Vincent Slot (Team lead – Search R&D) and Kasper Kok (Product owner – Knowledge Resources) offer insight into the topic in our on-demand webinar, answering:
- What is AI?
- What are the practical applications of AI in talent acquisition?
- What powers AI technology (in ways I can understand)?
Fill out the form below to watch the recording:
Don't want to watch the entire webinar?
Kasper Kok: Hello everybody and welcome again to this webinar, brought to you by Textkernel on AI for talent acquisition. We’ll be talking about a bit of the technology behind AI today. Hoping that by the end of this session all of you will kind of understand what this technology is about. One practical note before we get started. If you have any questions if anything we say goes over your head, feel free to ask anything in the Q&A functionality of Zoom. You can find this in your Zoom menu. Do not use the chat version but the Q&A button that way we’ll make sure to get back to your questions at the end of the webinar.
My name is Kasper Kok. I am a product owner here at Textkernel, working with all of the R&D teams and especially the Knowledge Graph Technologies here, and with me is Vincent.
Vincent Slot: Hello, my name is Vincent Slot. I lead the search R&D team here at Textkernel. I have a background in AI and I’m excited to do this webinar today.
Kasper Kok: So, before we begin a few notes about Textkernel, we are a software company. We make software basically for matching people to jobs and jobs to people using AI technology. We’ve been doing this for over 15 years now. So basically, we are doing this way before AI became the big hype that it is nowadays. We also have software in the domain of labor market intelligence, but that will not be the focus of our webinar today. We are quartered in Amsterdam and proud members of the Career Builder Group since 2015.
So, the mission that we have in this webinar is to sort of demystify the concept of AI. So, to make you understand the basic concepts of what this technology is about and how it can be applied in talent acquisition. After a general introduction, we will be focusing on sort of zooming in on three of the main applications that we are most familiar with. Which is transforming documents into data structures, semantic search, and finally we’ll tell you how all of that comes together in trying to match people to jobs and vice versa using AI technology.
So, let’s get started with AI and talent acquisition and I’d first like to give you a sort of historical perspective. So, one way to see why AI is such a hype nowadays is, because it can be seen as sort of the 4th industrial revolution. An industrial revolution is generally kind of characterized by the invention of a set of techniques that make something possible that wasn’t possible before. So, in the 1700’s and 1800’s, the invention of steam and electricity drastically lowered the cost of the production of goods and of transport by making certain processes much faster and more efficient. Now in the 1900’s when computers came along, of course the sharing and retrieval of information or storage of information became a lot more efficient and cheaper. And similarly, since the 1990’s a new set of techniques has been developed collectively sort of called artificial intelligence which can help to drastically lower the cost of cognitive tasks.
And cognitive tasks are basically every sort of operation you would perform with the human brain. Which can be anything from reading documents, recognizing images or pictures or audio anything we do with our brain. And in talent acquisition, there are of course quite a few of those tasks. Now the ones that are most suitable for being either performed or simplified using AI technology are those that are relatively simple and repetitive. So, things we kind of do over and over again using kind of the same thought processes. And one good example of that is capturing information from HR documents such as CVs.
So, traditionally it would be kind of a human job to read CVs, big files of CVs and capture information or extract information from those and put those in a database. And nowadays, this is typically something that can be to a large extent automated using a technique called Machine Learning, which is a way of doing artificial intelligence. Other topics or other tasks are finding the right search terms or dealing with linguistic variation. We’ll talk about that in the 3rd part of this webinar and the general task of man CVs to jobs and vice versa. This is also something that can be to a large extent simplified using AI.
Now in talent acquisition as a whole of course there’s a lot more going on and there are also things such as social engagement with candidates, making decisions in a strategic context, maybe interviews. These are things that are generally a bit less suitable for AI we believe. This will probably always stay in the hands of human intelligence. And this is because AIs are generally not very well able to learn social behaviors in the way that humans do.
So, let’s get to the core of the question: what is AI? One definition would be that AI is the simulation of human intelligence in machines that are programmed to think like humans and mimic their actions. And this definition is kind of neutral to the type of technique that is being used. Any simple program or simple piece of code that in some way mimics human intelligence could be considered AI. However, in modern times, what people really mean when they talk about artificial intelligence is machine learning. So, machine learning is one sort of implementation or one type of creating AI systems that is based on large amounts of data, and algorithms that can automatically learn patterns from those data in order to transform any input into an output. And we’ll get back to this on the next slide.
Another term you might have heard a lot in recent years is Deep Learning. So, deep learning is a type of machine learning which involves an algorithm called a Deep Neural Network. Deep learning is the type of technique that is often used in things like image recognition and self-driving cars and it’s a very powerful sort of way of approaching machine learning.
Now this slide shows you the three basic steps of machine learning or making a machine learning system for in this case kind of a toy task which is to recognize pictures of cats and dogs. And the first step that would be involved in doing so would be to collect lots and lots of data. In this case, the data would consist of large collections of labeled pictures of dogs and cats. The next step would be to train a model as we call it to learn the relationship between the content of the pictures and the label. And a model can be anything from a very simple mathematical function to something more complex.
What you see in this picture sort of depicts a neural network which is sort of a way of stacking lots and lots of simple mathematical functions on top of each other. And these neural networks are able to learn very complex relationships between input and output. After training a model to understand these relationships the last step is that it’s able to make predictions for new outputs. So, if you feed this network a picture of a cat or a picture that it has never seen before. So, that was not part of the training data, it will be able to predict or to classify this picture with a certain confidence level.
So, these are the three main steps involved in machine learning and this is also exactly what we do at Textkernel when we try to make sense of CVs or vacancies. Vincent will explain how that works in the next section. But before we get there, I’d like to sort of summarize how we see the role of AI in talent acquisition in relation to let’s say human intelligence. We see this really as a collaboration in the sense that AI will be simplifying the tasks that machines are good at. Which is generally dealing with repetitive cognitive processes. That involves large sets of either text or images or any other kind of data. So, humans can then focus on what they are good at. Which are generally relative to computers, humans really excel at social interactions, bringing different context together, stakeholder management, strategic thinking, all these things. And in this we believe that AI is really for humans and not supposed to replace humans.
Vincent Slot: Right, so Kasper has given a high-level overview of what AI is so let’s dive a bit deeper into how AI can actually help people working in the HR domain or in intelligent acquisition. So, a typical pipeline of AI systems in the HR domain could be represented as this. Where on the left we have basically the data, the raw data. In this case, that would be HR documents, like CVs like vacancies. But in theory, that could be any type of document, any type of any type of document that contains text, whatever you prefer. The first step is to turn that raw data into structured data because structured data is something that we can work with that has a meaning. Just a big text about a topic is not something a computer can understand.
So, the first step is to extract the structured data from the raw document. Hidden in that data are many patterns and there’s meaning hidden in the text and those structures can be identified by AI as well. So, lately you may have heard the term big data a lot. That’s generally the process of finding those patterns in the data. Like if we’re related to the HR domain which skills are related. Which skills are relevant for a certain job, which jobs are related to each other? And that kind of knowledge is hidden in the data and that we use AI to capture that.
Well, so both this structured data and the additional knowledge that we mine from data we use those in searching and matching. So, this is of course an application that is very common in HR and TA. So, if we can make that more efficient using this data, that would be awesome.
Let’s start with the first part here. So, start at the beginning when we have raw documents and we want to get some sensible information from that that we can use, right? So, if we look back at how Kasper explained how AI. It generally works the three steps well since this is the human resource domain not the animal resource domain. This is not the type of data that we’re interested in. so, it would probably look more something like this, where for example the input document can be a CV. And instead of piling the CVs into two groups, cats and dogs or other groups we annotate on this CV what the interesting bits of information are and what they are. Like, what the name in a resume is or what type of working experience a person has?
If we feed a lot of those annotated labeled documents into a machine learning model, then it will be able to generalize. That’s the one of the most important parts of a machine learning model is that it can generalize. So, it recognizes the pattern that for example occurs a lot in resumes like date, job at company. Like we see 1977, human resources generalist at Avnet. That is a very common pattern, so the network learns to generalize that. If this sentence would have said, a 1977 human resource generalist with Avnet instead of at, then the recognize would have been able to generalize that and still provide the correct labels. Even though that input sentence is probably never was probably never in the training date. So, that’s the big advantage, the takeaway message here that these models can generalize really well, based on all these examples that they’ve seen.
So, if we look at the information that we can get from a resume, then there are all types of interesting things in there. Like the education that a person had or the skills that a person has, and of course you want to look for those in the appropriate section of the resume. So, you don’t want to look for skills in a personal section or you don’t want to look for work experience in education for example. So, the first step is to segment this CV into the most, well let’s say common sections that resumes are made of. And then you search for the interesting bits of information within those sections.
So, this is also what the annotated CVs would look like. So, we would say this section and it contains this information. So, for the vacancy side, we can do more or less the same thing, right? There is a general structure and a common structure to vacancies as well. And we can annotate those and look for the interesting bits. In the case of vacancies, the interesting bits are often the requirements that are needed for a job and we can give those to a machine learning model. So, I guess what makes the AI strong here is that you can write a resume or a vacancy, you can write it in a million ways. And this machine learning model is able to, even if a resume wouldn’t have in this example, we have a benefit section.
Well, that’s not something that every resume/every vacancy has. So, the machinery model would be able to generalize that still do a correct parsing from that. So, that’s all very nice that we know what the bits of information are, what they mean, what type of information they are. But how can we actually use that? Because this is the first step, it may already save a lot of time in terms of not having to fill forms manually anymore, based on resumes. Like entering data, so that is already a huge time saver. But we can also use this information in a smarter way for searching.
How do we do that? This very specific information enables us to do what’s called faceted search, and you all know it from web shops. It’s a very common way to search web shops, where you can search very precisely on the attributes that you want to look for. So, this is what AI enables us to do on these unstructured HR documents, because we now have much more structure in the data. We can do a much more precise search on that. So a more powerful tool is Semantic search and Kasper is going to tell a bit more about that.
Kasper Kok: Alright, so now we know how by using machine learning, we’re able to extract structured information or transform documents into a data space format, and how that can be leveraged through search for this information. There’s kind of one problem left before we can do this effectively. And this problem is that the human language sort of has this annoying property of expressing exactly the same things in different ways. So, if you see here on the left, there’s a vacancy and a CV. On the right, what you see is that the three concepts that are mentioned, the ones that are marked bold are exactly the same things but they are written in different ways. So, you might write about having managed projects instead of project management. Maybe someone calls the same thing e-learning that another person would call digital learning. Advising and consulting are of course kind of synonyms. But in a traditional search system, these would not be recognized as referring to the same thing.
So, this is why, what one could do is add knowledge to a search system. And this constitutes what can be called semantic search, so semantic actually just means meaning. So, it’s sort of making the search more meaningful rather than searching for exact words. So, in order to do this there’s something needed in the background which is called a ‘knowledge base’. And knowledge bases are basically kind of structured inventories of all sorts of concepts in the DA world. This would be things like all professions that people could have all skills, education levels etc. as well as the relations between these professions. So, for instance, you might store which words actually refer to the same profession or which ones are closely related to each other.
This is then extracted from the parse documents through a method called ‘Clustering’. We’re not going to explain the details of that, but it’s similar to machine learning in the sense that it uses large quantities of data to extract statistical patterns. So, using this knowledge base, it is possible to create a more effective search system. By for instance, leveraging this knowledge in Autocomplete. So, autocomplete you might also know from other search engines is where you type a few letters and you get suggestions as to what you would like to type.
Now the Autocomplete suggestions come from this knowledge base and will tell you what are the most likely things you want to search for. And it helps to sort of enforce a common language between what you’re typing and what you’re likely to find in a document. Another thing that’s possible to do using semantic search is to expand a search query. This means that alternative wordings or phrasings of something you type could be automatically added to this. So, if you type project manager in a semantic search system, it could be automatically augmented with other variations of this word. Such as projects manager or maybe without a space or just a PM. So, this would be an acronym meaning the same thing.
Similarly in German where the gender of professions is often explicitly marked as being male or female it would be important to automatically add in this case the feminine version to the masculine one. And this way we also avoid gender bias. So, apart from adding spelling variations which is of course very useful and makes the searching much easier because as a user you won’t have to worry about that so much. It’s also possible to leverage AI based knowledge resources in order to add domain knowledge. So, here you see an example of someone for instance searching for a candidate with experience in Hadoop. Hadoop is a big data engineering framework. What you see is that there are a lot of related technologies suggested to be automatically added to these search queries.
And the philosophy here is that someone with experience in any of these technologies is very likely to be a good candidate for a Hadoop job because these technologies are so simple, similar sorry. Now some experienced IT and big data recruiters might know this and might be able to handle this themselves. But the big benefit of semantic search would be that all of this domain knowledge is actually embedded in the software and therefore no longer necessary by recruiters. Which can be a very big advantage especially for those with less experience in a certain professional domain.
In general, if you do this for all the words you type, here you see on the left that a 3-word user searches. So, Java Hadoop Cloud would actually under the ‘hood’ result in a more than 60-word search query, which drastically increases your chances of finding relevant candidates. So, to sum up some of the benefits of AI enhanced search, first of all you can search in a very targeted manner on specific attributes of candidates or jobs rather than just using full tech search. Second, there’s no need to worry about all the different ways people could refer to the same things. For instance, using different spelling variations or synonyms. And finally, domain knowledge is embedded in the software and therefore no longer required by the user of this system. Now let’s get to the grand finale about how this all comes together in matching people and jobs.
Vincent Slot: Yeah, so searching for candidates via search queries is one thing. But if this AI system has extracted all these useful bits of information from both the vacancies that we have and the CVs that we have. Then why would we still want to type the query manually when we know exactly which requirements are, which things are required and what skills people have to offer. So, that’s what we see here regardless of how a certain job is referred to in either a CV or a vacancy. For example, on the left here we see Java developers, on the right we see a vacancy that has the Dutch translation. These two phrases refer to the same thing. Namely a developer who knows Java.
So, if our semantic system understands all that, it understands that if you’re looking for someone with 5 to 10 years’ experience and a person has listed 7 years of experience on the resume, then AI can understand that those two are a good match. So, instead of having to type all these requirements in there, we can automatically search for those, because we know them from both sides. We know exactly what a person has. We know exactly what a person can, what a vacancy is looking for. So, yeah that’s something that can save you a lot of time and you can see where, on which facets the search result matches.
So basically, instead of typing, spending a lot of time finding the right query combining terms. We can just do that in one click basically if you use AI. And this opens up a plethora of use cases, because if AI powers is automatic matching, then we can use that in very many places. Like sourcing from your own ATS for example or from external databases. Like LinkedIn, you can reduce wasted opportunity by trying to place your silver medalists. There are all kinds of different ways in which AI can help you make your work more efficient.
So, if let’s say the whole technical explanation of AI wasn’t, let’s say a bit too complicated, then this is basically what you need to remember. Because AI is not a magic algorithm that’s going to solve all your problems. It’s not going to replace humans in this process, but there are some very useful and very, let’s say down to earth real ways in which AI can benefit talent acquisition. So, you don’t have to type your data into a database anymore. You don’t have to have all the domain knowledge when you want to search for the right candidate. You don’t even need to type queries anymore. So, AI enables all of these processes that Can really enhance the search for the right candidate.
Kasper Kok: Alright, this is time for the Q&A session. I see we have quite a few questions that have come in. And we’re going to start with Stein’s question that there are actually two. How do you look at dealing with soft skills and matching? We will move beyond CVs more and more. And what’s your vision on this?
Vincent Slot: Yes, that is actually a very interesting question. This is one of those cases where just searching on a term is not going to help you. If you just search for it, you must be able to communicate well. I mean no one has exactly that in their resume. So, also understanding what soft skills relate to each other and what soft skills mean more or less the same thing can help you greatly already. And this is also something that we didn’t cover in this webinar is that this is typically something that Deep Learning can help you with matching. Because it has a more subtle grasp of what the meaning of words is. I’m not going to explain it right now because I cannot talk for an hour about that. But yeah, this is actually typically something that that AI would be very helpful for.
Kasper Kok: I think there are a couple more questions about soft skills coming up. But first, let’s move on to the question. Do you have to train your CV parser or your vacancy parser for every language separately? This is also an interesting question. The answer is generally yes. We have CV parsers in over 20 languages for which we repeat this process over and over again. So, for every language, we have native speakers annotate data and then we train models and this is how parsers are created. However, recently some interesting developments have been going on where it turns out that for closely related languages. It is possible to train a single model that actually performs for these different languages simultaneously.
We’ve actually implemented this for a number of Slavic languages, where actually we have gotten really promising results and this is already in production right now. So, that’s basically the answer. For closely related language families, it is possible to sort of leverage what you know for one language for the other ones as well.
Vincent Slot: Yeah, the advantage of that is obviously that there are languages for which you have very few resources, where it’s hard to get annotated documents for. So typically, for those languages, you would want to leverage whatever cross lingual information you can use.
Kasper Kok: Yeah. Another question about soft skills from the perspective of knowledge base. It is possible of course to store collections of soft skills in a way just like we do or just like one could do for other skills. This is something that is generally used in combination with other features and other skills. It is also a possibility to connect custom knowledge bases with skills, but bear in mind that maintaining a knowledge base is really something that is not to be underestimated. We would recommend only maintaining this, having this maintained by specialized let’s say knowledge-based experts.
Vincent Slot: Okay, let’s maybe ask one more question. Unfortunately, it looks like we’re not going to have time to answer all of the questions. So, one more question that we can answer. Kim asks, how do you prevent AI from becoming biased? That’s actually a very interesting question, because lately you may have heard a lot of stories about AI discriminating and/or doing things that that are socially not acceptable. The answer to that is our searching and matching is all white box. So, you see what you’re searching for, and that is of course something that any AI system could do. In the background, you use AI to enhance every aspect of searching, but what you’re actually searching for is actually what you see. And on top of that, you can for example leverage the gender normalization that Kasper mentioned. That’s the same thing. So, AI can account for that.
Kasper Kok: So, two more notes before we conclude this, please feel free to Ask any other questions through email. You see our email addresses here on the screen and you can always get more information or contact us through our website as well. The recordings of the webinar will be shared with you through a link by the end of the week. And before we fully conclude, we are going to launch a poll and we’d really appreciate it if you fill in these couple of questions. That will be appearing on your screen right now. That’s it for us. Thank you very much for tuning in.