Understanding bias in recruitment AI
What is bias and why is it a problem?
The process of recruitment strives to find the perfect candidate for an employment position. Ideally the candidate should fulfill all required criteria to be able to execute the job as well as possible.
However, in reality human judgment tends to be less objective. In practice factors that are not necessary to execute the job satisfactorily will play a role in the selection process. For example, recruiters may take ethnicity, gender, familiar education institutions or companies into account when making such a decision, often without them even realizing it. The recruiter is in most cases not even aware of this effect. This is called ‘unconscious bias’.
Numerous studies have confirmed that in HR, unconscious bias is a significant factor in causing unfair distribution of opportunities and decreasing diversity on the labour market.
At Textkernel, we are dedicated to championing responsible AI in recruitment, placing a premium on ethical practices and inclusivity to pave the way for a brighter, more equitable future.
MITIGATING BIAS AND ENSURING RESPONSIBLE AI
Using AI can be a double-edged sword. It can cause harm when used carelessly, but it can equally promote fairness and reduce bias when employed responsibly – which is crucial to ensuring ethical outcomes.
Bias in AI systems
MITIGATING BIAS AND ENSURING RESPONSIBLE AI
Just like a recruiter; Artificial Intelligence (AI) recruitment software can have biases. To understand how that happens we need to understand on a very high level how AI systems work. Don’t worry, we’ll also explain how Textkernel prevents biases appearing in their AI systems, and how our AI systems can increase fairness and diversity in your recruitment process.
Bias in AI systems
Understanding the Learning Process of AI Systems in Recruitment
AI systems are learning systems. They are ‘taught’ to make a prediction based on training data.
For example, an AI system can learn to predict which candidates might be suitable for a job, based on previous hiring decisions made by recruiters. The system may learn that if a candidate mentions experience with programming languages, that person may be suitable for a software engineer job. After all, in the past successful candidates for software engineer jobs will have had experience with programming languages.
Bias in AI System
Gender Disparities and Unconscious Bias in Tech Recruitment
However, male employees are simply more common in the tech industry than female employees. Combine this with a possible unconscious bias of tech recruiters to slightly favor male candidates, and this may lead to an overrepresentation of successful male candidates in the historical data from which the AI system learns. As a result, such an AI system may wrongly learn that male candidates are more suitable for engineering jobs, and therefore disadvantage female candidates!
AI’s complexity makes it nearly impossible to fully eliminate biases
Even simply withholding any gender information from the document is not enough when training the AI system. These systems are so powerful that they will pick up other unforeseen subtle signals to reconstruct the gender information. For example, an algorithm will still infer the candidate’s gender from phrases in the CV (like “women’s chess club captain”). Armed with this inferred gender, the algorithm would still perpetuate the bias from its training data. Without the proper tools and processes in place, the developers of the AI system would not even realize that their system is now enforcing and even amplifying an existing bias
The latest generation of AI algorithms (Deep Neural Networks) is mathematically so complex that it is impossible to completely detect and remove these biases in the current state of technology.
Even simply withholding any gender information from the document is not enough when training the AI system. These systems are so powerful that they will pick up other unforeseen subtle signals to reconstruct the gender information. For example, an algorithm will still infer the candidate’s gender from phrases in the CV (like “women’s chess club captain”). Armed with this inferred gender, the algorithm would still perpetuate the bias from its training data. Without the proper tools and processes in place, the developers of the AI system would not even realize that their system is now enforcing and even amplifying an existing bias.
REAL WORLD APPLICATIONS
Responsible use of AI in real world applications
Now that we’ve looked at how AI can be harmful when used carelessly, it’s time to look at how to use AI in a safe and ethical manner. This is called Responsible AI. In fact, when used responsibly AI can help reduce bias, instead of amplifying it.
DESIGNING SYSTEMS AROUND USER AND AI BIAS
There are numerous examples of successful employment of AI algorithms in a wide variety of applications, where bias is unlikely to occur. Think of for example spam filtering. This application is not very sensitive to introducing or perpetuating bias, because it operates in a problem space that doesn’t rely on factors that may be related to sensitive data. Deciding if an email contains spam is completely independent of the ethnicity, gender and religion of the user.
Unfortunately, in the last few years, there have also been numerous examples in the news of AI systems making biased decisions. For example, a bank deciding whether to give someone credit or not, or a government deciding whether someone is a risk for fraud with social welfare. But we’ve also seen famous examples of AI-based candidate-job matching gone wrong. The common problem in all these cases? Letting AI mimic previous human decisions for problems that are very sensitive to bias and directly affect the lives of real people. The kinds of decisions in these examples require, on top of the “hard” data, common sense, intuition and empathy. All which AI doesn’t have in the current state of technology.
Thus, given that AI can create or perpetuate biases, the first question we should always ask is if AI is the appropriate tool for the job? For solving problems where bias is lurking around the corner the answer is most often: NO.
At Textkernel, we separate the problem of matching candidates to jobs into two different steps, document understanding and the matching itself. We have seen that AI can learn and amplify the biases in your hiring data, so maybe it should not be used in the matching algorithm. There are other more controllable algorithms that can do the matching in a transparent and controllable fashion. However, when it comes to extracting information from a CV or job description, the risk for bias is small if done in a responsible manner.
Mitigating bias in AI
Whenever AI is the right tool for the job, it is crucial to have the right checks and balances in place. This will ensure that any bias that may arise from the AI solution is minimized. Recently, many tech giants (e.g. Google, IBM etc) have formalized processes to minimize the bias that their machine learning algorithms produce. Following in the steps of these companies, Textkernel has also formalized such processes in a Fairness Checklist. By following this checklist, we are fully aware of any potential biases that may arise. Based on it, we can decide if we must take measures to mitigate bias and ensure that we develop safe and unbiased software.
Let’s look at some examples of these measures in the context of profession normalization. This is the process of ‘normalizing’ a free form job title to a concept. For example, there are many ways to write “Java Developer”, ranging from “J2EE Full Stack Engineer” to “Java Ninja” and everything in between. This is in fact a lower risk problem for AI as the result of the job title normalization is not influenced by a person’s ethnicity or religion but only by the free form job title.
However, the gender of the person could influence the quality of this AI system. This is because “gendered” job titles could be more common in the AI training data in one of their forms (e.g. actor/actress, waiter/waitress). In such cases, the AI algorithm could learn to normalize one of the forms better than the other. Therefore, one of the measures we can take is to ensure that the training data for the algorithm is balanced and representative. That means that it should be a fair representation of the variety of data that we may encounter in a real-life setting.
With the right measures in place, bias can be well mitigated for these ‘low risk’ AI tasks. However, as the task becomes more complex and impactful (like the job-candidate matching task itself), the AI algorithms will also need to become much more complex and less controllable. Managing bias in such systems is a very difficult task. In fact, de-biasing and making AI decisions explainable is still an active area of research. There is still no way to guarantee a bias-free AI model for complex tasks. Any companies that claim to have solved this problem are likely exaggerating.
Once we have ensured that the training data is fair and balanced, another measure is to test the quality of normalization for both genders separately. The values for both groups should be close to each other. That way we can ensure that both female and male candidates get equal opportunity when normalized job titles are used for tasks like candidate-job matching.
Responsible AI in practice
The Textkernel solution
Enhance data extraction, standardization, and enrichment to reduce bias and improve document understanding.
Source & Match
Improve sourcing and matching precision with AI-driven search queries and enriched criteria which enhances search accuracy.
Responsible AI use
Ensure transparency and control in AI-driven processes to reduce bias and foster fair and ethical AI practices.
Reducing human unconscious bias
Textkernel aims to reduce disparities and foster fairness in critical decision-making processes.
The first step of any automated recruitment process is to understand the data. Our Parsing product is a perfect example of this. Understanding a document means to be able to extract the relevant information from a document and enrich it with domain specific knowledge. For example, when we parse a CV, the system reads what work experience the candidate has, but also which skills and degrees he or she possesses and so on (i.e. extraction ). On top of that, it can also standardize the job title and skills to existing taxonomies (i.e. normalization ), derive in which work field the candidate is working, or infer likely skills for that candidate, even though these things are not explicitly mentioned in the document (i.e. enrichment).
We can apply the same process of extraction and enrichment to a job posting, to give us the structured information for the job. In the case of job postings, this entails things like the required experience level, skills, and degree etcetera.
Searching and matching
This extracted and enriched knowledge is a very powerful tool for sourcing and matching. For example, understanding a document allows us to search only on professional skills instead of keyword matching on the entire document; or we can search on normalized job titles so we can find the candidate no matter how she/he expressed their job titles. This leads to a more accurate search. Another example is that we can search on inferred information (e.g. the experience level for a candidate, even though that experience level was not explicitly mentioned in their profile). Enrichment is useful not only for documents but also for search queries. For example, we can add synonyms or related terms to the search query.
Knowing all qualifications of candidates and all the requirements for job postings allows us to automate one more step: matching. To achieve this, we automatically generate a search query given an input document. Let’s say we want to match all suitable candidates for a given job, the search query will contain all required and desired criteria for that vacancy. Each criterion will have its own appropriate weight to optimize the quality of the result set of the query.
Responsible use of AI in Textkernel solution
Why does all this matter? Well, most importantly: AI doesn’t do matching for you. The matching is done in a term-based search engine. We employ powerful AI algorithms only for document understanding (to extract information and enrich documents and queries) but leave the matching part to more transparent and controllable algorithms. This way we give the recruiter full control over the matching, and benefit from our AI-driven world leading parsing capabilities.
However, even when employing transparent and controllable algorithms, bias may arise through properties of the language. For example, a simple term-based search on “waiter” will favor male candidates for that job, since the job title is male by definition. Enrichment of search and match queries helps reduce this type of bias. When recruiting for a waiter job, the query will be automatically enriched with the waitress job title to remove gender bias inherent to that job title. A similar bias reduction can be achieved by normalizing the job titles (as discussed before), normalizing skills and using it in queries: this ensures that no matter how the candidate expresses a skill or previous experience, the concept will still be matched.
To control any bias that could potentially arise in the AI-powered document understanding steps of the process, we enforce our R&D Fairness Checklist.
Reducing human unconscious bias
Having fully controllable and transparent matching has another benefit: by matching on objective criteria, we may actually mitigate any unconscious bias that a recruiter may have. This will improve equal opportunities and diversity in your HR processes.
Current research suggests that if used carefully, AI can help avoid discrimination, and even raise the bar for human decision-making.
Of course the user is unable to search on any discriminatory attributes when searching with Textkernel’s Source and Match, like gender or religion.
Guiding Textkernel’s ethical approach
Textkernel’s AI principles
At Textkernel, our approach to responsible AI is ingrained in our principles. We believe that AI should serve as a tool guided by humans, not an unsupervised decision maker. Transparency, diversity, and data security are all vitally important for us.
AI driven by humans
Humans are in control of our solutions and understand what they are doing to achieve the results that they strive to obtain. Our AI will not make any decisions for you but will support you by taking over time-consuming processes to increase your efficiency. Our products are designed so that our users can always evaluate, and override suggestions provided by the technology and remain the final decision maker.
Transparency of results and white box AI approach
We strive to ensure that our AI and the way in which our solutions work is always explainable. For more complex tasks, like matching candidates to jobs, we advocate a strong explainability and transparency. Our solution can indicate exactly which criteria are used to construct the match, in a way that the end user can interpret, understand, and influence.
Matching based on objective, measurable criteria will reduce/eliminate bias from your recruitment process. Our solution will disregard any candidate properties that are irrelevant for successful execution of the job (e.g. gender, ethnicity, age, etc), promoting diversity and inclusion within your organization.
Robust data protection and security
To ensure the trustworthiness of our AI, we have in place robust security measures and mechanisms to protect (personal) data against potential attacks throughout all phases of the AI lifecycle.