AI has always been our method of choice in our mission to accelerate staffing, recruitment and HR processes. In fact, we have pioneered AI solutions for the recruitment domain over 20 years ago, and have been monitoring and applying developments in AI ever since. And that’s not just because it’s an exciting technology: AI actually makes our customers more effective! Whether it’s about automating data entry from CVs or vacancies, shortlisting candidates for jobs, or enabling market analytics, AI-driven software can hugely improve process efficiency. And over time we’ve learned that embracing new developments in AI is key to making sure that the quality of these systems gets ever better.
With all the media frenzy these days, many people would be surprised to learn that AI has been around since the invention of computers. What has changed over the years is the AI algorithms used to make computers intelligent.
The early days
The very AI algorithms of the 1980s consisted of a set of hard-coded assumptions and rules made by domain experts. Think of rules like “if a CV contains a 10-digit number, then it must be a phone number”, or “whatever follows the phrase “Name:” is someone’s name”. It turns out that language is way too complex to be captured with rules (phone numbers can be written with dashes in between digits, the phrase “Name:” can occur in phrases like “School Name:”). Rule-based AI systems tend to grow into a large stack of exceptions on top of exceptions: error-prone and difficult to maintain. Practical applications of such systems were out of reach.
Statistical machine learning
In the late 1990s statistical machine learning came to the rescue. Instead of writing rules manually, statistical algorithms (e.g. Hidden Markov Models in the early 2000s) can infer rules and patterns from annotated data. Those rules are generally better than those found by human engineers: they strike the right balance between being specific and generalizable, and use patterns in the data that humans wouldn’t have seen. Employing machine learning models in combination with various rich data sources, Textkernel achieved best-in-breed accuracy levels on the problems it set out to solve.
Introducing Deep Learning
But early machine learning models still had their limits: they were not able to digest a lot of context and still relied heavily on human expertise (of which signals/features are relevant for specific problems). To understand what a given word means, they would basically only consider the words in their direct neighborhood. A good understanding of a CV or job ad, however, requires understanding the context of the entire paragraph or even the full document.
This is why we invested in upgrading our models to a special kind of machine learning technology: Deep Learning. These somewhat more complex neural networks allowed for a much more contextualized form of document understanding. In addition, they could figure out by themselves which textual features are relevant to solve a given task. Deep Learning took academia by storm in the 2010s and in 2017 it was mature enough to be applied to business problems. Once we applied to parsing, it led to another substantial boost to our accuracy levels.
Recently we’ve been closely monitoring one of the most disruptive developments in language technology so far: Large Language Models (the technology behind ChatGPT) and their impressive ability to perform well on just about any language task and to encode knowledge of the world.
What are LLMs and why do they work so well?
Language models are AI systems with a surprisingly simple objective: “simulate” language. Given a sequence of words, their task is to predict the next most likely word. For example, “bank” or “ATM” are the most likely words that would follow the sequence “I withdrew some money from the …”. Language models have been around for about 30 years. In the past few years, people have been building language models using increasingly bigger neural networks with a special attention mechanism (transformers) and using more and more language data (see table below). It turns out that these Large Language Models (LLMs) start exhibiting abilities that even surprised their creators:
Performing language tasks: in order to “simulate” language, they become very good at language tasks. They can generate high quality text, summarize text, rewrite text in specific styles, etc.
Encoding knowledge of the world: language can not be simulated well without world knowledge (e.g. you can not write good quality text about Obama unless you know he was a president of the USA). LLMs magically capture and represent that knowledge just by reading lots of text.
Some cognitive skills: LLMs try to simulate text that was manually created by people by applying various cognitive skills: inference, deduction, simple reasoning, etc. LLMs seem to develop – or at least mimic – such skills in order to be good at simulating text. It is hypothesized that the size of the neural network and attention mechanism is key for this. In addition, since their training data also includes computer programs, their documentation and the text around them, LLMs are surprisingly good at generating code too. In fact, LLMs can even learn new skills.
Key ingredients in LLMs
Very large neural networks
Because LLMs are extremely large neural networks, they have an advanced ability to learn abstract structures from raw data, and to draw abstractions over abstractions over abstractions, etc. While most traditional language models were only able to learn relations between individual words in a sentence, LLMs draw abstractions on much higher levels. They are wired to be able to learn relations between sentences, blocks of text, conversational turns, or entire documents.
Attention is a central element of any well-performing language processing system. Unlike more traditional deep learning models, LLMs are based on a type of architecture called “transformer”, which allows them to figure out which elements of an input sequence are most relevant to the production of the desired output sequence. This greatly enhances their capacity to produce content that’s relevant to the questions they’re being asked. Which is not very different from how human attention helps us stick to the point in a conversation.
LLMs in recruitment: potential and limitations
The HR media are flooded with suggestions on how ChatGPT and similar tools can be applied to streamline workflows. Ideas range from automated content generation (vacancies, interview questions, marketing content) to improved candidate screening and automated communication. Some of these will be more fruitful than others, but one thing is for sure: recruitment and HR are among the many industries that will be shaken up, if not revolutionized, by this new generation of AI technology.
Apart from giving rise to innovative products, it’s also clear that LLMs will help existing AI-based tools reach higher accuracy and improve their user experience. That’s also true for our software: just like we’ve seen that previous AI developments brought significant quality improvements, LLMs will most certainly benefit the quality of our software for document understanding, candidate sourcing and matching, data enrichment, and analytics. In the next parts of this blog series we will share how we’re using the technology at the moment, and what’s to come.
Not so fast?
Having pursued AI-driven innovation for over two decades, at Textkernel we are well aware that technological breakthroughs are not merely reasons for excitement. And we’re not the first to note that the use of technologies like ChatGPT come with risks and limitations. There are technical limitations concerning scalability and cost. For example, building LLMs is a very complex and very expensive process. It is estimated that it cost OpenAI 4 million dollars to train their GPT-3. Keep in mind that ChatGPT is based on an even newer version, GPT-3.5. At least for the near future, it is envisioned that companies will use LLMs from a small number of providers rather than build an in-house LLM. Running LLMs is also costly which in turn affects the cost of services built on top of them.
Lastly, and not to be underestimated, there are valid concerns about data privacy, transparency and bias. These concerns should be taken very seriously, and the various upcoming forms of AI legislation, such as the EU AI Act and the NY AEDT Law, will help ensure these concerns are treated seriously.
Stay tuned to the next parts of this blog series to hear more about how LLMs relate to AI legislation and how we envision combining compliance with cutting-edge innovation.