Textkernel at DIR 2020!

On the 3rd of December the 19th edition of the Dutch-Belgian Information Retrieval Workshop (DIR) took place.

The conference was supposed to take place in Antwerp, but due to the coronavirus pandemic, it was held fully online. Like so many editions before, Textkernel was there, this time with a digital booth!

DIR is a conference where the most recent academic developments in the scientific area of information retrieval are presented. This year the conference had a ‘lightning-talks’ theme: Dutch or Belgian speakers from major international conferences were asked to present their paper in a shorter talk. Therefore, the conference gave a valuable overview of the state-of-the-art of some areas in information retrieval techniques.

What is Information Retrieval?

I’ve used the term information retrieval (IR) a couple of times now, but that may not be a familiar phrase to everyone. However, we all interact with IR-systems on a daily basis. IR is basically the science behind search engines, recommendation systems and matching engines. Essentially it’s about the challenge of finding the right information in a huge amount of data. This aspect of computer science really gained traction ever since the internet became a household concept in the 90’s and the 00’s. We all remember early web search engines like AltaVista and Yahoo!.

Ever since then the amount of data that is contained in computer systems has grown tremendously, and therefore the need to have a good way to search this data has grown more and more important. Enter the field of information retrieval.

Typical questions IR researchers ask themselves are for example:

  • How can we best perform searches to get the most relevant matches or recommendations?
  • How can we incorporate the latest AI technologies into searching and matching?
  • How can we ensure that ranking is fair and unbiased?

Application in Textkernel’s technology

These are questions that we also continuously need to consider for our searching and matching technology at Textkernel.

An example of how these topics relate to technology at Textkernel is how we can leverage the power of Deep Learning (the latest and greatest development in artificial intelligence) in our matching technology. Throughout the field of IR, Deep Learning has shown that it can improve searching and matching systems. Deep Learning systems can understand documents in a very complex and nuanced manner that can greatly improve matching. However, these systems work more or less like a black box: it’s hard to tell why the algorithm matches a certain result. Therefore we need to combine this powerful learning algorithm with our regular transparent and explainable matching technology, to get the best of both worlds.

Another example of how the aforementioned questions relate to Textkernel’s R&D: one common strategy to optimize search results is to learn from user clicks and user feedback on search results. However, in the HR domain this method is more controversial: the learning algorithm may also learn the user’s (unconscious) bias. Letting an algorithm learn a bias is something we need to avoid at all costs. Therefore we take bias and fairness in the ranking into account, to ensure that we’re fully matching based on a candidate’s skills and competencies, and not on characteristics that are irrelevant for the job.

A conference like DIR can provide the latest insights in techniques that are used to use Deep Learning in searching and matching, while mitigate bias and control the variables on which a match is being executed.

 

This blog post was written by Vincent Slot, our Team Lead Search R&D at Textkernel.
December 10th, 2020.