Textkernel is happy to release its new version of Extract! CV parsing. This release contains the brand new parser for Greek CVs, as well as large improvements for all languages and language-specific improvements for English, Flemish, Slovak and French.
New: Greek resume parsing
Answering to our customer demand, Textkernel is introducing its 17th language of CV parsing: Greek. In order to add a new language, Textkernel’s parsing engine needs to be trained and tuned on resumes from those languages. Textkernel’s research engineers were able to overcome the Greek language’s diversity, richness and complexity and have developed a state-of-the-art language model.
The R&D team at Textkernel has made several improvements to its parsing engine that result in parsing enhancements for all languages.
New: extraction of Apple’s Pages file format (.pages)
In addition to the standard file types (such as .doc, .docx, .pdf .html, .text), Textkernel’s parsers can now also process Apple pages file types.
Support for even more Microsoft Word and PDF file subtypes
Textkernel made improvements to its preprocessor, which converts the original CV into text that is then used for parsing.Textkernel is now able to accept even more subtypes of Microsoft Word and PDF resumes
Improved extraction of phone numbers
Improved extraction of dates
Improved extraction of all skills
Language-specific parsing improvements
English: Better extraction of candidate name, especially when only the first name is present in the CV.
English: improvements to the experience and education sections:
Better segmentation of items
Better recognition of English dates
US: better extraction of city, region and country
US: better extraction of company names and locations
Belgian: Better extraction of names from Flemish CVs
Slovak: improved classification of education items and degrees