28 Jul

Data science is one of the most important building blocks of the 8vance system. It includes obtaining huge amounts of rich data from public data sources (“big data”), effective storage, pre-processing, merging and intelligent enhancement of incomplete profile data.

To find really good matches for given jobs, on the one hand lots of talent data is needed and on the other hand this data needs to be sufficiently rich to give a good insight into the talents’ profiles. While the 8vance system enables talents to register and create profiles, the internet provides us with a much larger source of public available talent profiles that can be obtained by scraping. This enables us to also consider persons not actively seeking for a job.

But online talent profiles are often incomplete and this not rich enough for accurate matching. Furthermore, they are language dependent and there is no commonly used standard vocabulary for describing e.g. skills and functions.

We tackle these problems by several approaches: At first, background knowledge in the form of taxonomies describing for example occupations, educations and skills are applied. These models were mainly created by human experts and also contain translations and synonyms. They allow us to map the free text with a more formal vocabulary. But to apply them additional syntactic and semantic analysis is needed. Additionally, the big amount of available data allows us to automatically generate models, e.g. for linking occupations, industries, skills and educations. These can be used to enhance incomplete profiles. Another approach for complementing talent data is the combination of different sources. So for example talent data from different web pages can be aggregated or the companies people worked for can indicate the industries they are familiar with.

