How we mapped the “skills genome” of emerging jobs

December 10, 2019

In today’s newly-published Emerging Jobs Report, we introduce the “skills genome methodology” to highlight the unique skills for emerging jobs—occupations that have experienced tremendous growth in hiring, but may not have had a correspondingly large or established workforce. Often, these jobs are relatively new, such as the top job globally on this year’s report: Artificial Intelligence Specialist. Highlighting the unique skills associated with each emerging job provides insights about what aptitude or knowledge is valued when employers are hiring for a particular role. It also can show how one job can be differentiated from others, even if the titles or roles appear to be very similar. 

In this blog post, we’ll discuss why skills data is important for a wide range of stakeholders, explain how we compute the skills genome for a given job, and provide some illustrative examples of how unique skill insights can be uncovered by using our skills genome methodology to compare industries, regions, and time periods.

  • linkedin-2020-emerging-jobs-report

Skills as encoders of economic opportunity

Skills are the building blocks of human capital. A job is increasingly defined by the skills people use to perform necessary tasks and responsibilities. Information like what skills uniquely constitute a job, what skills differentiate one job from another, and how the unique skills needed for a job are changing could all be essential information to understand a role.

Answers to these questions can be useful to a wide range of different groups:

  • Students and career starters: understanding the unique skills used for different jobs and finding the jobs that their skills are the best match for.

  •  Mid-career job seekers looking to pivot: helping them find jobs that use a similar set of skills and identifying the gaps between their skill set and that of other jobs so that they can upskill and pivot.

  • Employers: for positions that are difficult to fill, identifying candidates internally and externally  with a similar skill set needed for the role.

  • Policymakers: setting up training programs that help workers reskill or upskill, particularly in the context of transitioning out of declining occupations.

At LinkedIn, we have developed a methodology to uncover the skills “genome” of a job in order to help answer these questions. The skills genome of a job tracks a set of skills that are most unique to and most representative of a job, based on the skills LinkedIn members feature on their profiles. 

Introducing the skills genome methodology

Since emerging jobs reflect new technological, workplace, and social trends in an industry or industries, it can be interesting to consider what makes one job different from another, rather than generalizing based on similarities that may not reflect the newest developments in a field. For instance, data analysts, AI specialists, and data engineers share many similar skills, but perform unique roles within an engineering team, even if all three roles require familiarity with statistics. By focusing on unique skills, the skills genome of an emerging job offers a glimpse of the skills that may be in demand in the future.   

Mining skill data for unique insights
To uncover the skills genome of a job, we not only look at what skills people are acquiring, but also when and where they acquire them. We then apply a model that is analogous to term frequency--inverse document frequency (TF-IDF), a commonly used data mining technique for textual analysis, to extract unique skills. 

This step is necessary because often, a set of generic and commonly held skills can obscure the unique skills that are used on a specific job. For example, data scientist appears on the emerging jobs list for the third time this year. If we look at the frequency with which members add skills to their profiles, we find that Microsoft Word and Microsoft Excel are among the most frequently added skills by data scientists. However, Microsoft Word and Microsoft Excel are commonly held skills that hardly distinguish a data scientist from other jobs.

To extract the most unique skills, we apply a weighting scheme that is analogous to the TF-IDF weighting scheme. We calculate this by giving each skill a weighted score for each emerging job based on two factors: how likely a skill is added by members in this job on their profile, and how likely a skill is added by members in any job. The more often a skill is added by members across a wide range of jobs, the lower the weight of the skill. 

After applying the weighting scheme, we can already begin to extract insights about the unique context related to a given job or role. For example, we find that the skills genome of data scientist spans a wide range of skills, including skills related to artificial intelligence (AI), such as machine learning and natural language processing (NLP). Data scientists also tend to be familiar with data query and programming languages such as SQL and Python and data visualization tools such as Tableau.

We can compute the distance between the skills genome of a pair of jobs to find similar jobs. A close “cousin” of data scientist is actually artificial intelligence specialist, which tops this year’s emerging jobs list. Both jobs feature AI skills—particularly deep learning and NLP. In contrast, data visualization and statistics rank higher in the tool kit of data scientists, while artificial intelligence specialists emphasize skills specially designed for machine learning (e.g., TensorFlow and Keras). 

Unique skills genome insights

The skills genome for a job can often reflect a context-specific evolutionary path. For example, not only can we learn about the skills genome of a data scientist, but we can also compare and contrast the skills genome of the same role across industries, regions, or over time. 

Commanders of physical and virtual robots
Comparing the skills genome of robotics engineers in the automotive industry to robotics engineers in financial services industry reveals that those in the automotive industry are familiar with the design, programming, and manipulation of physical robots, with skills such as Fanuc Robots (largest industrial robot manufacturer), Programmable Logic Controller (industrial digital computer controlling robotic devices), and AutoCAD (computer-aided engineering program). In contrast, robotics engineers in financial services tend to focus on designing and implementing a digital “robotic” process to eliminate repetitive digital tasks. Topping their list of skills are robotic process automation (RPA) and familiarity with the leading players in RPA that provide implementation solutions, such as UiPath, Blue Prism, and Automation Anywhere. 

Successful customer skills
Another example is to compare the skills genome of a customer success specialist in New York to a customer success specialist in San Francisco. While a customer success specialist’s top skills generally focus on Customer Relationship Management (CRM), in the tech hub of San Francisco, familiarity with cloud computing is also prominent in their skill set. In contrast, digital marketing, including social media marketing, is among the most important skills of a customer success specialist in New York. This provides insights for customer success specialists who are looking to move geographically and searching for potential skill gaps to fill. 

Dynamic skill trends over time
Skills genomes can be dynamic. Comparison of the skills genome of a job over time highlights how the job is changing. Although most emerging jobs are still nascent in the labor market, in the past five years, the skill set of some has evolved tremendously. 

  • Engineering jobs, such as site reliability engineer (SRE), require constant learning of new software, tools, and programming languages. Edging into the top 10 skills of SREs today are IT automation and orchestration systems like Kubernetes, Docker Products, Terraform, and Ansible. Many of these are closely associated with cloud computing and none of them were among the top 10 skills in 2015. 

  • Chief revenue officer (CRO) is a fast-growing emerging C-suite title. Comparing the top skills of CROs four years ago to today, lead generation has dropped out of the top 10, while go-to-market strategy and cross-functional team leadership have risen in importance. 


The skills genome methodology mines the wealth of skills data at LinkedIn and extracts useful information about the unique skill composition of a job. Information from tools like skills genomes and research like the Emerging Jobs Report provide a missing link for both employers and potential employees: identifying the skills associated with jobs that will likely be more common in the future, so that people can learn new skills as they grow their careers over time. We look forward to using the skills genome methodology for more research and other applications in the future. 


Many thanks to Di Mo for co-developing the skills genome methodology and to Nick Doulos and Banu Muthukumar for comments.