Learning Hiring Preferences: The AI Behind LinkedIn Jobs
February 12, 2019
Co-authors: Benjamin McCann and Nadeem Anjum
Last June, we introduced Recommended Matches: candidate recommendations for your open job posting that get more targeted over time based on your feedback. We're now rolling out this feature globally alongside an updated version of the algorithm powering it. This new algorithm, which is used throughout the Jobs platform, performs nearly 20% better than the previous version in generating recommendations when we simulate our members' past hiring activity.
The technique we leverage to train the targeting to get smarter is called "online learning," which is learning that happens in real time as our members use the product. Based on how you interact with candidates, our algorithm learns your preferences and delivers increasingly relevant candidates across the Jobs product. If you’re consistently interested in candidates who are, say, accountants with leadership skills, or project managers who are adept at social media, we’ll send you more of those. And this all happens online in real time so that your feedback is taken instantly into account.
LinkedIn provides multiple avenues for finding success in filling your job posting. Our matching technology shows up on both the job-seeker and company side throughout our Jobs product. One of the challenges we face in building our products is that we’d like for the relevance aspects to be unified across these different channels so that it feels like a cohesive experience. For example, when you use job targeting to specify the skills or years of experience you’re looking for, we are able to use that information to determine which candidates you should reach out to, highlight your job to the members that may be most interested in and qualified for it, and place the most qualified applicants at the top of your review list. The online learning component helps power all of those product features.
This online learning-powered recommendation system uses signals such as the job description, candidates you’ve reached out to or archived, and members interested in jobs like yours to match the most suitable candidates to your open role. Interactions with an applicant versus interactions with a recommended match may be fundamentally different, and so we explicitly represent the channel in which the candidate was discovered in our machine learning model. This also gives us the ability to incorporate online learning from user feedback across additional channels outside our Jobs product in the future like Recruiter Search.
Convincing someone to consider your company and role is harder when they’re not actively seeking a job. One of the engineering challenges here is that we need to optimize for two-way interest. We want candidates to find the messages they receive compelling and not miss out on any opportunities, and we want hirers to reach out to candidates who are qualified and might be interested. As a result, in passive sourcing channels like Recommended Matches, we utilize a member’s Open Candidate status and other indicators of job seeking intent to present those members higher in the ordering. Our algorithms present the candidates most likely to accept your outreach based upon whether they are qualified, have displayed job seeking intent, and would be interested in your job.
Deep dive into online machine learning
Feedback about candidates is aggregated in real time and associated with the corresponding hiring project. A hiring project represents an opening to be filled, and may have a job post, search queries, candidate feedback, and other useful details associated with it. The aggregated information is utilized to produce features in our machine learning model. This aggregation is done at the hiring project-level so that we can learn a customer’s preferences for the specific opening.
For each hiring project, we want to learn which profiles attributes (e.g., skill, title, industry, etc.) might be most relevant based upon the feedback for each candidate with those attributes. We refer to these profile attributes as “term types.” Each feedback type (e.g., Message, Archive, Skip, etc.) from each candidate channel (e.g., Recommended Matches, Job Applicants, Recruiter Search, etc.) could mean something slightly different, so we create Personalization Features for each channel, feedback type, term type combination.
For those interested in more detail, we provide the actual equations we use below. Let S be the set of all channels, R be the set of all feedback types (such that the set of all actions would be the cartesian product of S and R), and T be the set of term types. The weight of a term i of type t from rating r in sourcing channel s is determined by the following equation:
where ci,t,s,r is the number of candidates having term i of type t that were impressed in sourcing channel s and rated r.
Let <wt,s,r> be the vector of all term weights of type t from rating r in sourcing channel s and<pt> be the boolean vector of terms of type t in candidate X’s profile. The Personalization Feature for candidate X with term type t from rating r in sourcing channel s is determined by the following dot product:
The idea here is to learn which profile attribute values might be most relevant for a given hiring project. This is done by creating a Personalization Feature for each profile attribute that combines the various profile attribute values and their respective weights. These Personalization Features zt,s,r can be calculated in real time based off the actions a user takes and represent the different online learning features used in our model.
The Personalization Scores are then included as a handful of the many features used in our XGBoost model. XGBoost produces many decision trees. As a result, the model can decide things like “add 10 points to the final model score if the personalized skill match is greater than 0.7 and the personalized title match is greater than 0.6.”
The impact of these powerful features is profound. In fact, the online learning features are currently 7 of the top 10 most important features in the model. When comparing a model with online learning versus a model without online learning, we found that online learning features provide 49.61% lift in NDCG@1 (averaged over all search queries, with each search query generating one candidate recommendation).
The road ahead
Our online learning capabilities are improving quickly. We’ve been actively surveying our job posters and many of the improvements we’ve made have come from this member feedback. Thank you to all of our customers who have tried the new Recommended Matches feature and let us know how it’s working. We’ll continue to make improvements based upon your feedback. Our vision at LinkedIn is to provide economic opportunity for every member of the global workforce. Our members use LinkedIn Jobs and Recommended Matches to hire for thousands of new positions everyday.
Thank you to Nadeem Anjum for co-authoring the post and Neha Jain for reviewing a draft.
Thank you to Neha Jain and Erik Buchanan for leading the engineering teams responsible for these innovations and to Skylar Payne, Nadeem Anjum, David DiCato, Gio Borje, Jerry Lin, Bo Yao, Huanyu Zhao, Mohit Kumar, Jinqi Huang, Heyang Liu, and Quentin DCosta for building this feature, which was chosen by LinkedIn executives as the most innovative engineering product across the company in 2017. We’d also like to thank our partners in product, design, product marketing management, and data science for designing and informing this tremendously impactful product that we’ve all enjoyed so much getting to build together.