The building blocks of LinkedIn Skill Assessments

Christian Mathiesen

CTO @ Frigade (YC W23)

September 17, 2019

Co-authors: Christian Mathiesen and Jie Zhang

Your LinkedIn profile is intended to be a representative picture of your professional life and career, and a key part of that picture is the skills you’ve acquired. In pursuit of our mission to create a place where everyone has access to opportunities based on the skills they have, we’re rolling out a new way to assess, validate, and showcase your proficiency in the skills you spend time cultivating.

LinkedIn Skill Assessments are short-form, standardized assessments designed by third-party subject matter and learning experts to assess and validate skills—everything from C++ to Adobe Photoshop. If a member successfully passes (scores in the 70th percentile or above), they have the option to add a “verified skill” badge to their profile. Regardless of if you pass, you’re offered free, relevant LinkedIn Learning courses to improve your skills after completing the assessment. You have complete control over the visibility of your results, meaning pass or fail, it’s up to you to decide what is displayed on your profile.

The skill badges on members’ profiles are only the tip of the iceberg, though. In fact, one of the biggest benefits to passing a Skill Assessment is how we use this information in the overall hiring ecosystem of LinkedIn. The badges feed into our search indexes, artificial intelligence, and hiring products such as LinkedIn Jobs and LinkedIn Recruiter, allowing recruiters and hiring managers to more effectively pinpoint a better match on the basis of proven skill. As a result, members are also served more accurate job recommendations.

In the first portion of this blog post, we’ll share a behind-the-scenes look at how this system was built. In the second half, we’ll discuss how this new offering interplays with the overall hiring ecosystem at LinkedIn, thereby offering members and recruiters an opportunity to find the right hire faster.

Content creation pipeline

To build the Skill Assessments system, we first needed a scalable content management pipeline so that we can continue adding new assessments and questions. We leveraged our existing content management platform used for products like LinkedIn Learning, which allows our content writers to easily build out learning materials.

LinkedIn Skill Assessments are produced by leading industry subject matter experts sourced through LinkedIn Learning. The content is vigorously reviewed and then entered through the content management platform, eventually being stored in an Espresso database. This database is managed by our assessment management platform service, which also manages other assessment types at LinkedIn, such as our interview preparation tools. It exposes assessment content through a RESTful API using rest.li. At the moment, the assessment management platform service is only used to manage content and not other mechanisms, so we built a dedicated microservice to manage the Skill Assessments flow by storing the assessment taker’s responses (skill assessment status) into another Espresso database.

Leveraging Adaptive Testing
The individual quizzes leverage Adaptive Testing to tailor the assessment based on each taker’s ability. We decided to use Adaptive Testing because of these three main benefits:

Fewer questions needed to evaluate ability
More fair and accurate scoring
Randomness of questions helps prevent cheating

Like the diagram illustrates above, the quiz is generated on the fly rather than being a predefined set of questions. First, a member is presented with a random, medium-level question from the bank of questions. If the member picks the right answer, a more difficult question (worth more points) is presented next. On the other hand, if the member answers the question incorrectly, an easier question worth fewer points is selected next.

So how do we determine the difficulty (and the associated points) for a given question? This is done through our calibration and question selection mechanism, which we cover in the next section.

Question calibration

We use a Rasch Model to calibrate question difficulty based on answer response data. Each question has three statuses: draft, limited ramp, and full ramp. Newly added questions will be in “draft” status first, and they won’t be surfaced to members. Questions marked as “draft” will subsequently be randomly picked and surfaced as actual assessment questions for our members, thus entering the “limited ramp” stage.

Due to the Rasch Model requirements, we only put a limited number of questions into the “limited ramp” status to avoid polluting the results of the already-calibrated questions. When a question is marked as “limited ramp” it will not count towards the final score in the assessments it’s part of; it will be surfaced to gather aggregated data insights only. After a “limited ramp” question has collected enough data to determine its difficulty, it is considered calibrated and will be fully ramped to all members. Thus, as more members take assessments, they help refine the scoring and accuracy of each question. This allows us to rapidly build out Skill Assessments at massive scale, allowing us to avoid having to hardcode the difficulty of each question from the outset.

We have three Hadoop workflows to manage these three statuses:

Initial Calibration: For new assessments, before fully releasing them to the public, we first collect enough answer response data to calibrate the initial difficulty for each question. This data will mainly come from our own employees or a small group of dynamically-selected LinkedIn members. This Hadoop workflow will calibrate the answer response data for all questions within the new assessment.
On-Fly Calibration: New questions will go through the “draft” and “limited ramp” flow before they are “fully ramped.” This Hadoop workflow will only calibrate the answer response data for those new questions within an assessment.
Recalibration: The difficulty of fully ramped questions will be recalibrated periodically with new answer response data, so that their difficulty will always be relevant.

To utilize the calibrated questions, the Hadoop workflows generate a question ranking key-value dataset with Fisher information, and store the data in a key-value store (Venice) for online services to use. The key-value structure is illustrated in the diagram below. It uses the skill ID and the ability as the key, and the value is an ordered question array based on Fisher information, which is calculated based on question frequency, difficulty, and status.

With this generated question ranking key-value dataset, the Skill Assessments algorithm is able to adapt to the assessment-taker’s ability level and select the next question in an assessment based on their performance on previous question answers. The algorithm, coupled with the assessment termination criteria, powers a fair and reliable scoring experience.

Preventing cheating

To make the assessment system trustworthy, anti-cheating protections are another critical component. To achieve this, we have made efforts to make it harder to search for the answers online or figure out the questions through other LinkedIn members.

First of all, we apply all the basic steps to make copying questions hard: a Skill Assessment can only be retaken a finite number of times within a certain timespan, any text content is disabled for copying on the web/mobile clients, and a time limit is enforced for answering each question.

A second and more foolproof component to prevent cheating is the nature of Adaptive Testing itself. That is, if a member tries to quickly click through an assessment to learn the questions beforehand (likely on a different LinkedIn account), they will only be surfaced easy, low-point questions. Guessing the wrong answers to these mean they will continue to be surfaced easy, low-point questions. They will then get a low score, and are never able to see the more intermediate and advanced questions.

Finally, due to the dynamic nature of our content creation pipeline, questions will continuously be rotated and retired, so keeping a static cheatsheet of all the questions available is simply not feasible.

How skills help our members land their next opportunity

After we register that a member has passed a Skill Assessment, a few things happen. First, an active job seeker, who has signed up for job alerts, will automatically start receiving job alerts for opportunities that they are pre-qualified for based on the assessment results. This is done by feeding the Skill Assessment signal into our search stack Galene, as seen in the figure below:

Skill-Assessment-signal-in-our-search-stack

Using Espresso’s built-in support for setting up Kafka topics for each table through Brooklin, our Recruiter search service will consume these events and index a member’s passed skill in a matter of a few milliseconds.

Our Recruiter Search product also reads directly from the same search service, allowing headhunters, staffing agencies, and recruiters around the world to immediately start filtering based on members with specific passed skills.

By feeding the Skill Assessments data into our Talent Solutions stack, LinkedIn amplifies your chances of landing your next job. In fact, for those that are actively job-seeking, early results show a significant improvement (~30%) in the likelihood to hear back from a recruiter if they pass a LinkedIn Skill Assessment.

Acknowledgements

Special thanks to all the engineers who made this product possible: Richard Meng, Joey Bai, Shiqi Wu, DJ Kim, Peter Tong, Enrique Torrendell, Kirill Talanine, Jia Li, Xin Liu, Xianyun Mao, Vijay Ramamurthy, Jefferson Lai, Vinyas Maddi, Himanshu Khurana, Smitha George, Yubo Wang, Eileen Ho, Mahir Shah, Joyce Liang, and Emre Kazdagli.

Topics: Open Source Product Design