Personalized Recommendations in LinkedIn Learning
December 6, 2016
We recently launched LinkedIn Learning, an online learning platform that enables students and professionals to take courses and learn the skills required to meet their career goals. As part of this platform, we provide personalized course recommendations. A/B testing indicates that we have 58% higher engagement rate when we provide personalized recommendations compared to generic or randomized recommendations.
It’s important to call out that these personalized recommendations are made possible by the robust, highly-structured knowledge base of member-skill-job connections that we have assembled at LinkedIn. For more information about this foundational work that enables machine learning and relevance at LinkedIn, please refer to Building the LinkedIn Knowledge Graph by Qi He, Bee-Chung Chen, and Deepak Agarwal.
Recommendations: A skill-based approach
As the workforce continues to transition toward a knowledge economy, the concept of “lifelong learning” becomes inevitable for students and professionals. With the technological advances of the past decade, the job market has become more dynamic, retiring skills and making some jobs obsolete, while at the same time creating new jobs and requiring new skills. Thus there is a growing incentive to continue the process of learning beyond school, whether you are a student transitioning from school to work, or a professional looking to re-skill (keep up with competences required by the job market), or make a career transition. To meet this demand, online learning is an indispensable mechanism that has enabled people all over the world to learn remotely for the last decade.
Online learning can be self-paced and its flexibility enables individuals of all abilities, demographics, and constraints to learn by creating personalized learning paths. Online learning can not only help in overcoming the limitations of formal education, but has also led to a paradigm shift in hiring strategies used by employers. Experts believe that if employers hired more specifically based on skills and absolute skill level, as opposed to primarily based on formal-education, then candidates could choose any valid learning path that allows them to meet the goals and standards required by employers. Career and education needs vary from person to person based on their career goals and skills-gap. In that context, personalized course recommendation algorithms are important in helping learners pick relevant courses that will help them achieve their specific goals.
Bootstrapping recommendations for new users
The course recommendation algorithms employed by most online learning applications rely upon user activity from existing users in order to make high-quality recommendations. For new users or new platforms, such signals are absent, and this creates an onboarding problem, also known as the “cold start problem.” We mitigate this problem by either using an explicit questionnaire asking members to pick the skills they are interested in learning, or, even better, by implicitly inferring these skills via a relevance framework that utilizes members’ information available through their LinkedIn profiles.
The overall system architecture of the course recommendation engine is shown above. There are two sources of data that we use: the course database (which contains information related to courses like description, author, category, video transcription, and other metadata), and member information( both explicit profile information and behavior on LinkedIn). The resulting course recommendations for a member are stored in offline and online stores and are served through our online service.
Mapping skills to courses and members
The central theme of our approach is to use skills as features to represent both members and courses. Courses can be mapped to a set of skill vectors that would be acquired by taking that course. Members also can be represented in this skill-space by using various sources of information. While the explicit skills available on the member’s profile is one such source of information, there are others that can be used to infer implicit skills, for instance using the member’s activity on LinkedIn, or the connections the member has through our rich professional network. Once the courses and the members have been mapped to this skill-space, we compute the affinity scores between members and course pairs.
Some skills are more popular (and thus, less unique) than others. To ensure that skill-based course recommendations for a member also take into account the uniqueness of their skills, we compute an “inverse document frequency” (IDF) score for each skill. For computing a skill’s IDF score, member profiles are treated as “documents” and the skills are treated as “words” appearing in those “documents.” A smoothed IDF score for skills is then computed as IDF(s) = log(1 + M/N(s)), where M is the total number of members, and N(s) is the number of members having skill “s” on their profile. Skills that are extremely rare (N(s) < 100) are not considered. The skill vector representing a given member’s mapping is weighted using IDF scores before being used in the dot product, providing a weighted similarity computation mechanism. In that way courses are recommended to users based on unique skills.
There is also a post-processing step we use to diversify our recommended courses in order to avoid having several very similar courses dominate the top of the list. Let’s take an example to illustrate the point. Let’s say there are three courses that have the same set of skills tagged (these could be newer versions of an older course, or the same course at a different difficulty level). In either case, the affinity score of all three courses will be exactly the same for a member who has a subset or all of those skills listed in his or her profile. If these three courses happen to have the highest affinity score, then the top three recommendations for the member will be those three courses. This is the classic problem of incorporating diversity in recommendations. To ensure our recommendations are diverse, we use a round-robin algorithm. Let’s say three courses have the exact same skill overlap with a member’s skills; in that case, we put these courses in the same bucket and order the bucket by a secondary feature—say course creation date. So given a set of skills for which a course recommendation needs to be made, the most recent course is picked from the bucket. After one recommendation has been drawn from that bucket, we draw the next course recommendation from the next bucket—which has a different skill set. After picking a course recommendation from the last bucket, we return to the first bucket. This approach ensures diversity of skills in course recommendations.
Along with relevant course recommendations, it is also important to provide explanations for why the recommendation is meaningful to the member. These explanations appear as “context annotations” for the course recommendations. Some examples of contexts for course recommendations are: “Trending in your title,” “Because you have skill ‘Java’,” “Stay sharp on ‘Object-oriented programming’,” “Because you watched Course ‘Java Essential Training’,” “Based on your title ‘Software Engineering’,” etc. Each context annotation translates to a carousel in the grid of courses that are recommended on the LinkedIn Learning desktop interface.
Testing recommendation quality
We experimented with different ways of creating the course-to-skill and member-to-skill mappings and tested the recommendation quality on channels like the feed and in marketing email campaigns. We measured Click Through Rate (CTR) on our feed and metrics like “Open to Click Rate” on our email campaigns. Our feed experiments heavily relied on our A/B testing framework. From our email campaigns, we learned that personalized recommendations are far more effective in acquiring new learners than non-personalized or hand-curated recommendations. From our feed experiments, we learned that the accuracy of the course-to-skill model is critical in ensuring quality recommendations. We also learned that a member’s professional network or profile information is very important for computing inferred implicit skills, and comes in handy when a member’s profile has an incomplete explicit skills section.
In a high-traffic channel like the LinkedIn feed, our recommendations need to be fresh, and we need to be very careful about fatigue. When there is not enough member activity on a particular course recommendation, then showing the same recommendation repeatedly on subsequent sessions for that member is futile and might even risk causing fatigue. One way to solve this problem is impression discounting. Our first naive implementation was simply an Impression Cool-Off mechanism. For past X number of days, we found the list of members that were served course recommendations and/or engaged with those impressions via clicks. These members were filtered out from our final recommendations, and hence were not served impressions on the (X + 1)th day. In our experiments, we learned that Impression Cool-off increases CTR by 10% but at the cost of losing 34% of impressions. We learned that the Impression Cool-Off approach is inefficient, in that we lose too many impressions for the gains in CTR. We are working on building an Impression Discounting pipeline, which will discount impressions at the course level, instead of at the member level.
Our work in personalizing course recommendations for learners has only just begun, as we just launched the product to the general public in September (and announced 4,000 new German, French, Spanish and Japanese courses last week). During the launch of LinkedIn Learning, we solved the cold start problem using LinkedIn’s rich profile data. However, several challenges and exciting problems lay ahead of us. As we get more and more activity signals from learners, their browsing, and their course-viewing behaviors, we can add more user feedback to our recommendations in a supervised manner. Given members’ click behavior on LinkedIn Learning, we can also learn what contexts are relevant to them and personalize the ordering of the carousels. For example, if a member is more interested in Data Mining and frequently clicks or views courses shown to him or her in the “Because you have Data Mining” context, then we can ensure that this carousel is at the top of the grid in the member’s LinkedIn Learning landing page.
Learning Relevance is a distributed team of engineers and scientists in New York, NY (Christopher Lloyd), Sunnyvale, CA (Deepak Kumar, Gungor Polatkan, Shivani Rao), and San Francisco, CA (Jeffrey Gee, Konstantin Salomatin, Mahesh Joshi). We'd also like to extend a special thanks to Luke Duncan and Anish Nair on the Monetization Relevance team, without whom this project couldn't be completed. We are also grateful for the support of our partners in the LinkedIn Learning backend engineering team, content-production team, product managers, standardization, and search team (located in Bangalore, India).