An Introduction to AI at LinkedIn

Deepak Agarwal

Chief AI Officer @ LinkedIn

October 9, 2018

Editor’s note: The use of AI in LinkedIn products has been the subject of multiple press articles and research papers (some highlighted on this blog). With the release of a new LinkedIn Learning course about AI at LinkedIn, we asked our Head of AI, Deepak Agarwal, for a brief overview of what AI is and how it works, geared towards people who are interested in this growing field. In this post, we discuss AI as a broad topic and look at a few ways that it influences product design at LinkedIn.

Back in 2005, I was working in my first job at AT&T Bell Labs. The telecommunications industry was struggling due to price wars and increased competition from wireless carriers. I was left wondering what I should do as I watched colleague after colleague slowly leave to find other positions in the booming consumer internet industry spearheaded by players like Google and Yahoo!.

Although LinkedIn existed at the time, I didn’t know about it. So what did I do? I reached out to my network, talked to my previous boss, and then participated in one job interview after another, until I finally ended up at Yahoo! Research later that year. This was the beginning of my career in the tech industry.

I share this anecdote because in many ways, it echoes the story of many LinkedIn members: I reached out to my network and found opportunity. At LinkedIn, enabling this kind of economic mobility at scale is our job—we want to connect every member of the professional workforce in the world to opportunity. To help tackle this mammoth task, we use artificial intelligence (AI) to assist with everything from finding our members the right job openings to surfacing better candidates for our customers. With AI, we’re able to efficiently sort through the massive amount of data we have—job postings, people you may want to connect with, feed content, and more—and align recommendations with members’ interests.

We’ve been incorporating AI into our products and services for years, and we’ve also written previously on our blog about several specific applications of AI at LinkedIn. In this post, I’d like to take a step back and reflect more broadly on how we use AI to improve our members’ and customers’ experiences. As AI has continued to advance, it has become omnipresent at LinkedIn, rather than being siloed into only one or two applications. Taking a broader view of how we use AI at LinkedIn helps to illustrate and explain how it is now woven into the fabric of everything that we do.

What is artificial intelligence?

According to a basic definition, AI is the science and engineering of building intelligent computer programs that can achieve complex goals, such as driving a car, identifying a cat in an image, or suggesting a job you may be interested in. Underneath the broad umbrella of AI are specialized branches, such as machine learning and deep learning.

In order to understand how AI systems help us achieve our goals, it’s important step back and look at how these algorithms work.

You identify a broad objective for the AI system, like “provide new job opportunities for our members that match their skills and interests” or “provide recruiters with a list of candidates that both match a given search criteria and are likely to result in a successful hire.”
Have a set of intermediate metrics (called “relevance” metrics in Figure 1) that are used as a proxy for how well the system is achieving its goal. This is often necessary because the original product metrics (for example: successful hires) are are not something that a machine learning algorithm can directly optimize easily. In our example, these metrics could include the number of members that apply to the jobs they are given, the number of confirmed hires, the number of members that click on job listings, etc.
You create an algorithm that improves (according to your relevance metrics) upon your existing method of generating results from data. For example, a model could use a different criteria to recommend job opportunities to members that results in an increased number of members clicking on job listings, which is used as an indication that the job recommendations have improved.

Finally, you use the scientific method to test the algorithm, changes to the algorithm, and competing algorithmic approaches to see which changes to the system yield the best results. An example of this is A/B testing.

Figure 1: An illustration of the relationship between product design, AI systems, and A/B testing

One key consideration for any company using AI is making sure that you are using the right metrics. For instance, when trying to increase the number of interactions with job recommendations, you could accidentally hurt the member experience by providing too many job recommendations, particularly to users who may not be looking for a new job. Similarly, members typically do not want to spend their time applying for a job if there is a small chance that they will be accepted. A key focus at LinkedIn is using a holistic approach to machine learning that optimizes for utility on both sides of the customer-member experience, whether it be content in your feed, job recommendations, or providing candidate search results to recruiters. You can read some of our research on this topic from KDD ‘14 (video) and about how this approach manifests in members’ newsfeeds.

Another important consideration is making sure that you are being responsive to the feedback you receive from members. Whether it is the results of an A/B test or qualitative techniques like interviews or focus groups, taking a holistic approach to feedback is equally important for long-term success.

How do we use AI at LinkedIn?

In one way or another, AI powers everything at LinkedIn. We use it in ways that our members see everyday, like giving them the right job recommendation, encouraging them to connect with someone, or providing them with helpful content in the feed. It is at work in products for our enterprise customers, such as helping a salesperson predict the responsiveness of their leads, serving relevant advertising to our members, or helping a recruiter find new talent pools. It also works in the background, doing things like making sure that our members are protected from harmful content, routing connections to ensure a fast site speed experience, and making sure that the notifications sent to our members are informative, but not annoying.

Figure 2: Examples of the LinkedIn Feed and a Jobs page

People + machines to leverage data at scale
Many people think of AI as a completely automated process with no human input, but much of the data used by our AI systems and many of the ways we deploy those systems are reliant on human input. Take the example of profile data. At a fundamental level, almost all our member data is generated by members themselves. As a result, one company might have a job called “senior software engineer,” while at another company, the same role would have the title “lead developer.” Multiply this by millions of member profiles, and you begin to realize that providing a good search experience for recruiters, where all of these varying job titles show up, can be a very challenging task! Standardizing that data in a way that our AI systems can understand is an important first step of creating a good search experience, and that standardization involves both human and machine efforts. We have taxonomists who create taxonomies of titles and use machine learning models (LSTM models, other kinds of neural networks, etc.) that then suggest ways that titles are related. Understanding these relationships allows us to infer further skills for each member beyond what is listed on their profile; for instance, someone who has a set of “machine learning” skills also understands (at least a subset) of “AI.” This is just one example of the kinds of of taxonomies and relationships that make up the LinkedIn Knowledge Graph.

As you can see, our approach to AI is neither completely machine-driven nor completely human-driven; it’s a combination of the two. We believe that both elements working together in harmony is the best solution.

Deep learning for personalization and content understanding
To perform personalization at the member level, we need machine learning algorithms that can understand content in a comprehensive fashion. Combining machine learning with member intent signals, profile data, and information about a member’s network, we can extensively personalize the recommendations and search results for our members.

We heavily leverage deep learning, a branch of machine learning that automatically learns complex hierarchical structures present in data using neural networks with multiple layers, to understand content of all types. We have developed new classes of machine learning models based on generalized mixed effects models (GLMix) to combine disparate sources of data for personalization at the member level.

In addition, deep learning methods can also capture nonlinear patterns in both temporal, sequential, and spatial data in an effective fashion. We employ three broad classes of deep learning methods for most of our natural language processing and computer vision tasks: the aforementioned LSTM, CNNs, and sequence-to-sequence models. We also employ canonical multi-layered perceptrons wherever necessary for some supervised learning tasks.

Putting AI into production at scale

Getting an AI system up-and-running can be a daunting challenge. When I started working on the AI team at LinkedIn several years ago, we already had a rich collection of data from many different sources. This benefited us for one aspect of AI creation, but our remaining challenge was twofold: scaling our people (due to a worldwide AI talent shortage) and scaling our infrastructure for deploying sophisticated models that are compute-hungry and built by processing very large data. These challenges still face many in the tech industry today.

Scaling our people
To scale our AI engineers, statisticians, and data scientists, we’ve adopted a centralized organizational model that embeds our experts with product teams but maintains the reporting relationship within a centralized AI organization. This allows us to find unique opportunities to cross-collaborate and problem solve for the entire member experience, while still applying more localized optimizations for machine learning problems at the product level. Our engineers often find ways to collaborate on disparate projects and share knowledge more easily because of our centralized organization.

LinkedIn AI Academy is another program that helps equip our employees across the company—in areas like engineering, product management, etc.—with the knowledge they need to optimally deliver impactful AI experiences to our members. As part of this program, engineers, for example, take a course that consists of five one-day-per-week deep-dive classes, and a subsequent, one-month apprenticeship with the core AI team. It takes participants from understanding how to incorporate and maintain an AI system to the step of actually shipping one for their team. For product managers and company executives, there is a single day-long deep-dive session that focuses on the specific domain knowledge that they’ll need to manage AI products.

A platform to train and deploy any AI model
Each AI system can only utilize certain types of data, a restriction that’s dictated by “features” that are built into a model. These features describe different kinds of information that we think might be useful to make better recommendations. For example, your job title may be a feature that can be used to match you to new job opportunities down the road. Our experts and A/B testing framework then teach the AI system how to use these features to make better recommendations based on previously-available data (e.g., someone with the job title “intern” might be more interested in junior developer listings than senior developer listings, for instance).

Doing this work can be a time-consuming process. Just at LinkedIn, we have hundreds of models in production across our various products, and hundreds of thousands of features. So we built an “AI automation” platform called Pro-ML that allows us to centrally manage the features and machine learning models for every engineering team at the company from one system. This system provides a single platform for the entire lifecycle of developing, training, deploying, and testing machine learning models. It has already massively accelerated the speed at which we can build and deploy new products at LinkedIn.

Scaling our infrastructure
On the data infrastructure side, we have a long history at LinkedIn of innovation in this space.

For example, we use our now-famous data messaging system, Kafka, as the “central nervous system” of everything at LinkedIn. We have our own stream processing framework, called Samza, which is also open sourced and used by other companies around the world. In addition to these streaming data systems, we have contributed to the Hadoop ecosystem and a variety of other projects, such as Ambry. We’ve also contributed new open source projects to help accelerate machine learning uses cases for Spark.

We consume a wide variety of open source software for our projects as well. For example, our deep learning workflows extensively use TensorFlow, which is a project that originated at Google. We use Spark with Scala extensively for data processing, and use Pig and Hive for data analytics.

In addition to these open source innovations, our recent collaborations with Microsoft have allowed us to take advantage of some of the artificial intelligence services offered on Azure. For example, as detailed in previous blog posts we use the Microsoft Text Analytics API for dynamic translation of content in the feed.

Making magic happen

AI is like oxygen at LinkedIn—it powers everything that we do. But why do we think that everything we do can benefit from AI? Here are a few reasons.

Our AI systems have had a huge impact for members who are trying to find a job. We saw a 30% increase in job applications from deploying just one AI system that improved the personalization of Jobs You May Be Interested In (JYMBII).
Job applications overall have grown more than 40% year-over-year, based on a variety of AI-driven optimizations that have been made to both sides of the member-recruiter ecosystem.
AI-driven improvements to our recruiter products have helped increase InMail response rates by 45%, while at the same time cutting down on the notifications that we send to our members.
AI has improved article recommendations in the feed by 10-20% (based on click-through rate).

If you’ve read this article and are interested in learning more about AI at LinkedIn, be sure to watch the free course videos for “AI the LinkedIn Way: A Conversation with Deepak Agarwal,” now on LinkedIn Learning.

Topics: Analytics Artificial intelligence A/B Testing/Experimentation Data Machine Learning