Helping members connect to opportunity through AI

Rupesh Gupta

AI at LinkedIn

July 1, 2021

Co-authors: Rupesh Gupta and Lingjie Weng

LinkedIn’s mission is to connect the world’s professionals to make them more productive and successful. Becoming more productive and successful is a lifelong journey. In this journey, members leverage LinkedIn in different ways, such as to build their professional networks, learn new skills, showcase their profiles, find jobs, and stay informed about their industries. As members move through these different stages, they sometimes benefit from guidance in how to realize increased value from LinkedIn. Providing this guidance is an ambitious goal because every member is different, every member is at a different stage in their journey, and there are several ways of realizing increased value from LinkedIn at each stage.

Fortunately, we are equipped with AI algorithms that are well suited for such tasks. These algorithms can learn how members who realize value from LinkedIn leverage the platform, and then these learnings can be used to design interventions for guiding other members to leverage LinkedIn in similar ways. In this blog post, we explain an AI-based four step approach that we employ for helping our members realize increased value from LinkedIn.

illustration-of-ai-based-four-step-approach-for-helping-members-realize-increased-value

Figure 1: AI-based four step approach for helping members realize increased value

Step 1: Defining a success metric

We started by defining our target metric of success. There are several ways of measuring whether a member is realizing value from LinkedIn. When we first implemented our approach, we assumed that a member is realizing value from LinkedIn if that member is visiting LinkedIn at least once within a week. Thus, in our quest to improve the number of members receiving value from LinkedIn, we sought to maximize the number of weekly active users (WAUs). This allowed us to define our goal in terms of a quantifiable success metric.

Step 2: Identifying features that strongly correlate with the success metric

Next, we trained a “probability of weekly activeness” model (called pWA model hereafter) for predicting the probability that a member will be active within the next week. We prepared training examples for this model as follows. For each member, we snapshotted a large number of their features on a specific date, and observed their activeness over the next seven day period to prepare labels. The label was 1 if the member was active (visited LinkedIn) over the seven day period, and 0 otherwise. We snapshotted the following five types of features:

Profile features, such as industry, skills, profile completeness, etc.
Asset features, such as whether the member has the mobile app, whether the member has a premium subscription, etc.
Activity features, such as number of sessions in the last N days, number of jobs viewed in the last N days, number of jobs posted in the last N days, number of feed interactions in the last N days, number of articles shared in the feed in the last N days, etc.
Liquidity features, such as number of notifications of various types received by the member in the last N days, number of updates in the member’s feed, etc.
Network features, such as number of connections, number of connections who are active, number of entities the member follows, number of groups the member is a part of, etc.

With these training examples, we trained an XGBoost classification model. After this model was trained, we sorted the features based on their importance in the model to identify a set of features that strongly correlated with the probability of a member being active within the next week. Some of the features in this set included: whether the member has the mobile app, number of sessions in the last N days, number of notifications of various types received by the member in the last N days, and number of connections who are active. As a simple example, we could say that if a member has the mobile app, then that member is likely to be active within the next week. Similarly, if a member has a small number of connections who are active, then that member is unlikely to be active within the next week.

Step 3: Estimating the impact of a change in each strongly correlated feature on the success metric

Next, for each feature in the above set, we considered whether it was possible for us to directly change the value of that feature through an intervention. If it was, then we kept the feature in the set; otherwise, we removed it. For example, we kept the feature “whether the member has the mobile app,” as we can nudge a member to install the mobile app through a prompt on the web app, but we removed the feature “number of sessions in the last N days,” as we cannot create an intervention to directly change the value of this feature. Now, for each feature in this smaller subset of features, we conducted an observational causal analysis to get a rough estimate of the expected impact of a change in the feature’s value on our success metric (WAUs). This was a necessary step because correlation does not necessarily imply causation. For example, based on correlation, we can say that if a member has a small number of connections who are active, then that member is unlikely to be active within the next week. However, without further analysis, we cannot say anything about whether the likelihood of that member being active within the next week would increase if that member added more connections who were active. So, this is how we estimated the expected impact of a change in the value of the feature “number of connections who are active.” We selected a random date d. Then, we looked at the historical data surrounding that date and divided our members into two groups:

Group 1: Members who connected with at least one active member (a member who visited LinkedIn at least once in the week before date d) between the dates d and d+7.
Group 2: Members who did not connect with any active member between the dates d and d+7.

Then, we calculated the number of WAUs in each of these groups in the one week period before date d and the one week period after date d+7.

graph-showing-process-of-estimating-impact-of-an-increase-in-the-value-of-the-feature-number-of-connections-who-are-active-on-the-waus-metric-from-historical-data

Figure 2: Estimating the impact of an increase in the value of the feature “number of connections who are active” on the WAUs metric from historical data

From this, we calculated the expected impact of an increase in the value of the feature “number of connections who are active” as:

[WAUs(Group 1, after) - WAUs(Group 1, before)] -

[WAUs(Group 2, after) - WAUs(Group 2, before)]

The first part of this formula captures the change in WAUs due to connecting with at least one active member, as well as other possible factors. The second part of the formula captures the change in WAUs due to factors other than connecting with an active member.

After running this type of analysis for each of the features in our subset, we ordered the features based on the expected impact of a change in their values on our success metric.

Step 4: Designing interventions to change values of features with a large expected impact on the success metric

Next, for each feature with a large expected impact on our success metric, we designed interventions (or experiments) for changing its value. For example, “number of connections who are active” had a large expected impact on WAUs. We therefore designed an intervention to facilitate a change in the value of this feature as follows.

Our People You May Know (PYMK) recommender suggests people that a member might want to connect with. Roughly speaking, this recommender was previously maximizing the total number of connections among members on the platform by recommending people who were most likely to receive and accept an invite from the member. The score for a candidate recommendation for a member was being computed as:

score(candidate) = pConnect(member, candidate)

where pConnect(member, candidate) is the probability that a connection will be formed between the member and the candidate.

Although this was maximizing the number of connections being formed, it was not necessarily maximizing the value being realized by each member. For example, if Alice and Bob are good friends in real life, then Bob would be recommended to Alice with the above scoring function, as Alice is very likely to send an invite to Bob and Bob is very likely to accept an invite from Alice. However, if Alice is currently looking for a job, then she might not realize as much value from connecting with Bob as she could from connecting with a prospective employer. To maximize the value realized by each member, we added maximization of our success metric (WAUs) as a new objective. This resulted in a new scoring function with an additional term for the new objective:

score(candidate) = pConnect(member, candidate) +

𝛼 [ΔpWA(member | member-candidate) + ΔpWA(candidate | member-candidate)]

- Equation (1)

where ΔpWA(member | member-candidate) is the expected change in the probability that the member will be active within the next week if a connection is formed between the member and the candidate, and ΔpWA(candidate | member-candidate) is the expected change in the probability that the candidate will be active within the next week if a connection is formed between the member and the candidate.

Note that in this scoring function, we captured the impact of a connection between a candidate and the member on the weekly activeness of both the member as well as the candidate. Also, we introduced a parameter 𝛼. This parameter controls the tradeoff between the new objective of WAUs and the earlier objective of number of connections. In the above example, replacing Bob with a prospective employer might reduce the number of connections formed, as Alice might not send an invite to that prospective employer or that prospective employer might not accept an invite from Alice.

Now, to be able to score each candidate recommendation using Equation (1), we needed the ability to predict ΔpWA(P | P-Q), i.e., the expected change in the probability that a member P will be active within the next week if a connection is formed between member P and another member Q. Learning a model for making such a prediction is a non-trivial exercise, because we do not have all the required information for training this model in our historical data. For example, let’s say a member A connected with a member B on date d in our historical data. Then, to create a training example from this event, we need the following label for this training example to accurately capture the change in the weekly activeness of A due to the connection with B:

label = (weekly activeness of A after date d) -

(weekly activeness of A after date d if A had not connected with B)

The difficulty lies in the fact that while (weekly activeness of A after date d) is observed, the counterfactual (weekly activeness of A after date d if A had not connected with B) is not observed. To get an accurate estimate of the counterfactual, we would have to find another member A’ who was identical to A on date d but did not connect with B, and then look at the observed weekly activeness of A’ after date d. However, finding A’ is not easy. So, we made an assumption that the (weekly activeness of A before date d) is a good approximation for the (weekly activeness of A after date d if A had not connected with B).

With this assumption, we prepared two training examples from each connection event A-B in our historical data: one for the change in weekly activeness of A, and one for the change in weekly activeness of B, due to the connection A-B. For each training example, we used features similar to the ones used in the pWA model for the two entities A and B, as well as some pair features, such as the profile similarity between A and B. With these training examples, we trained an XGBoost model for ΔpWA(P | P-Q).

Our model learned some nice patterns, such as:

ΔpWA(P | P-Q) being high if P is a job seeker (viewed some jobs in the past) who is not very active (few sessions in the recent past) and Q is a job poster (posted jobs in the past) who is active (several sessions in the recent past).
ΔpWA(P | P-Q) being high if P is a content consumer (some feed interactions in the past) who is not very active (few sessions in the recent past) and Q is a content producer (shared content in the feed) who is active (several sessions in the recent past).

With this model in Equation (1), we can see that if a member is currently looking for a job in their journey towards becoming more productive and successful, then our intervention should help them connect with active recruiters and hiring managers. That member should see more active recruiters and hiring managers in their PYMK recommendations, and should also appear in the PYMK recommendations of active recruiters and hiring managers. Then, once this member finds a job and starts looking for news about their industry, our intervention should help them connect with individuals who actively produce content.

When we deployed our intervention, we did see an increase in the number of WAUs, but we also saw a slight drop in the number of connections formed. We tuned the parameter α in Equation (1) to achieve an acceptable tradeoff between the two objectives.

Similarly, we designed other interventions for changing the values of other features with a large expected impact on WAUs. It is worth pointing out here that:

Predicting the impact (ΔpWA) of non-personalized interventions is relatively much simpler. For example, to predict the impact of nudging a member to install the mobile app through a prompt on the web app, we can simply run an experiment where we show this prompt to a small percentage of randomly selected members and then compare the weekly activeness of these members in the treatment group with the members in the control group. The comparison can be done for members with any chosen characteristics — for example, members in the treatment and control groups who were in the internet industry before the start of the experiment.
ΔpWA should be close to 0 for most interventions for a member who is already very active, or, in other words, already realizing large value from LinkedIn.

Learnings

Here’s a summary of our learnings based on our experience thus far:

It is very important to select a success metric early on, as it is difficult to make progress without it. There are usually many options for the success metric, but it’s good to pick a metric that aligns well with the mission of the product and can be observed easily.
Training a simple correlational model with a large number of member features greatly helps narrow down the search space of possible interventions.
Changing the value of features that do correlate strongly with the success metric doesn't necessarily result in improvement of the success metric; for this reason, it's best to evaluate the impact of any change in the value of each strongly correlated feature on the success metric through an observational causal analysis.
Interventions need to be carefully designed when trying to change the values of features with a large expected impact on the success metric. It’s best to add maximization of the success metric as an objective to the system through which an intervention is going to be delivered.
When preparing training data for a model to predict the impact of an intervention on a member, certain assumptions usually need to be made (depending on the type of intervention).
To improve the success metric for a specific cohort of members (such as members who are students), the correlation and observational causal analysis can be rerun for just those members to identify features with a large expected impact for that cohort in particular.
AI helps identify impactful interventions based on historical data about how different members have leveraged the product to achieve their goals. However, there can certainly be many other impactful interventions that enhance member experience in new ways, such as a new product capability, a redesign of the user interface, etc.

Acknowledgements

The content of this blog post is based on the work of the Retention AI team at LinkedIn: Jiaqi Ge, Keren Wang, Lingjie Weng, Rupesh Gupta, Smriti Ramakrishnan, Yan Gao, Yao Pan, and Yung-Yu Chung.

We would like to thank Shipeng Yu, Hema Raghavan, Nikhil Joshi, Shira Gasarch, Albert Cui, Bobby Nakamoto, Ye Tian, Caitlin Crump, Joon Lim, Abhisek Kumar, Guangde Chen, Ajith Muralidharan, Netra Malagi, Kathleen Shim, Vivek Gupta, Sigal Traister, Chanh Nguyen, Parag Agrawal, Aastha Nigam, Eric Lawrence, Abdulla Al-Qawasmeh, and many others who helped us.

Topics: Artificial intelligence Machine Learning