Spreading the Love in the LinkedIn Feed with Creator-Side Optimization

Bonnie Barrilleaux

Data science 🔮🔬📈 | Ex-LinkedIn

October 16, 2018

Co-authors: Bonnie Barrilleaux and Dylan Wang

Our 567M members use the LinkedIn feed to talk to each other a lot: more than a million posts, videos, and articles flow through the LinkedIn feed each day. This is the story of how we discovered some growing pains for both creators and viewers in the feed; how we solved the problems with a smarter feed relevance model; and how we combined multiple experimental techniques to understand the impact of the changes on the whole interconnected feed ecosystem. So, let’s start with the problem.

Watch the author talk about creator-side optimization and incentives in social networks, in this video from Strata Data NY 2018.

The problem: Big trouble for small creators

Members can participate in conversations in the feed in two distinct roles: as creators who share posts, and as feed viewers who read those posts and respond to them. When a feed viewer visits the LinkedIn homepage, the feed relevance model selects the best posts to show at the top of the feed. The viewer can give feedback to creators through viral actions by liking and commenting on their posts. Of course, the same person can play both roles; many viewers also create posts, and creators usually also view the feed and respond to other members’ posts.

More and more people are using the feed and giving feedback to their network’s posts: our members generate tens of millions of viral actions (likes, comments, and reshares), and the number is increasing more than 50% YoY. However, we found that these increases weren’t equally distributed. In fact, at the beginning of 2018, we were in danger of creating an economy where all the gains in viral actions accrued to the top 1% power users, while the majority of creators who don’t receive much feedback were receiving less than ever. The distribution was already skewed to start with, which occurs naturally in any system where virality propagates. Influencers like Bill Gates get orders of magnitude more feedback than the average person; he has millions of followers, so this is normal and expected. In some ways, a “rich get richer” feedback loop is normal to expect in social media (indeed, a related phenomenon dubbed the “Matthew effect” has been observed in many areas of social interaction). But in this case, we saw that the number of creators who get zero feedback when they post was actually increasing. Yikes!

Members skew wildly in terms of their connections on social networking sites, and LinkedIn is no exception. Combined with limited space in the member’s feed, optimizing for the wrong metric can lead negative ecosystem effects.

This was a huge problem because research shows that giving and receiving feedback on internet posts really does help people feel closer to each other. Our members tell us, and we also see in our data, that getting feedback is critical for helping creators feel successful so that they’ll want to come back and post again in the future. Members who receive 10+ likes when they post are 17% more likely to post again the following week compared to members who post but don’t get any feedback. In light of this, an increase in posters receiving no feedback at all was very concerning, especially when, overall, feed viewers were giving more feedback. It was clear that we couldn’t just grow our way out of this problem by encouraging feed viewers to give more and more feedback. If that feedback kept going to the top 1% of posters, who were already getting plenty, the lesser-known creators would continue to be starved.

A second aspect of this problem was also becoming apparent among feed viewers. We heard anecdotal reports that irrelevant hyper-viral posts were gaming the feed and crowding out posts from closer connections. This wasn’t due to any kind of bug; the feed relevance model was doing exactly what we told it to do by showing lots of broad-interest, hyper-viral content. If many people have already enjoyed, liked, and shared a piece of content, then the feed will correctly guess that a new viewer is also highly likely to enjoy it. However, these posts may not provide much incremental value despite their popularity, and our members’ time is in short supply; if we take up all of their time with popular, general-interest content, they’ll miss important posts from close connections and people they know personally.

The solution: A new feed ranking optimization function

Previously, the feed ranked posts using signals such as the probability that the viewer would click the post or give feedback on it by liking, commenting, or resharing it. We also incorporated the predicted downstream viral impact of that feedback on the ecosystem; this aspect of the model makes recommendations by considering both an individual viewer’s preference, and her network’s. The missing link here is that the model didn’t consider how much the creator may appreciate receiving feedback from the viewer. The model had a blind spot when it came to the value of feedback to the creator because, historically, we always considered the ranking problem primarily from the viewer’s perspective. The novelty here is considering the perspective of the creator whose content is being viewed. For the top 1% of creators, one more like or comment from an unknown follower may not mean much. In fact, they may already get so many responses that one more will just go unnoticed. But for the average creator, getting one more piece of feedback from a close colleague can actually be very meaningful—and could even lead to a conversation and maybe a career opportunity.

To solve the “concentration of likes” problem and take the creator’s perspective into account, we added an additional term in the optimization function of the relevance model, so that we are now optimizing for a new utility function along with the existing ones. This new term quantifies the value that the creator will receive from the viewer providing feedback on the post, accounting for the fact that the first few pieces of feedback are the most important, and each new response beyond the first few provides diminishing returns. Now, the feed “knows” how much a given creator will appreciate getting feedback from a given viewer, and it uses this information when ranking the posts. Of course, not every post deserves feedback (some posts are spam or just not very good!), so the model also considers the quality of the post to avoid spamming viewers with low-quality posts. The effect is that we are redistributing a little bit of the attention in the system from the power users to the other creators, so that no one is left behind. This helps ensure that the “small” creators who create high-quality posts can reach out to the community that cares about them.

Now that we had a new model, we needed to determine whether it would actually fulfill the goals we set out to achieve. For this, we turned to experiments.

Optimization from the creator’s perspective.

Experimentation in a world of network impacts

We do a lot of A/B testing at LinkedIn, and it’s absolutely critical to understanding whether a new feature is working as intended and is actually improving the member experience. However, standard A/B testing is woefully inadequate to assess the impact of a feature like this new relevance model on our ecosystem. In fact, many of our experiments at LinkedIn have network impacts that are hard to measure and hard to reason about. Let’s see why that’s so.

In a traditional A/B test, we randomly choose whether each member is in the treatment or control group, and then we treat the treatment members with some experimental condition. In this example, the feed viewers are the treatment group, and the treatment consists of ranking posts higher if we believe the creator who shared the post will appreciate the viewer’s feedback. The viewer sees a feed with more content from their close connections and might give more feedback to those people. However, they’ll probably give about the same amount of feedback overall—instead of liking or commenting on a post from Bill Gates, they’ll like or comment on a post from their close connection. From the viewer’s perspective, not very much has changed; both posts are good and both posts are worth liking or commenting on, so it’s hard to see much difference in standard engagement metrics.

The big change is for the creator—if the creator’s network is in the treatment group, they’ll get more feedback. Yay! However, because of how we’ve randomized the experiment, about half of each creator’s network will be in treatment and half will be in the control group. There are no two groups of creators we can compare to each other. So, how can we actually measure the impact on creators?

Standard A/B testing can’t measure network impact.

Our first choice might be to do two experiments: one randomized on the viewers and one randomized on the creators. To do the creator-side experiment, we could take the creators and boost half of them in the feeds of their network, while leaving the other half un-boosted so we can compare them to each other. We’ve frequently used this technique in other areas of LinkedIn, such as notifications. Unfortunately, that option doesn’t work in this case due to technical limitations, so for now we’d have to rely on randomizing on the feed viewers.

A second option, which we used in this case, is using “upstream/downstream metrics.” These are metrics calculated on a single member, but which measure some aspect of their impact on their network. As a basic example, if we create an experiment that causes members to post more, we can measure the downstream impact on their networks with a metric like “total feedback received.” For a given poster, the “total feedback received” metric calculates the number of likes and comments they received on their posts. This would help us distinguish between a treatment that causes members to create boring posts that get very little feedback versus a treatment that helps members create interesting posts that get more feedback.

In the case of this new feed model, to get a sense of the potential impact on creators, we used a suite of upstream metrics, including “first likes given.” This metric quantifies how often I, as a feed viewer, like a post that didn’t previously have any likes. If I see a post that doesn’t have any likes, and I click the like button for it, then I just created one “first like given.” The suite of metrics contains several variations on this theme involving comments, freshness of the post, and the changing value of feedback beyond the first piece (which rolls into a metric called “creator love given”), but it all follows the pattern of measuring value given to the creator. The big caveat here is that even if a viewer gives 10 more “first likes given” in the treatment group than in the control group, that doesn’t mean creators will receive 10 more first likes if we ramp the experiment to everyone. This is because if someone in the treatment group didn’t give that first like, it’s possible that someone in the control group would still have given it later, so all our treatment has done is give the creator their first like a bit sooner than they would’ve gotten it otherwise. The metric gives us a directional indication that we’re changing things in the way that we want to, but it doesn’t scale linearly to the whole population. If we want to accurately measure the actual impact on creators, we’ll need a different method.

Our “creator love” upstream metrics describe how the creator feels about the viewer’s actions.

To measure the impact on creators more accurately, we used “ego clusters,” in which we randomize clusters of members rather than individuals into the treatment and control groups. In this method, we take a sample of members called the Egos and randomize them into either the treatment or control group. But instead of treating the Egos themselves, we treat their Alters (their connections) with either treatment or control. Thus, the treatment unit of the experiment isn’t a single member, but rather a member plus their whole network. Ego cluster experiments answer the question: if my whole network is treated, what’s the impact on me? There are some caveats here (lower power because we have a smaller sample size; incomplete treatment of the Alters because the clusters overlap), but it’s currently our gold standard method for measuring network impacts in the feed. Guillaume Saint-Jacques and team have a paper on the method coming soon; stay tuned.

Ego clusters randomize on groups of members.

Just to round out the methodologies, in some cases we also use edge-based bootstrapping on bernoulli randomization, pioneered among our analytics team by Jim Sorenson. This only works for experiments where each interaction is one-to-one between the sender and receiver, such as messaging. In the feed, where actions such as posting can be one-to-many, we don’t have the option of considering single edges between pairs of individuals, because everything is an interconnected web.

Four methods used to measure network impact.

Results: This story has a happy ending

It turns out that the new feed model exceeded our expectations. We worried that we might see some declines in feed engagement that we would have to weigh against the benefits we were bringing to creators, but actually, this feature turned out to be win-win for both creators and feed viewers. Members like seeing more content from people they know!

Of course, prioritizing posts from close connections means we have less space in the feed to show posts from top creators. The overall effect of the model was to take about 8% of all feedback away from the top 0.1% of creators and redistribute it to the bottom 98%. Since rolling out the model, we’ve seen a reduction in out-of-network highly-viral posts shown in top slots of the feed, indicating that, in aggregate, this change is making more room for posts from close connections at the top of the feed.

As a result, we saw big wins for creators, especially the members with smaller networks. The fraction of creators receiving feedback on their posts increased, and as a result we saw a 5% increase in creators returning to post again.

If this all sounds like our top creators are now going to be hurting because we’ve taken feedback away from them, recall that we’re in an environment of 50% YoY growth in overall responses to posts—taking 8% of the likes away from the top 0.1% still leaves them better off than they were a year ago. These changes just help ensure that the rising tide is lifting all the boats in a fair and equitable fashion.

Future directions
Of course, we have more plans for optimizing and extending this model, as well as teaching the feed relevance model lots of new tricks involving content understanding, transfer learning, and more. We want to optimize feed as an ecosystem; after all, LinkedIn is a social network where your actions not only affect your own experience, but also impact many other members. We also want to understand members’ long-term behavior, such as interactions with the feed over time, and incorporate that into the relevance model. Other signals that we want to explore further include dwell time, freshness, affinity, and others. Beyond the feed, we plan to also incorporate creator-side optimization into other areas of LinkedIn content recommendations, such as notifications.

We’re not done exploring new methods of experimentation and analysis either. We’re working on new and better ways of measuring the impact of our work on complex, interconnected systems. We know we’ll win some and lose some, but we always want to try lots of new things, take intelligent risks, learn, and iterate, so that ultimately we can enable members to advance their careers through their LinkedIn community.

Acknowledgements

This story spans the work of several teams, including feed AI (Wei Xue, Ying Xuan, Souvik Ghosh, Tim Jurka) and the greater flagship relevance org (Shaunak Chatterjee, Kinjal Basu), analytics (Guillaume Saint-Jacques, Jia Ding, Ya Xu), and product (Pete Davies).

Topics: Analytics Feed A/B Testing/Experimentation Data Data Science