Maximizing Our Publishing Platform Reach with Network Distribution

August 13, 2014

One of the primary differentiators between the LinkedIn publishing platform and other existing blogging platforms is the power of network distribution - distribution of published content to a member’s connections and followers. To give you an idea of the reach of network distribution, we know that members who have published at least one article have an average of 1049 1st degree connections and 42 followers.

Within network distribution there are three channels to distribute content:

  1. Distribution to feeds
  2. 1st degree network notifications
  3. Distribution to Pulse Email

One additional detail that will be covered in this post is how we filter out low quality and spam articles so that only high quality articles get distribution to a member’s network.

Distribution to Feeds

At LinkedIn, the Feed-Mixer system is used to create and optimize feeds for each LinkedIn member. This system works by blending many different types of updates together into a single feed that is then optimized to provide high quality and relevant content for the viewer.

Feed-Mixer supports the most important network distribution channel for member published articles - the LinkedIn Feed on Mobile and Desktop. These feeds help LinkedIn members keep up with their connections’ activities by consuming updates on the LinkedIn.com homepage feed or on LinkedIn mobile applications. When an author publishes an article on LinkedIn, a “member published” activity is distributed to all of the author's 1st degree connections and followers. Viral activity on that update - comments, likes, and reshares - gets the poster additional distribution beyond their 1st degree network, and gets them distribution to additional people who can follow them. For many of our authors, over time, the followership base that they build on LinkedIn will dwarf their 1st degree network.

Let's talk about how member published updates get pushed to Feed-Mixer and delivered to members. Below is a data flow from publishing service to Feed-Mixer.

Data flow from Publishing Service to Feed-Mixer Data flow from Publishing Service to Feed-Mixer

When an author publishes an article, publishing service posts a “member published” update to the Activities Index which is hosted on Espresso. When there is a client request for a member’s network feed, Feed-Mixer fetches activities from the Unified Social Content Platform (USCP) and other updates/recommendations from different first pass rankers that serve personalized professional content for a feed. Feed-Mixer then merges this content together to a logical feed based on relevance, freshness and diversity and serves it to clients.

Notifications to 1st Degree Network

Today, when a post is published by a member, we notify their connections via in-app mobile notifications as well as on www.linkedin.com.

Creating this publishing notification is very straightforward. On the server side, first we define a new notification type for the publishing event. The creation of these notifications are done asynchronously after a member publishes a post, allowing us to implement custom business logic on how widely and when to distribute it. Publishing service invokes notification system’s rest.li API to send out the notifications. On the client side, we created templates for both desktop and mobile clients to render the new notification type.

However, there are three additional requirements for publishing notifications:

  1. We want members to only receive notifications for high quality and relevant articles
  2. Connections should be able to unsubscribe notifications from any authors if they choose to
  3. Notifications from the same author will be aggregated to one update

To ensure members only receive high quality and relevant publishing notifications, we do two things. First, all posts must pass our spam and low quality filter before having a notification published for them. Second, only connections whom we deem are strong connections will receive these notifications. We determine this by leveraging the connection strength score from the LinkedIn cloud service. Cloud service maintains connection relationships between members.

To ensure that members can unsubscribe from these notifications, we leverage notifications system’s rest.li API for clients to look up unsubscription status for any given member. The publishing service filters out the list of notification recipients by their subscription status before creating notifications for them.

For the last requirement, the notification system provides a feature for clients to determine how unread notifications should be aggregated. For all publishing notifications, we use the option to aggregate all unread notifications to one update.

Desktop notifications when connection publishes an article Desktop notifications when connection publishes an article

Distribution to LinkedIn Pulse Email

LinkedIn sends out daily or weekly Pulse email to active content consumers to provide our members with timely and relevant professional content. In the email, there is a “Published by your network” section which contains a list of articles published by your connections and members that you are following.

All necessary data used to generate this section such as member’s connections, follow relationships, and articles published are ETL’d to HDFS for offline processing. To generate the “Published by Your Network” section, we run a daily or weekly Hadoop job to compute member’s network (connections + follows) and retrieve most recent articles published by the network since the last email was sent.

“Published by your network” section in Pulse Email Published by your network section in Pulse Email

Low Quality and Spam Filtering of Posts

As discussed earlier, before distributing articles in our network distribution channels, we need to ensure the article content is neither low quality nor spam. The publishing service is integrated with LinkedIn’s Unified Content Filtering (UCF) Service, a client side library which provides information such as an article's spam score and confidence scores for how likely an article is to be a job posting or an event promotion. If the scores exceed a particular threshold, they will not be distributed to the author’s network. UCF also provides the publishing service with information indicating whether the publishing member is blocked by another member. The returned content filtering result is then stored in publishing platform's domain and is integrated to our rest.li APIs for clients to access.

Publishing service’s integration with UCF

Topics