Measuring downstream impact on social networks by using an attribution framework
July 21, 2022
Co-authors: Qiannan Yin, Derek Koh, and Jenny Wu
The network effect in social networks increases the complexity of conducting analysis, as the actions of members have the ability to impact others within the network. For instance, a member posting an article on LinkedIn could result in actions downstream where other members see the post and interact with it by means of liking, commenting, or sharing. These downstream actions make it harder to measure the overall impact when we try to combine the primary impact from the posters and the downstream impact from those downstream actions, and this is especially challenging in A/B testing.
Even though quantifying such downstream impact is widely studied and documented in the literature (Backstrom and Kleinberg 2011; Katzir, Liberty, and Somekh 2012; Saveski et al. 2017; Karrer et al. 2020), application of these methods is often less practical, requiring heavy implementation cost. In this post, we introduce a practical framework of running experiments that captures downstream effects without needing complex experiment setups that other solutions in the literature require. Details will be shared later in this post so that others can adopt this framework.
CAMEL framework for downstream sessions
CAMEL (Customizable Allocation of Modular Empirical Likelihood) is a practical framework we developed to estimate the downstream impact of a given action in social networks. For instance, a member posts an article on LinkedIn, leading to the downstream action of another member sharing the post with their own connections. This can induce a snowball effect leading to more actions from other members. The actions other members take because of a given actor’s actions are what we refer to as “downstream session impact.”
Let 𝜏downstream be the downstream session impact and Y'i, be the downstream session caused by member i due to their original action. Here, we are only considering one generation of activities, ignoring the downstream of downstream actions. In traditional A/B tests, the population is randomly split into two groups, where each entity is assigned to a group by an assignment vector Z ϵ (0, 1)N, with entities i = 1, ..., N. Treated units have Zi = 1 and control values Zi = 0. We can use the following equation to get the downstream sessions of members in the treatment group as follows.
To get Y', we use an attribution framework to trace the recipient session impact back to the actor whose action caused the recipient to come to the site. This is very intuitive in simple scenarios. For example, if Jane sends Mary a message, Mary will receive a notification about receiving a message. When Mary clicks on the notification, she is redirected to the site, resulting in the start of a session. In this case, Mary’s session is a recipient session of Jane’s action of sending a message (Figure 1).
Figure 1: A message actor sending a message to a message receiver
Although the attribution is straightforward when a recipient only receives one notification, it becomes more complicated when a recipient gets multiple notifications from different actors’ actions. Before we go deeper into this, let’s first take a look at the notification system. There are three types of notifications: emails, push notifications, and badge notifications.
Figure 2: Three types of notifications at LinkedIn
For emails and push notifications, although a member can receive multiple notifications, they can only click on one when being directed to the main site. So for emails and push notifications, we can still directly attribute the downstream session to the actor action, as we know the causal flow—the recipient sees the notification, clicks it, and enters the site. For badge notifications, the member sees the badge icon and the notification count, and responds by clicking the mobile app to start a session. This presents a problem for attribution, as the notifications are aggregated to a count and we do not observe a direct causal link that ties a specific notification to the session generated from the click on the mobile app.
Here, we treat the downstream session as a sum of effects from all the badge notifications. Assume a recipient comes to the site due to M badge notifications, and the probability of each badge notification causing a recipient session is pm, m = 1, ..., M. The expected recipient session for each badge notification would also be pm, m = 1, ..., M. When we consider those badge notifications that did cause a session, we have:
p1+p2+...+pM = 1
Since members won’t see the content of the badge notifications before they open the app, we assume pm does not depend on the content of the notification, but depends on the total number of badge notifications (M). So we estimated pm as:
That is, we equally attribute the downstream session to all the actions that produce badge notifications. With this framework, each action will have an allocated fractional downstream session impact.
Another issue with badge notifications is that there is uncertainty on whether members open the mobile app due to the new badge notification. Sometimes members open the app, but because they do not click through all the existing notifications, the badge still remains on the app icon for the next time they reopen the app. In such a case, if we attribute downstream sessions to all the existing badge notifications at the time of the session start, we may overestimate the downstream sessions due to old badge notifications still being active.
Intuitively, the older the badge notifications are, the less likely it is that they are driving new sessions. To mitigate this, we do not want to attribute a downstream session to badge notifications that were sent long ago. On the other hand, the risk of completely discounting notifications due to time is that there may be members who take more time to react to notifications. We studied the distribution of notification response time (the time between when a notification is received and when a corresponding session is generated) for different modes of notifications, and found that over 95% of the sessions generated by notifications are within one day of the receipt of notifications. Thus, we only attribute a downstream session to badge notifications that are received within one day of that session.
So the final estimator is:
where M' is the number of badge notifications within one day of the session.
Now that we have the fractional downstream session impact for each action, we can get the total downstream sessions, Y'i’, attributed to a specific member i by simply summing up the fractional sessions attributed to member i’s actions from all recipients:
where Ji is the set of all fractional downstream sessions attributed to member i.
We validated whether CAMEL downstream session impact estimation is accurate via two intentionally designed experiments.
Downstream session from a product
The downstream session impact from a specific product is defined as the total downstream sessions caused by members' actions on the product. For example, People You May Know (PYMK) is a module on LinkedIn where we recommend a list of people to our members to connect with. Members can take actions to "connect" with other people by sending invitations to them. The downstream session impact from PYMK is defined as downstream sessions caused by invitations sent from PYMK.
Figure 3: People You May Know (PYMK) at LinkedIn
To validate whether CAMEL framework can measure the downstream sessions from PYMK accurately, we designed a PYMK filter experiment to measure it, and compare CAMEL framework measurement with the experiment readout.
We designed the PYMK filter experiment in such a way that members in the treatment group will be removed from the PYMK recommendation system, so that they won't show up in other members' PYMK modules. As a result, they won't receive invitations from PYMK during the experiment period; it’s worth noting that the experiment lasts for only a week, to minimize any negative member impact. Members in the control group will experience no change: they will still be shown on PYMK, and they can still receive invitations from PYMK. In this way, we can directly measure the recipient side session impact from PYMK invitations, which is the downstream session impact from PYMK.
Figure 4: PYMK filter experiment
On the other hand, CAMEL can attribute either a full or fractional recipient side session to each invitation, based on the circumstances of the notification. So, CAMEL can measure the downstream session impact from PYMK by aggregating all the recipient-side sessions associated with invitations sent from PYMK.
We did a hypothesis testing, and found there were no statistically significant differences between the CAMEL estimate and the experiment readout. This validates the CAMEL framework's ability to estimate the downstream session impact from a product accurately.
Downstream session impact in A/B test
The purpose of the second experiment is to validate whether CAMEL can correctly measure the downstream session impact in traditional A/B tests.
In the second experiment, we set up an experiment that randomizes the actors and measures the effect on recipients. For this experiment, an actor is a member who posts on the feed and the recipient is the actor's network that receives a notification of the post. Note that a member can be both an actor and a recipient at the same time.
Figure 6: Post notification at LinkedIn
Attempting to measure the impact of the actor on the recipients is complex because in traditional A/B tests, the experimental unit is limited to members who are actors, and thus the measurement of the experiment can only describe the impact on actors. As the recipients are not randomized, the A/B test cannot quantify recipient impact.
Figure 7: Recipients not randomized in a poster (actor) experiment
Thus, similar to what has been done at Amazon (Bajari et al. 2019), we employed two-level randomization experiments: an actor randomized experiment and a recipient randomized experiment that are orthogonal. In this dual randomized experiment, members will have two labels corresponding to their randomized group for each of the experiments. This means that a member can be in either treatment or control in the poster experiment and either treatment or control in the recipient experiment.
For the recipient experiment, we withhold feed post notifications sent to the treatment group if the actor is in the treatment of the actor experiment. Thus the recipients in treatment will receive fewer feed notifications than the control, mimicking what would happen if actors post less, and we can measure how that affects sessions of the recipients.
Figure 8: Dual randomization experiment setup
For the poster experiment, this setup will result in actors in the treatment group losing some of their audience. (The experiment is limited to one week in order to minimize any negative member impact.) We can then measure how a reduction in member post notifications impacts recipient sessions via CAMEL by combining the attributed downstream sessions from the post notifications for each poster and doing a poster side t-test.
With both experiments measuring the impact of reduced post notifications, we find that the CAMEL estimation is well aligned with the recipient side experiment readout without statistically significant difference. This further supports our belief that CAMEL is effective in estimating downstream session impact in experiments.
Outlier effect and winsorization
An outlier refers to an observation far away from most or all other observations. It is a common problem in social networks due to power members (Lim et al. 2011; Muchnik et al. 2013). At LinkedIn, a handful of influencers have an extremely large number of followers, whereas most members have, at most, a few hundred connections and followers. As a result, when an influencer posts on LinkedIn, the network effect can be over a hundred times larger than a post from a typical LinkedIn member. The effect of such an outlier may cause a problem when we want to generalize impact to an average member, making the statistical interpretation misleading (Ghosh and Vogt 2012).
In practice, we observed that outliers have led to unreliable A/B testing results when we evaluate downstream session impact in our daily experiments using CAMEL. When we interpret A/B results using the downstream sessions metric value, we observed as many as 30% of experiments show statistically significant movement of the metric without any reasonable explanation. In one experiment case study, we observed a large downstream session lift from a small product change. Digging into the treatment and control data, we found that three influencers posted during the experiment period and all happened to be assigned to the treatment group. All together, these three influencers generated millions of downstream sessions. Just by removing these three data points from the experiment, which had a sample size in the tens of millions, the downstream session impact estimate dropped by over 60%.
Besides influencing the mean of the metric, outliers also impact variance estimations, resulting in higher variance and more statistical power needed to get significant experiment readout. As such, we found that many experiments were under-powered or did not produce statistically significant results, contrary to the convictions and expectations of the experimenters.
Common procedures to treat outliers are winsorizing and trimming. Winsorization is to replace outliers with a lower value that is not considered as an outlier, and trimming is to remove all the outliers from the data. In addition, outliers can be defined using a rule of thumb as any observation that, if removed, causes the parameter of interest to change by more than 10% (Hansen, Madow, and Tepping 1983).
We took a look at the percentiles of downstream sessions and observed that this metric was severely skewed. We tried both trimming and winsorization and found trimming has a larger impact. So, we decided not to completely negate the influence of outliers, but rather apply winsorization at the 99.9% percentile to the downstream sessions before using it in experiments. This adjustment made the average statistic less susceptible to extreme outliers and increased the experiment power by reducing standard deviation of the metric by almost 90%.
Extension to other downstream metrics
After attributing downstream sessions, we found that downstream actions can also be easily attributed based on the downstream session attribution. For example, when members send invitations, the recipients will come to the site to accept the invitation, and this is a downstream session attributed to the invitation sender, as described in the previous sections. At that point, a recipient may perform other activities—for instance, interacting with their news feed—which may also be attributed to the original action.
To quantify downstream actions that occur during a downstream session, we extended the CAMEL framework one step further to attribute the downstream actions performed by the recipient in their induced session to the corresponding downstream session. By identifying the recipient sessions, we attribute actions performed in these sessions to the original actor. The action metrics are multiplied by the fractional downstream session value to get the quantity attributed to the original actor.
With this extension, the CAMEL framework can now provide downstream impact measurement for many other metrics in addition to downstream sessions, as long as the metric can be associated with a downstream session. Quantifying the downstream impact of metrics—such as revenue generated from viewing ads, or post interactions on feeds that enrich the social network—allows for a simple way to understand the evolution of important metrics as we improve the social network at LinkedIn.
One significant advantage of using the CAMEL framework to measure downstream impact is that the implementation is relatively simple.
To set up the CAMEL framework in practice, you need to be able to track the data of members performing actions on a platform. More specifically, when a recipient clicks on a notification, you need to be able to record the id of the recipient and the id of the notification that they clicked. Additionally, for each notification sent, there needs to be a tracked record of the notification, with information of the member id of the actor that initiated the notification. At LinkedIn, we have a scalable notification infrastructure (Nelamangala and Xiao 2018; Shi and Fuad 2018) that is used to send notifications. For each notification sent, we log the send event and its details. With this comprehensive tracking, we are able to trace back the downstream sessions to the upstream actors and actions.
Once a complete tracking is established, an offline flow is run to process the logged data based on the framework. Currently, we run this on a daily basis and produce a data set containing the actor attribution and fractional downstream session attribution for every notification. This data can then be used in A/B testing and downstream session related analysis. As the setup and maintenance of this offline processing flow is similar to the processing of other metrics used in our experimentation system, there is no difference between running a regular A/B test and an A/B test where one needs to understand the downstream impact. This is an advantage over many existing methods, which need special setup for experiments.
Reveal counterintuitive experiment results
The CAMEL framework has been very successful in discovering downstream impacts that were otherwise neglected in our daily experiments. One such example is a PYMK experiment we previously conducted.
One pain point that we had heard from our members was that some were receiving too many unwanted invitations from people they didn’t know. To help those members, we made an adjustment to the PYMK recommendation model, so that we ranked members who have received too many invitations lower in the recommendation list. By doing this, they will get fewer impressions and thus receive fewer invitations.
We ran an experiment to test the performance of this change, and observed that invitations sent dropped by 1%, while the sender-side session impact was neutral. This was not surprising, as we were modifying the result of the AI model that optimizes the invitations sent. And when we saw the invitation numbers drop, our intuition was that the downstream sessions would also drop as a result.
However, with the CAMEL framework, we observed that the downstream sessions have actually increased by 1%. This is because the new recommender model’s ranking system led to more invitations being sent to members who have received fewer invitations, and invitations are more effective in driving sessions for those members who actually want to receive them.
Optimize AI models
In the previous example, we have seen that downstream sessions can be a very important component for AI models. This has led us to include downstream impact metrics as inputs for AI model optimization. One such model that incorporated our optimization was the AI model that powers our feed.
At LinkedIn, we have AI models to power the ranking of updates shown in the feed. Historically, the AI models were only designed to optimize for viewers, i.e., show updates that interest the viewers. With the CAMEL framework, we incorporated posters into our AI model optimization and optimized for content that both the viewers were interested in and the posters would be likely to engage with. In this way, we improved the overall member experience by creating more opportunities for platform interactions, which also led to a significant increase in overall sessions.
Understand LinkedIn's ecosystem
Another way CAMEL is used is in understanding the sessions ecosystem of LinkedIn. Because CAMEL is able to attribute downstream sessions to an upstream actor, we are able to build dashboards and visualize how sessions are generated at LinkedIn and understand how to optimize this area.
As an illustration of this ability, we took a subset of members from an unnamed geographic region and showed the breakdown of where their downstream sessions come from (data may not necessarily generalize to all of LinkedIn). In Figure 9 we see that a third of all sessions are from messaging, feed, and invitations. We can further break down the downstream sessions by the types of notifications that caused them. Feed activities such as liking, commenting, or sharing primarily resulted in badge notification sessions, while messaging sessions are evenly split between the three notification types. Finally, half of member invitations generated push notification sessions. With this information, we can more effectively use the best channels for each product (e.g., push notifications for invitations) in order to more effectively reach members with updates.
Figure 9: Notifications type makeup for sessions
(The white space in the first chart represents organic sessions)
The CAMEL framework allows for the measurement of downstream effects, providing an improved understanding of metrics in a social network. As opposed to other solutions to downstream impact measurement, CAMEL offers a simple and practical way to measure downstream effects. Such a framework can be quickly employed in companies with similar ecosystems of members and app notification functionalities. If there is adequate logging of user events, an offline processing job can simply perform the calculation of a downstream metric that the company can use directly in a regular A/B experiment setting. This is in contrast to previous solutions that require changing the experimentation setup via customized treatment assignments, which can be computationally intensive and require a lot more setup cost.
In this blog post, we highlighted the importance of downstream impact and how its consideration can alter our conclusions of experiment outcomes. We have shown that outside of experimentation, the CAMEL framework also provides an understanding of how sessions transpire in the LinkedIn ecosystem. Such information is not only useful for product managers and engineers to design better product features, but is also useful as inputs to optimize AI models.
We would like to thank Franco Liang, Craig Tutterow, Joonhyung Lim, Gigi Zhang, Shuoze Wang, Andrew Ruland, Meghan Frate, and Janice Tian for their contributions to this project, and Ken Soong, Nanyu Chen, and Bonnie Barrilleaux for helping review the blog, and Jia Ding, Wenjing Zhang, and Ya Xu for their support and feedback on this project.