How LinkedIn Uses Data to Improve Video Performance
January 14, 2019
At LinkedIn, we use data to improve our members’ experiences while using our site. On the video team, we value metrics that yield insight on how long our videos take to load, why certain videos draw more attention than others, and how our members tend to interact with videos across web, iOS, and Android. In short, the various data points collected during video playback on LinkedIn are leveraged to drive powerful improvements in video performance.
This post will make mention of the following terms, defined below for your convenience:
Iframe: An element that can render the content of external web pages inside of it. This is very useful in the case of video, as it allows us to render videos from third-party (e.g., Youtube, Vimeo) domains directly within our site.
Viewport: The portion of a website that is visible on the screen.
DOM: The representation of the web page as a tree made up of many content nodes.
Capturing data during playback
Our systems capture troves of data on how a video performs during playback. We have found that by focusing on the following data points, we have been able to dramatically improve video performance on LinkedIn.com:
Media Initialization Start: When the player starts to initialize.
For videos played through an iframe, such as third-party videos, this metric marks when the iframe is first rendered on the page.
For HTML5, or native, videos that are rendered directly on the page, this metric marks when the loadstart event is emitted by the video player.
Media Initialization End: When the player initialization has completed.
This metric essentially marks when the video has emitted the loadeddata event.
Media Buffering Start: When the media first begins to buffer, prior to the video playing.
Media Buffering End: When the media stops buffering, just before the video begins to play.
Time to Start (TTS): The time between when the player is initialized and when the player is ready to play the video.
Note: This is the sum of the amount of time the video spent in initialization and buffering.
Perceived Time to Start (PTTS): The time between when a member requests a video to play and when the video actually starts playing.
Media Initialization Time: The amount of time between the Media Initialization Start and the Media Initialization End events.
Media Initialization Rate: A data point that determines the percentage of videos that enter the viewport and successfully load before they exit the viewport.
If this rate drops, it tells us that our videos are likely taking too long to load.
Later on in this post, we’ll zoom in on a couple of experiments that leveraged many of the above data points to improve one of our most important metrics, PTTS.
Using data to benefit our members
Now that we have amassed a wealth of insightful video playback data, how can we use it to improve the experiences of our members? We tackle this problem from two approaches.
Detailed, real-time metrics reporting
At LinkedIn, we leverage multiple internal tools and services that allow us to store data and visualize changes in this data in real-time. One of the tools that is particularly helpful is called InGraphs, which allows us to visualize the many data points collected across our products.
In addition to the charts that InGraphs provides, we have services that notify relevant teams if any core metrics fall below a pre-set threshold. These tools allow us to take immediate action should we detect a degradation in member experience within one of our products.
Continuous A/B testing of features
We are constantly experimenting with new features, as well as tweaks to existing ones, with the overarching goal of producing the best experience for our members. We use metrics—in conjunction with reporting tools like InGraphs—to paint a clear picture of how a given experiment affects user interaction across our site.
For example, imagine a fictitious experiment in which we tested the effect of showing only video content for the first thirty posts in each member’s feed. The experiment may initially seem like a success, as the amount of video being watched by our members has gone up. However, after a closer look at InGraphs, we notice that the number of posts being shared by our members has dropped. By understanding this correlation and considering both impacts, the experiment would be terminated for having a negative impact on member experience.
Ensuring that our data is accurate
Data is only as useful as it is accurate. If we cannot trust that the data we store is accurate, there is no basis for testing the various experiments that we create. In addition to the data monitoring services mentioned above, we make heavy use of automated (unit, integration, and acceptance) tests to ensure a given feature works correctly. As you might imagine, it is not scalable at LinkedIn’s size to manually test all existing features during the development of new ones. Instead, tests are used to run existing features in isolation and guarantee that, through various interactions, the feature performs as expected. For example, we might write a test which asserts that clicking on a video’s play button causes the video to start playing, and captures data about the video’s loading performance. Automated tests therefore allow our engineers to guarantee that the metrics being emitted by their feature are accurate long after the feature has been created. In addition to automated tests, LinkedIn engineers have some handy tools (see the Tracking Overlay mentioned in a prior blog post on engineering infrastructure at scale) at their disposal to facilitate the validation of metrics emitted by a given feature.
Using data for video performance
Due to the naturally large size of a video asset, video performance requires a unique approach: we need a way to download enough of the video so that it starts to play right away, while also ensuring that we do not slow down the rest of the elements rendering on the page.
Case study: Perceived time to start (PTTS)
At LinkedIn, we measure perceived loading times to get an understanding of how long our members are waiting for videos to play. The principal metric that we use to gain insight into how long a video takes to load is perceived time to start (PTTS). PTTS measures how long the browser takes to download a video, in addition to the time the video spends buffering prior to playback.
Let’s take a look at the above chart, which provides some insight into a particular member’s experience in waiting for a video to load. Once the video entered the viewport, it took 2,700ms for the video to be initialized, followed by another 3,300ms of video buffering before the video began to play. The PTTS in this case was roughly 6,000ms. We can now use this metric, along with millions of other data points, as a base guideline for experiments to speed up video loading. Let’s take a look below at a couple of the experiments we ran.
Eagerly load all videos in the DOM
At LinkedIn, we have experimented with both eagerly and lazily loading videos. To eagerly load a video is to begin downloading the video as soon as it enters the DOM. This is different from lazily loading, through which the video is not downloaded until it enters the viewport. Eager loading allows the video to be loaded in the background before it enters the viewport. This delivers a great user experience because videos begin to play as soon as they enter the viewport with little-to-no buffering. At first glance, this experiment was successful in that it decreased PTTS, meaning videos appeared to begin playing in less time. However, as we took a closer look at the metrics, we uncovered some fascinating results. While our members with stronger bandwidths did indeed enjoy a decrease in PTTS, those with weaker bandwidths experienced a decrease in Media Initialization Rate and increase in Media Initialization Time. Imagine, for example, a member scrolling through the LinkedIn feed on his or her phone while riding the subway. Given the subway’s weak internet connection, the member will already face lag in loading content, let alone a video asset. In the case of eager loading, not only are we downloading the content in the viewport, we’re also trying to load videos behind the scenes. As you might imagine, this places an excessively large load on the member’s relatively weak connection, potentially resulting in none of the feed’s posts loading. This phenomenon explains the decreased Media Initialization Rate and increased Media Initialization Time noted earlier on, and was the motivation behind our next experiment.
Queued video loading
Queued loading is a loading strategy in which videos are added to a loading queue and loaded one at a time, as opposed to loading all videos in the DOM at once (as is the case with eager loading). Queued loading aims to combine the benefits of eager loading (decreased PTTS) and lazy loading (more accessible for members with less network bandwidth). It does this by loading videos outside of the viewport, but only doing so once the videos in the viewport have been successfully loaded. With queued loading, we observed a slight increase in PTTS, likely because fewer videos are being loaded outside of the viewport, but an increase in Media Initialization Rate and a decrease in Media Initialization Time for members with weaker network connections. This led us to conclude that queued loading yields the best tradeoff between aggressively loading videos for our members on strong network connections, and prioritizing the loading of content in the viewport for our members with weaker network connections.
The large size of a video asset and the expectation that it load quickly without negatively impacting the rest of the site’s speed make video performance at scale an inherently tough problem to solve. To further complicate the problem, we must also consider discrepancies in network speeds and browser capabilities, and the varying ways that our members use our site before we run performance-related experiments. By properly using data, we can quickly pinpoint and iterate on performance degradations, while also ensuring that no performance regressions emerge along the way.
I would like to thank Shane Afsar and Kris Teehan for their help in writing this post, as well as Kevin O’Connell and the LinkedIn Engineering Blog team for their help in editing this post. Shout out to the video team in NYC, working tirelessly on improving video performance and overall video experience.