Feature Highlight: Scaling Autoplay Videos for Hundreds of Millions

Evan Farina

Sr. Staff Software Engineer at LinkedIn

March 5, 2019

On the video team, we’re always looking for ways to improve our members’ experience with video. In late 2016, LinkedIn’s video feature was still young and our autoplay feature remained in the planning phases. Two years have since passed, and while autoplay has become a key component of the LinkedIn video experience, we’re still working on perfecting this feature due to its complex product requirements and inherit performance implications. This post will outline our product criteria for autoplay, along with the technology and architecture developed by our engineers to support it. Finally, we will take a look at some performance challenges that we faced in building an autoplay solution that can scale to hundreds of millions of members.

Technical terms

This post will refer to several frontend terms and technologies, which we define as follows:

iframe: An iframe is an element that can render the content of external web pages inside of itself. At LinkedIn, we use iframes to render videos from third-party domains (e.g., YouTube, Vimeo) directly within our site.
Viewport (or “above the fold”): The portion of a website that is visible on the screen.
Spaniel: LinkedIn’s in-house solution for tracking elements as they move in and out of the viewport.
postMessage: postMessage is a native browser technology that allows two websites on different domains to communicate with one another. We use postMessage to essentially interact with the video APIs that third-party domains provide.
Publish-subscribe (pub-sub) pattern: A communication pattern used by applications in which programmatic events are not sent to specific subscribers, but are instead blindly emitted without knowing which components within the application may be subscribed to the events.
Debounce: Limiting the number of calls to a particular method that can occur within a given timeframe. Debouncing is especially useful when dealing with user interactions (e.g., scrolling quickly through the page) that can cause many events to occur in a short amount of time.
DOM: Represents the web page as a tree made up of many content nodes.

Product criteria

From both an engineering and a product perspective, autoplay is one of the more complicated features that we’ve built on the video team because with autoplay the devil is in the details. We focused on several key criteria:

only one video can play at a time;
autoplayed videos should pause as they exit the viewport (the caveat to this rule is if a user had interacted with the video; more on this below); and
when a user interacts with a video, or any of its controls, the video should play with sound and should not pause as it exits the viewport.

Architecture overview

There are four main aspects to LinkedIn’s autoplay architecture:

HTML5 video: This is the browser’s native video implementation.
Video manager: A singleton responsible for keeping track of which videos are playing, and whether or not they are playing with sound. The video manager controls which videos play via an event emitter, which uses the pub-sub pattern.
Video wrapper: A JavaScript object that wraps HTML5 video and communicates with the video manager’s public API and subscribes to events emitted by the video manager.
Viewport management: We use Spaniel to keep track of the video elements as they move in and out of the viewport. Viewport management plays an important role in each of our video loading strategies, which is a topic that we’ll cover later on in this post.

User experience considerations

Autoplay is a naturally complicated feature to get right because there is a lot to consider from a user experience perspective. Below are a few of the many aspects of user experience that we took into account when building this feature.

Viewing context
Video can show up in a substantial number of contexts — from the feed to private messages to learning playlists — on LinkedIn.com. Each of these contexts requires a unique consideration in that our members will be interacting with each of these pages in different ways. In the feed, for example, we have the unique challenge of having to manage a collection of videos at once. We have found that our members do not want videos to automatically play with sound, but do want videos to unmute once they have been interacted with. Within the LinkedIn Learning app, videos are loading as playlists and each successive video needs to respect the volume setting from the video that played before it. It’s important to do a deep dive on the various contexts in which your users will be interacting with video and tailor a unique autoplay solution for each case.

The viewport
In the context of the LinkedIn feed on desktop, videos are played as soon as they enter the viewport, and paused when they exit the viewport. The exception to this rule is when a video is playing with sound: in this case, we assume that the member has shown enough interest in the video that they will want it to keep playing in the background as they scroll through the feed.

Fine-grained control over settings
Given the adverse effect that autoplay can have on some users, it is essential that they have the ability to turn off this feature if desired. At LinkedIn, we expose a setting to our members that allows them to easily disable autoplay should they want to do so.

Site performance
Videos are data hogs: they require a lot of data to play and they attempt to download that data as fast as possible. Given the bandwidth limits of internet networks, coupled with the various limits put in place by desktop browsers, optimizing for video downloads can quickly cause the degradation of loading performance of other assets on the page. For this reason, it is imperative that overall site performance is considered at the forefront of your autoplay strategy. We’ll dive into this topic more deeply in the next section.

Performance considerations

On the video team, we are constantly calibrating the aggressiveness of our video loading strategy. On one hand, we want to prioritize the downloading of video content so that our members are not spending too much time waiting for the videos to buffer. On the other hand, given the inherently large size of a video asset, we need to be careful to not place too much strain on our members’ networks as we request these assets from the server. Furthermore, the concern about asset size with regards to network strain increases with the number of videos on a given page; not only do we need to consider total data requested, but we are also concerned with the timeframe in which the data is downloaded, as browsers place a limit on how many simultaneous network requests can be processed. Below, we’ll take a closer look at the aforementioned considerations.

Network bandwidth
Network bandwidth can vary depending on a handful of factors, such as:

Location: Internet infrastructure can vary from region to region. For example, this Akamai State of the Internet Report from Q1 2017 states that the average connection speed in India was 6.5 Mbps, whereas the average connection speed in the United States at that time was 18.7 Mbps. It’s important that we keep our members with lower connection speeds in mind when designing an autoplay solution, as downloading every video asset that enters the viewport could quickly consume most or all of the network’s bandwidth.
Connection type: By connection type, we are referring to the mechanism through which a member is connecting to the internet (e.g., Ethernet, Wi-Fi, or mobile data). We take this information into account not only for discrepancies in connection speeds, but also because we want to be careful to not use up too much of our members’ data plans by automatically downloading video assets.
Device type: People can browse the internet from just about any device they own, be it a watch, phone, tablet, or a desktop computer. The browser implementations vary from device to device, specifically with regards to the number of concurrent network requests that can be processed. In the context of autoplay, we cannot afford to clog up the network by loading videos in the background in preparation of automatically playing them as they enter the viewport. Instead, we want to prioritize the downloading of the content that is currently within the viewport.

We can mitigate the above concerns by:

Giving members fine-grained control of when a video can autoplay (e.g., members on mobile devices can choose to have videos autoplay only when they are connected via Wi-Fi).
Queued loading, which is a strategy where videos are loaded via a queueing system. This system ensures that we are not downloading multiple videos simultaneously and that we are not prioritizing too heavily the downloading of videos over other content on the page.

Mobile data plans
Many of our members browse LinkedIn using their mobile data plans and we need to be respectful of the fact that videos can quickly consume a large amount of data. For this reason, videos will, by default, only autoplay on a mobile device when the device is connected to a wireless network. Furthermore, the loading process for a video on our mobile-web does not begin until it has been interacted with by a member.

Scrolling performance
If your website displays long lists of information on a page, such as a feed of some sort, it’s likely that you are interacting with the browser’s scroll event. Given the rapid rate at which scroll events are fired, it’s paramount to understand the impact that doing DOM manipulations within the scroll event handler can have on overall page render performance. The browser does the majority of its render work within two cycles: reflows and repaints. As mentioned in this article by Google, a reflow computes the layout of the page and can occur when a CSS style is changed, a DOM node is moved, or a scroll event occurs, among other things. A repaint, on the other hand, occurs when a style change is made that affects a DOM node’s visual appearance but does not change the node’s layout, or position on the screen. The browser’s goal is to limit the number of reflows and repaints that occur, and it uses the native requestAnimationFrame method to ensure that multiple reflow and repaint cycles are batched whenever possible.

With the above in mind, let’s take a look at how scrolling through the page can negatively impact the page’s render performance. When a user scrolls through a browser page, the browser is forced to recalculate the layout of DOM nodes that are moving with the scrolling page. If the page is doing any manipulation of DOM nodes within the event handler of the scroll event, the browser will once again be forced to reflow and repaint. Therefore, doing DOM manipulation within the scroll event handler can quickly become expensive and lead to a visual degradation known as layout thrashing.

To avoid making the browser work too hard, it’s important to debounce your scroll events. This ensures that a reflow only occurs once the scrolling of the page has stopped, as opposed to each time the page is scrolled.

Video loading strategies

When developing a video loading strategy, placing an emphasis on the aforementioned performance considerations is crucial if you want to ensure that all of your users have an optimal user experience on your site. Below, we will take a closer look at a few of the experiments, along with their respective pros and cons, that we have run at LinkedIn. Each of these experiments was carefully crafted with a focus on both video load time and overall site performance.

Eagerly loading all videos in the DOM

Eagerly loading videos in the LinkedIn feed

At LinkedIn, we have experimented with both eagerly and lazily loading videos. To eagerly load a video is to begin downloading the video as soon as it enters the DOM. This is different from lazily loading a video, in which the video is not downloaded until it enters the viewport. In the case of eager loading, videos are essentially loaded in the background.

The benefit of eagerly loading a video is that the video will have completed most, or all, of its buffering in the background. The more content that is loaded in the background, the less content needs to load once the video enters the viewport. Eagerly loaded videos therefore spend less time buffering in the viewport compared to a video that was not eagerly loaded.

The drawback of eagerly loading a video is most evident in scenarios where connection speeds are relatively low. As we download video assets in the background, there is less bandwidth available that can be used to download other content in the viewport. In addition to bandwidth concerns, we also need to keep in mind that browsers, both on mobile and desktop devices, are limited in the number of HTTP requests that they can handle in parallel. By fetching background resources, we are potentially causing the loading of content in the viewport to be delayed.

The bottom line is that, in the above diagram, all three videos are loaded as soon as the video element is attached to the DOM, regardless of whether or not the video element has entered the viewport.

Limited queued loading

Loading videos in the LinkedIn feed using a limited queue

The limited queued loading system addresses the drawbacks of both unlimited eager loading (high bandwidth and HTTP request usage) and the unlimited queue system (high HTTP request usage) by placing a limit on how many videos can be eagerly loaded.

The benefit of the limited queue system is similar to that of the limitless queued loading system, but to a lesser extent due to the limitation on how many videos can load in the background.

The drawback to the limited queue system is that we are limiting how many videos can be loaded in the background. This drawback is particularly evident in cases where a member has high network bandwidth and a feed which contains a higher number of videos than the max size of the queue.

Limitless loading queue

Loading videos in the LinkedIn feed using a limitless queue

The queue system aims to gain from the benefits of eager loading while also taking into account its drawbacks. The queue system works by adding all videos on the page, regardless of whether or not they are in the viewport, to a queue. Each video in the queue is loaded by the browser in the order that it was added to the queue. While many videos can exist in the queue simultaneously, we only ever load one video at a time. This ensures that we are not clogging the browser’s available HTTP connections, by allowing for only one video to load at a time.

A major benefit of the queued loading system is that videos outside of the viewport (i.e., in the background) can be eagerly loaded, allowing videos to play with little to no buffering once they enter the viewport.

A potential drawback to the limitless loading queue is that we may, in some cases, be requesting that the browser download a sizeable amount of data in a short amount of time if a member’s feed contains a lot of video updates. For our members with weaker network connections, this can cause a sluggish viewing experience and can have a negative impact on page load time.

The bottom line is that, in the above diagram, all three videos have the opportunity to load eagerly; the videos will not be loaded in parallel, however, which allows for the loading of content above the fold to be prioritized. We have found that the limitless loading queue creates the best balance between user experience and performance for our members.

Conclusion

When building the autoplay feature at LinkedIn, we learned the hard way that it is a deceptively complicated feature. Everything from network bandwidth to browser rendering cycles to the device through which our website is being interacted with comes into play and has an effect on our members’ overall experience with this feature. When used responsibly, autoplay can have an immediate positive effect on your site’s visitor engagement because website users in the twenty-first century crave content, but, more importantly, crave fast content. Video is a fantastic medium through which we can deliver lots of content to our members, but without a holistic approach to performance, the feature would likely degrade the overall user experience. Hopefully the reflection of our experience with this feature can prove beneficial to those of you looking to implement this functionality into your site.

Acknowledgements

Shoutout to the video team at LinkedIn NYC who built, and continues to iterate on, our autoplay feature. Thanks to the LinkedIn Engineering Blog team for their help in editing this post.

Topics: Culture Product Design