Building a Native Video Player Library for Android

Jack K.

Machine Learning, Deep Learning, NLP, Computer Vision

August 2, 2016

At LinkedIn, we recognize that video has become a popular medium for people to communicate and share information. We recently launched a feature where members can hear directly from Influencers on timely and thought-provoking topics through the rich experience of video. In this post, we will discuss some of the technical challenges involved in developing a shared video player library for Android. This library powers the video viewing experience on LinkedIn.

Mobile constraints

Compared to text and images, video transmissions require a lot more data bandwidth and can become costly to our members who are on metered data plans. The video viewing experience is also highly sensitive to spotty network coverage and occasionally unreliable connections often seen in mobile networks.

To support the many LinkedIn app teams, we have built a shared native video player library that will power video viewing experiences across our apps.

Video delivery

Mobile connection bandwidth often fluctuates across locations and time of day. A device may drop out of 4G coverage and fall back to a slower cell tower. Network utilization, which affects network speed, can vary from hour to hour. For reasons like these, it is important that the video delivery be adaptive to dynamic network conditions.

First-party LinkedIn videos are relatively short—less than 30 seconds long. Because the videos are so brief, we believe that changing video quality in the midst of playback would result in a jarring member experience. For this reason, we will change the video quality only at the start of a video playback. We chose a progressive download instead of HLS as the video delivery mechanism. The team decided we would revisit HLS when we support longer videos in the future.

With all else being equal, a device with a fast Wi-Fi connection should be expected to show a higher quality video stream than a device with a slow connection. But unlike HLS, a progressive download is unaware of network conditions. So we came up with a hybrid approach.

We designed a Bandwidth Meter that measures current network throughput between the device and the video content delivery network. Every time video data is read from the network stream, data size and associated read start time and end time are recorded. For each measurement, we calculate the estimated bandwidth (in bytes per second) by dividing the number of bytes downloaded over the number of seconds it took. In real-time, we compute an exponential moving average (so newer measurements carry more weight than older measurements) of the collected bandwidth data points to estimate the current network throughput. The estimated throughput is a key factor used in selecting the optimally encoded video stream.

Each source video is encoded into a range of bitrates optimized for a variety of network types and screen sizes. JSON metadata containing the stream info (including a set of stream URLs) is sent to the video player. The player chooses the stream that will result in the best video viewing experience given the current network condition. The player avoids choosing a bitrate that is too high, as that would result in long and frequent buffering disruptions. It also avoids choosing too low of a bitrate, since the player would then not be showing the best video quality possible.

A simplified view of the key components of the LinkedIn video player library.

Video performance monitoring

We recently updated our internally-developed Android performance monitoring library to support several new video performance metrics like buffering ratio, which is key to member engagement. Data points such as buffering start time and end time are uploaded to our data center and processed by Kafka and Samza. Our video performance dashboards alert us of streaming problems in real time. This monitoring allows us to experiment with various encoding parameters to optimize for mobile delivery (i.e., optimizing encoding and serving costs while delivering a great video viewing experience). We can even run A/B tests to experiment with different encoding parameters, and evaluate performance. The performance data also help pinpoint issues that may be specific to particular devices or OS versions.

Building off an existing Android video player

We evaluated a few choices for Android video players:

Google Exoplayer
Android MediaPlayer (Android SDK)
Third-party commercial solutions

When evaluating options, our requirements were:

Support of HLS and progressive downloads;
Native hardware decoding;
Closed Captioning;
Ease of extensibility/customizability;
High performance (power consumption and memory overhead);
And ideally, support of Android SDK version 15 and above.

We chose Exoplayer, as it satisfies all of our requirements except the last. Exoplayer takes advantage of hardware decode, so battery life is conserved. The big draw for Exoplayer is that it is open sourced and architected to follow the open/closed principle, meaning that its various components are open for extensions but closed for modifications. For example, without modifying the Exoplayer source code, we replaced the default networking stack with the LinkedIn networking library by creating a custom implementation of com.google.android.exoplayer.upstream.HttpDataSource.

Memory optimization: Video player recycling

A member's news feed can potentially contain tens or even hundreds of videos. A naive approach is to create a new video player for each video in the feed. This approach proved to incur increasing memory costs as more and more video players were created, and the app became sluggish as more and more player instances were created. Eventually, an OutOfMemory error was raised on some older devices.

To optimize memory usage, we reused a single instance of the player. This is fine for our use case because at most, only one video should be playing at any time. When a video is not playing, we show a still video frame as a placeholder where the video player would be drawn.

Data usage optimization: Video cache

Downloading a typical one-minute video may consume a few MBs of data. Using an effective video caching strategy can dramatically improve video replay start time while conserving precious mobile data. Exoplayer doesn’t come with a caching mechanism, so we had to build our own.

Older Android phones have less than 32 MB of heap space available to each running app. Just a few 5-minute long HD videos can take up all of the heap memory available. Therefore, a disk cache is far more effective than an in-memory cache. We decided to cache videos to the internal disk cache (versus an in-memory cache) private to the app (versus the less secure external disk cache accessible to other apps).

Android doesn’t provide any disk cache management functionalities, but it does provide a getCacheDir() method that returns the app-specific cache directory on the file system. Even though the cache directory returned from getCacheDir() is private to an app, the total cache space is shared among all apps on a device. To ensure we don’t use more than our fair share of the cache space, the video cache trims itself according to the well known LRU cache eviction policy. When the number of stored bytes exceeds a limit we have set, the cache will remove the least recently used video files one by one until the limit is satisfied. We came up with a simple heuristic that derives the video cache size limit from the total available disk space on the device.

table
Total Disk Space	Max Video Cache Size
>500 MB	100 MB
200-500 MB	40 MB
<200 MB	30 MB

A naive implementation may use a simple hash (e.g., MD5 hash) of the video URL as the cache key. This turns out to be suboptimal, as our video processing backend encodes each source video into multiple bitrates addressed by different URLs. If the device is connected to Wi-Fi, the video player may fetch a high bitrate/resolution version of the video. If the member watches the same video again later while connected to a slower cell network, this URL cache key scheme would result in a cache miss, because the lower bitrate version of the video will have a different URL. To solve this problem, we use the "video URN" generated by our backend service as the cache key. The video URN identifies a specific piece of video content independent of encoding parameter variations.

With the disk cache, video replays have a near-zero buffering ratio, which is great for member experience while also saving our members’ data usage at the same time.

Conclusion

Video presents unique challenges on mobile devices and networks. To address those challenges, we’ve made use of dynamic bitrate selection, video metrics monitoring, caching to disk, and more in the mobile player. We expect to continue to refine and evolve our video architecture to support innovative video use cases and may open source this library if there is enough interest from the community.

Topics: Culture Code Product Design