Feature Highlight: Scaling Autoplay Videos for Hundreds of Millions

On the video team, we’re always looking for ways to improve our members’ experience with video. In late 2016, LinkedIn’s video feature was still young and our autoplay feature remained in the planning phases. Two years have since passed, and while autoplay has become a key component of the LinkedIn video experience, we’re still working on perfecting this feature due to its complex product requirements and inherit performance implications. This post will outline our product criteria for autoplay, along with the technology and architecture developed by our engineers to support it. Finally, we will take a look at some performance challenges that we faced in building an autoplay solution that can scale to hundreds of millions of members.

Technical terms

This post will refer to several frontend terms and technologies, which we define as follows:

  • iframe: An iframe is an element that can render the content of external web pages inside of itself. At LinkedIn, we use iframes to render videos from third-party domains (e.g., YouTube, Vimeo) directly within our site.

  • Viewport (or “above the fold”): The portion of a website that is visible on the screen.

  • Spaniel: LinkedIn’s in-house solution for tracking elements as they move in and out of the viewport.

  • postMessage: postMessage is a native browser technology that allows two websites on different domains to communicate with one another. We use postMessage to essentially interact with the video APIs that third-party domains provide.

  • Publish-subscribe (pub-sub) pattern: A communication pattern used by applications in which programmatic events are not sent to specific subscribers, but are instead blindly emitted without knowing which components within the application may be subscribed to the events.

  • Debounce: Limiting the number of calls to a particular method that can occur within a given timeframe. Debouncing is especially useful when dealing with user interactions (e.g., scrolling quickly through the page) that can cause many events to occur in a short amount of time.

  • DOM: Represents the web page as a tree made up of many content nodes.

Product criteria

From both an engineering and a product perspective, autoplay is one of the more complicated features that we’ve built on the video team because with autoplay the devil is in the details. We focused on several key criteria:

  • only one video can play at a time;

  • autoplayed videos should pause as they exit the viewport (the caveat to this rule is if a user had interacted with the video; more on this below); and

  • when a user interacts with a video, or any of its controls, the video should play with sound and should not pause as it exits the viewport.

Architecture overview

There are four main aspects to LinkedIn’s autoplay architecture:

  • HTML5 video: This is the browser’s native video implementation.

  • Video manager: A singleton responsible for keeping track of which videos are playing, and whether or not they are playing with sound. The video manager controls which videos play via an event emitter, which uses the pub-sub pattern.

  • Video wrapper: A JavaScript object that wraps HTML5 video and communicates with the video manager’s public API and subscribes to events emitted by the video manager.

  • Viewport management: We use Spaniel to keep track of the video elements as they move in and out of the viewport. Viewport management plays an important role in each of our video loading strategies, which is a topic that we’ll cover later on in this post.

User experience considerations

Autoplay is a naturally complicated feature to get right because there is a lot to consider from a user experience perspective. Below are a few of the many aspects of user experience that we took into account when building this feature.

Viewing context
Video can show up in a substantial number of contexts — from the feed to private messages to learning playlists — on LinkedIn.com. Each of these contexts requires a unique consideration in that our members will be interacting with each of these pages in different ways. In the feed, for example, we have the unique challenge of having to manage a collection of videos at once. We have found that our members do not want videos to automatically play with sound, but do want videos to unmute once they have been interacted with. Within the LinkedIn Learning app, videos are loading as playlists and each successive video needs to respect the volume setting from the video that played before it. It’s important to do a deep dive on the various contexts in which your users will be interacting with video and tailor a unique autoplay solution for each case.

The viewport
In the context of the LinkedIn feed on desktop, videos are played as soon as they enter the viewport, and paused when they exit the viewport. The exception to this rule is when a video is playing with sound: in this case, we assume that the member has shown enough interest in the video that they will want it to keep playing in the background as they scroll through the feed.

Fine-grained control over settings
Given the adverse effect that autoplay can have on some users, it is essential that they have the ability to turn off this feature if desired. At LinkedIn, we expose a setting to our members that allows them to easily disable autoplay should they want to do so.

Site performance
Videos are data hogs: they require a lot of data to play and they attempt to download that data as fast as possible. Given the bandwidth limits of internet networks, coupled with the various limits put in place by desktop browsers, optimizing for video downloads can quickly cause the degradation of loading performance of other assets on the page. For this reason, it is imperative that overall site performance is considered at the forefront of your autoplay strategy. We’ll dive into this topic more deeply in the next section.

Performance considerations

On the video team, we are constantly calibrating the aggressiveness of our video loading strategy. On one hand, we want to prioritize the downloading of video content so that our members are not spending too much time waiting for the videos to buffer. On the other hand, given the inherently large size of a video asset, we need to be careful to not place too much strain on our members’ networks as we request these assets from the server. Furthermore, the concern about asset size with regards to network strain increases with the number of videos on a given page; not only do we need to consider total data requested, but we are also concerned with the timeframe in which the data is downloaded, as browsers place a limit on how many simultaneous network requests can be processed. Below, we’ll take a closer look at the aforementioned considerations.

Network bandwidth
Network bandwidth can vary depending on a handful of factors, such as:

  • Location: Internet infrastructure can vary from region to region. For example, this Akamai State of the Internet Report from Q1 2017 states that the average connection speed in India was 6.5 Mbps, whereas the average connection speed in the United States at that time was 18.7 Mbps. It’s important that we keep our members with lower connection speeds in mind when designing an autoplay solution, as downloading every video asset that enters the viewport could quickly consume most or all of the network’s bandwidth.

  • Connection type: By connection type, we are referring to the mechanism through which a member is connecting to the internet (e.g., Ethernet, Wi-Fi, or mobile data). We take this information into account not only for discrepancies in connection speeds, but also because we want to be careful to not use up too much of our members’ data plans by automatically downloading video assets.

  • Device type: People can browse the internet from just about any device they own, be it a watch, phone, tablet, or a desktop computer. The browser implementations vary from device to device, specifically with regards to the number of concurrent network requests that can be processed. In the context of autoplay, we cannot afford to clog up the network by loading videos in the background in preparation of automatically playing them as they enter the viewport. Instead, we want to prioritize the downloading of the content that is currently within the viewport.

We can mitigate the above concerns by:

  • Giving members fine-grained control of when a video can autoplay (e.g., members on mobile devices can choose to have videos autoplay only when they are connected via Wi-Fi).

  • Queued loading, which is a strategy where videos are loaded via a queueing system. This system ensures that we are not downloading multiple videos simultaneously and that we are not prioritizing too heavily the downloading of videos over other content on the page.

Mobile data plans
Many of our members browse LinkedIn using their mobile data plans and we need to be respectful of the fact that videos can quickly consume a large amount of data. For this reason, videos will, by default, only autoplay on a mobile device when the device is connected to a wireless network. Furthermore, the loading process for a video on our mobile-web does not begin until it has been interacted with by a member.

Scrolling performance
If your website displays long lists of information on a page, such as a feed of some sort, it’s likely that you are interacting with the browser’s scroll event. Given the rapid rate at which scroll events are fired, it’s paramount to understand the impact that doing DOM manipulations within the scroll event handler can have on overall page render performance. The browser does the majority of its render work within two cycles: reflows and repaints. As mentioned in this article by Google, a reflow computes the layout of the page and can occur when a CSS style is changed, a DOM node is moved, or a scroll event occurs, among other things. A repaint, on the other hand, occurs when a style change is made that affects a DOM node’s visual appearance but does not change the node’s layout, or position on the screen. The browser’s goal is to limit the number of reflows and repaints that occur, and it uses the native requestAnimationFrame method to ensure that multiple reflow and repaint cycles are batched whenever possible.

With the above in mind, let’s take a look at how scrolling through the page can negatively impact the page’s render performance. When a user scrolls through a browser page, the browser is forced to recalculate the layout of DOM nodes that are moving with the scrolling page. If the page is doing any manipulation of DOM nodes within the event handler of the scroll event, the browser will once again be forced to reflow and repaint. Therefore, doing DOM manipulation within the scroll event handler can quickly become expensive and lead to a visual degradation known as layout thrashing.

To avoid making the browser work too hard, it’s important to debounce your scroll events. This ensures that a reflow only occurs once the scrolling of the page has stopped, as opposed to each time the page is scrolled.

Video loading strategies

When developing a video loading strategy, placing an emphasis on the aforementioned performance considerations is crucial if you want to ensure that all of your users have an optimal user experience on your site. Below, we will take a closer look at a few of the experiments, along with their respective pros and cons, that we have run at LinkedIn. Each of these experiments was carefully crafted with a focus on both video load time and overall site performance.

Eagerly loading all videos in the DOM

Source link