Infrastructure

Accelerating the LinkedIn Experience with Azure Front Door

Co-authors: Samir JafferaliViranch Mehta, and Thanglalson Gangte

We announced the completion of LinkedIn’s migration to Azure’s edge offering, Azure Front Door (AFD), in June of 2020 and since then we have seen numerous benefits from the switch. Microsoft has continued to aggressively expand the AFD footprint to new countries, giving us quick and direct access to our members’ ISPs, which has resulted in measurable latency improvements, especially in Africa. We also continue to ramp up additional LinkedIn web properties like Real-Time Bidding and the lnkd.in URL shortener. Being on Azure has also enabled us to streamline our site operations and start redesigning our infrastructure for greater scalability. In this blog post, we’ll explain some of the technology that runs the edge, further explore the performance wins we’ve seen, and recap interesting challenges from this multi-year migration.

The technology that runs the edge

How do Points of Presence (PoPs) accelerate traffic?
The bytes that make up the LinkedIn experience are delivered to your devices from LinkedIn’s U.S. data centers. Data flows from our data centers (first mile), through the internet (middle mile), and is then handed off to the member’s ISP (last mile). 

image-of-how-data-flows-through-miles

Fig 1. The three different internet “miles” that traffic flows among

Accelerating this byte stream is complex because the disciplines involved span physical infrastructure and software stacks across the network, hardware, and application spaces. There are hundreds of impacting parameters, but the one with the most impact is the latency between the server and the client. Because of the geographic distance between some members and our data centers, sending members directly to one of our U.S. data centers would introduce significant latency. To minimize this, HTTP proxies are distributed geographically in points of presence (PoPs) closer to member last mile networks. 

We started out by building our very own in-house CDN (Content Delivery Network) for dynamic content, spanning 19 PoPs across 10 countries. By migrating to AFD, we’re able to extend our reach to 300 PoPs, many in countries where we had no PoP presence previously. The following animation illustrates some of the advantages of leveraging a close-by PoP versus a far-away data center.

Fig 2. How Points of Presence (PoPs) accelerate HTTP

The first area of optimization is between the member and PoP; traffic in this last mile normally traverses over the public internet. Leveraging the peering between our member ISPs and Azure PoPs facilitates an express lane, so member traffic flows directly into the Microsoft infrastructure.

The resulting low latency greatly improves the TCP handshake times, SSL setup and resumption times, the growth of various protocol timing windows, and the ability to correct for lost packets and retries. Low latency also reduces time and distance on the wire, thereby reducing opportunities for buffering, jitter, or packet loss. This all translates into an accelerated and reliable transport of our members’ data. 

Cold potato and hot potato routing
As the member’s traffic is forwarded by the PoP toward LinkedIn’s data centers, it traverses the middle mile. This leg of the journey is usually where the most time is spent and can span continents or oceans. One of the design considerations in our migration was how to accelerate this middle mile. Even though our edge had migrated to AFD, we had various choices for how traffic traveled back to LinkedIn’s data centers.

image-of-AFD-PoP-sending-traffic-to-LinkedIn-data-center

Fig 3. Method 1 (Hot potato): AFD PoP sends traffic to LinkedIn data center over public internet

The name “hot potato” comes from the children’s game, where the holder quickly tosses the “hot potato” to their neighbor. In the rightward journey depicted above of ingressing toward the data center, the MSFT PoP traffic is immediately handed off to the internet—hence why it’s considered hot potato. Once on the public internet, we have little control over how the traffic is routed. While this option lowers cost, it comes at the expense of speed and availability. This is the approach commonly employed by CDNs.

image-of-AFD-PoP-sending-traffic-to-closest-LinkedIn-PoP

Fig 4. Method 2 (Hot potato): AFD PoP sends traffic to closest LinkedIn PoP, which carries data on LinkedIn backbone to LinkedIn data center 

With this second approach, AFD PoPs would hand off traffic to the closest LinkedIn PoP (often on the same continent), and we would carry that traffic to our data centers instead of the internet carrying it. Because traffic is immediately handed off from MSFT to LinkedIn, this is also known as hot potato. This method would accelerate member experience compared to the first approach, but has the caveat of us having to maintain our own backbone network across the globe.

image-of-AFD-PoP-carrying-traffic-on-Microsoft-backbone-to-LinkedIn-data-center

Fig 5. Method 3 (Cold potato): AFD PoP carries traffic on Microsoft backbone all the way to LinkedIn data center

With cold potato routing, the AFD PoP traffic stays on the Microsoft backbone all the way into the LinkedIn data center. In order for this to work, we peered our data center to Microsoft’s network using private links. Our A/B testing compared the various options and demonstrated that both latency and throughput were fastest with cold potato routing, and so this was the design selected. Peering with Azure is a service offering available to customers that push a lot of latency-sensitive traffic.

In the animation below, we compare traffic routed from India to our Oregon data center either entirely by AFD, or entirely by LinkedIn. We used actual internet measurements to time the various hops and slowed the animation travel time by 50x.

goif-of-trip-between-last-mile-in-Bangalore-and-Oregon-data-center

Fig 6. Single round-trip between last mile in Bangalore and Oregon data center

The reason AFD is quicker is multifold:

  1. AFD has 4 PoPs in India versus 1 LinkedIn PoP, which provides PoP termination closer to members.
  2. Microsoft's diverse peering allows traffic to directly enter the AFD PoP. Traffic going towards LinkedIn PoPs must first go over the public internet.
  3. AFD’s diverse backbone shortens the route by miles by directing traffic directly to our Oregon data center cold potato style. LinkedIn’s backbone has less diverse physical paths that must travel through California first before Oregon.

In our June 2020 blog post, we shared real-world data demonstrating the impact the AFD PoPs had on reductions in page load times (PLT) and improved business metrics like the number of page views and sessions.

histogram-of-percent-page-load-time-reduction-across-major-markets

Fig 7. Histogram of percent page load time reduction across major markets

Africa wins

Africa possesses the world's fastest growing workforce due to its young median age and burgeoning internet connectivity, making it imperative to support this workforce to enable further digital transformation. The African infrastructure Microsoft is building was a key factor in LinkedIn's migration to Azure Front Door in order to reliably deliver our tools that help members close the skills gap and improve their employability. 

During our AFD evaluation, it was clear that the experience would improve based on the reduced page load times, as seen by synthetic browser tests we ran across Africa. And with Microsoft’s current and future PoPs across the continent, it made sense to ride along with their investments. 

We've been ramped on AFD in Africa for over a year now, and as we look back at the real user metrics collected, we measure clear improvements to the member experience enabled by the local infrastructure. The median round-trip times have all been at least halved, as seen by the sharp decrease in TCP and TLS connect times, translating to improved availability and page load times.

image-of-AFD-PoP-locations-in-africa

Fig 8. AFD PoP locations in Africa

figure-of-median-TLS-handshake-durations-across-2020

Fig 9. Median TLS handshake durations across 2020

AFD has been adding new PoP locations at breakneck speed, even during the pandemic. When AFD turns on a site, they announce anycast network address space that transparently draws local traffic. Examples of this can be observed in the 2020 Q4 timeframe in the above graphs, when 20 new locations were added, including Kenya and Egypt. TLS handshake duration was reduced in May 2020 when we ramped Africa to AFD PoPs in Europe, and they were reduced again when AFD routed requests to the new African PoPs. At the time of the writing of this blog, AFD has just turned up a PoP in Angola and we’ve already started to track performance improvements. 

We are now embarking on using our insights into member performance metrics to make recommendations to Azure Front Door about the future PoP sites that would benefit our members the most. As the AFD footprint grows, all tenants of the services, as well as our members, will benefit from the improved performance. 

Key challenges faced during migration

This site-wide migration to AFD posed many complex challenges because of the many moving parts. Below, we’ve highlighted two roadblocks that were the most technically challenging and involved multiple teams, and explained how we solved them.  

Interpreting HTTP RFCs
Some of our third party API clients were incompatible with our AFD migration because of variations on how the content-length header was used. AFD’s HTTP proxies expect a content-length header of 0 for POST requests with no body, but LinkedIn’s HTTP proxies didn’t have this requirement.

figure-of-post-request

Fig 10. POST request with content-length=0 header (left) vs. POST request without content-length header (right)

When AFD received POSTs with no content-length header, it would drop the request by responding with a 404, which is not ideal. LinkedIn’s partner API documentation recommended not including the content-length header for legacy reasons.  Since we had a contractual obligation to give our API partners several months heads up on any API changes, we could not immediately drop support for this behavior.

When collaborating with the Microsoft AFD team on workarounds and understanding the difference in proxy behaviors, we realized the various HTTP RFCs on content-length were ambiguous. While AFD expected a content-length header, this expectation had previously been relaxed on the LinkedIn Traffic stack with a custom Apache Traffic Server (ATS) plugin. Since relaxing the content-length header requirement on AFD would have necessitated a code change in a core Windows kernel library, we agreed we’d change the behavior on our stack to resolve this. We then communicated the deprecation of this behavior to our partners, giving them several months to update their clients. 

In order to continue ramping through the partner notice period, we needed to devise a quick workaround.

LinkedIn has an intelligent mechanism to detect and target members’ recursive DNS resolvers. This valuable information revealed that most of the impacted API partners used a handful of cloud services. Using our software-based DNS, we targeted these IPs to safely direct them to LinkedIn PoPs, away from AFD, permitting us to resume and complete our AFD ramps of member traffic without ever impacting our API partners.

Not all DNS recursive resolvers are equal
Qualifying performance using A/B testing is a fundamental part of assessing new infrastructure and we used DNS round robin to compare the baseline control to the new treatment. Over the course of several months, DNS queries for www.linkedin.com got answered with either an IP corresponding to AFD, or to LinkedIn, equally randomly. This would allow us to ramp all our devices (iOS, Android) and every browser all at once, with no software code changes in our apps. Our members reported Real User Measurements (RUM) that we used to compare the improvement or deterioration in their experience. The millions of recursive DNS resolvers across the internet generally respect the 50/50 ratios and so the number of data points gathered for either platform was generally uniform.

In order to safely conduct the ramps and observe the behavior, we ramped a few percentages of global traffic rather than our usual 50%. While most DNS recursive resolvers honored our low percentage ramps, some key ISPs in certain countries were sending virtually no traffic to AFD, especially in India. This was an unusual behavior, and it added a significant skew to the metrics, preventing accurate measurements of the experiment. This resulted in a protracted war room for investigating the behavior.

Our Bangalore SRE team met the infrastructure engineers at the largest ISP in India to debug the issue, walking through their turnkey recursive DNS platform and configs. We pored over the user manuals of their DNS platform, going through all the possibilities for the caching aberrations. Despite the ISP assuring us their platform was correctly configured, our members’ DNS were not getting AFD IPs.

As we couldn’t go about fixing all the ISP resolvers that were behaving incorrectly, we decided to pause our migration and build the A/B test mechanism directly in our Android and iOS apps. This would give us precise control of ramps without relying on recursive DNS infrastructure. This increased the data quality through perfectly uniform measurement density, and because it leveraged our established LiX testing platform, it permitted our data science team to analyze the data with their existing workflow. 

Refining this methodology added three months to the migration time, but delivered an accurate conclusion that migrating to AFD was the right direction for our edge. Our takeaway from this exercise is that it is crucial to study the data of an experiment. High impact experiments should ideally use the A/B test mechanism that delivers a higher quality of results. 

Wins since ramping

Evolution from edge PoPs to data center PoPs
With the majority of our traffic flowing through AFD, it made sense to decommission our edge PoPs and instead invest in building out origin Data Center (DC) PoPs that terminate AFD traffic within our DCs. Our edge PoPs that spanned the EU, LATAM, and APAC regions were expensive to maintain and decommissioning them has resulted in significant opex/capex savings.

DC PoPs can handle more traffic because they have access to orders of magnitude more compute and connectivity than our edge PoPs, allowing us to oversize them to handle significantly more traffic. The redesign also allowed us to simplify our PoP architecture, flatten our tiers, and modernize our stack, including the adoption of HAProxy. In the process of flattening our tiers, we eliminated a L7 proxy hop, which reduced a source of failure, operational complexity, and development overhead. It simplified the complex ordering needs in the routing configuration deployments and removed the need to maintain multiple versions of software across the tiers. 

Reducing toil and site failures through Azure integrations
Maintaining a global edge requires significant resources to build and support critical dial tone infrastructure and tooling. Migrating our traffic stack to Azure enables us to refocus our time on engineering rather than operations. Two prime examples of this are maintaining TLS certs and logging. 

We’ve had to build and maintain our own certification management portal for hundreds of external TLS certs. With the original author long gone, and subject matter expert ownership exchanging hands several times, the aging code base has become difficult to maintain, and it has led to public lapses in external domain expirations. We’re now able to orchestrate certificate auto-renewal and deployment using Azure Key Vault

The same issue was present with our global HTTP logs. SREs often resorted to grepping large log files across fleets of hosts in order to debug live production issues. We now have our billions of daily HTTP requests automatically sent by AFD to Azure Data Explorer, allowing instant SQL-like querying. We’re exploring using this data to discover patterns and trends in our traffic.

What’s next?

The AFD migration has enabled us to re-engineer resilient solutions and reduce toil on infrastructure operations. With the Azure ecosystem, we’re able to wrap our day-to-day in Infrastructure-as-Code, invest in business continuity planning, gather better insights into our traffic, and experiment with the cutting edge of internet technologies.

In order to enable efficient collaboration with our partners at Microsoft, we’re expanding our SRE operations to a LinkedIn Seattle office. We already have several collaboration projects in the pipeline as we work with various Azure Edge teams. We are thrilled to be participating in development and A/B testing of the latest web technologies like HTTP/3 aka HTTP-over-QUIC. We have recorded very promising benefits of QUIC in prior experiments, and so enabling this on AFD is a top priority to better serve LinkedIn and other users of AFD. Our long term roadmap also includes IP coalescing with Azure CDN, 0-RTT with TLS1.3, as well as tracking the gains we continue to fetch from an ever-growing PoP footprint.