Resilience Articles

  • Hodor: Overload scenarios and the evolution of their detection and handling

    February 23, 2023

    Co-Authors - Abhishek Gilra, Nizar Mankulangara, Salil Kanitkar, and Vivek Deshpande Introduction To connect professionals and make them more productive, it is crucial that LinkedIn is available at all times. For us, downtime means that our members and customers don’t have access to the conversations, connections, and knowledge that are essential to them...

  • image-of-framework

    Hodor: Detecting and addressing overload in LinkedIn microservices

    February 18, 2022

    LinkedIn launched in its initial form over 18 years ago, which is an eternity in the technology world. The early site was a single monolithic Java web application, and as it gained in popularity and the user base grew, the underlying technology had to adapt in order to support our ever-growing scale. We now operate well over 1,000 separate microservices running...

  • linkedout-1

    LinkedOut: A Request-Level Failure Injection Framework

    May 24, 2018

    LinkedIn has made significant investments in resilience engineering over the past few years. As Site Reliability Engineers (SREs), we've consistently witnessed the effects of Murphy's Law: "Anything that can go wrong, will go wrong." In a complex, distributed technology stack, it's important to understand the points where things can go wrong in your system and...

  • multicluster1

    Improving Resiliency and Stability of a Large-scale...

    November 28, 2017

    Co-authors: Maulin Patel, Erek Gokturk, and Chris Stufflebeam How do you increase the resiliency and stability of a monolithic API...

  • Waterbear-logo

    Resilience Engineering at LinkedIn with Project Waterbear

    November 10, 2017

    Coauthors: Bhaskaran Devaraj and Xiao Li Over the last several years, many companies have discussed ways to improve the resiliency of...

  • Dyno Dependency Components

    Dyno: How LinkedIn Determines the Capacity Limits of Its...

    February 17, 2017

    Co-authors: Susie Xia and Anant Rao Editor's note: This blog has been updated due to the renaming of the project since publication....