SRE Articles

  • traffic-routing-architecture

    Eliminating toil with fully automated load testing

    December 6, 2019

    Introduction In 2013, when LinkedIn moved to multiple data centers across the globe, we needed a way to redirect traffic from one data center to another in order to mitigate potential member impact in the event of a disturbance to our services. This need led to the birth of one of the most important pieces of engineering at LinkedIn, called TrafficShift. It...

  • photo-of-attendees

    A look at our biggest SRE[in]con yet

    November 14, 2019

    Co-authors: Todd Palino, Samir Jafferali, Kurt Andersen, and Carolyn Blood LinkedIn hosted its 4th annual SRE[in]con conference in late October that brought together over 700 LinkedIn site engineers, as well as partners from Microsoft, Github, Drawbridge and Glint, for more than 60 talks, workshops, and main stage keynotes. The purpose? To provide engineers...

  • iris-mobile-view

    Iris Mobile: An Open Source, Mobile Interface for Incident Management

    May 9, 2019

    At LinkedIn, our on-call incidents are managed using Iris and Oncall, two tools that we released as open source to the community about two years ago. Oncall allows our teams to manage their on-call shifts in a largely automated fashion, scheduling rotations without any human intervention. At the same time, it allows teams to be agile and adaptable when defining...

  • featured7

    Coding Conversations: The “Perfect Storm" that Brought Down...

    November 16, 2018

    Editor’s Note: This article originally appeared as a guest post on VentureBeat titled “What I learned by bringing down LinkedIn.com.”...

  • linkedout-1

    LinkedOut: A Request-Level Failure Injection Framework

    May 24, 2018

    LinkedIn has made significant investments in resilience engineering over the past few years. As Site Reliability Engineers (SREs),...

  • open-sourcing-shiv-1

    Introducing and Open Sourcing shiv

    May 10, 2018

    At LinkedIn, we ship hundreds of command-line utilities to every machine in our data centers and to all of our employees’ workstations...