SRE Articles

  • featured7

    Coding Conversations: The “Perfect Storm" that Brought Down LinkedIn.com

    November 16, 2018

    Editor’s Note: This article originally appeared as a guest post on VentureBeat titled “What I learned by bringing down LinkedIn.com.” Reprinted here in full, the post tells the story of how Katie accidentally crashed LinkedIn.com. After the immediate problem was resolved, the incident resulted in sitewide technical improvements and turned out to be a growth...

  • linkedout-1

    LinkedOut: A Request-Level Failure Injection Framework

    May 24, 2018

    LinkedIn has made significant investments in resilience engineering over the past few years. As Site Reliability Engineers (SREs), we've consistently witnessed the effects of Murphy's Law:  "Anything that can go wrong, will go wrong." In a complex, distributed technology stack, it's important to understand the points where things can go wrong in your system and...

  • open-sourcing-shiv-1

    Introducing and Open Sourcing shiv

    May 10, 2018

    At LinkedIn, we ship hundreds of command-line utilities to every machine in our data centers and to all of our employees’ workstations. The vast majority of these utilities are written in Python. In addition to developing these command-line utilities, we have hundreds of supporting libraries that are constantly being iterated on, with new versions published...

  • feature7

    Evolution of Couchbase at LinkedIn

    May 1, 2018

    Author's note: My colleague, Michael Kehoe, wrote a blog post on the Couchbase Ecosystem at LinkedIn. I encourage you to read it if...

  • gd-sre-teams-pt2-1

    The Makeup of Successful Geographically-Distributed SRE...

    March 27, 2018

    In part one of this series, we discussed some of the key principles to consider when developing geographically distributed (GD) SRE...

  • geographically-distributed-sre-teams-1

    The Makeup of Successful Geographically-Distributed SRE...

    March 15, 2018

    Why geographically-distributed SRE teams? In today’s hyper-connected technological world, there is a need for geographically...