SRE Articles

  • gd-sre-teams-pt2-1

    The Makeup of Successful Geographically-Distributed SRE Teams: Part 2

    March 27, 2018

    In part one of this series, we discussed some of the key principles to consider when developing geographically distributed (GD) SRE teams. Similar to the first article, we’re leveraging the journey of LinkedIn’s SRE team as the point of reference for the topics discussed here in part two. Within this post, we’ll discuss growth planning, the challenges associated...

  • geographically-distributed-sre-teams-1

    The Makeup of Successful Geographically-Distributed SRE Teams: Part 1

    March 15, 2018

    Why geographically-distributed SRE teams? In today’s hyper-connected technological world, there is a need for geographically widespread technical teams to facilitate global growth. Businesses that scale to this level need global teams to handle that reach. However, as development teams continue to ramp, it quickly becomes infeasible for them to solve operational...

  • image5

    Project STAR*: Streamlining Our On-Call Process

    January 10, 2018

    Co-authors: Bef Ayenew and Adam Hobson Consider the following conversation that used to be typical at LinkedIn: "Folks, we may have an on-call problem this week..." "What’s wrong?" "Our Android engineer is missing."  "Where are they?" "They’ve been on leave for two weeks now." "?!?!&*?!?!" Welcome to the on-call rotation for the LinkedIn flagship mobile app and...

  • fossor2

    Automating Your Oncall: Open Sourcing Fossor and Ascii Etch

    December 14, 2017

    One of our sayings in Site Reliability Engineering (SRE) is that the goal of your job is to “automate yourself out of the job.” While...

  • couchbase2

    Couchbase Ecosystem at LinkedIn

    December 6, 2017

    Couchbase is a highly scalable, distributed data store that plays a critical role in LinkedIn’s caching systems. Couchbase was first...

  • Waterbear-logo

    Resilience Engineering at LinkedIn with Project Waterbear

    November 10, 2017

    Coauthors:  Bhaskaran Devaraj and Xiao Li   Over the last several years, many companies have discussed ways to improve the resiliency...