Nishant Singh

Posts by Nishant Singh

  • diagram-of-alert-correlation-high-level-architecture

    Spike detection in Alert Correlation

    December 22, 2021

    Introduction LinkedIn’s stack consists of thousands of different microservices and the associated complex dependencies among them. When a production outage happens due to an issue with misbehaving services, finding the exact service responsible for the outage is challenging and time-consuming. Although each service has multiple alerts configured in a distributed...

  • traffic-routing-architecture

    Eliminating toil with fully automated load testing

    December 6, 2019

    Introduction In 2013, when LinkedIn moved to multiple data centers across the globe, we needed a way to redirect traffic from one data center to another in order to mitigate potential member impact in the event of a disturbance to our services. This need led to the birth of one of the most important pieces of engineering at LinkedIn, called TrafficShift. It...