LinkedIn launched in its initial form over 18 years ago, which is an eternity in the technology world. The early site was a single monolithic Java web application, and as it gained in popularity and the user base grew, the underlying technology had to adapt in order to support our ever-growing scale. We now operate well over 1,000 separate microservices running...
Resilience Articles
-
- Topics:
- microservices,
- Resilience,
- Low Latency
-
LinkedIn has made significant investments in resilience engineering over the past few years. As Site Reliability Engineers (SREs), we've consistently witnessed the effects of Murphy's Law: "Anything that can go wrong, will go wrong." In a complex, distributed technology stack, it's important to understand the points where things can go wrong in your system and...
- Topics:
- Resilience,
- SRE
-
Co-authors: Maulin Patel, Erek Gokturk, and Chris Stufflebeam How do you increase the resiliency and stability of a monolithic API service that is used by three different platforms, serving 500+ million members, developed by over 400 engineers, deployed three times per day, and consuming almost 300 downstream services? The API layer service used by LinkedIn.com...
- Topics:
- Performance,
- scalability,
- multicluster,
- Resilience
-
Coauthors: Bhaskaran Devaraj and Xiao Li Over the last several years, many companies have discussed ways to improve the resiliency of...
- Topics:
- Resilience,
- SRE
-
Co-authors: Susie Xia and Anant Rao Editor's note: This blog has been updated due to the renaming of the project since publication....
- Topics:
- Performance,
- Resilience,
- Testing,
- Automation
-
Failure induction is a process of non-functional testing in which a set of failures is induced against a perfectly healthy service....
- Topics:
- Resilience,
- Testing,
- Open Source,
- Simoorg