Our streams infrastructure powers all of our near real-time data processing, serving as LinkedIn’s application backbone. Developed at LinkedIn, Apache Kafka, Apache Samza, and Brooklin form this world-class data processing infrastructure, which enables the experiences of our more than 660 million members.
Teams & Project Spotlights
Apache Kafka is a core part of our is a core part of our infrastructure at LinkedIn. It was originally developed in-house as a stream processing platform and was subsequently open sourced. Today, it’s widely used by the industry, has an active community, and few companies — if any — do so at LinkedIn’s scale.
Brooklin is a distributed service for streaming data in near real-time and at scale that’s been running in production at LinkedIn since 2016. It powers thousands of data streams and more than 4 trillion messages per day.
Apache Samza enables data processing in near real-time. At LinkedIn, Samza operates at a massive scale, enabling thousands of applications, tens of thousands of containers to process trillions of messages each day.