LinkedIn has a diverse set of needs for online storage, including images, documents, message attachments, videos, and more. To scale our infrastructure to handle growth in membership, traffic, and data, we built Espresso, Venice, and Ambry.
Ambry is a distributed immutable object store that acts as our source of truth for media. Created in 2014, it has grown alongside our needs for persistent online object storage. Ambry manages billions of blobs spread across thousands of machines in multiple datacenters, and we’ve been proud to contribute Ambry to the open source community.
Espresso is our primary online datastore, and the focal point of our online data infrastructure. Serving millions of QPS of traffic and petabytes of data at low latencies, Espresso powers hundreds of applications, including Profiles, InMail, and our feed.
Venice is our main derived data platform. Its architecture is heavily stream-oriented, making it one of the most demanding and sophisticated Kafka use cases. Venice unlocks the value of data produced in Hadoop and Samza by supporting high-throughput ingestion and low-latency online queries. It is used to build both Lambda and Kappa Architecture applications.
Cluster Management (Hercules)
The Cluster Management (Hercules) team is responsible for developing and maintaining highly available and scalable cluster management solutions for our stateful data systems and stateless services.