LinkedIn has a diverse set of needs for online storage. We store images such as member profile photos, documents such as resumes, a variety of message attachments, videos, and more.
Ambry is LinkedIn’s source-of-truth distributed blob storage system. As a source-of-truth online system, the durability and availability of Ambry are paramount, along with its ability to serve even large objects with very low latency. Like any distributed system, failures of individual nodes are a routine part of our operations. The redundancy in Ambry’s deployment means we are prepared to lose a single node, a whole rack, or even an entire datacenter without incurring any data loss.
LinkedIn conceived of and built Ambry to solve our growing needs for persistent online object storage. As LinkedIn has evolved and our data processing, storage, and serving needs have grown more complex we’re expanding our ability to meet the needs of our partner teams by building integrations with Hadoop and acting as a cross-datacenter transport layer for very large datasets.
We’ve been operating Ambry at LinkedIn for many years now, consistently delivering over four nines of availability. With LinkedIn’s infrastructure migration to Azure, we’re building a cloud-native version of Ambry to provide the first-class public-cloud/on-premise hybrid blob storage system that will power LinkedIn’s blob storage needs for many more years to come.
Ambry manages billions of blobs spread across thousands of machines in multiple datacenters, and we’ve been proud to contribute Ambry to the open source community.