Ambry: LinkedIn's Scalable Geo-Distributed Object Store

Shadi Abdollahian Noghabi (University of Illinois), Sriram Subramanian (LinkedIn Corp), Priyesh Narayanan (LinkedIn Corp), Sivabalan Narayanan (LinkedIn Corp), Gopalakrishna Holla (LinkedIn Corp), Mammad Zadeh (LinkedIn Corp), Tianwei Li (LinkedIn Corp), Indranil Gupta (University of Illinois at Urbana-Champaign), Roy Campbell (University of Illinois at Urbana-Champaign)

SIGMOD2016

Paper

The infrastructure beneath a worldwide social network has to continually serve billions of variable-sized media objects such as photos, videos, and audio clips. These objects must be stored and served with low latency and high throughput by a system that is geo-distributed, highly scalable, and load balanced. Existing file systems and object stores face several challenges when serving such large objects. We present Ambry, a production-quality system for storing large, immutable objects (called blobs). Ambry is designed in a decentralized way and leverages techniques such as logical blob grouping, asynchronous replication, rebalancing mechanisms, zero-cost failure detection, and OS caching. Ambry has been running in LinkedIn’s production environment for the past 2 years, serving up to 10,000 requests per second across more than 400 million users. Our experimental evaluation reveals that Ambry offers high efficiency (utilizing up to 88% of the network bandwidth), low latency (less than 50 ms latency for a 1 MB object), and load balancing (improving imbalance of request rate among disks by 8x-10x).

Read the full paper