Distributed Systems Articles

  • chart-showing-exponential-growth-of-data-metadata-and-compute-on-linkedins-largest-hadoop-cluster

    The exabyte club: LinkedIn’s journey of scaling the Hadoop Distributed File System

    May 27, 2021

    Co-authors: Konstantin V. Shvachko, Chen Liang, and Simbarashe Dzinamarira LinkedIn runs its big data analytics on Hadoop. During the last five years, the analytics infrastructure has experienced tremendous growth, almost doubling every year in data size, compute workloads, and in all other dimensions. It recently reached two important milestones. LinkedIn now...

  • Jhubbub-on-Helix-making-stateless-and-elastic-easy

    Jhubbub on Helix: Stateless and elastic made easy

    August 27, 2020

    Co-authors: Hunter Lee and Dru Pollini LinkedIn was built to help professionals achieve more in their careers, and every day millions of people use our products to make connections, discover new opportunities and get better at what they do. An important part of our mission is helping people to find other professionals who are interested in the same things they...

  • event-photo

    LinkedIn NYC Tech Talk series: Engineering Excellence Meetup

    August 28, 2019

    We regularly play host to a series of meetups here at the LinkedIn office in the Empire State Building. Open to the community, these events cover a range of topics—from distributed systems, web and mobile development, to machine learning—and are a great way for engineers to meet, share notes, and learn from each other on various technical topics. At our latest...

  • Espresso-online-data-flow-with-Netty4

    Improving performance and capacity for Espresso with new...

    June 27, 2019

    In this blog post, we’ll share how we migrated Espresso, LinkedIn’s distributed data store, to a new Netty4-based framework and...

  • star-tree-data-structure

    Star-tree index: Powering fast aggregations on Pinot

    June 14, 2019

    Pinot is an open source, scalable distributed OLAP data store that entered the Apache Incubation recently. Developed at LinkedIn, it...

  • helixtask1

    Managing Distributed Tasks with Helix Task Framework

    January 24, 2019

    Co-authors: Junkai Xue and Hunter Lee Stateless tasks are widely used for serving large-scale data processing systems. Lots of...