Open Source Articles

  • title-card

    Project Magnet, providing push-based shuffle, now available in Apache Spark 3.2

    October 20, 2021

    Co-authors: Venkata Krishnan Sowrirajan and Min Shen We are excited to announce that push-based shuffle (codenamed Project Magnet) is now available in Apache Spark as part of the 3.2 release. Since the SPIP vote on Project Magnet passed in September 2020, there has been a lot of interest in getting it into Apache Spark. As of March 2021, 100% of LinkedIn’s Spark...

  • graph-of-linkedin-cluster-trends-for-hdfs-space-used-total-name-node-objects-and-yarn-compute-capacity

    Scaling LinkedIn's Hadoop YARN cluster beyond 10,000 nodes

    September 8, 2021

    Co-authors: Keqiu Hu, Jonathan Hung, Haibo Chen, and Sriram Rao At LinkedIn, we use Hadoop as our backbone for big data analytics and machine learning. With an exponentially growing data volume, and the company heavily investing in machine learning and data science, we have been doubling our cluster size year over year to match the compute workload growth. Our...

  • diagram-of-http2-network-client-architecture

    HTTP/2 in infrastructure: Ambry network stack refactoring

    August 24, 2021

    Co-authors: Ze Mao, Matt Wise, Casey Getz, Justin Lin, Ashish Singhai, and Rob Block Introduction Ambry is LinkedIn's scalable geo-distributed object store. Developed in-house and open sourced in 2016, Ambry stores tens of petabytes of data. At LinkedIn, Ambry is used to store objects like photos, videos, and resume uploads, as well as internal binary data....

  • lambda-learner-logo

    Lambda Learner: Nearline learning on data streams

    August 11, 2021

    Co-authors: Kirill Talanine, Jeffrey D. Gee, Rohan Ramanath, Konstantin Salomatin, Gungor Polatkan, Onkar Dalal, and Deepak Kumar...

  • new-dualip-project-logo

    DuaLip: Solving extreme-scale linear programs for web...

    August 6, 2021

    Co-authors: Kinjal Basu, Yao Pan, Rohan Ramanath, Konstantin Salomatin, Amol Ghoting, and S. Sathiya Keerthi Building thriving...

  • diagram-of-how-tony-works-with-horovod

    TonY joins LF AI & Data Foundation

    July 15, 2021

    Co-authors: Keqiu Hu, Jonathan Hung, and Junfan Zhang Today, TonY is joining the LF AI & Data Foundation, an umbrella foundation of...