Data Articles

  • high-level-diagram-of-user-migration-and-dataset-deprecation-tool

    Co-authors: Steven Chuang, Qinyu Yue, Aravind Rao, and Srihari Duddukuru Introduction Having recently transitioned LinkedIn’s analytics stack (including 1400+ datasets, 900+ data flows, and 2100+ users) to one based on open source big data technologies, we wanted to give an overview of the journey in a blog post. This move freed us from the limits imposed by...

  • an-illustration-of-the-distributed-tier-merge

    Distributed tier merge: How LinkedIn tackles stragglers in search index build

    September 27, 2021

    Co-authors: Andy Li and Hongbin Wu Indexing plays the key role in modern search engines for fast and accurate information retrieval, and the ability to swiftly build indexes is crucial for LinkedIn to provide up to date information, such as candidates to recruiters, job posts to members, etc. In some instances, such as if a member profile is missing and...

  • graph-of-linkedin-cluster-trends-for-hdfs-space-used-total-name-node-objects-and-yarn-compute-capacity

    Scaling LinkedIn's Hadoop YARN cluster beyond 10,000 nodes

    September 8, 2021

    Co-authors: Keqiu Hu, Jonathan Hung, Haibo Chen, and Sriram Rao At LinkedIn, we use Hadoop as our backbone for big data analytics and machine learning. With an exponentially growing data volume, and the company heavily investing in machine learning and data science, we have been doubling our cluster size year over year to match the compute workload growth. Our...

  • diagram-of-http2-network-client-architecture

    HTTP/2 in infrastructure: Ambry network stack refactoring

    August 24, 2021

    Co-authors: Ze Mao, Matt Wise, Casey Getz, Justin Lin, Ashish Singhai, and Rob Block Introduction Ambry is LinkedIn's scalable...

  • lambda-learner-logo

    Lambda Learner: Nearline learning on data streams

    August 11, 2021

    Co-authors: Kirill Talanine, Jeffrey D. Gee, Rohan Ramanath, Konstantin Salomatin, Gungor Polatkan, Onkar Dalal, and Deepak Kumar...

  • gif-showing-new-recruiter-and-jobs-experience

    New Recruiter & Jobs: The largest enterprise data migration...

    July 23, 2021

    Co-authors: Xiaoyang Gu, Xie Lu, and Xiaoguang Wang Introduction In August 2019, we introduced our members and customers to the idea...