• graph-of-linkedin-cluster-trends-for-hdfs-space-used-total-name-node-objects-and-yarn-compute-capacity

    Scaling LinkedIn's Hadoop YARN cluster beyond 10,000 nodes

    September 8, 2021

    Co-authors: Keqiu Hu, Jonathan Hung, Haibo Chen, and Sriram Rao At LinkedIn, we use Hadoop as our backbone for big data analytics and machine learning. With an exponentially growing data volume, and the company heavily investing in machine learning and data science, we have been doubling our cluster size year over year to match the compute workload growth. Our...

  •  encoded-activity-sequence-showing-requests-made-by-a-member-that-was-not-using-abusive-automation

    Using deep learning to detect abusive sequences of member activity

    September 2, 2021

    Co-authors: James Verbus and Beibei Wang The Anti-Abuse AI Team at LinkedIn creates, deploys, and maintains models that detect and prevent many types of abuse, including the creation of fake accounts, member profile scraping, automated spam, and account takeovers. As we prevent abuse using machine learning, there are several challenges we can face: Maximizing...

  • diagram-of-http2-network-client-architecture

    HTTP/2 in infrastructure: Ambry network stack refactoring

    August 24, 2021

    Co-authors: Ze Mao, Matt Wise, Casey Getz, Justin Lin, Ashish Singhai, and Rob Block Introduction Ambry is LinkedIn's scalable geo-distributed object store. Developed in-house and open sourced in 2016, Ambry stores tens of petabytes of data. At LinkedIn, Ambry is used to store objects like photos, videos, and resume uploads, as well as internal binary data....

  • lambda-learner-logo

    Lambda Learner: Nearline learning on data streams

    August 11, 2021

    Co-authors: Kirill Talanine, Jeffrey D. Gee, Rohan Ramanath, Konstantin Salomatin, Gungor Polatkan, Onkar Dalal, and Deepak Kumar...

  • high-level-architecture-of-native-video-on-linkedin

    Building Microsoft-powered native video meetings on LinkedIn

    August 9, 2021

    Co-authors: Crystal Hsieh, Kyle Huynh, Sameera Padhye, Kalyan Punukollu, Federico Hlawaczek, Pavan Mellamputi, and Christian Byza As...

  • new-dualip-project-logo

    DuaLip: Solving extreme-scale linear programs for web...

    August 6, 2021

    Co-authors: Kinjal Basu, Yao Pan, Rohan Ramanath, Konstantin Salomatin, Amol Ghoting, and S. Sathiya Keerthi Building thriving...