Open Source Articles

  • graph-of-fast-tree-shap-version-comparison

    FastTreeSHAP: Accelerating SHAP value computation for trees

    March 15, 2022

    Co-authors: Jilei Yang, Humberto Gonzalez, Parvez Ahammad In this blog post, we introduce and announce the open sourcing of the FastTreeSHAP package, a Python package based on the paper Fast TreeSHAP: Accelerating SHAP Value Computation for Trees (presented at the NeurIPS2021 XAI4Debugging Workshop). FastTreeSHAP enables an efficient interpretation of tree-based...

  • an-example-for-using-the-member-connection-graph-for-a-job-recommendation-task

    Performance-Adaptive Sampling Strategy (PASS) for GNNs: Open sourcing PASS

    March 7, 2022

    Co-authors: Jaewon Yang, Minji Yoon, Sufeng Niu, Dash Shi, and Qi He Graphs are a universal way to represent relationships among entities. Social graphs represent how people interact with each other, professional graphs represent how people collaborate, and so on. Graph Neural Networks (GNNs) are deep learning models that are specialized for understanding graphs...

  • title-card

    Project Magnet, providing push-based shuffle, now available in Apache Spark 3.2

    October 20, 2021

    Co-authors: Venkata Krishnan Sowrirajan and Min Shen We are excited to announce that push-based shuffle (codenamed Project Magnet) is now available in Apache Spark as part of the 3.2 release. Since the SPIP vote on Project Magnet passed in September 2020, there has been a lot of interest in getting it into Apache Spark. As of March 2021, 100% of LinkedIn’s Spark...

  • graph-of-linkedin-cluster-trends-for-hdfs-space-used-total-name-node-objects-and-yarn-compute-capacity

    Scaling LinkedIn's Hadoop YARN cluster beyond 10,000 nodes

    September 8, 2021

    Co-authors: Keqiu Hu, Jonathan Hung, Haibo Chen, and Sriram Rao At LinkedIn, we use Hadoop as our backbone for big data analytics and...

  • diagram-of-http2-network-client-architecture

    HTTP/2 in infrastructure: Ambry network stack refactoring

    August 24, 2021

    Co-authors: Ze Mao, Matt Wise, Casey Getz, Justin Lin, Ashish Singhai, and Rob Block Introduction Ambry is LinkedIn's scalable...

  • lambda-learner-logo

    Lambda Learner: Nearline learning on data streams

    August 11, 2021

    Co-authors: Kirill Talanine, Jeffrey D. Gee, Rohan Ramanath, Konstantin Salomatin, Gungor Polatkan, Onkar Dalal, and Deepak Kumar...