Open Source Articles

  • LinkedIn-Kafka-ecosystem

    How LinkedIn customizes Apache Kafka for 7 trillion messages per day

    October 8, 2019

    Co-authors: Jon Lee and Wesley Wu Apache Kafka is a core part of our infrastructure at LinkedIn. It was originally developed in-house as a stream processing platform and was subsequently open sourced, with a large external adoption rate today. While many other companies and projects leverage Kafka, few—if any—do so at LinkedIn’s scale. Kafka is used extensively...

  • isolation-tree

    Detecting and Preventing Abuse on LinkedIn Using Isolation Forests

    August 13, 2019

    The Anti-Abuse AI Team at LinkedIn creates, deploys, and maintains models that detect and prevent various types of abuse, including the creation of fake accounts, member profile scraping, automated spam, and account takeovers. There are several unique challenges we face when using machine learning to prevent abuse on a large professional network, including:...

  • change-data-capture

    Open Sourcing Brooklin: Near Real-Time Data Streaming at Scale

    July 16, 2019

    Brooklin - a distributed service for streaming data in near real-time and at scale - has been running in production at LinkedIn since 2016, powering thousands of data streams and over 2 trillion messages per day. Today, we are pleased to announce the open-sourcing of Brooklin and that the source code is available in our Github repo!  Why Brooklin? At LinkedIn,...

  • star-tree-data-structure

    Star-Tree Index: Powering Fast Aggregations on Pinot

    June 14, 2019

    Pinot is an open source, scalable distributed OLAP data store that entered the Apache Incubation recently. Developed at LinkedIn, it...

  • iris-mobile-view

    Iris Mobile: An Open Source, Mobile Interface for Incident ...

    May 9, 2019

    At LinkedIn, our on-call incidents are managed using Iris and Oncall, two tools that we released as open source to the community about...

  • ambry-logo

    Introducing Data Compaction in Ambry

    May 8, 2019

    Introduction Three years ago, LinkedIn announced and open sourced Ambry, a distributed, highly available and horizontally scalable...