Open Source Articles

  • datahub-logo

    Open sourcing DataHub: LinkedIn’s metadata search and discovery platform

    February 18, 2020

    Co-authors: Kerem Sahin, Mars Lan, and Shirshanka Das Finding the right data quickly is critical for any company that relies on big data insights to make data-driven decisions. Not only does this impact the productivity of data users (including analysts, machine learning developers, data scientists, and data engineers), but it also has a direct impact on end...

  • how-we-retired-python-2

    How we retired Python 2 and improved developer happiness

    January 29, 2020

    Nearly 20 years after the first release of Python 2 and 11 years after the first release of Python 3, the Python development community has retired Python 2.7, the last of the Python 2 series. This marks the end of all upstream support for Python 2, including bug and security fixes, and allows developers to devote their time fully to Python 3, which is faster,...

  • lightweight-hardware-accelerated-video/audio-transcoder-for-android

    LiTr: A lightweight video/audio transcoder for Android

    December 19, 2019

    If a picture’s worth a thousand words, then what about a video? In 2017, we launched video sharing to give our members the ability to share video content on the feed via the LinkedIn mobile app or a web browser. When posting a video from an Android device, the member could either record it using their device camera app or pick an existing video from the gallery....

  • LinkedIn-Kafka-ecosystem

    How LinkedIn customizes Apache Kafka for 7 trillion...

    October 8, 2019

    Co-authors: Jon Lee and Wesley Wu Apache Kafka is a core part of our infrastructure at LinkedIn. It was originally developed in-house...

  • isolation-tree

    Detecting and preventing abuse on LinkedIn using isolation ...

    August 13, 2019

    The Anti-Abuse AI Team at LinkedIn creates, deploys, and maintains models that detect and prevent various types of abuse, including...

  • change-data-capture

    Open sourcing Brooklin: Near real-time data streaming at...

    July 16, 2019

    Editor's note: This blog has been updated. Brooklin—a distributed service for streaming data in near real-time and at scale—has been...