Data Articles

  • LinkedIn’s GraphQL journey for integrations and partnerships: How we accelerated development by 90%

    October 6, 2022

    Co-authors: Mimi Chen, Calvin Lei, and Amit Yadav Background LinkedIn’s mission is to connect the world’s professionals to make them more productive and successful. One way we advance this mission is by partnering with other organizations to deliver world class integrations. We are developing a platform-as-a-service (PaaS) that provides exploratory access,...

  • Super Tables: The road to building reliable and discoverable data products

    September 28, 2022

    Co-authors: David Lu, Hong Liu, Thomas Kwan, Christopher Harris, Weiping Si Many companies, including LinkedIn, have experienced exponential data growth ever since the Apache Hadoop adoption a decade ago. With a proliferation of self-service data authoring tools and publishing platforms, different teams have created and shared datasets to address business needs...

  • Open Sourcing Venice – LinkedIn’s Derived Data Platform

    September 26, 2022

    We are proud to announce the open sourcing of Venice, LinkedIn’s derived data platform that powers more than 1800 of our datasets and is leveraged by over 300 distinct applications. Venice is a high-throughput, low-latency, highly-available, horizontally-scalable, eventually-consistent storage system with first-class support for ingesting the output of batch and...

  • Since she was a child, Deepti has been motivated to help people. This drive led her on a career journey with many pivots and moves —...

  • Towards data quality management at LinkedIn

    June 9, 2022

    Co-authors: Liangzhao Zeng, Ting Yu (Cliff) Leung, Jimmy Hong, and Kevin Lau Introduction Data is at the heart of all our products and...

  • image-of-schemas-for-a-person

    Shifting left on governance: DataHub and schema annotations

    May 17, 2022

    Co-authors: Joshua Shinavier and Shirshanka Das Data governance is easy… as long as the data to be governed is small and simple. A...