Data Articles

  • image-of-schemas-for-a-person

    Shifting left on governance: DataHub and schema annotations

    May 17, 2022

    Co-authors: Joshua Shinavier and Shirshanka Das Data governance is easy… as long as the data to be governed is small and simple. A handful of developers creating a startup company can get away with relatively lightweight solutions for managing their data, but things change as scale and complexity increases. Like a hermit crab outgrowing its shell, we constantly...

  • opal-data-flow

    Opal: Building a mutable dataset in data lake

    March 16, 2022

    Co-authors: Bhupendra Kumar Jain, Aditya Narain Gupta, Kuai Yu, and Hung Tran At LinkedIn, trusted data platforms and quality data pipelines are essential to meaningful business metrics and sound decision-making. Today, a considerable percentage of data at LinkedIn comes from online data stores. Whether the online data systems fall into SQL or NoSQL categories,...

  • diagram-of-darwin-functionality

    DARWIN: Data Science and Artificial Intelligence Workbench at LinkedIn

    January 28, 2022

    Co-authors: Varun Saxena, Harikumar Velayutham, and Balamurugan Gangadharan LinkedIn is the largest global professional network and generates massive amounts of high-quality data. Our data infrastructure scales to store exabytes of data; data analysts, data scientists, and AI engineers then use this data to power several LinkedIn products and the platform as a...

  • photo-of-juan-in-argentina

    After joining LinkedIn Argentina, Juan took an Ireland-based opportunity to build a new EMEA (i.e., Europe, Middle East, and Africa)...

  • high-level-diagram-of-user-migration-and-dataset-deprecation-tool

    Co-authors: Steven Chuang, Qinyu Yue, Aravind Rao, and Srihari Duddukuru Introduction Having recently transitioned LinkedIn’s...

  • an-illustration-of-the-distributed-tier-merge

    Distributed tier merge: How LinkedIn tackles stragglers in ...

    September 27, 2021

    Co-authors: Andy Li and Hongbin Wu Indexing plays the key role in modern search engines for fast and accurate information retrieval,...