• Super Tables: The road to building reliable and discoverable data products

    September 28, 2022

    Co-authors: David Lu, Hong Liu, Thomas Kwan, Christopher Harris, Weiping Si Many companies, including LinkedIn, have experienced exponential data growth ever since the Apache Hadoop adoption a decade ago. With a proliferation of self-service data authoring tools and publishing platforms, different teams have created and shared datasets to address business needs...

  • Open Sourcing Venice – LinkedIn’s Derived Data Platform

    September 26, 2022

    We are proud to announce the open sourcing of Venice, LinkedIn’s derived data platform that powers more than 1800 of our datasets and is leveraged by over 300 distinct applications. Venice is a high-throughput, low-latency, highly-available, horizontally-scalable, eventually-consistent storage system with first-class support for ingesting the output of batch and...

  • Real-time analytics on network flow data with Apache Pinot

    September 13, 2022

    The LinkedIn infrastructure has thousands of services serving millions of queries per second. At this scale, having tools that provide observability into the LinkedIn infrastructure is imperative to ensure that issues in our infrastructure are quickly detected, diagnosed, and remediated. This level of visibility helps prevent the occurrence of outages so we can...

  • Feathr-joins-LF-AI-and-Data-Foundation

    Feathr joins LF AI & Data Foundation

    September 12, 2022

    Co-authors: Hangfei Lin, Jinghui Mo We’re excited to announce today that Feathr is joining LF AI & Data, the Linux Foundation’s...

  • Originally from Argentina, systems & infrastructure engineering leader Federico was a founding member of the Media Infrastructure team...

  • Operating system upgrades at LinkedIn’s scale

    August 31, 2022

    Co-authors: Hengyang Hu, Dinesh Dhakal, Kalyanasundaram Somasundaram Introduction Completing recurring operating system (OS) upgrades...