Data Articles

  • nuage1

    Nuage: Making Data Systems Management Scalable

    August 7, 2018

    With more than a half billion members on LinkedIn, we have had to create new ways to scale our infrastructure and support the tremendous growth in data. We’re always looking for the best ways to manage our growth, from our internally-built systems to leveraging technology from Azure for external cloud integrations. One of the ways we manage this data influx is...

  • Architecture of LinkedIn Voices

    Voices: a Text Analytics Platform for Understanding Member Feedback

    June 10, 2016

    In the era of big data, corporations and businesses are increasingly collecting immense amounts of unstructured data in the form of free text, from customer service conversations to market research surveys. While it is clear that such member feedback, or “Voice of the Member” (VOM), contains valuable information, it is often less clear how to best analyze such...

  • Open Sourcing WhereHows: A Data Discovery and Lineage Portal

    March 3, 2016

    In modern data-driven businesses, the complexity that arises from fast-paced analytics, data mining and ETL processes makes metadata increasingly important. In this blog post, we share our own journey and a new open source effort that aims to boost productivity and data provenance. WhereHows, a project of the LinkedIn Data team, works by creating a central...

  • Benchmarking Apache Samza: 1.2 million messages per second ...

    August 24, 2015

    Update Apr 13, 2016: There are numerous improvement to Samza cachestore (SAMZA-658, SAMZA-812, SAMZA-873 etc.) since our last test...

  • Benchmarking Apache Samza: 1.2 million messages per second ...

    August 24, 2015

    Apache Samza has been run in production and is used by many LinkedIn services to solve a variety of stream processing scenarios. For...

  • Benchmarking Apache Samza: 1.2 million messages per second ...

    August 24, 2015

    Update Apr 13, 2016: There are numerous improvement to Samza cachestore (SAMZA-658, SAMZA-812, SAMZA-873 etc.) since our last test...