Kafka Articles

  • Stream Processing Hard Problems – Part 1: Killing Lambda

    June 27, 2016

    We live in an age where we want to know relevant things happening around the world as soon as they happen; an age where digital content is updated instantly based on our likes and dislikes; an age where credit card fraud, security breaches, device malfunctions and site outages need to be detected and remedied as soon as they happen. It is an age where events are...

  • Kafka Ecosystem at LinkedIn

    April 19, 2016

    Apache Kafka is a highly scalable messaging system that plays a critical role as LinkedIn’s central data pipeline. Kafka was developed at LinkedIn back in 2010, and it currently handles more than 1.4 trillion messages per day across over 1400 brokers. Kafka’s strong durability and low latency have enabled us to use Kafka to power a number of newer...

  • Gobblin Gobbles Camus, Looks Towards the Future

    April 13, 2016

    We shared Gobblin with the open source community a year ago. Since then, we’ve seen increasing interest and adoption among engineers, researchers and analysts in using Gobblin to integrate data from a variety of sources into Hadoop. In previous blog posts, publications, and talks, we’ve described our motivations for building a unified ingestion framework that is...

  • Bridging Batch and Streaming Data Ingestion with Gobblin

    September 28, 2015

    Genesis Less than a year ago, we introduced Gobblin, a unified ingestion framework, to the world of Big Data. Since then, we’ve shared...

  • Prototyping Venice: Derived Data Platform

    August 10, 2015

    This is an interview with Clement Fung, who interned with the Voldemort team last year and liked LinkedIn so much that he decided to...

  • Early Architecture

    A Brief History of Scaling LinkedIn

    July 20, 2015

    LinkedIn started in 2003 with the goal of connecting to your network for better job opportunities. It had only 2,700 members the first...