Big Data in Real Time: Processing Data Streams at LinkedIn - a tech talk by Jay Kreps, creator of Kafka and Voldemort

September 8, 2011

Come by LinkedIn Headquarters on Thursday, September 15 for a public tech talk "Big Data in Real Time: Processing Data Streams at LinkedIn". Jay Kreps, creator of Kafka and Voldemort, will be presenting. If you are planning to attend, please RSVP here.


Life happens in real-time. From breaking news to breaking servers, real-world events require petabytes of data to be collected, analyzed and acted on as those events happen. Until recently, real-time data processing was an exotic practice, relegated to narrow domains like time-series analysis for financial markets. For most of us performing analytics -- and particularly web analytics -- batch processing has been the dominant approach.

This paradigm is shifting. Here at LinkedIn, we are building the next generation of scalable real-time data stream processing. Technologies like Kafka and S4 have become indispensable for enabling real-time recommendations, fraud detection, trending topic identification, and personalized news feeds.

In this talk, I will discuss the state of up-and-coming stream processing technologies and how they fit in the broader data infrastructure ecosystem -- from live storage systems to Hadoop. I will explore problems that are amenable to real-time stream processing, solutions that change and shape the way we think about data, and challenges and lessons that we have learned while building LinkedIn’s data infrastructure.


Jay Kreps is a Principal Engineer at Linkedin. He was the original author of Voldemort, a distributed key-value storage system recently recognized by the OSCON Data Innovation Award as one of LinkedIn’s major contribution to the open source community to support data analytics. Jay has also made key contributions to Kafka, a persistent distributed message queue, and Azkaban, a simple batch scheduler for constructing and running Hadoop jobs or other offline processes. His team builds the core, data-driven features that delight LinkedIn’s users, including People You May Know, Who's Viewed My Profile, Skill Pages, and the collaborative filtering applications for LinkedIn's various recommendations. Jays has a BS and MS in computer science from the University of California, Santa Cruz.


2027 Stierlin Court
Mountain View, CA

LinkedIn Map


  • The talk is on Thursday, September 15, 2011 from 4:00 - 6:00PM
  • Doors open at 3:30pm
  • Please RSVP here