This post is the second in a series of posts that discuss some of the hard problems in stream processing. In the previous post, we explored the use of lambda architecture in stream processing and discussed techniques to avoid it. In this post, we’ll focus on one of the main bottlenecks in high scale stream processing applications: “accessing data.” Background...
Posts by Kartik Paramasivam
-
- Topics:
- Apache Samza,
- Stream Processing,
- Big Data,
- Kafka,
- Samza
-
We live in an age where we want to know relevant things happening around the world as soon as they happen; an age where digital content is updated instantly based on our likes and dislikes; an age where credit card fraud, security breaches, device malfunctions and site outages need to be detected and remedied as soon as they happen. It is an age where events are...
- Topics:
- Apache Samza,
- Stream Processing,
- Big Data,
- realtime,
- Kafka,
- Samza
-
At LinkedIn, events pertaining to application and system monitoring, member behavior tracking, inter-application communication, etc., are all ingested into our pub-sub messaging system (Kafka). A staggering 1.3 trillion events are published into Kafka per day with peaks of 4.5 million messages/sec per cluster (these numbers don’t correlate directly with site...
- Topics:
- Apache Samza,
- Samza
-
How We’re Improving and Advancing Kafka at LinkedIn
Kartik Paramasivam September 2, 2015
Kafka continues to be one of the key pillars in LinkedIn’s data infrastructure. One of our engineers has described it as LinkedIn’s...
- Topics:
- Kafka