Introduction Apache Kafka is an open-sourced event streaming platform where users can create Kafka topics as data transmission units, and then publish or subscribe to the topic with producers and consumers. While most of the Kafka topics are actively used, some are not needed anymore because business needs changed or the topics themselves are ephemeral. Kafka...
Kafka Articles
-
- Topics:
- GC,
- Garbage Collection,
- Kafka
-
The LinkedIn infrastructure has thousands of services serving millions of queries per second. At this scale, having tools that provide observability into the LinkedIn infrastructure is imperative to ensure that issues in our infrastructure are quickly detected, diagnosed, and remediated. This level of visibility helps prevent the occurrence of outages so we can...
-
At LinkedIn, Apache Kafka is used heavily to store all kinds of data, such as member activity, log storage, metrics storage, and a multitude of inter-service messaging. LinkedIn maintains multiple data centers with multiple Kafka clusters per data center, each of which contains an independent set of data. Mirroring (i.e., replicating) Kafka topics across the...
- Topics:
- Pinot,
- Kafka,
- Open Source
-
Co-authors: Zihan Li, Sudarshan Vasudevan, Lei Sun, and Shirshanka Das Data analytics and AI power many business-critical use cases at...
- Topics:
- Stream Processing,
- Hadoop,
- Data,
- batch processing,
- Open Source,
- Gobblin,
- Kafka
-
Co-authors: Xiang Zhang and Jingyu Zhu Introduction The Lambda architecture has become a popular architectural style that promises...
- Topics:
- Stream Processing,
- Pinot,
- Profile,
- Architecture,
- Kafka,
- batch processing
-
Co-authors: Khai Tran and Steve Weiss Batch and streaming computations are often combined together in the Lambda architecture, but...
- Topics:
- Stream Processing,
- batch processing,
- Data,
- Pinot,
- Gobblin,
- Kafka,
- Samza