Co-authors: Zihan Li, Sudarshan Vasudevan, Lei Sun, and Shirshanka Das Data analytics and AI power many business-critical use cases at LinkedIn. We need to ingest data in a timely and reliable way from a variety of sources, including Kafka, Oracle, and Espresso, bringing it into our Hadoop data lake for subsequent processing by AI and data science pipelines. We...
batch processing Articles
-
- Topics:
- Stream Processing,
- Hadoop,
- Data,
- batch processing,
- Open Source,
- Gobblin,
- Kafka
-
Co-authors: Xiang Zhang and Jingyu Zhu Introduction The Lambda architecture has become a popular architectural style that promises both speed and accuracy in data processing by using a hybrid approach of both batch processing and stream processing methods. But it also has some drawbacks, such as complexity and additional development/operational overheads. One of...
- Topics:
- Stream Processing,
- Pinot,
- Profile,
- Architecture,
- Kafka,
- batch processing
-
Co-authors: Khai Tran and Steve Weiss Batch and streaming computations are often combined together in the Lambda architecture, but carry the cost of maintaining two different code bases for the same logic. We have previously shared on the blog a behind-the-scenes look at our approach into enabling the seamless translation of declarative batch code into streaming...
- Topics:
- Stream Processing,
- batch processing,
- Data,
- Pinot,
- Gobblin,
- Kafka,
- Samza
-
The existing Lambda architecture With the evolution of big data technologies over time, two classes of computations have been...
- Topics:
- Stream Processing,
- batch processing,
- Data