Co-Authors: Yuhong Cheng, Shangjin Zhang, Xinyu Liu, and Yi Pan Efficient data processing is crucial in reducing learning curves, simplifying maintenance efforts, and decreasing operational complexity. This, in turn, helps engineers to develop and deploy data processing applications quickly and easily, powering various business requirements, and enhancing member...
Stream Processing Articles
-
- Topics:
- Apache Samza,
- Spark,
- Stream Processing,
- apache,
- Data Streams
-
Co-authors: Zihan Li, Sudarshan Vasudevan, Lei Sun, and Shirshanka Das Data analytics and AI power many business-critical use cases at LinkedIn. We need to ingest data in a timely and reliable way from a variety of sources, including Kafka, Oracle, and Espresso, bringing it into our Hadoop data lake for subsequent processing by AI and data science pipelines. We...
- Topics:
- Stream Processing,
- Hadoop,
- Data,
- batch processing,
- Open Source,
- Gobblin,
- Kafka
-
Co-authors: Xiang Zhang and Jingyu Zhu Introduction The Lambda architecture has become a popular architectural style that promises both speed and accuracy in data processing by using a hybrid approach of both batch processing and stream processing methods. But it also has some drawbacks, such as complexity and additional development/operational overheads. One of...
- Topics:
- Stream Processing,
- Pinot,
- Profile,
- Architecture,
- Kafka,
- batch processing
-
Co-authors: Yixing Zhang, Bingfeng Xia, Ke Wu, and Xinyu Liu Since Beam Samza runner was developed in 2018 at LinkedIn, we now have...
- Topics:
- Stream Processing,
- Apache Samza,
- Performance,
- Benchmark
-
Co-authors: Khai Tran and Steve Weiss Batch and streaming computations are often combined together in the Lambda architecture, but...
- Topics:
- Stream Processing,
- batch processing,
- Data,
- Pinot,
- Gobblin,
- Kafka,
- Samza
-
Editor's note: This blog has been updated. Brooklin—a distributed service for streaming data in near real-time and at scale—has been...
- Topics:
- scale,
- Stream Processing,
- infrastructure,
- Kafka,
- Data,
- Open Source