Apache Samza Articles

  • Unified Streaming And Batch Pipelines At LinkedIn: Reducing Processing time by 94% with Apache Beam

    March 23, 2023

    Co-Authors: Yuhong Cheng, Shangjin Zhang, Xinyu Liu, and Yi Pan Efficient data processing is crucial in reducing learning curves, simplifying maintenance efforts, and decreasing operational complexity. This, in turn, helps engineers to develop and deploy data processing applications quickly and easily, powering various business requirements, and enhancing member...

  • design-of-real-time-personalization-solution

    Near real-time features for near real-time personalization

    March 1, 2022

    Co-authors: Rupesh Gupta, Sasha Ovsankin, Qing Li, Seunghyun Lee, Benjamin Le, and Sunil Khanal At LinkedIn, we strive to serve the most relevant recommendations to our members, whether that’s a job they may be interested in, a member they may want to connect with, or another type of suggestion. In order to do that, we need to know their intent and preferences,...

  • table-comparing-the-nexmark-benchmark-results

    Building a better and faster Beam Samza runner

    October 1, 2020

    Co-authors: Yixing Zhang, Bingfeng Xia, Ke Wu, and Xinyu Liu Since Beam Samza runner was developed in 2018 at LinkedIn, we now have 100+ Samza Beam jobs running in production. As our usage grew, we wanted to better understand how the Samza runner performs compared to other runners and identify areas of improvement. In general, for stream processing platforms,...

  • testing_samza1

    Test Strategy for Samza/Kafka Services

    April 27, 2017

    Over a decade ago, test strategies invested heavily in UI-driven tests. Backend and mid-tier services were tested using automated...

  • async21

    Asynchronous Processing and Multithreading in Apache Samza,...

    January 6, 2017

    This post is the second in a series discussing asynchronous processing and multithreading in Apache Samza. In the previous post, we...

  • Async1

    Asynchronous Processing and Multithreading in Apache Samza,...

    January 4, 2017

    As part of the Apache Samza 0.11 release, we rebuilt Samza’s underlying event processing engine to use an asynchronous and parallel...