Co-authors: Junkai Xue and Hunter Lee Stateless tasks are widely used for serving large-scale data processing systems. Lots of requests were made by systems, which rely on Apache Helix, for a stateless task management feature to be added to Apache Helix. Recently, our team decided to explore new ways to manage stateless tasks, in addition to our ongoing work to...
Distributed Systems Articles
-
- Topics:
- Apache Helix,
- Distributed Systems,
- Data,
- Open Source
-
Co-authors: Saurabh Goyal and Janardh Bantupalli In our previous blog post introducing Brooklin, we outlined the reasons why we created our own framework for near real-time incremental data capture from production. This framework feeds data to our larger data ingestion pipeline for the hundreds of nearline applications processing data that are distributed across...
- Topics:
- Stream Processing,
- database,
- Distributed Systems
-
Co-authors: Divye Kapoor, Zheng Li, and Pujita Mathur Introduction As a professional social network serving more than 500 million worldwide members, LinkedIn is the premier destination for professional conversations. We have a wide variety of posts that attract significant engagement, and some of these posts go viral. These posts attract likes and comments in...
-
Editor's note: This blog has been updated. Typical distributed data systems are clusters composed of a set of machines. If the dataset...
- Topics:
- Apache Helix,
- Big Data,
- Distributed Systems,
- Open Source
-
We build a lot of our own infrastructure systems here at LinkedIn. Many people have heard of Kafka, our distributed message buffer. We...
- Topics:
- Venice,
- Craftsmanship,
- Distributed Systems
-
Editor's note: This blog has been updated. Background Like many internet companies, LinkedIn has faced data growth challenges....
- Topics:
- Venice,
- Distributed Systems,
- Cluster Management,
- helix