Co-authors: Bhupendra Kumar Jain, Aditya Narain Gupta, Kuai Yu, and Hung Tran At LinkedIn, trusted data platforms and quality data pipelines are essential to meaningful business metrics and sound decision-making. Today, a considerable percentage of data at LinkedIn comes from online data stores. Whether the online data systems fall into SQL or NoSQL categories,...
apache Articles
-
Open sourced software is no longer restricted to addressing small, low-level problems and for companies that can’t afford to built it themselves. Today it is backed by a mature community that creates and adopts it on behalf of some of the world’s biggest technology companies. Why the shift? Why should companies focus some of their efforts on open source? Why...
- Topics:
- apache,
- Lucene,
- Kafka,
- Open Source
-
Apache Samza is a stream processing framework that LinkedIn developed to solve some of our toughest stream processing challenges. We open sourced it in September of 2013 as an Apache Incubator project. I'm very pleased to announce that Samza recently graduated from Apache Incubator into a top-level Apache project. Apache Incubator is the entry point into the...
-
This post originally appeared as a contributed piece on The New Stack. LinkedIn began processing “big data” on Apache Hadoop six years...
- Topics:
- Stream Processing,
- apache,
- Kafka,
- Samza,
- Open Source
-
At LinkedIn, we use a log-centric system called Apache Kafka to move tons of data around. If you're not familiar with Kafka, you can...