When I started my journey at LinkedIn ten years ago, the company was just beginning to experience extreme growth in the volume, variety, and velocity of our data. Over the next few years, my colleagues and I in LinkedIn’s data infrastructure team built out foundational technology like Espresso, Databus, and Kafka, among others, to ensure that LinkedIn would...

Posts by Shirshanka Das
-
- Topics:
- Data,
- Metadata,
- Open Source
-
Bridging Batch and Streaming Data Ingestion with Gobblin
Shirshanka Das September 28, 2015
Genesis Less than a year ago, we introduced Gobblin, a unified ingestion framework, to the world of Big Data. Since then, we’ve shared ongoing progress through a talk at Hadoop Summit and a paper at VLDB. Today, we’re announcing the open source release of Gobblin 0.5.0, a big milestone that includes Apache Kafka integration. Our motivations for building Gobblin...
- Topics:
- Big Data,
- Hadoop,
- Open Source,
- Distributed Systems,
- ETL,
- Gobblin,
- Kafka