Shirshanka Das

Posts by Shirshanka Das

  • explaining-metadata-architectures

    DataHub: Popular metadata architectures explained

    December 7, 2020

    When I started my journey at LinkedIn ten years ago, the company was just beginning to experience extreme growth in the volume, variety, and velocity of our data. Over the next few years, my colleagues and I in LinkedIn’s data infrastructure team built out foundational technology like Espresso, Databus, and Kafka, among others, to ensure that LinkedIn would...

  • Bridging Batch and Streaming Data Ingestion with Gobblin

    September 28, 2015

    Genesis Less than a year ago, we introduced Gobblin, a unified ingestion framework, to the world of Big Data. Since then, we’ve shared ongoing progress through a talk at Hadoop Summit and a paper at VLDB. Today, we’re announcing the open source release of Gobblin 0.5.0, a big milestone that includes Apache Kafka integration. Our motivations for building Gobblin...