Hadoop Articles

  • Gobblin Gobbles Camus, Looks Towards the Future

    April 13, 2016

    We shared Gobblin with the open source community a year ago. Since then, we’ve seen increasing interest and adoption among engineers, researchers and analysts in using Gobblin to integrate data from a variety of sources into Hadoop. In previous blog posts, publications, and talks, we’ve described our motivations for building a unified ingestion framework that is...

  • Open Sourcing Dr. Elephant

    April 8, 2016

    We are proud to announce today that we are open sourcing Dr. Elephant, a powerful tool that helps users of Hadoop and Spark understand, analyze, and improve the performance of their flows. We first presented Dr. Elephant to the community last year during the eighth annual Hadoop Summit, a leading conference for the Apache Hadoop community. Hadoop is a framework...

  • Sizr: Visualizing HDFS utilization at LinkedIn

    Sizr: Visualizing HDFS utilization at LinkedIn

    January 8, 2016

    Co-authors: Vamshi Hardageri, Brian Jue Sizr is an interactive visualization tool developed at LinkedIn for the Hadoop Distributed file system (HDFS). It provides insights into HDFS disk space and namespace utilization. It can forecast and track weekly, monthly and quarterly growth, and detect inefficient file storage. This post outlines the need for this tool,...

  • Bridging Batch and Streaming Data Ingestion with Gobblin

    September 28, 2015

    Genesis Less than a year ago, we introduced Gobblin, a unified ingestion framework, to the world of Big Data. Since then, we’ve shared...

  • Rewinder: Interactive Analysis of Hadoop's...

    September 23, 2015

    Co-authors: Teja Thotapalli Brian Jue Tu Tran Sandhya Ramu As LinkedIn continues to grow in size and stature, the data volume being...

  • Open-Sourcing the LinkedIn Gradle Plugin and DSL for Apache...

    August 13, 2015

    I'm proud to announce that the Hadoop Dev Team at LinkedIn has open-sourced the LinkedIn Gradle Plugin for Apache Hadoop ("Hadoop...