The Analytics team provides platforms for analytics data storage, transport, access/management, compute frameworks, and query engines.
Gobblin is a distributed data integration framework that simplifies common aspects of big data integration, such as ingestion, replication, organization, and lifecycle management, for both streaming and batch ecosystems.
Unified Metrics Platform (UMP)
Unified Metrics Platform (UMP) is a specification and a set of tools used to facilitate the creation of streamlined and consistent metrics data at LinkedIn.
Dali is a collection of libraries, services, and development tools united by the common goal of providing a logical data access layer for Hadoop and Spark.
To meet our needs, and because there are many others interested in running distributed machine learning who are also running large Hadoop deployments, we have built and open sourced TensorFlow on YARN (TonY).
Dr. Elephant is a performance monitoring and tuning tool for Hadoop and Spark with the goal of improving developer productivity and increasing cluster efficiency.
Apache Spark has a significant impact on the way we process and analyze data. Learn more about what we're up to, including our Spark-based open source project Avro2TF.