Analytics Platforms
The Analytics team provides platforms for analytics data storage, transport, access/management, compute frameworks, and query engines.
Gobblin
Gobblin is a distributed data integration framework that simplifies common aspects of big data integration, such as ingestion, replication, organization, and lifecycle management, for both streaming and batch ecosystems.
Unified Metrics Platform (UMP)
Unified Metrics Platform (UMP) is a specification and a set of tools used to facilitate the creation of streamlined and consistent metrics data at LinkedIn.
Dali
Dali is a collection of libraries, services, and development tools united by the common goal of providing a logical data access layer for Hadoop and Spark.
TonY
To meet our needs, and because there are many others interested in running distributed machine learning who are also running large Hadoop deployments, we have built and open sourced TensorFlow on YARN (TonY).
ThirdEye
ThirdEye is a platform for real-time monitoring of metrics that covers a wide variety of use-cases such as site performance, adoption of new features, and system security.
Pinot
Pinot, a real-time distributed OLAP data store built at LinkedIn, delivers scalable real time analytics with low latency.
Pinot
Pinot, a real-time distributed OLAP data store built at LinkedIn, delivers scalable real time analytics with low latency.
Dr. Elephant
Dr. Elephant is a performance monitoring and tuning tool for Hadoop and Spark with the goal of improving developer productivity and increasing cluster efficiency.
Spark
Apache Spark has a significant impact on the way we process and analyze data. Learn more about what we're up to, including our Spark-based open source project Avro2TF.
TensorFlow
Artificial intelligence plays a big role in how we deliver content and create opportunity for our members. Many of these use cases are built on TensorFlow.
Interested in joining our team?