Pinot Joins Apache Incubator

Neha Pawar

Head of Data Infra at StarTree

March 12, 2019

Pinot is a scalable distributed OLAP data store developed at LinkedIn to deliver real-time, low latency analytics. Pinot’s powerful analytics capabilities have made it the de facto real-time analytics infrastructure at LinkedIn, serving a wide variety of production use cases. The most notable ones are customer-facing applications, such as “Who Viewed My Profile,” Talent Insights, and news feed customization, all of which have voluminous data, thousands of queries per second, and require very low latency. We open sourced Pinot four years ago and have received tremendous interest from the open source community in the project. The developers of Pinot—both internal and external—have made relentless efforts towards its enhancement. Today, we are thrilled to share that Pinot has entered Apache incubation!

As of today, Pinot can scale to sustain 1,000s of QPS and deliver results on the order of 10s to 100s of milliseconds, along with ingesting new data at millions of events per second. Inside LinkedIn, we host over 50 site-facing analytics use cases using Pinot.

Pinot also powers internal analytics at LinkedIn, such as dashboards used by business analysts. Pinot serves as the backend for ThirdEye, a comprehensive platform for real-time monitoring of metrics, as well as XLNT, an end-to-end A/B testing platform. All of these applications rely on Pinot’s ability to slice and dice high-dimensional data on demand, providing users insights into LinkedIn’s datasets to make data-driven decisions.

What’s new in Pinot

Here’s a quick recap of the existing components of Pinot. Find more details, see here for a deep dive.

Pinot Server: Hosts one or more data segments and serves queries from those segments.
Pinot Broker: Accepts queries from clients, routes them to one or more servers, and returns consolidated response to the client.
Pinot Controller: Manages other Pinot components and controls assignment of data segments to servers.

Pinot supports near real-time data ingestion by reading events directly from streams like Kafka as well as data pushes from offline systems like Hadoop.

Pinot architecture

Since the time Pinot was open sourced, it has grown by leaps and bounds. Here are some of the highlights of what we’ve been up to.

Deep storage
We have introduced a Pinot filesystem abstraction, thereby providing users the option to plug in their own storage backend. Natively, we support Hadoop, NFS, and Azure Data Lake. Find out more about configuring these at Pluggable Storage.

Smart routing strategies
Previously, Pinot brokers by default used all the servers available for scatter gather during the query execution, resulting in a long tail latency. We have added smart routing and segment assignment strategies that limit the server fan-out out for a query, which has helped bring down the 99th percentile latency from 200ms to under 100ms in certain cases.

Pluggable real-time streams
Pinot real-time is no longer tightly integrated with Kafka for ingestion. Users can develop their own plugins to read from any pub-sub system. Examples of pub-sub streams for which plugins can be written are Azure EventHubs and Amazon Kinesis. Find out more at Pluggable Streams In Realtime.

TDigest support
The calculation of percentiles is tricky in a distributed environment, because exact percentile computation is non-additive. Pinot can now accept byte serialized TDigest data and store it natively, which can then be queried to compute percentiles as follows:

SELECT percentileTDigest95(tDigestColumn) from myTable where… group by… TOP N

As a byproduct of this, Pinot can also support byte[] data type natively. Multiple operators—such as Avg, MinMaxRange, and DistinctCountHLL—can take both serialized data structure as well as raw values.

New indexing techniques
Apart from existing indexing techniques (bitmap inverted index, sorted index) built atop dictionary encoding, we have introduced some new indexing techniques.

Star-Tree index: Star-Tree index is built on multiple columns, and utilizes the pre-aggregated results to significantly reduce the number of values to be processed, thus improving the query performance. We support multiple Star-Trees with various type of aggregations.

Raw value forward index: When aggregating a large number of values, raw value forward indexes take advantage of good locality of the values for scanning, thus vastly improving the performance.

Pinot Minion
Pinot Minion is a new component which leverages the Helix Task Framework and can be attached to an existing Pinot cluster and then executes tasks as provided by the controller. By being the single generic place for running background jobs, Pinot Minions help offload computationally intensive tasks—such as adding indexes to segments and merging segments—from other components.

UDF support
Many use cases require transformation on the column values, before and/or after performing selection, aggregation, and group by. We added UDF support in order to make such custom transformations possible, with some in-built UDFs for common use cases such as time column transformation.

Pinot @SIGMOD 2018
Our paper on Pinot was published and presented at SIGMOD 2018. Check out Pinot: Realtime OLAP for 530 Million Users to find out more about Pinot and how it compares with similar products in this space (e.g., Druid).

Coming soon

Here’s a sneak peak into some of the exciting upcoming features in Pinot.

Segment merge service: Pinot’s query execution happens at the segment level, which can suffer from a substantial query planning overhead if faced with too many small segments. We’re designing a segment merge service, which will automatically merge small segments into larger ones, thus enhancing the query performance. Merging segments will also achieve better compression, which will lead to improved storage utilization.

Decoupling Pinot and Helix controllers: We will be working on separating the Pinot and Helix controllers, which will facilitate independently scaling and debugging the two controllers.

Distributed Pinot controllers: We plan to add a mechanism to distribute the controller’s duties across all available controllers. This will help eliminate strong dependency on the lead controller and improve resource utilization across controllers.

Multi-tenancy support: We are actively working on multiple features (segment assignment strategies, segment relocation, resource isolation, query admission control) to enhance the multi-tenancy support in Pinot.

HOCON-based config format: We will soon launch a new HOCON-based config format with an easy and intuitive syntax. This feature will greatly reduce the risk of misconfigurations.

Shout out to the open source community

The open source community has been instrumental in spearheading some important developments and interesting proposals in Pinot.

Pinot @Uber
Uber uses Pinot to power real-time dashboards and business intelligence applications across a wide range of use cases in the “Engineering & Operational City” teams. The Pinot contributors at Uber have been proactive in making improvements to the real-time pipeline to improve latencies and storage utilization. They are working on integrating Apache Superset with Pinot. They have also proposed and worked on features such as deep storage support in real-time ingestion (eliminating the need for storage in real-time service), PQL enhancements, metadata upload in real-time segment completion, and so on. We thank Xiang Fu, James Shao, Haibo Wang, Ting Chen, Devesh Agrawal, and Chinmay Soman for their ongoing contributions!

Pinot @Microsoft
Microsoft Teams has started to pilot Pinot for low-latency reporting and analytics. The Teams Telemetry group has Pinot running on 100% Azure infrastructure and is now evaluating Pinot as the backend for providing usage reports to customer IT admins, in-product analytics, and internal business intelligence reports. The Pinot contributors at Microsoft have made many improvements to segment generation code. Their collaboration in the design of deep storage support in Pinot is highly appreciated. They also put forth numerous interesting proposals such as a plugin to ingest from Azure Event Hubs and SSL config support in controllers and brokers. Thanks Vish Balasubramanian, Richard Ninh, Nandini Malempati, Fred Weitendorf, and Greg Koehler for your involvement in Pinot!

Get Involved

We welcome everyone to get involved in the Apache Pinot community. Please join our dev and user mailing lists to get started.

You can check out the Pinot documentation here. Last week, we published our first release Apache Pinot (Incubating) 0.1.0, which can be downloaded here. We would be more than happy to answer any questions you might have about Pinot. There are multiple ways to get in touch with us. You can open a JIRA ticket, create an open source issue, or shoot an email to one of our mailing lists. Feel free to create pull requests to make contributions. We look forward to hearing your interesting proposals and brainstorming ideas with the community!

Big Data meetup
We are delighted to announce that we will be hosting a Big Data Meetup on March 27 at LinkedIn’s Mountain View campus. Please join us for a glass of Pinot and hear speakers from LinkedIn, Microsoft, and Slack talk about interactive analytics at scale.

Acknowledgments

This project would not have been possible without the relentless efforts of the Pinot team within LinkedIn. This includes the remarkable engineers Sunitha Beeram, Jeffrey Bowles, Jennifer Dai, John Gutmann, Walter Huf, Jean-Francois Im, Xiaotian Jiang, Seunghyun Lee, Jialiang Li, Dino Occhialini, Mayank Shrivastava, and Subbu Subramaniam. We thank Ravi Aringunram, Kishore Gopalakrishna, Prasanna Ravi, and Shraddha Sahay for their leadership, vision, and guidance.

We would also like to express our sincere gratitude to our Apache mentors Felix Cheung, Jim Jagielski, Oliver Lamy, and Roman Shaposhnik. And finally, we extend thanks to Kapil Surlaker and Igor Perisic for their continued support.

Topics: Open Source Data