CIO Influence
CIO Influence News Cloud Data Storage

The Apache Software Foundation Announces Apache Pinot as a Top-Level Project

The Apache Software Foundation Announces Apache Pinot as a Top-Level Project
Open Source distributed real-time Big Data analytics infrastructure in use at Amazon-Eero, Doordash, Factual/FourSquare, LinkedIn, Stripe, Uber, Walmart, Weibo, and WePay, among others.

The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, announced today Apache Pinot as a Top-Level Project (TLP).

Apache Pinot is a distributed Big Data analytics infrastructure created to deliver scalable real-time analytics at high throughput with low latency. The project was first created at LinkedIn in 2013, open-sourced in 2015, and entered the Apache Incubator in October 2018.

“We are pleased to successfully adopt ‘the Apache Way’ and graduate from the Apache Incubator,” said Kishore Gopalakrishna, Vice President and original co-creator of Apache Pinot. “Pinot initially pushed the boundaries of real-time analytics by delivering insights to millions of Linkedin users. Today, as an Apache Top-Level Project, Pinot is in the hands of developers across the globe who are building it to power several user-facing analytical applications and unlock the value of data within their organizations.”

Recommended ITech News:  OSNEXUS Announces QuantaStor 5.10 with Red Hat 8 and Intel Ice Lake Support

Scalable to trillions of records, Apache Pinot’s online analytical processing (OLAP) ingests both online and offline data sources from Apache Kafka, Apache Spark, Apache Hadoop HDFS, flat files, and Cloud storages in real time. Pinot is able to ingest millions of events and serve thousands of queries per second, and provide unified analytics in a distributed, fault-tolerant fashion. Features include:

  • Speed —answers OLAP queries with low latency on real-time data
  • Pluggable indexing —Sorted, Inverted, Text Index, Geospatial Index, JSON Index, Range Index, Bloom filters
  • Smart Materialized Views – Fast Aggregations via star-tree index
  • Supports different stream systems with near real-time ingestion —with Apache Kafka, Confluent Kafka, and Amazon Kinesis, as well as customizable input format, with out-of the box support for Avro and JSON formats
  • Highly available, horizontally scalable, and fault tolerant
  • Supports lookup joins natively and full joins using PrestoDB/Trino

Apache Pinot is used to power internal and external analytics at Adbeat, Amazon-Eero, Cloud Kitchens, Confluera, Doordash, Factual/FourSquare, Guitar Center, LinkedIn, Publicis Sapient, Razorpay, Scale U********, Startree, Stripe, Traceable, Uber, Walmart, Weibo, WePay, and more.

Recommended ITech News:  Creatio Builds on Existing Relationship with AWS–Available in AWS Marketplace Now

Examples of how Apache Pinot helps organizations across numerous verticals include: 1) a fintech company uses Pinot to achieve financial data visibility across 500+ terabytes of data and sustain half million queries per second with financial transactions; 2) a food delivery service leveraged Pinot in the midst of the COVID-19 pandemic to analyze real-time data to provide a socially-distanced pick-up experience for its riders and restaurants; and 3) a large retail chain with geographically distributed franchises and stores uses Pinot for revenue-generating opportunities by analyzing real-time data for internal use cases, as well as real-time cart analysis to i*************.

“We rely on Apache Pinot for all our real-time analytics needs at LinkedIn,” said Kapil Surlaker, Vice President of Engineering at LinkedIn. “It’s battle-tested at LinkedIn scale for hundreds of our low-latency analytics applications. We believe Apache Pinot is the best tool out there to build site-facing analytics applications and we will continue to contribute heavily and collaborate with the Apache Pinot community. We are very happy to see that it’s now a Top-level Apache project.”

Recommended ITech News:  NetActuate Continues to Increase Global Infrastructure Capacity with Paris Data Center Expansion

“We use Apache Pinot in our real-time analytics platform to power external user-facing applications and critical operational dashboards,” said Ujwala Tulshigiri, Engineering Manager at Uber. “With Pinot’s multi-tenancy support and horizontal scalability, we have scaled to hundreds of use cases that run complex aggregations queries on terabytes of data at millisecond latencies, with the minimal overhead of cluster management.”

“We’ve been using Apache Pinot since last year, and it’s been a huge win for our client’s dashboard project,” said Ken Krugler, President of Scale U********. “Pinot’s ability to rapidly generate aggregation results over billions of records, with modest hardware requirements, was critical for the success of the project. We’ve also been able to provide patches to add functionality and fix issues, which the Pinot community has quickly integrated and released. There was never any doubt in our minds that Pinot would graduate from the Apache incubator and become a successful top-level project.”

“Last year, we started without analytics built into our product,” said Pradeep Gopanapalli, technical staff member at Confluera. “By the end of the year, we were using Apache Pinot for real-time analytics in production. Not many of our competitors can even dream of having such results. We are very happy with our choice.”

Recommended ITech News:  Bitrise Names Pablo Pinillos Its New Chief Financial Officer

“Pinot is critical to our real-time analytics platform and allowed us to scale without degrading latency,” said software engineer Elon Azoulay. “Pinot enables us to onboard large datasets effortlessly, run complex queries which return in milliseconds and is super reliable. We would like to emphasize how helpful and engaged the community is and are certain that we made the right choice with Pinot, it continues to impress us and satisfy our real-time analytics needs.”

“We created Pinot at LinkedIn with the goal of tackling the low-latency OLAP problem for site-facing use cases at scale. We evolved it to solve numerous OLAP use cases, and open-sourced it because there aren’t many technologies in that domain,” said Subbu Subramaniam, member of the Apache Pinot Project Management Committee, and Senior Staff Engineer at LinkedIn. “It is heart-warming to see such a wide adoption and great contributions from the community in improving Pinot over time.”

“We are at the beginning of this transformation and we cannot wait to see every software company build real-time applications using Apache Pinot,” added Gopalakrishna. “We welcome everyone to join our community Slack channel and contribute to the project.”

Recommended ITech News:  Google Cloud Region Goes Live in Delhi NCR in India

Related posts

EdgeDB Raises $15Million Series A Round to Bring its Modernized Relational Database for Cutting-Edge Apps to the Cloud

CIO Influence News Desk

FPT Software Collaborates with Solibri, Offering Tailored Solutions to Improve Architecture, Engineering

Intel Brings AI Everywhere Across Network, Edge, Enterprise

Business Wire

Leave a Comment