CIO Influence
CIO Influence Interviews Data Management IT Ops Machine Learning

CIO Influence Interview with Priyank Patel, VP of Product Management at Cloudera

CIO Influence Interview with Priyank Patel, VP of Product Management at Cloudera

“Right now, everyone is experimenting with new Generative AI capabilities, but very few are implementing in production. In the future the differentiation between AI leaders and AI laggards will be the efficiency with which these successful experiments can move to production for wide impact.”

Hi Priyank, welcome to our Interview Series. Please tell us a little bit about your role and responsibilities at Cloudera.

Thanks for having me, Sudipto. As VP of Product Management at Cloudera my role involves leading the strategy and development of our AI products across both public and private cloud, ensuring they align with market needs and enterprise-level requirements today and in the future.

Could you tell us more about the recent updates supporting NVIDIA’s data engineering and AI technologies? 

Cloudera is committed to providing our customers with the best performing open lakehouse on the market, and to those ends it’s important that we partner with industry leaders like NVIDIA. This year NVIDIA has become synonymous with AI, but our partnership goes beyond the obvious AI use cases.

Recently, we introduced NVIDIA RAPIDs accelerated Spark in our Data Services.

With NVIDIA’s GPUs and libraries like RAPIDS, CDP can accelerate the entire data lifecycle, from the edge to AI, all with minimal changes to existing CPU based code, in both public and private cloud environments.

How do you bring together AI and Zero Trust capabilities for advanced security management with the Cloudera Data Platform (CDP)?

Our work with customers adopting AI in the enterprise has made it clear to us that any trust in AI-driven decisions has to start with the trust in the data used for that AI.

At the center of the Cloudera Data Platform is the Shared Data Experience, or SDX, our data fabric solution that integrates security and governance across the entire platform in a single place, ensuring safe data access with reduced risks. Our purpose-built data services for every stage of the data life cycle from data ingestion to AI inference, all are integrated into SDX out of the box.

Cloudera Signs Strategic Collaboration Agreement with AWS

With SDX, organizations can leverage Cloudera’s AI and machine learning capabilities while maintaining full control and visibility over their data, ultimately enabling them to drive insights and value securely and responsibly.

What is the most impactful Enterprise-level Trusted AI framework that could accelerate Gen AI DevOps pipelines in 2024? 

That is a loaded question, but I’ll give you my opinion based on first principles. Any Trusted AI framework that has a hope to become the most impactful in 2024 should have a few key attributes.

First and foremost, this framework has to be secure and offer robust privacy features to protect any organization’s most valuable asset, its data. Second, the framework should facilitate ethical AI practices, including transparency, fairness, and accountability, and offer simple governance tools to manage and audit AI applications.

Blast from the Past for CIOs:

New Research Reveals U.S. Companies are Spending an Estimated $61B a Year on Tasks Many Devs and DevOps Consider Frustrating, Instead of on Innovation

We anticipate this to be a key topic in 2024. Third, the framework must have the ability to efficiently scale to handle large and complex datasets, including support for seamless distributed computing with GPUs.

Last but not least, the framework has to be user friendly, including intuitive interfaces, APIs, and documentation, all of which will contribute to community acceptance and support.

We know that no single technology provider can act as the end-all for such a broad framework.  This is why at Cloudera we are building deep partner integrations with the likes of AWS, Pinecone, NVIDIA, Azure OpenAI, and many more to make this a reality for enterprises.

Please tell us more about the best practices in scaling machine learning with MLOps.

Can you elaborate more about these capabilities from a customer perspective?

The core concepts behind MLOps are automation, efficient and scalable deployment, and performance monitoring. These pillars remain the same starting points, even with Generative AI applications and models we see at customers. Cloudera has several customers that have deployed LLMs and other Generative AI models into production and are seeing tremendous value out of the gates.

Customers have told us they care about flexibility of where to build, tune, and deploy the models.

Oftentimes, they want to experiment quickly with SaaS services like AWS Bedrock and Azure OpenAI, but when it comes to operationalization they have to consider lower TCO, privacy, security compliance, and control over the models.

Such a pattern of AI adoption fits hand in glove with Cloudera’s hybrid platform – which is built to provide agility in the public cloud and through SaaS model endpoints, but is fully capable of running in a customer-controlled environment on-premise – all without rewriting the applications.

Cloudera Observability Optimizes Hybrid Cloud Costs

In summary, it is not as much about the MLOps capabilities, as it is about having them in any environment that serves as an important differentiator for customers.

What kind of infrastructure does a financial services organization need to streamline the data engineering lifecycle?

How does CDP help with these efforts?

Cloudera is a trusted partner and proven provider to the financial services industry, our largest Industry Vertical.

Our platform powers 82% of the largest global banks, of which 27 are Globally Systemic banks, 4/5 of the top stock exchanges, 8/10 top wealth management firms, all 4 of the top credit card networks, more than 20 central Banks, and approximately a dozen Financial Regulators.

Financial services organizations need a flexible and scalable data infrastructure to handle large volumes of structured and unstructured data.

Cloudera Data Platform (CDP) is an integrated end-to-end Data Platform. With experiences/tools that cover every aspect of the data life cycle. Cloudera is a true enterprise data platform that can be utilized for on prem, on private cloud, on public cloud, or in a hybrid environment. CDP facilitates operational efficiency through automation, using ML and ML Ops. And provides end-to-end data governance and security with SDX to simplify data protection, sharing, and compliance Cloudera Data Platform (CDP) can help in several ways:

  • CDP’s open lakehouse provides data services to ingest, process, and analyze data of any type and scale. This streamlines the full data engineering lifecycle from raw data to insights.
  • CDP has shared data experience (SDX) to provide consistent security, governance, and lineage across multi-cloud environments. This makes it easier to operationalize and scale data pipelines.
  • CDP automates infrastructure provisioning and data pipeline orchestration to accelerate development.

Our platform facilitates a variety of use cases from risk management, trade analytics, regulatory compliance, fraud detection, customer 360, ESG initiatives around climate risk, and more.

Your take on the future of next-gen MLOps and data engineering trends in hardened security environments:

Right now, everyone is experimenting with new Generative AI capabilities, but very few are implementing them in production. In the future, the differentiation between AI leaders and AI laggards will be the efficiency with which these successful experiments can move to production for wide impact. We are already seeing this in the early adopters who are far outpacing their established competitors.

Data engineering is moving towards the destruction of silos – with the shift to open lakehouses and community-accepted table formats like Apache Iceberg.

I would bet against propriety lakehouses. Open lakehouses give customers the flexibility to choose the right tool for data management and the right tool for processing – without having to move and duplicate those data assets.

Accelerated compute with Data Engineering is another trend that we see getting more important as the TCO of running these large AI applications is further scrutinized by organizations.

Lastly, open communities in AI and AIOps will be an important driver for access to these cutting-edge technologies to enterprises.  The silent power of open source communities in AI will be an equalizer for the next-generation of platforms that enterprises adopt in high security environments.

Lighter notes:

Burn the midnight candle or soak in the sun?

Soak in the sun…

Coffee, or Tea?

Tea

Your favorite Cloudera product marketing initiative that you want everyone to know about?

When we launched our open lakehouse initiative in 2022 with partners from LinkedIn, Netflix and co-creators of Apache Iceberg. In 20 years of experience in data management and analytics, an open lakehouse is the most promising attempt at breaking down data silos that companies are burdened with (sometimes out of choice, most other times without their choice).

Within weeks of our launch, we had customers and users constantly pouring in requests and questions to understand more and chart the easiest path to adoption.

First memorable experience in your career as a Cloud and data engineering leader?

When we were building Arcadia Data, it took us several quarters to land our first paid customer. It was eye-opening as we went through the process of how new technologies get adopted – why is it important to build as little as you can in new products and keep iterating until you find the minimal surface area that has the highest and differentiated value proposition – i.e., the MVP. And an unforgettable experience when we tasted success in that phase of our team’s journey.

Most useful app that you currently use: 

Duckduckgo, Signal, and ExpressVPN – what can I say, I value privacy 🙂

Thank you, Priyank! That was fun and we hope to see you back on CIO Influence soon.

[To participate in our interview series, please write to us at sghosh@martechseries.com]

Priyank Patel is Vice President of Product Management at Cloudera where he leads the ML and Open Lakehouse Data Services. He was a founder and Chief Product Officer of Arcadia Data, which was acquired by Cloudera in 2019.

At Arcadia, he helped lead the creation of Arcadia’s data-native Business Intelligence product, which was recognized as a leader in the Forrester Wave on scalable BI platforms, and received numerous accolades from industry analysts and customers.

Prior to Arcadia, Priyank was a founding engineer at Aster Data, acquired by Teradata. At Aster he designed core modules of the Aster Database and worked on its SQL-MapReduce and in-database analytical framework.

Cloudera

At Cloudera, we believe data can make what is impossible today, possible tomorrow. We empower people to transform their data into trusted enterprise AI so they can reduce costs and risks, increase productivity, and accelerate business performance.

Our open data lakehouse enables secure data management and portable cloud-native data analytics helping organizations manage and analyze data of all types, on any cloud, public or private. With as much data under management as the hyperscalers, we’re a data partner for the top companies in almost every industry.

Cloudera has guided the world on the value and future of data, and continues to lead a vibrant ecosystem powered by the relentless innovation of the open source community.

Related posts

Pariveda Earns the Data Warehouse Migration to Microsoft Azure Advanced Specialization

mabl Experience Highlights Importance of Quality in DevOps Adoption, Digital Transformation

PR Newswire

Arcserve and Nutanix Bring Game-Changing Hyperconverged Data Protection, Cyber Security to Hybrid Cloud Infrastructures