Striim, Inc. announced Striim for Databricks, the first streaming SaaS solution to integrate database change streams using change data capture (CDC) technologies from enterprise-grade databases such as Oracle, SQL Server, PostgreSQL, MySQL, and other sources to the Databricks Lakehouse. Customers can quickly build a new data pipeline to stream transactional data from hundreds and thousands of tables to Databricks with sub-second end-to-end latencies to enable streaming analytics, refresh their AI/ML models in real time, and address time-sensitive operational issues.
“Enterprises are increasingly seeking solutions that help bring critical data stored in databases into the Databricks Lakehouse Platform with speed and reliability,” said Roger Murff, VP of Technology Partners at Databricks.“With this integration, customers can quickly and easily integrate their data into Databricks and begin analyzing and driving business value with data throughout their organizations.”
CIO INFLUENCE: World Password Day: Password advice for CIOs
Organizations replicate data from multiple databases to cloud data warehouses, data lakes, and data lakehouses to enable their data science and analytics teams to optimize their decision-making and business workflows. Legacy data warehouses are not easily scalable or high-performant enough to deliver real-time analysis capabilities, while cloud-based data ingestion platforms can require significant effort to set up.
Striim for Databricks builds on Striim’s award-winning data integration and streaming capabilities to simplify building and operating data pipelines and enable real-time streaming workloads on the lakehouse.Using the newly-designed user interface, customers can configure and observe the ongoing and historical health and performance of their data pipelines, reconfigure their data pipelines to add or remove tables on the fly, and easily repair their pipelines in case of failures.
CIO INFLUENCE: CIO Influence Interview with Lior Yaari, CEO and Co-Founder at Grip Security
“A unified approach that combines the best of data warehouses and data lakes for modern workloads to make real-time decisions relies on fresh data being delivered in an open format,” said Alok Pareek, co-founder and Executive Vice President of Engineering and Products at Striim. “Our customers increasingly need operational data in Delta tables for their data analytics needs. We have designed Striim for Databricks to support Delta tables and Databricks Unity Catalog for operational ease, data sharing, flexibility, and resiliency so that our customers can use Spark Dataframes or SQL to easily extract business value from their data.We have automated schema management, snapshot, CDC coordination, and failure handling in the data pipelines to deliver a delightful user experience.”
Striim for Databricks provides a high level of automation.Customers can set up their data pipelines with a few clicks, and Striim takes care of the rest.Striim uses patented technologies and Databricks best practices to parallelize writing to Databricks to maximize pipeline throughput and reduce end-to-end latencies.Striim continuously monitors and reports pipeline health and performance.Striim for Databricks natively stores and reports health performance data so customers can quickly analyze and optimize pipeline performance based on real-time, near-term, and historical data.
CIO INFLUENCE: CIO Influence Interview with Russ Ernst, Chief Technology Officer at Blancco
[To share your insights with us, please write to sghosh@martechseries.com]