CIO Influence
CIO Influence News Data Management Digital Transformation

StreamSets Expands Databricks Partnership with New Connector for Databricks Delta Lake Integration

StreamSets Expands Databricks Partnership with New Connector for Databricks Delta Lake Integration

Databricks Partner Integration Gallery Now Features StreamSets Cloud Integration Connector to Enable Users to Easily Ingest, Integrate and Monitor Data in Delta Lake

StreamSets, provider of the industry’s first DataOps platform, announced an expansion of its partnership with Databricks by participating in Databricks’ newly launched Data Ingestion Network. As part of the expanded partnership, StreamSets is offering additional functionality in StreamSets with a new connector for Databricks Delta Lake. With it, users can configure their pipelines to write data from any source moving in batch or streaming mode directly into Delta Lake. Now, data teams can deliver more data in a shorter time frame, driving BI, analytics and, ultimately, digital transformation.

TOP AI Insights for CIOs: What Tech Leaders Need to Know About AI’s Transformative Potential

Today, companies require systems for diverse data applications like real-time monitoring, machine learning and data science — and that can process unstructured data like text, images, video and audio. A decade ago, data lakes replaced data warehouses as the best repositories for this raw data; however, they neither support transactions nor enforce data quality. In addition, they lack consistency, making it almost impossible to mix batch and streaming jobs and appends and reads.

Leveraging the best of data warehouses and data lakes, data lakehouses remedy their limitations, but friction ingesting fresh data remains. With this partnership, Databricks users will now be able to capitalize on the new data lakehouse paradigm without the friction previously encountered. They can easily connect into the StreamSets platform and leverage out-of-the-box connectors to load batch, change data capture (CDC) or streaming data from any source (such as relational data, on-premises data lakes and warehouses, and cloud applications) into Databricks Delta Lake. With StreamSets, data engineers can easily build and operate data pipelines for modern and legacy data sources to migrate to a data lakehouse platform and continuously refresh with relevant data.

Recommended: Balancing Privacy and Data Activation with CDPs

Specifically, the new StreamSets connector for Databricks Delta Lake enables several key benefits for even greater operational control over the full life cycle of data:

  • Faster migration to the cloud with fewer data engineering resources
  • Drag-and-drop interface to simplify data movement from multiple disparate sources
  • Improved management of operations and performance for cloud data lakes with Delta Lake
  • Change-data-capture capability from several data sources into Delta Lake
  • Built-in Kubernetes containerization and native cloud scaling

Combined with Delta Lake, the connector also makes it possible to unify batch and streaming data to support the timeliness of transactional operations, ensuring ACID compliance.

“Along with Apache Spark, the use of Databricks’ Delta Lake is rapidly expanding in the market,” said Pankaj Dugar, vice president of business development at Databricks. “With StreamSets’ extended support for Delta Lake, small and midsize companies now have an easy way to ingest data from their cloud-based service into Databricks’ Delta Lake so they can maximize their analytics efforts with fresh data in their data lakehouse.”

“This connector is another step forward in our alliance with Databricks to deliver more data, faster, to drive analytics — which is critical to the survival and success of today’s organizations,” said Jobi George, general manager of Cloud Business at StreamSets. “We’re excited to continue our work with Databricks to drive innovation in the industry.”

Read More: AWS for Media & Entertainment Launches New Monetization Solution Area

[To share your insights with us, please write to sghosh@martechseries.com]

Related posts

Japan Aerospace Exploration Agency (JAXA) Redefined HPC System TOKI-SORA Enables Numerical Analysis

CIO Influence News Desk

Strike Graph Introduces Support for HIPAA Compliance

CIO Influence News Desk

Oracle Strengthens Saudi Arabia’s AI Economy with Opening of Second Public Cloud Region

PR Newswire