New Spark-native offering makes Senzing the first entity resolution vendor to offer batch, transactional, and hybrid deployment models with full end-to-end agentic automation
Senzing, an identity intelligence company, announced the opening of its Senzing for Apache Spark beta program, bringing the company’s industry-leading entity resolution technology to distributed batch workloads for the first time. Organizations running Spark on AWS EMR, Databricks, or Snowflake can now resolve and relate billions of records across multiple data sources—from fully autonomous data profiling and preparation through to publishing the resolved entity graph to downstream systems.
Also Read: CIO Influence Interview with Gihan Munasinghe, CTO of One Identity
Organizations running Spark on AWS EMR, Databricks, or Snowflake can now resolve and relate billions of records across multiple data sources with Senzing.
The launch marks a significant milestone for the entity resolution market. Until now, enterprises faced a binary choice: batch processing systems built on Spark, or real-time transactional systems. Senzing for Spark eliminates that tradeoff. With this release, Senzing becomes the only entity resolution vendor to offer all three entity resolution deployment modes—Spark batch, transactional SQL, and hybrid.
“Picking an entity resolution vendor has long forced a binary choice: Batch Spark or Transactional SQL. We’re excited to turn this ‘or’ into an ‘and.’ With Senzing for Spark, customers get all the intelligence found in our real-time SDK—principle-based entity resolution, entity-centric learning, relationship awareness, global name, address and cross-script matching, and explainability—running natively inside their Spark platform of choice.
— Brian Macy, Head of Operations and Engineering, Senzing
Fully Agentic from Preparation to Publication
Senzing® entity resolution for Spark is designed for agentic AI workflows end-to-end. Powered by the Senzing MCP Server, AI agents execute each stage of the pipeline autonomously:
- Data preparation and mapping: Agents profile, prepare, map, and validate each data source to Senzing-ready dataframes autonomously.
- Distributed entity resolution: With validated dataframes, agents trigger and manage distributed entity resolution jobs across the Spark cluster, executing across all data sources in parallel at any scale.
- Publishing the resolved entity graph: Agents propagate results to any downstream destination—Elasticsearch, knowledge graphs, data lakes—or implant the resolved entity graph directly into an existing live Senzing instance, giving real-time systems an immediate entity intelligence boost.
Catch more CIO Insights: CIO as Orchestrator of Cross-Functional Digital Strategy
[To share your insights with us, please write to psen@itechseries.com ]


