Senzing Announcement
Senzing Agentic Entity Resolution for Apache Spark
Resolve and relate billions of records en masse.
For the first time, Senzing’s world-class entity resolution intelligence – no training/fine-tuning, entity-centric learning, relationship awareness, and global name, address and cross-script matching – runs natively on Apache Spark.
The breakthrough
Same genius. More options.
Senzing has always been the gold standard for transactional entity resolution – superior accuracy with unprecedented simplicity designed specifically for real-time workloads. Today, with this announcement, accurate and simple entity resolution now runs natively on Spark.
Picking an entity resolution vendor has long forced a binary choice: Batch Spark or Transactional SQL. Weโre excited to turn this โorโ into an โand.โ With Senzing for Spark, customers get all the intelligence found in our real-time SDKโprinciple-based entity resolution, entity-centric learning, relationship awareness, global name, address and cross-script matching, and explainabilityโrunning natively inside their Spark platform of choice.
โ Brian Macy, Head of Operations and Engineering, Senzing
As the leader in entity resolution, we are excited to be first to market with three flavors of entity resolution: Spark batch, transactional with SQL, and hybrid. And, of course, agentically if you use the Senzing MCP server.
Agentic Entity Resolution ยท Spark Style
Agents do the work.
You get identity intelligence.
Senzing for Spark is built for agentic workflows end to end – from data source preparation all the way through to publishing the resolved entity graph to downstream systems.
Agentically prepare and map each data source
Using the Senzing MCP server, profiling, preparing and mapping each data source to Senzing-ready dataframes is handled without a human writing a line of code.
Agentically kick off the Spark jobs
With sources mapped and validated, agents trigger and manage the distributed resolution job across your Spark cluster – multi-source batch ER at any scale, with full entity-centric learning, relationship discovery, and match-key explainability running in parallel.
Agentically publish the resolved entity graph
Once resolved, agents can propagate your entity graph to wherever you want it to land – Elasticsearch, knowledge graphs, data lakes. You can also implant the results straight into an existing Senzing real-time instance.
Senzing for Spark ยท Roadmap
What's coming.
Here’s where we are at – and what’s right behind it.
Sz Spark v1.0
Multi-source batch ER
Resolve and relate billions of records across multiple data sources in a single Spark job. Outputs a fully-resolved entity graph. Deploy on AWS EMR, Databricks, Snowflake, or just Apache Spark – or pre-populate a real-time Senzing SQL instance.
Sz Spark v2.0 – Hybrid
Transactional ER coexisting with Spark batches
Splice batch entity resolution results directly into your live Senzing transactional instance – no downtime, no record-by-record ingestion. Fast-track the onboarding of large, new datasets at Spark speed.
Sign up for the beta here.
We’re onboarding select partners. If you have a Spark cluster and want to explore Senzing quality entity resolution for your financial crime, insurance fraud, national security, or customer 360 project – we want to hear from you.