Skip to main content

The Senzing MCP Server

A Senzing Expert On Demand

Senzing entity resolution mcp server for agentic ai

What It Does

It Puts a Senzing Expert Inside Your AI Assistant

Entity resolution implementations have steep learning curves — data mapping specs, flag combinations, SDK patterns, error codes, working code examples scattered across dozens of repositories. Getting from raw data to resolved entities takes time.

The Senzing MCP Server connects your AI assistant to the full depth of Senzing technical knowledge so you can build faster, debug faster, and go further with less friction.

Ask it anything. Put it to work. Get authoritative, Senzing-specific answers in seconds instead of hours.

What It Is

Not a Wrapper. An Expert.

It is not a wrapper around the Senzing engine. It doesn’t call a live Senzing instance, and no source data ever flows to Senzing. Your data stays between you and your AI client.

What the server provides is expertise: the knowledge an LLM needs to guide you through a real Senzing implementation, from first data mapping to production deployment.

Connect it once. Ask it anything.

What It Covers

A Radically Different Approach to
Senzing Implementation

Data Mapping

An interactive workflow for profiling your source data, planning the mapping, generating field-level code, and validating output quality — for even the most complex data sources.

SDK Development

Working scaffold code for common workflows: record ingestion, entity search, redo processing, and more. Backed by searchable access to GitHub repositories of working Senzing code examples.

Troubleshooting

Complete coverage of Senzing error codes, causes, resolution steps, and context — so you can diagnose and fix issues without hunting through documentation.

Documentation

Senzing’s complete documentation library in one searchable place: the Entity Specification, SDK guides, Docker/AWS/Azure quickstarts, database tuning, pricing, and more.

Architecture & Deployment

How the embedded SDK library runs in-process and air-gapped with no source data leaving your environment, plus install steps, database setup, and scale-out clustering for high-volume repositories.

Reporting & Visualization

Patterns for exporting resolved entities, building aggregate reports and an analytical data mart, designing dashboards and network graphs, and measuring resolution quality with precision and recall.

Use Cases & Fit

How entity resolution applies to fraud, KYC, sanctions and trade-enforcement screening, and master data management — including multilingual and cross-script matching.

Evaluation & ROI

The business case for entity resolution and the DSR pricing model, plus real, ready-to-load reference datasets (CORD) and a proof-of-concept methodology for validating results before you commit.

Why It’s Different

Designed to Teach, Not Just to Wrap

Most MCP servers are wrappers: the LLM picks a tool, fills in slots, and the wrapper calls an API on its behalf. That architecture has a ceiling. The wrapper can only expose what the wrapper author built, and it has no opinion about what you’re actually trying to accomplish.

The Senzing MCP Server is designed differently. It teaches your AI assistant how Senzing actually works: the principles, the patterns, the failure modes, the right architecture for your situation.

The LLM applies that expertise to your specific context — your data, your stack, your constraints. The output is code and guidance fitted to you, not generic tool-call results.

Think of a Forward Deployed Engineer embedded in your development environment — someone who knows every Senzing API, has seen every failure mode, and can write production-ready code for your specific data and architecture.

That’s how a Forward Deployed Engineer works. That’s what you get.

How to Get Started

Works With Any MCP-Compatible AI Client.
No Senzing Installation Required.

1

Open your AI client’s MCP settings

Follow your AI client’s instructions for adding an MCP server. Setup takes minutes.

2

Add the Senzing MCP Server URL

Use this URL when prompted:

https://mcp.senzing.com/mcp

Pro tip: Prefix your questions with “Use the Senzing MCP Server to…” to ensure your AI draws on MCP tools rather than general training data. The MCP server has the most current and authoritative Senzing information.

Ask It Anything

Explore. Learn. Build.

Explore how Senzing works

How does Senzing entity resolution work?
Senzing uses Entity-Centric Learning — each inbound record is compared against existing resolved entities (not against other individual records), then principle-based rules determine whether it is the same entity, possibly the same, or possibly related. Features like names, addresses, and phone numbers are mapped, standardized, and scored using purpose-built comparators. When the system encounters a new alias or variation it has not seen before, it re-evaluates earlier decisions automatically. No pre-training, tuning, or custom rule configuration required.
What databases does Senzing support?
SQLite (development and testing), PostgreSQL, Aurora PostgreSQL, Microsoft SQL Server, Azure SQL, MySQL, and Oracle. Each has a dedicated setup guide covering schema DDL, connection string format, and Senzing-specific tuning recommendations. For production workloads exceeding one billion records, Senzing supports a HYBRID multi-node clustering backend that distributes the entity repository across CORE, RES, and LIBFEAT database nodes.
What does error SENZ0005 mean?
SENZ0005 is EAS_ERR_EXCEEDED_MAX_RETRIES — the engine exceeded the maximum number of retries allowed for an operation, typically caused by high concurrency or transient database contention. To diagnose: enable verbose logging to capture additional context, then cross-reference the Senzing error code documentation. If the error persists, Senzing support resources and community forums maintain known patterns for this error class.
What flags can I use on a get entity call?
Start with the two composite flags: SZ_ENTITY_DEFAULT_FLAGS (includes all relation types, representative features, entity name, record summary, record data, and matching info) or the lighter SZ_ENTITY_CORE_FLAGS for a response without relations. Layer individual flags as needed: SZ_ENTITY_INCLUDE_ALL_FEATURES or SZ_ENTITY_INCLUDE_REPRESENTATIVE_FEATURES for feature detail; SZ_ENTITY_INCLUDE_POSSIBLY_SAME_RELATIONS, SZ_ENTITY_INCLUDE_POSSIBLY_RELATED_RELATIONS, or SZ_ENTITY_INCLUDE_DISCLOSED_RELATIONS for relationship types; SZ_ENTITY_INCLUDE_RECORD_DATES for FIRST_SEEN_DT and LAST_SEEN_DT timestamps; SZ_INCLUDE_MATCH_KEY_DETAILS for the per-feature evidence breakdown behind each match.
What’s the ROI case for fixing mismatched identity data?
A 2026 analysis by Senzing founder Jeff Jonas estimates the annual cost of mismatched identity data across the US economy at $0.7 to $0.8 trillion — spanning data quality, marketing, government, financial services, and healthcare. A 2025 Forrester Total Economic Impact study found 226% ROI and $19.7M in benefits for a composite enterprise customer over three years. The investment case for entity resolution is rarely close.
How does Senzing run air-gapped, and what data leaves my environment?
Nothing leaves your environment. The Senzing SDK is an embedded library that runs in-process inside your application. After installation, no internet connectivity is required — Senzing does not phone home or connect to any Senzing servers. The only network traffic is between your application and your own database. For fully air-gapped environments, Senzing provides a dedicated quickstart guide for transferring packages and Docker images to disconnected systems with no Senzing server contact whatsoever.
Is Senzing a fit for sanctions screening, KYC, or fraud detection?
Yes — these are core Senzing use cases. For fraud detection, entity-centric learning catches bad actors who deliberately vary names, addresses, and phone numbers across records, which record-to-record matching cannot do. USCIS deployed Senzing to detect immigration fraud across complex multi-source systems and realized significant cost reductions. For sanctions and KYC, OFAC/SDN screening is fundamentally entity resolution — fuzzy name matching, alias resolution, and beneficial ownership tracing through layered corporate structures.

Put it to work

Use the Senzing mapping workflow to map this file to Senzing JSON
When you share your actual file, the MCP server launches an 8-step guided mapping workflow: it profiles your source data, helps you plan the entity structure (identifying master entities, child records, and disclosed relationships), maps your fields to the correct Senzing feature attributes, then generates validated sample JSON output plus ready-to-run Python mapper code. The workflow prevents the most common mapping errors — wrong attribute names, missing RECORD_ID fields, incorrect feature structures — that hand-coded mappings routinely produce.
Use the Senzing scaffold tool to generate Python code to load records
When you ask this, the MCP server returns working Python scaffold code sourced directly from Senzing official GitHub repositories, with provenance URLs so you can inspect the original source. The scaffold covers the full pipeline: initialize the SDK, register your data source, load records with production-ready threading, and drain the redo queue. Code is generated for the current Senzing v4 SDK, avoiding the method name and API pattern errors that training-data-based code generation commonly produces.
Walk me through the right architecture for a fraud screening pipeline
Ask the MCP server and it returns platform-specific architecture guidance. The core principle: embed the Senzing SDK in-process in your application (not as a separate service) and feed records via a message queue — SQS Standard queue on AWS, not FIFO. Run concurrent loader threads with add_record() for real-time in-transaction resolution, and run a dedicated redo processing worker alongside — skipping redo processing produces incomplete resolution results. The MCP server covers AWS, Azure, and GCP topology, database selection, container sizing, and horizontal scaling patterns.
How do I deploy Senzing on Azure with Azure SQL, and scale past a billion records?
For Azure: Ubuntu 24.04 LTS VM (Standard_D8ds_v6 or larger for production), Azure SQL Database Hyperscale accessed via private endpoint (required for performance), TLS required in the connection string (encrypt=yes). Apply the Senzing schema DDL before SDK initialization. For billion-record scale: use the Senzing® HYBRID database backend to distribute the entity repository across a 3-node cluster — CORE, RES, and LIBFEAT nodes on separate Azure SQL instances. Verisk ran 1.6 billion records and 420 million resolved entities with sub-second search response times.
What SQL gives me cross-source overlap and duplication rates?
Two queries from the Senzing data mart: for cross-source overlap, join sz_dm_record to itself on entity_id where data_source differs — this counts entities that appear in both sources. For within-source deduplication (compression ratio), group by data_source and divide record_count by entity_count — a ratio above 1.0 means records are resolving together. The full Senzing reporting guide also provides SQL for entity size distribution, match key frequency analysis, and review queues for possible-match and ambiguous-match cases.
Give me sample records from the Las Vegas dataset to test with
The Las Vegas CORD dataset (Collections Of Relatable Data) contains 367,671 real historical records from 11 public-record sources covering the Las Vegas area — including healthcare providers, business registrations, and other public identifiers. The data is available for evaluation use only. Ask the MCP server and it returns a preview in source format plus a download URL for the full dataset. Additional CORD datasets include London (5 sources, international data) and Moscow (6 sources, Cyrillic script) for globalization and non-roman script testing.

From Raw Data to Resolved Entities — Faster

What used to take days now takes seconds. The Senzing MCP Server compresses the learning curve for Senzing implementations, whether you’re a developer starting your first integration or an experienced team onboarding a new data source.