How Senzing Performs Real Time Entity Resolution
This process overview describes how Senzing performs entity resolution on inbound records as they are received.
Entity resolution involves a number of steps. If you already have an understanding of entity resolution, you may be familiar with several of them, but Senzing performs some unique processes that can result in higher quality outcomes than other methods, including:
• Automatically resolving entities without any pre-tuning, training or custom configurations
• Matching records to entities, instead of matching records to records like most other systems
• Discovering and managing relationships between entities as part of the entity resolution process
• Revisiting previous decisions as the system continuously learns in real time
ENTITY RESOLUTION OVERVIEW
The image below shows the steps Senzing takes to resolve inbound records.
These steps are similar when searching for, deleting or updating records.
ENTITY RESOLUTION DETAILS
1. RECEIVE AND PROCESS RECORDS
Source systems provide new data in real time or in batches. Inbound records are stored as they are received, then processed to generate information required downstream.
A. RECEIVE RECORD – each inbound record is added to the record library in its original form. If the inbound record does not contain a unique ID from the source system, Senzing generates one.
B. MAP AND STANDARDIZE FEATURES – record data is grouped into features and mapped to elements within those features. Some elements are standardized or used to generate variants and then added to existing features. All features are stored in the feature library. See below for an example of how it works using the name feature John Smith.
C. COMPUTE NEW FEATURES – additional features are automatically produced based on combinations of features. If needed, new feature combinations can easily be created.
2. IDENTIFY CANDIDATE ENTITIES TO CONSIDER
Features of the inbound record are used to select entity resolution candidates from the entity-resolved graph database.
A. DETERMINE CANDIDATE FEATURES – features of the inbound record are analyzed to determine which should be used for candidate selection. This process includes ignoring generic values, such as a bank’s toll-free phone number already associated with hundreds of people.
B. SELECT CANDIDATES – the final list of non-generic features is used to retrieve candidate entities.
3. RESOLVE ENTITIES AND DETECT RELATIONSHIPS
Features of the inbound record are compared with those of candidate entities, then entity resolution principles are applied to determine if an inbound record matches or is related or unrelated. The real-time learning loop reevaluates historical decisions.
A. COMPARE FEATURES – features of the inbound record are compared with candidate entities and scores are assigned. Purpose-built comparators, including world-class name and address
comparators, score each feature as same, close, likely, plausible, unlikely or not the same. Users can also add their own comparators.
B. APPLY PRINCIPLES – principles efficiently determine if an inbound record is the same, possibly the same or possibly related to each candidate entity. The software comes preconfigured with principles for people and organizations. Additional features and principles can be added. For more information, read Principle-based Entity Resolution Explained.
C. LEARN SOMETHING NEW? – as statistics evolve or new features are added, changed or deleted, earlier entity resolution assertions are reevaluated.
4. UPDATE DATABASE AND RESPOND WITH OUTCOMES
The entity-resolved graph is updated and outcomes are reported as required.
A. UPDATE DATABASE – new entity and relationship information is added to the entity-resolved graph.
B. RESPONSE REQUESTED? – if a response is requested, information is provided about how the inbound record was handled and any changes to the entity-resolved graph.
C. PROCESS COMPLETE
For more information about Senzing software, visit senzing.com or read Uniquely Senzing.