Consuming "get_entity" Responses
To retrieve information describing the composition and relationships of entities the “get entity” methods are utilized. The response from these methods, like the majority of Senzing SDK methods, is a JSON document. This document outlines the structure of the “get entity” response document, how to influence its contents, and considerations for consuming and using the resulting information in your own applications and services that leverage Senzing’s entity resolution.
Get Entity Methods
The name of the “get entity” methods is SDK language specific, method names and examples used herein and outside of this table are for the Python SDK.
Language | By Entity ID | By Record ID |
---|---|---|
C# | GetEntity | GetEntity |
Java | getEntity | getEntity |
Python | get_entity_by_entity_id | get_entity_by_record_id |
Get an Entity by Entity ID
Retrieves information about an entity where the entity is specified using an entity ID.
...
result = sz_engine.get_entity_by_entity_id(787)
...
Get an Entity by Record ID
Retrieves information about an entity where the entity is specified using the data source code and record ID of a previously added record.
...
result = sz_engine.get_entity_by_record_id("WATCHLIST", "102934")
...
JSON Document Example
The following is an example JSON document from calling get_entity_by_entity_id
for entity ID 1.
This example is using the default output specification, this can be modified and is covered in Using Flags to Specify the Output Document Contents
{
"RESOLVED_ENTITY": {
"ENTITY_ID": 1,
"ENTITY_NAME": "Robert Smith",
"FEATURES": {
"ADDRESS": [
{
"FEAT_DESC": "1515 Adela Lane Las Vegas NV 89111",
"LIB_FEAT_ID": 29,
"USAGE_TYPE": "HOME",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "1515 Adela Lane Las Vegas NV 89111",
"LIB_FEAT_ID": 29
},
{
"FEAT_DESC": "1515 Adela Ln Las Vegas NV 89132",
"LIB_FEAT_ID": 93
}
]
},
{
"FEAT_DESC": "123 Main Street, Las Vegas NV 89132",
"LIB_FEAT_ID": 72,
"USAGE_TYPE": "MAILING",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "123 Main Street, Las Vegas NV 89132",
"LIB_FEAT_ID": 72
}
]
}
],
"DOB": [
{
"FEAT_DESC": "11/12/1979",
"LIB_FEAT_ID": 92,
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "11/12/1979",
"LIB_FEAT_ID": 92
}
]
},
{
"FEAT_DESC": "12/11/1978",
"LIB_FEAT_ID": 2,
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "12/11/1978",
"LIB_FEAT_ID": 2
},
{
"FEAT_DESC": "11/12/1978",
"LIB_FEAT_ID": 28
}
]
}
],
"EMAIL": [
{
"FEAT_DESC": "bsmith@work.com",
"LIB_FEAT_ID": 3,
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "bsmith@work.com",
"LIB_FEAT_ID": 3
}
]
}
],
"NAME": [
{
"FEAT_DESC": "B Smith",
"LIB_FEAT_ID": 91,
"USAGE_TYPE": "PRIMARY",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "B Smith",
"LIB_FEAT_ID": 91
}
]
},
{
"FEAT_DESC": "Robert Smith",
"LIB_FEAT_ID": 71,
"USAGE_TYPE": "PRIMARY",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "Robert Smith",
"LIB_FEAT_ID": 71
},
{
"FEAT_DESC": "Bob J Smith",
"LIB_FEAT_ID": 1
},
{
"FEAT_DESC": "Bob Smith",
"LIB_FEAT_ID": 27
}
]
}
],
"PHONE": [
{
"FEAT_DESC": "702-919-1300",
"LIB_FEAT_ID": 30,
"USAGE_TYPE": "HOME",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "702-919-1300",
"LIB_FEAT_ID": 30
}
]
},
{
"FEAT_DESC": "702-919-1300",
"LIB_FEAT_ID": 30,
"USAGE_TYPE": "MOBILE",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "702-919-1300",
"LIB_FEAT_ID": 30
}
]
}
],
"RECORD_TYPE": [
{
"FEAT_DESC": "PERSON",
"LIB_FEAT_ID": 7,
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "PERSON",
"LIB_FEAT_ID": 7
}
]
}
]
},
"RECORD_SUMMARY": [
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_COUNT": 4
}
],
"RECORDS": [
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1002",
"INTERNAL_ID": 3,
"MATCH_KEY": "",
"MATCH_LEVEL_CODE": "",
"ERRULE_CODE": ""
},
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1003",
"INTERNAL_ID": 1,
"MATCH_KEY": "+NAME+DOB+EMAIL",
"MATCH_LEVEL_CODE": "RESOLVED",
"ERRULE_CODE": "SF1_PNAME_CSTAB"
},
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1004",
"INTERNAL_ID": 8,
"MATCH_KEY": "+NAME+DOB+ADDRESS",
"MATCH_LEVEL_CODE": "RESOLVED",
"ERRULE_CODE": "CNAME_CFF_CEXCL"
},
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1001",
"INTERNAL_ID": 9,
"MATCH_KEY": "+NAME+DOB+PHONE+EMAIL",
"MATCH_LEVEL_CODE": "RESOLVED",
"ERRULE_CODE": "SF1_SNAME_CFF_CSTAB"
}
]
},
"RELATED_ENTITIES": [
{
"ENTITY_ID": 6,
"MATCH_LEVEL_CODE": "POSSIBLY_SAME",
"MATCH_KEY": "+NAME+ADDRESS-DOB",
"ERRULE_CODE": "CNAME_CFF_DEXCL",
"IS_DISCLOSED": 0,
"IS_AMBIGUOUS": 0,
"ENTITY_NAME": "Robert E Smith Sr",
"RECORD_SUMMARY": [
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_COUNT": 1
},
{
"DATA_SOURCE": "WATCHLIST",
"RECORD_COUNT": 1
}
]
},
{
"ENTITY_ID": 144,
"MATCH_LEVEL_CODE": "POSSIBLY_SAME",
"MATCH_KEY": "+NAME",
"ERRULE_CODE": "SNAME",
"IS_DISCLOSED": 0,
"IS_AMBIGUOUS": 0,
"ENTITY_NAME": "Robert Smith",
"RECORD_SUMMARY": [
{
"DATA_SOURCE": "WATCHLIST",
"RECORD_COUNT": 1
}
]
},
{
"ENTITY_ID": 147,
"MATCH_LEVEL_CODE": "POSSIBLY_RELATED",
"MATCH_KEY": "+SURNAME+ADDRESS",
"ERRULE_CODE": "CFF_SURNAME",
"IS_DISCLOSED": 0,
"IS_AMBIGUOUS": 0,
"ENTITY_NAME": "Patricia Smith",
"RECORD_SUMMARY": [
{
"DATA_SOURCE": "WATCHLIST",
"RECORD_COUNT": 1
}
]
}
]
}
JSON Document Structure
The JSON document is comprised of a RESOLVED_ENTITY
object and RELATED_ENTITIES
array.
{
"RESOLVED_ENTITY": {
"ENTITY_ID": 1,
"ENTITY_NAME": "Robert Smith",
"FEATURES": {...
},
"RECORD_SUMMARY": [...
],
"RECORDS": [...
]
},
"RELATED_ENTITIES": [...
]
}
RESOLVED_ENTITY
This object provides details about the requested entity.
- ENTITY_ID - The current unique identifier for this entity
- ENTITY_NAME - A name this entity goes by, an entity could have multiple names
In addition to this overview information there are the following object and arrays that provide specific details for the requested entity:
- A FEATURES object
- A RECORD_SUMMARY array
- A RECORDS array
FEATURES
This object contains details about the features that constitute the entity. This entity has address, date of birth, email, name, phone and record type features. Each of the FEATURES
keys is an array of one of more objects describing the corresponding features.
...
"FEATURES": {
"ADDRESS": [
{
"FEAT_DESC": "1515 Adela Lane Las Vegas NV 89111",
"LIB_FEAT_ID": 29,
"USAGE_TYPE": "HOME",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "1515 Adela Lane Las Vegas NV 89111",
"LIB_FEAT_ID": 29
},
{
"FEAT_DESC": "1515 Adela Ln Las Vegas NV 89132",
"LIB_FEAT_ID": 93
}
]
},
{
"FEAT_DESC": "123 Main Street, Las Vegas NV 89132",
"LIB_FEAT_ID": 72,
"USAGE_TYPE": "MAILING",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "123 Main Street, Las Vegas NV 89132",
"LIB_FEAT_ID": 72
}
]
}
],
"DOB": [...
],
"EMAIL": [...
],
"NAME": [...
],
"PHONE": [...
],
"RECORD_TYPE": [...
]
}
...
- FEAT_DESC - Describes the attributes that make up the feature
- When using the
SZ_ENTITY_INCLUDE_REPRESENTATIVE_FEATURES
flag this is the representative feature
- When using the
- LIB_FEAT_ID - The unique identifier for the feature value
- USAGE_TYPE - Label to identify how some features are being used (can also change some features behavior)
- FEAT_DESC_VALUES - Describes the attributes that make up the feature
In the snippet above there are 2 objects for the ADDRESS
feature. Notice the first object has 2 further objects in the FEAT_DESC_VALUES
array, multiple feature values scored as being similar are clustered together. The 2 addresses are similar, the first is using Lane
and 89111
whereas the second is using Ln
and 89132
. Moreover, both addresses specified the ADDR_TYPE
attribute (which is a USAGE_TYPE
) as HOME
on the records loaded into Senzing, when features have the same USAGE_TYPE
they are clustered around both the USAGE_TYPE
and similar values.
The clustering in the above happens in the default output document or when the flag SZ_ENTITY_INCLUDE_REPRESENTATIVE_FEATURES
is used.
If one of the 2 similar addresses didn’t have a USAGE_TYPE
, or had one that was not HOME
, there would be 3 ADDRESS
objects without any clustering.
...
"FEATURES": {
"ADDRESS": [
{
"FEAT_DESC": "1515 Adela Ln Las Vegas NV 89132",
"LIB_FEAT_ID": 93,
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "1515 Adela Ln Las Vegas NV 89132",
"LIB_FEAT_ID": 93
}
]
},
{
"FEAT_DESC": "1515 Adela Lane Las Vegas NV 89111",
"LIB_FEAT_ID": 29,
"USAGE_TYPE": "HOME",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "1515 Adela Lane Las Vegas NV 89111",
"LIB_FEAT_ID": 29
}
]
},
{
"FEAT_DESC": "123 Main Street, Las Vegas NV 89132",
"LIB_FEAT_ID": 72,
"USAGE_TYPE": "MAILING",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "123 Main Street, Las Vegas NV 89132",
"LIB_FEAT_ID": 72
}
]
}
],
...
Similar can be seen for the NAME
features. The second object contains 3 similar names clustered around the USAGE_TYPE
of PRIMARY
. However the first object only contains B SMITH
, although B
could be considered to match Bob
it could also represent Barry, Ben, Belinda, etc and doesn’t score as similar.
...
"NAME": [
{
"FEAT_DESC": "B Smith",
"LIB_FEAT_ID": 91,
"USAGE_TYPE": "PRIMARY",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "B Smith",
"LIB_FEAT_ID": 91
}
]
},
{
"FEAT_DESC": "Robert Smith",
"LIB_FEAT_ID": 71,
"USAGE_TYPE": "PRIMARY",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "Robert Smith",
"LIB_FEAT_ID": 71
},
{
"FEAT_DESC": "Bob J Smith",
"LIB_FEAT_ID": 1
},
{
"FEAT_DESC": "Bob Smith",
"LIB_FEAT_ID": 27
}
]
}
],
...
When the output clusters similar features as in the examples above, the FEAT_DESC
in the root of the object is the longest representation of the similar clustered values from the FEAT_DESC_VALUES
array.
RECORD_SUMMARY
Each element of this array is an overview of the source systems the records came from. Each element is clustered by the data source code used when adding a record to Senzing.
...
"RECORD_SUMMARY": [
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_COUNT": 4
}
],
...
- DATA_SOURCE - A label for identifying the system a record came from
- RECORD_COUNT - The number of records for the entity with the same data source code
The example entity is comprised of 4 records all from the CUSTOMERS
data source. The following entity is comprised of 2 records, 1 from the CUSTOMERS
and the other from the WATCHLIST
data sources.
...
"RECORD_SUMMARY": [
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_COUNT": 1
},
{
"DATA_SOURCE": "WATCHLIST",
"RECORD_COUNT": 1
}
],
...
RECORDS
This array contains details about the records that have entity resolved to and constitute the entity. Each element of the array provides details about a single record and why it resolved to the entity at the point in time the decision was made.
...
"RECORDS": [
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1002",
"INTERNAL_ID": 3,
"MATCH_KEY": "",
"MATCH_LEVEL_CODE": "",
"ERRULE_CODE": ""
},
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1003",
"INTERNAL_ID": 1,
"MATCH_KEY": "+NAME+DOB+EMAIL",
"MATCH_LEVEL_CODE": "RESOLVED",
"ERRULE_CODE": "SF1_PNAME_CSTAB"
},
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1004",
"INTERNAL_ID": 8,
"MATCH_KEY": "+NAME+DOB+ADDRESS",
"MATCH_LEVEL_CODE": "RESOLVED",
"ERRULE_CODE": "CNAME_CFF_CEXCL"
},
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1001",
"INTERNAL_ID": 9,
"MATCH_KEY": "+NAME+DOB+PHONE+EMAIL",
"MATCH_LEVEL_CODE": "RESOLVED",
"ERRULE_CODE": "SF1_SNAME_CFF_CSTAB"
}
]
...
- DATA_SOURCE - The data source code
- RECORD_ID - Unique identifier for the record in the data source
- INTERNAL_ID - Internal identifier for the record
- MATCH_KEY - Representation of matched source record features
- MATCH_LEVEL_CODE - The type of match that occurred for the record
- ERRULE_CODE - Identifier of the entity resolution rule that was triggered
MATCH_KEY - Consider a MATCH_KEY
value of +NAME+ADDRESS-DOB
, this indicates name and address matched but the date of birth did not. +
means contributed to the match and -
means detracted from the match. -
is only shown for exclusive features that could break a match.
Notice the first object in the array has empty string values for MATCH_KEY
, MATCH_LEVEL_CODE
and ERRULE_CODE
, this was the initial record that formed the entity. The other records subsequently resolved to this entity and the MATCH_KEY
, MATCH_LEVEL_CODE
and ERRULE_CODE
give an overview of the reason at the time this happened. The second object in the array for the record from the CUSTOMERS
data source and record ID 1003
resolved to the entity when it was loaded due to name, date of birth, and email match. This satisfied the conditions of the SF1_PNAME_CSTAB
entity resolution rule.
RELATED_ENTITIES Array
Elements in this array are objects describing relationships between this entity and other entities in the system; each object is 1 relationship. If an entity has no relationships the RELATED_ENTITIES
array will be empty.
...
"RELATED_ENTITIES": [
{
"ENTITY_ID": 6,
"MATCH_LEVEL_CODE": "POSSIBLY_SAME",
"MATCH_KEY": "+NAME+ADDRESS-DOB",
"ERRULE_CODE": "CNAME_CFF_DEXCL",
"IS_DISCLOSED": 0,
"IS_AMBIGUOUS": 0,
"ENTITY_NAME": "Robert E Smith Sr",
"RECORD_SUMMARY": [
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_COUNT": 1
},
{
"DATA_SOURCE": "WATCHLIST",
"RECORD_COUNT": 1
}
]
},
{
"ENTITY_ID": 144,
"MATCH_LEVEL_CODE": "POSSIBLY_SAME",
"MATCH_KEY": "+NAME",
"ERRULE_CODE": "SNAME",
"IS_DISCLOSED": 0,
"IS_AMBIGUOUS": 0,
"ENTITY_NAME": "Robert Smith",
"RECORD_SUMMARY": [
{
"DATA_SOURCE": "WATCHLIST",
"RECORD_COUNT": 1
}
]
},
{
"ENTITY_ID": 147,
"MATCH_LEVEL_CODE": "POSSIBLY_RELATED",
"MATCH_KEY": "+SURNAME+ADDRESS",
"ERRULE_CODE": "CFF_SURNAME",
"IS_DISCLOSED": 0,
"IS_AMBIGUOUS": 0,
"ENTITY_NAME": "Patricia Smith",
"RECORD_SUMMARY": [
{
"DATA_SOURCE": "WATCHLIST",
"RECORD_COUNT": 1
}
]
}
]
- ENTITY_ID - Current unique identifier for the related entity
- MATCH_LEVEL_CODE - The type of relationship that was established
- MATCH_KEY - Representation of matched source record features
- ERRULE_CODE - Identifier of the entity resolution rule that was triggered
- IS_DISCLOSED - Indicates if this is a disclosed relationship
- IS_AMBIGUOUS - Indicates if this is an ambiguous relationship
- ENTITY_NAME - A name the related entity goes by
- RECORD_SUMMARY - Overview of the source systems the records came from comprising this related entity
- DATA_SOURCE - The data source code
- RECORD_COUNT - Number of records for the related entity with the same data source code
Entity 1 has 3 relationships to other entities identified as entity ID 6, 144, and 147. The relationships to entities 6 and 144 have a match level of POSSIBLY_SAME
and the relationship to entity 147 POSSIBLY_RELATED
.
Similar to the RECORDS
array, there are overview details on why a relationship was established. The relationship to entity ID 6 was established because the NAME
and ADDRESS
features matched but the DOB
didn’t, triggering the CNAME_CFF_DEXCL
rule.
Using Flags to Specify the Output Document Contents
It is beyond the scope of this document to go into every flag option and associated outcomes. For additional details on flags, including those for get_entity_by_entity_id
and get_entity_by_record_id
, see Engine Flags.
The get entity
methods, like many other SDK methods, accept an optional flags argument to control the content and details of the JSON response document. The example output thus far is the default without specifying any flags, in this situation the output is constructed using the flag SZ_ENTITY_DEFAULT_FLAGS
. The following examples produce the same output:
...
result = sz_engine.get_entity_by_entity_id(787)
...
...
result = sz_engine.get_entity_by_entity_id(787, flags=SzEngineFlags.SZ_ENTITY_DEFAULT_FLAGS)
...
The SZ_ENTITY_DEFAULT_FLAGS
flag is a composite of multiple other flags:
- SZ_ENTITY_INCLUDE_ALL_RELATIONS
- SZ_ENTITY_INCLUDE_REPRESENTATIVE_FEATURES
- SZ_ENTITY_INCLUDE_ENTITY_NAME
- SZ_ENTITY_INCLUDE_RECORD_SUMMARY
- SZ_ENTITY_INCLUDE_RECORD_DATA
- SZ_ENTITY_INCLUDE_RECORD_MATCHING_INFO
- SZ_ENTITY_INCLUDE_RELATED_ENTITY_NAME
- SZ_ENTITY_INCLUDE_RELATED_RECORD_SUMMARY
- SZ_ENTITY_INCLUDE_RELATED_MATCHING_INFO
Consider a scenario where an application utilizing Senzing needed to retrieve the entity details for a previously loaded record, but required less information than the default output provides. Using the flags argument, such a request could be:
...
data_source_code = "CUSTOMERS"
record_id = "1001"
sz_flags = (
SzEngineFlags.SZ_ENTITY_INCLUDE_ALL_FEATURES
| SzEngineFlags.SZ_ENTITY_INCLUDE_ENTITY_NAME
| SzEngineFlags.SZ_ENTITY_INCLUDE_RECORD_DATA
)
result = sz_engine.get_entity_by_record_id(data_source_code, record_id, sz_flags)
...
The resulting JSON response document is reduced by using flags to specify only the data required to satisfy the request:
{
"RESOLVED_ENTITY": {
"ENTITY_ID": 1,
"ENTITY_NAME": "Robert Smith",
"FEATURES": {
"ADDRESS": [
{
"FEAT_DESC": "1515 Adela Lane Las Vegas NV 89111",
"LIB_FEAT_ID": 26,
"USAGE_TYPE": "HOME",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "1515 Adela Lane Las Vegas NV 89111",
"LIB_FEAT_ID": 26
}
]
},
{
"FEAT_DESC": "1515 Adela Ln Las Vegas NV 89132",
"LIB_FEAT_ID": 78,
"USAGE_TYPE": "HOME",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "1515 Adela Ln Las Vegas NV 89132",
"LIB_FEAT_ID": 78
}
]
},
{
"FEAT_DESC": "123 Main Street, Las Vegas NV 89132",
"LIB_FEAT_ID": 60,
"USAGE_TYPE": "MAILING",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "123 Main Street, Las Vegas NV 89132",
"LIB_FEAT_ID": 60
}
]
}
],
"DOB": [
{
"FEAT_DESC": "11/12/1978",
"LIB_FEAT_ID": 25,
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "11/12/1978",
"LIB_FEAT_ID": 25
}
]
},
{
"FEAT_DESC": "11/12/1979",
"LIB_FEAT_ID": 77,
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "11/12/1979",
"LIB_FEAT_ID": 77
}
]
},
{
"FEAT_DESC": "12/11/1978",
"LIB_FEAT_ID": 2,
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "12/11/1978",
"LIB_FEAT_ID": 2
}
]
}
],
"EMAIL": [
{
"FEAT_DESC": "bsmith@work.com",
"LIB_FEAT_ID": 3,
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "bsmith@work.com",
"LIB_FEAT_ID": 3
}
]
}
],
"NAME": [
{
"FEAT_DESC": "B Smith",
"LIB_FEAT_ID": 76,
"USAGE_TYPE": "PRIMARY",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "B Smith",
"LIB_FEAT_ID": 76
}
]
},
{
"FEAT_DESC": "Bob J Smith",
"LIB_FEAT_ID": 1,
"USAGE_TYPE": "PRIMARY",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "Bob J Smith",
"LIB_FEAT_ID": 1
}
]
},
{
"FEAT_DESC": "Bob Smith",
"LIB_FEAT_ID": 24,
"USAGE_TYPE": "PRIMARY",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "Bob Smith",
"LIB_FEAT_ID": 24
}
]
},
{
"FEAT_DESC": "Robert Smith",
"LIB_FEAT_ID": 59,
"USAGE_TYPE": "PRIMARY",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "Robert Smith",
"LIB_FEAT_ID": 59
}
]
}
],
"PHONE": [
{
"FEAT_DESC": "702-919-1300",
"LIB_FEAT_ID": 61,
"USAGE_TYPE": "HOME",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "702-919-1300",
"LIB_FEAT_ID": 61
}
]
},
{
"FEAT_DESC": "702-919-1300",
"LIB_FEAT_ID": 61,
"USAGE_TYPE": "MOBILE",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "702-919-1300",
"LIB_FEAT_ID": 61
}
]
}
],
"RECORD_TYPE": [
{
"FEAT_DESC": "PERSON",
"LIB_FEAT_ID": 22,
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "PERSON",
"LIB_FEAT_ID": 22
}
]
}
]
},
"RECORDS": [
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1002"
},
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1003"
},
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1001"
},
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1004"
}
]
}
}
Recommendations for Using Flags
The main recommendation is carefully consider what is the minimal detail of information required from the get get_entity
methods. In large Senzing systems ingesting from many sources it is possible for entities to become substantial in size, and also to potentially have many relationships to other entities.
-
Don’t rely on the default response document without using flags, the default is to provide a rounded response.
-
Do use specific flags to control what information is returned and only request exactly what is required.
-
The same application or service may be calling
get entity
for different purposes. Don’t rely on using a single set of flags when one request requires minimal information and another more thorough information. -
Returning features requires additional IO, if they are not needed don’t request them.
-
When you do need to return the entity features examine the difference between the
SZ_ENTITY_INCLUDE_ALL_FEATURES
andSZ_ENTITY_INCLUDE_REPRESENTATIVE_FEATURES
flags. The later involves additional CPU usage to score and cluster the features on similar values. -
Using
SZ_ENTITY_INCLUDE_REPRESENTATIVE_FEATURES
is typically only required when displaying entities to end users. The representative feature is the value of theFEAT_DESC
key in the root JSON object of each element in the feature type array. -
If requesting
RELATED_ENTITIES
don’t get the related entity record summary or record data (SZ_ENTITY_INCLUDE_RELATED_RECORD_SUMMARY
andSZ_ENTITY_INCLUDE_RELATED_RECORD_DATA
) unless necessary, these also incur additional IO cost. Generally, these are not required to for an overview of any relationships and the information can be requested when a deeper understanding of each related entity is required.
Example Use Cases and Flags
Advanced Real-time Replication and Analytics
Methods that can modify entity resolution outcomes, such as add_record
, can optionally provide a JSON response document detailing entities affected by the operation. This can be used for passing changes to downstream systems for analytics and/or replication, see Advanced Real-time Replication and Analytics For this use case it’s typical to only require the data sources and record IDs of the entity; and related entity details if they are important to you.
Consider calling add_record("CUSTOMERS", "1004", <RECORD_DATA>, flags=SzEngineFlags.SZ_WITH_INFO)
to add a record, complete entity resolution, and return a “with info” document outlining entity ID 767 was affected by the call.
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1004",
"AFFECTED_ENTITIES": [
{
"ENTITY_ID": 767
}
]
}
To reflect changes in the composition of this entity in a downstream system, the data sources and record IDs for the entity are required. Calling get_entity_by_entity_id(767)
with the flags SZ_ENTITY_INCLUDE_RECORD_DATA
and SZ_ENTITY_INCLUDE_ALL_RELATIONS
produces the following JSON document that can be used to update the downstream system:
{
"RESOLVED_ENTITY": {
"ENTITY_ID": 767,
"RECORDS": [
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1001"
},
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1002"
},
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1003"
},
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1004"
}
]
},
"RELATED_ENTITIES": [
{
"ENTITY_ID": 11
},
{
"ENTITY_ID": 141
},
{
"ENTITY_ID": 145
}
]
}
In this scenario we are also interested in the entity relationships and request them. Only the entity IDs of the relationships are required in the downstream system, if the composition of any of the related entities had also changed they would be in the AFFECTED_ENTITIES
array of the “with info” result from calling the add_record
method. In which case get_entity_by_entity_id
would also be called on them to retrieve their data sources and record IDs for updating downstream. If RELATED_ENTITIES
weren’t required use only the SZ_ENTITY_INCLUDE_RECORD_DATA
flag.
Populating a Graph
A application used to analyze and visualize investigations can search Senzing for entities of interest and display them and their relationships, representing them as a graph structure for example. An analyst issues a search asking if Senzing can recall any entities that could match a name and a passport of an individual of interest to them. A matching entity is located, the analyst requests the entity and any first degree relationships be rendered in the application. What flags would be useful for initially rendering this information?
The graph doesn’t need the full set of information for the matching entity, or entities it is related to; presenting everything would also clutter a graph. In the graph entities are represented as nodes and relationships as edges. For the nodes it would be useful to visualize the entity name, its current entity ID, and change the color or icon of the node if any of the source records came from an interesting data source. Additionally, for the initial render of the graph, nodes and edges would be rendered representing the relationships and brief details on why the relationship was established. These flags would retrieve enough information for this initial rendering:
- SZ_ENTITY_INCLUDE_ENTITY_NAME
- SZ_ENTITY_INCLUDE_RECORD_SUMMARY
- SZ_ENTITY_INCLUDE_ALL_RELATIONS
- SZ_ENTITY_INCLUDE_RELATED_ENTITY_NAME
- SZ_ENTITY_INCLUDE_RELATED_MATCHING_INFO
- SZ_ENTITY_INCLUDE_RELATED_RECORD_SUMMARY
The result from the search indicated entity ID 145 is a potential match, calling get_entity_by_entity_id
with these flags produces:
{
"RESOLVED_ENTITY": {
"ENTITY_ID": 145,
"ENTITY_NAME": "Patricia Smith",
"RECORD_SUMMARY": [
{
"DATA_SOURCE": "WATCHLIST",
"RECORD_COUNT": 1
}
]
},
"RELATED_ENTITIES": [
{
"ENTITY_ID": 29,
"MATCH_LEVEL_CODE": "POSSIBLY_SAME",
"MATCH_KEY": "+NAME",
"ERRULE_CODE": "SNAME",
"IS_DISCLOSED": 0,
"IS_AMBIGUOUS": 0,
"ENTITY_NAME": "Patrick Smith",
"RECORD_SUMMARY": [
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_COUNT": 2
}
]
},
{
"ENTITY_ID": 30,
"MATCH_LEVEL_CODE": "POSSIBLY_SAME",
"MATCH_KEY": "+NAME",
"ERRULE_CODE": "SNAME",
"IS_DISCLOSED": 0,
"IS_AMBIGUOUS": 0,
"ENTITY_NAME": "Patricia Smith",
"RECORD_SUMMARY": [
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_COUNT": 2
}
]
},
{
"ENTITY_ID": 1,
"MATCH_LEVEL_CODE": "POSSIBLY_RELATED",
"MATCH_KEY": "+SURNAME+ADDRESS",
"ERRULE_CODE": "CFF_SURNAME",
"IS_DISCLOSED": 0,
"IS_AMBIGUOUS": 0,
"ENTITY_NAME": "Robert Smith",
"RECORD_SUMMARY": [
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_COUNT": 4
}
]
}
]
}
The response JSON document is succinct yet contains enough information for the application to render a meaningful graph.
-
The
RESOLVED_ENTITY
object contains the searched entity name, entity ID, and record summary for node of the entity of interest. -
The only record comprising this entity came from a WATCHLIST data source, the icon or color of the node can be changed to signify an entity of interest.
-
The
RELATED_ENTITIES
array details the entity of interest has 3 relationships, each element in the array has the required information for their nodes and edges to link to the entity of interest. -
None of the related entities have records from a interesting data source, but the node could still be modified to visually indicate and differentiate the related entities are customers.
-
There are 2 different types of relationships,
POSSIBLY_SAME
andPOSSIBLY_RELATED
, the visual formatting of the edges can be modified to indicate these difference and a label added for the type of relationship. -
The
MATCH_KEY
for each of the relationships indicates why the relationship was established and can be used on the edges for quick reference. -
Each of the entity nodes can add an indicator of how many records comprise each entity. It’s possible an entity with a lot of records is interesting to an analyst.
It can also be helpful for each of the related entity nodes to indicate how many relationships they have, prompting an analyst to expand and explore the graph further. In addition to the original get_entity_by_entity_id
call for the graph, additional calls for each of the 3 related entities would be issued to support this. Using only the SZ_ENTITY_INCLUDE_ALL_RELATIONS
flag on each of the related entities would result in a response such as the following, allowing a relationship count indicator to be placed on each related entity nodes. Entity ID 1 has 3 relationships, the graph currently represents the relationship to entity ID 145, prompting the analyst there are 2 additional relationships to investigate.
{
"RESOLVED_ENTITY": {
"ENTITY_ID": 1
},
"RELATED_ENTITIES": [
{
"ENTITY_ID": 11
},
{
"ENTITY_ID": 141
},
{
"ENTITY_ID": 145
}
]
}
This process can continue to explore, recall, and render further relationships in the application.
Detailed Entity Report
In the previous 2 examples minimal information has been requested, there are of course use cases where returning extended entity information is required. Consider an application where users need to review a detailed report for an entity, or in the analyst example they need to review a detailed résumé for an entity in their graph and understand everything Senzing knows about it.
The flags to use depends on the level of information required by the end users, keep in mind the points in Recommendations for Using Flags and only request the minimum required! These flags provide similar details to SZ_ENTITY_DEFAULT_FLAGS
, both might be a good starting point for exploration and adding or subtracting to or from for your individual needs:
- SZ_ENTITY_INCLUDE_ALL_FEATURES
- SZ_ENTITY_INCLUDE_ENTITY_NAME
- SZ_ENTITY_INCLUDE_RECORD_SUMMARY
- SZ_ENTITY_INCLUDE_RECORD_DATA
- SZ_ENTITY_INCLUDE_ALL_RELATIONS
- SZ_ENTITY_INCLUDE_RELATED_ENTITY_NAME
- SZ_ENTITY_INCLUDE_RECORD_SUMMARY
- SZ_ENTITY_INCLUDE_RELATED_MATCHING_INFO
- SZ_ENTITY_INCLUDE_RELATED_RECORD_DATA
{
"RESOLVED_ENTITY": {
"ENTITY_ID": 1,
"ENTITY_NAME": "Robert Smith",
"FEATURES": {
"ADDRESS": [
{
"FEAT_DESC": "1515 Adela Lane Las Vegas NV 89111",
"LIB_FEAT_ID": 26,
"USAGE_TYPE": "HOME",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "1515 Adela Lane Las Vegas NV 89111",
"LIB_FEAT_ID": 26
}
]
},
{
"FEAT_DESC": "1515 Adela Ln Las Vegas NV 89132",
"LIB_FEAT_ID": 78,
"USAGE_TYPE": "HOME",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "1515 Adela Ln Las Vegas NV 89132",
"LIB_FEAT_ID": 78
}
]
},
{
"FEAT_DESC": "123 Main Street, Las Vegas NV 89132",
"LIB_FEAT_ID": 60,
"USAGE_TYPE": "MAILING",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "123 Main Street, Las Vegas NV 89132",
"LIB_FEAT_ID": 60
}
]
}
],
"DOB": [
{
"FEAT_DESC": "11/12/1978",
"LIB_FEAT_ID": 25,
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "11/12/1978",
"LIB_FEAT_ID": 25
}
]
},
{
"FEAT_DESC": "11/12/1979",
"LIB_FEAT_ID": 77,
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "11/12/1979",
"LIB_FEAT_ID": 77
}
]
},
{
"FEAT_DESC": "12/11/1978",
"LIB_FEAT_ID": 2,
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "12/11/1978",
"LIB_FEAT_ID": 2
}
]
}
],
"EMAIL": [
{
"FEAT_DESC": "bsmith@work.com",
"LIB_FEAT_ID": 3,
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "bsmith@work.com",
"LIB_FEAT_ID": 3
}
]
}
],
"NAME": [
{
"FEAT_DESC": "B Smith",
"LIB_FEAT_ID": 76,
"USAGE_TYPE": "PRIMARY",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "B Smith",
"LIB_FEAT_ID": 76
}
]
},
{
"FEAT_DESC": "Bob J Smith",
"LIB_FEAT_ID": 1,
"USAGE_TYPE": "PRIMARY",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "Bob J Smith",
"LIB_FEAT_ID": 1
}
]
},
{
"FEAT_DESC": "Bob Smith",
"LIB_FEAT_ID": 24,
"USAGE_TYPE": "PRIMARY",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "Bob Smith",
"LIB_FEAT_ID": 24
}
]
},
{
"FEAT_DESC": "Robert Smith",
"LIB_FEAT_ID": 59,
"USAGE_TYPE": "PRIMARY",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "Robert Smith",
"LIB_FEAT_ID": 59
}
]
}
],
"PHONE": [
{
"FEAT_DESC": "702-919-1300",
"LIB_FEAT_ID": 61,
"USAGE_TYPE": "HOME",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "702-919-1300",
"LIB_FEAT_ID": 61
}
]
},
{
"FEAT_DESC": "702-919-1300",
"LIB_FEAT_ID": 61,
"USAGE_TYPE": "MOBILE",
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "702-919-1300",
"LIB_FEAT_ID": 61
}
]
}
],
"RECORD_TYPE": [
{
"FEAT_DESC": "PERSON",
"LIB_FEAT_ID": 22,
"FEAT_DESC_VALUES": [
{
"FEAT_DESC": "PERSON",
"LIB_FEAT_ID": 22
}
]
}
]
},
"RECORD_SUMMARY": [
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_COUNT": 4
}
],
"RECORDS": [
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1002"
},
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1003"
},
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1001"
},
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1004"
}
]
},
"RELATED_ENTITIES": [
{
"ENTITY_ID": 11,
"MATCH_LEVEL_CODE": "POSSIBLY_SAME",
"MATCH_KEY": "+NAME+ADDRESS-DOB",
"ERRULE_CODE": "CNAME_CFF_DEXCL",
"IS_DISCLOSED": 0,
"IS_AMBIGUOUS": 0,
"ENTITY_NAME": "Robert E Smith Sr",
"RECORDS": [
{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1005"
},
{
"DATA_SOURCE": "WATCHLIST",
"RECORD_ID": "1006"
}
]
},
{
"ENTITY_ID": 141,
"MATCH_LEVEL_CODE": "POSSIBLY_SAME",
"MATCH_KEY": "+NAME",
"ERRULE_CODE": "SNAME",
"IS_DISCLOSED": 0,
"IS_AMBIGUOUS": 0,
"ENTITY_NAME": "Robert Smith",
"RECORDS": [
{
"DATA_SOURCE": "WATCHLIST",
"RECORD_ID": "1008"
}
]
},
{
"ENTITY_ID": 145,
"MATCH_LEVEL_CODE": "POSSIBLY_RELATED",
"MATCH_KEY": "+SURNAME+ADDRESS",
"ERRULE_CODE": "CFF_SURNAME",
"IS_DISCLOSED": 0,
"IS_AMBIGUOUS": 0,
"ENTITY_NAME": "Patricia Smith",
"RECORDS": [
{
"DATA_SOURCE": "WATCHLIST",
"RECORD_ID": "1007"
}
]
}
]
}