Senzing v4 Linux Quickstart Guide

This article outlines installing the Senzing SDK on Linux, performing loading and entity resolution, analysis and exploration of the outcomes of entity resolution and how to prepare and load your own data to Senzing.

Info

Senzing provides 500 source records for ingestion and evaluation for free. If you require additional records for an evaluation, or any assistance when following this guide, please contact Senzing Support. Support is 100% FREE!

The installation steps add the Senzing software repository to your Linux distribution, these steps only need to be completed once. During installation you will be asked to accept the End User License Agreement (EULA). On Red Hat based distributions you will also be prompted to accept the Senzing public key.

Info

To expedite getting started an embedded SQLite database is configured for use when creating a Senzing project. SQLite is easy to evaluate with, for production systems an enterprise level RDBMS such as Postgres would be used. For additional information see Technical - Database.

Installing Senzing - Debian Based Distributions

Add APT repository

Add and enable the Senzing APT repository to the currently configured list managed by apt. This only need to be completed once.

sudo apt install apt-transport-https
wget https://senzing-production-apt.s3.amazonaws.com/senzingrepo_2.0.1-1_all.deb
sudo apt install ./senzingrepo_2.0.1-1_all.deb
sudo apt update

Install package

Info

The latest version of Senzing can now be installed. As part of the installation you will be asked to accept the End User License Agreement (EULA).

sudo apt update
sudo apt install senzingsdk-poc

Installing Senzing - Red Hat Based Distributions

Add YUM repository

Add and enable the Senzing YUM repository to the currently configured list managed by yum. This step only needs to be completed once.

sudo yum install https://senzing-production-yum.s3.amazonaws.com/senzingrepo-2.0.1-1.noarch.rpm

Install package

The latest version of Senzing can now be installed.

Info

The latest version of Senzing can now be installed. As part of the installation you will be asked to accept the End User License Agreement (EULA).

sudo yum install senzingsdk-poc
Tip

During the first installation of Senzing to a system you will also be prompted to accept the Senzing public key. Accepting the prompt imports the public key to verify future installations come from Senzing.

Retrieving key from https://senzing-production-yum.s3.amazonaws.com/senzing-production.key
Importing GPG key 0xD99E309D:
 Userid : "Senzing, Inc. <buildmgr@senzing.com>"
 Fingerprint: e38c a28c f7ab 06d5 120b bda7 4f67 bf4d d99e 309d
 From : https://senzing-production-yum.s3.amazonaws.com/senzing-production.key
Is this ok [y/N]: y

Create a Senzing Project

To begin using Senzing, first create a project. This deploys an instance of Senzing into a specified path. The project folder must not already exist and will be created by the /opt/senzing/er/bin/sz_create_project utility.

/opt/senzing/er/bin/sz_create_project <senzing_project_path>

Creating and using projects provides independent and isolated instances of Senzing. Projects can be upgraded from prior Senzing versions.

For example, the following command creates the Senzing project in your current users home path in a new directory named senzing:

/opt/senzing/er/bin/sz_create_project ~/senzing

Configure Environment

To utilize your new project, environment variables need to be set indicating where to find resources for the project. The setupEnv script is project dependent and needs to be run whenever you are working with a project, for example between logging in and out of shell sessions. To setup the environment, change to your project directory and source the setupEnv file.

cd <senzing_project_path>
source setupEnv
Info

<senzing_project_path> refers to the path specified with the /opt/senzing/er/bin/sz_create_project command when creating a project.

Updating Database with Senzing ER Configuration

A Senzing instance is configured with a Senzing Entity Resolution configuration. The Senzing ER configuration is stored as a JSON document. On a fresh installation this configuration needs to be registered in the Senzing database. This step only needs to be performed once initially for a new project. From the root of your project directory, run the following command and enter y when prompted:

cd <senzing_project_path>
source setupEnv
./bin/sz_setup_config

Loading the Truth Set Data

To get started with some data, load the Senzing example truth set by:

Understanding the Truth Set Files

The truth set demo includes three main types of files, each serving a distinct purpose in entity resolution:

  • Customers: Represent your subjects of interest such as these customers. But they could easily be employees for insider threat detection, vendors for supply chain management, or other tracked entities. These records form the core dataset you aim to analyze and resolve.
  • Watchlist: Contains entities you want to avoid due to potential risks. Examples include past fraudsters, known terrorists, money launderers, or entities on mandated exclusion lists (e.g., sanctions lists like OFAC). By integrating watchlist data, Senzing helps you identify high-risk entities by matching them against your subject records. This enables risk assessment by flagging connections to undesirable entities, helping you mitigate threats like fraud, regulatory non-compliance, or reputational damage.
  • Reference List: Includes supplemental data purchased or acquired about individuals (e.g., demographics, past addresses, contact methods) or companies (e.g., firmographics, corporate structure, executives, ownership). This data enriches your understanding of your subjects by providing additional context, such as historical addresses to track entity movement or corporate hierarchies to identify ultimate beneficial owners. This deeper insight improves entity resolution accuracy and supports use cases like customer profiling or due diligence.

Download the files

wget https://raw.githubusercontent.com/Senzing/truth-sets/main/truthsets/demo/customers.jsonl
wget https://raw.githubusercontent.com/Senzing/truth-sets/main/truthsets/demo/reference.jsonl
wget https://raw.githubusercontent.com/Senzing/truth-sets/main/truthsets/demo/watchlist.jsonl

Add the data source

source setupEnv
sz_configtool
Type help or ? for help
addDataSource CUSTOMERS
Data source successfully added!
addDataSource REFERENCE
Data source successfully added!
addDataSource WATCHLIST
Data source successfully added!
save
WARNING: This will immediately update the current configuration in the Senzing repository with the current configuration!

Are you certain you wish to proceed and save changes? (y/n)
y
Configuration changes saved
quit

Load the files

source setupEnv
sz_file_loader -f customers.jsonl
sz_file_loader -f reference.jsonl
sz_file_loader -f watchlist.jsonl

Explore the results

See EDA Tools: Basic Exploration

source setupEnv
sz_explorer

  ____|  __ \     \
  __|    |   |   _ \   Senzing
  |      |   |  ___ \  Exploratory Data Analysis
 _____| ____/ _/    _\


Type help or ? to list commands.

(szeda) get CUSTOMERS 1070

Entity summary for entity 98: Jie Wang
┼───────────┼────────────────────────────────────────┼─────────────────┼
│ Sources   │ Features                               │ Additional Data │
┼───────────┼────────────────────────────────────────┼─────────────────┼
│ CUSTOMERS │ NAME: Jie Wang (PRIMARY)               │ AMOUNT: 100     │
│ 1069      │ NAME: 王杰 (NATIVE)                    │ AMOUNT: 200     │
│ 1070      │ DOB: 9/14/93                           │ DATE: 1/26/18   │
│           │ GENDER: Male                           │ DATE: 1/27/18   │
│           │ GENDER: M                              │ STATUS: Active  │
│           │ ADDRESS: 12 Constitution Street (HOME) │                 │
│           │ NATIONAL_ID: 832721 Hong Kong          │                 │
│           │ NATIONAL_ID: 832721                    │                 │
│           │ RECORD_TYPE: PERSON                    │                 │
┼───────────┼────────────────────────────────────────┼─────────────────┼
│ REFERENCE │ NAME: Wang Jie (PRIMARY)               │ CATEGORY: Owner │
│ 2013      │ DOB: 1993-09-14                        │ STATUS: Current │
│           │ RECORD_TYPE: PERSON                    │                 │
│           │ REL_POINTER: 2011 (OWNS 60%)           │                 │
┼───────────┼────────────────────────────────────────┼─────────────────┼
└── Disclosed relation (1)
    └── --> OWNS 60% (1)
        └── 182: Hajah Mamunah Jln Pisang CUSTOMERS (1) | REFERENCE (1) +REL_POINTER(OWNS 60%:)

Mapping Your Own Data

At this point you are ready to map and load your own data. Mapping is the process of converting your source data into a structure Senzing understands ready to load.

Info

To learn more about mapping, the dictionary of terms and samples to help prepare your own data sources for loading and entity resolving review the Senzing Entity Specification.

Consider these examples, in your data an attribute describing a personal full name is in a database table with the column name fullname. In Senzing a full name is represented by the term NAME_FULL. Similarly for address line 1, your database column is named addressline1, in Senzing this is represented by the term ADDR_LINE1.

Your task in mapping is to determine which attributes in your data source(s) are appropriate for use in entity resolution, extract those attributes and construct the structure describing those attributes to send to Senzing. The following is an example of a Senzing mapped JSON structure for an entry from a data source.

{
"DATA_SOURCE": "CUSTOMERS",
"RECORD_ID": "1001",
"RECORD_TYPE": "PERSON",
"PRIMARY_NAME_LAST": "Smith",
"PRIMARY_NAME_FIRST": "Robert",
"DATE_OF_BIRTH": "12/11/1978",
"ADDR_TYPE": "MAILING",
"ADDR_LINE1": "123 Main Street, Las Vegas NV 89132",
"PHONE_TYPE": "HOME",
"PHONE_NUMBER": "702-919-1300",
"EMAIL_ADDRESS": "bsmith@work.com",
}

Start Developing

Members of our team have created GitHub projects that show more of what you can do quickly:

Info

If you have any questions, contact Senzing Support. Support is 100% FREE!