Finding-PPP-Loan-Data-Hidden-Connections-Using-Senzing
By Jeff Jonas, published September 23, 2020

At Senzing we have created the first real-time AI for entity resolution. We make it quick and easy to accurately combine data about people and companies from different data sources. No other technology that exists today can do this in real time, at scale, with this level of accuracy without any training, tuning or experts. It is also the most affordable option available!

You can try it right now on Paycheck Protection Program (PPP) data in under 20 minutes.

To help make this fast and simple, we have prepared three Senzing-ready .csv files filtered to contain Las Vegas related records.

[NOTE: Instructions for running the whole PPP loan file are located at the bottom of this blog post.]

 

IMPORTANT DISCLAIMERS

  • The Senzing-ready .csv links provided are snapshots from the past, so the information is out of date. If you are doing real research, be sure to download the latest files (see links at bottom).
  • Many organizations have multiple legal entities, sometimes similarly named. Without more data, Senzing may match these entities if they are located at the same business address. Such duplicates are likely legitimate. Note: as more data is loaded, these overmatches begin to automatically self-correct, which is a unique capability of Senzing.


Loading and Exploring Las Vegas PPP Data

1. Download and install the free Senzing App here. [No personal data flows to Senzing, Inc.]

2. Launch the Senzing App.

3. Create a Project.

  • Select “Projects” (left toolbar, icon with the hammer).
  • Select “Add Project.”
  • Name the project whatever you like e.g., “PPP Las Vegas.”
  • Select “Create.”

4. Load the PPP file into Senzing.

  • Download the “PPP_Loans_Over_$150k_LasVegas.csv” that we have prepared here.
  • Select “Data” (left toolbar, icon with the cylinder).
  • Drag and drop the “PPP_Loans_Over_$150k_LasVegas.csv” file onto the canvas.
  • Click the “Load” on the card.

5. Review the results.

  • Once loading is complete, Click “Review.”
  • Explore the Duplicates — records Senzing thinks belong to the same organization.
  • On the far right click the little “expand” icon (looks like a small blue clock) that appears as you hover over the “Other Data” column.
  • Once finished exploring “Duplicates,” click on “Possibly Related.” The Match Key column explains why they are related.
  • Click on any “Entity ID” (left column in chart) to see the entity’s resume.

Highlights:

  • Notice in the top blue bubble there are 40 duplicates.
  • Looking over these duplicates you will notice some are probably false positives e.g., these three entities “NG WASHINGTON”, “NG WASHINGTON II” and “NG WASHINGTON III” are probably different legal entities – each eligible for a PPP loan. Records like this match because of the name and address similarity.
  • You may notice other duplicates that look like identical legal entities – these being examples where further human analysis is required.
  • Select “Search” (left toolbar, icon with a magnifying glass) and search for this address: “3130 S Durango Dr STE 400 Las Vegas.” Click any of the possibly related entities you will see something like this:
  • Click the “Show Match Key” in the lower right corner and you will see how these three entities “BOYACK AND ASSOCIATES INC”, “BIA LAS VEGAS LLC” and “BIA NEVADA, LLC” are related.


Add Reference Data to Improve Accuracy

Reference data is carefully curated data sets that can be used to improve entity resolution accuracy. For this demonstration, we will be using a publicly available file called the National Provider Index (NPI) which contains a list of US health care providers curated by Health and Human Services.

1. Load the NPI file into Senzing.

  • Download the “NPI_Orgs_LasVegas.csv” that we have prepared here.
  • Select “Data.”
  • Drag and drop the “NPI_Orgs_LasVegas.csv” file onto the canvas.
  • Click the “Load” on the card.

2. Review the results.

  • Once loading is complete, click “Review.”
  • Once loaded, click “Review” on the PPP LOANS OVER… card.
  • Notice there are now 41 duplicates in the PPP data – recall, before loading the NPI file there were only 40. Which match is new? Hint: Use the More button to reveal records from other data sources that may have contributed to the matching decision.
  • Notice there are now two possible duplicates – recall, before loading the NPI file there were zero.
  • Click on the two (2) “Possible Duplicates.” Can you figure out what Senzing learned that caused it to change its mind about these matches?

Highlights:

  • Using the NPI reference data, these three PPP records came together: “BAI LAS VEGAS LLC”, “BOYACK AND ASSOCIATES INC”, and “BAI NEVADA, LLC”. Because of entity-centric learning, when the NPI record revealed BIA was a DBA (doing business as) “BOYACK AND ASSOCIATES”, Senzing reevaluated the earlier decision and improved it, in real time.
  • In a similar manner, the NPI reference data surfaced to possible matches – these have close names at the same address.
  • Other popular reference data that can significantly improve matching results are commercially available from data providers like Dun & Bradstreet, Moody’s and OpenCorporates.


How to Combine Other Data to Improve Context

Combining additional data from other public and private sources is easy too. For example, publicly available data from the US Department of Labor Wage and Hour Compliance Actions can be easily added to discover which PPP recipients also have labor violations.

3. Load the DOL Compliance Actions file into Senzing.

  • Download the “Dept_Labor_Whisard_LasVegas.csv” that we have prepared here.
  • Select “Data.”
  • Drag and drop the “Dept_Labor_Whisard_LasVegas.csv” file onto the canvas.
  • Click the “Load” on the card.

4. Review the results.

  • Once loading is complete, click “Review.”
  • Once loaded, click “Review” on the PPP LOANS OVER… card.
  • In the upper right area of the screen you’ll see “PPP Loans Over 150k” in a drop-down. To the right of this you will see the word “NONE”. Click this drop-down to change “NONE” and to the “US DOL – WHD” data source.
  • Now click in the middle of the blue circles to see the matches between these data sources.
  • Notice the CASE_VIOLTN_CNT values (Case Violations) on the far right.
  • Use the More button to reveal records from other data sources that may have contributed to the matching decision.

Highlights:

  • Before loading compliance actions there were only two possible duplicates. Now there are three. To see this, change the “US DOL – WHD” data source back to “NONE”. Then click on the three (3) possible duplicates. Take a look, one of these is new. Take away: although this is not considered reference data, new data from any source can be used to help improve past, present and future matches.
  • While on the same PPP Possible Duplicates screen, check out the Match Key column. Notice one of the rows has a “-NPI_Number” which means these values were different. Had these not disagreed, Senzing would have considered these duplicates.

SHAZAM!

Look what you have been able to do so quickly! Unlike other technologies that take a long time to set up and configure, Senzing is so easy. Feel free to entity resolve your data e.g., your contacts, Salesforce accounts, vendor file, marketing list, etc. If you want additional info on getting started, check out this article.

Hope you enjoyed Senzing. We would love to hear any feedback, especially suggestions on how to make it better. You can reach us here.

Thank you.

 

BONUS SECTION

Just for fun, check out these additional Senzing-ready files, filtered for Las Vegas:

Instructions for running all the PPP loan data:

The Senzing API, our main product, is for developers. Our technology makes the complicated task of entity resolution trivial for programmers. Senzing is real-time and scalable to billions of records. More our unique technology here.

If you are not a developer, the simple Senzing App is for you. While 100k records are free, an affordable license upgrade is available here.

To speed up your full-file PPP project, here are some key links. Use the source link if you need current information for real work. Otherwise, if you are just experimenting, try our Senzing-ready links which are out of date snapshots:

PPP Loans over $150k                              Source Link              Senzing-ready Link
National Provider Index                          Source Link               Senzing-ready Link (filtered for organizations)
Dept of Labor Compliance Actions     Source Link               Senzing-ready Link
Medicare Supplier Directory                 Source Link               Senzing-ready Link
Physician Compare                                  Source Link               Senzing-ready Link
OIG Exclusions                                           Source Link               Senzing-ready  Link (filtered for organizations)


REFERENCE LINKS
Senzing’s Developer Page
Uniquely Senzing White Paper
Entity Resolution Processes White Paper
Slow Motion Entity Resolution Video
Entity-Centric Learning
Architecture Pattern for Perpetual Insights
Our Customers & Partners