Entity Resolution Breakthrough Using AWS ECS Fargate & Aurora PostgreSQL Serverless
By Jeff Jonas, published November 2, 2020
“Good, fast, cheap. Choose two.”
There are plenty of references to this dynamic “law,” if you will, applied to various domains e.g., project management, marketing. For so long, entity resolution has lived under this same law too — but no longer!
Team Senzing is elated to report a breakthrough: good, fast, and cheap entity resolution.
BEHOLD: Watch this 17-minute video to witness the deployment of a fully scalable AWS serverless stack from start to finish in 24 minutes of clock time. Then entity resolve 10M synthetic records in ~3 hours for under $100 in AWS compute. Use this exact fully-managed infrastructure to scale up to billions of records.
Developers can now deploy scalable entity resolution into applications with unprecedented accuracy, ease and cost:
Good: When comparing accuracy on real world data, Senzing proves to be more accurate in most cases. Use our Exploratory Data Analysis (EDA) audit tool on up to 100M records to see how our accuracy compares to other methods.
Fast: The estimated time includes the total amount of time to provision AWS, deploy Senzing, and process prepared data. Notably, Senzing does not require training or tuning — though in a minority of cases, config might need to be slightly tweaked. Note: Mapping/transforming each data source to Senzing JSON is not included in the time estimate.
Cheap: The Senzing quoted price is for a one-month production license (via Senzing). The AWS price is per run. If you’re interested in non-production pricing, Contact Us. For annual pricing for the real-time Senzing API, try our Pricing Calculator.
Wait! There’s more.
Get this: Senzing is a truly transactional, real-time, engine — when you add more data, whether batch or streaming, Senzing handles these new records incrementally (e.g., adding 10k more records to a 1B record database costs only a couple dollars on AWS serverless). Other entity resolution technologies require complete reloading to account for adds/changes/deletes e.g., requiring redundant A/B systems so one can be online while the other system re-boils the ocean (reloads). This is more time-consuming and expensive. Not true with Senzing.
More about Senzing’s game-changing features can be found in our Uniquely Senzing and Entity Resolution Capabilities to Consider white papers.
Link to the Github serverless project with 10M entity resolution test records
Link to the Github serverless project with 100M entity resolution test records