3 Mistakes to Avoid When Evaluating Entity Resolution Software
By Jeff Jonas, published April 20, 2022
There are three common mistakes that many companies make when evaluating entity resolution software. Any one of them can result in a whole lot of buyer’s remorse later on.
The Big Three include:
- Overlooking operational impacts
- Allowing behind the curtain wizardry
- Inaccurately testing for accuracy
While the impact of each is different for every company, all are worth avoiding.
1. How to Define Entity Resolution Operational Impacts
Imagine evaluating a car by driving it around the block when you are looking for a low maintenance, all-terrain vehicle with minimal operating costs. How does this testing scenario help with your evaluation? It doesn’t.
The same is true when evaluating entity resolution software by simply batch loading a couple of data sets when you need a system that operates 24×7 in real time, supports dozens of data sources and so on. Just like testing a car with a trip around the block, not a sufficient test!
Taking the time to predefine your short- and long-term operational requirements is key. In addition to your initial use, think through what you’ll need one, two and three years out. Consider a range of different scenarios for how you might expand the system’s data and use over time.
Try to avoid these common oversights:
- Not fully understanding what is required to onboard new data sources. The actual time, resources and costs to add new data sources are often much greater than expected. Be sure you know what’s involved, whether you expect to add hundreds, dozens or just a few new data sources over time.
→ It’s important to clearly assess what skills and time are required to prepare, map and tune new sources. You also need to know if the entity resolution software you’re evaluating must be fully reloaded every time you add a new source. This is important because, when reloading is required, you may have to reprocess growing numbers of records, which will be increasingly expensive and time consuming.
- Miscalculating total cost of ownership (TCO). Missing one or more costs could throw off your entire TCO calculation and make the comparison of entity resolution systems like comparing apples to oranges. As you scope the TCO of each entity resolution system, consider both initial and ongoing operations. Areas that often result in higher-than-expected costs include the following:
→ Onboarding new data sources – For some entity resolution technologies, adding a new data source can take a month or more, even with experts. The costs of adding data sources can add up fast if you expect to add many new sources, or even periodically add sources. Don’t underestimate the number of experts and time you’ll require.
→ Ongoing operating costs – How many and what kind of dedicated resources are needed to operate the entity resolution software? Remember to include the costs of maintaining your bench strength and any new-hire training programs to backfill for attrition.
→ Maintenance expenses – Batch-based technologies usually handle initial data loads quickly but require periodic compute-intensive reloads. If 24×7 operations are needed, two systems are required to support batch methods. Conversely, real-time transactional systems typically take longer to bulk load historical data than batch systems, but don’t ever require batch reloading when adding new data sources. This difference can have a huge impact on TCO over the long term.
→ Long term resource requirements – Adding new data sources or expanding operations in the future can significantly increase costs for hardware, software and other resources. Be sure to factor these costs into your overall TCO calculations.
It’s common when evaluating entity resolution software to focus on the basics or the minimum viable product (MVP). To reduce your risk of buyer’s remorse, spend more time up front thinking about your overall journey.
2. Why Testing Entity Resolution Behind the Curtain isn't Advised
Imagine evaluating a car by having a seasoned professional, who works for the car manufacturer, drive it around the block and report back to you on their experience. This approach would leave you totally in the dark about the amount of training and expertise needed, how it handles, and the noise level.
When an entity resolution vendor runs your data for you during an evaluation, the same thing happens. You have no idea what happened behind the curtain, including the amount of complexity and effort required (potentially Herculean) to make their system’s results look good.
We think it’s critical for you to be involved in the evaluation process, before you make a purchase decision. Whether you run your own entity resolution evaluation in house, which we recommend, or sit up front in the passenger seat during the evaluation process. If you aren’t involved, here’s what you might miss:
- Many nuances and complexities in your data are usually uncovered during the data preparation and mapping processes. If you’re involved, you’ll gain awareness about the quality of your data and how its structure aligns with the entity resolution technology under evaluation. You’ll also better understand the full range of skills needed to launch and operate the production system.
- Configuration and tuning steps needed to achieve optimal matching results are made obvious during evaluation. It’s important to participate in each run and audit. Every run can result in new discoveries related to data preparation, mapping and configuration. For some systems, these discoveries will result in significant additional effort. And, what you’ll learn from this experience may be invaluable, and isn’t likely to be found in any software user guide. [Note: Some configurations may deliver great results on small data sets during evaluation, but will fail horribly or lack the performance or scalability needed for larger data sets in production. A brief discussion of this complex topic is covered in the Inaccurately Testing for Accuracy section below.]
- Determining how to best size your operating environment. Even though your production environment is likely to be different, you’ll gain some understanding about the hardware, operating system and software versions needed and the footprint of the software across provisioned nodes, cores, memory, IO, and storage consumption. Sometimes evaluations are conducted entirely in memory, when test data sets are small. It is important to know if that is the case, in advance, as the performance of a production system using persistent storage will be much different.
- Understanding how much human capital and compute time was required to complete the evaluation.
To reduce your risk of buyer’s remorse, get up close and personal when evaluating entity resolution software. Do the work! It will pay off big time.
3. How you Could Inaccurately Test Entity Resolution Accuracy
Imagine conducting vehicle safety tests, such as measuring stopping distance or handling in high-speed turns, in a small parking lot set up with some orange cones. Obviously, the results of these type of tests will be quite different than the actual safety you’ll experience in the real world.
Similarly, when insufficient data sets or volumes are used when evaluating entity resolution software for accuracy, it is highly unlikely the results will have anything in common with your accuracy in the real world.
To better approximate your real-world accuracy, avoid these common oversights:
- Using poorly constructed test data or truth sets. The wrong slice of real data or data without enough volume or diversity will produce bogus results. Furthermore, avoid regarding your truth set as a source of golden answers, since even good truth sets have gaps and errors!
→ You should use real data to create your entity resolution truth set and make sure you use enough of it. How much is enough data? One way to tell is when interesting surprises pop up from time to time, despite your careful curation process. Be sure to keep an open mind during testing because certain outcomes may change your point of view, and thus your truth set, or even what you measure. For more details, review the articles on how to create an entity resolution truth set and the path to a successful proof of concept.
- Evaluating entity-centric matching using a truth set created for record matching technology. If you do this, you’ll get inaccurate comparisons and hide the higher levels of accuracy entity-centric matching provides. The blog, entity-centric learning vs. record matching methods provides more details about the differences between entity-centric matching and record matching.
→ When evaluating entity resolution systems that use entity-centric matching, be sure your truth set includes some records that test and expose the quality of entity-centric logic. Your audit tools should also support entity-centric matching. If not, you’ll need to manually audit exceptions for accuracy and, in many cases, the exceptions will actually be correct.
- Not considering ambiguous matching conditions, i.e., when records can match to more than one existing entity. If not handled properly, ambiguous records are arbitrarily assigned to the wrong entity, creating invisible false positives.
→ Be sure your truth set includes ambiguous records. If your audit isn’t finding any, it may because your entity resolution algorithm doesn’t handle them, your audit process isn’t detecting them, or the data sets you’re testing don’t make these ambigious records more obvious. As you add data sources in the future, ambiguous records can create significant accuracy issues, so don’t shortcut this one. It can become a big deal down the road!
- Relying on only high-level statistics such as precision, recall and F1 scores to quantify accuracy. One system may have 50% more matches than another, not because it is more accurate but because it has so many false positives.
→ When comparing systems, start with a record-level audit that manually inspects discrepancies between systems. Keep a close eye on what each system missed. Sample enough of every category to get a feel for what is really happening. For instance, you may like the additional matches one technology found that were missed by the truth set or another competing method. Remember: one system’s false positives are another’s false negatives.
- Testing for accuracy with synthetically generated data. This is dangerous because it is almost impossible to produce synthetic data representative of real data. As a result, you should avoid using synthetic data.
→ Instead of using synthetic data, try to find another organization already using the same entity resolution technology on data that is similar to yours. Ask them what level of accuracy they’re seeing. If that isn’t an option and you must use a synthetic truth set, take great care when creating a truth set.
To reduce your risk of buyer’s remorse, use real data to evaluate entity resolution technologies. Be sure to run a record-level audit while keeping an eye out for entity-centric matching and ambiguous conditions.
If you won’t let a manufacturer test drive your car for you, why would you let an entity resolution vendor perform your evaluation testing? Getting actively involved in the evaluation process gives you a true sense of how entity resolution software works on your own data. During an evaluation, we recommend you focus on understanding how the software works from start to finish, from installation and configuration to operation and production. This is essential if you want to reduce your risk of buyer’s remorse.
Also, if you are a like-minded, kindred spirit, please join our Entity Resolution LinkedIn Group.