Data Finds Data
By Jeff Jonas, published August 1, 2018
Two years after the 9/11 attacks, I found myself meeting with a three-letter agency’s counter-terrorism analyst. Mid-conversation a question popped into my head: “What do you wish you could do with technology, if you could make it do anything?”
She replied: “Get answers to questions faster.”
I thought about this for a moment and realized it was a catch 22.
1. What if the question you’re asking today isn’t a smart question, but it would be a great question in the future after more data has arrived? Answer: That could happen.
2. Is it even possible to think of and ask every smart question every day? Answer: Nope.
On a humorous note, I should have been thinking “we are all going to die” if protecting our country relies on the ability of humans to think of and ask every smart question every day!
Nonetheless it was in this moment that a big thought struck me — a new way to describe my work. We need systems where…
“Data finds data and the relevance finds you.”
In other words: each new arriving piece of data (observation) is the question.
Whether you are onboarding a new vendor or changing an existing customer’s email address, with each new piece of data you receive, your organization has learned something. If a customer’s new phone number has a nexus to a previous investigation, how long do you want to wait until this insight is flagged for human review? Wouldn’t you want to know right away?
Organizations can’t get this information immediately today… unless they have systems that perform Data Finds Data. For example, the moment a new phone number is posted to customer #123, when Data Finds Data, a mechanism automatically notices that the same phone number is also associated with pending investigation #XYZ.
When systems contain entity data (e.g., people, companies, vehicles, vessels), transactional entity resolution is the core technology needed to make this type of real-time awareness possible. Of course, these entity resolution engines must be smart or organizations will be flooded with false alarms or miss key insights. For example, a phone number can easily change owners, so not noticing this would lead to false alarms. In addition, phone numbers that are the same might be formatted differently (such as one having a country code) and an unsophisticated system might miss this alarm.
Unless the decisions you make are final and never worth rereviewing, you need systems that deliver Data Finds Data behavior.
Banks don’t just need Know Your Customer (KYC) systems. They need Continuous KYC.
Vendors don’t just need vendor vetting systems that perform vendor onboarding due diligence, they need Continuous Vendor Vetting.
Want a more robust insider threat program? Yep, you must incorporate Continuous Risk Assessment. The typical once every five years (or never) reevaluation isn’t good enough anymore.
If you want to read more about this, check out Chapter 7 “Data Finds Data” in O’Reilly Media’s excellent book, Beautiful Data: The Stories Behind Elegant Data Solutions.
You can also get the PDF of the Data Finds Data chapter here.*
* PDF provided here under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.