At SEON, we pride ourselves on our data enrichment processes. But the term can be confusing to business owners – and even to fraud managers. So today, we’ll see what data enrichment is, when it’s used efficiently, and how it improves fraud detection.
What is Data Enrichment?
At its core, data enrichment is refining and enhancing information. Organizations acquire raw data, but that data isn’t always useful. It can contain mistakes. It can be too isolated to be useful (data silos). It can be too vague to be meaningful.
Data enrichment is therefore the process that takes raw data and improves its value. Countering our three examples above, data enrichment can:
- Correct data flaws
- Link data to other sources and reduce silos
- Refine data to break it down into more granularly
Data Enrichment: Manual and Automated
Picture the classic scene in TV shows where a detective pieces information together: this is manual data enrichment. You have one piece of information, and you try to connect the dots. You cross reference other pieces of information. You try confirm the value of your data.
Automated data enrichment is something we all experience everyday, maybe without being aware of it. Ever relied on Google’s autofill feature? Or their spelling suggestions? What the tools do is take your raw data (text), and automatically link it to their huge databases to enrich it, thereby providing more accurate (useful) results.
Note that the above example refines data internally. You will often need to source data externally in order to make yours valuable. This is particularly true in the world of fraud prevention, as we’ll see below.
Data Enrichment in Fraud Prevention
Fraud Prevention is all about discovering who you are dealing with. What kind of users should be allowed into your system, and which ones will try to scam you in the long term. This is where enriching simple data fields externally can make all the difference. Of course, there is a variety of data points you can use, but we’ll look at three noteworthy examples:
Email address data enrichment
Chances are that all your users will need to sign up with an email address. This simple data field can reveal a lot by comparing it with external databases:
- Is the address free or paid?
- Is it disposable?
- Is the domain registered?
- Has it been involved in previous data breaches?
- Is it used to register to social services (Facebook, Instagram, Spotify etc…)
Deep social media profiling and domain verification, for instance, is a science SEON is proud to offer through our Email Risk Analysis API.
IP address data enrichment
Similar to email addresses, the IP address of your user can reveal a lot about who they truly are.
- Where is the user based?
- Are they connecting through open ports – communicating with other servers?
- Are they using proxies, VPNs or TOR?
- Is the IP on any spam blacklists?
- Datacenter IP or residential connection (belonging to a homeowner)?
BIN number data enrichment
Another great example of enriching simple data points through external data sources. BIN (Bank Identification Numbers), also known as IIN (Issuer Identification Numbers) can tell us a lot about a card:
- What bank issued the card?
- What kind of card is it?
- What is the bank’s phone number?
- What is the card’ level (ATM only, Gold, Platinium, World Elite or Infinite depending on the provider)
Device data enrichment
This is what, at SEON, we call Device Fingerprinting. Quite simply, it’s the ability to tell a lot based on the device the user connects to your site with:
- Has the user connected with the same device before?
- Is it running a virtual machine?
- What kind of browser is installed?
- Is it a mobile or desktop?
Leveraging Data Enrichment
Of course, all the points above are easy to fake or enter with stolen information. Which is why data enrichment is only truly valuable in fraud prevention when combined with a solution designed to understand it.
How is that possible? In two words: machine learning. Now that we have created a large scale data set, it’s time to feed into a ML model to analyze, report, and reveal insights. At this stage, organizations may create different models based on their needs.
- Whitebox solution: A machine learning model that delivers readable rules through a Decision Tree algorithm. Each applied rule creates a new branch where the nodes are clear parameters.
- Black Box model: relying on complex probability-based classification that remove transparency for the sake of scores.
Whichever the model, by feeding it through a ML system, your simple data point has now been enriched, transformed, and turned into a powerful weapon in your fight against fraudulent users.
Data enrichment isn’t just an option for fraud prevention: it is one of the most crucial processes which allows you to get a clear picture of who your users are – whether it’s in the travel industry or for an online gambling platform.
However, data enrichment is just one part of the process. It is meaningless without proper analysis. And the only way to pore over this data to turn it into insights is through machine learning tools. Combining data enrichment with machine learning is what improves risk-based decision making, maximize resources to fight against fraud, and outsmart malicious users whose goal is to damage your business.