Fraud Detection for Machine Learning & AI: What Is It & How It Works?

Fraud Detection for Machine Learning & AI: What Is It & How It Works?

Author avatar

by Florian

There has seldom been a Risk Ops conference in the past few years that didn’t address the topic of machine learning in fraud detection. 

Some go as far as saying it will completely replace manual reviews. But can we really trust algorithms to understand how fraudsters target a business? And what is machine learning anyway? All the answers are in this post.

What is AI & Machine Learning for Fraud Detection?

In fraud detection, machine learning is a collection of artificial intelligence (AI) algorithms trained with your historical data to suggest risk rules. You can then implement the rules to block or allow certain user actions, such as suspicious logins, identity theft, or fraudulent transactions. 

When training the machine learning engine, you must flag previous cases of fraud and non-fraud to avoid false positives and to improve your risk rules’ precision. The longer the algorithms run, the more accurate the rule suggestions will be.

Difference Between AI & Machine Learning for Fraud Detection

The terms AI and machine learning are often used interchangeably. However, while every form of machine learning counts as AI, not every AI uses machine learning. 

AI is a bigger concept designed to create machines that simulate human thinking. Machine learning is a subset of AI that allows machines to learn from data without being reprogrammed.

It’s also worth noting that machine learning has its own subset, called Deep Learning. It employs algorithms and structures modeled after the human brain. 

The Benefits of Machine Learning For Fraud Management

Because machines have a much easier job processing a large dataset than humans, what you get is the ability to slice and dice huge amounts of information. That means:

  • Faster and more efficient detection: the system gets to quickly identify suspicious patterns and behaviors that might have taken human agents months to establish.
  • Reduced manual review time: similarly, the amount of time spent on manually reviewing information can be drastically reduced when you let machines analyze all the data points for you.
  • Better predictions with large datasets: the more data you feed a machine learning engine, the more trained it becomes. That is to say, while large datasets can sometimes make it challenging for humans to find patterns, it’s actually the opposite with an AI-driven system.
  • Cost-effective solution: unlike hiring more RiskOps agents, you only need one machine-learning system to go through all the data you throw at it, regardless of the volume. This is ideal for businesses with seasonal ebbs and flows in traffic, checkouts, or signups. A machine learning system is a great ally to scale up your company without increasing risk management costs drastically at the same time.

Last but not least: algorithms don’t need breaks, holidays, or sleep. Fraud attacks can happen 24/7, but even the best fraud managers might come to work on Monday morning with a backlog of manual reviews. Machines can ease up the process by sorting through the obviously fraudulent or acceptable cases.

According to a whitepaper by computer scientists from the University of Jakarta, machine learning algorithms achieved up to 96% accuracy in reducing fraud for eCommerce businesses.

Try a Fraud Product Demo

Disadvantages of Machine Learning for Fraud Prevention

In spite of its advantages, there will always be cases where old-fashioned manual reviews will be preferable

  • Less control: this is especially true of blackbox machine learning engines, which can make mistakes without anyone noticing them
  • False positives: if a legitimate action is marked as fraud and you don’t realize it, it will influence the whole system negatively. In that sense, a badly calibrated machine learning engine can create a negative loop where the more false positives aren’t flagged, the less precise your results will be in the future.
  • No human understanding: if you’re trying to get to the bottom of understanding why a user action is suspicious, it’s hard to beat good old psychology

A few examples where human reviewers are often preferred to automated systems include AML (anti-money laundering) and reviewing high-value transactions, such as payment for a jewelry store or high-end electronics.

I love where the machines have taken us because AI and Machine Learning can take out those things that are very very bad, but a lot of the time there’s false positives. And if you’re teaching a machine in a certain way you have to deal with the fallout from that. So there’s that grey space in the middle and that’s where I like my teams to operate.”

Jacqueline Hart, Head of Trust & Safety at Patreon, as heard on the SEON podcast.

Differences Between Blackbox and Whitebox Machine LearningUnsupervised Protection

While machine learning tends to be a selling point for most fraud prevention vendors, not all solutions are created equal. Notably, there is a key difference between whitebox and blackbox machine learning:

  • Blackbox machine learning: the system is designed to work in a ‘set and forget’ mode, where the decisions are opaque and automated. It can be great for small businesses that do not need to dive into the nitty-gritty of tweaking their risk rules.
  • Whitebox machine learning: the system will give you clear explanations as to why a risk rule was suggested. This makes it easier to understand where the risk is and gives fraud managers more flexibility to improve their fraud prevention strategy. 

Both systems have their pros and cons. However, at SEON, we opted for a whitebox system, which allows risk managers to gain more control over the engine in order to tweak, test, and measure the results of each risk rule.

Getting Started With Machine Learning Fraud Prevention

The term machine learning may seem intimidating, but getting started with an algorithmic system is actually straightforward. 
In this example, we’ll be looking at reducing transaction fraud (and therefore chargeback costs).

1. Feeding the input data

Every AI or ML system needs data to get started. In this scenario, it will be transaction data such as:

  • Transaction value
  • Product SKU
  • Type of credit card
  • Etc.

But we’ll also add data relating to how the customers connect to the site:

  • IP data
  • Device type
  • VPN, proxy or TOR usage
  • Etc.

Note that the more data you have to start with, the more accurate your results will be. This is particularly important if your fraud prevention software does not allow custom fields, as you could be missing out on crucial information.

2. Generating the rules

SEON’s machine learning can generate two main types of rules:

  • Single parameter rules, also known as heuristic rules: an example of a single parameter rule would be: block if the IP is X.
  • Complex rules: including multiple parameters.

Each listed rule shows an accuracy score. You can adjust accuracy thresholds to tighten or loosen triggering conditions.

Note that the rule names are extremely descriptive, allowing you to understand why it was generated at a glance. You can clearly see how all the rules are designed to understand how the customer logged in could affect the transaction value lost to fraud.

3. Reviewing and activating the rule

SEON allows you to filter the rules by any data point, including its type and predicted accuracy. The accuracy part is particularly useful, and it is calculated using a complex confusion matrix. 

By default, machine-learning suggestions are switched off. You can quickly enable them using the ON/OFF switch. It’s also possible to manually create and adjust thresholds for the rule to be triggered.

4. Training the algorithm

Providing feedback data is the key to refining rules and getting better accuracy. With SEON, there are two ways to provide feedback and label the actions:

  • Via the GUI: a simple, visually-friendly way to mark actions
  • Using the Label API: you can mark actions programmatically via API calls

However you do it, the actions should be marked as either APPROVED, REVIEWED, or DECLINED.

The algorithms retrain themselves every day based on the last 180 days worth of data. You can access them at any time in your backend and scoring engine (where you manage all the risk rules).

5. Testing rules on historical data

Good fraud prevention software should allow you to revisit past cases to see if the rules would have helped. This is done in a sandbox environment, where you can switch the rules on and off and witness their accuracy in person. 

Running a test will create a confusion matrix based on previous transactions over the selected time frame and highlights the estimated accuracy rate of the rule. ??In the field of machine learning a confusion matrix, or error matrix, is a table layout that allows visualization of the performance of an algorithm. This allows you to calculate accuracy over a specific date range – selectable from the last hour through to the last year.

In the right hands, this gives fraud managers complete control over their risk strategy, allowing them not only to reduce but also to monitor, test, and tweak results at will.

Outsourced Vs On-Site Machine Learning Fraud Detection

While it is entirely possible for a talented team to build their own machine learning models in-house, it’s worth considering the time, effort and costs involved:

  • The costs of sourcing talent: you’ll need data scientists, engineers and machine learning specialists to build the models.
  • Time to prepare the data: preparing and cleaning the raw data. This can be a lengthy process that can 60 – 80% of the entire timeline between receiving input and suggesting risk rules.
  • No shared data: another advantage of outsourced machine learning engines is that they can benefit from shared data from multiple customers. It doesn’t mean that rules get applied across the board, but rather that vendors can use their knowledge of an industry to create highly-accurate rules that other competitors can benefit from.
  • Not out-of-the-box integration: last but not least, integrating machine learning with a risk management strategy can be lengthy, complex, and costly.

5 Use cases of Machine Learning for Fraud Detection

AI-driven fraud prevention is industry-agnostic. It only needs data to work, which is why you’ll find it has been deployed in a variety of verticals such as:

Online stores and Transaction Fraud

Analyzing data for thousands of transactions can be difficult. This is why many fraud managers for many big eCommerce websites use machine learning to understand why some transactions weren’t initially flagged as fraudulent by the system.

And it’s more important than ever: Juniper Research forecasts a $50.5B loss to fraud for online retailers by 2024.

So after letting your ML system run for a while, you can learn which items are the most targeted by fraudsters, what kind of shipping information involves the most risk, and, of course, which card payments should be blocked to avoid high chargeback rates and more.

Financial Institutions and Compliance

Fintech companies, established financial institutions and even insurance providers have strict compliance requirements they must meet to avoid regulatory fines. In other words, they need to verify that they are dealing with real users, not fraudsters.

However, they must also work fast to remain competitive. This is how fraudulent profiles slip through the net. With a machine learning system in place, many of these companies can gain invaluable insights into what makes a legitimate versus fake user profile.

iGaming and Bonus Abuse or Multi Accounting

Online gaming companies, casinos, and betting platforms must do their best to ensure all the players are real. They also tend to offer high-value rewards to new customers. This creates a double incentive for fraudsters to create multiple accounts (multi-accounting) and to claim the signup bonuses as well as engage in collusive play.

According to TransUnion, 2021 saw a 43% increase in online gambling identity fraud, which proves that measures are needed more than ever.

A machine learning system can be used to analyze data points that point to suspicious user behavior. This can work in your favor to detect poker bots, cheating players, and even bad affiliates that bring in a lot of low-quality traffic to your site.

BNPL and Account Takeover (ATO Attacks)

Buy Now Pay Later accounts are essentially becoming online digital wallets. If a fraudster manages to log into a user account, they can use it to purchase goods and services illegally. This is called an account takeover attack, or ATO.

The best way to protect accounts is to understand how users log onto your platform. The problem is that this may vary greatly depending on your market, seasonality, and other parameters. By running a machine learning engine on the login data points, you can understand how to better authenticate your users to protect their online accounts.

Payment Gateways and Chargeback Fraud

Yet another example where it’s very hard to manually review every transaction – especially when speed is of the essence. Payment gateways must process thousands of transactions as quickly as possible, which makes it virtually impossible to employ human agents to review all of them.

A machine learning engine can act as a kind of fraud monitoring analytics system, where you train it to detect fraudulent transactions that would otherwise incur chargeback costs. 

How SEON Helps Combine Machine Learning with Manual Reviews

There are clearly numerous advantages to letting a machine-learning system oversee your fraud prevention strategy. But sometimes, your focus isn’t just to block or accept user actions, but rather to give all the right information to your risk analysts as fast as possible. You always get these cases that fall into a grey area where the best algorithms can’t help by themselves.

This is precisely where you can use SEON’s whitebox machine learning system to suggest rules. With our engine, which also includes powerful data enrichment, you get a full understanding of the rules, and a human can still have the final say.

SEON’s transparency helps your analysts get a fuller picture of the problem and its potential solution. They can even test the rules on your own historical data and tweak them to get better results in a sandbox environment.

Effectively, this means that you get the best of both worlds: a powerful AI-driven system to tell you where it thinks fraudsters are hiding, and human intelligence to oversee the suggestions. 
In short, machines are fantastic at processing and memorizing knowledge; humans are still better at applying it. This is why SEON believes combining both insights and machine learning fraud detection is the ultimate way to fight against bad agents online.

Try a Fraud Product Demo

FAQ Frequently Asked Questions About Machine Learning in Fraud Prevention?

Why use machine learning in fraud detection?

A fraud detection system with machine learning will be able to detect risk based on your historical data. It can then suggest or implement rules to reduce the fraud risk automatically.

Is fraud detection with machine learning expensive?

Most machine learning systems are fully integrated with the fraud prevention vendor’s system. Pricing varies, and you can even find solutions where you only pay depending on the number of API checks you make each month.

What is the difference between whitebox and blackbox machine learning?

A blackbox machine learning system is designed to run automatically with little human supervision. A whitebox system, however, offers clearly readable suggestions so that you can accept, update or reject the risk rules it found based on your data.

You might also be interested in reading about:

Related Source for this article

Share article

See a live demo of our product

Click here

Author avatar
Florian
Communication Specialist

Florian helps tech startups and global leaders organise their thoughts, find their voices, and connect with customers worldwide.


Sign up to our newsletter