Discover the pros and cons of blackbox and whitebox AI solutions and understand which one is the best fit for you.
Artificial Intelligence (AI) is transforming every facet of our personal and professional lives but businesses globally are still struggling to understand the implications of this new wave of disruptive technologies.
Fraudsters are no longer working as individuals, instead opting to work together in international criminal gangs and global networks to share new malicious techniques to defraud businesses. The best technology for fighting fraud is one that can change and adapt as quickly as the fraudster’s tactics.
- Exploring the different forms of ML in Fraud Detection: data, multiplicity, integration, white-boxing, ongoing monitoring, experimentation.
- Supervised or unsupervised learning for fraud detection.
- Historical data and labelled data.
- Explainable AI.
- ML Regulations – central regulations.
- Pros and Cons of BlackBox and WhiteBox solutions.
- The future direction of machine learning in fraud-fighting.
- Stefania Boná – Data Science Product Manager at Checkout.com
- James Hunt – Fraud Advisory at Bottomline Technologies
- János Szendi-Varga – Data Team Lead at SEON
The host was our very own co-founder and COO, Bence Jendruszak.
Key Answers From the Q&A
How much training data is needed to make ML useful for fraud detection? One day? A Week? More?
“I think the first problem when it comes to machine learning and card payments is that there’s a lag. So you’re not going to know straight away if a transaction was fraudulent. We do need to wait for Visa, Amex, Mastercard etc… to tell us if it was fraud or not. So usually I’d say 3-6 months worth of data. The lag is massive. And it’s not just about how much data, but about the quality of the data. It’s something I need to keep stressing: your model is only as good as the data that comes into it.” – Stefania
“There are so many factors to consider, so I really think the only answer I can give is: it depends.” – János
“I’ll agree. It depends on the provider you’re potentially using. If I’m looking at the industry as a whole, there are plenty of providers who just take your own data, either in a structured or unstructured way, and will try to make sense of it. Now the time to value could be completely different with a cloud-based vendor that’s using their industry knowledge and the talent pool right. For example, there’s about 10% of Swift traffic we can use in our modelling when our customers come to us. The one key thing I’d say is: if you’re talking to a vendor, you should understand what the time to value is. Understand what the methods and time scales are, and choose what’s best for your business model.” – James
What is the administrative workload for a machine learning service? How much of your resources should be allocated to it (team size, etc…)?
“Here again, I think it depends on the deployment of the technology itself. Sometimes it’s more labour-intensive. It will depend on the size of the business too. Ultimately I work with multiple customers on a daily basis and can manage that quite easily, so it depends, but for the bare minimum, having one person who’s multidisciplined can work. You might not need them to concentrate on ML full time, but that would be my benchmark.” – James
“Yes, it does depends on the reasons. You can outsource it and have someone taking care of the end-to-end management and that will be cheaper. In my case, if you are a company saying: “I want to build my own machine learning model for fraud detection”, then you’ll definitely need to have data scientists and engineers continually updating those models. Because fraud changes, so you need to make sure you update those models as frequently as possible to keep up with changing fraud patterns.” – Stefania
“When you are an early-stage startup you need one person who understands the business, understands the data, the technology. So a real data superhero with business knowledge. And that’s enough to start. But you have to scale up quickly when your business is growing. If you’re in an industry that’s riskier than others, you’ll definitely need more fraud analysts and experts than others.” – János
“Yes, it’s also so important to have the relevant knowledge. Like you said you need someone with inside knowledge around the intricacies of that specific industry.” – Bence
Are you using neo4J? Or which database management platform is your favourite?
“I’m definitely a big fan of neo4J’s graph database because they are the market leader and they got a huge investment – the biggest in database company history. So I think it’s a validation of their business. We’re using neo4J, implementing a solution based on it. We’re still working on it and we have big plans with them.” – János
How does your company capture false positives or false negatives?
“When I used to do the demo calls for our company back in 2017, we were always focused on developing a whitebox system. Not just for machine learning, but even for the decisioning of the fraud platform. When rules are created, they have to be readable and so on. And so for every rule that the fraud manager applies, we have a backtesting functionality. This was really interesting to me because nobody else in the fraud prevention space was doing that. So you can backtest it and calculate over a specific timeframe what that rule would have given you in terms of false positives and false negatives. Unfortunately, I’m not doing many demo calls anymore but I think that’s a great feature of SEON’s.” – Bence
What is a big challenge to start in AI detection?
“If you’re starting from scratch you need to invest in a lot of resources. You need to make sure the data you have is clean and of good quality. To make sure you have enough data. Data scientists and engineers must turn these models into a reality and you need to make sure you have the infrastructure to do that. It’s definitely challenging to make sure everything works smoothly from the moment the data is ingested to the time of the output.
Another challenge is that you don’t really understand what’s in the market out there because everyone is really secretive about how their models are performing. So being tasked to come up with goals for the team has been challenging.” – Stefania
“I think Stefania hit the nail on the head. If you’re looking to start building AI models, especially from a fraud operations standpoint, it’s going to be a case of ROI. Of how much money you’re going to be able to put in and what you’re hoping to get out. I was talking to a few people the other week and one of the conversations some financial institutions were asking: why should I choose a vendor when I could just build it in house? But why would you? Especially if it’s a cloud-based solution, you get a much better price, you get quicker ROI, you get the team etc… The way I look at it is you need to look at how much time and resources you want to put into it. It’s an in-house vs outsourced argument, but I’ll always be a fan of outsourcing.“ – James
“Only one addition to this: to start you need to hire good people. That’s the biggest challenge nowadays, to get the proper AI or machine learning engineers. When you have a team it’s not so hard to start.“ – János
“Yes, and I’m on the recruitment side so every day I see how we’re trying to help the team out by getting more and more talented people on our team.“ – Bence
Watch our webinar to learn about the future direction of machine learning in fraud-fighting.
You might also be interested in reading about: