Digital companies face a new era of fraud. In this article, we look at fraud beyond financial transactions. “Soft fraud” is about loopholes in marketing incentives or policies, rather than the typical “hard” definitions of payment or identity fraud. The goal is to look at fraud that could silently happen to you and how to address it with data. Lastly, we check what is needed for successful fraud detection with machine learning.
Many companies transform digitally to stay ahead of the curve. At the same time they expose themselves in a digital ecosystem. As digital presence grows, so does the surface area that attracts malicious actors. “The crime of getting money by deceiving people” according to the Cambridge Dictionary takes many forms when you deceive systems instead of people. Once fraudsters identify a loophole, they scale their approach with bots leading to substantial financial loss. This likely explains why fraud and debt analytics ranks among the top ten AI use cases according to McKinsey’s state of AI report.
Fraud that is less clear-cut from a legal perspective involves bad actors that systematically exploit loopholes within usage policies, marketing campaigns or products. We could refer to it as soft fraud:
Bad actors systematically misuse policies, products or services to divert money or goods from the digital ecosystem to themselves.
So, what forms can soft fraud take?
Photo by Noelle Otto
Digital marketing giveaways. The digital economy offers a vast range of services, and so does it offer endless possibilities for fraud. One of the biggest areas is digital marketing. It gets attacked from two sides: Humans and algorithms that mimic human behavior, also known as bots. Both try to exploit usage policies, ad campaigns or incentive schemes. For example, a customer creates accounts to claim sign-up bonuses, also called sign-up fraud. Another one involves a customer that uses a product once and yet returns it, referred to as return fraud. Sharing accounts across friends or family is a famous example for companies like Netflix. Non-human actors, like bots, click on paid-ads or exploit affiliate schemes to claim rewards, such as a payout for each new customer registration.
Humans reap bonuses. Most of the traffic still comes from humans, estimated around 60%. They become interested in your product and explore your digital offering. Some try to take advantage of promotional schemes such as newsletter sign-up bonuses, giveaways or related incentives. They reap bonuses multiple times, for example by using generic email addresses. Others try to push boundaries on usage policies. For example, when multiple persons use one account or share content protected by paywall. With a genuine interest in your product, they count as “friendly fraudsters”, happily using blind spots in web-tracking or marketing campaigns. Those customers invest time to access your products. So, they reveal a strong preferences for your offering. Rigorously blocking them to bring down fraud may hit innocent customers as false positives. Additionally it kills the potential to re-engage with previous fraudsters in a more secure way. That is why in the world of fraud detection, experts refer to it as the “insult rate”.
Bots dilute metrics. Up to estimated 40% of website traffic comes from bots. They click ads, fill out web forms and reap giveaways. The entire lifecycle of digital marketing gets compromised. Bots dilute key performance metrics which leave you wondering about low conversion rates, high cost-per-click or low lead quality. They negatively impact key metrics such as cost per acquisition (CPA), customer lifetime value (LTV), cost per click (CPC), marketing qualified leads (MQL), etc.
Adapt fraud detection to these types
Photo by lil artsy
Below you find a list that provides an overview about fraud types you can encounter. It divides into non-human actors like bots, human actors like users and eventually both. It includes anyone who gets incentivized by your digital presence to commit fraud.
Non-human actors like bots
- Click fraud: Viewing ads to get paid per click.
- Inventory fraud: Buying limited goods like sneakers or tickets and holding inventories.
- Fake account creation: Registering as users to dilute the customer base.
- Campaign life-cycle fraud: Competitors deploy bots which eat up marketing budgets.
- Lead generation fraud: Filling out forms to sabotage sales efforts
Human-only actors like customers or competitors
- Multi-account usage: Different persons use a personalized account.
- Return fraud: Customer uses product and returns it damaged
- Bonus fraud: Get discounts multiple times after newsletter sign-up or account registration.
- Account takeover: Leaked login details or weak user authentication
- Friendly fraud: Customers receive a product, dispute the purchase and chargeback the money
Either human or non-human
- Affiliate fraud: Bots click exploit a strategy in affiliate campaigns to unlock compensation
- Bad-reputation fraud: An attack on your product reviews from competitors
Some of these can be tackled with data analytics and possibly machine learning, while some are more about designing policies and services in a safer way, so that they cannot be easily exploited.
Effective fraud detection builds on data
Now that we have seen different types of fraud, what can we do about it? Do we want to detect them when they happen, or do we want to prevent them from happening at all? Let us see how data & analytics can help us.
Leverage machine learning. Fraud tends to happen systematically. Systematic actors need a systematic response. If your data captures these patterns and lets you identify fraud, you have everything to build effective solutions with rules, heuristics or eventually machine learning. Machine learning is an approach to learn complex patterns from existing data and use these patterns to make predictions on unseen data (Huyen, C., 2022. Designing Machine Learning Systems).
Rephrasing this from a business perspective would lead to the starting question for machine learning:
Do you face a (1) business-relevant and (2) complex problem which can be (3) represented by data?
- Business-relevance: Can you become more profitable
- Complexity: Is data available in volume or complexity that heuristics likely fail?
- Data representation: Is data extensive and consistent enough for a model to identify patterns?
Machine learning requires detailed and consistent data to make it work. There is no silver bullet.
Identify fraud in data. Preventing fraud comes down to data. How well you can track web sessions, impressions and click paths becomes central in dealing with fraud. Without tracking data, chances are low to do anything about it. Even third-party anti-fraud software might be ineffective since it solves generic use cases by design. Different firms attract different fraud types. Third party solutions cannot possibly know the specifics on a complex range of products or services and their vulnerabilities. Therefore, a tailored approach built together with internal domain experts such as product or marketing could effectively prevent fraud.
Machines or humans. One major challenge is to differentiate between bots and humans. Nowadays, bots have become better at mimicking human behavior. At worst they come in thousands to interact with whatever incentive you expose to the outside world. Due to the sheer traffic volume it is infeasible to manually analyze patterns. You have to fend off algorithms with algorithms. The depth of data you have, directly determines whether you have any chance to deploy machine learning.
Honeypots for bots. One way to label bots is to use so-called honeypots to lure bots. Honeypots are elements on your website invisible to humans, like hidden buttons or input forms. Bots scrape the website source-code to discover elements they can interact with. If your website tracking logs an interaction with these hidden elements, you clearly identify bots. You can see a summary of the honeypot method in this article by PerimeterX: How to use honeypots to lure and trap bots.
As bots act more like humans, their digital footprint blends in with anyone else’s. This poses a major challenge to any data-driven solution and there is no magic solution to that. Creating honeypots that lure bots could be one way forward. Along the lines of Gartner’s Market Guide for Online Fraud Detection, a vendor on bot detection would be the safest bet, such as Arkose Labs, Imperva, GeeTest or Human to name a few.
This article talks about the rise of novel fraud types that modern fraud detection faces. Firms increasingly expose their offerings in the digital ecosystem which leads to losses due to fraud. Policy loopholes and marketing giveaways erode their digital budgets. For example, customers reaping signup bonuses multiple times with generic emails on the one hand, and sophisticated bots creating fake accounts that dilute your customer base on the other hand. Both forms lead to losses along the digital supply chain.
I personally find the world of fraud detection fascinating. It constantly changes where preventive technology and creative fraudsters move in tandem. With the rise of bots, fraud detection becomes more complex and difficult to do with conventional approaches. If you start on your fraud detection journey, I recommend you start thinking about how your company’s digital presence is reflected by the data you have. Web tracking needs to be deep enough to enable analytics or even machine learning.
At Solita we have the skillset to both build strategic roadmaps and create data solutions with our team of data experts. Feel free to reach out how we can help you on the data groundwork towards effective fraud detection.