*By Bart Baesens, Véronique Van Vlasselaer and Wouter Verbeke*

The importance and need for effective fraud detection and prevention systems is highlighted by some recent numbers which give an indication of the estimated *size* and the financial impact of fraud:

- A typical organization loses 5% of its revenues to fraud each year (www.acfe.com)
- The total cost of insurance fraud (non-health insurance) in the U.S.A. is estimated to be more than $40 billion per year (www.fbi.gov)
- Fraud is costing the U.K. £73 billion a year (National Fraud Authority)
- Credit card companies "lose approximately seven cents per every hundred dollars of transactions due to fraud” (Andrew Schrage, Money Crashers Personal Finance, 2012)
- The average size of the informal economy, as a percent of official GNI in the year 2000, in developing countries is 41%, in transition countries 38%, and in OECD countries 18% (Schneider, 2002)

Even though these numbers are rough estimates rather than exact measurements, they are based on evidence and do indicate the importance and impact of the phenomenon, and therefore as well the need for organizations and governments to actively fight and prevent fraud with all means they have at their disposal. These numbers indicate that it is likely worthwhile to invest in fraud detection and prevention systems, since a significant financial return on investment can be made. However, estimating the return on investment in analytical approaches to fighting fraud is not self-evident, requiring an assessment of the *total cost of ownership* of analytical models as well as the full impact of fraud on the organization and the *total* *utility* of fraud detection and investigation.

*Source for picture: click here*

**Total Cost of Ownership**

The Total Cost of Ownership (TCO) of a fraud analytical model refers to the cost of owning and operating the analytical model over its expected lifetime, from inception to retirement. It should consider both quantitative and qualitative costs and is a key input to make strategic decisions about how to optimally invest in fraud analytics. The costs involved can be decomposed into: acquisition costs, ownership and operation costs, and post ownership costs, as illustrated with some examples in Table 1.

*Table 1 Example costs for calculating Total Cost of Ownership (TCO).*

The goal of TCO analysis is to get a comprehensive view of all costs involved. From an economic perspective, this should also include the timing of the costs through proper discounting using e.g. the weighted average cost of capital (WACC) as the discount factor. Furthermore, it should help identifying any potential hidden and/or sunk costs. In many fraud analytical projects, the combined cost of hardware and software is subordinate to the people cost that comes with the development and usage of the analytical models (e.g. training, employment and management costs). Furthermore, TCO analysis allows to pinpoint cost problems before they become material. E.g., the change management costs to migrate from a legacy fraud model to a new analytical fraud model are often largely underestimated. TCO analysis is a key input for strategic decisions such as vendor selection, Buy versus Lease decisions, In- versus Outsourcing, overall budgeting and capital calculation. Note that when making these investment decisions, it is also very important to include the benefits in the analysis since TCO only considers the cost perspective.

**Return on Investment**

Return on Investment (ROI) is defined as the ratio of a return (benefit or net profit) over the investment of resources that generated this return. Both the return and the investment are typically expressed in monetary units, whereas the ROI is calculated as a percentage. In this section we discuss how to calculate the ROI of fraud detection, which may be less straightforward to calculate than the ROI of a financial product, but nonetheless can provide useful insights to an organization.

The returns of a fraud detection system depend on the amount of cases investigated, and the fraction of those that are effectively fraudulent. Remark that this fraction is a property of the fraud detection system, and depends on the power of the system to detect fraudulent cases.

The optimal amount of resources to allocate to fraud investigation and as such the sample to investigate is defined as the amount of resources that maximized the total utility associated with inspecting a sample. This sample can be selected either as a top-fraction of most suspicious cases with the highest scores assigned by a detection model, or as a top-fraction of the cases with the *highest expected fraud amount* which is defined as the probability to be fraudulent times the estimated fraud amount.

The utility of different *outcomes* is expressed as a net monetary value, either positive or negative, representing the costs and benefits to an organization (of any nature, both economic and non-economic, yet always expressed in monetary units) associated with the decision to inspect or not to inspect either a fraudulent or non-fraudulent case.

The investment required to generate the total returns or total utility can be assumed equal to the total cost of ownership. The total cost of ownership includes costs of diverse nature, covering the full investment required to build, operate, and maintain a fraud detection system. However, the total cost of ownership does not include costs related to resources that are required to further act upon the outputs of the detection system, i.e. inspecting and handling suspicious cases. All these costs together will be denoted as the *Total Cost of Fraud Handling*, and include inspection costs, legal costs, etc. Clearly, calculating the total cost of fraud handling may be a cumbersome task, yet indispensable to calculate the ROI.

Hence, we get to a general ROI formula for assessing investments in fraud detection and prevention, which can be fine-tuned to the specific setting of any organization:

**Conclusion**

Fraud has a significant impact on organizations of all sorts and sizes. Estimating the size of the impact in terms of financial losses is difficult and the resulting figures are typically rather sensitive to the underlying assumptions. Yet, calculating the returns of investing in a powerful fraud detection system can and definitely should be done to evaluate whether the system is delivering value to the organization as well as to quantify how much value. This article briefly introduces a formula to calculate ROI in this setting, indicating different sources of costs and benefits to be taken into account.

Further detailed information on this topic and on how to optimize fraud detection and prevention efforts, as well as a methodology to calculate capital requirements to cover for fraud losses can be found in a book written by the authors of this article, entitled ‘*Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection**’*, published in August 2015 by Wiley in the SAS Business Series. The book presents methods and techniques to develop powerful fraud detection and prevention systems using a data driven approach, *from A to Z.*

**About the Authors**

*Professor Bart Baesens is a professor at KU Leuven (Belgium), and a lecturer at the University of Southampton (United Kingdom). He has done extensive research on analytics, customer relationship management, web analytics, fraud detection, and credit risk management. He has also developed two Self Paced E-learning courses: Advanced Analytics in a Big Data World and Credit Risk Modeling. See www.dataminingapps.com for more information about his research. [email protected] *

*Professor Wouter Verbeke, Ph.D., is an assistant professor of Business Informatics and Business Analytics at Vrije Universiteit Brussel, Brussels, Belgium. He graduated in 2007 as a civil engineer and obtained a Ph.D. in applied economics at KU Leuven in 2012. [email protected]*

*Véronique Van Vlasselaer graduated magna cum laude as Master Information Systems Engineer at the faculty of Business and Economics, KU Leuven (Belgium). With her master thesis ‘Mining Data on Twitter’, she won the Best Thesis Award from the faculty. In October 2015, Véronique obtained her degree of Doctor in the Applied Economic Sciences with prof. Bart Baesens at the department of Decision Sciences and Information Management. The title of her Ph.D dissertation is ‘FAIR: Forecasting and Network Analytics for Collection Risk Management’ which mainly focused on the development of fraud detection techniques using social network analysis. During her Ph.D, she worked together with Smals Research (RSZ-ONSS) and Atos Worldline. She currently works as an Analytical Consultant at SAS Belgium & Luxembourg. [email protected]*

**DSC Resources**

- Services: Hire a Data Scientist | Search DSC | Classifieds | Find a Job
- Contributors: Post a Blog | Ask a Question
- Follow us: @DataScienceCtrl | @AnalyticBridge

Popular Articles

© 2020 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central