Subscribe to DSC Newsletter

Detecting Hidden Fraud Risk from Public Data

by Hudson Hollister

Detecting which of the federal government’s millions of contracts most likely involve fraud used to require insider access to agencies’ IT systems. Data analytics provides greater efficacy and higher hit rate than traditional investigative methods – and now can even be performed using only public data.

To help agencies more effectively detect contract fraud, Elder Research has developed a risk model and data visualization tool. Undergoing beta testing by procurement and oversight experts from four agencies, the tool scores the likelihood of fraud for every single contract award—using only publicly-available data and without needing a connection to agencies' internal systems. This blog explains how it works, why it’s important, and how you can participate in the program.

How It Works: Applying Effective Risk Models to Public Data

Elder Research has helped agencies analyze their data for over two decades. In the 2000's, inspectors general started asking us to help maximize the impact of their small teams of fraud investigators. The goal was to develop risk models to predict the likelihood that a given contract might involve fraud, acquisition violations, or other irregularities. That way, their teams could focus on the riskiest contracts, find fraud faster, arrest more bad guys, and save taxpayers more money.

The fraud risk models Elder Research developed for federal contract data create and combine metrics drawn from an agency's internal financial and procurement data. For example, a high number of contract modifications or a high payment frequency are both correlated with higher risk. The models combine many metrics and can be refined based on experience.

To simplify use and make it easy to take action on model insights, we developed a visualization platform called the Risk Assessment Data Repository, or RADR (It’s pronounced just like “radar,” signifying the way it can detect hidden risk). RADR displays ranked lists and mapped plots of an agencies’ contracts, with ranked numerical and color-coded risk scores to make it very clear where investigators should focus their efforts. Using the RADR platform the U.S. Postal Service opened 113 investigations that aided in over $11 million in recoveries, restitution, and cost avoidance in the first year, reduced hours spent on a case by 30%, and increased dollars returned per case by 35%.

But the RADR platform required inspectors general to extract data from their agency’s financial and procurement systems into a secure environment accessible to the risk models. Once the data was analyzed and scored, many of the high-risk contracts investigated turned out to involve fraud—allowing the inspectors general to build a criminal case and shut them down. However, these tools could only be deployed by inspectors general within their agencies, and only when they managed to gain access to the required internal systems.

Fortunately, to increase transparency Congress passed legislation to require data from these same internal systems to be made available for public scrutiny. First came the Federal Funding Accountability and Transparency Act of 2006 (FFATA). FFATA required the White House to display a summary of every federal contract, grant, loan, and other financial assistance award of more than $25,000 on a publicly accessible and searchable website to give the American public access to information on how their tax dollars are being spent.  (Check it out at USAspending.gov).  The Digital Accountability and Transparency Act (DATA Act) -- enacted on May 9, 2014 -- dramatically expanded the universe of public spending data by disclosing direct agency expenditures and linking federal contract, loan, and grant spending information to federal agency programs. The DATA Act also improved data quality on USAspending by requiring the Treasury Department and the White House to work together on data standards.

As a result, USAspending is now the official source for public access to spending data for the U.S. Government. Its mission is to show the American public how the federal government spends their tax dollars. The site allows money to be tracked from Congressional appropriations, to the federal agencies, and down to local communities and businesses. Best of all, the data on USAspending is available for download.

Building a Workflow to Manage Risk

Once USAspending’s public data became more extensive, more reliable, and easily downloadable, we compared the performance of our contract fraud risk models using internal data to results from the public data. We found that a significant number of our risk metrics, though not a majority of them, can be drawn from the USAspending data (and from some other public portals, such as FedBizOps.gov). For example, from USAspending data the RADR platform can determine how many times a contract has been modified.

We created new risk models and a new version of RADR optimized for use with publicly available data. This new version of RADR no longer requires access to an agency’s internal data systems. Anyone with access to the tool can view fraud risk rankings for contracts at a particular agency or even government-wide.

The beta version of RADR provides four ways to review and analyze the fraud risk for federal contracts:

  • Individual Contracts (independent of agency or recipient) - Scored with transparency as to how the score was built
  • Agency-based view (top 100 scored Contracts) - Score binning of all Contracts associated with the Agency
  • Recipient-based view (top 100 scored Contracts) - Score binning of all Contracts associated with the Recipient
  • Product Service Code view (top 100 scored Contracts) - Score binning of all Contracts associated with the Product Service Code

The following figures illustrate some of RADR’s capabilities:

Figure 1 Contract Score Summary

Figure 2 Top 100 Scored Contracts (data in first 3 columns removed for privacy)

 

Figure 3 Overall Score Bin

Because fewer metrics can be drawn from the publicly available data, results from this version of RADR are not as accurate as those derived using the internal financial and procurement systems data at every agency. But it’s much better than applying no data science at all!

Why It’s Important: Government-Wide Risk Management

The new version of RADR is now in beta testing with four agencies and one private group to understand:

  • How they review and act on risk information
  • How to improve the RADR platform user experience
  • How to make the underlying risk models more effective
Until we adapted our risk models and developed this beta version, there was simply no way to get a holistic government-wide picture of fraud risk in federal contracting. Now, our beta users get an instant, contract by contract, picture of fraud risk within every agency across the government.

Once the beta program is completed, we will decide how best to extend the benefits of instant fraud risk detection to all the stakeholders who may need it: agency executives, oversight leads, contracting officials, Congress, voters, and contractors.

The possibilities are exciting! In the near term, every inspector general office could employ advanced analytics to assess risk for its agency’s contracts. The future promises even more advanced capabilities. Imagine, for example, when it’s possible for a contractor to follow their risk profile in real time and proactively take steps to mitigate that risk. Then, fraud might be prevented before it ever begins.

Views: 1275

Tags: analytics, data analytics, data science, fraud, fraud detection, government, machine learning, oversight

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service