Subscribe to DSC Newsletter

Healthcare fraud detection still uses cave-man data mining techniques

The Washington Education Association (WEA, in Washington State) is partnering with Aon Hewitts (Illinois), a verification company, to eliminate a specific type of health insurance fraud: teachers reporting non-qualifying people as dependents, such as an unemployed friend with no health insurance. The fraud is used by "nice" people (teachers) to provide health insurance to people who would otherwise have none, by reporting them as spouse or kids.

Interestingly, I saw the letter sent to all WEA teachers. It requires you to fill lots of paperwork and provide multiple identity proofs (tax forms, birth certificates, marriage certificates etc.) similar to ID documents (I9 form) requested to be allowed to work for a company.

It is easy to cheat on the requested paper documentation that you have to mail to the verification company (e.g. by producing fake birth certificates or claiming you don't have one, etc). In addition, asking people to fill so much paperwork is a waste of time and natural resources (trees used to produce paper), and results in lots of errors, privacy issues and ID theft risk, and costs lots of money to WEA.

So why don't they use modern methods to detect fraud: data mining techniques to detect suspicious SSN's, identifying SSN's reported as dependent by multiple households based on IRS tax data, SSN's not showing up in any tax forms submitted to the IRS, address mismatch detection, etc. (note that a 5-day old baby probably has no record in the IRS database, yet he is eligible as a dependent for tax or health insurance purposes).

Why not use data mining technology, instead of paper - with all the advantages that data mining offers over paper? What advantages does paper offer? I don't see any.

Here's how data collection and processing should have been performed:

  • Ask Washington teachers to provide list of dependents, with relationship (spouse / kid), address, phone number(s), date of birth, maybe driving license with state, for each dependent - and nothing more, no paper.
  • Teachers fill the form online (those without Internet access use school computers) and the online questionnaire is designed to minimize the risk of errors / typos. IP addresses are also tracked.
  • The verification agency use a service such as Intelius to check whether the names (dependents) provided by a teacher live under the same roof, and that the age / date of birth matches the number reported by the respondent. The data mining algorithm will perform fuzzy matching. Also do dependents share the same last name? If not additional scrutiny required.
  • Teachers providing more than 3 dependents are subject to increased scrutiny by data mining algorithms, and random manual verification (e.g. via phone calls).

This should eliminate most of the fraud, at a very low cost, and with very little burden on teachers.

Views: 419

Tags: data mining techniques


You need to be a member of Data Science Central to add comments!

Join Data Science Central


  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service