Fraud detections in the health care industry

One more opportunity to implement data mining techniques in the health care industry will be helping the healthcare insurers to detect fraud transactions so that the other patients can receive better and more affordable healthcare services. This occurs when individuals deceive an insurance company to try to obtain money to which they are not entitled. It happens when someone puts false information on an insurance application and when false or misleading information is given or important information is omitted in an insurance transaction or claim.

Data that can be collected from the patients:

Patient Name  

Age - gender 

date of service


Service provider


diagnosis reports

Services Utilized

Cost of Service

No. of visits


24 - M










30 - F












Quality Ratings

Average No. of patients

Doctors Availability

Average Service Cost






Mon, Tue



Apart from the data we have collected from the patients, we will be gathering one more dataset where we will be having the details of all hospitals in the locality, diagnosis, quality.


Evaluation: Here we cannot accurately classify whether a transaction is default or not, because of the challenges faced while collecting the data related to the hospitals.

We can use data mining algorithms such as decision trees and naïve bayes classification, for classification whether a claim is default or not based on the

  • Deviation from inquired cost and the average cost in that hospital, in that locality and also for the same hospital in other localities.
  • Comparing the ratings to that service/quality from the patient to the other patients
  • Services utilized and the doctor’s availability to the data of service.

Here we also have to consider the deviations which are small, because that is the case where most of the transactions occurring, because they will be including the charges with caution like cost inquired to them from medicines, consultation, etc…

We also be using the regression analysis to check on whether the average cost for that diagnosis is increasing or not. Because may be the hospital has upgraded their service based on the ratings or feedback and modified the diagnosis steps which also has an effect on the cost.

Even though we have used the analytics, my perspective for the success of classification for this particular opportunity will be the domain knowledge and the manual analysis, like comparing the date of service and availability of the doctor’s.

Integration: We can integrate the model built to the insurance claims functionality. Whenever the customer submits a claim, we have to design the application is such a way that it collects all the details necessary for our analysis. Then we have to clean the data for any missing values and based on the classification the finance section of the company will take appropriate decision

Views: 1081

Tags: data mining techniques


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Aravind Reddy on February 9, 2017 at 6:34am

Hi Mirjana,

Yeah i am planning to add more content for this article. Currently i am doing research on how to gather the data mentioned above.

As of now i have not thought of any country, But i believe that this can be implemented anywhere.

Thanks for your time in viewing the post.....

Comment by Mirjana Panova on February 7, 2017 at 6:47pm

Is there any more to this article???

What countries health care industry?

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service