What are your chances of being audited? And how can data science help answer this question, for each of us individually?
Some factors increase the chance of an audit, including:
Companies such as Deloitte or KPMG probably compute tax audit risks quite accurately (including the penalties in case of an audit) for their large clients, because they have access to large internal databases of audited and non-audited clients.
But for you and me, how could data science help predict our risk of an audit? Is there data publicly available to build a predictive model? Or is it a good idea for a new startup: creating a website where
Indeed, this would amount to reverse-engineering the IRS algorithms.
Which metrics should be used to assess tax audit risk? And how accurate can the prediction be? Could such a system attains 99.5% accuracy, that is, wrong predictions for only 0.5% of taxpayers? Right now, if you tell someone "you won't be audited this year", you are correct 98% of the time. More interestingly, what level of accuracy can be achieved for higher risk taxpayers?
Finally, if your model predict both the risk of audit, and the penalty if audited, then you can make smarter decisions, and decide which risks you can take and what to avoid. This is pure decision science to recoup dollars from the IRS, not via exploiting tax law loopholes (dangerous), but via honest, fair and smart analytics: in a nutshell outsmarting the IRS data science algorithms, rather than outsmarting tax laws.