Subscribe to DSC Newsletter

Does anyone have any advice for good data quality audit processes/ checks?

I have a live data set and I have a number of criteria which each record must have. Some sensical and logical checks. Such as missing fields, if this field is x then this field must have a date and anomaly detection.


Has anyone got any suggestions- currently the process I run is very manual filtering the data to check. I wonder if there is a better way.

I am thinking a algorithm which I can press run and it will run these checks and flag outliers/ records which do not pass the checks

Tags: audit, data, quality, wrangling

Views: 495

Reply to This

Replies to This Discussion

Here is a good start to do that: https://github.com/SauceCat/pydqc. Good luck!

RSS

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service