Does anyone have any advice for good data quality audit processes/ checks?
I have a live data set and I have a number of criteria which each record must have. Some sensical and logical checks. Such as missing fields, if this field is x then this field must have a date and anomaly detection.
Has anyone got any suggestions- currently the process I run is very manual filtering the data to check. I wonder if there is a better way.
I am thinking a algorithm which I can press run and it will run these checks and flag outliers/ records which do not pass the checks
Here is a good start to do that: https://github.com/SauceCat/pydqc. Good luck!
Hello. Nice info! I suppose that sometimes people in relations have some misunderstanding and it is not a good at all. To my mind, it is better to find someone with a suitcase of experience could help you with that source . There are a lot of hot mature girls. Try it and you will see.
One of my teachers who is an Excel Spreadsheet Consultant explain to me this "Data quality encompasses five characteristics: accuracy, completeness, reliability, relevance, and timeliness." So If possible and if the system is designed to do so, the procedure attempts to rectify the problems as it examines the data quality. The system employs the set data quality management (DQM) criteria to determine whether or not to address data quality concerns.