There’s no such thing as perfect data, but there are several factors that qualify data as good :
Following a few best practices will ensure that any data you collect and analyze will be as good as it gets.
1. Collect Data Carefully
Good data sets will come with flaws, and these flaws should be readily apparent. For example, an honest data set will have any errors or limitations clearly noted. However, it’s really up to you, the analyst, to make an informed decision about the quality of data once you have it in hand. Use the same due diligence you would take in making a major purchase: once you’ve found your “perfect” data set, perform more web-searches with the goal of uncovering any flaws.
Some key questions to consider  :
Three great sources to collect data from
US Census Bureau
U.S. Census Bureau data is available to anyone for free. To download a CSV file:
The wide range of good data held by the Census Bureau is staggering. For example, I typed “Institutional” to bring up the population in institutional facilities by sex and age, while data scientist Emily Kubiceka used U.S. Census Bureau data to compare hearing and deaf Americans .
Data.gov  contains data from many different US government agencies including climate, food safety, and government budgets. There's a staggering amount of information to be gleaned. As an example, I found 40,261 datasets for "covid-19" including:
Kaggle  is a huge repository for public and private data. It’s where you’ll find data from The University of California, Irvine’s Machine Learning Repository, data on the Zika virus outbreak, and even data on people attempting to buy firearms. Unlike the government websites listed above, you'll need to check the license information for re-use of a particular dataset. Plus, not all data sets are wholly reliable: check your sources carefully before use.
2. Analyze with Care
So, you’ve found the ideal data set, and you’ve checked it to make sure it’s not riddled with flaws. Your analysis is going to be passed along to many people, most (or all) of whom aren’t mind readers. They may not know what steps you took in analyzing your data, so make sure your steps are clear with the following best practices :
3. Don’t be the weak link in the chain
Bad data doesn’t appear from nowhere. That data set you started with was created by someone, possibly several people, in several different stages. If they too have followed these best practices, then the result will be a helpful piece of data analysis. But if you introduce error, and fail to account for it, those errors are going to be compounded as the data gets passed along.
Data set image: Pro8055, CC BY-SA 4.0 via Wikimedia Commons