Five Critical Words When Searching for Data Prep Tools

This is a guest blog from David Lefkowich, VP Sales and Marketing for FreeSight Software

I have yet to find an organization that doesn’t have occasional or consistent data quality concerns.  Some concerns are innocuous inconveniences; but others can lead to huge and costly impacts.


Whatever the source(s) of your data problems, you must address them somewhere in your processes to report accurate results and perform other business critical data-driven tasks.


When looking for the right data prep and analysis tool(s) for your organization you’ll want to look for product features that are relevant to your specific types of data quality issues. And of course you’ll spend time on the obvious:


Price Point.  Ease of ImplementationEase of Learning.


Here are five additional words to keep in mind as “must-have” capabilities when you look for software tools that can help you with data cleansing, analysis and reporting:


Dynamic. Persistent. Transparent. Reversible. Automatic.


Dynamic: Your data prep tool should allow you to work with a copy or representation of your data in real-time so that the effects of any changes made, functions applied, or manipulations taken can be seen immediately. If you need to correct or edit your process later on you should be able to modify the appropriate step in the process without having to repeat or rebuild all of the subsequent steps. All changes should ripple through the process automatically.


Persistent: Perform your work and build your cleansing or analysis process once. Save it as a query or program. From then on, whenever you have updates to your data and you need to perform the same task again, you should be able to use the same query or program with a simple ‘update’ or ‘refresh’ command. You shouldn’t have to redo your work from scratch.


Transparent: You should have easy access to a workflow or audit-trail of all data manipulations, changes, calculations or corrections made to your data – onscreen preferred. With drill-down or equivalent functions, you should be able to quickly answer any question of “what is this data and where did it come from?”


Reversible: Have you ever deleted cells, rows or columns of data by mistake, or made some other manipulations or changes - and wished you hadn’t? Your tool should allow you to work in real-time to perform your work --- and also allow you to reverse or correct any action anywhere in the process, at any time, even after you’ve saved the document and reopened it later. Original data should always be preserved and accessible whenever needed.


Automatic: Your tool should simplify and automate repetitive manual and time-consuming steps. And then, once your data cleansing, analysis or report generating processes or models are completed, tested and correct, your tool should allow scripting that enables automatic running – in real-time or in batch.


All of the above being said, with whatever tool you ultimately decide on, the most critical component is to ensure that your end-users – the people who have their hands in the data and are responsible for the data deliverable – can learn it quickly and experience a lot of direct benefit from it. If they don’t, they will be extremely creative at finding ways to not use it.


David Lefkowich is the VP Sales and Marketing for FreeSight Software, a data integration, cleaning, analysis and reporting tool. (www.freesightweb.com)

Views: 1101

Tags: data prep


You need to be a member of Data Science Central to add comments!

Join Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service