Guest blog from David Lefkowich
Almost every notable project we’ve completed with our clients over the past 3 years has had one major factor in common. Each originated from a complicated spreadsheet-driven data manipulation and reporting process that the company had outgrown. The corporate business needs evolved or expanded. Perhaps a new regulation or requirement came into effect, or an acquisition took place and new data needed to be integrated into the reporting process. The reasons are many.
The data inputs, or the processing rules, or the output reports changed one too many times and the process broke. Perhaps the existing process became too complicated, too ‘kludgy’ for the spreadsheet-driven solution to handle. Or perhaps the spreadsheet expert who built it was no longer available to fix or maintain it and it was impossible to reverse engineer… or a host of other factors.
Spreadsheets are such powerful and flexible tools, and they’re so ubiquitous, that we have gotten into the habit of using them as our ‘go-to’ tool for just about everything numbers-related. As great as they are, spreadsheets and spreadsheet-driven processing are not always the best solution for the jobs at hand.
There is a quickly growing pool of powerful software tools in the market designed specifically with end-users in mind, to help them do many of the manual, onerous, time-consuming and error-prone tasks that they currently default to spreadsheets for.
If your concerns or tasks at hand require any of the following, then it behooves your organization to consider replacing existing spreadsheet-driven tasks or systems in support of other tools that were designed for these purposes.
Data Integration – Anyone who has had to append additional rows or columns into a dataset or merge multiple datasets together knows what an awkward, time-consuming and finicky process this is – especially when the files are not all the exact same format or have the same column names. When there are many files or datasets to join, the task moves from painful and onerous to excruciating and complicated. There are tools on the market that are easy to access and learn, designed to simplify and even automate this process. There is no longer any excuse for doing copy and paste or writing scripts for data integrations.
Data Cleansing and Normalization – Data cleansing is a huge and never-ending task. Data prep tools can simplify and automate a huge percentage of your data cleansing and normalization for you, leaving you the task of addressing only the issues that require human intervention. Plus, once the task is completed the first time, then the data cleansing rules can remain in place to automate this process for the next time. Running queries and sorts to do repetitive manual data corrections should be a thing of the past.
Data Summarizations and Data Discovery – What is your data telling you? A spreadsheet expert can easily create all kinds of sorts and filters, charts and pivot tables to explore your data – even if the methods are manual. But most users aren’t spreadsheet experts. There are end-user tools available that give “regular” spreadsheet users the ability do ad hoc querying and data summarizing without relying on an expert.
Data Analysis, Visualization and Reporting – Charts, graphs and tables for analysis and reporting are also no longer only the domain of power users. Some of today’s data analysis tools make it easy for all users to explore the data and create charts and tables of notable interest – with intuitive point and click / drag and drop functions. Also, the charts and tables can be set up to seamlessly and automatically reflect data updates and automate the creation of recurring reports for export, as needed, into standard format output files or existing templates.
Performance – Available memory and power resources are enabling or limiting factors in users’ computers. Spreadsheets have size limitations that are pretty hard and fast and, for the average data analyst user on a PC, not difficult to exceed. Once that file size is reached the system performance degrades dramatically. Many tools are optimized to manage these resources more efficiently, allowing users to work effectively with larger file sizes and datasets.
Data Governance, Data Integrity, Transparency and Visibility – From IT’s perspective, this may be the biggest concern about spreadsheet-driven processes. There is virtually no control, audit trail or visibility into what a user can do in a spreadsheet. From a data integrity perspective, you need to find tools (and they do exist) that will track or manage access to source data, easily allow you to identify all operations performed on your data, and trace back how the results in any report or analysis were attained – without imposing on or impeding users’ abilities to creatively and efficiently complete their work.
Reminder: the easier your data tools are to implement, learn and use - and the quicker they can provide demonstrable results - the more likely it is that your users will adopt them. Otherwise they will continue to use the tools they know, even if they are not the best tools for the job.