I found some leftover hamburgers in the fridge. I decided to stack a couple of them together to form a colossal “super-burger.” At the time, I didn’t appreciate how doing so would make it almost impossible to physically fit the burger in my mouth. I squished and squeezed the burger until it was flat enough to eat. Such are the problems of physics that become apparent sometimes only after one tests the limits of design. A similar problem occurs I find in relation to spreadsheets. Spreadsheets are straightforward, powerful, and effective. So it should come as no surprise for people to try to fit everything on a single spreadsheet: “Sure, I will do up a spreadsheet to take care of it.” This is precisely what they do. After all, if a massive relational table can fit astronomic amounts of data, why not put everything on a single spreadsheet? I will be sharing my perspective or bias on this blog.

My general observation is that the single-spreadsheet approach - if it works at the outset - can often continue working for at least a couple of weeks. On the other hand, if the single-spreadsheet is going to fail, operational failure might occur in less than a month. The crux of the problem is as follows: ease-of-use cannot substitute for good design indefinitely. Consider quality control in a hospital where managers have decided to start “collecting data.” Since everything needs a beginning, I consider it reasonable for initial efforts to be relatively superficial. However, in order for the data to provide “actionable insights,” the data often cannot be aggregated. If the bathroom facilities on the third floor of Wing 4 are being poorly maintained, lumping all of the data together as if location doesn’t matter is hardly constructive. At the same time, if aggregation occurs, it becomes nearly impossible to detect “structural insulation” (i.e. negligence or gross incompetence of managers) except through non-design investigation. By this I mean, the data collected should be defined and determined by patients rather than hospital administrators; only in this manner does data reflect the underlying reality.

So we return to the single-spreadsheet where, in all likelihood, somebody has decided to enter scoring data at each column for every record. Then one afternoon we hear the following, “You know - there is a column here for bathroom cleanliness. But there’s a big difference between bathrooms - even between individual stalls. I don’t think this spreadsheet has enough columns to hold everything we need.” Ah, so a decision is made to expand the spreadsheet, which becomes so extremely large. It also becomes necessary to sort the rows in order to make location a delineating factor. Sorting the rows by location makes sense. But after the first reporting period, it becomes necessary to sort the rows by reporting period and location. It quickly becomes apparent that simply making the spreadsheet larger is a bad idea. I guess from a structural standpoint, it is necessary to accept that fact that making a file structurally larger is actually the opposite of what one should attempt to do. Bulk and bigness don't lead to inherent advantages. However, design developments are unavoidable. It is necessary to adapt the system as demands change.

Now, this blog is disguised as something else. Because I will reveal right now that I generally don’t care about amount of data. I believe in handling an unlimited amount. However, the fact there might be a vast amount of data doesn’t negate the need to ensure its usefulness. More importantly, for those organizations making the effort to collect actionable data, I consider the threat of collecting superficial and overly simplistic data quite high. I call it “census” type data. “Last month, 25 employees were injured on the job. Previously, there had been 32 employees injured. This represents an improvement of nearly 22 percent!” But the numbers can actually turn in the other direction just as easily. The company is not actually collecting “actionable insights” but rather a bunch of stale hindsights. In order to take action, it is necessary to collect data beyond the immediately quantitative. Design works best when steps are taken to overcome flaws in design.

Views: 306

Tags: actionable, administration, assurance, big, bulk, control, employees, insights, management, mass, More…methods, processes, protocols, quality, systems, workflows, workplaces

© 2019 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central