Subscribe to DSC Newsletter

Excel has become the de-facto spreadsheet tool for business users. It currently has approximately 500 million users, and it might have a bright future ahead.

"To a large extent, Excel (and the alike) democratized programming by making its power available to non-experts to solve their problems on their own. What will be the Excel of machine learning? Chances are it may just be ... Excel. It makes sense to collocate AI with the data."
François Chollet, Artificial Intelligence Researcher and creator of Keras

Excel divides the world into those who love it, those who hate it, and many have a love-hate relationship. Emotions aside, given that the corporate world we live in has fundamentally changed since Excel was released in 1987: What is a viable use case for Excel today, and what is not?

This question started boggling my mind, once I realized how many large businesses literally still "run on Excel", such as banks of which many do their risk assessment in Excel across the entire organization. Apparently, no company wants to talk about their misuse of Excel, and that is often the case with organizations and individuals alike: The bigger the problem, the less likely it is they are going to admit it.

Below my definition of what appropriate use cases for Excel are, and which are not. Please share your thoughts, make additions (my list is by all means not all-encompassing). I am happy to see this discussion kick off. That said, here's my list:

These are viable Excel use cases in my view

Education - Excel is a great platform to learn the fundamentals of working with data from ETL, to math, probability, statistics, and their application in the context of business analytics and data science. It's almost like learning to ride a bicycle with training wheels.

Rapid Prototyping - Excel is an excellent tool to play around with data, brainstorm observations and discuss the possible hypothesis. Modern Excel provides relational database capabilities, and it can ingest data from a broad range of sources, such as Google BigQuery, IBM, Oracle or SAP HANA databases. Surprisingly few Excel power users are familiar with those capabilities.

Small Businesses - in the case of businesses with a small handful of analysts, Excel can be a viable solution. However, this is a double edge sword. Every company has their "Excel Guru", and this person can unwillingly cause huge damage when quitting his job. His spreadsheet creations might be likely incomprehensible for others.

Self Service Portals - whenever data is appropriately managed, and business users are only allowed to access and consume well-prepared data, Excel can shine, as well as data visualization tools such as Microsoft Power BI for example.

Data Storytelling - it's easy for Excel users to switch to Power BI, which comes with Power Query and Power Pivot, very much like modern Excel. Power BI has web publishing capabilities allowing analysts to share their insights on internal web portals or their corporate website.

These are not viable Excel use cases in my view

Whenever:

  • the subject of analysis is mission-critical, sensitive data
  • datasets are beyond 100MM rows and thus just too big to handle for Excel
  • data is semi-structured or unstructured
  • insights need to be gained in real time
  • entire teams work collaboratively on data
  • high level of automation is critical for the company's success
  • operational processes and decisions need to be centralized
  • regulatory authorities require auditability

What are your thoughts: What is a viable use case for Excel, and what is not?

Views: 737

Reply to This

Replies to This Discussion

I found your article to be insightful.  In my work we tend to use R, SAS and Excel.  We are not giving up Excel anytime soon.

My biggest issue with Excel is that its use is extremely personal.  Everybody has their own way of structuring spreadsheets, of accessing data and interacting with VBA.  When tasks are passed between people the new person either recreates the spreadsheet, reverts to a manual process, or uses the existing spreadsheet without understanding how it works or how to recognize problems. 

A smaller problem I have with Excel is that as spreadsheets become more complicated they become much harder to validate.  More than once I have found that people have inserted mistakes into columns of copied formulas. 

While Excel is the tool that I use most, I am wondering why it is so slow when processing (say) one million rows. I'am used to code in Perl (and its ancestors, like C or C++, which share the same DNA) but there are unexpected benefits about Excel that few people are aware of. For instance, the random number generator in Excel is good enough, while the one in Perl, can only produce 32,768 values, which is useless in all my applications (I wrote my own random number in Perl, which can produce trillions of different values, enough for my needs.)

By the way, here is an example of Excel used to deal with unstructured data: https://www.datasciencecentral.com/profiles/blogs/advanced-machine-...

Thanks for the open-minded view of Excel.  We've had some great success in using Excel as a user interface platform, pulling data and pushing results into a DBMS back end.  We use this as an alternative to a web front end if there are a small number of users who have been working with Excel spreadsheets, it makes the transition to a more structured application a bit gentler if they can fall back into Excel for ad hoc calculations or updates.  We also use it as a prototyping tool where we can provide a "quick and dirty" user interface (although they end up being very effective, if a bit outdated in style) to help establish a more structured process and move the users quickly to a shared data back end while establishing and validating the front end requirements.  Obviously there are many limitations, but I believe that it can be a very effective tool for advanced analytics and budget-forecast-actual efforts, and I am sure there are many other places too.

I think an undervalued benefit of Excel, especially in corporations both large and small, is its ability to easily and reliably share information. Users often prepare their data in Excel and then simply forward the file to the recipient. Recipients then easily copy and paste or use their preferred ETL tool to extract the data. No date formatting or delimiter issues. It is the convention in Germany to use comma as the thousands separator, dot as the decimal delimiter and semicolon as the field delimiter. CSV format is too unreliable in this regard. Other formats such as xml or json too unwieldy for many users.

RSS

Videos

  • Add Videos
  • View All

Follow Us

© 2018   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service