Subscribe to DSC Newsletter

The market for data visualization software has bloomed. I'm suspicious.

Companies like Tableau, Spotfire, SAS Visual Analytics, Qlik and Zoomdata are positioning their tools far beyond traditional business intelligence.  Capabilities for graphically navigating data, recognizing patterns and finding relationships are growing in both functional and economic scope.  These new tools can provide charting forms only imagined in the last decade like word clouds, circular hierarchies, tree maps and stream graphs.  Check out the D3 (data driven documents) javascipt library for inspiration.  All this innovation begs a critical question:

Is data visualization

  • an entirely new dimension of data management
  • a subject within analytics, emerging with new tools
  • a rebranding of old subjects like business intelligence, dashboards and reporting
  • or something else?

On the one hand, visual is not new.  In 1983, Tuffle wrote "The Visual Display of Quantitative Information" which never stopped selling.  In 1987, Rockhart and De Long offered "Executive Support Systems" and launched the very user centric EIS age.  Comshare, IRI, Pilot and Arbor Software launched the 90's OLAP generation with its own concepts.  And, the last decade, we've seen familiar players in Business Intelligence leap frog each other, continually competing on presentation. Face it - "visual" sells software.

Analytic visuals aren't new either. Archimedes had charts. He just used a pen. The statistical suites all have rough but ready graph capabilities. Basic, and un-pretty, plots are among the first steps of exploratory data analysis. So, treating data visualization as innovative comes with a very high burden of proof.

I could buy the an argument that Big Data has a consequence of Big Graphics.  We are capturing more detail, we can store it and we need to study it.  So, data science has a ground to need advanced visuals.  But, I'm guessing that data visualization licenses outnumber data scientists by a hundred fold. 

Tufte's original book included a famous duck.   The picture illustrated, for Tufte, irrelevant and useless presentation.  So far, I haven't seen reasoning to treat data visualization as much more than a next generation duck.

Views: 2612

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Andrej Lapajne on October 23, 2014 at 4:16am

That's a very good point. Internet is flooded with "ducks": bad visualizations, meaningless dashboards and inefficient reports. Some of them might "look good" and very sexy, but only a small portion of them really help readers understand the messages and reveal the stories that numbers want to convey. The situation with internal reports and business presentations within corporations is similar.

However, you can always create a bad visualization or a good (efficient) one. A good visualization does help understand data. Quite universally and even more so when datasets are complex. Authors like J. Bertin, E. Tufte and R. Hichert have shown us how.

Also, not all visualization and reporting tools are the same. For example, at Zebra BI (check examples at http.//zebra.bi) we have tried to focus on functional and consistent visualizations instead of decoration. Clear and appropriate business charts, proper labelling and correct scaling are the 3 key ingredients. Software should help users achieve that.

Comment by Sean McClure on October 20, 2014 at 9:30am

Data visualization is an important piece in the data science toolbox but, yes as with any hype it simply attracts too many annoying vendors selling more sex than substance.  Visualizing is useful when there is real data to sit it on. And the only immediate visualizations you can get out of the box are ones based on simple SQL queries. If your organization is gaining new insight from simple SQL queries then you have bigger problems than a lack of visualization tools.  Visualization is best used when new insights are gleaned from advanced analytical analyses on large datasets.  If simple queries are revealing 'amazing' trends that you can now show off on dashboards with sexy D3 libraries that's fine...but you're not doing data science.  BI is not data science. Data science is machine learning. Not Tableau sitting on top of MySQL with a sexy dashboard. Start changing the organization to have a real data-driven culture that takes out the gut-feeling and focuses on advanced pattern recognition to discover trends that humans cannot.  Sitting some dashboard on top of a SQL query means the human is still doing the pattern recognition...this is NOT data science. 

Comment by Richard Ordowich on October 20, 2014 at 6:55am

Most of the visualizations I see are in the form of Datatainment. Correlations gone wild. With the advent of the term "big data" has come the inevitable exaggerations that you can turn data into gold like alchemy.

People spend too much time trying to pretty up the data rather than on addressing the substance behind the data and the audience is mostly Data Illiterate not asking relevant or critical questions about the data.

If the data has little substance, dress it up like a duck :)

Comment by Georges Grinstein on October 20, 2014 at 6:31am

Data visualization presents data visually. Your blog above is a data visualization as is a table or a spreadsheet. When the data is quite large (sorry about the singular verb) humans need support to build a mental model to help us reason effectively. Hence graphical visualization techniques act as tools to support us - as buffers (Archimedes' paper, Galileo's drawings, ...) and as objects that can be interacted with for hypothesis generation and exploration. Thus data visualizations can be considered data views augmenting data, or as interfaces supporting model building, whether that model is cognitive or mathematical. Finally visualizations can facilitate interpreting and presenting complex results to others. These latter are presentation visualizations (Tufte, Few, ...). All in all I view visualizations as simply alternative representations of objects we must reason about or deal with. Ideally these are good and well done alternatives - that's key.

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service