I am asking this question to current DS for their expert opinion. First, I am fascinated by all aspects of data science and know a little about some areas and nothing about others, but I am straight up a newbie, Master degreed, with a strong curiosity and desire to learn more about each and every area of data science. So, I am new at this and have basically started at the bottom rung learning (self-directed) as becoming a Ninja DS would certainly compliment my career leverage. Since the beginning of my journey, I have questioned everything as to "how" and "why" I should learn this in regards to its current and possible future importance or value. Probably the only contribution, outside of healthcare, I could give to you experts is that I am not aligned with any software company and bring no bias or preconcieved ideas of where DS is headed. Recently, I watched an online presentation about IBM Watson and what they referred to as cognitive data and was stunned, thinking if this presentation is half-way true, why would you need DS everywhere? The last question pertains to self-service analytics and the software companies that provide this type of -"silver-bullet"- all you need for deep analytics horsepower for your corporate management decision-makers is our program. I watched a demo presentation recently, I can not recall the vendor name, and basically it looked to me as a click-and-drag program with no need for programming, modeling or statistical interpretation skills, and with a few clicks the graphics looked Board Room ready for presentation. It was so automated, one could produce the output from a dataset and not even know the meaning, muchless field specific questions about the how, when, where, why, etc. as it pertains to the data and findings. Then, I realized this is where the phrase "citizen data scientist" is coming from and everyone in management will be expected to automatically know how to capture, manage, analyze, interpret, and visualize their departments' data to be a data driven decision-maker.
Considering these two examples, albeit from my own personal observation as an amateur data science enthusist, I would like to know your opinion as to why you think more and more companies will need (market demand) real expert DS such as yourself in the future say 3-5-10 years if or when the above mentioned techniques become ubiquitous? Sure, there will always be a need and market demand for DS and yes the current demand is white hot, but do you really see the growth, in numbers of real-deal DS, continuing? I sure hope demand continues to grow but I would like to hear your thoughts as to why the demand will continue to exceed the supply in the timeframe given.
Many thanks, D
I'm by no means an expert, but I think I can add some thoughts here. First we'll start with the overall question of "is Data Science needed/necessary/important"? I think it's safe to say yes, and this is because the business world has changed structurally over the last decade so that in order for a business to be competitive and understand trends and take profitable actions, that business needs to collect, store, use and analyze data. These days, we're talking about large amounts of data that come from a number of different sources - stores, websites, mobile phones, social media, etc. Since the core of DS is to use a scientific process to analyze and produce actionable results from data (I'm sure there's a better definition out there, but let's go with that for now), then the business world needs data scientists now and into the foreseeable future.
Now let's get to your two examples. First off, IBM Watson. Although Watson is very impressive, in the end it's just a tool to use to understand data. It's also the only one of its kind out there (from what I know) and even though there might be some other software systems like Watson being built, you still have to go to IBM and ask them to use their system, and probably pay a boatload in the process. For some companies it might be worthwhile, but the majority out there it's not the right solution for a host of reasons.
Let me put it the other way. Take a mid-market company that has an overworked IT department, a barebones financial team and a C-Team that's driving everyone to drink by asking for the impossible weekly. The company decides that they need to be more competitive to drive profits higher, and someone pipes up that because their analysts are overworked already, maybe they need to bring in a bigger gun who can crunch all the data that's sitting around and come up with the stuff they just aren't seeing. Watson is not going to provide this - it's going to be a jack-of-all-trades Data Scientist who understands databases, coding, business, a bit of finance, and how to talk to people to figure out what's important and what's not. In the end, no one software is going to fix this problem, because it's not a problem, it's a job description.
For the second example - a demo is designed with one specific goal in mind: to sell you the software. Usually the people running the demos are salespeople, sometimes they bring along more techie-types, but you don't really want to trust what you see in a demo. But I think here's where you answered your own question beautifully. You said of the demo: "It was so automated, one could produce the output from a dataset and not even know the meaning, muchless field specific questions about the how, when, where, why, etc. as it pertains to the data and findings."
And this is the core of why data scientists will be needed more in the future, and most likely won't be replaced any time soon by a software system. Data can be interpreted - and misinterpreted - in many different ways. If you study data science for a while, you'll quickly understand that there are no silver bullets, no magic formulas, no easy answers. Everything is an approximation, or an educated guess, or a probability. Part of a successful analyst is to question the data - over and over again. Is that right? Why am I getting this result? Does this make sense? Where is this data coming from? Are the data sources reliable? Am I missing any data? How am I going to validate this? And so on. These are the important questions and woe to any business that makes decisions based solely on a software report.
Also - go to a packed boardroom with a printout that you know nothing about, pass it around to everyone and see what happens. You'll get pinned to a wall with questions. For every analysis I've ever presented to a c-team, I've found that I need a decent stack of supporting information to answer all the questions that come from one report. Someone's got to think of those questions, and then try to find answers even before they're asked. It takes business knowledge, experience, and a number of years of putting together analytics - sometimes in record time - to get to that level.
Data scientists are becoming integral to all aspects of business. It's a growing field, and will be for the foreseeable future. Also, until software can start asking questions, and not just answering the ones that are fed to them, there's no fear that the software will replace the human in this field anytime soon.
I'd recommend checking out the job descriptions for various data scientist roles. You'll see how enormously complex they are - statistics, programming, business knowledge and experience, financial know-how. Also find someone in your area that's doing DS at a company, offer to take them out to lunch and ask them what their typical day is like. You might be surprised at how versatile you have to be in a job like that.
All my own (and non-expert) $0.02. Take what you can. Good luck!