10 Data Science, Machine Learning and IoT Predictions for 2017

It's time again to share your predictions for 2017. I did my homework and came with these 10 predictions. I invite you to post your predictions in the comment section, or write a blog about it. Ramon Chen's predictions are posted here, while you can read Tableau's prediction here. Top programming languages for 2017 can be found here. Gil Press' top 10 hot data science technologies is also worth reading. For those interested, here were the predictions for 2016. Finally, MariaDB discusses the future of analytics and data warehousing in their Dec 20 webinar.

My Predictions

  1. Data science and machine learning will become more mainstream, especially in the following industries: energy, finance (banking, insurance), agriculture (precision farming), transportation, urban planning, healthcare (customized treatments), even government.
  2. Some, with no familiarity with data science, will want to create a legal framework about how data can be analyzed, how the algorithms should behave, and to force public disclosure of algorithm secrets. I believe that they will fail, though Obamacare is an example where predictive algorithms were required to ignore metrics such as gender or age, to compute premiums, resulting in more expensive premiums for everyone.
  3. The rise of sensor data - that is, IoT - will create data inflation. Data quality, data relevancy, and security will continue to be of critical importance.
  4. With the rise of IoT, more processes will be automated (piloting, medical diagnosis and treatment) using machine-to-machine or device-to-device communications powered by algorithms relying on artificial intelligence (AI), deep learning, and automated data science. I am currently writing an article that describes the differences between machine learning, IoT, AI, deep learning and data science. You can sign-up on DSC to make sure that you won't miss it. 
  5. The frontier between AI, IoT, data science, machine learning, deep learning and operations research will become more fuzzy. Statistical engineering will be present in more and more applications, be it machine learning, AI or data science. 
  6. Many systems will continue to not work properly. The solution will have to be found not in algorithms, but in people. Read my article Why so many Machine Learning Implementations Fail. An example is Google analytics, which fails to catch huge amounts of robotic traffic that is so rudimentary and so obvious, you don't need any statistical or data science knowledge to filter it or block it. People publish elementary solutions to address these issues, yet it continues unabated. Fake reviews, fake news, undetected hate speech on Twitter, undetected plagiarism by Google search, are in the same category. Eventually it leaves room for new players to jump in and build a system that will actually work. 
  7. Reliance on public data and public news will come with bigger scrutiny. Some say that the failure to predict the elections is a data science failure. In my opinion, it is a different type of failure: it is the failure to recognize that the media are biased (they publish whatever predictions that fit with their agenda) and maybe even those doing the surveys are biased or incompetent (there are lies, damn lies, and statistics as the saying goes). It is also a failure to recognize the very high volatility in these elections, and the fact that day-to-day variations were huge. Anyone able to compute sound confidence intervals that incorporates historical data,  would have said that the results were not reliably predictable. Finally, I always thought that the winner would be the one best able at manipulation and playing tricks, be it hacking or paying the media.
  8. More and more data cleaning, pre-processing, and exploratory data analysis will be automated. We will also face more unstructured data, with powerful ways to structure them.  Multiple algorithms and models will be more and more blended together to provide the best pattern recognition and predictive systems, and boost accuracy. 
  9. Data science education will evolve, with perhaps a come back of strong university curricula run by leading practitioners, and fewer people finding a job through data science camps only, as many of these camps do not train you to become a data scientist, but instead a Python / R / SQL coder with classic, elementary, even outdated and dangerous statistical knowledge. Or data camps will have to evolve, or otherwise risk becoming another kind of Phoenix university.
  10. Attacks against data-dependent infrastructure will switch from stealing or erasing data, to modifying data. Some will be launched from IoT devices if security holes are not fixed.

Top DSC Resources

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge

Views: 13275


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Antal Sofalvy on January 30, 2017 at 4:57am

Point 6 - Google Analytics

I do not really believe they have not recognized these issues - namely robot / bot artificial traffic. Filtering these are simply does not match their primary business goals: click -> paid visit = revenue. While site owners / Ad placers are not yelling (so the real human conversion is still acceptable) the bigger numbers generate bigger revenue.

OK; at the very end this caused by "people", but not human error type of behaviour...

Comment by Amy Flippant on January 4, 2017 at 3:45am

Really interesting predictions Vincent - it does seem that the progress made in 2016 will see a shift in data handling and data integration technology in 2017, particularly towards the incorporation of data virtualization as noted by Gartner:

As data integration architectures continue to shift from physical bulk/batch movement to virtualized and real-time granular data delivery, data and analytics leaders must intertwine integration styles to match all requirements of business changes.” (The State and Future of Data Integration: Optimizing Your Portfolio...

I agree with Eric below in so far as the need to "shift from a technology focus to one of true organization transformation" - for that shift to be realized, business intelligence must to be accessible at an enterprise level for technical users and business users alike to be able to make informed decisions.

Data virtualization creates a virtual data layer, connecting to all data sources (of all types, formats and structures) and publishes this data making it accessible by all consuming applications. The full capabilities of DV have not yet been realized at an enterprise level - but the many issues companies will face in 2017 regarding big data analytics, data governance and organization transformation can be solved quite simply through data virtualization technology...watch this space.

Comment by Eric A. King on December 14, 2016 at 2:05pm

2016 has been another year of heavy buzz and non-productive disruption -- even as Big Data slips into Gartner’s Hype Cycle’s “trough of disillusionment.” 

In trying to sort out all of the new pressures and opportunities in the broad area of data science, I believe that the lessons learned in 2016 will motivate an era of reset, reflection, comprehensive assessment, prioritized project planning, full team collaboration, and strategic implementation at the enterprise-level toward longer-term residual gains with predictive analytics in 2017.  I truly hope that organizations have learned the costly lessons in 2016 of abruptly redirecting to grab seemingly low-hanging fruit, accommodate the latest buzz and vendor hype, or respond to social media whims and rants. 


Most of all, if organizations are to succeed and sustain in predictive analytics, they need to shift from a technology focus to one of true organizational transformation.  That transformation requires vendor-neutral training; longer-term vision; uncomfortable change management; strategic consultation; a framework for operating and collaborating at the enterprise level; and a view on returns beyond the next quarter’s results.


The organizations that commit to that level of transformation in 2017 (healthcare or otherwise) will be the leaders in 2018.  They will be the ones who have operationalized the shift from gut-level decisioning to automated and targeted data-driven decisioning.  This shift will also free up their highly valued SMEs to focus their time, talent and experience on more creative and meaningful endeavors.

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service