A few universities and other organizations have started to offer data science curriculum, training and certificates.
The following excerpts are taken from the respective programs.
- University of Washington: Develop the computer science, mathematics and analytical skills needed to enter the field of data science. Use data science techniques to analyze and extract meaning from extremely large data sets, or big data. Become familiar with relational and non-relational databases. Apply statistics, machine learning, text retrieval and natural language processing to analyze data and interpret results. Learn to apply data science in fields such as marketing, business intelligence, scientific research and more.
- Northwestern University: As businesses seek to maximize the value of vast new stores of available data, Northwestern University’s Master of Science in Predictive Analytics program prepares students to meet the growing demand in virtually every industry for data-driven leadership and problem solving. Advanced data analysis, predictive modeling, computer-based data mining, and marketing, web, text, and risk analytics are just some of the areas of study offered in the program.
- Berkeley: The UC Berkeley School of Information offers the only professional Master of Information and Data Science (MIDS) delivered fully online. This exciting program offers:CUNY: The Online Master’s Degree in Data Analytics (M.S.) prepares graduates to make sense of real-world phenomena and everyday activities by synthesizing and mining big data with the intention of uncovering patterns, relationships, and trends. Big data has emerged as the driving force behind critical business decisions. Advances in our ability to collect, store, and process different kinds of data from traditionally unconnected sources enables us to answer complex, data-driven questions in ways that have never been possible before.
- A multidisciplinary curriculum that prepares you to solve real-world problems using complex and unstructured data.
- A web-based learning environment that blends live, face-to-face classes with interactive, online course work designed and delivered by UC Berkeley faculty.
- A project-based approach to learning that integrates the latest tools and methods for identifying patterns, gaining insights from data, and effectively communicating those findings.
- Full access to the I School community including personalized technical and academic support.
- The chance to build connections in the Bay Area — the heart of the data science revolution — and through the UC Berkeley alumni network.
- New York University: Our initiative is university-wide because data science has already started to revolutionize most areas of intellectual endeavor found at NYU and in the world. We believe this revolution is just beginning. Data science is becoming a necessary tool to answer some of the big scientific questions and technological challenges of our times: How does the brain work? How can we build intelligent machines? How do we better find cures for diseases? Data science overlaps multiple traditional disciplines at NYU such as mathematics (pure and applied), statistics, computer science and an increasingly large number of application domains. It also stands to impact a wide range of spheres — from healthcare to business to government — in which NYU’s schools and departments are engaged.
- Columbia University: The Institute for Data Sciences and Engineering at Columbia University strives to be the single world-leading institution in research and education in the theory and practice of the emerging field of data science broadly defined. Equally important in this mission is supporting and encouraging entrepreneurial ventures emerging from the research the Institute conducts. To accomplish this goal, the Institute seeks to forge closer relationships between faculty already at the University, to hire new faculty, to attract interdisciplinary graduate students interested in problems relating to big data, and to build strong and mutually beneficial relationships with industry partners. The Institute seeks to attract external funding from both federal and industrial sources to support its research and educational mission.
- Stanford: With the rise of user-web interaction and networking, as well as technological advances in processing power and storage capability, the demand for effective and sophisticated knowledge discovery techniques has grown exponentially. Businesses need to transform large quantities of information into intelligence that can be used to make smart business decisions. With the Mining Massive Data Sets graduate certificate, you will master efficient, powerful techniques and algorithms for extracting information from large datasets such as the web, social-network graphs, and large document repositories. Take your career to the next level with skills that will give your company the power to gain a competitive advantage. The Data Mining and Applications graduate certificate introduces many of the important new ideas in data mining and machine learning, explains them in a statistical framework, and describes some of their applications to business, science, and technology. Stanford says that this broad-based approach makes it an appropriate for everyone from strategy managers to researchers in social, medical and scientific fields as well as data analysts.
- North Carolina State: If you have a mind for mathematics and statistical programming, and a passion for working with data to solve challenging problems, this is your program. The MSA is uniquely designed to equip individuals like yourself for the task of deriving and effectively communicating actionable insights from a vast quantity and variety of data—in 10 months.
Many other institutions have strong analytics programs, including Carnegie Mellon, Harvard, MIT and Georgia Tech, Wharton (Analytics Initiative). For a map of academic data science programs, visit
A typical academic program (University of Washington) features the following courses:
- Introduction to data (data types, data movement, terminology, etc.)
- Storage and Concurrency Preliminaries
- Files and File-based data systems
- Relational Database Management Systems
- Hadoop Introduction
- NoSQL - MapReduce vs. Parallel RDBMS
- Search and Text Analysis
- Entity Resolution
- Inferential Statistics
- Gaussian Distributions, Other Distributions and The Central Limit Theorem
- Testing and Experimental Design
- Bayesian vs. Classical Statistics
- Probabilistic Interpretation of Linear Regression, and Maximum Likelihood
- Graph Algorithms
- Raw Data to Inference Model
- Motivation & Applications of Machine Learning
- Supervised Learning
- Linear and Non-Linear Learning Models
- Classification, Clustering and Dimensionality Reduction
- Advanced Non-Linear Models
- Collaborative Filtering and Recommendation
- Models that are Robust
- Data Sciences with Text and Language
- Data Sciences with Location
- Social Network Analysis
Certifications and other training
A few organizations - both private as well as professional organizations - offer certifications or training. In particular:
- INFORMS (Operations Research Society): Analytics certificate
- Digital Analytics Association: certificate
- TDWI: courses, focus is on database architecture
- American Statistical Association: chartered statistician certificate
- Data Science Central: data science apprenticeship (no cost)
- International Institute for Analytics
Fees range from below $100 for a certification with no exam, to a few thousand dollars for a full program.
It is also possible to get data science training at almost any of the many conferences focusing on analytics, big data or data science:
- Predictive Analytics World,
- GoPivotal Data Science,
- SAS data mining and analytics,
- IEEE analytics / big data / data science
- IE Group,
- Text Analytics News
- White Hall Media
Vendors such as EMC, SAS or Teradata also offer valuable training. Web sites such as Kaggle.com allow you to participate in data science competitions, get access to real data, and sometimes the award is being hired by a company such as Facebook or Netflix.
Coursera.com offers online training at no cost. Instructors are respected University professors, so the material can sometimes feel a bit academic. Here are the few of their offerings, at the time of writing:
- Machine learning (Stanford University)
- Data structures and algorithms (Peking University)
- Web intelligence and big data (Indian Institute of Technology)
- Introduction to data science (University of Washington)
- Maps and geospatial revolution (Penn State)
- Introduction to databases (Stanford University, self-study)
- Computing for data analysis (John Hopkins University)
- Initiation à la programmation (EPFL, Switzerland, in French)
- Statistics one (Princeton University)
Finally, in the next article, we will describe our online, on-demand, free program that comes with real life projects involving big data, and provide an update.