*This article was written by Tomasz Malisiewicz.*

You might go to a cutting-edge machine learning research conference like NIPS hoping to find some mathematical insight that will help you take your deep learning system's performance to the next level. Unfortunately, as Andrew Ng reiterated to a live crowd of 1,000+ attendees this past Monday, there is no secret AI equation that will let you escape your machine learning woes. All you need is some *rigor*, and much of what Ng covered is his remarkable NIPS 2016 presentation titled "*The Nuts and Bolts of Building Applications using Deep Learning*" is not rocket science. Today we'll dissect the lecture and Ng's key takeaways. Let's begin.

Figure 1. Andrew Ng delivers a powerful message at NIPS 2016.

Andrew Ng's lecture at NIPS 2016 in Barcelona was phenomenal -- truly one of the best presentations I have seen in a long time. In a juxtaposition of two influential presentation styles, the *CEO-style* and the *Professor-style*, Andrew Ng mesmerized the audience for two hours. Andrew Ng's wisdom from managing large scale AI projects at Baidu, Google, and Stanford really shows. In his talk, Ng spoke to the audience and discussed one of they key challenges facing most of the NIPS audience -- *how do you make your deep learning systems better*? Rather than showing off new research findings from his cutting-edge projects, Andrew Ng presented a simple recipe for analyzing and debugging today's large scale systems. With no need for equations, a handful of diagrams, and several checklists, Andrew Ng delivered a two-whiteboards-in-front-of-a-video-camera lecture, something you would expect at a group research meeting. However, Ng made sure to not delve into Research-y areas, likely to make your brain fire on all cylinders, but making you and your company very little dollars in the foreseeable future.

.

Andrew Ng highlighted the fact that while NIPS is a research conference, many of the newly generated ideas are simply ideas, not yet battle-tested vehicles for converting mathematical acumen into dollars. The bread and butter of money-making deep learning is supervised learning with recurrent neural networks such as LSTMs in second place. Research areas such as Generative Adversarial Networks (GANs), Deep Reinforcement Learning (Deep RL), and just about anything branding itself as unsupervised learning, are simply Research, with a capital R. These ideas are likely to influence the next 10 years of Deep Learning research, so it is wise to focus on publishing and tinkering if you really love such open-ended Research endeavours. Applied deep learning research is much more about taming your problem (understanding the inputs and outputs), casting the problem as a supervised learning problem, and hammering it with ample data and ample experiments.

.

Most issues in applied deep learning come from a training-data / testing-data mismatch. In some scenarios this issue just doesn't come up, but you'd be surprised how often applied machine learning projects use training data (which is easy to collect and annotate) that is different from the target application. Andrew Ng's discussion is centered around the basic idea of bias-variance tradeoff. You want a classifier with a good ability to fit the data (low bias is good) that also generalizes to unseen examples (low variance is good). Too often, applied machine learning projects running as scale forget this critical dichotomy. Here are the four numbers you should always report:

- Training set error
- Testing set error
- Dev (aka Validation) set error
- Train-Dev (aka Train-Val) set error

Andrew Ng suggests following the following recipe:

Figure 2. Andrew Ng's "Applied Bias-Variance for Deep Learning Flowchart"

for building better deep learning systems.

*To read the full original article click here. For more deep learning related articles on DSC click here.*

**DSC Resources**

- Services: Hire a Data Scientist | Search DSC | Classifieds | Find a Job
- Contributors: Post a Blog | Ask a Question
- Follow us: @DataScienceCtrl | @AnalyticBridge

Popular Articles

© 2021 TechTarget, Inc. Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central