Start with …

*Quora contribution written by Chomba Bupe.*

I am actually not even aware of any machine learning (ML) problem that is considered to have been solved recently or in the past. This tells you a lot about how hard things really are in ML. Of course, if you read media outlets, it may seem like researchers are sweeping the floor clean with deep learning (DL), solving ML problems one…

ContinueAdded by Andrea Manero-Bastin on April 21, 2019 at 6:00am — No Comments

*This article was written by V Sharma.*

Astonishing Hierarchy of Machine Learning Needs – Artificial intelligence and machine learning are used interchangeably often but for they are not the same. Machine learning is one of the most active areas and a way to achieve AI. Why ML is so good today; for this, there are a couple of reasons. Machine Learning entirely depend upon…

ContinueAdded by Andrea Manero-Bastin on April 10, 2019 at 2:30am — No Comments

*This article was written by Michael Grogan**.*

It is often the case that a dataset contains significant outliers – or observations that are significantly out of range from the majority of other observations in our dataset. Let us see how we can use *robust regressions* to deal with this issue.

I described in…

ContinueAdded by Andrea Manero-Bastin on April 4, 2019 at 9:30am — No Comments

*This article was written by Patricia Jones**.*

The advent of the digital age has led to several innovations which will play a huge role in the future of international society, and one of these blockchain technology. The…

ContinueAdded by Andrea Manero-Bastin on April 4, 2019 at 9:00am — No Comments

*This article was written by Datapred.*

* *

In a previous post, we explained the concept of cross-validation for time series, aka backtesting, and why proper backtests matter for time series modeling.

The goal here is to dig deeper and discuss a few coding tips that will help you cross-validate your predictive models correctly.…

ContinueAdded by Andrea Manero-Bastin on March 28, 2019 at 10:04am — No Comments

*This article, written by the Facebook research team, was written by Ben Letham, Brian Karrer, Guilherme Ottoni and Eytan Bakshy.…*

Added by Andrea Manero-Bastin on March 19, 2019 at 9:30am — No Comments

*This article was written by Krishna Kumar Mahto.*

* *So, three days into SVM, I was 40% frustrated, 30% restless, 20% irritated and 100% inefficient in terms of getting my work done. I was stuck with the Maths part of Support Vector Machine. I went through a number of YouTube videos, a number of documents, PPTs and PDFs of lecture notes, but…

Added by Andrea Manero-Bastin on March 14, 2019 at 6:30am — No Comments

*This article was written by Jason Brownlee. *

Artificial neural networks have two main hyperparameters that control the architecture or topology of the network: the number of layers and the number of nodes in each hidden layer. You must specify values for these parameters when configuring your network. The most reliable way to configure…

ContinueAdded by Andrea Manero-Bastin on February 17, 2019 at 2:00am — No Comments

*This article was written by Enda Ridge.*

Data Scientists need to communicate without jargon so customers understand, believe and care about their recommendations. Here is a Data Science jargon buster to help.

Data Science is a technical…

ContinueAdded by Andrea Manero-Bastin on February 17, 2019 at 1:30am — No Comments

*This article was written by Vitaly Shmatikov.*

* *Machine learning is eating the world. The abundance of training data has helped ML achieve amazing results for object recognition, natural language processing, predictive analytics, and all manner of other tasks. Much of this training data is very sensitive, including personal photos, search queries,…

Added by Andrea Manero-Bastin on February 17, 2019 at 1:30am — No Comments

*This article was written by Sondos Atwi*.

In Machine Learning, Cross-validation is a resampling method used for model evaluation to avoid testing a model on the same dataset on which it was trained. This is a common mistake, especially that a separate testing dataset is not always available. However, this usually leads to inaccurate performance measures (as the model will have an almost perfect…

ContinueAdded by Andrea Manero-Bastin on February 10, 2019 at 10:30am — No Comments

*This article was written by Sondos Atwi**.*

**What is Cross-Validation?**

In Machine Learning, Cross-validation is a resampling method used for model evaluation to avoid testing a model on the same dataset on which it was trained. This is a common mistake, especially that a separate…

ContinueAdded by Andrea Manero-Bastin on January 28, 2019 at 11:30pm — No Comments

*This article was written by Kevin Hartnett.*

The nearest neighbor problem asks where a new point fits into an existing data set. A few researchers set out to prove that there was no universal way to solve it. Instead, they found such a way.…

ContinueAdded by Andrea Manero-Bastin on January 15, 2019 at 9:00am — No Comments

*This article was written by Harsh Sikka. This version is a summary of the original article.*

Start with …

Added by Andrea Manero-Bastin on January 9, 2019 at 7:30am — No Comments

*This article was written by James Le. Here is a brief summary. Link to the full article is provided at the bottom. Some techniques are not mentioned in Le's article, for instance neural networks, K-NN, density estimation, time series models, survival analysis, Markov chains, Bayesian statistics, graph models, and spatial processes. However his article is a great read, with the 10 topics explained in details,…*

Added by Andrea Manero-Bastin on January 9, 2019 at 7:30am — No Comments

*This article was written by Hunter Heidenreich.*

Looking into what a generative adversarial network is to understand how they work.

**What’s…**

Added by Andrea Manero-Bastin on January 2, 2019 at 9:30am — No Comments

*This article was written by Will Koehrsen**.*

Here’s the complete code: just copy and paste into a Jupyter Notebook or Python script, replace with your data and run:

The final result is a complete decision tree as…

ContinueAdded by Andrea Manero-Bastin on December 22, 2018 at 7:30am — 1 Comment

*This article was written by Jim Frost. Here we present a summary, with link to the original article.*

Ordinary Least Squares (OLS) is the most common estimation method for linear models—and that’s true for a good reason. As long as your model satisfies the OLS assumptions for linear…

ContinueAdded by Andrea Manero-Bastin on December 13, 2018 at 5:00pm — No Comments

*This article was written by Tristan Handy.*

This post is about how to create the analytics competency at your organization. It’s not about what metrics to track (there are plenty of good posts about that), it’s about how to actually get your business to produce them. As it turns out, the implementation question - How do I build a…

ContinueAdded by Andrea Manero-Bastin on November 27, 2018 at 9:00pm — No Comments

*This article comes from the blog of the website KNIME. Below is a summary. The highest reduction ratio without performance degradation is obtained by analyzing the decision cuts in many random forests (Random Forests/Ensemble Trees). However, even just counting the number of missing values, measuring the column variance, and measuring the correlation of pairs of columns can lead to a satisfactory reduction rate…*

Added by Andrea Manero-Bastin on November 17, 2018 at 8:30pm — 1 Comment

- Unsolved Problems in Machine Learning
- Astonishing Hierarchy of Machine Learning Needs
- Robust Regressions: Dealing with Outliers
- Five Industries Where Blockchain Has Innovated Beyond Cryptocurrency
- Advanced cross-validation tips for time series
- Efficient Tuning of Online Systems Using Bayesian Optimization
- Demystifying the Math of Support Vector Machines (SVM)

- Data Science Jargon Explained to the Layman
- The Math Required for Machine Learning
- Unsolved Problems in Machine Learning
- What is a Generative Adversarial Network?
- Seven Techniques for Data Dimensionality Reduction
- The 10 Statistical Techniques Data Scientists Need to Master
- Astonishing Hierarchy of Machine Learning Needs

© 2019 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

**Technical**

- Free Books and Resources for DSC Members
- Learn Machine Learning Coding Basics in a weekend
- New Machine Learning Cheat Sheet | Old one
- Advanced Machine Learning with Basic Excel
- 12 Algorithms Every Data Scientist Should Know
- Hitchhiker's Guide to Data Science, Machine Learning, R, Python
- Visualizations: Comparing Tableau, SPSS, R, Excel, Matlab, JS, Pyth...
- How to Automatically Determine the Number of Clusters in your Data
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- Fast Combinatorial Feature Selection with New Definition of Predict...
- 10 types of regressions. Which one to use?
- 40 Techniques Used by Data Scientists
- 15 Deep Learning Tutorials
- R: a survival guide to data science with R

**Non Technical**

- Advanced Analytic Platforms - Incumbents Fall - Challengers Rise
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- How to Become a Data Scientist - On your own
- 16 analytic disciplines compared to data science
- Six categories of Data Scientists
- 21 data science systems used by Amazon to operate its business
- 24 Uses of Statistical Modeling
- 33 unusual problems that can be solved with data science
- 22 Differences Between Junior and Senior Data Scientists
- Why You Should be a Data Science Generalist - and How to Become One
- Becoming a Billionaire Data Scientist vs Struggling to Get a $100k Job
- Why do people with no experience want to become data scientists?

**Articles from top bloggers**

- Kirk Borne | Stephanie Glen | Vincent Granville
- Ajit Jaokar | Ronald van Loon | Bernard Marr
- Steve Miller | Bill Schmarzo | Bill Vorhies

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives**: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions