Ensemble methods take several machine learning techniques and combine them into one predictive model. It is a two step process: Generate the Base Learners: Choose any c...
Logistic regression is regressing data to a line (i.e. finding an average of sorts) so you can fit data to a particular equation and make predictions for your data. This ...
This article, written by the Facebook research team, was written by Ben Letham, Brian Karrer, Guilherme Ottoni and Eytan Bakshy. It features some of the techniques used b...
This is a simple overview of the k-NN process. Perhaps the most challenging step is finding a k that’s “just right”. The square root of n can put you i...
Determining sample sizes is a challenging undertaking. For simplicity, I’ve limited this picture to the one of the most common testing situation: testing for differ...
This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, decision trees, ensembles, correlati...
This article was written by Krishna Kumar Mahto. So, three days into SVM, I was 40% frustrated, 30% restless, 20% irritated and 100% inefficient in terms of getting my ...
Witnessing the data science field’s meteoric rise in demand across pretty much all industries and areas of scientific research, it’s easy to anticipate efforts to cre...
Last month I had an honor to participate in data science project reviews for the new graduates of General Assembly’s Data Science Immersive program. In the span of ...
Determining the number of clusters when performing unsupervised clustering is a tricky problem. Many data sets don’t exhibit well separated clusters, and two human ...