# Many articles have been written about the top machine learning algorithms: click here and here for instance. Most of them seem to define top as oldest, and thus most used, ignoring modern, efficient algorithms fit for big data, such as indexation, attribution modeling, collaborative filtering, or recommendation engines used by companies such as Amazon, Google, or Facebook. I received this morning and advertisement for a (self-published) book called Master Machine Learning Algorithms, and I could not resist to post the author's list of top 10 machine learning algorithms::

Linear Algorithms:

• Algorithm 1: Linear Regression
• Algorithm 2: Logistic Regression
• Algorithm 3: Linear Discriminant Analysis

Nonlinear Algorithms:

• Algorithm 4: Classification and Regression Trees
• Algorithm 5: Naive Bayes
• Algorithm 6: K-Nearest Neighbors
• Algorithm 7: Learning Vector Quantization
• Algorithm 8: Support Vector Machines

Ensemble Algorithms:

• Algorithm 9: Bagged Decision Trees and Random Forest
• Algorithm 10: Boosting and AdaBoost

• The Gradient Descent algorithm is also covered as it us used as the optimization algorithm at the core of so many machine learning algorithms

You can check the book here.

Some of these techniques such as Naive Bayes (variables are almost never uncorrelated), Linear Discriminant Analysis (clusters are almost never separated by hyperplanes), or Linear Regression (numerous model assumptions - including linearity - are almost always violated in real data)  have been so abused that I would hesitate teaching them. This is not a criticism of the book; most textbooks mention pretty much the same algorithms, and in this case, even skipping all graph-related algorithms. Even k Nearest Neighbors have modern, fast implementations not covered in traditional books - we are indeed working on this topic and expect to have an article published shortly about it.

If anything, it proves that modern techniques take a lot of time to hit the classroom and the textbooks. You might have to attend classes taught by real practitioners (people who worked for big data solutions vendors) to learn modern tools that will give you a competitive edge on the job market. Though you can discover a lot of this free "hidden knowledge" on our website, using our data science search engine.  An publisher such as O'Reilly, as well as some universities with an applied data science department, provide good education about these state-of-the-art techniques, with case studies. My upcoming book Data Science 2.0 will cover much of the topic, and my previous Wiley book is a good starting point. And you can learn quite a bit from our apprenticeship (for self-learners only at this time).

DSC Resources

Views: 40697

### Replies to This Discussion

Thanks for the article. Indeed good.

Exist another book about Machine Learning, " The Master Algorithm", from Pedro Domingos  https://homes.cs.washington.edu/~pedrod/

"The Five Tribes of Machine Learning, and What You Can Take from Each: There are five main schools of thought in machine learning, and each has its own master algorithm – a general-purpose learner that can in principle be applied to any domain. The symbolists have inverse deduction, the connectionists have backpropagation, the evolutionaries have genetic programming, the Bayesians have probabilistic inference, and the analogizers have support vector machines. What we really need, however, is a single algorithm combining the key features of all of them."

http://www.slideshare.net/SessionsEvents/pedro-domingos-professor-u...

Also take a look at the article covering the importance of Machine Learning applications in various spheres - https://www.cleveroad.com/blog/importance-of-machine-learning-appli...

Lovely and very informative article! Good job. Btw, here you can find additional info about What is Machine Learning and What is It Not

Here's another very interesting writeup from Towardsdatascience.com. Posting its gist here -

1. Linear Regression: Linear regression is a supervised learning algorithm and tries to model the relationship between a continuous target variable and one or more independent variables by fitting a linear equation to the data.

2. Support Vector Machine: SVM distinguishes classes by drawing a decision boundary. How to draw or determine the decision boundary is the most critical part in SVM algorithms.

3. Naive Bayes:

Naive Bayes is a supervised learning algorithm used for classification tasks. Hence, it is also called the Naive Bayes Classifier. Naive Bayes assumes that features are independent of each other and there is no correlation between features.

4. Logistic Regression: Logistic regression is a supervised learning algorithm that is mostly used for binary classification problems. Although “regression” contradicts with “classification”, the focus here is on the word “logistic” referring to the logistic function which does the classification task in this algorithm.

5. K-Nearest Neighbors (kNN): K-nearest neighbors (kNN) is a supervised learning algorithm that can be used to solve both classification and regression tasks. The main idea behind kNN is that the value or class of a data point is determined by the data points around it.

6. Decision Trees: A decision tree builds upon iteratively asking questions to partition data. It is easier to conceptualize the partitioning data with a visual representation of a decision tree.

7. Random Forest: Random forest is an ensemble of many decision trees. Random forests are built using a method called bagging in which decision trees are used as parallel estimators.

GBDT is an ensemble algorithm that uses boosting method to combine individual decision trees.

Boosting means combining a learning algorithm in series to achieve a strong learner from many sequentially connected weak learners.

9. K-Means Clustering: Clustering is a way to group a set of data points in a way that similar data points are grouped together. Therefore, clustering algorithms look for similarities or dissimilarities among data points.

10. Hierarchical Clustering:

Hierarchical clustering means creating a tree of clusters by iteratively grouping or separating data points. There are two types of hierarchical clustering:

• Agglomerative clustering
• Divisive clustering

11. DBSCAN Clustering: Partition-based and hierarchical clustering techniques are highly efficient with normal shaped clusters. However, when it comes to arbitrary shaped clusters or detecting outliers, density-based techniques are more efficient.

12. Principal Component Analysis: PCA is a dimensionality reduction algorithm which basically derives new features from the existing ones with keeping as much information as possible. PCA is an unsupervised learning algorithm but it is also widely used as a preprocessing step for supervised learning algorithms.