Subscribe to DSC Newsletter

Many articles have been written about the top machine learning algorithms: click here and here for instance. Most of them seem to define top as oldest, and thus most used, ignoring modern, efficient algorithms fit for big data, such as indexation, attribution modeling, collaborative filtering, or recommendation engines used by companies such as Amazon, Google, or Facebook. 

I received this morning and advertisement for a (self-published) book called Master Machine Learning Algorithms, and I could not resist to post the author's list of top 10 machine learning algorithms:: 

Linear Algorithms:

  • Algorithm 1: Linear Regression
  • Algorithm 2: Logistic Regression
  • Algorithm 3: Linear Discriminant Analysis

Nonlinear Algorithms:

  • Algorithm 4: Classification and Regression Trees
  • Algorithm 5: Naive Bayes
  • Algorithm 6: K-Nearest Neighbors
  • Algorithm 7: Learning Vector Quantization
  • Algorithm 8: Support Vector Machines

Ensemble Algorithms:

  • Algorithm 9: Bagged Decision Trees and Random Forest
  • Algorithm 10: Boosting and AdaBoost

Bonus #1: Gradient Descent

  • The Gradient Descent algorithm is also covered as it us used as the optimization algorithm at the core of so many machine learning algorithms

You can check the book here.

Some of these techniques such as Naive Bayes (variables are almost never uncorrelated), Linear Discriminant Analysis (clusters are almost never separated by hyperplanes), or Linear Regression (numerous model assumptions - including linearity - are almost always violated in real data)  have been so abused that I would hesitate teaching them. This is not a criticism of the book; most textbooks mention pretty much the same algorithms, and in this case, even skipping all graph-related algorithms. Even k Nearest Neighbors have modern, fast implementations not covered in traditional books - we are indeed working on this topic and expect to have an article published shortly about it.

If anything, it proves that modern techniques take a lot of time to hit the classroom and the textbooks. You might have to attend classes taught by real practitioners (people who worked for big data solutions vendors) to learn modern tools that will give you a competitive edge on the job market. Though you can discover a lot of this free "hidden knowledge" on our website, using our data science search engine.  An publisher such as O'Reilly, as well as some universities with an applied data science department, provide good education about these state-of-the-art techniques, with case studies. My upcoming book Data Science 2.0 will cover much of the topic, and my previous Wiley book is a good starting point. And you can learn quite a bit from our apprenticeship (for self-learners only at this time).

DSC Resources

Additional Reading

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge

Views: 21949

Reply to This

Replies to This Discussion

Thanks for the article. Indeed good.

Exist another book about Machine Learning, " The Master Algorithm", from Pedro Domingos  https://homes.cs.washington.edu/~pedrod/ 

"The Five Tribes of Machine Learning, and What You Can Take from Each: There are five main schools of thought in machine learning, and each has its own master algorithm – a general-purpose learner that can in principle be applied to any domain. The symbolists have inverse deduction, the connectionists have backpropagation, the evolutionaries have genetic programming, the Bayesians have probabilistic inference, and the analogizers have support vector machines. What we really need, however, is a single algorithm combining the key features of all of them." 

http://www.slideshare.net/SessionsEvents/pedro-domingos-professor-u...

Reply to Discussion

RSS

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service