Subscribe to DSC Newsletter

Great resource, using spam filtering as an illustration to teach you Python basics for data science. It comes with many charts, codes snippets and comments, and is very useful to learn about Python.

It has the following sections:

  • Load data, look around
  • Data preprocessing
  • Data to vectors
  • Training a model, detecting spam
  • How to run experiments?
  • How to tune parameters?
  • Productionalizing a predictor

Caveat: It uses the worst possible technique for spam filtering: Naive Bayes, responsible for extremely poor spam filtering systems with tons of false positives and false negatives, still alive today. So this is definitely not a good resource to learn data science, but a great tutorial to learn Python, especially since naive Bayes is extremely easy to implement, though alternate but far better techniques such as hidden decision trees, are almost just as easy to code.

Read the tutorial

DSC Resources

Additional Reading

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge

Views: 12898


  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service