- Covers basic to advanced topics in an easy step-oriented manner
- Concise on theory, strong focus on practical and hands-on approach
- Explores advanced topics, such as Hyper-parameter tuning, deep natural language processing, neural network and deep learning
- Describes state-of-art best practices for model tuning for better model accuracy

This book is your practical guide towards novice to master in machine learning with Python in six steps. The six steps path has been designed based on the “Six degrees of separation” theory which states that everyone and everything is a maximum of six steps away. Note that the theory deals with the quality of connections, rather than their existence. So, a great effort has been taken to design an eminent, yet simple six steps covering fundamentals to advanced topics gradually that will help a beginner walk his way from no or least knowledge of machine learning in Python to all the way to becoming a master practitioner. This book is also helpful for current Machine Learning practitioners to learn the advanced topics such as Hyperparameter tuning, various ensemble techniques, Natural Language Processing (NLP), deep learning, and basics of reinforcement learning.

Each topic has two parts, the first part will cover the theoretical concepts and the second part will cover practical implementation with different Python packages. The traditional approach of math to machine learning i.e., learning all the mathematic then understanding how to implement them to solve problems need a great deal of time/effort which has proven to be not efficient for working professionals looking to switch careers. Hence the focus in this book has been more on simplification, such that the theory/math behind algorithms have been covered only to extend required to get you started.

I recommend you to work with the book instead of reading it. Real learning goes on only through active participation. Hence, all the code presented in the book are available in the form of iPython notebooks to enable you to try out these examples yourselves and extend them to your advantage or interest as required later.

- Examine the fundamentals of Python programming language
- Review machine Learning history & evolution
- Learn various machine learning system development frameworks
- Learn fundamentals to advanced text mining techniques
- Learn and implement deep learning frameworks

This book will serve as a great resource for learning machine learning concepts and implementation techniques for:

- Python developers or data engineers looking to expand their knowledge or career into machine learning area.
- A current non-Python (R, SAS, SPSS, Matlab or any other language) machine learning practitioners looking to expand their implementation skills in Python.
- Novice machine learning practitioners looking to learn advanced topics such as hyperparameter tuning, various ensemble techniques, Natural Language Processing (NLP), deep learning, and basics of reinforcement learning.

- Introduction
- Chapter 1: Step 1 – Getting Started in Python
- Chapter 2: Step 2 – Introduction to Machine Learning
- Chapter 3: Step 3 – Fundamentals of Machine Learning
- Chapter 4: Step 4 – Model Diagnosis and Tuning
- Chapter 5: Step 5 – Text Mining and Recommender Systems
- Chapter 6: Step 6 – Deep and Reinforcement Learning
- Chapter 7: Conclusion

- The Best Things in Life Are Free
- The Rising Star
- Python 2.7.x or Python 3.4.x?
- Windows Installation
- OSX Installation
- Linux Installation
- Python from Official Website
- Running Python

- Key Concepts
- Python Identifiers
- Keywords
- My First Python Program
- Code Blocks (Indentation & Suites)
- Basic Object Types
- When to Use List vs. Tuples vs. Set vs. Dictionary
- Comments in Python
- Multiline Statement
- Basic Operators
- Control Structure
- Lists
- Tuple
- Sets
- Dictionary
- User-Defined Functions
- Module
- File Input/Output
- Exception Handling

- Endnotes

- Artificial Intelligence Evolution
- Different Forms
- Statistics
- Data Mining
- Data Analytics
- Data Science
- Statistics vs. Data Mining vs. Data Analytics vs. Data Science

- Machine Learning Categories
- Supervised Learning
- Unsupervised Learning
- Reinforcement Learning

- Frameworks for Building Machine Learning Systems
- Knowledge Discovery Databases (KDD)
- Cross-Industry Standard Process for Data Mining
- SEMMA (Sample, Explore, Modify, Model, Assess)
- KDD vs. CRISP-DM vs. SEMMA

- Machine Learning Python Packages
- Data Analysis Packages
- NumPy
- Pandas
- Matplotlib

- Machine Learning Core Libraries

- Data Analysis Packages
- Endnotes

- Machine Learning Perspective of Data
- Scales of Measurement
- Nominal Scale of Measurement
- Ordinal Scale of Measurement
- Interval Scale of Measurement
- Ratio Scale of Measurement

- Feature Engineering
- Dealing with Missing Data
- Handling Categorical Data
- Normalizing Data
- Feature Construction or Generation

- Exploratory Data Analysis (EDA)
- Univariate Analysis
- Multivariate Analysis

- Supervised Learning– Regression
- Correlation and Causation
- Fitting a Slope
- How Good Is Your Model?
- Polynomial Regression
- Multivariate Regression
- Multicollinearity and Variation Inflation Factor (VIF)
- Interpreting the OLS Regression Results
- Regression Diagnosis
- Regularization
- Nonlinear Regression
- Supervised Learning – Classification
- Logistic Regression
- Evaluating a Classification Model Performance
- ROC Curve
- Fitting Line
- Stochastic Gradient Descent
- Regularization
- Multiclass Logistic Regression
- Generalized Linear Models
- Supervised Learning – Process Flow
- Decision Trees
- Support Vector Machine (SVM)
- k Nearest Neighbors (kNN)
- Time-Series Forecasting

- Unsupervised Learning Process Flow
- Clustering
- K-means
- Finding Value of k
- Hierarchical Clustering
- Principal Component Analysis (PCA)

- Endnotes

- Optimal Probability Cutoff Point
- Which Error Is Costly?

- Rare Event or Imbalanced Dataset
- Known Disadvantages

- Which Resampling Technique Is the Best?
- Bias and Variance
- Bias
- Variance

- K-Fold Cross-Validation
- Stratified K-Fold Cross-Validation
- Ensemble Methods
- Bagging
- Feature Importance
- RandomForest
- Extremely Randomized Trees (ExtraTree)
- How Does the Decision Boundary Look?
- Bagging – Essential Tuning Parameters

- Boosting
- Example Illustration for AdaBoost
- Gradient Boosting
- Boosting – Essential Tuning Parameters
- Xgboost (eXtreme Gradient Boosting)

- Ensemble Voting – Machine Learning’s Biggest Heroes United
- Hard Voting vs. Soft Voting

- Stacking
- Hyperparameter Tuning
- GridSearch
- RandomSearch

- Endnotes

- Text Mining Process Overview
- Data Assemble (Text)
- Social Media
- Step 1 – Get Access Key (One-Time Activity)
- Step 2 – Fetching Tweets

- Data Preprocessing (Text)
- Convert to Lower Case and Tokenize
- Removing Noise
- Part of Speech (PoS) Tagging
- Stemming
- Lemmatization
- N-grams
- Bag of Words (BoW)
- Term Frequency-Inverse Document Frequency (TF-IDF)

- Data Exploration (Text)
- Frequency Chart
- Word Cloud
- Lexical Dispersion Plot
- Co-occurrence Matrix

- Model Building
- Text Similarity
- Text Clustering
- Latent Semantic Analysis (LSA)

- Topic Modeling
- Latent Dirichlet Allocation (LDA)
- Non-negative Matrix Factorization

- Text Classification
- Sentiment Analysis
- Deep Natural Language Processing (DNLP)
- Recommender Systems
- Content-Based Filtering
- Collaborative Filtering (CF)

- Endnotes

- Artificial Neural Network (ANN)
- What Goes Behind, When Computers Look at an Image?
- Why Not a Simple Classification Model for Images?
- Perceptron – Single Artificial Neuron
- Multilayer Perceptrons (Feedforward Neural Network)
- Load MNIST Data
- Key Parameters for scikit-learn MLP

- Restricted Boltzman Machines (RBM)
- MLP Using Keras
- Autoencoders
- Dimension Reduction Using Autoencoder
- De-noise Image Using Autoencoder

- Convolution Neural Network (CNN)
- CNN on CIFAR10 Dataset
- CNN on MNIST Dataset

- Recurrent Neural Network (RNN)
- Long Short-Term Memory (LSTM)

- Transfer Learning
- Reinforcement Learning
- Endnotes

- Summary
- Tips
- Start with Questions/Hypothesis Then Move to Data!
- Don’t Reinvent the Wheels from Scratch
- Start with Simple Models
- Focus on Feature Engineering
- Beware of Common ML Imposters

- Happy Machine Learning

- Apress Link: Click here!
- Amazon links by location: US, United Kingdom, India, Brazil, Canada, France, Germany, Italy, Japan, Mexico, Netherlands, Spain

**DSC Resources**

- Services: Hire a Data Scientist | Search DSC | Classifieds | Find a Job
- Contributors: Post a Blog | Ask a Question
- Follow us: @DataScienceCtrl | @AnalyticBridge

**Popular Articles**

© 2020 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Upcoming DSC Webinar**

**Your Model Will Probably Fail (And How to Prevent it)**- July 9

Data science is more popular than ever, but many data scientists struggle with complicated workflows to run their models as well as how to best communicate the output to less technical stakeholders. Tableau can solve both of these challenges by designing R workflows and creating visualizations that break complicated models down into easily understandable stories.**Register today**.

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Upcoming DSC Webinar**

**Your Model Will Probably Fail (And How to Prevent it)**- July 9

Data science is more popular than ever, but many data scientists struggle with complicated workflows to run their models as well as how to best communicate the output to less technical stakeholders. Tableau can solve both of these challenges by designing R workflows and creating visualizations that break complicated models down into easily understandable stories.**Register today**.

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central