Probability and Statistics- the terms which resonate together to create the vast applications of the fields of Data Science and Machine Learning, have immensely grown a huge followers’ base in this era. But programming the concept sometimes gets tricky and requires a lot of contemplation on the code. Allen B. Downey, in his ‘Think’ series has written a book to solve just the problem for everyone.

**Think Stats**

Think Stats is one of the books in the ‘Think’ series authored by Allen B. Downey, published by O’Reilly Media, which focuses on Probability and Statistics for Python programmers. The book helps readers envisage the concepts into Python code using simple techniques which will help explore real data sets and answers interesting questions. It contains a case study using data from the National Institutes of Health, and it demands from the readers to work on real life projects using realistic datasets.

Of course, one needs to have a basic understanding and skills of Python, to make full use of the book. Think Stats is based on a Python library for probability distribution (Probability Mass Functions (PMFs) and Cumulative Distribution Function (CDFs)), which include techniques to represent and plot PMFs and CDFs. Many of the exercises have short programs to run as experiments and help readers develop a deeper understanding of the topic.

Other important topics among many, include details about outliers, conditional probability, plotting histograms, different types of distributions (e.g. exponential, Pareto, Normal), Bayes’ theorem, Hypothesis testing, estimations etc.

The thing which stands out of the rest of the books is the inclusion and proper explanation of Bayesian Statistics for programming, which the author feels, is too important to be neglected. By taking advantage of the PMF and CDF libraries, it is possible for even beginners to learn the concepts and solve challenging problems related to the same.

*To read the rest of the article and access the book,* *click here.*

© 2020 TechTarget, Inc. Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central