Subscribe to DSC Newsletter

5 Free Data Science eBooks For Your Summer Reading List

Going somewhere nice for your summer holidays? Somewhere with a nice beach perhaps – Goa, Grand Cayman or Grimsby? Or a bustling city break? Wherever you’re going there’s sure to be long periods where you’ll sit for hours on end with little to do but read, so I thought I’d throw together a few free eBooks for your Kindle to while away the long hours in the airport, in a traffic jam or on the beach.

A mixture of books about data, analysis, statistics and R programming, they’re all very popular and are great for early-stage data scientists and will get your mental juices flowing with ideas about how to tackle your data for when you get back to your desk.

There’s even a book about data analysis for the life sciences in here, and a bonus book at the end about data cleaning (everybody likes bonuses, right?).

Best of all, they’re all free. My favourite price!

 

Well, in no particular order, here they are.

1. R Programming for Data Science

Author: Roger D. Peng

https://leanpub.com/rprogramming

 

R Programming for Data Science is about the fundamentals of R programming. Starting with the basics of R, you will learn how to manipulate datasets, write functions, and how to debug and optimise code.

According to Roger:

“Data science has taken the world by storm. Every field of study and area of business has been affected as people increasingly realize the value of the incredible quantities of data being generated. But to extract value from those data, one needs to be trained in the proper data science skills. The R programming language has become the de facto programming language for data science. Its flexibility, power, sophistication, and expressiveness have made it an invaluable tool for data scientists around the world.”

With over 90,000 readers on LeanPub alone, this free ebook has proved a big hit worldwide. As an accompaniment to the Coursera R course, it is a good and easy read on the basics of R. With the programming theories written in layman’s terms, the skills taught in this book will lay the foundation to begin your journey learning data science.

 

The book is offered on the Pay-What-You-Want model, including free, but there is a minimum donation level if you want the accompanying datasets, R code and lecture videos.

 

2. The Elements of Data Analytic Style

Author: Jeff Leek

https://leanpub.com/datastyle

 

Jeff contends that data analysis is “at least as much art as it is science”. This book is focused on the details of data analysis that sometimes fall through the cracks in traditional statistics classes and textbooks.

Like Roger, Jeff is also one of the co-developers of the Johns Hopkins Specialization in Data Science, and this book is a useful reference tool for people tasked with reading and critiquing data analyses.

The Elements of Data Analytic Style, currently being enjoyed by almost 60,000 LeanPub readers, is a concise introduction to all stages of data analysis, is a good starting point for data analysis newcomers and is also useful as a frequent look up tool to check that you’re on the right track.

 

The book is offered on the Pay-What-You-Want model, including free.

 

3. Regression Models for Data Science in R

Author: Brian Caffo

https://leanpub.com/regmods

 

Although at the moment only 90% complete, this book has still attracted over 18,000 LeanPub readers and gives a brief, but rigorous, treatment of regression models from a practical perspective.

You will need a basic understanding of statistical concepts and R programming, and the book is intended for practicing Data Scientists but as long as you tick these boxes you should be fine.

After reading the book you should be able to perform multivariate regressions and understand their interpretations.

 

This book is also Pay-What-You-Want, including free, and there is a minimum donation level if you want the accompanying datasets, R code and lecture videos.

 

4. OpenIntro Statistics

Authors: David Diez and Mine Cetinkaya-Rundel

https://leanpub.com/openintro-statistics

 

Written by OpenIntro, whose mission is to make educational products that are free, transparent, and lower barriers to education, this book has been used as the course text in courses from community colleges to Ivy League colleges.

This book provides an excellent introduction to statistical analysis and model thinking as well as tools to challenge yourself along the way (quizes, tests, real world examples), and quickly accelerates from introduction to more complex statistics. There is also a website that provides some sample data to use in R and includes useful code snippets.

It is not so much a reference for statistics, but is a great book to learn how to use reason and logic about data, probability and statistical tools.

 

The book is offered on the Pay-What-You-Want model, including free, and helpfully, they also offer it as a tablet-friendly pdf, also free.

 

5. Data Analysis for the Life Sciences

Authors: Rafael A Irizarry and Michael I Love

https://leanpub.com/dataanalysisforthelifesciences

 

This is a book that is different from many statistical textbooks as it focuses less on mathematics and more on using a computer to perform data analysis. Instead of explaining the mathematics and theory, and then showing examples, the authors start with a practical data-related life science challenge. This book also includes the computer code that provides a solution to the problem and helps illustrate the concepts behind the solution giving you a better intuition for the concepts, the mathematics, and the theory.

This is a good introduction to statistics at the college level, and is particularly good for those entering the life sciences.

 

The book is offered on the Pay-What-You-Want model, including free.

 

Bonus Book: Practical Data Cleaning

Author: Lee Baker

Get Your Copy Here

 

Most of the books in this list are focussed towards statistics and R programming, so I thought I’d throw in something a little different for your summer reading list.

Practical Data Cleaning is a brief, but thorough introduction to the basics of data cleaning for beginners and the more experienced. Following the 19 tips outlined in the book will help you to get organised and avoid many of the most common pitfalls of data collection, cleaning, classification and data integrity.

There is also a free Microsoft Excel Practical Data Cleaning template to help you get a good start with your data.

 

This book is being offered for free, exclusive to the Data Science Central crowd.

***Latest News***

Practical Data Cleaning is now available as a free online video course

 

Summary

 

So there you have it – 5 free eBooks (plus a bonus book) for your summer reading.

I hope you enjoy them, wherever you go.


What do you think?

It would be great if you would leave brief reviews of these books in the comments below – I’m sure all the authors would appreciate your comments and shares.

Join the debate below and let me know your thoughts...


About the Author

Lee Baker is an award-winning software creator with a passion for turning data into a story.

A proud Yorkshireman, he now lives by the sparkling shores of the East Coast of Scotland. Physicist, statistician and programmer, child of the flower-power psychedelic ‘60s, it’s amazing he turned out so normal!

Turning his back on a promising academic career to do something more satisfying, as the CEO and co-founder of Chi-Squared Innovations he now works double the hours for half the pay and 10 times the stress - but 100 times the fun!

He also wanted to be rich, famous and good looking. Ah well...

PS - Don't forget to connect with me in Twitter: @eelrekab


Other DSC Articles by the same Author


Disclaimer: Practical Data Cleaning was written by the author of this blog post

Views: 38113

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Lee Baker on August 17, 2016 at 12:15am

@Tri

you're welcome - I hope you derive great value out of these books

Comment by Tri Puguh Santoso on August 16, 2016 at 7:20am

Cool..thank you so much :)

Comment by Lee Baker on August 3, 2016 at 1:41pm

@Sione

thank you for your comments.

I want to add that I started my programming journey with Matlab, and it's still my go-to application when I want to do something quick and dirty. For me, it's so much quicker and easier to bash something out and get it working than any other language I've worked with.

Comment by Sione Palu on August 2, 2016 at 6:01am

I want to add on, that the topics I stated in my previous message have tons of Matlab softwares available that their corresponding authors had made available on the net together with their papers. There are some in R  but since Matlab is predominantly the language of engineering & scientific computing,  majority of researchers published their papers & develop their codes in Matlab, which means that if one needs to learn advance topics in data science, then knowledge of Matlab is a must. There are sophisticated algorithms available on the net that they have no versions in R, Java, Python & what have you. The interested developer or data scientist may want to implement the algorithm if he/she's not familiar with Matlab from a particular author's paper (which is time consuming & can be very complex to do - algorithm-wise) in his/her language (R, Python, Java, etc...) or just grab the author's Matlab package from his site, matlab-central-repository or github then experiment or explore.

Comment by Sione Palu on August 2, 2016 at 5:53am

Good books for beginners. The books covered traditional classical techniques that have been familiar to statisticians for long time. Well, what I mean here are the techniques covered in all those books are what    Vincent Granville had blogged about to call old techniques & I agree. Anyway the books are good for beginners to start with. When they are proficient with those classical techniques, they can of course move on to more state of the art recent techniques, to name a few, will be :  mutli-view learning,  multi-task learning, multi-target learning, semi-supervised learning, low-rank non-negative matrix & tensor factorisation,  tensor & matrix completion, etc, etc,...

Comment by Lee Baker on July 30, 2016 at 12:12am

@Kelechi

You're very welcome - I hope these books take your data science skills to even greater heights

Comment by Kelechi Francisca Njoku on July 29, 2016 at 7:57pm

Thank you very much for the books.

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service