Going somewhere nice for your summer holidays? Somewhere with a nice beach perhaps – Goa, Grand Cayman or Grimsby? Or a bustling city break? Wherever you’re going there’s sure to be long periods where you’ll sit for hours on end with little to do but read, so I thought I’d throw together a few free eBooks for your Kindle to while away the long hours in the airport, in a traffic jam or on the beach.
A mixture of books about data, analysis, statistics and R programming, they’re all very popular and are great for early-stage data scientists and will get your mental juices flowing with ideas about how to tackle your data for when you get back to your desk.
There’s even a book about data analysis for the life sciences in here, and a bonus book at the end about data cleaning (everybody likes bonuses, right?).
Best of all, they’re all free. My favourite price!
Well, in no particular order, here they are.
Author: Roger D. Peng
https://leanpub.com/rprogramming
R Programming for Data Science is about the fundamentals of R programming. Starting with the basics of R, you will learn how to manipulate datasets, write functions, and how to debug and optimise code.
According to Roger:
“Data science has taken the world by storm. Every field of study and area of business has been affected as people increasingly realize the value of the incredible quantities of data being generated. But to extract value from those data, one needs to be trained in the proper data science skills. The R programming language has become the de facto programming language for data science. Its flexibility, power, sophistication, and expressiveness have made it an invaluable tool for data scientists around the world.”
With over 90,000 readers on LeanPub alone, this free ebook has proved a big hit worldwide. As an accompaniment to the Coursera R course, it is a good and easy read on the basics of R. With the programming theories written in layman’s terms, the skills taught in this book will lay the foundation to begin your journey learning data science.
The book is offered on the Pay-What-You-Want model, including free, but there is a minimum donation level if you want the accompanying datasets, R code and lecture videos.
Author: Jeff Leek
Jeff contends that data analysis is “at least as much art as it is science”. This book is focused on the details of data analysis that sometimes fall through the cracks in traditional statistics classes and textbooks.
Like Roger, Jeff is also one of the co-developers of the Johns Hopkins Specialization in Data Science, and this book is a useful reference tool for people tasked with reading and critiquing data analyses.
The Elements of Data Analytic Style, currently being enjoyed by almost 60,000 LeanPub readers, is a concise introduction to all stages of data analysis, is a good starting point for data analysis newcomers and is also useful as a frequent look up tool to check that you’re on the right track.
The book is offered on the Pay-What-You-Want model, including free.
Author: Brian Caffo
Although at the moment only 90% complete, this book has still attracted over 18,000 LeanPub readers and gives a brief, but rigorous, treatment of regression models from a practical perspective.
You will need a basic understanding of statistical concepts and R programming, and the book is intended for practicing Data Scientists but as long as you tick these boxes you should be fine.
After reading the book you should be able to perform multivariate regressions and understand their interpretations.
This book is also Pay-What-You-Want, including free, and there is a minimum donation level if you want the accompanying datasets, R code and lecture videos.
Authors: David Diez and Mine Cetinkaya-Rundel
https://leanpub.com/openintro-statistics
Written by OpenIntro, whose mission is to make educational products that are free, transparent, and lower barriers to education, this book has been used as the course text in courses from community colleges to Ivy League colleges.
This book provides an excellent introduction to statistical analysis and model thinking as well as tools to challenge yourself along the way (quizes, tests, real world examples), and quickly accelerates from introduction to more complex statistics. There is also a website that provides some sample data to use in R and includes useful code snippets.
It is not so much a reference for statistics, but is a great book to learn how to use reason and logic about data, probability and statistical tools.
The book is offered on the Pay-What-You-Want model, including free, and helpfully, they also offer it as a tablet-friendly pdf, also free.
Authors: Rafael A Irizarry and Michael I Love
https://leanpub.com/dataanalysisforthelifesciences
This is a book that is different from many statistical textbooks as it focuses less on mathematics and more on using a computer to perform data analysis. Instead of explaining the mathematics and theory, and then showing examples, the authors start with a practical data-related life science challenge. This book also includes the computer code that provides a solution to the problem and helps illustrate the concepts behind the solution giving you a better intuition for the concepts, the mathematics, and the theory.
This is a good introduction to statistics at the college level, and is particularly good for those entering the life sciences.
The book is offered on the Pay-What-You-Want model, including free.
Author: Lee Baker
Most of the books in this list are focussed towards statistics and R programming, so I thought I’d throw in something a little different for your summer reading list.
Practical Data Cleaning is a brief, but thorough introduction to the basics of data cleaning for beginners and the more experienced. Following the 19 tips outlined in the book will help you to get organised and avoid many of the most common pitfalls of data collection, cleaning, classification and data integrity.
There is also a free Microsoft Excel Practical Data Cleaning template to help you get a good start with your data.
This book is being offered for free, exclusive to the Data Science Central crowd.
***Latest News***
Practical Data Cleaning is now available as a free online video course
So there you have it – 5 free eBooks (plus a bonus book) for your summer reading.
I hope you enjoy them, wherever you go.
What do you think?
It would be great if you would leave brief reviews of these books in the comments below – I’m sure all the authors would appreciate your comments and shares.
Join the debate below and let me know your thoughts...
About the Author
Lee Baker is an award-winning software creator with a passion for turning data into a story.
A proud Yorkshireman, he now lives by the sparkling shores of the East Coast of Scotland. Physicist, statistician and programmer, child of the flower-power psychedelic ‘60s, it’s amazing he turned out so normal!
Turning his back on a promising academic career to do something more satisfying, as the CEO and co-founder of Chi-Squared Innovations he now works double the hours for half the pay and 10 times the stress - but 100 times the fun!
He also wanted to be rich, famous and good looking. Ah well...
PS - Don't forget to connect with me in Twitter: @eelrekab
Other DSC Articles by the same Author
Disclaimer: Practical Data Cleaning was written by the author of this blog post
Comment
@Tri
you're welcome - I hope you derive great value out of these books
Cool..thank you so much :)
@Sione
thank you for your comments.
I want to add that I started my programming journey with Matlab, and it's still my go-to application when I want to do something quick and dirty. For me, it's so much quicker and easier to bash something out and get it working than any other language I've worked with.
I want to add on, that the topics I stated in my previous message have tons of Matlab softwares available that their corresponding authors had made available on the net together with their papers. There are some in R but since Matlab is predominantly the language of engineering & scientific computing, majority of researchers published their papers & develop their codes in Matlab, which means that if one needs to learn advance topics in data science, then knowledge of Matlab is a must. There are sophisticated algorithms available on the net that they have no versions in R, Java, Python & what have you. The interested developer or data scientist may want to implement the algorithm if he/she's not familiar with Matlab from a particular author's paper (which is time consuming & can be very complex to do - algorithm-wise) in his/her language (R, Python, Java, etc...) or just grab the author's Matlab package from his site, matlab-central-repository or github then experiment or explore.
Good books for beginners. The books covered traditional classical techniques that have been familiar to statisticians for long time. Well, what I mean here are the techniques covered in all those books are what Vincent Granville had blogged about to call old techniques & I agree. Anyway the books are good for beginners to start with. When they are proficient with those classical techniques, they can of course move on to more state of the art recent techniques, to name a few, will be : mutli-view learning, multi-task learning, multi-target learning, semi-supervised learning, low-rank non-negative matrix & tensor factorisation, tensor & matrix completion, etc, etc,...
@Kelechi
You're very welcome - I hope these books take your data science skills to even greater heights
Thank you very much for the books.
© 2021 TechTarget, Inc.
Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central