This book is also part of our apprenticeship. Part of the content as well as new content is in a separate document called Addendum. Click here to download the addendum. The book is available on Barnes and Noble. Also, read our article on strong correlations to see how various sections of our book apply to modern data science. If you start from zero, read my data science cheat sheet first: it will greatly facilitate the reading of my book.
My second book - Data Science 2.0 - can be checked out here. The book described on this page is my first book.
About the Author
Dr. Vincent Granville is a visionary data scientist with 15 years of big data, predictive modeling, digital and business analytics experience. Vincent is widely recognized as the leading expert in scoring technology, fraud detection and web traffic optimization and growth. Over the last ten years, he has worked in real-time credit card fraud detection with Visa, advertising mix optimization with CNET, change point detection with Microsoft, online user experience with Wells Fargo, search intelligence with InfoSpace, automated bidding with eBay, click fraud detection with major search engines, ad networks and large advertising clients.
Most recently, Vincent launched Data Science Central, the leading social network for big data, business analytics and data science practitioners. Vincent is a former post-doctorate of Cambridge University and the National Institute of Statistical Sciences. He was among the finalists at the Wharton School Business Plan Competition and at the Belgian Mathematical Olympiads. Vincent has published 40 papers in statistical journals (including Journal of Royal Statistical Society - Series B, IEEE Pattern Analysis and Machine Intelligence, Journal of Number Theory), a Wiley book on data science, and is an invited speaker at international conferences. He also developed a new data mining technology known as hidden decision trees, owns multiple patents, published the first data science eBook, and raised $6MM in start-up funding. Vincent is a top 20 big data influencers according to Forbes, and was also featured on CNN.
Introduction
To find out whether this book might be useful to you, read my introduction.
Table of Content
Chapter 1 - What Is Data Science? 1
Real Versus Fake Data Science 2
The Data Scientist 9
Data Science Applications in 13 Real-World Scenarios 13
Data Science History, Pioneers, and Modern Trends 30
Summary 39
Chapter 2 - Big Data Is Different 41
Two Big Data Issues 41
Examples of Big Data Techniques 51
What MapReduce Can’t Do 60
Communication Issues 63
Data Science: The End of Statistics? 65
The Big Data Ecosystem 70
Summary 71
Chapter 3 - Becoming a Data Scientist 73
Key Features of Data Scientists 73
Types of Data Scientists 78
Data Scientist Demographics 82
Training for Data Science 82
Data Scientist Career Paths 89
Summary 107
Chapter 4 - Data Science Craftsmanship, Part I 109
New Types of Metrics 110
Choosing Proper Analytics Tools 113
Visualization 118
Statistical Modeling Without Models 122
Three Classes of Metrics: Centrality, Volatility, Bumpiness 125
Statistical Clustering for Big Data 129
Correlation and R-Squared for Big Data 130
Computational Complexity 137
Structured Coefficient 140
Identifying the Number of Clusters 141
Internet Topology Mapping 143
Securing Communications: Data Encoding 147
Summary 149
Chapter 5 - Data Science Craftsmanship, Part II 151
Data Dictionary 152
Hidden Decision Trees 153
Model-Free Confidence Intervals 158
Random Numbers 161
Four Ways to Solve a Problem 163
Causation Versus Correlation 165
How Do You Detect Causes? 166
Life Cycle of Data Science Projects 168
Predictive Modeling Mistakes 171
Logistic-Related Regressions 172
Experimental Design 176
Analytics as a Service and APIs 178
Miscellaneous Topics 183
New Synthetic Variance for Hadoop and Big Data 187
Summary 193
Chapter 6 - Data Science Application Case Studies 195
Stock Market 195
Encryption 209
Fraud Detection 216
Digital Analytics 230
Miscellaneous 245
Summary 253
Chapter 7 - Launching Your New Data Science Career 255
Job Interview Questions 255
Testing Your Own Visual and Analytic Thinking 263
From Statistician to Data Scientist 268
Taxonomy of a Data Scientist 273
400 Data Scientist Job Titles 279
Salary Surveys 281
Summary 285
Chapter 8 - Data Science Resources 287
Professional Resources 287
Career-Building Resources 295
Summary 298
Index 299
Other links
Comment
Lucky me! I actually bought this book last June, 2014 then learned about this website when I was reading page 85. Now I can devote my time on this valuable website other than the resourceful book.
Are there any plans to have it in different languages? Spanish for example... How can I contribute?
Just ordered this book thru amazon .. eagerly waiting to read
Bought it on Amazon...looking forward to it arriving.
How can I get a copy of that book?
Got a copy from amazon.com last week. I am more that halfway through.
Thank you Dr. Granville for such a wonderful book. This helps me focus and repriortize my learning goals.
Looking forward to hone my data science skills and complete the Data Science Apprenticeship.
Just got the epub version from Kobo. Looking forward to see how R and Hadoop are intertwined.
I just got my copy of the data science book from Wiley..want to say a big Thank You to Dr Vincent for writing such a practical handbook devoid of any clutter or hype on Data Science.
Great work with practical relevance..Most of the current books or courses on Data Science tells you either Step-0 or Step-10. This book will teach you from Step-0 to Step-10 and build the foundation for future..
Must for everyone who want to plunge into Data Science..
Enjoy the ride !
Good book. About 1/2 way through the book. It si very informative and to the point and clearly written.
© 2018 Data Science Central Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
You need to be a member of Data Science Central to add comments!
Join Data Science Central