This book is also part of our apprenticeship. Part of the content as well as new content is in a separate document called Addendum. Click here to download the addendum. The book is available on Barnes and Noble. Also, read our article on strong correlations to see how various sections of our book apply to modern data science. If you start from zero, read my data science cheat sheet first: it will greatly facilitate the reading of my book.
My second book - Data Science 2.0 - can be checked out here. The book described on this page is my first book.
About the Author
Dr. Vincent Granville is a visionary data scientist with 15 years of big data, predictive modeling, digital and business analytics experience. Vincent is widely recognized as the leading expert in scoring technology, fraud detection and web traffic optimization and growth. Over the last ten years, he has worked in real-time credit card fraud detection with Visa, advertising mix optimization with CNET, change point detection with Microsoft, online user experience with Wells Fargo, search intelligence with InfoSpace, automated bidding with eBay, click fraud detection with major search engines, ad networks and large advertising clients.
Most recently, Vincent launched Data Science Central, the leading social network for big data, business analytics and data science practitioners. Vincent is a former post-doctorate of Cambridge University and the National Institute of Statistical Sciences. He was among the finalists at the Wharton School Business Plan Competition and at the Belgian Mathematical Olympiads. Vincent has published 40 papers in statistical journals (including Journal of Royal Statistical Society - Series B, IEEE Pattern Analysis and Machine Intelligence, Journal of Number Theory), a Wiley book on data science, and is an invited speaker at international conferences. He also developed a new data mining technology known as hidden decision trees, owns multiple patents, published the first data science eBook, and raised $6MM in start-up funding. Vincent is a top 20 big data influencers according to Forbes, and was also featured on CNN.
Introduction
To find out whether this book might be useful to you, read my introduction.
Table of Content
Chapter 1 - What Is Data Science? 1
Real Versus Fake Data Science 2
The Data Scientist 9
Data Science Applications in 13 Real-World Scenarios 13
Data Science History, Pioneers, and Modern Trends 30
Summary 39
Chapter 2 - Big Data Is Different 41
Two Big Data Issues 41
Examples of Big Data Techniques 51
What MapReduce Can’t Do 60
Communication Issues 63
Data Science: The End of Statistics? 65
The Big Data Ecosystem 70
Summary 71
Chapter 3 - Becoming a Data Scientist 73
Key Features of Data Scientists 73
Types of Data Scientists 78
Data Scientist Demographics 82
Training for Data Science 82
Data Scientist Career Paths 89
Summary 107
Chapter 4 - Data Science Craftsmanship, Part I 109
New Types of Metrics 110
Choosing Proper Analytics Tools 113
Visualization 118
Statistical Modeling Without Models 122
Three Classes of Metrics: Centrality, Volatility, Bumpiness 125
Statistical Clustering for Big Data 129
Correlation and R-Squared for Big Data 130
Computational Complexity 137
Structured Coefficient 140
Identifying the Number of Clusters 141
Internet Topology Mapping 143
Securing Communications: Data Encoding 147
Summary 149
Chapter 5 - Data Science Craftsmanship, Part II 151
Data Dictionary 152
Hidden Decision Trees 153
Model-Free Confidence Intervals 158
Random Numbers 161
Four Ways to Solve a Problem 163
Causation Versus Correlation 165
How Do You Detect Causes? 166
Life Cycle of Data Science Projects 168
Predictive Modeling Mistakes 171
Logistic-Related Regressions 172
Experimental Design 176
Analytics as a Service and APIs 178
Miscellaneous Topics 183
New Synthetic Variance for Hadoop and Big Data 187
Summary 193
Chapter 6 - Data Science Application Case Studies 195
Stock Market 195
Encryption 209
Fraud Detection 216
Digital Analytics 230
Miscellaneous 245
Summary 253
Chapter 7 - Launching Your New Data Science Career 255
Job Interview Questions 255
Testing Your Own Visual and Analytic Thinking 263
From Statistician to Data Scientist 268
Taxonomy of a Data Scientist 273
400 Data Scientist Job Titles 279
Salary Surveys 281
Summary 285
Chapter 8 - Data Science Resources 287
Professional Resources 287
Career-Building Resources 295
Summary 298
Index 299
Other links
Comment
This looks like it will be an eclectic -- and valuable -- list of topics. Congrats on your progress and I'll definitely be buying this as soon as it's released!
Great effort. Looking forward for the book.
Looking forward to the release.
I expect this book to be a bit more oriented to the stats side of the process compared to other titles.
I'm looking forward to this release and buy the hardcopy.
Good job Vincent. I look forward to reading it. I'm sure it's better than John Foreman's book, which is IMO the best one out there at the moment.
Congrats on the book and the progress. (I feel your pain: I'm about 75% done with my book that is to be completed late in December). I'm looking forward to reading it.
Great ! Eagerly awaiting the completed version
Vincent, This is fabulous stuff. Looking forward to the complete book.
© 2019 Data Science Central ® Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central