This book is also part of our apprenticeship. Part of the content as well as new content is in a separate document called Addendum. Click here to download the addendum. The book is available on Barnes and Noble. Also, read our article on strong correlations to see how various sections of our book apply to modern data science. If you start from zero, read my data science cheat sheet first: it will greatly facilitate the reading of my book.
My second book - Data Science 2.0 - can be checked out here. The book described on this page is my first book.
About the Author
Dr. Vincent Granville is a visionary data scientist with 15 years of big data, predictive modeling, digital and business analytics experience. Vincent is widely recognized as the leading expert in scoring technology, fraud detection and web traffic optimization and growth. Over the last ten years, he has worked in real-time credit card fraud detection with Visa, advertising mix optimization with CNET, change point detection with Microsoft, online user experience with Wells Fargo, search intelligence with InfoSpace, automated bidding with eBay, click fraud detection with major search engines, ad networks and large advertising clients.
Most recently, Vincent launched Data Science Central, the leading social network for big data, business analytics and data science practitioners. Vincent is a former post-doctorate of Cambridge University and the National Institute of Statistical Sciences. He was among the finalists at the Wharton School Business Plan Competition and at the Belgian Mathematical Olympiads. Vincent has published 40 papers in statistical journals (including Journal of Royal Statistical Society - Series B, IEEE Pattern Analysis and Machine Intelligence, Journal of Number Theory), a Wiley book on data science, and is an invited speaker at international conferences. He also developed a new data mining technology known as hidden decision trees, owns multiple patents, published the first data science eBook, and raised $6MM in start-up funding. Vincent is a top 20 big data influencers according to Forbes, and was also featured on CNN.
Introduction
To find out whether this book might be useful to you, read my introduction.
Table of Content
Chapter 1 - What Is Data Science? 1
Real Versus Fake Data Science 2
The Data Scientist 9
Data Science Applications in 13 Real-World Scenarios 13
Data Science History, Pioneers, and Modern Trends 30
Summary 39
Chapter 2 - Big Data Is Different 41
Two Big Data Issues 41
Examples of Big Data Techniques 51
What MapReduce Can’t Do 60
Communication Issues 63
Data Science: The End of Statistics? 65
The Big Data Ecosystem 70
Summary 71
Chapter 3 - Becoming a Data Scientist 73
Key Features of Data Scientists 73
Types of Data Scientists 78
Data Scientist Demographics 82
Training for Data Science 82
Data Scientist Career Paths 89
Summary 107
Chapter 4 - Data Science Craftsmanship, Part I 109
New Types of Metrics 110
Choosing Proper Analytics Tools 113
Visualization 118
Statistical Modeling Without Models 122
Three Classes of Metrics: Centrality, Volatility, Bumpiness 125
Statistical Clustering for Big Data 129
Correlation and R-Squared for Big Data 130
Computational Complexity 137
Structured Coefficient 140
Identifying the Number of Clusters 141
Internet Topology Mapping 143
Securing Communications: Data Encoding 147
Summary 149
Chapter 5 - Data Science Craftsmanship, Part II 151
Data Dictionary 152
Hidden Decision Trees 153
Model-Free Confidence Intervals 158
Random Numbers 161
Four Ways to Solve a Problem 163
Causation Versus Correlation 165
How Do You Detect Causes? 166
Life Cycle of Data Science Projects 168
Predictive Modeling Mistakes 171
Logistic-Related Regressions 172
Experimental Design 176
Analytics as a Service and APIs 178
Miscellaneous Topics 183
New Synthetic Variance for Hadoop and Big Data 187
Summary 193
Chapter 6 - Data Science Application Case Studies 195
Stock Market 195
Encryption 209
Fraud Detection 216
Digital Analytics 230
Miscellaneous 245
Summary 253
Chapter 7 - Launching Your New Data Science Career 255
Job Interview Questions 255
Testing Your Own Visual and Analytic Thinking 263
From Statistician to Data Scientist 268
Taxonomy of a Data Scientist 273
400 Data Scientist Job Titles 279
Salary Surveys 281
Summary 285
Chapter 8 - Data Science Resources 287
Professional Resources 287
Career-Building Resources 295
Summary 298
Index 299
Other links
Comment
Vincent, I'm looking forward to reading it. sure it'll help me to be more oriented in BD business
Gr8 book , looking forward to grab copy.
Great work so far, looks like a good read.
You don't happen to have an indicative release date do you?
@Brian: I can bring the issue to Wiley, not sure if they will/can change the title, though there was some hesitation before deciding on a title.
Interesting book and I look forward to reading it. One thing. Can you change the title to "Developing Data Scientists", "Developing Data Analysts" or something that does not use the word "Talent" in it? The word Talent has been massively abused by HR people and seems to connote that there is something innate about analytic or any other learned ability. There is enough research out there that shows that there is no such thing as innate talent and contradicts what you are trying to do which is teach people how to become Data Scientists/Analysts. We are people, not "Talent" needing an agent to represent us.
Thorough and clear. Do future chapters address competencies other than those technical (Analytics, Data Management, Software)? Topics such as how to interface with business generalists, build EQ, and and influence organizations would be valuable, given how data scientists are expanding beyond their traditional back room role.
Vincent I don't know yet if you have included comments on "new trends" like e.g. sentiment analysis. Waiting to read it and may be I will be ιn a position (as far as my knowledge is concerned) to help. I think this book could be evolved in a valuable knowledge device for our expertise and profession.
Vincent, Is there any way we can get draft content of selected chapter to read and comment before its publication? Something like the "rough cuts" that OReilly gives? Right now we have table of contents to read but reading the actual content even if in draft form would be great!
Congrats! Would love to read the book.
© 2021 TechTarget, Inc.
Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central