Full title: Applied Stochastic Processes, Chaos Modeling, and Probabilistic Properties of Numeration Systems. An alternative title is Organized Chaos. Published June 2, 2018. Author: Vincent Granville, PhD. (104 pages, 16 chapters.)
This book is intended for professionals in data science, computer science, operations research, statistics, machine learning, big data, and mathematics. In 100 pages, it covers many new topics, offering a fresh perspective on the subject. It is accessible to practitioners with a two-year college-level exposure to statistics and probability. The compact and tutorial style, featuring many applications (Blockchain, quantum algorithms, HPC, random number generation, cryptography, Fintech, web crawling, statistical testing) with numerous illustrations, is aimed at practitioners, researchers and executives in various quantitative fields.
New ideas, advanced topics, and state-of-the-art research are discussed in simple English, without using jargon or arcane theory. It unifies topics that are usually part of different fields (data science, operations research, dynamical systems, computer science, number theory, probability) broadening the knowledge and interest of the reader in ways that are not found in any other book. This short book contains a large amount of condensed material that would typically be covered in 500 pages in traditional publications. Thanks to cross-references and redundancy, the chapters can be read independently, in random order.
This book is available for Data Science Central members exclusively. The text in blue consists of clickable links to provide the reader with additional references. Source code and Excel spreadsheets summarizing computations, are also accessible as hyperlinks for easy copy-and-paste or replication purposes. The most recent version of this book is available from this link, accessible to DSC members only.
A complement to this book is my article (March 2019) about the theory of randomness, available here. This long article will be part of my upcoming Machine Learning book, entitled The Art of Data Science. An original business application can be found here.
About the author
Vincent Granville is a start-up entrepreneur, patent owner, author, investor, pioneering data scientist with 30 years of corporate experience in companies small and large (eBay, Microsoft, NBC, Wells Fargo, Visa, CNET) and a former VC-funded executive, with a strong academic and research background including Cambridge University.
Download the book (members only)
Click here to get the book. For Data Science Central members only. If you have any issues accessing the book please contact us at [email protected] To become a member, click here.
Content
The book covers the following topics:
1. Introduction to Stochastic Processes
We introduce these processes, used routinely by Wall Street quants, with a simple approach consisting of re-scaling random walks to make them time-continuous, with a finite variance, based on the central limit theorem.
2. Integration, Differentiation, Moving Averages
We introduce more advanced concepts about stochastic processes. Yet we make these concepts easy to understand even to the non-expert. This is a follow-up to Chapter 1.
3. Self-Correcting Random Walks
We investigate here a breed of stochastic processes that are different from the Brownian motion, yet are better models in many contexts, including Fintech.
4. Stochastic Processes and Tests of Randomness
In this transition chapter, we introduce a different type of stochastic process, with number theory and cryptography applications, analyzing statistical properties of numeration systems along the way -- a recurrent theme in the next chapters, offering many research opportunities and applications. While we are dealing with deterministic sequences here, they behave very much like stochastic processes, and are treated as such. Statistical testing is central to this chapter, introducing tests that will be also used in the last chapters.
5. Hierarchical Processes
We start discussing random number generation, and numerical and computational issues in simulations, applied to an original type of stochastic process. This will become a recurring theme in the next chapters, as it applies to many other processes.
6. Introduction to Chaotic Systems
While typically studied in the context of dynamical systems, the logistic map can be viewed as a stochastic process, with an equilibrium distribution and probabilistic properties, just like numeration systems (next chapters) and processes introduced in the first four chapters.
7. Chaos, Logistic Map and Related Processes
We study processes related to the logistic map, including a special logistic map discussed here for the first time, with a simple equilibrium distribution. This chapter offers a transition between chapter 6, and the next chapters on numeration system (the logistic map being one of them.)
8. Numerical and Computational Issues
These issues have been mentioned in chapter 7, and also appear in chapters 9, 10 and 11. Here we take a deeper dive and offer solutions, using high precision computing with BigNumber libraries.
9. Digits of Pi, Randomness, and Stochastic Processes
Deep mathematical and data science research (including a result about the randomness of Pi, which is just a particular case) are presented here, without using arcane terminology or complicated equations. Numeration systems discussed here are a particular case of deterministic sequences behaving just like the stochastic process investigated earlier, in particular the logistic map, which is a particular case.
10. Numeration Systems in One Picture
Here you will find a summary of much of the material previously covered on chaotic systems, in the context of numeration systems (in particular, chapters 7 and 9.)
11. Numeration Systems: More Statistical Tests and Applications
In addition to featuring new research results and building on the previous chapters, the topics discussed here offer a great sandbox for data scientists and mathematicians.
12. The Central Limit Theorem Revisited
The central limit theorem explains the convergence of discrete stochastic processes to Brownian motions, and has been cited a few times in this book. Here we also explore a version that applies to deterministic sequences. Such sequences and treated as stochastic processes in this book.
13. How to Detect if Numbers are Random or Not
We explore here some deterministic sequences of numbers, behaving like stochastic processes or chaotic systems, together with another interesting application of the central limit theorem.
14. Arrival Time of Extreme Events in Time Series
Time series, as discussed in the first chapters, are also stochastic processes. Here we discuss a topic rarely investigated in the literature: the arrival times, as opposed to the extreme values (a classic topic), associated with extreme events in time series.
15. Miscellaneous Topics
We investigate topics related to time series as well as other popular stochastic processes such as spatial processes.
16. Exercises
Comment
Hi Mohammed,
I believe so. If you look at chapter 5 (six degrees of separation) it applies to Youtube videos as well, in the sense that there is a path involving no more than six links from any Youtube video to any other one. Using a recursive algorithm for (automated) crawling is not a good idea though, as explained in chapter 5. Also, some videos are somewhat disconnected from the vast majority of Youtube videos. For instance, can you start with a video of the Beatles, and end up after any amount of browsing, discovering a machine learning video? Maybe not, and it means that the Youtube graph is not fully connected, and you need a number of seed videos from each connected component when doing your browsing, in order to retrieve all of them.
Vincent
Thank you for the book, Vincent!
We all appreciate your work!
Just an innocent question: could this be applied to youtube browsing ? and then what can be learned from that ?
For those that are members, but still can't access the page, please clear your cache and cookies or try with a different browser. If you still have issues please email us at [email protected]
For everyone that can't download the book, you need to have patience for your membership to be approved once you registered. Please don't leave your email in the comments, since it will be publicly available.
For any issues, contact us at [email protected]
I updated the page where the PDF copy is located, to make it far easier to find and download the book. I also replied to all requests, with a copy of the PDF document.
Thank you!
Thank you Vincent!
© 2020 Data Science Central ® Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
DSC Podcast
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
DSC Podcast
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central