Subscribe to DSC Newsletter

A Simple Introduction to Complex Stochastic Processes - Part 2

In my first article on this topic (see here) I introduced some of the complex stochastic processes used by Wall Street data scientists, using a simple approach that can be understood by people with no statistics background other than a first course such as stats 101. I defined and illustrated the continuous Brownian motion (the mother of all these stochastic processes) using approximations by discrete random walks, simply re-scaling the X-axis and the Y-axis appropriately, and making time increments (the X-axis) smaller and smaller, so that the limiting process is a time-continuous one. This was done without using any complicated mathematics such as measure theory or filtrations.

Here I am going one step further, introducing the integral and derivative of such processes, using rudimentary mathematics. All the articles that I've found on this subject are full of complicated equations and formulas. It is not the case here. Not only do I explain this material in simple English, but I also provide pictures to show how an Integrated Brownian motion looks like (I could not find such illustrations in the literature), how to compute its variance, and focus on applications, especially to number theory, Fintech and cryptography problems. Along the way, I discuss moving averages in a theoretical but basic framework (again with pictures), discussing what the optimal window should be for these (time-continuous or discrete) time series.

1. General Framework

As in my previous article, we define a time-continuous process as the limit of a time-discrete process. The time-discrete process is referred to as the base process. The fundamental example is as follows:

Start with discrete random variables U(k), k = 1, 2, and so on (the base process) that are independently and identically distributed, with mean equal to 0. Typically, the time series { U(k) } is a white noise.

  1. Define X(n) = U(1) + ... + U(n)
  2. Standardize X(n) so that its variance does not depend on n, that is, introduce Y(n) = X(n) / SQRT(n). This step consists of re-scaling the Y-axis.
  3. Re-scale the time-axis (the X-axis) so that time increments are now equal to 1 / n rather than 1, and let n tends to infinity.  The limiting variable for Y(n), as n tends to infinity, is denoted as Z(1). The value at time t (t being continuous this time) is the limiting value of Y(INT(nt)) as n tends to infinity, where INT is the integer part function, and it is denoted as Z(t).

The collection of random variables { Z(t) } defines the resulting, time-continuous, properly re-scaled stochastic process. In this case, Var[Z(t)] = t Var[U(1)]. Also Z(t) has a Gaussian distribution, by construction and by virtue of the central limit theorem. This process is known as a Brownian motion. The initial random variables U(k) could be Gaussian, or uniform on {-1, +1}, or uniform on [-1, +1]. It does not matter. See illustration here

This process is continuous everywhere but nowhere differentiable. Thus the idea to build processes that are derived from { Z(t) }, but smoother (differentiable everywhere.) We introduce two such types of processes that meet this goal:

  • The cumulative or integrated process { S(t) } derived from { Z(t) }
  • The theoretical moving average process { M(t) } derived from { Z(t) }

Finally, we also define the inverse operation of integration as differentiation. The differentiated process of S(t) is Z(t). In practice, the smoother (integrated or moving average) process is easier to study and sometimes displays patterns that can't be identified in the original process. More on this in the last section.

2. Integrated, Moving Average and Differential Process

Here we define three operations to transform a stochastic process into another one, hopefully more useful for interpretation and decision making, than the original process: Integration, differentiation, and moving average. In all three cases, the construction follows the same principle: In the construction of the Brownian motion described in the previous section, replace X(n) = U(1) + ... + U(n) by X(n) = V(1) + ... + V(n), where V(k) is described below for each transformation. We also discuss some of the challenges to make this methodology mathematically robust.

  • Integrated Process, Construction: V(k) = U(1) + ... + U(k).
  • Differential Process, Construction: V(k) = U(k+1) - U(k). If { Z(t) } is the Brownian motion described in the introduction, then the resulting process is a white noise: continuous and differentiable nowhere.
  • Moving Average Process, Construction: V(k) = U(k) + U(k + 1) + ... + U(k + h(k)) where h(k) is as small as possible to make the resulting process continuous and differentiable everywhere.  For Brownian motions,  h(k) = SQRT(k) works. Does h(k) = log(k) work? This would make the resulting process far more similar to the original one, but maybe barely (if at all) continuous -- in other words, more chaotic than with h(k) = SQRT(k).


The general construction procedure described above needs to be further studied from a theoretical point of view, for the following reasons:

  • Are the derived processes depending on U(1), U(2) and so on? This should not be the case. This issue is especially critical for the differentiated process.
  • Is the differentiation operation truly the reverse of integration?
  • Are my definitions of differential and integrated processes compatible with or equivalent to the highly technical definitions found in the literature?

Proper Re-scaling and Variance Computation

The second step in the general framework (see first section) needs to be adjusted to get things right, when constructing differential, integrated, or moving average processes. In short, you must keep the variance of X(n) not depending on n as n tends to infinity. Let us show you how it works for the integrated Brownian motion. In this case, for the integrated process, we have:


So, the proper re-scaling factor for the Y-axis, as n tends to infinity, is in this case Y(n) = X(n) / SQRT(n^3 / 3). Using similar arguments, one can easily prove that

Also, S(t) has a Gaussian distribution for the same reasons as described in the first section. The same logic applies to compute Var[M(t)]. The details are left as an exercise. A more complicated exercise consists of computing the covariance between S(t) and S(t - s) for s > 0, and proving that {S(t) } is NOT a Brownian motion itself (being differentiable everywhere unlike Brownian motions.) 

Figure 1: Brownian motion realization (blue) and its moving average (red)

In Figures 1 and 2, the X-axis represents the time axis, between 0 and 1. The Brownian motion is continuous everywhere while differentiable nowhere. To the contrary, the moving average and integrated processes are both continuous and differentiable everywhere.

Figure 2:  Brownian motion realization (blue) and its integrated process (green)

3. Application to Number Theory

The purpose here is to study pairs of large prime numbers p and q with p  <  q, used to design cryptographic keys m = pq. Some of these primes exhibit strong patterns that make them unsuitable for use in highly secure cryptographic systems, making it less difficult to factor m. Factoring a product of two large primes each with hundreds of digits, is usually an intractable problem if p and q are carefully chosen. In other words, we are looking for primes p and q that are not as "random" (or to put it differently, less strongly prime)  than your typical large prime numbers. Factoring m allows you to break the key in cryptographic systems, something you want to avoid precisely by carefully choosing p and q.

We discuss here one example of such bad pair, namely p = 1879 and q = 3803. We use the same construction technique as outlined in the first section, resulting in a process that looks exactly like a Brownian motion up to some value of t, then suddenly exhibits two big jolts very early on the time scale, in this particular example.

The base process { U(k) } (see first section) is defined by

The brackets in the formula represent the integer part function. The resulting process { Z(t) } created using the technique described in the first section, is a Brownian motion up to the first jolt occurring at k = SQRT(m); the second jolt occurs at k = q.  Note that for small k's, { U(k) } is reminiscent of number representations in various systems, as described in this article.  In other examples, the interesting jolt occurs at k = p (the smaller prime.) In most cases, no obvious patterns are found.

Figure 3: Brownian motion up to first jolt; second jolt used to break cryptographic key

Figure 3 is based on the first 25,000 observations. The base process { U(k) } does not display these jolts. They are only visible in the process { Z(t) } pictured in Figure 3.  Detecting these jolts (the second one) is of particular interest since it allows you to factor m to retrieve the two large prime factors p and q. To do this, one solution consists of using sampled values of U(k) together with interpolation techniques. Note that as k becomes large (higher than 12,000) the U(k)'s become so heavily auto-correlated that the process { Z(t) } is no longer a Brownian motion. In some ways, it could be used as a better model to simulate stock market prices, than standard Brownian motions.

For testing purposes, tables of large prime numbers are easy to find on the Internet, for instance here. Details about the computations are found in in this spreadsheet.

For related articles by the same author, click here or visit Follow me on LinkedIn.

DSC Resources

Views: 3389


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Vincent Granville on March 22, 2018 at 7:29am

Hi Marco, thanks for pointing out this mistake. I corrected it, you are right.

Comment by Marco Gorelli on March 22, 2018 at 5:44am

Great article, but when you write

"Standardize X(n) so that its variance does not depend on n, that is, introduce Y(n) = X(n) / SQRT(Var[U(1)]). This step consists of re-scaling the Y-axis."

should it not read Y(n) = X(n) / SQRT(n) ?

This way

Var(Y(n)) = Var(X(n) / SQRT(n)) = 1/n Var(X(n)) = Var(U(1)).

Comment by Zhongmin Luo on March 20, 2018 at 2:59pm

Stochastic process, indeed, is used a lot in Finance, especially in derivative pricing. Well-known Black-Scholes (and Merton) model for European Call option assumes stock prices follow Geometric BrownianMotion.

Just to name a few:

Interest rates:

  • Vasicek
  • Cox-Ingersoll-Ross or CIR
  • Hull-White 1-factor and 2-factor
  • Black-Karasinski
  • Libor Market Model
  • Heath-Jarrow-Morton


  • Heston stochastic volatility


  • CIR++
  • CIR++ with Jump

Best regards,


Follow Us


  • Add Videos
  • View All


© 2018   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service