Picture by By Tatiana Shepeleva/shutterstock.com
One of the most challenging problems in modern theoretical physics is the so-called many-body problem. Typical many-body systems are composed of a large number of strongly interacting particles. Few such systems are amenable to exact mathematical treatment and numerical techniques are needed to make progress. However, since the resources required to specify a generic many-body quantum state depend exponentially on the number of particles in the system (more precisely, on the number of degrees of freedom), even today’s best supercomputers lack sufficient power to exactly encode such states (they can handle only relatively small systems, with less than ~45 particles).
As we shall see, recent applications of machine learning techniques (artificial neural networks in particular) have been shown to provide highly efficient representations of such complex states, making their overwhelming complexity computationally tractable.
In this article, I will discuss how to apply (a type of) artificial neural network to represent quantum states of many particles. The article will be divided into three parts:
There is a fascinating story recounted by one of Albert Einstein's scientific collaborators, the Polish physicist Leopold Infeld, in his autobiography.
Einstein and Infeld in Einstein’s home (source).
According to Infeld, after the two physicists spent several months performing long and grueling calculations, Einstein would make the following remark:
“God [Nature] does not care about our mathematical difficulties. He integrates empirically.”
— Einstein (1942).
What Einstein meant was that, while humans must resort to complex calculations and symbolic reasoning to solve complicated physics problems, Nature does not need to.
Quick Note: Einstein used the term “integrate” here because many physical theories are formulated using equations called “differential equations” and to find solutions of such equations one must apply the process of “integration”.
As noted in the introduction, a notoriously difficult problem in theoretical physics is the many-body problem. This problem has been investigated for a very long time in both classical systems (physical systems based on Newton's three laws of motion and its refinements) and quantum systems (systems based obeying quantum mechanical laws).
The first (classical) many-body problem to be extensively studied was the 3-body problem involving the Earth, the Moon, and the Sun.
A simple orbit of a 3-body system with equal masses.
One of the first scientists to attack this many-body problem was none other than Issac Newton in his masterpiece, the Principia Mathematica:
“Each time a planet revolves it traces a fresh orbit […] and each orbit is dependent upon the combined motions of all the planets, not to mention their actions upon each other […]. Unless I am much mistaken, it would exceed the force of human wit to consider so many causes of motion at the same time, and to define the motions by exact laws which would allow of an easy calculation.”
— Isaac Newton (1687)
Newton’s Principia Mathematica, arguably the most important scientific book in history.
Since essentially all relevant physical systems are composed by a collection of interacting particles, the many-body problem is extremely important.
One can define the problem as “the study of the effects of interactions between bodies on the behavior of a many-body system ”.
Collisions of gold ions generate a quark-gluon plasma, a typical many-body system.
The meaning of “many” in this context can be anywhere from three to infinity. In a recent paper, my colleagues and I showed that the signatures of quantum many-body behavior can be found already for N=5 spin excitations (figure below).
The density of states of a type of spin system (XX model). As the number of spin excitations increases from 2 to 5, a Gaussian distribution (typical of many-body systems with 2-body couplings) is approached.
In the present article, I will focus on the quantum many-body problem which has been my main topic of research since 2013.
The complexity of quantum many-body systems was identified by physicists already in the 1930s. Around that time, the great physicist Paul Dirac envisioned two major problems in quantum mechanics.
The English physicist Paul Dirac.
The first, according to him, was “in connection with the exact fitting in of the theory with relativity ideas”. The second was that “the exact application of these [quantum] laws leads to equations much too complicated to be soluble”. The second problem was precisely the quantum many-body problem.
Luckily, the quantum states of many physical systems can be described using much less information than the maximum capacity of their Hilbert spaces. This fact is exploited by several numerical techniques including the well-known Quantum Monte Carlo (QMC) method.
Simply put, a quantum wave function describes mathematically the state of a quantum system. The first quantum system to receive an exact mathematical treatment was the hydrogen atom.
The probability of finding the electron in a hydrogen atom (represented by the brightness).
In general, a quantum state is represented by a complex probability amplitude Ψ(S), where the argument S contains all the information about the system’s state. For example, in a spin-1/2 chain:
A 1D spin chain: each particle has a value for σ in the z-axis.
From Ψ(S), probabilities associated with measurements made on the system can be derived. For example, the square modulus of Ψ(S), a positive real number, gives the probability distribution associated with Ψ(S):
The properties of a quantum system are encapsulated by the system’s Hamiltonian operator H. The latter is the sum of two terms:
The allowed energy levels of a quantum system (its energy spectrum) can be obtained by solving the so-called Schrodinger equation, a partial differential equation that describes the behavior of quantum mechanical systems.
The Austrian physicist Erwin Schrodinger, one of the fathers of quantum mechanics .
The time-independent version of the Schrödinger equation is given by the following eigenvalue system:
The eigenvalues and the corresponding eigenstates are
The lowest energy corresponds to the so-called “ground state” of the system.
For concreteness, let us consider the following example: the quantum harmonic oscillator. The QHO is the quantum-mechanical counterpart of the classical harmonic oscillator (see the figure below), which is a system that experiences a force when displaced from its initial that restores it to its equilibrium position.
A mass-spring harmonic oscillator
The animation below compares the classical and quantum conceptions of a simple harmonic oscillator.
Wave function describing a quantum harmonic oscillator (Wiki).
While a simple oscillating mass in a well-defined trajectory represents the classical system (blocks A and B in the figure above), the corresponding quantum system is represented by a complex wave function. In each block (from C onwards) there are two curves: the blue one is the real part of Ψ, and the red one is the imaginary part.
In quantum mechanics, spin can be roughly understood as an “intrinsic form of angular momentum” that is carried by particles and nuclei. Though it is intuitive to think of spin as a rotation of a particle around its own axis this picture is not quite correct since then the particle would rotate at a faster than light speed which would violate fundamental physical principles. If fact spins are quantum mechanical objects without classical counterpart.
Example of a many-body system: a spin impurity propagating through a chain of atoms
Quantum spin systems are closely associated with the phenomena of magnetism. Magnets are made of atoms, which are often small magnets. When these atomic magnets become parallelly oriented they give origin to the macroscopic effect we are familiar with.
Magnetic materials often display spin waves, propagating disturbances in the magnetic order.
I will now provide a quick summary of the basic components of machine learning algorithms in a way that will be helpful for the reader to understand their connections with quantum systems.
Machine learning approaches have two basic components (Carleo, 2017):
The two components of machine learning.
Artificial neural networks are usually non-linear multi-dimensional nested functions. Their internal workings are only heuristically understood and investigating their structure does not generate insights regarding the function being it approximates.
Simple artificial neural network with two hidden layers.
Due to the absence of a clear-cut connection between the network parameters and the mathematical function which is being approximated, ANNs are often referred to as “black boxes”.
What are Restricted Boltzmann Machines?
Restricted Boltzmann Machines are generative stochastic neural networks. They have many applications including:
RBMs belong to a class of models known as Energy-based Models. They are different from other (more popular) neural networks which estimate a value based on inputs while RBMs estimate probability densities of the inputs (they estimate many points instead of a single value).
RBMs have the following properties:
The energy functional to be minimized is given by:
Eq.1: Energy functional minimized by RBMs.
The joint probability distribution of both visible and hidden units reads:
Eq.2: Total probability distribution.
where the normalization constant Z is called the partition function. Tracing out the hidden units, we obtain the marginal probability of a visible (input) vector:
Eq.3: Input units marginal probability distribution,
Since, as noted before, hidden (visible) unit activations are mutually independent given the visible (hidden) unit activations one can write:
Eq.4: Conditional probabilities becomes products due to mutual independence.
Eq. 5: Same as Eq.4.
Finally, the activation probabilities read:
Eq.6: Activation probabilities.
where σ is the sigmoid function.
The training steps are the following:
The following analysis is heavily based on this excellent tutorial. The three figures below show how a RBM processes inputs.
A simple RBM processing inputs.
RBMs perform an unsupervised process called “reconstruction”. They learn to reconstruct the data performing a long succession of passes (forward and backward ones) between its two layers. In the backward pass, as shown in the diagram below, the activation functions of the nodes in the hidden layer become the new inputs.
The product of these inputs and the respective weights are summed and the new biases b from the visible layer are added at each input node. The new output from such operations is called “reconstruction” because it is an approximation of the original input.
Naturally, the reconstructions and the original inputs are very different at first (since the values of w are randomly initialized). However, as the error is repeatedly backpropagated against the ws, it is gradually minimized.
We see therefore that:
Joining both conditional distributions, the joint probability distribution of x and a is obtained i.e. the RBM learns how to approximate the original data (the structure of the input).
In a recent article published in Science magazine, it was proposed that one can treat the quantum wave function Ψ(S) of a quantum many-body system as a black-box and then approximate it using an RBM. The RBM is trained to represent Ψ(S) via the optimization of its parameters.
RBM used by Carleo and Troyer (2017) that encodes a spin many-body quantum state.
The question is how to reformulate the (time-independent) Schrodinger equation, which is an eigenvalue problem, as a machine learning problem.
As it turns out, the answer has been known for quite some time, and it is based on the so-called variation method , an alternative formulation of the wave equation that can be used to obtain the energies of a quantum system. Using this method we can write the optimization problem as follows:
where E[Ψ] is a functional that depends on the eigenstates and Hamiltonian. Solving this optimization problem we obtain both the ground state energy and its corresponding ground state.
In Carleo and Troyer (2017), RBMs are used to represent a quantum state Ψ(S). They generalize RBMs to allow for complex network parameters.
It is easy to show that the energy functional can be written as
where the argument of the expectation value after the last equal sign is the local energy. The neural network is then trained using the method of Stochastic Reconfiguation (SR). The corresponding optimization iteration reads:
The gradient descent update protocol.
where η is the learning rate and S is the stochastic reconfiguration matrix which depends on the eigenstates and its logarithmic derivatives.
Carleo and Troyer (2017)were interested specifically in quantum systems of spin 1/2 and they write the quantum state as follows:
In this expression the W argument of Ψ is the set of parameters:
where the components on a and b are real but W can be complex. The absence of intralayer interactions, typical of the RBMs architecture allows hidden variables to be summer over (or traced out), considerably simplifying the expression above to:
To train the quantum wave functions one follows a similar procedure as described above for RBMs.
The figure below shows the negligible relative error of the NQS ground-state energy estimation. Each plot corresponds to a test case which is a system with known exact solutions. The horizontal axis is the hidden-unit density i.e. the ratio between the number of hidden and visible units. Notice that even with relatively few hidden units, the accuracy of the model is already extremely impressive (one part per million error!)
The error of the model ground-state energy relative to the exact value in three tests cases.
In this brief article, we saw that Restricted Boltzmann Machines (RBMs), a simple type of artificial neural network, can be used to compute with extremely high accuracy the ground-state energy of quantum systems of many particles.
Thanks for reading!
As always, constructive criticism and feedback are welcome!
This article was originally published on here.
Comments are closed for this blog post