After being confined for 2 months and hearing the word pandemic almost daily, seeing graphs, opinions, time lines and values that rise and fall, I became curious. Firstly, what is a pandemic? Is it the same as an epidemic? A plague? Was the plague an epidemic? What do epidemiologists study? Yes, they study a discipline called epidemiology.
What is epidemiology?
Epidemiology studies the distribution, frequency, relationships, predictions, and control of all factors related to health and disease in the human population. Epidemiology has more to it, there is descriptive epidemiology, which as you can imagine focuses in recording and observing all the possible information in order to generate a hypothesis, there is also, analytical epidemiology, which is in charge of using statistical and probabilistic tools to search for relationships among the factors to which a population and disease are exposed, there is also experimental epidemiology, which is based on experimental studies generally on animals, eco-epidemiology, similar but in the field of ecology, and theoretical epidemiology, based on mathematic models.
Now I think we can already talk about what an epidemic actually is, right? Wikipedia defines it very well, saying that it is what happens when a disease affects more people than expected during a certain period of time. Well, when an epidemic spreads to different geographical regions it is called a pandemic. At least I already have something clear, I have even looked for examples of epidemics, for example:
And in this last, more recent century, we have the epidemics of influenza A, the Zica virus and the current pandemic that we are going through, that of COVID-19.
The importance of epidemiology and mathematical models
For it was Daniel Bernoulli who published in 1760 the first article based on a mathematical application to an infectious disease, Smallpox. Soon after, d’Alembert continued his work and was the first to describe what the spread of disease was like using a model. A curiosity, Ronald Ross, 1902 Nobel Prize winner, showed that eliminating mosquitoes would end malaria.
It was not until 1927 when Kermack and McKendrick published a model to predict the final size of an epidemic, how it spreads, and what is known as the threshold theorem, which has been one of the most important contributions of theoretical epidemiology. This theorem basically came to tell us that the introduction of a highly infectious case in a susceptible community might not give rise to an outbreak or an epidemic, if the density of susceptibles is less than a certain critical value. Then we will go into more detail.
I wanted to highlight the importance of mathematical models when we speak in the field of health and diseases, since historically epidemics are one of the great fears for all people, since they completely isolated the population that suffered from it. This importance is mainly due to the fact that they show us causal relationships that are not easily observable at first sight, we can use mathematical models to predict the consequences and allow us to understand how a disease spreads through a population under multiple scenarios.
First of all, I want to remind you that we should not forget at any time that we are treating biological processes, that is, before a disease that is transmitted through an agent (rats, for example) we can formulate a model as if that disease were transmitted between humans. We might even believe that by quarantining infected individuals we could curb the disease, as happened in the city of Eyam in 1965 with the bubonic plague. What happened is that rats continued to spread the disease throughout England.
We can say that there are different factors within a disease that make it impossible to study all diseases in the same way. These factors can be the mode of transmission, infectious agents, the affected population and the states through which an individual can pass.
It can be person to person like HIV or COVID-19, by the environment like cholera or by agents (insects) like malaria. It varies according to the disease.
These are microorganisms capable of producing an infection or an infectious disease, for example, Viruses, Bacteria, Worms, etc. These agents condition the various states through which those affected by a disease pass.
The affected population
You have to study if you have immigration, emigration, births or deaths, etc.
The different states that an individual can go through
As I said before, the states are not the same for all diseases, but the following are usually used:
We usually consider that the population size is normalized, it is that is, the population has size 1. Therefore, each state represents the proportion of individuals in that state with respect to the total number of individuals in the population.
We have: S + E + I + R + M = 1
Types of diseases
Types of mathematical models in epidemics
There are three types of models:
Epidemic or endemic
We have to know how to differentiate between epidemic and endemic, one of the most important factors to study is whether or not that epidemic will be endemic.
To find out, we analyze the indicator R_0. R_0 is the basic breeding number, and consists of the average number of secondary infections that occur when an infectious individual (known as patient zero) is introduced into a susceptible population. In Christian, how many individuals will patient zero directly infect.
This value of R_0 is fundamental in epidemiology, since it tells us if the infection is going to spread, if R_0 <1 will disappear and if R_0> 1 we are facing an epidemic case:
SIR model for epidemics
There is a vast literature on mathematical models that help predict whether or not an epidemic will spread. By going to a basic but functional one, we are going to focus on the SIR model, which as you may have imagined, the S is Susceptible to contract the disease, the I is for the Infected and the R for the Recovered.
If we define instant t, we can define:
This model will consider two parameters, which are:
Thus, we can define the rate of passing from susceptible to infected as the average number of contacts for the probability of being infected (at time t):
That is, the variation of susceptibles in t is the pass rate of susceptible to infected (in negative, since it decreases if the rate of infected increases) by the total number of susceptibles.
The variation of infected is the rate from susceptible to infected by the total number of susceptible (those infected increase) minus the transition rate from infected to recovered by the total of infected.
Finally, the recovery rate, as we have defined, would simply be the transition rate from infected to recovered by the total number of infected.
Thanks to the fact that the number of susceptibles plus the number of infected plus the number of recovered remains constant and is the total population, that is:
dS/st + dI/dy + dR/dt=0
Then R (t) = N-S (t) -I (t) so the system with three equations becomes a system of two equations.
On the other hand, the solution S (t) can be obtained if we divide the first equation by the last one and integrate,
To the value β/γ we call it the basic reproduction number, which is the famous coefficient R_0.
This coefficient is very important for the study of the infected, since we can rewrite the equation in this way,
Let’s suppose t_1,…,t_n we got S(t_i ),I(t_i ) y R(t_i). So in order to estimate R_0 we substitute in S(t):
We can see that there is a linear relationship between the variables N lnS (t) = Y and R (t) = X since N lnN is constant.
Once we have calculated R_0 through this regression line and confirming a good fit with R ^ 2 we will be able to see if the value of R_0 e is greater or less than e^((R_0 R(t))/N) at some point t and conclude whether the infection will continue to expand or otherwise decrease. This R_0 is visually the slope of that regression line.
As the SIR model is one of the most basic that does not take into account the number of deaths, I advise that if used, that they be added to the number of infected, so we maintain the N of the population and assume that there are no births.
In one of the practices of the course "Learn to analyze COVID-19 data with R and Python" we are asked to calculate R_0 and comparing it with e^((R_0 R(t))/N) for the Community of Madrid. Well, as of May 12, 2020, when we are talking in terms of whether it is early or not to go to phase I in the Community of Madrid, with this simple SIR model the result is as follows:
Now that R_0 is sufficiently greater than e^((R_0 R(t))/N), We can say that the virus is still spreading, so we still have to focus our attention (R^2=0,98).
Read more here.