This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

**Summary**

In this report, the spread of the **pandemic influenza A** (**H1N1**) that had an outbreak in Kolkata, West Bengal, India, 2010 is going to be simulated. The basic **epidemic SIR model** will be used, it describes three populations: a **susceptible **population, an **infected** population, and a **recovered** population and assumes the total population (sum of these 3 populations) as fixed over the period of study.

There are two parameters for this model: namely the attack rate (**β**) per infected person per day through contacts and the recovery rate (**α**). Initially there will be a small number of infected persons in the population. Now the following questions are to be answered with the simulation / analysis:

1. Whether the number of infected persons increase substantially, producing an epidemic, or the flue will fizzle out.

2. Assuming there is an epidemic, how will it end? Will there still be any susceptibles left when it is over?

3. How long will the epidemic last?

The **Euler’s method** will be primarily used to solve the **system of differential equations **for the **SIR model** and compute the **equilibrium points** (along with some

analytic solution attempts for a few simplified special cases). Here are the

conclusions obtained from the simulations:

1. When the recovery rate **α** is **≈ 0** or very very low compared to the attack rate **β**, the flu will turn out to be an epidemic and the entire population will be infected first (the higher **β** is the quicker the epidemic outbreak).

2. To be more precise, when the initial susceptible population **S(0)** is greater than the inverse of the basic reproduction number** 1/ R0 = α / β**, a **proper epidemic** will break out.

3. When the initial susceptible population **S(0)** is less than the inverse of the basic reproduction number **1/R0=α/β**, then a **proper epidemic** will **never** break out.

4. If the initial susceptible population is non-zero, in the end (at equilibrium) there will always be some susceptible population.

5. When there is an epidemic, it will eventually end in the equilibrium point with 0 infected population, how fast it reaches the equilibrium depends upon the recovery rate (the higher α is the quicker the infection removal).

6. The time to reach the equilibrium can be computed using Euler’s method, it depends on the parameters α (the higher the quicker) and β (the higher the quicker) and the initial infected populated size I(0) (the higher the quicker).

Introduction

In 2010, the pandemic influenza A (H1N1) had an outbreak in Kolkata, West Bengal, India. An increased number of cases with influenza like illness (ILI) were reported in Greater Kolkata Metropolitan Area (GKMA) during July and August 2010, as stated in [3]. The main motivation for this research project will be to understand the spread of the pandemic, compute the equilibrium points and find the impact of the initial values of the infected rate and the attack / recovery rate parameters on the spread of the epidemic, using simulations using the basic epidemic SIR model.

The Euler’s method will be primarily used to solve the system of differential equations

for SIR model and compute the equilibrium points. First a few simplified special cases will be considered and both analytic and numerical methods (with Euler method) will be used to compute the equilibrium points. Then the solution for the generic model will be found. As described in [6], the SIR model can also be effectively used (in a broader

context) to model the propagation of computer virus in computer networks, particularly for the networks with Erdos-Renyi type random graph topology.

SIR Epidemic Model

The SIR model is an epidemiological model that computes the theoretical number of people infected with a contagious illness in a closed population over time. One of the basic one strain SIR models is **Kermack-McKendrick Model**. The Kermack-McKendrick Model is used to explain the rapid rise and fall in the number of infective patients observed in epidemics. It assumes that the population size is fixed (i.e., no births, no deaths due to disease nor by natural causes), incubation period of the infectious agent

is instantaneous, and duration of infectivity is the same as the length of the disease. It also assumes a completely homogeneous population with no age, spatial, or social structure.

The following figure 2.1 shows an electron microscope image of the re-assorted **H1N1**influenza virus photographed at the CDC Influenza Laboratory. The viruses are 80 − 120 nm in diameter [1].

**2.1 Basic Mathematical Model**

The starting model for an epidemic is the so-called SIR model, where **S** stands for susceptible population, the people that can be infected. **I** is the already infected population, the people that are contagious, and **R** stands for the recovered population, people who are not contagious any more.

**2.1.1 Differential Equations**

The

• The terms ** dS/dt** ,

• The terms **β** and **α** indicate the attack rate (number of susceptible persons get infected per day) and the recovery rate of the flu (inverse of the number of days a person remains infected), respectively.

• High value of **α** means a person will be infected by the flu for less number of days and high value of **β** means that the epidemic will spread quickly.

• Also, as can be seen from below, from the differential equations it can be shown that the population **(S + I + R)** is assumed to be constant.

**2.1.2 Collected data, units and values for the constants**

• As can be seen from the following figure 2.2, the focus of this analysis will be limited to the population in Kolkata Metropolitan Corporation (KMC, XII) area where the population can be assumed to be ≈ 4.5 million or 4500 thousands, as per [7].

• **Units**

– All population (S, I, R) units will be in thousands persons (so that total population

N = 4500).

– As can be derived from the differential equations 2.1, the unit of β will be in

10^(−6) /persons/day (β = 25 will mean 25 persons in a million gets infected by

susceptible-infected contact per infected person per day).

– Similarly, the units of α will be in 10^(−3) / day (α = 167 will mean 167 × 10^(−3) /

day gets recovered from the flu per day).

• The attack rate is 20-29/100000 and the number of days infected (i.e. the inverse of recovery rate) = 5−7 days on average (with a few exceptions), as per [3].

• Typical values for β and α can be assumed to be 25 /person / day and 10^3/6 ≈ 167 / day, respectively.

**2.2 Simplified Model 1 (with α = 0)**

• At first a simplified model is is created assuming that α = 0 (/ day) and that R = 0, so once infected, a person stays contagious for ever. Because S(t) + I(t) + R(t) = S(t) + I(t) = N is constant (since population size N is fixed), S(t) can be eliminated and a single differential equation in just I(t) is obtained as shown in the equation below 2.2.

• Also, let the (fixed) population size N = 4500 = S(0) + I(0), (in thousand persons), initially the number of persons infected = I(0) = 1 (in thousand persons) and number of persons susceptible S(0) = N −I(0) = 4499 (in thousand persons), respectively. Let β = 25 × 10^(−6) /persons / day) to start with.

**2.2.1 Analytic Solution**

• The analytic solution can be found by following the steps shown in the Appendix A and the final solution is shown in the below equations 2.3:

• The following figure 2.3 shows the logistic (bounded) growth in I(t) (in thousands persons) w.r.t. the time (in days) for different values of attack rate β×10^(−6) (/ person / day). As expected, the higher the attack rate, the quicker all the persons in the population become infected.

**2.2.2 Finding the equilibrium points for I**

• The equilibrium points are the points where the rate of change in I is zero, the points that satisfy the following equation

https://sandipanweb.files.wordpress.com/2017/07/f5.png?w=150&h=86 150w" sizes="(max-width: 236px) 100vw, 236px" />

• Considering a small neighborhood of the equilibrium point at I = 0, it can be seen from the figure 2.4 that whenever I > 0, *dI/dt > 0*, so I increases and goes away from the equilibrium point.

• Hence, the equilibrium point at I = 0 is unstable.

• At I = N = 4500 (in thousand persons) it is a stable equilibrium. As can be seen from the following figure 2.4, in a small neighborhood of the equilibrium point at I = 4500, it always increases / decreases towards the equilibrium point.

• In a small ε > 0 neighborhood at I = 4500 (in thousand persons),

1. *dI/dt > 0*, so I increases when I <= 4500 − ε .

2. *dI/dt* > 0, so I decreases when I >= 4500 + ε .

• The same can be observed from the direction fields from the figure 2.5.

• Hence, the equilibrium at I = 4500 is stable.

**2.2.3 Numerical Solution with the Euler’s Method**

• The algorithm (taken from the course slides) shown in the following figure 2.6 will be used for numerical computation of the (equilibrium) solution using the Euler’s method.

• As can be seen from the figure 2.6, then the infection at the next timestep can be (linearly) approximated (iteratively) by the summation of the the infection current timestep with the product of the difference in timestep and the derivative of the infection evaluated at the current timestep.

**2.2.3.1 Finding the right step size (with β = 25 × 10^(−6)/person/day)**

• In order to decide the best step size for the Euler method, first the Euler’s method is run with different step sizes as shown in the figure 2.7.

• As can be seen from the following table 2.1 and the figure 2.7, the largest differences in the value of I (with two consecutive step sizes) occurs around 78 days:

• As can be seen from the table in the Appendix B, the first time when the error becomes less than 1 person (in thousands) is with the step size **1/512** , hence this step size will be used for the Euler method.

**2.2.3.2 Computing the (stable) equilibrium point**

• Now, this timestep will be used to solve the problem to find the equilibrium time

• Now, from the analytic solution 2.3 and the following figure 2.8, it can be verified that the teq solution that the Euler’s method obtained is pretty accurate (to the ε tolerance).

**2.2.3.3 Results with β = 29 × 10^(−6) / person / day, I(0) = 1 person**

• Following the same iterations as above, the steepest error is obtained at t = 67 days in this case, as shown in the figure 2.9.

• The first time when error becomes less than one person for t = 67 days with the Euler ‘s method is with step size **1/512** again.

• The solution obtained is * teq* = 234.76953125 days ≈

**2.2.3.4 Results with β = 25 × 10−6 / person / day, with different ****initial values for infected persons (I(0))**

• Following the same iterations as above, the equilibrium point is computed using the Euler’s method with different values of initial infected population I(0), as shown in the figure 2.10.

• The solutions obtained are **teq** = 272.33, 258.02, 251.85, 248.23, 245.66, 245.66 days for I(0) = 1, 5, 10, 15, 20 days, respectively. So the equilibrium is obtained earlier when the initial infected population size is higher, as expected.

**2.3 Simplified Model 2 (with β = 0)**

• Next, yet another simplified model is considered by assuming that β = 0 and that α > 0, so the flu can no more infect anyone (susceptible, if any, possibly because everyone got infected), an infected person recovers from flu with rate α. This situation can be described again with a single differential equation in just I(t) as shown in the equation below 2.4.

https://sandipanweb.files.wordpress.com/2017/07/f14.png?w=150&h=30 150w" sizes="(max-width: 351px) 100vw, 351px" />

• Also, let the the entire population be infected, N = 4500 = I(0), (in thousand persons), initially the number of persons susceptible = S(0) = 0, respectively. Let α = 167 × 10^(−3)

(/ day) to start with.

**2.3.1 Analytic Solution**

• The analytic solution can be found by following the steps shown in the below equations 2.5:

• The following figure 2.11 shows the exponential decay in I(t) (in thousand persons) w.r.t. the time (in days) for different values of recovery rate α × 10^(−3) (/ day). As expected, the higher the recovery rate, the quicker all the persons in the population get rid of the infection.

• Now, I(t) + R(t) = N (since S(t) = 0 forever, since no more infection) and I(0) = N, combining with the above analytic solution **I(t) = I(0).exp(−αt) = N.exp(−αt)**, the following equation is obtained:

• The following figure 2.12 shows the growth in R(t) (in thousand persons) w.r.t. the time (in days) for different values of recovery rate α × 10^(−3) (/ day). As expected, the higher the recovery rate, the quicker all the persons in the population move to the removed state.

**2.3.2 Numerical Solution with the Euler’s Method**

**2.3.2.1 Solution with α = 167 × 10−3 / day**

• Following the same iterations as above, the steepest error is obtained at t = 6 in this case, as shown in the figure 2.16.

• The first time when error becomes less than one person for t = 67 with the Euler’s method is with step size **1/256** .

• The solution obtained with the Euler’s method is 133.076171875 days ≈ 133 days to remove the infection from population with 10^(−6) tolerance. From the analytic solution,

I(133) = N.exp(−αt) = 1.016478E−06, similar result is obtained.

**2.3.2.2 Results**

The following figure 2.16 shows the solutions obtained with different step sizes using the Euler’s method.

**2.4 Generic Model (with α, β > 0)**

First, the numeric solution will be attempted for the generic model (using the Euler’s method) and then some analytic insights will be derived for the generic model.

**2.4.1 Numerical Solution with the Euler’s Method**

• The following algorithm 2.14 shown in the next figure is going to be used to obtain the solution using Euler method (the basic program for Euler’s method, adapted to include three dependent variables and three differential equations).

• As can be seen from the figure 2.14, first the vector X(0) is formed by combining the three variables S, I, R at timestep 0. Then value of the vector at the next timestep can be (linearly) approximated (iteratively) by the (vector) summation of the vector value at the current timestep with the product of the difference in timestep and the derivative of the

vector evaluated at the current timestep.

**2.4.1.1 Equilibrium points**

• At the equilibrium point,

There will be no infected person at the equilibrium point (infection should get removed).

• As can be seen from the following figure 2.15 also,** I = 0** is an **equilibrium **point, which is quite expected, since in the equilibrium all the infected population will move to the removed state.

• Also, at every point the invariant** S + I + R = N** holds.

• In this particular case shown in figure 2.15, the susceptible population S also becomes 0 at equilibrium (since all the population got infected initially, all of them need to move to removed state) and R = N = 4500 (in thousand persons).

**2.4.1.2 Results with Euler method**

• As explained in the previous sections, the same iterative method is to find the right stepsize for the Euler method. The minimum of the two stepsizes determined is

• The following figures show the solutions obtained with different values of α, β with the initial infected population size I(0) = 1 (in thousand persons). Higher values for the parameter β obtained from the literature are used for simulation, since β = 25 × 10^(−6) /person /day is too small (with the results not interesting) for the growth of the epidemic using the Euler’s method (at least till ∆t = 1/ 2^15), after which the iterative Euler’s

method becomes very slow).

• As can be seen, from the figures 2.16, 2.17 and 2.19, at equilibrium, I becomes zero.

• The solution (number of days to reach equilibrium) obtained at α = 167×10^(−3) /day and β = 25×10^(−5) /person /day is **teq** = 143.35546875 ≈ 144 days with I(0) = 1 (in thousand persons), the corresponding figure is figure 2.16.

• The solution (number of days to reach equilibrium) obtained at α = 167 × 10^(−3) /day and β = 5 × 10^(−5) /person /day is **teq** ≈ 542 days with I(0) = 1 (in thousand persons), the corresponding figure is figure 2.17.

• Hence, higher the β value, the equilibrium is reached much earlier.

• The solution obtained at α = 500 × 10^(−3) /day and β = 25 × 10^(−5) /person /day is

**teq** ≈ 78 days with I(0) = 1 (in thousand persons), the corresponding figure is figure 2.19.

• Hence, higher the α value, the equilibrium is reached earlier.

• The solution obtained at α = 167×10^(−3) /day and β = 25×10^(−5) /person /day is

**teq** = 140 days with I(0) = 10. Hence, as expected, higher the number of initial infected population size, quicker the equilibrium is reached.

• At equilibrium, S does not necessarily become close to zero, since sometimes the entire population may not get infected ever, as shown in the figure 2.17, where at equilibrium the susceptible population is non-zero.

• As can be seen from the phase planes from following figure 2.21, at equilibrium, the infected population becomes 0.

**2.4.2 Analytic Solution and Insights**

2.4.2.1 Basic Reproduction Number (R0)

The **basic reproduction number** (also called basic reproduction ratio) is defined by

**R0 = β / α** (unit is /day). As explained in [2], this ratio is derived as the expected number of new infections (these new infections are sometimes called secondary infections) from a single infection in a population where all subjects are susceptible. How the dynamics of the system depends on R0 will be discussed next.

**2.4.2.2 The dynamics of the system as a function of R0**

• By dividing the first equation by the third in 2.1, as done in [2], the following equation is obtained:

• Now, at t → ∞, the equilibrium must have been already reached and all infections must have been removed, so that *lim* (t→∞) I(t) = 0.

• Also, let R∞ = *lim *(t→∞) R(t).

• Then from the above equation 2.7, **R∞ = N − S(0).exp(R0.(R∞−R(0)))**

.

• As explained in [2], the above equation shows that at the end of an epidemic, unless

S(0) = 0, not all individuals of the population have recovered, so some must remain susceptible.

• This means that the end of an epidemic is caused by the decline in the number of infected individuals rather than an absolute lack of susceptible subjects [2].

• The role of the basic reproduction number is extremely important, as explained in [2]. From the differential equation, the following equation can be obtained:

•** S(t) > 1/R0** ⇒ **dI(t)/dt > 0** ⇒ there will be a **proper epidemic outbreak** with an increase of the number of the infectious (which can reach a considerable fraction of the population).

• **S(t) < 1 R0 ⇒ dI(t) dt < 0** ⇒ independently from the initial size of the susceptible population the disease can never cause a proper epidemic outbreak.

• As can be seen from the following figures 2.21 and 2.22 (from the simulation results obtained with Euler method), when **S(0) > 1/R0** , there is a peak in the infection curve, indicating a proper epidemic outbreak.

• Also, from the figures 2.21 and 2.22, when **S(0) > 1/R0** , the higher the the gap between **S(0)** and **1/R0** , the higher the peak is (the more people get infected) and the quicker the peak is attained.

• Again, from the figure 2.22, when 4490 = **S(0) < 1/R0** = 5000, it never causes a proper epidemic outbreak .

https://sandipanweb.files.wordpress.com/2017/07/f30.png?w=197&h... 197w" sizes="(max-width: 619px) 100vw, 619px" />

• Again, by dividing the second equation by the first in 2.1, the following equation is obtained:

• As can be noticed from the above figure 2.23 that because the formulas differ only by an additive constant, these curves are all vertical translations of each other.

• The line I(t) = 0 consists of equilibrium points.

• Starting out at a point on one of these curves with I(t) > 0, as time goes on one needs to travel along the curve to the left (because dS/dt < 0), eventually approaching at some positive value of S(t).

• This must happen since on any of these curves, as I(t) → ∞, as S(t) → 0, from equation 2.8.

• So the answer to question (2) is that the epidemic will end as with approaching some positive value and thus there must always be some susceptibles left over.

• As can be seen from the following figure 2.24 (from the simulation results obtained with the Euler’s method), when S(0) > 1/R0 , lesser the the gap between S(0) and 1/R0 , the higher the population remains susceptible at equilibrium (or at t → ∞).

**Conclusions**

In this report, the spread of the pandemic influenza A (H1N1) that had an outbreak in Kolkata, West Bengal, India, 2010 was simulated using the basic epidemic SIR model.Initially there will be a small number of infected persons in the population, most of the population had susceptible persons (still not infected but prone to be infected) and zero removed persons. Given the initial values of the variables and the parameter (attack and recovery rates of the flu) values, the following questions were attempted to be answered with the simulation / analysis:

1. Whether the number of infected persons increase substantially, producing an epidemic, or the flue will fizzle out.

2. Assuming there is an epidemic, how will it end? Will there still be any susceptibles left when it is over?

3. How long will the epidemic last?

The following conclusions are obtained after running the simulations with

different values of the parameters and the initial values of the variables:

1. When the recovery rate α is ≈ 0 or very very low compared to the attack rate β (so that R0 = β / α >> 1) and I(0) > 1, the flu will turn out to be an epidemic and the entire population will be infected first (the higher β is the quicker the epidemic break out).

2. To be more precise, when the initial susceptible population S(0) is greater than the inverse of the basic reproduction number 1/R0 = α / β, a proper epidemic will break out.

3. When the initial susceptible population S(0) is less than the inverse of the basic reproduction number 1/R0 = α/β, then a proper epidemic will never break out.

4. If the initial susceptible population is non-zero, in the end (at equilibrium) there will always be some susceptible population.

5. When there is an epidemic, it will eventually end in the equilibrium point with 0 infected population, how fast it reaches the equilibrium depends upon the recovery rate (the higher α is the quicker the infection removal).

6. The time to reach the equilibrium can be computed using Euler method, it depends on the parameters α (the higher the quicker) and β (the higher the quicker) and the initial infected populated size I(0) (the higher the quicker).

7. Scope of improvement: The SIR model could be extended to The Classic Endemic Model [5] where the birth and the death rates are also considered for the population (this will be particularly useful when a disease takes a long time to reach the equilibrium state).

https://sandipanweb.files.wordpress.com/2017/07/f35.png?w=137 137w, https://sandipanweb.files.wordpress.com/2017/07/f35.png?w=274 274w" sizes="(max-width: 669px) 100vw, 669px" />

© 2020 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**New Books and Resources for DSC Members** - [See Full List]

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central