ENG Global Risks

Coronavirus: a simulation on R to understand

The purpose of this article is to present the results of a simple simulation carried out to better explain the R0 parameter that commands the spread of an epidemic such as that of the new Coronavirus Covid-19.

I have already mentioned the subject of Coronavirus and its direct and indirect economic impact on the Chinese economy in a first article.

The purpose of this new article is to present the results of a small simulation that I carried out to better explain the parameters that command the spread of an epidemic such as coronavirus, which is fast becoming a global pandemic.

First of all, I am not an epidemiologist, I am not a public health specialist, let alone a microbiologist. The analyses carried out here are only a way of popularizing certain notions used by specialists to identify the dynamics of this epidemic, which broke out in China at the end of 2019 and is now taking on a global scale. I strongly invite anyone interested to consult the work of specialists on these issues, including those at Johns Hopkins University in the United States, and imperial college in London.

In my first article I outlined some basic sets on the compartmental models used in epidemiology, and I mentioned a number of recent articles and studies on the new coronavirus (Covid-19) and its spread in China in the first two months of the year, as well as the response of the Chinese authorities.

I sought to go further by performing a simulation based on a model of the type SEIR (Susceptible, Exposed, Infected, Recovered).

I also sought to integrate a “network effect” through modelling of the number of daily contacts within a given population. This parameter, which is a proxy for “population density,” is crucial in understanding how quickly an infection spreads according to spatial configuration.

For this, I used the library of Programs in R contained in the Epimodel package, which can be found on the eponymous site. I have also integrated a lethality risk profile that increases according to the age of individuals, through a logistics-type “rise-up” function. Finally, I did not take into account the natural demographic changes in the population so as not to burden the model. But it would not have had a real impact on the results, given that we are looking at the evolution of an epidemic over a short period of time (180 days).

Here are the different parameters I used to calibrate the model:

  • Total number of individuals: 1000. Selected by sampling based on a uniform distribution between 1 and 80 years (rectangular demographic pyramid).
  • Number of “active” daily interactions between individuals: 5
  • Number of individuals originally infected: 1
  • Number of actual contacts per day between an infected person and a healthy person when matched in the network: 1. This is an indirect parameter, because for reasons of computatibility of the model, it was easier to assume that there were 1/10 contacts each day within each set-up S-I (Susceptible – Infected) and that the average duration of such a pairing was 10 days. But such a hypothesis does not change the overall dynamics of the model in a homogeneous population in which all individuals have an identical risk of infection.
  • Average non-contagious incubation period of the virus (E-I): 6.5 days (i.e. an Epsilon transition rate – 0.15 between Exposed and Infected states)
  • Average duration of contagious infection until recovery (I-R): 6.5 days
  • The average probability of death from infection (Mu) was set at 2%. The probability of death by age has been calibrated as the distribution function of a logistic law with an expectation of 0.5 and a scale parameter of 0.1. The result is that the maximum probability of death for the most vulnerable, i.e. the oldest, is 4%.

We selected four variants for the basic reproduction rate (R0): 0.9 , 1.35 , 1.8 , 2.25.

  • These variants are obtained by taking different probabilities of infection during contact between a healthy individual and an infected individual (which result from different levels of precaution). We could also have changed the number of contacts. It would have had the same effect.
  • The probability of infection multiplied by the expected total number of contacts within the pairings between Susceptible-Infected individuals (S-I) gives Beta, i.e. the probability of a person’s transition from the susceptible state (S) to the carrier state virus (E or Exposed).
  • To obtain R0 it is necessary to multiply Beta by the average time to pass from an infected state to the state of recovering (R) or the state of death (D). This delay is the opposite of the sum of the Gamma and Mu parameters, i.e. the probability of transition between infectious state (I) and convalescnce (R) – Gamma – or between infectious state (I) and death (D) – Mu.
  • R0 = Beta / (Gamma + Mu) in the simplified case where the natural demographic dynamics of the population (natural birth and mortality rates) are not taken into account. The evolution of deaths is only due to the epidemic.

It is indeed fundamental to reason in flux when studying a phenomenon such as an epidemic and to compare inflows (S–E–I) and outflows (I —R or D). The basic reproductive rate is a synthetic ratio that reflects the overall dynamics of an epidemic by reporting inflows to outflows. Nevertheless, it should be kept in mind that this R0 setting is itself a random variable. The model used here hypothesizes that certain key events that impact R0, such as the likelihood of infection of a healthy individual and the likelihood of death of an infected individual follow a binomial law. It is also assumed that the underlying distributions of these events are stable over time and more fundamentally that the random processes that are associated with these variables are markovian. That is, the state of the system at the t+1 moment depends only on its state at the t moment and a constant transition probability. In addition, the network of interactions between individuals is homogeneous in our case. This means that there are no different groups in the population, other than the age factor, which only influences the fatality rate. These limitations are important to take into account in order to understand the essentially didactic nature of this exercise.

Below are the detailed results of the simulation performed, indicating the prevalence of each of the different states (S, E, I, R, D) at any given time in the total population, out from a single infected individual at the beginning of the simulation. It can be observed that the share of the population infected with the virus increases well with R0, but that the prevalence of the infection in the total population remains low, due to the speed of the transition between the healthy, exposed, infected and recovered states.

Population evolution of different compartments: S (Susceptible), E (Exposed), I (Infected), R (Recovered), D (Deceased)

We also show here the number of deaths obtained in each variant of R0. In the case of Covid-19, several studies and expert comments, which we cited in our first article shows that the case fatality rate of 2% initially advanced by the authorities is probably greatly exaggerated, due to the numerous asymptomatic cases that have not been detected. In any case, our results should be read with caution and interpreted only as the product of a hypothetical (“what if”) modelling exercise. Their interest lies in the comparison between different scenarios, according to the basic reproductive number (R0), highlighting the role played by different inflows and outflows in the dynamics of an epidemic. This puts into perspective the disproportionate attention given by the mass media to the number of infections by the virus, at the expense of a better understanding of the overall dynamics of the epidemic.

Number of simulated deaths on a population of 1000 individuals for each baseline reproductive rate, assuming no precautionary measures are taken to reduce R0.


Those who want to learn more about the modeling methods used can read the following two articles:

  1. BUTTS, Carter T. network: A Package for Managing Relational Data in R. Journal of Statistical Software[S.l.], v. 24, Issue 2, p. 1 – 36, May 2008. ISSN 1548-7660. Available at: <https:””>.</https:> 
  2. JENNESS, Samuel M.; GOODREAU, Steven M.; MORRIS, Martina. EpiModel: An R Package for Mathematical Modeling of Infectious Disease over Networks. Journal of Statistical Software, [S.l.], v. 84, Issue 8, 1 – 47, A.C. 2018. ISSN 1548-7660. Available at: <https:””>.</https:>

2 replies on “Coronavirus: a simulation on R to understand”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.