In this report we model the COVID-19 epidemic in South Africa. In this paper mobility data is used to model the reproduction number of the COVID-19 epidemic over time using Bayesian hierarchical model. Results are calibrated to reported deaths only. This is achieved by adapting the work in [1] for South Africa. Similar models have been built for Brazil [2] and the United States [3].

We use data as compiled in [4]. We use mobility movement indexes by province from [5]. This report is automatically generated using R [6] and contains data available on 29 May 2020.


Key features of the approach employed are:

See a more detailed description of the methodology and assumptions below.


Below we show the mobility data from [5]. We use the indexes at a provincial level but here we plot the national indexes.


Below we plot the mobility data from [5] for South Africa as a whole. Clear trends are observable over:

  • Mobility generally reduces before the lockdown on 27 March.
  • There is an increase in mobility in the days just before the lockdown. In particular the Grocery & Pharmacy and Retail indexes. Perhaps an indication of pre-lockdown “panic buying”?
  • Mobility is relatively stable during level 5 lockdown at low levels.
  • Mobility increases when level 4 starts.

Below we summarise this chart into the 3 indicators used in the model:

We do not show them separately, but, in the model, we use the provincial versions of these indexes.


This section shows the calibration of data for various provinces. Three panels are plotted for each province:

  1. The first panel shows the modelled daily number of infections (blue) compared to confirmed case counts (brown) as reported for the province. Note that this model does not calibrate to case data, but this data is shown for reference.
  2. The second shows the daily count of deaths reported in the province (brown) and the modelled deaths in blue.
  3. The third panel shows the estimates for \(R_{t,m}\) for each province.

In all the charts the darker shaded area represents a confidence interval of 50% and the lighter shaded area represents a confidence interval of 95%.

In general, it is noted:

Western Cape

The Western Cape has the most reported deaths of all provinces and hence the most data to calibrate. Below we plot the modelled infections. It’s clear that infections are far outpacing reported cases. Over the last 14 days it would appear that the Western Cape only tested 2.7% of all new infections.

Below we plot reported deaths in brown vs. modelled deaths in blue. This province has increasing numbers of deaths. The model appears to have a reasonable fit. The data does seem quite variable from day to day which may be perhaps data processing delays causing clumping of reported deaths.

Western Cape \(R_{t,m}\) is currently in a band between 1.5 and 2.0 after never truly moving much below 1.5. Thus, we are probably dealing with an already rapidly spreading epidemic.

Eastern Cape

Eastern Cape also has a recently had an increase in deaths. Data is still relatively sparse, but the model is clearly trending upwards to accommodate more frequent reporting of deaths. \(R_{t,m}\) for Eastern Cape appears to be slightly higher than Western Cape. A concerning trend. Over the last 14 days it would appear that 2.7% of all new infections were tested.


Gauteng has limited reported deaths. The long period of relatively low deaths is pulling the values of \(R_{t,m}\) during level 4 lockdown to be around 1. It is encouraging to see a slower epidemic in Gauteng. Over the last 14 days it would appear that 17.3% of all new infections were tested.