Using information theory to optimise epidemic models for real-time prediction and estimation.
Parag KV., Donnelly CA.
The effective reproduction number, Rt, is a key time-varying prognostic for the growth rate of any infectious disease epidemic. Significant changes in Rt can forewarn about new transmissions within a population or predict the efficacy of interventions. Inferring Rt reliably and in real-time from observed time-series of infected (demographic) data is an important problem in population dynamics. The renewal or branching process model is a popular solution that has been applied to Ebola and Zika virus disease outbreaks, among others, and is currently being used to investigate the ongoing COVID-19 pandemic. This model estimates Rt using a heuristically chosen piecewise function. While this facilitates real-time detection of statistically significant Rt changes, inference is highly sensitive to the function choice. Improperly chosen piecewise models might ignore meaningful changes or over-interpret noise-induced ones, yet produce visually reasonable estimates. No principled piecewise selection scheme exists. We develop a practical yet rigorous scheme using the accumulated prediction error (APE) metric from information theory, which deems the model capable of describing the observed data using the fewest bits as most justified. We derive exact posterior prediction distributions for infected population size and integrate these within an APE framework to obtain an exact and reliable method for identifying the piecewise function best supported by available epidemic data. We find that this choice optimises short-term prediction accuracy and can rapidly detect salient fluctuations in Rt, and hence the infected population growth rate, in real-time over the course of an unfolding epidemic. Moreover, we emphasise the need for formal selection by exposing how common heuristic choices, which seem sensible, can be misleading. Our APE-based method is easily computed and broadly applicable to statistically similar models found in phylogenetics and macroevolution, for example. Our results explore the relationships among estimate precision, forecast reliability and model complexity.