Cookies on this website
We use cookies to ensure that we give you the best experience on our website. If you click 'Continue' we'll assume that you are happy to receive all cookies and you won't see this message again. Click 'Find out more' for information on how to change your cookie settings.

Abstract The renewal model uses the observed incidence across an epidemic to estimate its underlying time-varying effective reproductive number, R(t) . The skyline model infers the time-varying effective population size, N(t) , responsible for the shape of an observed phylogeny of sequences sampled from an infected population. While both models solve different epidemiological problems, the bias and precision of their estimates depend on p -dimensional piecewise-constant descriptions of their variables of interest. At large p estimates can detect rapid changes but are noisy, while at small p inference, though precise, lacks temporal resolution. Surprisingly, no transparent, principled approach for optimally selecting p , for either model, exists. Usually, p is set heuristically, or obscurely controlled using complex algorithms. We present an easily computable and interpretable method for choosing p based on the minimum description length (MDL) formalism of information theory. Unlike many standard model selection techniques, MDL accounts for the additional statistical complexity induced by how parameters interact. As a result, our method optimises p so that R(t) and N(t) estimates properly adapt to the available data. It also outperforms comparable Akaike and Bayesian information criteria over several model classification problems. Our approach requires some knowledge of the parameter space, and exposes the similarities between renewal and skyline models.

Original publication

DOI

10.1101/703751

Type

Journal article

Publication Date

16/07/2019