Cookies on this website
We use cookies to ensure that we give you the best experience on our website. If you click 'Continue' we'll assume that you are happy to receive all cookies and you won't see this message again. Click 'Find out more' for information on how to change your cookie settings.

One approach to the reconstruction of infectious disease transmission trees from pathogen genomic data has been to use a phylogenetic tree, reconstructed from pathogen sequences, and annotate its internal nodes to provide a reconstruction of which host each lineage was in at each point in time. If only one pathogen lineage can be transmitted to a new host (i.e., the transmission bottleneck is complete), this corresponds to partitioning the nodes of the phylogeny into connected regions, each of which represents evolution in an individual host. These partitions define the possible transmission trees that are consistent with a given phylogenetic tree. However, the mathematical properties of the transmission trees given a phylogeny remain largely unexplored. Here, we describe a procedure to calculate the number of possible transmission trees for a given phylogeny, and we then show how to uniformly sample from these transmission trees. The procedure is outlined for situations where one sample is available from each host and trees do not have branch lengths, and we also provide extensions for incomplete sampling, multiple sampling, and the application to time trees in a situation where limits on the period during which each host could have been infected and infectious are known. The sampling algorithm is available as an R package (STraTUS).

Original publication

DOI

10.1093/molbev/msz058

Type

Journal article

Journal

Molecular biology and evolution

Publication Date

06/2019

Volume

36

Pages

1333 - 1343

Addresses

Nuffield Department of Medicine, Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom.