Bayesian hierarchical mixture modeling to assign copy number from a targeted CNV array.

Cardin N.; Holmes C.; Wellcome Trust Case Control Consortium None.; Donnelly P.; Marchini J.

Bayesian hierarchical mixture modeling to assign copy number from a targeted CNV array.

Cardin N., Holmes C., Wellcome Trust Case Control Consortium None., Donnelly P., Marchini J.

Accurate assignment of copy number at known copy number variant (CNV) loci is important for both increasing understanding of the structural evolution of genomes as well as for carrying out association studies of copy number with disease. As with calling SNP genotypes, the task can be framed as a clustering problem but for a number of reasons assigning copy number is much more challenging. CNV assays have lower signal-to-noise ratios than SNP assays, often display heavy tailed and asymmetric intensity distributions, contain outlying observations and may exhibit systematic technical differences among different cohorts. In addition, the number of copy-number classes at a CNV in the population may be unknown a priori. Due to these complications, automatic and robust assignment of copy number from array data remains a challenging problem. We have developed a copy number assignment algorithm, CNVCALL, for a targeted CNV array, such as that used by the Wellcome Trust Case Control Consortium's recent CNV association study. We use a Bayesian hierarchical mixture model that robustly identifies both the number of different copy number classes at a specific locus as well as relative copy number for each individual in the sample. This approach is fully automated which is a critical requirement when analyzing large numbers of CNVs. We illustrate the methods performance using real data from the Wellcome Trust Case Control Consortium's CNV association study and using simulated data.

Original publication

DOI

10.1002/gepi.20604

Type

Journal article

Journal

Genetic epidemiology

Publication Date

09/2011

Volume

Pages

536 - 548

Addresses

Department of Statistics, University of Oxford, 1 South Parks Road, Oxford, United Kingdom. NiallC@gmail.com

Keywords

Wellcome Trust Case Control Consortium, Humans, Cluster Analysis, Bayes Theorem, Cohort Studies, Genotype, Gene Dosage, Algorithms, Models, Genetic, Computer Simulation, Molecular Epidemiology, DNA Copy Number Variations

Cookies on this website

Bayesian hierarchical mixture modeling to assign copy number from a targeted CNV array.

Cardin N., Holmes C., Wellcome Trust Case Control Consortium None., Donnelly P., Marchini J.

DOI

Type

Journal

Publication Date

Volume

Pages

Addresses

Keywords