A method for pooling alleles from different genotyping experiments.
Aulchenko YS., Bertoli-Avella AM., van Duijn CM.
Single tandem repeat (STR) polymorphisms are widely used in linkage and association studies. One of the drawbacks of using these markers is that genetic data coming from different experiments cannot be easily pooled together, because both allele length and binning distance may change. As large studies with multiple series of subjects sequentially included become more and more common, there is an increasing interest in pooling the genetic data obtained in different experiments. Correct reconstruction of allelic correspondences between genotyping experiments is particularly crucial for association-oriented studies, such as candidate gene studies and genome-wide association studies in isolated populations. Here, we suggest a maximum-likelihood framework to find the best correspondence between alleles typed in different genotyping experiments. We also address the issue of goodness-of-fit and robustness. We perform a study simulating results obtained in a genome scan using 787 STR markers. The simulations show that the suggested method yields good results with respect to the error rate, even if the sizes of the samples to be pooled are as low as 10 subjects (3% errors), though only 9% of alleles pass our tests. As sample sizes increase to 250 subjects the proportion of alleles pooled reaches 96% with an error rate of <0.1%.