Malaria represents one of the major worldwide challenges to public health. A recent breakthrough in the study of the disease follows the annotation of the genome of the malaria parasite Plasmodium falciparum and the mosquito vector (an organism that spreads an infectious disease) Anopheles. Of particular interest is the molecular biology underlying the immune response system of Anopheles, which actively fights against Plasmodium infection. This article reports a statistical analysis of gene expression time profiles from mosquitoes that have been infected with a bacterial agent. Specifically, we introduce a Bayesian model-based hierarchical clustering algorithm for curve data to investigate mechanisms of regulation in the genes concerned; that is, we aim to cluster genes having similar expression profiles. Genes displaying similar, interesting profiles can then be highlighted for further investigation by the experimenter. We show how our approach reveals structure within the data not captured by other approaches. One of the most pertinent features of the data is the sample size, which records the expression levels of 2,771 genes at 6 time points. Additionally, the time points are unequally spaced, and there is expected nonstationary behavior in the gene profiles. We demonstrate our approach to be readily implementable under these conditions, and highlight some crucial computational savings that can be made in the context of a fully Bayesian analysis. © 2006 American Statistical Association.

Original publication

DOI

10.1198/016214505000000187

Type

Journal article

Journal

Journal of the American Statistical Association

Publication Date

01/03/2006

Volume

101

Pages

18 - 29