BDI Seminar: Exact and efficient tests for heritability and set tests
Regev Schweiger, Computer Science, Tel Aviv University
Monday, 26 February 2018, 10am to 11am
Seminar Room 0, Big Data Institute Old Road Campus Oxford OX3 7LF
Abstract:
Testing for the existence of variance components in linear mixed models (LMMs) is a fundamental task in many applicative fields. In statistical genetics, the score test has recently become instrumental in the task of testing an association between a set of genetic markers and a phenotype. With few markers, this amounts to set-based variance component tests, which attempt to increase power in association studies by aggregating weak individual effects. When the entire genome is considered, it allows testing for the heritability of a phenotype, defined as the proportion of phenotypic variance explained by genetics. In the popular score-based Sequence Kernel Association Test (SKAT) method, the assumed distribution of the score test statistic is uncalibrated in small samples, with a correction being computationally expensive. This may cause severe inflation or deflation of p-values, even when the null hypothesis is true. I will characterize the conditions under which this discrepancy holds, and show it may occur also in large real datasets, such as a dataset from the Wellcome Trust Case Control Consortium 2 (n=13,950) study, and in particular when the individuals in the sample are unrelated. To address this limitation, we suggest an efficient method to calculate exact p-values for the score test, which can speed up the analysis by orders of magnitude. Our results enable fast and accurate application of the score test in heritability and in set-based association tests. Time permitting, I will also discuss nonparametric testing, in the case where the LMM is not suitable. I will review an efficient method to perform permutation testing for heritability, achieving a speedup of several orders of magnitude, resulting in a method which is both highly efficient and does not suffer from model misspecification.