Identification of patients with atrial fibrillation: a big data exploratory analysis of the UK Biobank.
Oster J., Hopewell JC., Ziberna K., Wijesurendra R., Camm CF., Casadei B., Tarassenko L.
ObjectiveAtrial fibrillation (AF) is the most common cardiac arrhythmia, with an estimated prevalence of around 1.6% in the adult population. The analysis of the electrocardiogram (ECG) data acquired in the UK Biobank represents an opportunity to screen for AF in a large sub-population in the UK. The main objective of this paper is to assess ten machine-learning methods for automated detection of subjects with AF in the UK Biobank dataset.ApproachSix classical machine-learning methods based on support vector machines are proposed and compared with state-of-the-art techniques (including a deep-learning algorithm), and finally a combination of a classical machine-learning and deep learning approaches. Evaluation is carried out on a subset of the UK Biobank dataset, manually annotated by human experts.Main resultsThe combined classical machine-learning and deep learning method achieved an F1 score of 84.8% on the test subset, and a Cohen's kappa coefficient of 0.83, which is similar to the inter-observer agreement of two human experts.SignificanceThe level of performance indicates that the automated detection of AF in patients whose data have been stored in a large database, such as the UK Biobank, is possible. Such automated identification of AF patients would enable further investigations aimed at identifying the different phenotypes associated with AF.