Oxford Big Data Institute

DrugGPT: A Knowledge-Grounded Collaborative Large Language Model for Evidence-based Drug Analysis

Lightweight transformers for clinical natural language processing

Specialised pre-trained language models are becoming more frequent in Natural language Processing (NLP) since they can potentially outperform models trained on generic texts. BioBERT (Sanh et al.Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv: 1910.01108, 2019) and BioClinicalBERT (Alsentzer et al.Publicly available clinical bert embeddings. In Proceedings of the 2nd Clinical Natural Language Processing Workshop, pp. 72-78, 2019) are two examples of such models that have shown promise in medical NLP tasks. Many of these models are overparametrised and resource-intensive, but thanks to techniques like knowledge distillation, it is possible to create smaller versions that perform almost as well as their larger counterparts. In this work, we specifically focus on development of compact language models for processing clinical texts (i.e. progress notes, discharge summaries, etc). We developed a number of efficient lightweight clinical transformers using knowledge distillation and continual learning, with the number of parameters ranging from million to million. These models performed comparably to larger models such as BioBERT and ClinicalBioBERT and significantly outperformed other compact models trained on general or biomedical data. Our extensive evaluation was done across several standard datasets and covered a wide range of clinical text-mining tasks, including natural language inference, relation extraction, named entity recognition and sequence classification. To our knowledge, this is the first comprehensive study specifically focused on creating efficient and compact transformers for clinical NLP tasks. The models and code used in this study can be found on our Huggingface profile at https://huggingface.co/nlpie and Github page at https://github.com/nlpie-research/Lightweight-Clinical-Transformers, respectively, promoting reproducibility of our results.

Decoding 2.3 Million ECGs: Interpretable Deep Learning for Advancing Cardiovascular Diagnosis and Mortality Risk Stratification

MRI economics: Balancing sample size and scan duration in brain wide association studies.

A pervasive dilemma in neuroimaging is whether to prioritize sample size or scan duration given fixed resources. Here, we systematically investigate this trade-off in the context of brain-wide association studies (BWAS) using resting-state functional magnetic resonance imaging (fMRI). We find that total scan duration (sample size × scan duration per participant) robustly explains individual-level phenotypic prediction accuracy via a logarithmic model, suggesting that sample size and scan duration are broadly interchangeable. The returns of scan duration eventually diminish relative to sample size, which we explain with principled theoretical derivations. When accounting for fixed costs associated with each participant (e.g., recruitment, non-imaging measures), we find that prediction accuracy in small-scale BWAS might benefit from much longer scan durations (>50 min) than typically assumed. Most existing large-scale studies might also have benefited from smaller sample sizes with longer scan durations. Both logarithmic and theoretical models of the relationships among sample size, scan duration and prediction accuracy explain well-predicted phenotypes better than poorly-predicted phenotypes. The logarithmic and theoretical models are also undermined by individual differences in brain states. These results replicate across phenotypic domains (e.g., cognition and mental health) from two large-scale datasets with different algorithms and metrics. Overall, our study emphasizes the importance of scan time, which is ignored in standard power calculations. Standard power calculations inevitably maximize sample size at the expense of scan duration. The resulting prediction accuracies are likely lower than would be produced with alternate designs, thus impeding scientific discovery. Our empirically informed reference is available for future study design: WEB_APPLICATION_LINK.

Authors’ reply to the Discussion of ‘Martingale Posterior Distributions’

Martingale posterior distributions

The prior distribution is the usual starting point for Bayesian uncertainty. In this paper, we present a different perspective that focuses on missing observations as the source of statistical uncertainty, with the parameter of interest being known precisely given the entire population. We argue that the foundation of Bayesian inference is to assign a distribution on missing observations conditional on what has been observed. In the i.i.d. setting with an observed sample of size n, the Bayesian would thus assign a predictive distribution on the missing Yn+1:∞ conditional on Y1:n, which then induces a distribution on the parameter. We utilize Doob’s theorem, which relies on martingales, to show that choosing the Bayesian predictive distribution returns the conventional posterior as the distribution of the parameter. Taking this as our cue, we relax the predictive machine, avoiding the need for the predictive to be derived solely from the usual prior to posterior to predictive density formula. We introduce the martingale posterior distribution, which returns Bayesian uncertainty on any statistic via the direct specification of the joint predictive. To that end, we introduce new predictive methodologies for multivariate density estimation, regression and classification that build upon recent work on bivariate copulas.

Audio-based AI classifiers show no evidence of improved COVID-19 screening over simple symptoms checkers

Recent work has reported that respiratory audio-trained AI classifiers can accurately predict SARS-CoV-2 infection status. However, it has not yet been determined whether such model performance is driven by latent audio biomarkers with true causal links to SARS-CoV-2 infection or by confounding effects, such as recruitment bias, present in observational studies. Here we undertake a large-scale study of audio-based AI classifiers as part of the UK government’s pandemic response. We collect a dataset of audio recordings from 67,842 individuals, with linked metadata, of whom 23,514 had positive polymerase chain reaction tests for SARS-CoV-2. In an unadjusted analysis, similar to that in previous works, AI classifiers predict SARS-CoV-2 infection status with high accuracy (ROC–AUC = 0.846 [0.838–0.854]). However, after matching on measured confounders, such as self-reported symptoms, performance is much weaker (ROC–AUC = 0.619 [0.594–0.644]). Upon quantifying the utility of audio-based classifiers in practical settings, we find them to be outperformed by predictions on the basis of user-reported symptoms. We make best-practice recommendations for handling recruitment bias, and for assessing audio-based classifiers by their utility in relevant practical settings. Our work provides insights into the value of AI audio analysis and the importance of study design and treatment of confounders in AI-enabled diagnostics.

VertXNet: an ensemble method for vertebral body segmentation and identification from cervical and lumbar spinal X-rays.

Accurate annotation of vertebral bodies is crucial for automating the analysis of spinal X-ray images. However, manual annotation of these structures is a laborious and costly process due to their complex nature, including small sizes and varying shapes. To address this challenge and expedite the annotation process, we propose an ensemble pipeline called VertXNet. This pipeline currently combines two segmentation mechanisms, semantic segmentation using U-Net, and instance segmentation using Mask R-CNN, to automatically segment and label vertebral bodies in lateral cervical and lumbar spinal X-ray images. VertXNet enhances its effectiveness by adopting a rule-based strategy (termed the ensemble rule) for effectively combining segmentation outcomes from U-Net and Mask R-CNN. It determines vertebral body labels by recognizing specific reference vertebral instances, such as cervical vertebra 2 ('C2') in cervical spine X-rays and sacral vertebra 1 ('S1') in lumbar spine X-rays. Those references are commonly relatively easy to identify at the edge of the spine. To assess the performance of our proposed pipeline, we conducted evaluations on three spinal X-ray datasets, including two in-house datasets and one publicly available dataset. The ground truth annotations were provided by radiologists for comparison. Our experimental results have shown that the proposed pipeline outperformed two state-of-the-art (SOTA) segmentation models on our test dataset with a mean Dice of 0.90, vs. a mean Dice of 0.73 for Mask R-CNN and 0.72 for U-Net. We also demonstrated that VertXNet is a modular pipeline that enables using other SOTA model, like nnU-Net to further improve its performance. Furthermore, to evaluate the generalization ability of VertXNet on spinal X-rays, we directly tested the pre-trained pipeline on two additional datasets. A consistently strong performance was observed, with mean Dice coefficients of 0.89 and 0.88, respectively. In summary, VertXNet demonstrated significantly improved performance in vertebral body segmentation and labeling for spinal X-ray imaging. Its robustness and generalization were presented through the evaluation of both in-house clinical trial data and publicly available datasets.

Association between health insurance cost-sharing and choice of hospital tier for cardiovascular diseases in China: a prospective cohort study.

BACKGROUND: Hospitals in China are classified into tiers (1, 2 or 3), with the largest (tier 3) having more equipment and specialist staff. Differential health insurance cost-sharing by hospital tier (lower deductibles and higher reimbursement rates in lower tiers) was introduced to reduce overcrowding in higher tier hospitals, promote use of lower tier hospitals, and limit escalating healthcare costs. However, little is known about the effects of differential cost-sharing in health insurance schemes on choice of hospital tiers. METHODS: In a 9-year follow-up of a prospective study of 0.5 M adults from 10 areas in China, we examined the associations between differential health insurance cost-sharing and choice of hospital tiers for patients with a first hospitalisation for stroke or ischaemic heart disease (IHD) in 2009-2017. Analyses were performed separately in urban areas (stroke: n = 20,302; IHD: n = 19,283) and rural areas (stroke: n = 21,130; IHD: n = 17,890), using conditional logit models and adjusting for individual socioeconomic and health characteristics. FINDINGS: About 64-68% of stroke and IHD cases in urban areas and 27-29% in rural areas chose tier 3 hospitals. In urban areas, higher reimbursement rates in each tier and lower tier 3 deductibles were associated with a greater likelihood of choosing their respective hospital tiers. In rural areas, the effects of cost-sharing were modest, suggesting a greater contribution of other factors. Higher socioeconomic status and greater disease severity were associated with a greater likelihood of seeking care in higher tier hospitals in urban and rural areas. INTERPRETATION: Patient choice of hospital tiers for treatment of stroke and IHD in China was influenced by differential cost-sharing in urban areas, but not in rural areas. Further strategies are required to incentivise appropriate health seeking behaviour and promote more efficient hospital use. FUNDING: Wellcome Trust, Medical Research Council, British Heart Foundation, Cancer Research UK, Kadoorie Charitable Foundation, China Ministry of Science and Technology, and National Natural Science Foundation of China.

Causal association between snoring and stroke: a Mendelian randomization study in a Chinese population.

BackgroundPrevious observational studies established a positive relationship between snoring and stroke. We aimed to investigate the causal effect of snoring on stroke.MethodsBased on 82,339 unrelated individuals with qualified genotyping data of Asian descent from the China Kadoorie Biobank (CKB), we conducted a Mendelian randomization (MR) analysis of snoring and stroke. Genetic variants identified in the genome-wide association analysis (GWAS) of snoring in CKB and UK Biobank (UKB) were selected for constructing genetic risk scores (GRS). A two-stage method was applied to estimate the associations of the genetically predicted snoring with stroke and its subtypes. Besides, MR analysis among the non-obese group (body mass index, BMI <24.0 kg/m2), as well as multivariable MR (MVMR), were performed to control for potential pleiotropy from BMI. In addition, the inverse-variance weighted (IVW) method was applied to estimate the causal association with genetic variants identified in CKB GWAS.FindingsPositive associations were found between snoring and total stroke, hemorrhagic stroke (HS), and ischemic stroke (IS). With GRS of CKB, the corresponding HRs (95% CIs) were 1.56 (1.15, 2.12), 1.50 (0.84, 2.69), 2.02 (1.36, 3.01), and the corresponding HRs (95% CIs) using GRS of UKB were 1.78 (1.30, 2.43), 1.94 (1.07, 3.52), and 1.74 (1.16, 2.61). The associations remained stable in the MR among the non-obese group, MVMR analysis, and MR analysis using the IVW method.InterpretationThis study suggests that, among Chinese adults, genetically predicted snoring could increase the risk of total stroke, IS, and HS, and the causal effect was independent of BMI.FundingNational Natural Science Foundation of China, Kadoorie Charitable Foundation Hong Kong, UK Wellcome Trust, National Key R&D Program of China, Chinese Ministry of Science and Technology.

Prevalence of persistent SARS-CoV-2 in a large community surveillance study.

Persistent SARS-CoV-2 infections may act as viral reservoirs that could seed future outbreaks1-5, give rise to highly divergent lineages6-8 and contribute to cases with post-acute COVID-19 sequelae (long COVID)9,10. However, the population prevalence of persistent infections, their viral load kinetics and evolutionary dynamics over the course of infections remain largely unknown. Here, using viral sequence data collected as part of a national infection survey, we identified 381 individuals with SARS-CoV-2 RNA at high titre persisting for at least 30 days, of which 54 had viral RNA persisting at least 60 days. We refer to these as 'persistent infections' as available evidence suggests that they represent ongoing viral replication, although the persistence of non-replicating RNA cannot be ruled out in all. Individuals with persistent infection had more than 50% higher odds of self-reporting long COVID than individuals with non-persistent infection. We estimate that 0.1-0.5% of infections may become persistent with typically rebounding high viral loads and last for at least 60 days. In some individuals, we identified many viral amino acid substitutions, indicating periods of strong positive selection, whereas others had no consensus change in the sequences for prolonged periods, consistent with weak selection. Substitutions included mutations that are lineage defining for SARS-CoV-2 variants, at target sites for monoclonal antibodies and/or are commonly found in immunocompromised people11-14. This work has profound implications for understanding and characterizing SARS-CoV-2 infection, epidemiology and evolution.

Assessing the importance of primary care diagnoses in the UK Biobank.

The UK Biobank has made general practitioner (GP) data (censoring date 2016-2017) available for approximately 45% of the cohort, whilst hospital inpatient and death registry (referred to as "HES/Death") data are available cohort-wide through 2018-2022 depending on whether the data comes from England, Wales or Scotland. We assessed the importance of case ascertainment via different data sources in UKB for three diseases that are usually first diagnosed in primary care: Parkinson's disease (PD), type 2 diabetes (T2D), and all-cause dementia. Including GP data at least doubled the number of incident cases in the subset of the cohort with primary care data (e.g. from 619 to 1390 for dementia). Among the 786 dementia cases that were only captured in the GP data before the GP censoring date, only 421 (54%) were subsequently recorded in HES. Therefore, estimates of the absolute incidence or risk-stratified incidence are misleadingly low when based only on the HES/Death data. For incident cases present in both HES/Death and GP data during the full follow-up period (i.e. until the HES censoring date), the median time difference between an incident diagnosis of dementia being recorded in GP and HES/Death was 2.25 years (i.e. recorded 2.25 years earlier in the GP records). Similar lag periods were also observed for PD (median 2.31 years earlier) and T2D (median 2.82 years earlier). For participants with an incident GP diagnosis, only 65.6% of dementia cases, 69.0% of PD cases, and 58.5% of T2D cases had their diagnosis recorded in HES/Death within 7 years since GP diagnosis. The effect estimates (hazard ratios, HR) of established risk factors for the three health outcomes mostly remain in the same direction and with a similar strength of association when cases are ascertained either using HES only or further adding GP data. The confidence intervals of the HR became narrower when adding GP data, due to the increased statistical power from the additional cases. In conclusion, it is desirable to extend both the coverage and follow-up period of GP data to allow researchers to maximise case ascertainment of chronic health conditions in the UK.

Lithium response in bipolar disorder is associated with focal adhesion and PI3K-Akt networks: a multi-omics replication study.

Lithium is the gold standard treatment for bipolar disorder (BD). However, its mechanism of action is incompletely understood, and prediction of treatment outcomes is limited. In our previous multi-omics study of the Pharmacogenomics of Bipolar Disorder (PGBD) sample combining transcriptomic and genomic data, we found that focal adhesion, the extracellular matrix (ECM), and PI3K-Akt signaling networks were associated with response to lithium. In this study, we replicated the results of our previous study using network propagation methods in a genome-wide association study of an independent sample of 2039 patients from the International Consortium on Lithium Genetics (ConLiGen) study. We identified functional enrichment in focal adhesion and PI3K-Akt pathways, but we did not find an association with the ECM pathway. Our results suggest that deficits in the neuronal growth cone and PI3K-Akt signaling, but not in ECM proteins, may influence response to lithium in BD.

[Progress and practice of objective measurement of physical behaviors in large-scale cohort research].

Due to the limited reliability of traditional self-completed questionnaire, the accuracy of measurement of physical behaviors (physical activity, sedentary behavior and sleep) is not high. With the development of technology, wearable devices (e.g. accelerometer) can be used for more accurate measurement of physical behaviors and have great application potential in large-scale research. However, the data of objective measurement of physical behaviors from large-scale cohort research in Asian populations is still limited. Between August 2020 and December 2021, the 3rd resurvey of China Kadoorie Biobank (CKB) project used Axivity AX3 wrist triaxial accelerometer to collect the data of participants' daily activity and sleep status. A total of 20 370 participants from 10 study areas were included in the study, in whom 65.2% were women, and the age was (65.4±9.1) years. The participants' physical activity level varied greatly in different study areas. The objective measurement of participants' physical behaviors in CKB project has provided valuable resources for the description of 24-hour patterns of physical behaviors and evaluation of the health effect of physical activity, sedentary behavior and sleep as well as their association with diseases in the elderly in China.

Characterizing prostate cancer risk through multi-ancestry genome-wide discovery of 187 novel risk variants.

The transferability and clinical value of genetic risk scores (GRSs) across populations remain limited due to an imbalance in genetic studies across ancestrally diverse populations. Here we conducted a multi-ancestry genome-wide association study of 156,319 prostate cancer cases and 788,443 controls of European, African, Asian and Hispanic men, reflecting a 57% increase in the number of non-European cases over previous prostate cancer genome-wide association studies. We identified 187 novel risk variants for prostate cancer, increasing the total number of risk variants to 451. An externally replicated multi-ancestry GRS was associated with risk that ranged from 1.8 (per standard deviation) in African ancestry men to 2.2 in European ancestry men. The GRS was associated with a greater risk of aggressive versus non-aggressive disease in men of African ancestry (P = 0.03). Our study presents novel prostate cancer susceptibility loci and a GRS with effective risk stratification across ancestry groups.

At Breaking Point or Already Broken? The National Health Service in the United Kingdom.

Evaluating approaches for constructing polygenic risk scores for prostate cancer in men of African and European ancestry.

Genome-wide polygenic risk scores (GW-PRSs) have been reported to have better predictive ability than PRSs based on genome-wide significance thresholds across numerous traits. We compared the predictive ability of several GW-PRS approaches to a recently developed PRS of 269 established prostate cancer-risk variants from multi-ancestry GWASs and fine-mapping studies (PRS269). GW-PRS models were trained with a large and diverse prostate cancer GWAS of 107,247 cases and 127,006 controls that we previously used to develop the multi-ancestry PRS269. Resulting models were independently tested in 1,586 cases and 1,047 controls of African ancestry from the California Uganda Study and 8,046 cases and 191,825 controls of European ancestry from the UK Biobank and further validated in 13,643 cases and 210,214 controls of European ancestry and 6,353 cases and 53,362 controls of African ancestry from the Million Veteran Program. In the testing data, the best performing GW-PRS approach had AUCs of 0.656 (95% CI = 0.635-0.677) in African and 0.844 (95% CI = 0.840-0.848) in European ancestry men and corresponding prostate cancer ORs of 1.83 (95% CI = 1.67-2.00) and 2.19 (95% CI = 2.14-2.25), respectively, for each SD unit increase in the GW-PRS. Compared to the GW-PRS, in African and European ancestry men, the PRS269 had larger or similar AUCs (AUC = 0.679, 95% CI = 0.659-0.700 and AUC = 0.845, 95% CI = 0.841-0.849, respectively) and comparable prostate cancer ORs (OR = 2.05, 95% CI = 1.87-2.26 and OR = 2.21, 95% CI = 2.16-2.26, respectively). Findings were similar in the validation studies. This investigation suggests that current GW-PRS approaches may not improve the ability to predict prostate cancer risk compared to the PRS269 developed from multi-ancestry GWASs and fine-mapping.

Assessing the Causal Effect of Blood Pressure on Renal Cancer in the UK Biobank via Mendelian Randomisation

Healthcare resource allocation decisions and non-emergency treatments in the aftermath of Covid-19 pandemic. How should children with chronic illness feature in prioritisation processes?

BackgroundIn the aftermath of the Coronavirus disease 2019 (Covid-19) pandemic, allocation of non-urgent medical interventions is a persistent ethical challenge as health systems currently face an unprecedented backlog of patients requiring treatment. Difficult decisions must be made that prioritise certain patients over others. Ethical resource allocation requires that the needs of all patients are considered properly, but at present there is no guidance that can help support such decision-making which explicitly considers the needs of children with chronic and complex conditions.MethodsThis paper reviews the NHS guidance for priorities and operational planning and examines how the needs of children with chronic illness are addressed in NHS objectives for restoring services and meeting elective care demands.ResultsThe usual criteria for prioritisation featured in the NHS guidance fail to account for the distinct needs of children with chronic illnesses and fail to match more general considerations of what constitutes fair resource allocation decisions. To address this issue, two considerations, namely 'protecting age-related opportunity' and 'recognising complexity of care,' are proposed as additions to the existing approach.ConclusionBy providing a broader conception of needs, these criteria address inefficiencies of the current guidance and relevant ethical frameworks and help to embed a currently missing children-related ethical approach to healthcare policy making in general.

Glaucoma Patients Have a Lower Abundance of Butyrate-Producing Taxa in the Gut.

Glaucoma is an eye disease that is the most common cause of irreversible blindness worldwide. It has been suggested that gut microbiota can produce reactive oxygen species and pro-inflammatory cytokines that may travel from the gastric mucosa to distal sites, for example, the optic nerve head or trabecular meshwork. There is evidence for a gut-eye axis, as microbial dysbiosis has been associated with retinal diseases. We investigated the microbial composition in patients with glaucoma and healthy controls. Moreover, we analyzed the association of the gut microbiome with intraocular pressure (IOP; risk factor of glaucoma) and vertical cup-to-disc ratio (VCDR; quantifying glaucoma severity). The discovery analyses included participants of the Rotterdam Study and the Erasmus Glaucoma Cohort. A total of 225 patients with glaucoma and 1247 age- and sex-matched participants without glaucoma were included in our analyses. Stool samples were used to generate 16S rRNA gene profiles. We assessed associations with 233 genera and species. We used data from the TwinsUK and the Study of Health in Pomerania (SHIP) to replicate our findings. Several butyrate-producing taxa (e.g. Butyrivibrio, Caproiciproducens, Clostridium sensu stricto 1, Coprococcus 1, Ruminococcaceae UCG 007, and Shuttleworthia) were less abundant in people with glaucoma compared to healthy controls. The same taxa were also associated with lower IOP and smaller VCDR. The replication analyses confirmed the findings from the discovery analyses. Large human studies exploring the link between the gut microbiome and glaucoma are lacking. Our results suggest that microbial dysbiosis plays a role in the pathophysiology of glaucoma.

Search results

Found 8729 matches for