Intelligent data gathering
summary
Careful decision making in the gathering of new data is often as important as the subsequent processing, modelling, and analyzing of the data once gathered. This theme will be centered around developing and applying principled techniques to aid such decision making, with a particular focus on problems related to clinical trials. We will develop techniques that both allow for adaptive design strategies that learn on the fly to leverage already collected data to optimize the target objective function under system constraints, such as those imposed by regulatory restrictions, ethical concerns, privacy requirements, and budget restrictions.
Key areas include:
- Bayesian experimental design (BED): recent advancements in BED [1] have transformed its applicability. One area of this theme will be exploit these advancements for real problems in clinical trials and beyond. For example, using BED to conduct intelligent adaptive trials and answer questions such as whether it is better to measure ‘omics in fewer patients more frequently, or more patients less frequently.
- Gathering data with mis-specified models: linking with the “decision analysis under model mis-specification” theme, we will look at how imperfect models can still be usefully exploited to make intelligent data-gathering decisions, without inducing the pathologies that naive usage can incur.
- Enrichment strategies in clinical development plans: using principles from experimental design, we will develop techniques to improve the selections of patients for trials and/or which patients within a trial to conduct more expensive or invasive assays for [2]. We will also develop techniques to ensure sound analysis for any data collected in such trials that accounts for the active selection of patients.
- Active drug selection: we will develop techniques for actively selecting proteins/drugs to prioritize for testing at different stages of the drug development process. For example, by using techniques from Bayesian optimization, experimental design, and/or active learning, we can help target which molecules are physically tested for key properties such as binding affinity at the early stages of a drug development.
References
- Rainforth, T., Foster, A., Ivanova, D. R., & Bickford Smith, F. (2024). Modern Bayesian experimental design. Statistical Science, 39(1), 100-114.
- Ahuja, V. and Birge, J.R., 2020. An approximation approach for response-adaptive clinical trial design. Informs journal on computing, 32(4), pp.877-894.
- Notin, P., Kollasch, A., Ritter, D., Van Niekerk, L., and others (2024). Proteingym: Large-scale benchmarks for protein fitness prediction and design. NeurIPS