Machine-learned models for predicting treatment efficacy against schistosomiasis in sub-Saharan Africa
An estimated 250 million people worldwide have schistosome infections with a total of 700 million people at risk of infection. Over 90% of these individuals live in sub-Saharan Africa. Parasitic blood flukes (‘worms’) cause the infection and associated disease. Transmission is closely linked to extreme poverty as endemic areas often lack access to safe water and adequate sanitation. If left untreated, chronic infections such as that caused by the species Schistosoma mansoni can cause liver fibrosis, diarrhoea, gastrointestinal haemorrhage, anaemia, malnutrition, and portal hypertension as well as other impairments such as reduced cognitive development, educational attainment, and work productivity.
Praziquantel is the only available treatment for schistosomiasis. The medicine works by targeting mature flukes and mature eggs. Immature flukes and immature eggs can survive treatment, and treatment is not protective as individuals are rapidly reinfected. Drug efficacy also is imperfect. Only ~76% of individuals infected with S. mansoni are temporarily cured. Yet, the determinants of praziquantel failure and variation in individuals are poorly understood. Failure is defined as a cure rate or egg reduction rate below what would normally be expected for medicine efficacy. Failure may be due to drug resistance, individual biosocial factors, or environmental conditions that affect parasite virulence.
This project aims to understand the patterns of praziquantel cure rates and egg reduction rates as well as the predictors of medicine failure in a rural Ugandan population that has a history of at least a decade of treatment with praziquantel. This is a unique opportunity to work with a rare set of data in its type and scale from a low-income setting. This project will work on data from rural Uganda as part of the SchistoTrack cohort led by the PI, Dr Goylette Chami. Nearly 4000 participants with a wide range of ecological, behavioural, social, and biological factors were examined pre- and post- treatment. The objectives of this internship will be to:
1) Characterise the patterns of infection clearance by age using machine-learned smoothing techniques.
2) Rank the importance of features for predicting medicine failure using machine learning and compare different machine-learned models.
3) Compare machine-learned models of variable selection/importance to applied statistical models.
During this internship, you will gain experience programming in R, common machine learning techniques, and applied statistical models for infectious diseases. The candidate also will undertake background reading to understand schistosomiasis epidemiology.
12 weeks, June-August w/ timing to be confirmed with the candidate
This project is suitable for candidates with a strong statistical or programming background from applied mathematics, engineering, statistics, computer science, or a related discipline. Candidates should have prior experience writing scripts in either R, Stata, or Python.