Graph learning and complex network analysis of clinical symptoms related to severe schistosomiasis
Project summary
An estimated 250 million people worldwide have schistosome infections with a total of 700 million people at risk of infection. Over 90% of these individuals live in sub-Saharan Africa. Parasitic blood flukes (‘worms’) cause the infection and associated disease. Transmission is closely linked to extreme poverty as endemic areas often lack access to safe water and adequate sanitation. If left untreated, chronic infections such as that caused by the species Schistosoma mansoni can cause liver fibrosis, diarrhoea, gastrointestinal haemorrhage, anaemia, malnutrition, and portal hypertension as well as other impairments such as reduced cognitive development, educational attainment, and work productivity.
Getting people to health centres and ensuring the availability of appropriate diagnostics, medical care, and continued follow-up remains challenging in resource-limited settings. Key ‘signals’ for individuals and health workers are clinical symptoms such as upper gastrointestinal tract bleeding, abdominal pain, and so on. It is difficult to untangle symptoms related to schistosome infections from other causes that exist in the same geographical areas such as malaria, hepatitis B, and HIV. Despite the difficulty in associating symptoms to infections, it may be possible to associate sets of symptoms to disease states to serve as a triage or screening tool to lessen the burden of overstrained health systems in rural poor areas. What remains an open question is what set of symptoms are indicative of what disease states?
Complex network analyses seek to simplify real-world relationships into a set of nodes and edges. The focus of this project is to discover the symptomatic network structure relevant to different schistosomiasis disease states. This problem involves testing different graph generation methods, exploring ways of choosing edge thresholds to determine the density of the graph, and exploring the statistical properties of the networks through semi-supervised or unsupervised clustering algorithms.
During this internship, you will gain experience with machine learning techniques and applied statistical models. The candidate also will undertake a short literature review of complex network construction for clinical symptom data and learn the foundations of schistosomiasis epidemiology. You will be embedded in the interdisciplinary SchistoTrack Group at the Big Data Institute.
Timescale
12 weeks. Start date no later than 1 July 2024.
Day-to-day supervision
Dr Yin-Cong Zhi
Suitability
This project is suitable for candidates with a strong statistical or programming backgrounds from engineering, computer science, statistics, applied mathematics, or a related discipline. Candidates should have prior experience writing scripts in either R or Python.
This project can be tailored to be more focused on machine learning or applied statistics depending on the skillset of the selected candidate.