Algorithm validation for data science
Professor Joachim M. Buhmann, Department of Computer Science, ETH Zurich, Switzerland
Wednesday, 21 September 2022, 11am to 12pm
Richard Doll Lecture Theatre, Richard Doll Building, Old Road Campus, Headington, OX3 7LF
Abstract
Data Science (DS) algorithms interpret outcomes of empirical experiments with random influences. Often, such algorithms are cascaded to long processing pipelines especially in biomedical applications. The validation of such pipelines poses an open question since data compression of the input should preserve as much information as possible to distinguish between possible outputs. Starting with a minimum description length argument for model selection we motivate a localization criterion as a lower bound that achieves information theoretical optimality. Uncertainty in the input causes a rate distortion tradeoff in the output when the DS algorithm is adapted by learning. We present design choices for algorithm selection and sketch a theory of validation. The concept is demonstrated in neuroscience applications of diffusion tensor imaging for tractography and brain parcellation.
Biography
Joachim M. Buhmann is a Professor for Computer Science at ETH Zurich. He studied physics at TU-Munich and performed postdoctoral research at USC and LLNL in California. Until 2003 he was a Professor for Applied Computer Science at the University of Bonn. His teaching and research includes Machine Learning in theory and applications, e.g. in the life sciences. He is a member of the Swiss Academy of Engineering Sciences (SATW), a Fellow of the IAPR and he serves as a research council member of the Swiss National Science Foundation.