Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Gaining access to large, labelled sets of relevant images is crucial for the development and testing of biomedical imaging algorithms. Using images found in biomedical research articles would contribute some way towards a solution to this problem. However, this approach critically depends on being able to identify the most relevant images from very large sets of potentially useful figures. In this paper a deep convolutional neural network (CNN) classifier is trained using only synthetic data, to rapidly and accurately label raw images taken from biomedical articles. We apply this method in the context of detecting faces in biomedical images; and show that the classifier is able to retrieve figures containing faces with an average precision of 94.8%, from a dataset of over 31,000 images taken from articles held in the PubMed database. The utility of the classifier is then demonstrated through a case study, by aiding the mining of photographs of patients with rare genetic disorders from targeted articles. This approach is readily adaptable to facilitate the retrieval of other categories of biomedical images.

Original publication




Conference paper

Publication Date



562 - 567