Deep learning for detection and segmentation of artefact and disease instances in gastrointestinal endoscopy.
Ali S., Dmitrieva M., Ghatwary N., Bano S., Polat G., Temizel A., Krenzer A., Hekalo A., Guo YB., Matuszewski B., Gridach M., Voiculescu I., Yoganand V., Chavan A., Raj A., Nguyen NT., Tran DQ., Huynh LD., Boutry N., Rezvy S., Chen H., Choi YH., Subramanian A., Balasubramanian V., Gao XW., Hu H., Liao Y., Stoyanov D., Daul C., Realdon S., Cannizzaro R., Lamarque D., Tran-Nguyen T., Bailey A., Braden B., East JE., Rittscher J.
The Endoscopy Computer Vision Challenge (EndoCV) is a crowd-sourcing initiative to address eminent problems in developing reliable computer aided detection and diagnosis endoscopy systems and suggest a pathway for clinical translation of technologies. Whilst endoscopy is a widely used diagnostic and treatment tool for hollow-organs, there are several core challenges often faced by endoscopists, mainly: 1) presence of multi-class artefacts that hinder their visual interpretation, and 2) difficulty in identifying subtle precancerous precursors and cancer abnormalities. Artefacts often affect the robustness of deep learning methods applied to the gastrointestinal tract organs as they can be confused with tissue of interest. EndoCV2020 challenges are designed to address research questions in these remits. In this paper, we present a summary of methods developed by the top 17 teams and provide an objective comparison of state-of-the-art methods and methods designed by the participants for two sub-challenges: i) artefact detection and segmentation (EAD2020), and ii) disease detection and segmentation (EDD2020). Multi-center, multi-organ, multi-class, and multi-modal clinical endoscopy datasets were compiled for both EAD2020 and EDD2020 sub-challenges. The out-of-sample generalization ability of detection algorithms was also evaluated. Whilst most teams focused on accuracy improvements, only a few methods hold credibility for clinical usability. The best performing teams provided solutions to tackle class imbalance, and variabilities in size, origin, modality and occurrences by exploring data augmentation, data fusion, and optimal class thresholding techniques.