Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Aberrant Crypt Foci (ACFs) are considered early markers of colorectal neoplasms, yet their segmentation remains challenging due to limited annotated data and large lesion variability. We propose a novel diffusion-based synthetic augmentation framework to generate realistic ACF images under diverse artefact conditions and shapes. By conditioning a denoising diffusion probabilistic model on multi-class masks (blood, dye, and waste), and depth cues, our method expands both morphological and contextual variability in small ACF datasets. We evaluate performance gains on five segmentation architectures: three convolutional networks (U-Net, U-Net++, and DeepLab V3+) and two models with transformer backbones (TransNetR, PVT CASCADE). In our experiments, the less complex CNN models achieve the most substantial boosts in test Dice (e.g., +22.1% for U-Net), while DeepLab V3+ sees an +8.4% gain. The transformer-backbone architectures also benefit, with improvements of +0.6% for TransNetR and +2.7% for PVT-CASCADE. Qualitative assessments confirm that the diffusion-generated images replicate the annotated artefacts and reduce the performance drop that typically occurs when training with limited real data. Our results indicate that CtrlEndoDiff improves ACF segmentation accuracy by adding visually realistic synthetic samples into the training process.

More information Original publication

DOI

10.1007/978-3-032-05472-2_31

Type

Chapter

Publication Date

2026-01-01T00:00:00+00:00

Volume

16128 LNCS

Pages

319 - 329

Total pages

10