Breast cancer is one of the cancers with a higher incidence in women. Early detection, mostly achieved by breast screening programs using mammographic x-ray imaging, is therefore very important for a good prognosis. Computer Aided Detection (CADe) and Diagnosis (CADx) tools have been proposed to help radiologists in this task of early detection with a potential impact on minimising reader variability and optimising reading times, sensitivity and radiologists’ workflow. Supervised deep learning methods are currently the dominant technology powering CADe tools, but they suffer from a limiting property: the need of large amounts of labelled data in training time. This becomes stricter when it comes to medical datasets which have high-cost time-consuming annotations. The aim of this work is to propose aDeep Convolutional Generative Adversarial Network (DCGAN) to generate synthetic breast mass lesions for data augmentation and analyse the effects their inclusion in training a breast mass detection system.In a first step, DCGANs are trained on increasing-size subsets of mammographic data and used to generate diverse and realistic mammographic lesions of size 128x128. Subsequently, the effect of including the generated images and/or applying horizontal and vertical flipping is analysed in a CADe mass detection framework (based on a fully-convolutional neural network) built to discriminate between normal tissue and lesion patches. A 1-to-10 imbalanced dataset is analysed with respect to the training size. Best results are obtained using a combination of conventional flipping and DCGANs, with a 9% improvement of the F1 score compared to no data augmentation. Qualitative results also show high quality of the synthetic images visually comparable to real mammographic lesions. In summary, we show that DCGANs can be used for synthesizing realistic mammographic mass patches with a considerable diversity and realism, with a positive impact on training CADe tools.
Presentor: Dr. Robert Marti.