DEEP LEARNING BASED APPROACH FOR DETECTING DISEASES IN ENDOSCOPY Vishnusai Y1,∗ , Prithvi Prakash1,∗ , Nithin Shivashankar1 1 Mimyk Medical Simulations Pvt. Ltd,∗ Authors contributed equally ABSTRACT In this paper, we discuss our submissions for the Endo- scopic Disease Detection Challenge (EDD2020) [1], which had two sub-challenges. The first task involved a bounding box based multi-class detection of diseases, namely Polyp, Barrett’s Esophagus (BE), Cancer, Suspicious and High- Grade Dysplasia (HGD). The second task involved creating semantic masks of the images for the aforementioned class of diseases. For the disease detection task we submitted the predictions of a Faster R-CNN with a ResNeXt-101 backbone Fig. 1. Sample results on the test dataset for the disease de- and achieved a dscore of 0.1335±0.0936. For the semantic tection and semantic segmentation tasks segmentation task, we employed a U-NET with a ResNeXt- 50 backbone that achieved an sscore of 0.5031. 2. RESULTS AND CONCLUSION 1. METHOD Disease Detection Task 1.1. Disease Detection Task Sl No Model mAP 1 ResNet-101 0.1724 For the disease detection task we made use of a Faster R-CNN 2 ResNeXt-101 0.2235 [2] object detector with a ResNeXt-101 serving as the back- Semantic Segmentation Task bone. Prior to feeding the data into our Neural Network model Sl No Model Train IoU Val IoU we applied augmentation techniques based on RandAugment [3] to improve the generalization capability of the neural net- 1 Single Model 0.381 0.121 work. From a choice of 16 augmentation techniques, two 2 BE Model 0.871 0.542 augmentation transformations were selected at random. We 3 Cancer Model 0.782 0.217 observed that magnitudes of 4, 5, 6 gave out the most effec- 4 HGD Model 0.814 0.313 tive augmentations and hence, this was chosen. The Faster 5 Polyp Model 0.932 0.571 R-CNN model was trained for 10 epochs and the learning rate 6 Suspicious Model 0.434 0.115 was set to 0.01. The images were resized to 1300x800 pixels. 7 Aggregate Model (2-6) 0.766 0.351 Table 1. Mean Average Precision of Test Data. Intersection 1.2. Semantic Segmentation Task over Union of Training and Validation Data. The U-NET [4] Architecture was used for the semantic seg- mentation task. Five separate U-NET models were created The results of the disease detection and segmentation to train individual models to segment out different diseases. tasks are summarised in Table 1. From the disease detec- Prior to feeding our data to each U-NET, the images and tion section, we see that the ResNeXt-101 outperformed masks were scaled to 256x256 pixels. It was then split to the ResNet-101. On submission we obtained a dscore of ensure that a proportionate sample of the true classes was 0.1335±0.0936. From the semantic segmentation task sec- present in both the sets. This was done by the K-Means tion, we observe that the individual disease models performed clustering algorithm and sampling an 80-20 split from each better than a single model trained for all diseases. This bucket. The number buckets was decided using the Elbow- prompted us to adopt an aggregate model that aggregated the Method. We then applied augmentations on the train images results of the individual disease models. On submitting the namely: flip, zoom, and rotate and then trained them on a predictions of this aggregate model on the test dataset, an U-NET with a ResNeXt-50 backbone for 150 epochs. sscore of 0.5031 was obtained. Copyright (c) 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 3. REFERENCES [1] Sharib Ali, Noha Ghatwary, Barbara Braden, Dominique Lamarque, Adam Bailey, Stefano Realdon, Renato Can- nizzaro, Jens Rittscher, Christian Daul, and James East. Endoscopy disease detection challenge 2020. arXiv preprint arXiv:2003.03376, 2020. [2] Ren et. al. Faster r-cnn: Towards real-time object de- tection with region proposal networks. In Proceedings of the 28th International Conference on Neural Informa- tion Processing Systems - Volume 1, NIPS15, page 9199, Cambridge, MA, USA, 2015. MIT Press. [3] Ekin D. Cubuk et. al. Randaugment: Practical automated data augmentation with a reduced search space, 2019. [4] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U- net: Convolutional networks for biomedical image seg- mentation. In Medical Image Computing and Computer- Assisted Intervention (MICCAI), volume 9351 of LNCS, pages 234–241, 2015.