ARTEFACT DETECTION AND SEGMENTATION BASED ON A DEEP LEARNING SYSTEM Xiaohong (Sharon) Gao1 , Barbara Braden2 1 Department of Computer Science, Middlesex University, London, NW4 4BT, UK 2 John Radcliff Hospital, University of Oxford, Oxford, UK ABSTRACT Model parameters Scored Rank Epoch = 412, This paper presents the results of detection and segmentation 0.2020 0.0768 18 Threshold = 0.17 of artefact from endoscopic video frames for EAD2020 com- Epoch = 226, petition. In this competition, a deep learning based system 0.2205 0.0843 7 Threshold = 0.13 is applied, which is built upon RetinaNet. Since RetinaNet employs a one-stage method that lacks facilitating masks of Table 1. Leaderborad scores for the two submissions put in segmented objects, inspired by the work of real-time instance in the EAD2020 competition. segmentation, this system accomplishes object segmentation through two parallel branches to generate a set of prototype mAP at IoU thresholds masks and to predict per-object mask coefficients respec- Region type 0.50 0.70 0.95 All tively. Overall, top 7 (out of 32 entries) position was achieved box 77.85 61.64 0.99 47.42 in this competition on the leaderboard. mask 76.91 61.51 0.87 46.17 Table 2. The mAP values for segmentation. 1. METHODS Figure 1 illustrates the network system applied in this com- [2] Sharib Ali, Felix Zhou, Adam Bailey, Barbara Braden, petition [1, 2, 3, 4, 5, 6, 7, 8, 9] built upon RetinaNet. It ac- James East, Xin Lu, and Jens Rittscher. A deep learning complishes object segmentation through two parallel strands framework for quality assessment and restoration in video (Prototype and Prediction coefficient), which are to generate endoscopy. arXiv preprint arXiv:1904.07073, 2019. a set of prototype masks and to predict per-object mask co- [3] Sharib Ali, Noha Ghatwary, Barbara Braden, Dominique efficients respectively. The backbone model of Resnet101 is Lamarque, Adam Bailey, Stefano Realdon, Renato Can- applied for all three tasks. 2. RESULTS Table 1 presents the final results submitted from this work whereas Table 2 gives the mAP for segmentation when 95% the overall classification accuracy was 63%. Although the results remain on the top 7, it is felt more enhancement is needed to further improve this model to improve its robust- ness. 3. REFERENCES [1] Sharib Ali, Felix Zhou, Christian Daul, Barbara Braden, Adam Bailey, Stefano Realdon, James East, Georges Wagnieres, Victor Loschenov, Enrico Grisan, et al. En- doscopy artifact detection (ead 2019) challenge dataset. arXiv preprint arXiv:1905.03209, 2019. Copyright c 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Fig. 1. The network applied in competition. nizzaro, Jens Rittscher, Christian Daul, and James East. Endoscopy disease detection challenge 2020. arXiv preprint arXiv:2003.03376, 2020. [4] Sharib Ali, Felix Zhou, Barbara Braden, Adam Bai- ley, Suhui Yang, Guanju Cheng, Pengyi Zhang, Xiao- qiong Li, Maxime Kayser, Roger D. Soberanis-Mukul, Shadi Albarqouni, Xiaokang Wang, Chunqing Wang, Seiryo Watanabe, Ilkay Oksuz, Qingtian Ning, Shufan Yang, Mohammad Azam Khan, Xiaohong W. Gao, Ste- fano Realdon, Maxim Loshchenov, Julia A. Schnabel, James E. East, Geroges Wagnieres, Victor B. Loschenov, Enrico Grisan, Christian Daul, Walter Blondel, and Jens Rittscher. An objective comparison of detection and seg- mentation algorithms for artefacts in clinical endoscopy. Scientific Reports, 10, 2020. [5] T. Lin, P. Goyal, R. Girshick, K. He, and P. Dollr. Fo- cal loss for dense object detection. In 2017 IEEE Inter- national Conference on Computer Vision (ICCV), pages 2999–3007, 2017. [6] Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. Rich feature hierarchies for accurate object detec- tion and semantic segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, page 580587, USA, 2014. IEEE Computer Society. [7] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott E. Reed, Cheng-Yang Fu, and Alexan- der C. Berg. Ssd: Single shot multibox detector. In ECCV (1), volume 9905 of Lecture Notes in Computer Science, pages 21–37. Springer, 2016. [8] Joseph Redmon, Santosh Kumar Divvala, Ross B. Gir- shick, and Ali Farhadi. You only look once: Unified, real-time object detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 779–788, 2015. [9] Daniel Bolya, Chong Zhou, Fanyi Xiao, and Yong Jae Lee. Yolact: Real-time instance segmentation. In ICCV, 2019.