=Paper=
{{Paper
|id=Vol-2595/endoCV2020_Choi_et_al
|storemode=property
|title=Centernet-based Detection Model And U-net-based Multi-class Segmentation Model For Gastrointestinal Diseases
|pdfUrl=https://ceur-ws.org/Vol-2595/endoCV2020_paper_id_32.pdf
|volume=Vol-2595
|authors=Yoon Ho Choi,Yeong Chan Lee,Sanghoon Hong,Junyoung Kim,Hong-Hee Won,Taejun Kim
|dblpUrl=https://dblp.org/rec/conf/isbi/ChoiLHKWK20
}}
==Centernet-based Detection Model And U-net-based Multi-class Segmentation Model For Gastrointestinal Diseases==
CENTERNET-BASED DETECTION MODEL AND U-NET-BASED MULTI-CLASS SEGMENTATION MODEL FOR GASTROINTESTINAL DISEASES Yoon Ho Choi1 , Yeong Chan Lee2 , Sanghoon Hong2 , Junyoung Kim3 , Hong-Hee Won2† , Taejun Kim2,3† 1 Dept. of Health Sciences & Tech., Samsung Advanced Institute for Health Sciences & Tech. (SAIHST), Sungkyunkwan University, Samsung Medical Center, Seoul, Republic of Korea 2 Dept. of Digital Health, Samsung Advanced Institute for Health Sciences and Tech. (SAIHST), & Sungkyunkwan University, Samsung Medical Center, Seoul, Republic of Korea 3 Dept. of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea ABSTRACT adenoma detection rates of colorectal polyps significantly increased when endoscopists co-worked with real-time auto- From the perspective of the computer-aided diagnosis sys- matic detection system [2]. Another randomized controlled tem, it is important to build automated techniques that detect trial showed that deep convolutional neural network using and diagnose lesions to reduce the missing rate of clinicians. deep reinforcement learning achieved real-time monitoring Recently, various diagnosis techniques using computer vi- blind spots with a high accuracy during esophagogastroduo- sion and artificial intelligence have been developed.However, denoscopy [3]. they need to diagnose various lesions more accurately to be We participated in sub-challenge II: Endoscopic Dis- used in actual clinical practice. Accordingly, we developed ease Detection and Segmentation (EDD2020) of Endoscopy CenterNet-based object detection model and U-Net-based Computer Vision Challenges on Segmentation and Detection class-wise binary segmentation model. These models were (EndoCV2020). Deep learning models were developed for trained with random augmentation methods including color detecting or segmenting lesions from 4 different organs for and morphological changes. For the 43 test set images, our this challenge. model shows 0.1932 ± 0.0622 of mean average precision For this challenge, A CenterNet-based model was de- with standard deviation in detection, and 0.2544 ± 0.2080 of signed to detect lesions and a class-wise U-Net-based model semantic score in segmentation. was developed to segment lesions. 2. DATASETS 1. INTRODUCTION In total, 386 endoscopic images of the training set were Endoscopists can recognize diverse lesions related with obtained from 5 multi-centers [4]. Every image was assigned digestive disorders in gastrointestinal organs through en- to at least 1 class from 5 disease classes with Barretts esoph- doscopic examinations. The detected lesion is clinically agus (BE), high grade dysplasia (HGD), cancer, polyp and managed or resected in compliance with medical guidelines. suspicious region from 4 different organs. These images had However, it is not typically diagnosed until the results of corresponding bounding boxes and pixel-level labels of each pathological examination are known. Some endoscopic ex- lesion and were annotated by medical experts. The number of aminations are effective for the early diagnosis and prevention images in the entire training set was imbalanced across dis- of gastrointestinal disease, but detecting lesions is highly de- ease classes. (BE : 160, HGD : 74, cancer : 53, polyp : 127, pendent on the skill and experience of the endoscopists. For and suspicious : 88). example, some studies have reported that the missing rate of polyps during colonoscopy ranges from 17% to 28% [1]. Recently, computer-aided system has remarkably im- 3. METHODS proved with medical imaging. Especially, recent studies have shown that artificial intelligence can meet the endoscopists’ 3.1. Image preprocessing needs. A prospective randomized controlled trial showed that Class imbalance can lead to biased results towards a par- Copyright c 2020 for this paper by its authors. Use permitted under ticular class during the training of the model. Thus, prior Creative Commons License Attribution 4.0 International (CC BY 4.0). to image pre-processing, we randomly duplicated images Fig. 1. The architecture of U-Net-based class-wise binary segmentation model in insufficient classes to balance the number of images in performance in real-time target detection, we applied Center- all classes. At this point, it was important to minimize the Net to endoscopic disease detection. Our CenterNet-based number of duplicated images, since indiscriminately dupli- EDD detection model predicts the center points of the le- cated images may cause substantial bias in the trained model. sions, offsets to the x and y axes, and the width and height of Therefore, every round we identified the class with the high- bounding boxes. est number of images and the class with the lowest number. The backbone architecture of our detection model is a Then we randomly duplicated images of the lowest class. To ResNet50 [6] model pre-trained on the PASCAL VOC 2012 ensure that, the images containing objects of the highest class and EDD2020 datasets [4] for multiclass classification. We were excluded from the random duplication. fine-tuned this detection model with the following training After balancing the number of images belonging to each options. The batch size and epoch were 8 and 150 times, re- class, we preprocessed the training data to reduce overfit- spectively, and the initial learning rate was 5e-4 and divided ting of our models to it and generalize the models to the test by 10 after every 80 epochs. The input image size was 512 data. Firstly, all images in the training data were standardized and the test image was restored to its original size by apply- for each channel and randomly augmented 86 times using ing an affine transformation. The threshold of the confidence rotation, flipping, contrast enhancement, and brightness ad- score was set to 0.2. justment. Next, to train the model with invariant properties for the scale, we randomly changed the resolution of the original image from 320 to 602 every 10 epochs and then 3.3. Model development for segmentation converted it to a size of 512 × 512 pixels. For disease segmentation, we modified the decoder part of Vanilla U-Net [7] to build a multi- class segmentation model 3.2. Model development for detection that can infer independent result for each class. Because some classes overlap with other disease classes in the EDD2020 For disease detection, we focused on single-stage object data, it would be inappropriate to implement general multi detection model with fast execution speed that is appropri- class segmentation that constitutes the final layer as softmax ate for real-time object detection and can possibly be used operation. Therefore, we replaced the final layer of vanilla U- in clinical practice because the endoscopic image consist of Net with class-wise binary segmentation branches for multi video frame images rather than still images. class segmentation. As shown in Fig 1, we designed a branch CenterNet was shown to work more simply and efficiently structure in which the last up-convolution layer of U-Net per- by predicting both key points and bounding boxes of objects formed segmentations for each class independently. Through in images at the same time instead of sliding anchors that these branches, the class-wise binary segmentation model was compute image features by identifying possible bounding trained by dice similarity coefficient loss. The same backbone boxes [5]. Because it has recently demonstrated excellent architecture used for the detection model was used for of our segmentation model. Training of our segmentation model was [4] Sharib Ali, Noha Ghatwary, Barbara Braden, Dominique carried out with batch size of 4 and 150 epochs, and the ini- Lamarque, Adam Bailey, Stefano Realdon, Renato Can- tial learning rate was 5e-4 and divided by 10 after every 80 nizzaro, Jens Rittscher, Christian Daul, and James East. epochs. Endoscopy disease detection challenge 2020. arXiv preprint arXiv:2003.03376, 2020. 4. RESULTS [5] Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, and Qi Tian. Centernet: Keypoint For the 43 test set images, our model showed mean average triplets for object detection. In Proceedings of the IEEE precision of 0.1932 ± 0.0622 in detection, and semantic score International Conference on Computer Vision, pages of 0.2544 ± 0.2080 in segmentation. 6569–6578, 2019. [6] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian 5. DISCUSSION & CONCLUSION Sun. Deep residual learning for image recognition. arxiv 2015. arXiv preprint arXiv:1512.03385, 2015. EndoCV2020 is an annual global competition for detecting [7] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U- and segmenting lesions of endoscopic images from gastroin- net: Convolutional networks for biomedical image seg- testinal organs. We developed deep learning models for each mentation. In International Conference on Medical im- task. The detection model achieved mean average precision age computing and computer-assisted intervention, pages of 0.1932 ± 0.0622 and the segmentation model achieved se- 234–241. Springer, 2015. mantic score of 0.2544 ± 0.2080 in the test dataset. The challenging problem was extremely small data size. Only 386 images were given as a training set to classify and localize 5 imbalanced classes. Even suspicious class literally comprised unclear regions that endoscopists could not define. To overcome this problem, the images of minority classes from the training set were oversampled to balance with other classes, and all images were augmented through various im- age preprocessing techniques. Further research is required to develop an artificial intel- ligence model that can fulfill the standard for practical endo- scopic examination. 6. REFERENCES [1] Nam Hee Kim, Yoon Suk Jung, Woo Shin Jeong, Hyo- Joon Yang, Soo-Kyung Park, Kyuyong Choi, and Dong Il Park. Miss rate of colorectal neoplastic polyps and risk factors for missed polyps in consecutive colonoscopies. Intestinal research, 15(3):411, 2017. [2] Pu Wang, Tyler M Berzin, Jeremy Romek Glissen Brown, Shishira Bharadwaj, Aymeric Becq, Xun Xiao, Peixi Liu, Liangping Li, Yan Song, Di Zhang, et al. Real-time auto- matic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised con- trolled study. Gut, 68(10):1813–1819, 2019. [3] Lianlian Wu, Jun Zhang, Wei Zhou, Ping An, Lei Shen, Jun Liu, Xiaoda Jiang, Xu Huang, Ganggang Mu, Xinyue Wan, et al. Randomised controlled trial of wisense, a real- time quality improving system for monitoring blind spots during esophagogastroduodenoscopy. Gut, 68(12):2161– 2169, 2019.