Glioblastoma Multiforme Classification On High Resolution Histology Image Using Deep Spatial Fusion Network? P. Sobana Sumi ??1 and Radhakrishnan Delhibabu1,2 1 School of Computer Science and Engg., Vellore Institute of Technology, Vellore, India sobanasumi.p2018@vitstudent.ac.in 2 Modeling Evolutionary Algorithms Simulation and Artificial Intelligence, Faculty of Electrical & Electronics Engineering, Ton Due Thang University, Ho Chi Minh City, Vietnam r.delhibabu@vit.ac.in & radhakrishnandelhibabu@tdtu.edu.vn Abstract. Brain tumor is a growth of abnormal cells in brain, which can be cancerous or non-cancerous. The Brain tumor have scarce symptoms so it is very difficult to classify. Diagnosing brain tumor with histology images will efficiently helps us to classify brain tumor types. Sometimes, histology based image analysis is not accepted due to its variations in morphological features. Deep learning CNN models helps to overcome this problem by feature extraction and classification. Here proposed a method to classify high resolution histology image. InceptionResNetV2 is an CNN model, which is adopted to extract hierarchical features with- out any loss of data. Next generated deep spatial fusion network to ex- tract spatial features found in between patches and to predict correct fea- tures from unpredictable discriminative features. 10-fold cross-validation is performed on the histology image. This achieves 95.6 percent accu- racy on 4-class classification (benign, malignant, Glioblastoma, Oligo- dendroglioma). Also obtained 99.1 percent accuracy and 99.6 percent AUC on 2-way classification (necrosis and non-necrosis). Keywords: Glioblastoma Multiforme · Deep spatial fusion network · InceptionResNetV2 · classification · patches · CNN 1 Introduction Cancer tumor anywhere in the body spreads to the brain or it starts in the brain. A brain tumor can be normal (benign) or cancerous (malignant) based on its characteristics, it is normaly found in children and adults. The brain tu- mors are differentiated in to two types as Low Grade Gliomas (LGG) and High Grade Gliomas (HGG). Grade 1 and grade 2 are defined as LGG, grade3 and ? Copyright c 2019 for this paper by its authors. Use permitted under Creative Com- mons License Attribution 4.0 International (CC BY 4.0). ?? Corresponding author 2 P. Sobana Sumi and Radhakrishnan Delhibabu grade 4 are defined as HGG. Astrocytomas are grade 1 and grade 2 level tu- mor, Oligodendroglioma is grade 3 level tumor and Glioblastoma Multiforme is grade 4 level tumor. Children are affected by Astrocytoma, Ependymoma and Medulloblastoma. Adults suffer from Astrocytoma, Oligodendrogliomas, Menin- gioma, Glioblastoma and Schwannoma. Gliobastoma Multiforme (GBM) is the matured stage of brain tumor with scarce symptoms so it is very hard to clas- sify. Diagnosing this kind of tumors at the right time helps to increase patient survival. Histology images are obtained through biopsy. The biopsy is a process of taking tissue from the tumor and that tumor tissue is analyzed under electron microscope. Histology images have to be analyzed in large numbers with mul- tiple staining to diagnose a single case, which causes time consuming problem. Sometimes this kind of histology image analysis is not accepted due to large vari- ations in pathological features. At the initial stage to detect tumors in histology images Computer Assisted Diagnosis (CAD) systems are used. In this technique, images are scanned at first next the digital images are processed and analyzed by visual feature extraction of machine learning technique. Color difference occurs due to various scanners, staining procedures, different age patients and due to tissue thickness. Color normalization helps to classify colors among samples [1]. CAD works well in breast cancer images, not in brain tumor. Xu et al. [2] proposed deep activation features for large scale histology im- ages. CNN is used for feature extraction and the extracted features are passed into SVM to classify the patches of necrosis and non necrosis area. This method achieved 90 percent accuracy on small datasets by classification and segmenta- tion process. Fukuma et al. [3] have explored feature extraction and disease stage classification for glioma histology images. Here tumors are distinguished as LGG and GBM by using significant features as object level features and spatial ar- rangement features to classify disease stage. These features are evaluated by K-S test and this obtained result is classified by using SVM classifier. Classification accuracy is low when compared to other non significant features. In the work of automated discrimination of low and high grade glioma, Mousavi et al. [4] proposed about pseudopalisiding necrosis area that is detected by cell segmen- tation and by cell count profile. Microvascular proliferation detection (MVP) is detected by spatial and morphological feature extraction. Finally, the hier- archical decision is made through a decision tree. MVP detection accuracy is less when compared to necrosis because of its structural complexity. Macyszyn et al. [5] examined about multidimensional pattern classification method. Here SVM classifier is used to classify patient survival in short (6 months) and long terms (18 months). Two cross accuracy for short term survival and three cross accuracy for the long term survival is used. Here MRI images are analyzed to predict survival. Barkerc et al. [6] explored coarse to fine method to analyze the characters of pathology images. Spatial features like shape, color and tex- ture are extracted from tiled region and passed into clustering to have better classification. K-means is used for clustering and PCA is used to reduce data dimensionality and classification complexity. Powell et al. [7] examined low grade gliomas using a bag of words approach. The edge detection algorithm is used Glioblastoma Multiforme Classification 3 for nuclear segmentation on hematoxylin stain and eosin stains. Threshold is as- signed using global value. K mean is used for feature extraction and SVM is used to classify overall patient survival in short and long terms. Xu et al. [8] suggested CNN architecture AlexNet for classification. Achieved good result than previous methods. Here image analysis is done with a limited number of images where CNN works well on a large dataset. Yonekura et al. [9] proposed CNN architec- ture with deep networks. This network consists of three convolution layers, three pooling layers, and three ReLU. Classification accuracy is low when compared with other network but the work mainly focuses on disease stage classification. Yonekura et al. [10] suggested deep CNN architecture of LeNet network to ex- tract features and to classify disease stage. Classification accuracy is low when compared with other networks. Classifying high resolution histology image is a major problem. CNN can ex- tract unpredictable discriminative features but training CNN directly with high resolution image is computationaly high. Mostly, histology image is found with unpredictable discriminative features which causes a challenge in patch based CNN classification. So, to solve this problem, InceptionResNetV2 architecture is adopted for hierarchical feature extraction and deep spatial fusion network is used to predict spatial features found in between patches. This proposed system gives better accuracy. The paper is organised as follows. Section 2 introduces the studied problem. Section 3 describes two freely available databases with high-resolution brain tumor histology images. Section 5 briefly describes types of neural networks used in computer vision and image analysis and peculiarities of histology image. Section 4 introduces the used architecture of deep neural networks. Section 6 provides readers with the essence of the proposed solution, while Section 7 ex- plains training process aspects. Section 8 is devoted to machine experiments and results discussion. Finally, Section 9 concludes the paper. 2 Problem Statement Histology image diagnosing is completely a human factor process. Sometimes this kind of analysis is disagreed due to various differences in morphological features. To diagnose a single tissue and to classify its disease stage, tissue has to be analyzed under various magnification factors. Several tissues have to be analyzed by a pathologist to conclude. In some cases, both surgery and tissue diagnosing has to be done at the same time, this leads to time consuming problems. Lack of images (datasets) and images with good quality are rare. Large datasets should be processed with deep learning technique. Many diseases are not diagnosed due to the lack of training data. Histology images are diagnosed as per H and E stains only. Imbalanced datasets can be solved by data augmentation and some other molecular features can be raised to diagnose histology images other than H and E stains. 4 P. Sobana Sumi and Radhakrishnan Delhibabu 3 Dataset TCGA and TCIA are most popular databases from where these high-resolution brain tumor histology images are taken. TCGA database consists of 2034 high resolution brain tumor histology images with the size of 2048*1536 and TCIA database consist of 2005 images. Each histology images are taken by biopsy process to maintain its molecular composition and its original structure. Each dataset is H and E stained microscopic histology image, with various magnifica- tion factors like 40X, 100X, 200X and 400X. These datasets are found under var- ious classifications as benign, malignant, Astrocytoma and Oligodendroglioma. Both datasets consist of this 4 class labels evenly. Astrocytoma and Oligoden- droglioma are malignant type tumors. To avoid data imbalance and overfitting, Tensorflow perform data augmentation by rotating, saturation adjustment etc. Normalization is done with an interval of [-1, 1], before the augmentation process to reduce variance that occurs through H and E stains [14]. To have a better classification four types of magnification images are given in the ratio of 7:3 for the training and testing process. Tumor classification works mainly focus on benign and malignant binary classification. 4 Architecture The proposed architecture is shown in figure 1. A high-resolution brain tu- mor histology image is given as input. Unpredictable discriminative features are present sparsely on the entire image, which denote that it is not necessary for all patches to be consistent with image-wise labels. To model this fact and also to have good image-wise prediction, a spatial fusion network has been proposed. First, the adapted InceptionResNetv2 is trained to extract hierarchical discrim- inative features and predict probabilistic values of different cancer type for local image patches. Compared to VGG [11], InceptionResNetV2 (INRV2) performs well because of its shortcut connection network. Skip connection structure of INRV2 helps to reduce several problems with training deep neural networks that occur while performing backpropagation and improve feature extraction. Secondly, a deep spatial fusion network is specially designed to learn spatial rela- tionship between patches, it gets input from the spatial feature map. Patch-wise probability vector is taken as a base unit for spatial feature maps to have bet- ter usage. The fusion model learns to adjust the bias of patch-wise predictions and tends to have efficient image-wise prediction, compared to typical fusion methods. 5 Related Theory 5.1 Convolutional Neural Network CNN is a category of the neural network, which shows its effective result in image classification, image recognition etc. Its operations are convolution, non Glioblastoma Multiforme Classification 5 Fig. 1. Proposed architecture of Spatial fusion network. 512*512 size pixel is given as input to deep patch process. Spatial fusion network is trained by INRV2. The proba- bilistic vector is given as base to spatial feature map and processed with deep spatial fusion network. The dropout layer is added to avoid overfitting and increase robustness. linearity (ReLU), pooling or subsampling, classification (fully connected layer). Extracting features from the input image is the major work of the convolution part; the image convolves with a filter and gives the feature map. A rectified linear unit is a non-linear operation, which performs element-wise operation and replaces all negative values found in feature map with zero value. Pooling helps to reduce the dimensionality of each feature map without any loss in information. A fully connected layer is a multilayer perceptron which uses softmax activation function in the output layer. Convolutional and pooling layers give high level features as the output. The fully connected layer uses these high level features to classify the input image. 5.2 InceptionResNetV2 InceptionResNetV2 is a CNN network, which is trained on more than one million images from ImageNet database. It consists of 164 deep layer network, it can classify images into 1000 object categories – variety of animals, birds, box, pencil, etc. Through this, the network has learned several features from various images. ResNet skips its connections with no loss of information. Training phase with this network is much faster and produces better accuracy than Inception network. INRV2 achieves 19.9 percent on top 1 error and 4.9 percent on top 5 error. In the proposed work this network consist of 24 layers and 4 blocks. 5.3 Histology Image Analysis The brain tumor histology image is analyzed only by H and E stained slides. Malignancy state can be determined by the presence or absence of certain histo- logical features such as presence and absence of necrosis, mitotically active cells, nuclear atypia, microvascular proliferation (enlarged blood vessels). 6 P. Sobana Sumi and Radhakrishnan Delhibabu Fig. 2. InceptionResNetV2 network with 1536 feature vector. These H and E stains are universally used for histological tissue examina- tion. Classification and grading of brain tumor can be done by including other molecular information along with the histology image based information. In our proposed method deep patch based process and deep spatial fusion network is used to classify high resolution histology image. 6 Proposed Work 6.1 Patch-wise INVR2 Instead of adapting normal feedforward CNN our proposed architecture used InceptionResNetV2. Compared to other CNN architectures INRV2 effectively reduces difficulties in training the deep network using shortcut connections and by residual learning. Though it skip connections there is no loss of information by non-linear functions. Extracted hierarchical features from low level to high level are combined to make final predictions, where as discriminative features are distributed in the image from cellular to tissue level. Input layer receives normalized image patches with the size of 512*512, which is sampled from whole histology image. Depth of the network is 24 layers with 4 block units for exploring region patterns in a different scale. 19*19 to 43*43, 51*51 to 99*99, 115*115 to 211*211 and 243*243 to 435*435 pixel are the size of four block groups size. This pixel size responds to region patterns in nuclei, structure organization and tissue organization. 6.2 Deep Spatial Fusion Network The main purpose of the fusion model is to predict the image-wise label ẑ among Y classes C = {C1 , C2 , . . . , CY }, given all patch-wise feature maps F as the Glioblastoma Multiforme Classification 7 Fig. 3. Hematoxylin stains are acidic molecules, shades of blue. Eosin stains are basic materials shades of red, pink and orange. input to the proposed INRV2 network. This image-wise classification prediction is defined as MAP [12] as follows ẑ = arg max P (z|F ). z∈C If the entire high resolution image is divided into M ∗ N patches, then all the patch wise probability maps are arranged in spatial order, as   F11 F12 F13 . . . F1N  F21 F22 F23 . . . F2N  F = . ..  .   .. .. . .  .. . . . .  FM 1 FM 2 FM 3 . . . FM N Here, deep neural network (DNN) is used to utilize the spatial relationship between patches as shown in figure 5. Proposed fusion model contains 4 fully connected layers, each follows by ReLU activation function [13]. During image wise training, multilayer perceptron converts the spatial distribution of local probability maps into global class probability vectors. One dropout layer is added in before each hidden layer to avoid overfitting and to increase robustness. Also, a dropout layer is inserted between the flattened probabilistic vector and first hidden layer. By dropping out half of the probability maps, this models learns image wise prediction with half information of patches and also minimize the cross entropy loss in training. 8 P. Sobana Sumi and Radhakrishnan Delhibabu Fig. 4. Training of deep patch model. Given 512*512 size pixel as input to deep patch process, after training it consist of 512*512 patch with 10*10 feature map and 4 prob- abilistic values. Fig. 5. Training of spatial fusion network. 512*512 size pixel is divided into 12 patches, each patch is given separately as input to InceptionResNetV2, after training it consist of 512*512 size patch with 10*10 feature map and with 4 probabilistic values. 7 Network Training 512*512 size pixel with overlapping patches extracted from the high resolution image and given as input for deep patch based model shown in figure 4. We assume that patch labels are consistent with image ground truth values because patch-wise labels are not given in the training dataset. Bias may suffer during patch based training and reduce patch-wise classification accuracy. By the second stage under supervised learning of image labels, bias get reduced during image based training. To avoid imbalanced dataset and overfitting, augmentation is done by random rotation, by making a change in contrast, brightness and by horizontal flipping etc. This generate 202,174 patches from the TCGA database and 140,099 patches from TCIA database. InceptionResNetV2 is trained on 32 sized mini batch to minimize the cost function of cross entropy using Adam optimization [15] with the learning rate of 10−5 for each 50 epoch. Now this Glioblastoma Multiforme Classification 9 patch wise network after training, encodes the 512*512 patch to 10*10 feature map and to 4 class probabilistic values. Data augmentation process is performed, to train spatial fusion network and generated 5,890 high resolution images from the TCGA training dataset and 3,998 images from TCIA training dataset. Each high resolution image is divided into 12 non overlapping patches with 512*512 size pixels after augmentation. Each of these patch is given individually as an input to InceptionResNetV2 and output 512 feature map of size 10*10 and with a class probability vector of size 1*4. Probabilistic vectors of patches in the current image are then combined into a probabilistic map following their own spatial order which is given as input for spatial fusion network. This generated probability map can be viewed as high level feature map that encodes all the patch wise discriminative features and the image wise spatial context features. After that it is analyzed with image ground truth values. Spatial fusion network weights are learned by using mini batch gradient descent with a batch size of 32 with Adam optimization. To minimize cross entropy loss function during training, the spatial fusion model encodes the biased probabilistic map into k class vector as per approximate image ground truth (k=4). By using the spacial context aware features hidden in probabilistic map, image based classification accuracy can be improved. 8 Experimental Results We evaluated the performance of the patch based InceptionResNetV2 and then focused on the effectiveness of spatial fusion network by having multiple experi- ment comparison on the two dataset. In the first experiment (Baseline) on TCIA dataset is a baseline method [16], which is based on patch based plain CNN along with multiple vote based fusion method. Second (Residual and Vote) on TCIA dataset, replaced plain CNN with patch wise residual network. Our proposed work is examined on both dataset and shown in Table 1. All these methods are evaluated with ten fold cross validation. On the TCIA dataset, proposed method Table 1. Comparision result on TCIA and TCGA dataset Datasets Methods 4 class ACC 2 class ACC STD TCIA Baseline 0.778 0.844 - TCIA Residual and Vote 0.816 0.850 - TCIA InceptionResNetV2 and Spatial Network 0.868 0.891 - TCGA CNN and GDT 0.872 0.938 0.026 TCGA InceptionResNetV2 and Spatial Network 0.956 0.991 0.022 achieve the accuracy of 86.8 percent for 4 class classification, which outperforms on baseline method [16] by 8.4 percent. ResNet and vote method brings improve- ment by 4.9 percent. As a comparison, on TCGA dataset CNN with GDT [17] is performed, which adopt several CNN network (ResNet50, InceptionV3, VGG16) 10 P. Sobana Sumi and Radhakrishnan Delhibabu by using ensemble model and extracted features with gradient boost tree clas- sifier. On TCGA dataset, our proposed method achieve 95.6 percent accuracy on 4 class classification and 99.1 percent accuracy, 99.6 percent AUC on 2 class classification (necrosis and non necrosis). Fig. 6. Confusion matrix (without normalization). 10 fold cross-validation process on 4 class classification with 400 high resolution histology image. Performance of classification in terms of ROC (receiver operating character- istic curve) and confusion matrix is shown in figure 6 and figure 7. By PyTorch library on NIVIDIA 1080Ti GPU, this performance is achieved. It took around 80ms to classify single high resolution histology image. 9 Conclusion The proposed method of this paper describes a deep spatial fusion network that handles the complex building of discriminative features over patches. Also learns to adjust bias on patch wise prediction in high resolution histology image. Patch wise InceptionResNetV2 is adopted to extract features from the cellular level to tissue level. This method is exposed to analyze the spatial relationship between patches. Compared to the previous experiments of CNN using various architecture, our proposed method gives better performance. Further, this work can be extended with some other networks that efficiently analyzes more types of malignant tumors other than Glioblastoma and Oligodendroglioma. Glioblastoma Multiforme Classification 11 Fig. 7. 2-way classification as necrosis and non necrosis in terms of ROC. References [1] H.O. Lyon, A.P. De Leenheer, R.W. Horobin, W.E. Lambert, E.K.W. Schulte, B. Van Liedekerke, D.H. Wittekind, Standardization of reagents and methods used in cytological and histological practice with emphasis on dyes, stains and chromogenic reagents, Histochem. J. 26 (7) (1994) 533–544. [2] Yan Xu, Zhipeng Jia, Yuqing Ai, Fang Zhang, Maode Lai, Eric I-Chao Chang, Deep Convolutional Activation Features For Large Scale Brain Tumor Histopathology Im- age Classification And Segmentation, 2015 IEEE ICASSP [3] Kiichi Fukuma, Hiroharu Kawanaka, Surya Prasath, Bruce J. Aronow and Haruhiko Takase, Feature Extraction and Disease Stage Classification for Glioma Histopathol- ogy Images, 2015 17th International Conference on E-health Networking, Applica- tion and Services (HealthCom) [4] Hojjat Seyed Mousavi, Vishal Monga, Ganesh Rao, Arvind U. K. Rao, Automated discrimination of lower and higher grade gliomas based on histopathological image analysis, J Pathol Inform 2015, 1-15 [5] Luke Macyszyn, Hamed Akbari, Jared M. Pisapia, Xiao Da, Mark Attiah, Vadim Pigrish,Yingtao Bi, Sharmistha Pal, Ramana V. Davuluri, Laura Roccograndi, Na- dia Dahmane, Maria Martinez-Lage, George Biros, Ronald L. Wolf, Michel Bilello, Donald M. O’Rourke, and Christos Davatzikos,Imaging patterns predict patient sur- vival and molecular subtype in glioblastoma via machine learning techniques, Neuro- Oncology 2015 [6] Jocelyn Barkerc, Assaf Hoogia, Adrien Depeursingea,b, Daniel L. Rubin,Automated classification of brain tumor type in whole-slide digital pathology images using local representative tiles, Elsevier, 2015. [7] Reid Trenton Powell, Adriana Olar, Shivali Narang, Ganesh Rao, Erik Sulman, Gregory N. Fuller, Arvind Rao, Identification of Histological Correlates of Overall Survival in Lower Grade Gliomas Using a Bag–of–words Paradigm: A Preliminary Analysis Based on Hematoxylin and Eosin Stained Slides from the Lower Grade Glioma Cohort of The Cancer Genome Atlas, 2017 Journal of Pathology Informatics 12 P. Sobana Sumi and Radhakrishnan Delhibabu [8] Yan Xu, Zhipeng Jia, Liang-Bo Wang, Yuqing Ai, Fang Zhang, Maode Lai and EricI-Chao Chang, Large scale tissue histopathology image classification, segmenta- tion, and visualization via deep convolutional activation features, 2017 BMC Bioin- formatics. [9] Yonekura A, Kawanaka H, Prasath VBS, Aronow BJ, Takase H,Improving the gen- eralization of disease stage classification with deep CNN for glioma histopathological images, In: International workshop on deep learning in bioinformatics, biomedicine, and healthcare informatics (DLB2H); 2017. pp 1222–1226 30. [10] Asami Yonekura, Hiroharu Kawanaka,V. B. Surya Prasath, Bruce J. Aronow, Haruhiko Takase, Automatic disease stage classification of glioblastoma multiforme histopathological images using deep convolutional neural network, Korean Society of Medical and Biological Engineering and Springer-Verlag GmbH Germany, part of Springer Nature 2018 [11] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) [12] Greig, D.M., Porteous, B.T., Seheult, A.H.: Exact maximum a posteriori estima- tion for binary images. Journal of the Royal Statistical Society. Series B (Method- ological) pp. 271–279 (1989) [13] Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann ma- chines. In: Proceedings of the 27th international conference on machine learning (ICML-10). pp. 807–814 (2010) [14] Macenko, M., Niethammer, M., Marron, J., Borland, D., Woosley, J.T., Guan, X., Schmitt, C., Thomas, N.E.: A method for normalizing histology slides for quantita- tive analysis. In: Biomedical Imaging: From Nano to Macro, 2009. ISBI’09. IEEE International Symposium on. pp. 1107–1110. IEEE (2009) [15] Kingma, D.P., Ba, J.:Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) [16] Araujo, T., Aresta, G., Castro, E., Rouco, J., Aguiar, P., Eloy, C., Polonia, A., Campilho, A.:Classification of breast cancer histology images using convolutional neural networks. PloS one 12(6), e0177544 (2017) [17] Rakhlin, A., Shvets, A., Iglovikov, V., Kalinin, A.A.: Deep convolutional neural networks for breast cancer histology image analysis. arXiv preprint arXiv:1802.00752 (2018)