-

2020). “Feature selection considering two types of feature relevancy and feature interdependency | Expert Systems with Applications: An International Journal.” https://dl.acm.org/doi/10.1016/j.eswa.2017.10.016 (accessed Aug. 14

10.1007/s12652-019

Automated Detection of Cervical Pre-Cancerous Lesions Using Regional-Based Convolutional Neural Network

Stephen Kiptoo

kiptoos@gmail.com 0

Lawrence Nderu

lnderu@jkuat.ac.ke 0

Leah Mutanu

lmutanu@usiu.ac.ke 1 0 School of Computing and Information, Technology (SCIT), Jomo Kenyatta, University of Agriculture and, Technology , Nairobi , Kenya 1 School of Science and Technology, (SST), United States International University , USIU- Africa, Nairobi , Kenya

2020

4 4 19 28

- Cervical Colposcopy image is an image of woman's cervix taken with a digital colposcope after application of acetic acid. This procedure is used to visualize pre-cancerous and abnormal areas in the cervix, thus a useful step in automating image regional-based cervical pre-cancerous lesion detection. This has revolutionized the early detection of cervical pre-cancerous traits. For cancerous ailments its detections in its earlier stages to a greater extent determines whether the ailment will be managed or not. This research uses R-CNN a digital colposcopy based convolutional neural network aimed at visualization of pre-cancerous lesions. It is mathematical modelbased classification algorithm that extracts the features used in determining the precancerous traits. This model was trained on a dataset comprising of 10,383 cervical images samples. The datasets were derived from Public Kaggle and UCI dataset repositories. The training samples comprised of grade 1, 2 and 3 traits of cervical precancerous traits. With an accuracy rate of 86%, this approach heralds a promising development in the detection of cervical precancerous cases.

Keywords— Convolutional Neural Networks (CNN), CNNArchitecture, Cervical Colposcopy, Regional Based Convolutional Neural Network (R-CNN)

INTRODUCTION

Cervical cancer is responsible for approximately 300,000 deaths worldwide, with 90% affecting women in low to middle income nations [1],[2]. Health sectors, particularly within the management of chronic diseases, are facing great challenges with the rise of the aging population [3]. One of these challenges rises from several rural parts of the world, many women at high risk for cervical cancer are receiving treatment that will not work for them due to the position of their cervix. In Kenya for instance, cervical cancer has been established to be leading cause of death among and second most prevalent cancer affecting women, reporting about 5,250 new cervical cancer cases are diagnosed annually on women between 15 and 44 years [4].

The conventional cytological screening involves manual smearing and staining [5]. The complexity of the acquisition process for conventional cytology requires mobilizing expert teams to the field to evaluate [6] . Digital colposcopy is a very effective, non-invasive or pre-malignancy diagnostic tool used to review the pre-cancerous cases for cervical cancer [7] . In this procedure, the cervix is examined non-invasively using special methods designed by binocular stereomicroscope, where the abnormal cervical cell regions show aceto-white (AWE) lesion after application of 3-5% of acetic acid [8]. This is then biopsied using digital colposcopy guided for histopathological evaluation and confirmation [9]. Today, colposcopy today is considered reliable for detection and treatment of pre-cancerous lesions of the cervix, where women worldwide in low-resource settings are benefiting through programs where cancer is identified and treated or planned in a single visit [10].

Currently, a void in specialized image processing software which has the ability to process images acquired in colposcopy [11]. Several experts with different levels of expertise on Technology on various aspects through a competition organized by the analysis of vulnerable population, Intel and Mobile ODT on image quality assessment [12] and enhancement [13] of digital colposcopies’, to the segmentation of cervical image anatomic regions of the [14] to the final diagnosis [15] and to automate the analysis of colposcopy images in order to support the medical decision process and to provide a data driven channel for communication of findings [16]. We are facing a possible turning point in the area, with the driving interest on the new advent of deep learning and neural networks [17]. Therefore, our work aimed at proposing a cervical pre-cancerous neural network architecture for detection of pre-cancerous lesions by making use of regional-based convolutional neural network.

II.

CONVOLUTIONAL NEURAL NETWORK(CONVNET)

A. Convolutional Layer

This is a layer that has and accepts a volume of size parameters that consist of a set of learnable filters dimensions such as height, breath and number of channels that are essential in the convolution operation to derive the complex features from the edges of the input image [18]. These filters are small dimensional collections extending to a full depth of the input volume once matrix multiplication operations have been performed. This is done on a dot products amid logical regions of the inputs and the filters until it analyzes the whole width and replicates it till the pictorial is traversed [19].

This operation results in a 2-dimensional feature map that results to responses of that filter at every spatial dimension [20]. Instinctively, the network will learn filters that are activated when a type of visual feature such as an edge of some alignment is detected on the opening layer of a network. What results is an entire set of filters in each convolution layer with 12 filters, a 2-dimensional activation map [21]. These activation maps are stacked along the depth dimension resulting in an output volume [22].

B. Rectified Linear Unit Layer(ReLU)

The activation function in neural network is responsible for transforming the summed weighted images input from the node into the activation of the node or output for that input [7]. Therefore, these ReLU activation functions are a piecewise linear function that will output the input directly if it is positive for positive type, otherwise, it will output zero for normal results. This layer performs pixel-wise operation and replaces all the negative values in the activation map by zero to get a rectified activation map [23]. It has become the default activation function for many types of neural networks because a model that uses it is easier to train and often achieves better performance faster and better the accuracy.

C. Fully Connected Layer

The cheapest way to possibly learn a non-linear function is through a fully connected layer. Outputs of the convolution layers are represented in space combinations of high-level features providing a fully expressive, low-dimensional and invariant attribute space [24]. All the neurons from the preceding layer be it pooling, convolution, or fully connected layers are taken and connected to each neuron within it. Each neuron receives weights that prioritize the most appropriate tag. Having transformed the input (any image) into an appropriate form of multi-level perceptron, the image is flattened into a column vector.

The resulting flattened output is fed to a feed forward neural network as well as backpropagation which is applied to every repetition in the training [25]. The main goal of a fully connected layer is to accumulate the results of convolutions and the pooling processes in order to separate amid the dominating from certain low-level features and using them to categorize images into a basic classification.

D. Pooling Layer

The pooling layer uses height, width, depth, and stride that will control the amount of pixels and kernel size as it’s parameters to reduce the spatial dimensions, but not depth, on a convolutional neural network model, to gain computation performance, less chance to overestimate or fit and to get some translation invariance [21]. The maxpooling operation, the layer operates on the input spatially resizing it with filter sizes of 2 by 2 dimensions which is the most common form of a pooling layer, two strides are applied across every depth along the width and the height ensuring that each max operation takes a maximum of four numbers all this is done while ensuring the depth pool layer dimensions remain unchanged [22].

Apart from max pooling operation, pooling units do perform other pooling events such as L2-norm pooling as well as average pooling. The creation of a brief version of features detected, is as a result of using pooling layers effectively generating scaled down or pooled feature maps [26]. To illustrate the usefulness of convnets in case of small alterations in the location of a feature in the input spotted by a convolutional layer, we resulted in a pooled feature map with the feature at a similar location [27].

E. Convolutional Neural Network Architecture

Convolutional neural networks (CNNs) recently have proved to be a remarkable success on neural natural language processing and computer vision, despite being biologically • • • • • stimulated by the structure of mammals’ visual cortex. [28] still followed up with same idea by adapting it to computer vision. CNNs typically comprise of different types of layers that can automatically acquire feature representation which is highly distinguishable without the need of hand-crafted features[29], suggested Alex-Net, a deep convnet architecture, comprising of seven hidden layers with millions of constraints that achieved a high-tech performance on the ImageNet dataset [30].

Conventional convolution operation requires a huge number of multiplications, which tends to increase the inference time and restricts the applicability of CNN to low memory and time constraint applications [31]. Many realworld applications, such as robots, healthcare, and mobile applications, perform the tasks that need to be conducted on computationally limited platforms in a timely manner. Hence different modifications in CNN are performed.

In addition, the image classification using convolutional neural network is a feed-forward process that following process [32]: • First, feed the image dimensions (pixels) into the convolutional layer.

Select the image features (parameters), apply strides and padding if it essential, and convolve the image using an activation map.

Apply pooling to reduce the image dimensionality. Increase the convolutional layers until the model achieves the best accuracy desired.

Then, it flattens the results and feeds into the last layer called a fully connected layer.

Finally, produce the output using an activation function, which classifies the target image into the respective class.

The architecture outperformed other architectures outshining the second one which had an error rate of 26.2% while itself it had an error test of 15.3% [33]. Subsequently, the efforts saw the making of a very deep CNN architecture consisting of 16 layers which further significantly improved its accuracy. This pointed to a trend that showed that with every increasing depth of a CNN architecture there is a significant increase in its accuracy. This increase in depth resulted in a decrease in performance. Through research focusing on CNN reviews is still ongoing and has a room for significant improvement. Generally, observation o ccurrences between 2015 to 2020 shows a significant improvement in CNN performance. The capacity of a deep learning and CNN usually depends on its depth, and in a sense, an enriched variable set ranging from simple to more complex abstractions can aid in learning complex problems that arise. Input Cervix Images 150by150by3 Zero centre normalization

Con Layer by64, (3,3) Zero centre normalization ''lr cvaanoeu= iitt

MaxPooling2D(2,2)

Flaten() ,12 ''l)ru 5 oe= ( n sne ita eD itvc a .t() 0oopu5D r tt upu O

However, the main setback faced by deep learning neural networks architectures is that of the diminishing the gradient. Previously, researchers attempted to eradicate this problem by connecting intermediate blocks of layers to auxiliary learners [34]. The emerging area of research was mainly the development of new connections to improve the convergence rate of deep CNN architectures. In this regard, different ideas such as information gating mechanism across multiple layers, skip connections, and cross-layer channel connectivity was introduced [35] [35].

To date, several improvements in CNN architectures have been reported. As this regard, the architectural advancement focus of research has been on designing of new layers that can scale up the network representation by exploiting featuremaps as well as processing input representation through adding more artificial inputs. Moreover, besides this, the focus is towards the design of the architectures type without affecting the performance to make CNN applicable at all device.

III.

METHODS

In this section an automated model was proposed using the reginal based convolution neural network architecture capable of classifying the cervical precancerous lesions images. The neural network is designed in python 3.8.5 using TensorFlow and Keras deep learning libraries. Other python libraries like pandas, NumPy, seaborn, and matplotlib are also used in data preparation and visualization.

A. Abbreviations and Acronyms CIN: Cervical Intraepithelial Neoplasia.

AWE: Acetowhite Lesion of the cervix categorized as precancerous lesions.

CIN 1: low-grade neoplasia which involves about one third of the thickness of the epithelium.

CIN 2: They are the abnormal changes in about one-third and two-third of the epithelial layers.

CIN 3: they are mostly severe, which affects more than a third of the epithelium.

B. Exploring the dataset

The design was done in the Jupyter notebook in anaconda environment. Initially, the dependent frameworks and libraries were installed into a local machine then imported into the Jupyter notebook editor.

C. Dataset and Annotation

We obtained 10,383 de-identified images through available online dataset for training from Kaggle and UCI machine learning repository. And according to [36], these images were annotated as per the focus score output from a trained classifier and the annotations were quantized into types. Through this method, 10,383 cervical images were used as the training samples, 712 cervical images were used as test samples and 100 as validation samples.

We used convolutional neural network (CNN) as well as transfer learning methods to evaluate pre-cancerous lesions, as per the anatomy of the cervix, and as by recommended screening guidelines [11]. Below are randomly selected images from Kaggle’s cropped to include the ROI region, and subsequent steps of the process were performed within it, thus avoiding the confusing patterns and colors that occupy the rest of the image [37].

Fig. 2. Few of Cervix Screened Images from UCI Repository

To address the confusion on assessment and enhancement of this medical cervical images’ quality are often applicable to specific and require extensive domain knowledge.

D. Cervix Image Patch Extraction and Detection

A total of 10, 383 cervical precancerous lesions images were represented in patches of 15 × 15 pixels with depth of 3 as shown in Fig. 2. Convolutional neural network trained a model to evaluate 9, 883 images which were used to train the CNN model, 712 for testing and 100 for validating as per the anatomy of the cervix; hence, the bounding boundaries information of the cervix region is crucial. Using the bounding box localization of the cervix region, we can restrict the quality assessment to our area of interest (cervix) instead of including irrelevant regions [28].

CIN 1 CIN 2 CIN 3

The goal for developing the training model using this proposed architecture was to achieve and obtain accuracy of the cervical precancerous lesions’ coordinates for training purposes. The visual cortex of primates first receives input from the retinotopic area. Whereby, the lateral geniculate nucleus performs multi-scale high pass filtering and later contrast normalization [38]. We make the final decision by using the largest predicted score and the associated bounding annotation. This was manually verified, to define the automated detection of precancerous lesion to ensure accuracy of the trained and validated model in this research was achieved as illustrated in Fig. 3 on the separated findings.

E. Technique Of Evaluating Neural Network Architectures

Cross validation being a simple method running on multiple architectures while selecting the best performing architectures based on the authentication set was applied. In spite of this, only the best performing architectures is selected amidst architectures selected manually from a large number of choices. Exhaustive grid search has been considered the most effective strategy for hyper parameter optimization [12]. This is done by trying all the thinkable combinations manually for every stated hyper parameter. As in the works mentioned according to [39], the architecture is considerably shallow, with a couple of densely connected layers within three groups of convolutional pooling layers, and trained the architecture on a dataset with 485 images achieving 50% accuracy in recognizing three balanced classes [40].

Lately, random search has reported an improved result better than the grid search, by selecting hyper parameters randomly in a well-defined search space. Nevertheless, neither random nor grid search have been shown to use preceding valuations in identifying the succeeding set of hyper parameters for testing so as to better the anticipated architecture [41].

To calculate the accuracy of the model, the model must be compiled first. Compiling the model includes specifying the optimizer, loss function, and the metric evaluation method. Finally, the model uses accuracy to evaluate the model metrics. Below is the proposed model summary as illustrated in Fig. 4 this is the stage of checking the review of the model to confirm that everything is as expected

Fig 4. The proposed model summary

Several technological researchers implement a hybrid of an already proposed architectures to improve deep CNN performance [40] ;[42]; [43]; [44]; [45]; [46]. In 2018, they mainly on designing of generic blocks capable to be incorporated at any learning stage in CNN architecture, for improvement of the network representation [46]. In 2019, Khan et al. introduced a new idea of channel boosting to up the performance of a CNN by learning distinct features as well as exploiting the already learned features through the concept of TL [47].

This has resulted in a probabilistic model built on preceding objective function evolutions Techniques that have implemented BO techniques include Sequential Model-based Algorithm Configuration (SMAC) which is grounded on random forests as well as Spearmint that uses a Gaussian process model [48]. Fig 1, describes the architecture of CNN, to achieve this evaluation accuracy.

IV.

RESULTS A. Set Up Configuration

We set up an algorithm using Keras environment, with other libraries to validate the accuracy of the performance. The entire dataset was split into a training set, Validation test and a testing set, in a ratio of 70:20:10. This was done automatically by initializing the weights and rescaling to 1. /255 units. A batch of 20 was used by a binary mode, with various optimizers such as Metrix to determine accuracies, cross entropy for loss checking and monitoring and RM-Prop at a Learning ration of 1e-5. We validated using 100 epochs using 50 validation steps.

Unlike other several research studies[18], [49] performed based on traditional machine learning approaches like the SVM, [15] showed approximate 50% validation classification accuracy. However, recently most researches have been focused on improving the recognition accuracy of models. Very little studies have been demonstrated on recurrent architectures using convolutional neural networks. This strategy is an exceptional for context modeling from inputs, while this result in itself is not satisfactory, proposing that deep learning techniques has the potential to detect any lesion from cervix images from colposcopy.

Therefore, this research relied on deep learning approaches to leverage on the systematic and successive higher-level feature extraction. The training was performed on numerous sets of training cervix images samples while documenting the outcomes at each time. However, at each of the instances the following sets of parameters was kept constant; the training batch was set to 20 with the model being trained for 10 epochs with 100 steps being set for each epoch.

Initially, the model was trained for a scaled down sized training set. This set up was also replicated for the testing samples. At each of these instances the number of validation samples was matched to the samples in the training set. As observed from the results tabulated in Table I. In the first two instances the models were trained for 20 epochs. For each of the remaining instances the models were set for training over 10 epochs.

Training of the model beyond 20 epochs, saw it suffer the intense overfitting as illustrated in Fig. 1. which in turn reversed the gains that had been learned by the model. As such to prevent the model from negative gain the maximum number of epochs was capped to 20.

Model accuracies for models trained for 20 epochs appear to have the lowest result between 30%-40%. The increase observed could be attributed to the increase in training as well as testing samples. This low result observed is when a comparison is made to the models trained for 10 epochs which post an accuracy result between 48%-86%.

Gradually increasing the training samples together with the testing samples results to a steady increase in terms of the accuracy. Initially, the model trained for 500 training samples post a 33% accuracy compared to the model trained for 10,383 samples which posted an 86% accuracy.

Training accuracies for models for 20 epochs appear to have a steady increase in the accuracy and validation accuracies. This is partly because of the low number of training samples as well as testing samples. The result point to a low outcome ranging between 30% and 40%.

On increasing the training samples to 6000 figures with about 200 in training samples the accuracy in predictions of the trained model increases slightly to 48%. Observations from the training and validation accuracy diagram in Table 1, there appears to be a steady increases terms of training accuracies then a decline in validation accuracies. This appears to suggest an increase in training samples with merely enough testing samples will affect the accuracy of the trained model.

On increasing the samples to 2,000 and having the test samples at 1,000, the prediction accuracy of the trained model rose to 73.66%. Further increase of training samples to 10,383 raises up the performance of the model to 85.78%. as illustrated in the Table 1.

B. Discussions

The Deep Learning, the Keras neural network and TensorFlow libraries were used. The present study investigated whether neural networks could be applied to the categorization of images from colposcopy. Several types of cervical images are increasingly available through Internet via public repositories’, and inexpensive high-end smartphones or digital colposcopies are readily available for the general researchers, facilitating the uploading and sharing of this information, like as pictures for data and image processing.

Currently, machine learning and statistical analysis can be performed using high-performance personal computers which are affordable for individuals, where there is limited volume of information. In addition, deep learning and neural networks technologies are becoming more accessible for research individuals and interested corporations. For instances, the Google software library for machine learning, TensorFlow, which was released under an open-sour ce license in 2015 [50]. Based on these emerging trends, the present study aimed to apply deep learning neural networks to gynecological clinical practice.

The architecture used consists of several blocks of convolutional layers followed by a max pooling layer. In the second to last layer of the structure global max-pooling is used, which is followed by a soft-max layer at the end [51]. This architecture demonstrated the state-of the-art accuracy for object classification at the time [52] [53]. This architecture uses a combination of two popular techniques, CNN and LSTM. The extraction is performed to identify how features vary with respect to time. This proposed model shows better performance for visual images analysis.

For computer vision problems and especially cervical cancer diagnosis using deep learning approaches, having multiple and high-resolution images is a step towards having favorable classification outcomes. However, having multiple images is simply not enough. Drawing comparisons as observed from Tables I, II, and III, having significantly low number of training and testing samples negatively affects capacity of a model to generate the favorable classifications outcomes as would be desired.

Further, training of such a model would result in the model gaining further characteristics which would negatively impact the model resulting in overfitting. This is captured in Fig. 5.

In stark, contrast to these research findings, other previous studies by [54] shows, the training set accuracy using deep CNN was 80.1% and according to [18] used a randomly selected 345 images, and reported an accuracy of 83%, by resizing the images to 120*120pixels and passing it to LeNet CNN. Our proposed model using R-CNN recorded 85.75% which is slightly higher on training samples of 10,383 improved up the performance of the model.

Lower classification accuracies observed from the training sample outcomes model case index I could also be attributed to misclassified images. Misclassified images could result from the presence of intra-uterine devices (UID), hair and even speculum often used in an examination for dilating the orifice. Presence of these foreign matter not only ensures the model learns incorrect and unimportant features precancerous-cervical representations but also introduces new variables that are not related to the cervical cancer characteristics, further complicating the classification process.

C. Conclusion

Early screening and subsequent discovery of any precancerous traits have over time proven to be an effective mode of dealing with the cancerous ailments. With significant strides having been made in diagnosis and detection of these ailments, the same has been replicated in the detection of the precancerous lesions.

The proposed model utilizes the same approach of using pre-cancerous lesions to aid in detection of its imminent progression. With an accuracy of 86% the proposed R-CNN model demonstrates its ability to detect the presence of cervical pre-cancerous traits and could highly aid in diagnosis of imminent progression of cervical precancerous traits.

To ensure higher classification accuracies the proposed RCNN model made use of significantly sufficient training as well as testing samples. The use of sufficient training samples was not only meant to ensure that the model gains the relevant characteristic traits of cervical pre-cancerous lesions but was also meant to make it generalizable which is crucial in the detection process.

Despite the high accuracy rating, detection of cervical precancerous traits is not always a straight forward feat. Low quality, low resolution images, presence of foreign such as IUD or hair makes the detection algorithm susceptible to erroneous classifications. Perhaps to ensure more accuracy rating careful selection and prepping of the images should be undertaken.

ACKNOWLEDGMENTS

The authors would like to thank all reviewers and editors for their comments and contributions on this paper. We do also acknowledge Kaggle and UCI dataset repository and all subjects whose cervical images were used for learning purposes. We also recognize the medical Gyn Oncologist expert’ input from Prof. Omenge E. Orango, Moi UniversitySchool of Medicine towards the success of this research. [17] [18]

F. Islami, L. A. Torre, J. M. Drope, E. M. Ward, and A. Jemal, “Global cancer in women: Cancer control priorities,” Cancer Epidemiol. Biomarkers Prev., vol. 26, no. 4, pp. 458–470, 2017, doi: 10.1158/1055-9965.EPI-16-0871.

World Health Organization, GUIDE TO CANCER - Guide to cancer early dia gnosis. 2017 . “Kenya Cancer Statistics & National Strategies,” Kenyan Network of Cancer Organizations, Feb. 18, 2013. https://kenyacancernetwork.wordpress.com/kenya-cancer-facts/ (accessed Aug. 13, 2020). “(1) (PDF) Screening for Cervical Cancer Using Automated Analysis of PAP-Smears,” ResearchGate. https://www.researchgate.net/publication/261923436_Screening_fo r_Cervical_Cancer_Using_Automated_Analysis_of_PAP-Smears (accessed Aug. 14, 2020). “Wei e t al. - 2017 - Cervical cancer histology image identification met.pdf.” .

X. Q. Zhang and S. G. Zhao, “Cervical image classification based on image segmentation preprocessing and a CapsNet network model,” Int. J. Imaging Syst. Technol., vol. 29, no. 1, pp. 19–28, 2019, doi: 10.1002/ima.22291. “National-Cancer-Screening-Guidelines-2018.pdf.” Accessed: Aug. 13, 2020. [Online]. Available: https://www.health.go.ke/wpcontent/uploads/2019/02/National-Cancer-Screening-Guidelines2018.pdf.

L. Wei, Q. Gan, and T. Ji, “Cervical cancer histology image identification method based on texture and lesion area features,” Comput. Assist. Surg., vol. 22, no. sup1, pp. 186–199, Oc t. 2017 , doi: 10.1080/24699322.2017.1389397. “Intel & MobileODT Cervical Cancer Screening.” https://kaggle.com/c/intel-mobileodt-cervical-cancer-screening (accessed Aug. 14, 2020).

World Health Organization, Ed., WHO guidelines for screening and treatment of precancerous lesions for cervical cancer prevention. Geneva: World Health Organization, 2013.

K. Fernandes, D. Chicco, J. S. Cardoso, and J. Fernandes, “Supervised deep learning embeddings for the prediction of cervical cancer diagnosis,” PeerJ Comput. Sci., vol. 4, p. e154, May 2018 , doi: 10.7717/peerj-cs.154. “Maini and Aggarwal - 2010 - A Comprehensive Review of Image Enhancement Techni.pdf.” . “A Novel Analysis of Clinical Data and Image Processing Algorithms in Detection of Cervical Cancer,” ResearchGate. https://www.researchgate.net/publication/277667329_A_Novel_An alysis_of_Clinical_Data_and_Image_Processing_Algorithms_in_D etection_of_Cervical_Cancer (accessed Aug. 14, 2020).

M. Sato et al., “Application of deep learning to the classification of images from colposcopy,” Oncol. Lett., Jan. 2018, doi: 10.3892/ol.2018.7762.

N. Muinga et al., “Digital health Systems in Kenyan Public Hospitals: a mixed-methods survey,” BMC Med. Inform. Decis. Mak., vol. 20, no. 1, p. 2, De c. 2020 , doi: 10.1186/s12911-0191005-7.

M. Nielsen, “Neural networks and deep learning.” 2019.

A. Mittal and M. Juneja, “Cervix Cancer Classification using Colposcopy Images by Deep Learning Method,” Int. J. Eng.

Technol. Sci. Res., vol. 5, no. 3, pp. 426–432, 2018.

C. Data Science, “An Intuitive Explanation of Convolutional Neural Networks – the data science blo g_.” 2017 .

R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi, “Convolutional neural networks: an overview and application in radiology,” Insights Imaging, vol. 9, no. 4, pp. 611–629, 2018, doi: 10.1007/s13244-018-0639-9.

H. A. Almubarak et al., “Convolutional Neural Network Based Localized Classification of Uterine Cervical Cancer Digital Histology Images.,” Procedia Comput. Sci., vol. 114, pp. 281–287, 2017, doi: 10.1016/j.procs.2017.09.044.

Guillaume Berger, “CS231n Convolutional Neural Networks for Visual Recogni tion.” 2016 .

J. Brownlee, “A Gentle Introduction to the Rectified Linear Unit (ReLU),” Machine Learning Mastery, Jan. 08, 2019. https://machinelearningmastery.com/rectified-linear-activationfunction-for-deep-learning-neural-networks/ (accessed Aug. 14, 2020). “CS231n Convolutional Neural Networks for Visual Recognition.” https://cs231n.github.io/convolutional-networks/ (accessed Aug. 14, 2020). [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44]

Bray ,

Ferlay , I. Soerjomataram ,

R. L.

Siegel ,

L. A.

Torre , and

Jemal , “Global cancer statistics 2018 : GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries ,” CA. Cancer

J. Clin.

, vol. 68 , no. 6 , pp. 394 - 424 , 2018 , doi: 10.3322/caac.21492.

[19] [20] [21] [22] [23] [24] “ Convolutional Neural Network. In this article, we will see what are… | by Arunava | Towards Data Science .” https://towardsdatascience.com/convolutional -neural-network17fb77e76c05 (accessed Aug . 14 , 2020 ).

21, 2019 . https://machinelearningmastery.com /pooling-layers-forconvolutional-neural-networks/ (accessed Aug . 14 , 2020 ).

“Litjens et al. - 2017 - A Survey on Deep Learning in Medical Image Analysi .pdf.” .

Tompson ,

Jain , Y. LeCun, and C. Bregler, “ Joint training of a convolutional network and a graphical model for human pose estimation , ” Adv. Neural Inf. Process. Syst. , vol. 2 , no. January , pp.

M. R.

Minar and

Naher , “ Recent Advances in Deep Learning: An Overview ,” vol. 2006 , pp. 1 - 31 , 2018 , doi: 10.13140/RG.2.2.24831.10403.

M. Wu , C.

Yan , H.

Liu , Q.

Liu , and Y.

Yin , “ Automatic classification of cervical cancer from cytological images by using convolutional neural network,” Biosci . Rep., vol. 38 , no. 6 , pp. 1 - 9 , 2018 , doi: 10.1042/BSR20181769.

https://www.springerprofessional.de/en/detecting -driverdrowsiness-in-real-time-through-deep-learning- b/16772484 (accessed Aug . 14 , 2020 ).

Prabhu , “ Understanding of Convolutional Neural Network (CNN) - Deep Learning ,” Medium, Nov. 21 , 2019 .

https://medium.com/@ RaghavPrabhu/understanding-ofconvolutional-neural-network-cnn-deep-learning-99760835f148 (accessed Aug . 27 , 2020 ).

Litjens et al., “A Survey on Deep Learning in Medical Image Analysis,” Med . Image Anal., vol. 42 , pp. 60 - 88 , Dec. 2017 , doi: 10.1016/j.media. 2017 . 07 .005.

Szegedy ,

Vanhoucke ,

Ioffe ,

Shlens , and

Wojna , “ Rethinking the Inception Architecture for Computer Vision ,” ArXiv151200567 Cs, Dec. 2015 , Accessed: Aug. 15 , 2020 .

[Online]. Available: http://arxiv.org/abs/1512.00567.

Barth ,

Hemming , and E. J. Van Henten , “ Optimising realism of synthetic images using cycle generative adversarial networks for improved part segmentation,” Comput. Electron. Agric. , vol. 173 , p. 105378 , Jun . 2020 , doi: 10.1016/j.compag. 2020 . 105378 .

Bosse ,

Maniry ,

Wiegand , and W. Samek, “Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Berlin, Germany Department of Electrical Engineering, Technical University of Berlin, Germany .,” pp. 1 - 5 .

“Find Open Datasets and Machine Learning Projects | Kaggle .” https://www.kaggle.com/datasets (accessed Aug . 14 , 2020 ).

Natu , “ The functional neuroanatomy of face perception: from brain measurements to deep neural networks , ” Interface Focus , vol. 8 , no. 4 , p. 20180013 , Aug . 2018 , doi: 10.1098/rsfs. 2018 . 0013 .

“(1) Multimodal Deep Learning for Cervical Dysplasia Diagnosis | Request PDF ,” ResearchGate.

https://www.researchgate.net/publication/308816998_Multimodal_ Deep_Learning_for_Cervical_Dysplasia_Diagnosis (accessed Aug.

Xu et al., “Multi-feature based Benchmark for Cervical Dysplasia Classification Evaluation,” Pattern Recognit ., vol. 63 , pp. 468 - 475 , Mar. 2017 , doi: 10.1016/j.patcog. 2016 . 09 .027.

“Bergstra and Bengio - Random Search for Hyper-Parameter Optimization .pdf.” Accessed: Aug. 14 , 2020 . [Online]. Available: https://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf .

M. Z. Alom , M.

Hasan , C.

Yakopcic , and T. M.

Taha , “ Inception Recurrent Convolutional Neural Network for Object Recognition,” ArXiv170407709 Cs, Apr . 2017 , Accessed: Aug. 14 , 2020 .

[Online]. Available: http://arxiv.org/abs/1704.07709.

Larsson ,

Maire , and G. Shakhnarovich, “FractalNet: UltraDeep Neural Networks without Residuals,” ArXiv160507648 Cs, May 2017 , Accessed: Aug. 14 , 2020 . [Online]. Available: http://arxiv.org/abs/1605.07648.

https://www.researchgate.net/publication/338402067_Neural_ Netw ork_Based_Rhetorical_Status_Classification_for_Japanese_Judgme nt_Documents (accessed Aug . 14 , 2020 ).