<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>2020).
“Feature selection considering two types of feature relevancy and
feature interdependency | Expert Systems with Applications: An
International Journal.”
https://dl.acm.org/doi/10.1016/j.eswa.2017.10.016 (accessed Aug.
14</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1007/s12652-019</article-id>
      <title-group>
        <article-title>Automated Detection of Cervical Pre-Cancerous Lesions Using Regional-Based Convolutional Neural Network</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Stephen Kiptoo</string-name>
          <email>kiptoos@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lawrence Nderu</string-name>
          <email>lnderu@jkuat.ac.ke</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Leah Mutanu</string-name>
          <email>lmutanu@usiu.ac.ke</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Computing and Information, Technology (SCIT), Jomo Kenyatta, University of Agriculture and, Technology</institution>
          ,
          <addr-line>Nairobi</addr-line>
          ,
          <country country="KE">Kenya</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Science and Technology, (SST), United States International University</institution>
          ,
          <addr-line>USIU- Africa, Nairobi</addr-line>
          ,
          <country country="KE">Kenya</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <volume>4</volume>
      <issue>4</issue>
      <fpage>19</fpage>
      <lpage>28</lpage>
      <abstract>
        <p>- Cervical Colposcopy image is an image of woman's cervix taken with a digital colposcope after application of acetic acid. This procedure is used to visualize pre-cancerous and abnormal areas in the cervix, thus a useful step in automating image regional-based cervical pre-cancerous lesion detection. This has revolutionized the early detection of cervical pre-cancerous traits. For cancerous ailments its detections in its earlier stages to a greater extent determines whether the ailment will be managed or not. This research uses R-CNN a digital colposcopy based convolutional neural network aimed at visualization of pre-cancerous lesions. It is mathematical modelbased classification algorithm that extracts the features used in determining the precancerous traits. This model was trained on a dataset comprising of 10,383 cervical images samples. The datasets were derived from Public Kaggle and UCI dataset repositories. The training samples comprised of grade 1, 2 and 3 traits of cervical precancerous traits. With an accuracy rate of 86%, this approach heralds a promising development in the detection of cervical precancerous cases.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Keywords— Convolutional Neural Networks (CNN),
CNNArchitecture, Cervical Colposcopy, Regional Based Convolutional
Neural Network (R-CNN)</p>
      <p>I.</p>
    </sec>
    <sec id="sec-2">
      <title>INTRODUCTION</title>
      <p>Cervical cancer is responsible for approximately 300,000
deaths worldwide, with 90% affecting women in low to
middle income nations [1],[2]. Health sectors, particularly
within the management of chronic diseases, are facing great
challenges with the rise of the aging population [3]. One of
these challenges rises from several rural parts of the world,
many women at high risk for cervical cancer are receiving
treatment that will not work for them due to the position of
their cervix. In Kenya for instance, cervical cancer has been
established to be leading cause of death among and second
most prevalent cancer affecting women, reporting about 5,250
new cervical cancer cases are diagnosed annually on women
between 15 and 44 years [4].</p>
      <p>The conventional cytological screening involves manual
smearing and staining [5]. The complexity of the acquisition
process for conventional cytology requires mobilizing expert
teams to the field to evaluate [6] . Digital colposcopy is a very
effective, non-invasive or pre-malignancy diagnostic tool
used to review the pre-cancerous cases for cervical cancer [7]
. In this procedure, the cervix is examined non-invasively
using special methods designed by binocular
stereomicroscope, where the abnormal cervical cell regions
show aceto-white (AWE) lesion after application of 3-5% of
acetic acid [8]. This is then biopsied using digital colposcopy
guided for histopathological evaluation and confirmation [9].
Today, colposcopy today is considered reliable for detection
and treatment of pre-cancerous lesions of the cervix, where
women worldwide in low-resource settings are benefiting
through programs where cancer is identified and treated or
planned in a single visit [10].</p>
      <p>Currently, a void in specialized image processing software
which has the ability to process images acquired in colposcopy
[11]. Several experts with different levels of expertise on
Technology on various aspects through a competition
organized by the analysis of vulnerable population, Intel and
Mobile ODT on image quality assessment [12] and
enhancement [13] of digital colposcopies’, to the
segmentation of cervical image anatomic regions of the [14]
to the final diagnosis [15] and to automate the analysis of
colposcopy images in order to support the medical decision
process and to provide a data driven channel for
communication of findings [16]. We are facing a possible
turning point in the area, with the driving interest on the new
advent of deep learning and neural networks [17]. Therefore,
our work aimed at proposing a cervical pre-cancerous neural
network architecture for detection of pre-cancerous lesions by
making use of regional-based convolutional neural network.</p>
      <p>II.</p>
      <p>CONVOLUTIONAL NEURAL NETWORK(CONVNET)</p>
      <sec id="sec-2-1">
        <title>A. Convolutional Layer</title>
        <p>This is a layer that has and accepts a volume of size
parameters that consist of a set of learnable filters dimensions
such as height, breath and number of channels that are
essential in the convolution operation to derive the complex
features from the edges of the input image [18]. These filters
are small dimensional collections extending to a full depth of
the input volume once matrix multiplication operations have
been performed. This is done on a dot products amid logical
regions of the inputs and the filters until it analyzes the whole
width and replicates it till the pictorial is traversed [19].</p>
        <p>This operation results in a 2-dimensional feature map that
results to responses of that filter at every spatial dimension
[20]. Instinctively, the network will learn filters that are
activated when a type of visual feature such as an edge of
some alignment is detected on the opening layer of a network.
What results is an entire set of filters in each convolution
layer with 12 filters, a 2-dimensional activation map [21].
These activation maps are stacked along the depth dimension
resulting in an output volume [22].</p>
      </sec>
      <sec id="sec-2-2">
        <title>B. Rectified Linear Unit Layer(ReLU)</title>
        <p>The activation function in neural network is responsible
for transforming the summed weighted images input from the
node into the activation of the node or output for that input
[7]. Therefore, these ReLU activation functions are a
piecewise linear function that will output the input directly if
it is positive for positive type, otherwise, it will output zero
for normal results. This layer performs pixel-wise operation
and replaces all the negative values in the activation map by
zero to get a rectified activation map [23]. It has become the
default activation function for many types of neural networks
because a model that uses it is easier to train and often
achieves better performance faster and better the accuracy.</p>
      </sec>
      <sec id="sec-2-3">
        <title>C. Fully Connected Layer</title>
        <p>The cheapest way to possibly learn a non-linear function
is through a fully connected layer. Outputs of the convolution
layers are represented in space combinations of high-level
features providing a fully expressive, low-dimensional and
invariant attribute space [24]. All the neurons from the
preceding layer be it pooling, convolution, or fully connected
layers are taken and connected to each neuron within it. Each
neuron receives weights that prioritize the most appropriate
tag. Having transformed the input (any image) into an
appropriate form of multi-level perceptron, the image is
flattened into a column vector.</p>
        <p>The resulting flattened output is fed to a feed forward
neural network as well as backpropagation which is applied
to every repetition in the training [25]. The main goal of a
fully connected layer is to accumulate the results of
convolutions and the pooling processes in order to separate
amid the dominating from certain low-level features and
using them to categorize images into a basic classification.</p>
      </sec>
      <sec id="sec-2-4">
        <title>D. Pooling Layer</title>
        <p>The pooling layer uses height, width, depth, and stride that
will control the amount of pixels and kernel size as it’s
parameters to reduce the spatial dimensions, but not depth,
on a convolutional neural network model, to gain
computation performance, less chance to overestimate or
fit and to get some translation invariance [21]. The
maxpooling operation, the layer operates on the input spatially
resizing it with filter sizes of 2 by 2 dimensions which is
the most common form of a pooling layer, two strides are
applied across every depth along the width and the height
ensuring that each max operation takes a maximum of four
numbers all this is done while ensuring the depth pool layer
dimensions remain unchanged [22].</p>
        <p>Apart from max pooling operation, pooling units do
perform other pooling events such as L2-norm pooling as
well as average pooling. The creation of a brief version of
features detected, is as a result of using pooling layers
effectively generating scaled down or pooled feature maps
[26]. To illustrate the usefulness of convnets in case of
small alterations in the location of a feature in the input
spotted by a convolutional layer, we resulted in a pooled
feature map with the feature at a similar location [27].</p>
      </sec>
      <sec id="sec-2-5">
        <title>E. Convolutional Neural Network Architecture</title>
        <p>Convolutional neural networks (CNNs) recently have
proved to be a remarkable success on neural natural language
processing and computer vision, despite being biologically
•
•
•
•
•
stimulated by the structure of mammals’ visual cortex. [28]
still followed up with same idea by adapting it to computer
vision. CNNs typically comprise of different types of layers
that can automatically acquire feature representation which is
highly distinguishable without the need of hand-crafted
features[29], suggested Alex-Net, a deep convnet architecture,
comprising of seven hidden layers with millions of constraints
that achieved a high-tech performance on the ImageNet
dataset [30].</p>
        <p>Conventional convolution operation requires a huge
number of multiplications, which tends to increase the
inference time and restricts the applicability of CNN to low
memory and time constraint applications [31]. Many
realworld applications, such as robots, healthcare, and mobile
applications, perform the tasks that need to be conducted on
computationally limited platforms in a timely manner. Hence
different modifications in CNN are performed.</p>
        <p>In addition, the image classification using convolutional
neural network is a feed-forward process that following
process [32]:
• First, feed the image dimensions (pixels) into the
convolutional layer.</p>
        <p>Select the image features (parameters), apply strides and
padding if it essential, and convolve the image using an
activation map.</p>
        <p>Apply pooling to reduce the image dimensionality.
Increase the convolutional layers until the model
achieves the best accuracy desired.</p>
        <p>Then, it flattens the results and feeds into the last layer
called a fully connected layer.</p>
        <p>Finally, produce the output using an activation function,
which classifies the target image into the respective
class.</p>
        <p>
          The architecture outperformed other architectures
outshining the second one which had an error rate of 26.2%
while itself it had an error test of 15.3% [33]. Subsequently,
the efforts saw the making of a very deep CNN architecture
consisting of 16 layers which further significantly improved
its accuracy. This pointed to a trend that showed that with
every increasing depth of a CNN architecture there is a
significant increase in its accuracy. This increase in depth
resulted in a decrease in performance. Through research
focusing on CNN reviews is still ongoing and has a room for
significant improvement. Generally, observation o
          <xref ref-type="bibr" rid="ref12">ccurrences
between 2015</xref>
          to 2020 shows a significant improvement in
CNN performance. The capacity of a deep learning and CNN
usually depends on its depth, and in a sense, an enriched
variable set ranging from simple to more complex
abstractions can aid in learning complex problems that arise.
Input Cervix Images
150by150by3
Zero centre
normalization
        </p>
        <p>Con Layer
by64, (3,3)
Zero centre
normalization
''lr
cvaanoeu=
iitt</p>
        <p>MaxPooling2D(2,2)</p>
        <p>Flaten()
,12 ''l)ru
5 oe=
( n
sne ita
eD itvc
a
.t()
0oopu5D
r
tt
upu
O</p>
        <p>However, the main setback faced by deep learning neural
networks architectures is that of the diminishing the gradient.
Previously, researchers attempted to eradicate this problem by
connecting intermediate blocks of layers to auxiliary learners
[34]. The emerging area of research was mainly the
development of new connections to improve the convergence
rate of deep CNN architectures. In this regard, different ideas
such as information gating mechanism across multiple layers,
skip connections, and cross-layer channel connectivity was
introduced [35] [35].</p>
        <p>To date, several improvements in CNN architectures have
been reported. As this regard, the architectural advancement
focus of research has been on designing of new layers that
can scale up the network representation by exploiting
featuremaps as well as processing input representation through
adding more artificial inputs. Moreover, besides this, the
focus is towards the design of the architectures type without
affecting the performance to make CNN applicable at all
device.</p>
        <p>III.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>METHODS</title>
      <p>In this section an automated model was proposed using the
reginal based convolution neural network architecture capable
of classifying the cervical precancerous lesions images. The
neural network is designed in python 3.8.5 using TensorFlow
and Keras deep learning libraries. Other python libraries like
pandas, NumPy, seaborn, and matplotlib are also used in data
preparation and visualization.</p>
      <sec id="sec-3-1">
        <title>A. Abbreviations and Acronyms</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>CIN: Cervical Intraepithelial Neoplasia.</title>
      <p>AWE: Acetowhite Lesion of the cervix categorized as
precancerous lesions.</p>
      <p>CIN 1: low-grade neoplasia which involves about one
third of the thickness of the epithelium.</p>
      <p>CIN 2: They are the abnormal changes in about one-third
and two-third of the epithelial layers.</p>
      <p>CIN 3: they are mostly severe, which affects more than a
third of the epithelium.</p>
      <sec id="sec-4-1">
        <title>B. Exploring the dataset</title>
        <p>The design was done in the Jupyter notebook in anaconda
environment. Initially, the dependent frameworks and
libraries were installed into a local machine then imported
into the Jupyter notebook editor.</p>
      </sec>
      <sec id="sec-4-2">
        <title>C. Dataset and Annotation</title>
        <p>We obtained 10,383 de-identified images through
available online dataset for training from Kaggle and UCI
machine learning repository. And according to [36], these
images were annotated as per the focus score output from a
trained classifier and the annotations were quantized into
types. Through this method, 10,383 cervical images were used
as the training samples, 712 cervical images were used as test
samples and 100 as validation samples.</p>
        <p>We used convolutional neural network (CNN) as well as
transfer learning methods to evaluate pre-cancerous lesions,
as per the anatomy of the cervix, and as by recommended
screening guidelines [11]. Below are randomly selected
images from Kaggle’s cropped to include the ROI region,
and subsequent steps of the process were performed within it,
thus avoiding the confusing patterns and colors that occupy
the rest of the image [37].</p>
        <p>Fig. 2. Few of Cervix Screened Images from UCI Repository</p>
        <p>To address the confusion on assessment and enhancement
of this medical cervical images’ quality are often applicable
to specific and require extensive domain knowledge.</p>
      </sec>
      <sec id="sec-4-3">
        <title>D. Cervix Image Patch Extraction and Detection</title>
        <p>A total of 10, 383 cervical precancerous lesions images
were represented in patches of 15 × 15 pixels with depth of
3 as shown in Fig. 2. Convolutional neural network trained a
model to evaluate 9, 883 images which were used to train the
CNN model, 712 for testing and 100 for validating as per the
anatomy of the cervix; hence, the bounding boundaries
information of the cervix region is crucial. Using the bounding
box localization of the cervix region, we can restrict the
quality assessment to our area of interest (cervix) instead of
including irrelevant regions [28].</p>
        <p>CIN 1
CIN 2
CIN 3</p>
        <p>The goal for developing the training model using this
proposed architecture was to achieve and obtain accuracy of
the cervical precancerous lesions’ coordinates for training
purposes. The visual cortex of primates first receives input
from the retinotopic area. Whereby, the lateral geniculate
nucleus performs multi-scale high pass filtering and later
contrast normalization [38]. We make the final decision by
using the largest predicted score and the associated bounding
annotation. This was manually verified, to define the
automated detection of precancerous lesion to ensure
accuracy of the trained and validated model in this research
was achieved as illustrated in Fig. 3 on the separated findings.</p>
      </sec>
      <sec id="sec-4-4">
        <title>E. Technique Of Evaluating Neural Network Architectures</title>
        <p>Cross validation being a simple method running on
multiple architectures while selecting the best performing
architectures based on the authentication set was applied. In
spite of this, only the best performing architectures is selected
amidst architectures selected manually from a large number of
choices. Exhaustive grid search has been considered the most
effective strategy for hyper parameter optimization [12]. This
is done by trying all the thinkable combinations manually for
every stated hyper parameter. As in the works mentioned
according to [39], the architecture is considerably shallow,
with a couple of densely connected layers within three groups
of convolutional pooling layers, and trained the architecture
on a dataset with 485 images achieving 50% accuracy in
recognizing three balanced classes [40].</p>
        <p>Lately, random search has reported an improved result
better than the grid search, by selecting hyper parameters
randomly in a well-defined search space. Nevertheless,
neither random nor grid search have been shown to use
preceding valuations in identifying the succeeding set of hyper
parameters for testing so as to better the anticipated
architecture [41].</p>
        <p>To calculate the accuracy of the model, the model must be
compiled first. Compiling the model includes specifying the
optimizer, loss function, and the metric evaluation method.
Finally, the model uses accuracy to evaluate the model
metrics. Below is the proposed model summary as illustrated
in Fig. 4 this is the stage of checking the review of the model
to confirm that everything is as expected</p>
        <p>Fig 4. The proposed model summary</p>
        <p>Several technological researchers implement a hybrid of
an already proposed architectures to improve deep CNN
performance [40] ;[42]; [43]; [44]; [45]; [46]. In 2018, they
mainly on designing of generic blocks capable to be
incorporated at any learning stage in CNN architecture, for
improvement of the network representation [46]. In 2019,
Khan et al. introduced a new idea of channel boosting to up
the performance of a CNN by learning distinct features as well
as exploiting the already learned features through the concept
of TL [47].</p>
        <p>This has resulted in a probabilistic model built on
preceding objective function evolutions Techniques that have
implemented BO techniques include Sequential Model-based
Algorithm Configuration (SMAC) which is grounded on
random forests as well as Spearmint that uses a Gaussian
process model [48]. Fig 1, describes the architecture of CNN,
to achieve this evaluation accuracy.</p>
        <p>IV.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>RESULTS</title>
      <sec id="sec-5-1">
        <title>A. Set Up Configuration</title>
        <p>We set up an algorithm using Keras environment, with
other libraries to validate the accuracy of the performance.
The entire dataset was split into a training set, Validation test
and a testing set, in a ratio of 70:20:10. This was done
automatically by initializing the weights and rescaling to 1.
/255 units. A batch of 20 was used by a binary mode, with
various optimizers such as Metrix to determine accuracies,
cross entropy for loss checking and monitoring and RM-Prop
at a Learning ration of 1e-5. We validated using 100 epochs
using 50 validation steps.</p>
        <p>Unlike other several research studies[18], [49] performed
based on traditional machine learning approaches like the
SVM, [15] showed approximate 50% validation classification
accuracy. However, recently most researches have been
focused on improving the recognition accuracy of models.
Very little studies have been demonstrated on recurrent
architectures using convolutional neural networks. This
strategy is an exceptional for context modeling from inputs,
while this result in itself is not satisfactory, proposing that
deep learning techniques has the potential to detect any lesion
from cervix images from colposcopy.</p>
        <p>Therefore, this research relied on deep learning
approaches to leverage on the systematic and successive
higher-level feature extraction. The training was performed on
numerous sets of training cervix images samples while
documenting the outcomes at each time. However, at each of
the instances the following sets of parameters was kept
constant; the training batch was set to 20 with the model being
trained for 10 epochs with 100 steps being set for each epoch.</p>
        <p>Initially, the model was trained for a scaled down sized
training set. This set up was also replicated for the testing
samples. At each of these instances the number of validation
samples was matched to the samples in the training set. As
observed from the results tabulated in Table I. In the first two
instances the models were trained for 20 epochs. For each of
the remaining instances the models were set for training over
10 epochs.</p>
        <p>Training of the model beyond 20 epochs, saw it suffer the
intense overfitting as illustrated in Fig. 1. which in turn
reversed the gains that had been learned by the model. As such
to prevent the model from negative gain the maximum number
of epochs was capped to 20.</p>
        <p>Model accuracies for models trained for 20 epochs appear
to have the lowest result between 30%-40%. The increase
observed could be attributed to the increase in training as well
as testing samples. This low result observed is when a
comparison is made to the models trained for 10 epochs which
post an accuracy result between 48%-86%.</p>
        <p>Gradually increasing the training samples together with
the testing samples results to a steady increase in terms of the
accuracy. Initially, the model trained for 500 training samples
post a 33% accuracy compared to the model trained for 10,383
samples which posted an 86% accuracy.</p>
        <p>Training accuracies for models for 20 epochs appear to
have a steady increase in the accuracy and validation
accuracies. This is partly because of the low number of
training samples as well as testing samples. The result point to
a low outcome ranging between 30% and 40%.</p>
        <p>On increasing the training samples to 6000 figures with
about 200 in training samples the accuracy in predictions of
the trained model increases slightly to 48%. Observations
from the training and validation accuracy diagram in Table 1,
there appears to be a steady increases terms of training
accuracies then a decline in validation accuracies. This
appears to suggest an increase in training samples with
merely enough testing samples will affect the accuracy of the
trained model.</p>
        <p>On increasing the samples to 2,000 and having the test
samples at 1,000, the prediction accuracy of the trained model
rose to 73.66%. Further increase of training samples to
10,383 raises up the performance of the model to 85.78%. as
illustrated in the Table 1.</p>
      </sec>
      <sec id="sec-5-2">
        <title>B. Discussions</title>
        <p>The Deep Learning, the Keras neural network and
TensorFlow libraries were used. The present study
investigated whether neural networks could be applied to the
categorization of images from colposcopy. Several types of
cervical images are increasingly available through Internet via
public repositories’, and inexpensive high-end smartphones or
digital colposcopies are readily available for the general
researchers, facilitating the uploading and sharing of this
information, like as pictures for data and image processing.</p>
        <p>
          Currently, machine learning and statistical analysis can be
performed using high-performance personal computers which
are affordable for individuals, where there is limited volume
of information. In addition, deep learning and neural networks
technologies are becoming more accessible for research
individuals and interested corporations. For instances, the
Google software library for machine learning, TensorFlow,
which was released under an open-sour
          <xref ref-type="bibr" rid="ref12">ce license in 2015</xref>
          [50].
Based on these emerging trends, the present study aimed to
apply deep learning neural networks to gynecological clinical
practice.
        </p>
        <p>The architecture used consists of several blocks of
convolutional layers followed by a max pooling layer. In the
second to last layer of the structure global max-pooling is
used, which is followed by a soft-max layer at the end [51].
This architecture demonstrated the state-of the-art accuracy
for object classification at the time [52] [53]. This architecture
uses a combination of two popular techniques, CNN and
LSTM. The extraction is performed to identify how features
vary with respect to time. This proposed model shows better
performance for visual images analysis.</p>
        <p>For computer vision problems and especially cervical
cancer diagnosis using deep learning approaches, having
multiple and high-resolution images is a step towards having
favorable classification outcomes. However, having multiple
images is simply not enough. Drawing comparisons as
observed from Tables I, II, and III, having significantly low
number of training and testing samples negatively affects
capacity of a model to generate the favorable classifications
outcomes as would be desired.</p>
        <p>Further, training of such a model would result in the model
gaining further characteristics which would negatively impact
the model resulting in overfitting. This is captured in Fig. 5.</p>
        <p>In stark, contrast to these research findings, other previous
studies by [54] shows, the training set accuracy using deep
CNN was 80.1% and according to [18] used a randomly
selected 345 images, and reported an accuracy of 83%, by
resizing the images to 120*120pixels and passing it to LeNet
CNN. Our proposed model using R-CNN recorded 85.75%
which is slightly higher on training samples of 10,383
improved up the performance of the model.</p>
        <p>Lower classification accuracies observed from the
training sample outcomes model case index I could also be
attributed to misclassified images. Misclassified images
could result from the presence of intra-uterine devices (UID),
hair and even speculum often used in an examination for
dilating the orifice. Presence of these foreign matter not only
ensures the model learns incorrect and unimportant features
precancerous-cervical representations but also introduces
new variables that are not related to the cervical cancer
characteristics, further complicating the classification
process.</p>
      </sec>
      <sec id="sec-5-3">
        <title>C. Conclusion</title>
        <p>Early screening and subsequent discovery of any
precancerous traits have over time proven to be an effective mode
of dealing with the cancerous ailments. With significant
strides having been made in diagnosis and detection of these
ailments, the same has been replicated in the detection of the
precancerous lesions.</p>
        <p>The proposed model utilizes the same approach of using
pre-cancerous lesions to aid in detection of its imminent
progression. With an accuracy of 86% the proposed R-CNN
model demonstrates its ability to detect the presence of
cervical pre-cancerous traits and could highly aid in diagnosis
of imminent progression of cervical precancerous traits.</p>
        <p>To ensure higher classification accuracies the proposed
RCNN model made use of significantly sufficient training as
well as testing samples. The use of sufficient training samples
was not only meant to ensure that the model gains the relevant
characteristic traits of cervical pre-cancerous lesions but was
also meant to make it generalizable which is crucial in the
detection process.</p>
        <p>Despite the high accuracy rating, detection of cervical
precancerous traits is not always a straight forward feat. Low
quality, low resolution images, presence of foreign such as
IUD or hair makes the detection algorithm susceptible to
erroneous classifications. Perhaps to ensure more accuracy
rating careful selection and prepping of the images should be
undertaken.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>ACKNOWLEDGMENTS</title>
      <p>The authors would like to thank all reviewers and editors
for their comments and contributions on this paper. We do also
acknowledge Kaggle and UCI dataset repository and all
subjects whose cervical images were used for learning
purposes. We also recognize the medical Gyn Oncologist
expert’ input from Prof. Omenge E. Orango, Moi
UniversitySchool of Medicine towards the success of this research.
[17]
[18]</p>
      <p>F. Islami, L. A. Torre, J. M. Drope, E. M. Ward, and A. Jemal,
“Global cancer in women: Cancer control priorities,” Cancer
Epidemiol. Biomarkers Prev., vol. 26, no. 4, pp. 458–470, 2017,
doi: 10.1158/1055-9965.EPI-16-0871.</p>
      <p>
        World Health Organization, GUIDE TO CANCER - Guide to
cancer early dia
        <xref ref-type="bibr" rid="ref11 ref24">gnosis. 2017</xref>
        .
“Kenya Cancer Statistics &amp; National Strategies,” Kenyan Network
of Cancer Organizations, Feb. 18, 2013.
https://kenyacancernetwork.wordpress.com/kenya-cancer-facts/
(accessed Aug. 13, 2020).
“(1) (PDF) Screening for Cervical Cancer Using Automated
Analysis of PAP-Smears,” ResearchGate.
https://www.researchgate.net/publication/261923436_Screening_fo
r_Cervical_Cancer_Using_Automated_Analysis_of_PAP-Smears
(accessed Aug. 14, 2020).
“Wei e
        <xref ref-type="bibr" rid="ref20">t al. - 2017</xref>
        - Cervical cancer histology image identification
met.pdf.” .
      </p>
      <p>X. Q. Zhang and S. G. Zhao, “Cervical image classification based
on image segmentation preprocessing and a CapsNet network
model,” Int. J. Imaging Syst. Technol., vol. 29, no. 1, pp. 19–28,
2019, doi: 10.1002/ima.22291.
“National-Cancer-Screening-Guidelines-2018.pdf.” Accessed: Aug.
13, 2020. [Online]. Available:
https://www.health.go.ke/wpcontent/uploads/2019/02/National-Cancer-Screening-Guidelines2018.pdf.</p>
      <p>
        L. Wei, Q. Gan, and T. Ji, “Cervical cancer histology image
identification method based on texture and lesion area features,”
Comput. Assist. Surg., vol. 22, no. sup1, pp. 186–199, Oc
        <xref ref-type="bibr" rid="ref20">t. 2017</xref>
        ,
doi: 10.1080/24699322.2017.1389397.
“Intel &amp; MobileODT Cervical Cancer Screening.”
https://kaggle.com/c/intel-mobileodt-cervical-cancer-screening
(accessed Aug. 14, 2020).
      </p>
      <p>World Health Organization, Ed., WHO guidelines for screening and
treatment of precancerous lesions for cervical cancer prevention.
Geneva: World Health Organization, 2013.</p>
      <p>
        K. Fernandes, D. Chicco, J. S. Cardoso, and J. Fernandes,
“Supervised deep learning embeddings for the prediction of
cervical cancer diagnosis,” PeerJ Comput. Sci., vol. 4, p. e154,
        <xref ref-type="bibr" rid="ref6 ref7">May 2018</xref>
        , doi: 10.7717/peerj-cs.154.
“Maini and Aggarwal - 2010 - A Comprehensive Review of Image
Enhancement Techni.pdf.” .
“A Novel Analysis of Clinical Data and Image Processing
Algorithms in Detection of Cervical Cancer,” ResearchGate.
https://www.researchgate.net/publication/277667329_A_Novel_An
alysis_of_Clinical_Data_and_Image_Processing_Algorithms_in_D
etection_of_Cervical_Cancer (accessed Aug. 14, 2020).
      </p>
      <p>M. Sato et al., “Application of deep learning to the classification of
images from colposcopy,” Oncol. Lett., Jan. 2018, doi:
10.3892/ol.2018.7762.</p>
      <p>
        N. Muinga et al., “Digital health Systems in Kenyan Public
Hospitals: a mixed-methods survey,” BMC Med. Inform. Decis.
Mak., vol. 20, no. 1, p. 2, De
        <xref ref-type="bibr" rid="ref12">c. 2020</xref>
        , doi:
10.1186/s12911-0191005-7.
      </p>
      <p>M. Nielsen, “Neural networks and deep learning.” 2019.</p>
      <p>A. Mittal and M. Juneja, “Cervix Cancer Classification using
Colposcopy Images by Deep Learning Method,” Int. J. Eng.</p>
      <p>Technol. Sci. Res., vol. 5, no. 3, pp. 426–432, 2018.</p>
      <p>
        C. Data Science, “An Intuitive Explanation of Convolutional
Neural Networks – the data science blo
        <xref ref-type="bibr" rid="ref11 ref24">g_.” 2017</xref>
        .
      </p>
      <p>R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi,
“Convolutional neural networks: an overview and application in
radiology,” Insights Imaging, vol. 9, no. 4, pp. 611–629, 2018, doi:
10.1007/s13244-018-0639-9.</p>
      <p>H. A. Almubarak et al., “Convolutional Neural Network Based
Localized Classification of Uterine Cervical Cancer Digital
Histology Images.,” Procedia Comput. Sci., vol. 114, pp. 281–287,
2017, doi: 10.1016/j.procs.2017.09.044.</p>
      <p>
        Guillaume Berger, “CS231n Convolutional Neural Networks for
Visual Recogni
        <xref ref-type="bibr" rid="ref20">tion.” 2016</xref>
        .
      </p>
      <p>J. Brownlee, “A Gentle Introduction to the Rectified Linear Unit
(ReLU),” Machine Learning Mastery, Jan. 08, 2019.
https://machinelearningmastery.com/rectified-linear-activationfunction-for-deep-learning-neural-networks/ (accessed Aug. 14,
2020).
“CS231n Convolutional Neural Networks for Visual Recognition.”
https://cs231n.github.io/convolutional-networks/ (accessed Aug.
14, 2020).
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>F.</given-names>
            <surname>Bray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ferlay</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Soerjomataram</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. L.</given-names>
            <surname>Siegel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Torre</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Jemal</surname>
          </string-name>
          , “Global cancer statistics
          <year>2018</year>
          :
          <article-title>GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries</article-title>
          ,” CA.
          <string-name>
            <surname>Cancer</surname>
            <given-names>J. Clin.</given-names>
          </string-name>
          , vol.
          <volume>68</volume>
          , no.
          <issue>6</issue>
          , pp.
          <fpage>394</fpage>
          -
          <lpage>424</lpage>
          ,
          <year>2018</year>
          , doi: 10.3322/caac.21492.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <source>[19] [20] [21] [22] [23]</source>
          [24] “
          <article-title>Convolutional Neural Network. In this article, we will see what are… | by Arunava | Towards Data Science</article-title>
          .” https://towardsdatascience.com/convolutional
          <article-title>-neural-network17fb77e76c05 (accessed Aug</article-title>
          .
          <volume>14</volume>
          ,
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          21,
          <year>2019</year>
          . https://machinelearningmastery.com
          <article-title>/pooling-layers-forconvolutional-neural-networks/ (accessed Aug</article-title>
          .
          <volume>14</volume>
          ,
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>“Litjens</surname>
          </string-name>
          et al.
          <article-title>- 2017 - A Survey on Deep Learning in Medical Image Analysi</article-title>
          .pdf.” .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>J.</given-names>
            <surname>Tompson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jain</surname>
          </string-name>
          , Y. LeCun, and C. Bregler, “
          <article-title>Joint training of a convolutional network and a graphical model for human pose estimation</article-title>
          ,
          <source>” Adv. Neural Inf. Process. Syst.</source>
          , vol.
          <volume>2</volume>
          , no.
          <source>January</source>
          , pp.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Minar</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Naher</surname>
          </string-name>
          , “
          <article-title>Recent Advances in Deep Learning: An Overview</article-title>
          ,” vol.
          <year>2006</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>31</lpage>
          ,
          <year>2018</year>
          , doi: 10.13140/RG.2.2.24831.10403.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>M. Wu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Yan</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          <string-name>
            <surname>Liu</surname>
            , and
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Yin</surname>
          </string-name>
          , “
          <article-title>Automatic classification of cervical cancer from cytological images by using convolutional neural network,” Biosci</article-title>
          . Rep., vol.
          <volume>38</volume>
          , no.
          <issue>6</issue>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          ,
          <year>2018</year>
          , doi: 10.1042/BSR20181769.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          https://www.springerprofessional.de/en/detecting
          <article-title>-driverdrowsiness-in-real-time-through-deep-learning-</article-title>
          <source>b/16772484 (accessed Aug</source>
          .
          <volume>14</volume>
          ,
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Prabhu</surname>
          </string-name>
          , “
          <article-title>Understanding of Convolutional Neural Network (CNN) - Deep Learning</article-title>
          ,” Medium, Nov.
          <volume>21</volume>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          https://medium.com/@
          <article-title>RaghavPrabhu/understanding-ofconvolutional-neural-network-cnn-deep-learning-99760835f148 (accessed Aug</article-title>
          .
          <volume>27</volume>
          ,
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>G.</given-names>
            <surname>Litjens</surname>
          </string-name>
          et al.,
          <article-title>“A Survey on Deep Learning in Medical Image Analysis,” Med</article-title>
          . Image Anal., vol.
          <volume>42</volume>
          , pp.
          <fpage>60</fpage>
          -
          <lpage>88</lpage>
          , Dec.
          <year>2017</year>
          , doi: 10.1016/j.media.
          <year>2017</year>
          .
          <volume>07</volume>
          .005.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>C.</given-names>
            <surname>Szegedy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vanhoucke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ioffe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shlens</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wojna</surname>
          </string-name>
          , “
          <article-title>Rethinking the Inception Architecture for Computer Vision</article-title>
          ,” ArXiv151200567 Cs, Dec.
          <year>2015</year>
          , Accessed: Aug.
          <volume>15</volume>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>[Online]. Available: http://arxiv.org/abs/1512.00567.</mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>R.</given-names>
            <surname>Barth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hemming</surname>
          </string-name>
          , and
          <string-name>
            <surname>E. J. Van Henten</surname>
          </string-name>
          , “
          <article-title>Optimising realism of synthetic images using cycle generative adversarial networks for improved part segmentation,”</article-title>
          <string-name>
            <surname>Comput. Electron. Agric.</surname>
          </string-name>
          , vol.
          <volume>173</volume>
          , p.
          <fpage>105378</fpage>
          ,
          <string-name>
            <surname>Jun</surname>
          </string-name>
          .
          <year>2020</year>
          , doi: 10.1016/j.compag.
          <year>2020</year>
          .
          <volume>105378</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>S.</given-names>
            <surname>Bosse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Maniry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wiegand</surname>
          </string-name>
          , and W. Samek, “Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Berlin, Germany Department of Electrical Engineering, Technical University of Berlin, Germany .,” pp.
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <source>“Find Open Datasets and Machine Learning Projects | Kaggle</source>
          .” https://www.kaggle.com/datasets (accessed
          <year>Aug</year>
          .
          <volume>14</volume>
          ,
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Natu</surname>
          </string-name>
          , “
          <article-title>The functional neuroanatomy of face perception: from brain measurements to deep neural networks</article-title>
          ,
          <source>” Interface Focus</source>
          , vol.
          <volume>8</volume>
          , no.
          <issue>4</issue>
          , p.
          <fpage>20180013</fpage>
          ,
          <string-name>
            <surname>Aug</surname>
          </string-name>
          .
          <year>2018</year>
          , doi: 10.1098/rsfs.
          <year>2018</year>
          .
          <volume>0013</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <article-title>“(1) Multimodal Deep Learning for Cervical Dysplasia Diagnosis | Request PDF</article-title>
          ,” ResearchGate.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          https://www.researchgate.net/publication/308816998_Multimodal_
          <article-title>Deep_Learning_for_Cervical_Dysplasia_Diagnosis (accessed Aug.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <given-names>T.</given-names>
            <surname>Xu</surname>
          </string-name>
          et al.,
          <article-title>“Multi-feature based Benchmark for Cervical Dysplasia Classification Evaluation,” Pattern Recognit</article-title>
          ., vol.
          <volume>63</volume>
          , pp.
          <fpage>468</fpage>
          -
          <lpage>475</lpage>
          , Mar.
          <year>2017</year>
          , doi: 10.1016/j.patcog.
          <year>2016</year>
          .
          <volume>09</volume>
          .027.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <article-title>“Bergstra and Bengio - Random Search for Hyper-Parameter Optimization</article-title>
          .pdf.”
          <source>Accessed: Aug. 14</source>
          ,
          <year>2020</year>
          . [Online]. Available: https://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>M. Z. Alom</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Hasan</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Yakopcic</surname>
            , and
            <given-names>T. M.</given-names>
          </string-name>
          <string-name>
            <surname>Taha</surname>
          </string-name>
          , “
          <article-title>Inception Recurrent Convolutional Neural Network for Object Recognition,” ArXiv170407709 Cs, Apr</article-title>
          .
          <year>2017</year>
          , Accessed: Aug.
          <volume>14</volume>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>[Online]. Available: http://arxiv.org/abs/1704.07709.</mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <given-names>G.</given-names>
            <surname>Larsson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Maire</surname>
          </string-name>
          , and G. Shakhnarovich, “FractalNet: UltraDeep Neural Networks without Residuals,” ArXiv160507648 Cs, May
          <year>2017</year>
          , Accessed: Aug.
          <volume>14</volume>
          ,
          <year>2020</year>
          . [Online]. Available: http://arxiv.org/abs/1605.07648.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          https://www.researchgate.net/publication/338402067_Neural_
          <article-title>Netw ork_Based_Rhetorical_Status_Classification_for_Japanese_Judgme nt_Documents (accessed Aug</article-title>
          .
          <volume>14</volume>
          ,
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>