<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Demonstrating Machine Learning for Cancer Diagnostics</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Paul Walsh</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jennifer Lynch</string-name>
          <email>jennifer.lynch@nsilico.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Brian Kelly</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cintia Palu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Onofrio Gigliotta</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Raffaele Di Fuccio</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>NSilico Life Science</institution>
          ,
          <addr-line>Dublin</addr-line>
          ,
          <country country="IE">Ireland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Naples</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <fpage>185</fpage>
      <lpage>194</lpage>
      <abstract>
        <p>This paper describes how machine learning systems can be explained and demystified for non-technical audiences through the use of an online simulation. This research is the result of a European Union funded project, SageCare, which focuses on developing machine learning systems to classify clinical and genomic data. In disseminating the use of machine learning to non-specialists we often encounter resistance or suspicion on the veracity of approach. Hence, we present artificial intelligence/machine learning for non-specialists and present a case study and an interactive simulation on how machine learning can be used in cancer diagnostics. The simulation system serves as a basis for both informing clinical practitioners how machine learning can be used to build diagnostic models and describes how feedback from users will be gathered and analyzed to assess how machine learning is viewed in such an application.</p>
      </abstract>
      <kwd-group>
        <kwd>First Keyword</kwd>
        <kwd>Second Keyword</kwd>
        <kwd>Third Keyword</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The SageCare Project [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] tackles the important area of personalized medicine, by
addressing health informatics in a holistic way by creating a platform that interlinks
spatially distributed clinical care information sources, EHRs and associated genomic
sequences, thereby allowing clinicians to make reasoned queries using machine learning
over vast knowledge bases of health and research data. This requires a number of
disciplines and skills to be brought together in order to achieve success, including
clinicians active in the diagnosis and treatment of cancer. To gauge the effectiveness of
machine learning in the domain of cancer diagnostics, a JavaScript simulation, based
on a simulator developed by [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], is configured to build a machine learning model using
a real cancer data set. The simulation serves as a basis of explaining the dynamics of
machine learning to potential end users.
      </p>
    </sec>
    <sec id="sec-2">
      <title>Cancer Diagnostics</title>
      <p>Cancer is one of the diseases which has a huge impact on patients and their families, so
understanding how artificial intelligence can be leveraged to aid diagnosis is important
in order to help find ways to alleviate the prevalence of this disease. This paper outlines
how machine learning driven artificial intelligence (AI) can be used to aid diagnosis of
cancer by building a model that assesses visual input features of cell nuclei. It also
serves as a useful example to non-specialists interested in AI to help them understand
the dynamics of machine learning algorithms and to understand how to assess their
performance.</p>
      <p>
        Breast cancer is one of the most commonly occurring cancers, with over 2 million
new cases diagnosed globally every year [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. While around 5% to 10% of cases are due
to inherited genes, such as variants of BRCA [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], there is a higher risk of developing
this form of cancer linked to lifestyle factors such as alcohol consumption and obesity
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. For example, overweight women have an increased invasive breast cancer risk
versus women of normal weight [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. However, the major risks associated with this disease
are age, due to likelihood of mutations caused by cell division, and gender, as breast
cancer mainly affects women. Breast cancer frequently occurs in the cells lining the
milk ducts, where it is referred to as ductal carcinomas, and the tissue that produces the
milk supplied to these ductal carcinomas, where it is referred to as lobular carcinomas
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Diagnosing such carcinomas involves taking a biopsy of cells from the site in
question, which may be deep within the breast tissue. Early diagnosis is key to the effective
treatment of such cancers, as studies have shown increases in cancer survival due to
advances in early detection and treatment [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], so performing an effective assessment is
critical. X-rays of the breast known as mammograms are frequently used as a screening
method to identify potential cancerous growths, along with physical contact
examination to determine if there is a need for further investigation. Suspect tissue is often
biopsied using a fine needle aspiration, whereby a narrow hollow needle is inserted into
the tissue to collect a sample of cells [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. An example image of an invasive ductal
carcinoma biopsy is given in Figure 1.
These cells are then prepared for examination by a pathologist who examines the
characteristics of individual cells, as many different cell features are thought be highly
correlated with malignancy [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Malignant cells tend to be irregular compared to normal
cells, so larger values for features related to shape, such as symmetry, fractal and
concavity tend to indicate that the cells are cancerous. It is possible to use machine vision
to detect such cell features from biopsies via a digital microscope. This is the basis of
the widely studies Wisconsin breast cancer dataset, where 569 biopsies were collected
and the following ten geometric features calculated for cells in each of the samples [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]:
1. The radius of the nucleus.
2. The perimeter of the cell nucleus.
3. The area of the cell nucleus.
cell nuclei using the formula:
4. The perimeter and area are combined to give a measure of the compactness of the

Cell nuclei that have an irregular shape will have a higher measure of compactness.
5. The smoothness as measured by the difference between the radii across the cell
nucleus.
6. The number and severity of concave features around the cell nucleus.
7. The number of concave points around the cell nucleus.
8. A measure of symmetry, sampled at points around the cell nucleus.
9. A measure of the fractal dimension along the cell.
10. The texture of the cell nucleus by measuring the grayscale intensity variation
across pixels within the cell nucleus.
      </p>
      <p>The mean, max and standard error of each feature are computed for each image to
give a total of 30 input features per sample, which are suitable for machine learning.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Machine Learning Approach</title>
      <p>
        Machine learning is a computational approach to AI that uses algorithms that iterate
over datasets to build statistical models [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Machine learning techniques can be
broadly classified as supervised, which use labelled input data to train a model, or
unsupervised algorithms that cluster data into related groups. The power of supervised
machine learning is the ability to generalise to correctly classify unseen data, based on
models built using training data. We use a Support Vector Machine (SVM) to build a
machine learning model for the Wisconsin breast cancer dataset, using a portion of the
data (80%) for training and the rest for testing the model (20%).
      </p>
      <p>
        The SVM is a supervised learning algorithm that has been shown to have good
performance as a classifier [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. The SVM Algorithm trains by iterating over a set of
labeled samples, which in this case are entries from the Wisconsin breast cancer dataset,
which are labelled as either benign or malignant. A good way to explain the operation
of machine learning is to use a two-dimensional input feature space as this allows us to
more easily visualize the decision boundary that the algorithm produces. Figure 3
shows a number of examples from Wisconsin breast cancer dataset plotting the radius
feature on the x-axis against the texture feature on the y-axis. An SVM algorithm finds
an optimal decision boundary by finding data points, known as support vectors that
maximise the separation between classes.
      </p>
      <p>
        One approach to gauging the performance of the classifier is to compute the F1 score,
which is a useful measure of the level of precision and recall in a machine learning
system [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Precision is the portion of instances among the classified instances that are
relevant, while recall or sensitivity is the fraction of correctly classified relevant
instances that have been retrieved over the total amount of relevant instances. An
algorithm with high precision over a data set will return more relevant results than irrelevant
ones. For cancer diagnosis, this is critical as both false positives and false negative
errors should be avoided. In particular, a false negative result should be avoided as the
impact could result in missed life-saving treatment. Precision can be thought of as the
ratio of correctly classified true positives tp, over the sum of true positives tp and falsely
classified positive fp:
      </p>
      <p>An algorithm with high recall will classify most of the relevant data correctly and
can be thought of as the ratio of correctly classified true positives tp, over the sum of
true positives tp and false negatives fn (the number of instances falsely classified as
negative instances):</p>
      <p>There is a trade-off between precision and recall as it is possible to have an algorithm
with high precision but low recall and vice versa. For example, the algorithm may be
precise by correctly classifying a subset of malignant breast cancer cases, however it
could achieve this by being stringent in its classification and could exclude many other
malignant cases, which would give it a low recall.</p>
      <p>
        The balance between precision and recall can be captured using an F1 score which
is the harmonic mean of the precision and recall scores, where a score of 1 indicates
perfect precision and recall [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
      <p>1
=</p>
      <p>1</p>
      <p>The machine learning model should be trained in such a way that the algorithm does
not overfit, which occurs when the algorithm fits a decision boundary tightly to all of
the data, including any noise in the training data, so that it generalises poorly to any
unseen input. To avoid over-estimation of model performance, a test data set is held
back and is used as the final unbiased measure of the algorithm’s performance on the
training data. A model that produces a high score on the training set but a low score on
the test set will overfit, while a model that produces a high score on the training set and
a high score on the test set should provide good classifications. A model that underfits,
by failing to find any useful decision boundary will perform poorly on both data sets.</p>
      <p>The simulation in Figure 3 shows the F1 scores for the algorithm on the training set
and the test set, thereby allowing users to gauge the performance of the algorithm. This
also challenges the user to investigate how tuning the hyper-parameters for a machine
learning algorithm affects its performance and so will enhance their understanding of
the dynamics of the problem.</p>
      <p>In the example demonstrated to the non-specialist audience a two-dimensional
feature space was presented. The simulator developed by Karpathy was enhanced to allow
users to select which feature they would like to evaluate. Figure 3 shows use of a linear
kernel when trying to find the ideal separation, and the F1 score for both the training
set and test set is shown. The choice of using a kernel is an important machine learning
hyperparameter; practitioners needs to consider if the data set is linearly separable or
not. This simulation is presented as a game to the users, where the goal is to reach a
perfect F1 score of 1 on the test data set.</p>
      <p>
        The use of a nonlinear kernel is shown in Figure 3. Choosing a non-linear kernel for
a linear data set will tend to cause the model to over fit the data, which will reduce its
ability to generalize as indicated by a poor performance on the test data set F1 score.
SVMs use a technique known as the kernel trick which maps data points to a higher
dimensional space where a linear separation may be found [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
A support vector machine can be tuned via a cost function, denoted C, which penalises
the algorithm for points that fall within the margin. A small value of C, imposes a low
penalty for misclassification, thereby allowing a "soft margin", which promotes better
generalisation at the risk of lower precision. A large value of C imposes a high cost of
misclassification, thereby producing a “hard margin", which promotes higher precision
but poorer generalisation and recall. The JavaScript framework [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] allows users to
modify the cost function C and are challenged to find a balance that maximises the F1 score.
      </p>
      <p>For non-linear kernel the Karpathy SVM JavaScript framework uses a Gaussian
radial basis function, which allows the SVM algorithm to fit the maximum margin
separating hyperplane in a transformed input feature space. The radial basis function is
controlled by the parameter sigma (σ), which determines the influ ence that feature vectors
have on the kernel mapping. Intuitively, low values of sigma narrow the region of
influence of the kernel for vectors in the feature space, which can cause the SVM to
overfit the data. High values of sigma widen the region of influence, making the
algorithm better at generalizing at the expense of losing precision. Users can experiment
with features, kernels and hyper-parameters as shown in the adaption of Karpathy’s
software, see Figure 3 (b). Communicating the effect of σ and other parameters, to
nontechnical audiences in a visual manner supports the objective of this study; to
investigate how an interactive tool enhances their understanding and user of machine learning
tools.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Understanding AI</title>
      <p>In the last decade, we have witnessed a growing interest on AI applications. Numbers
of commercials on a variety of everyday objects (e.g. mobile phones, vacuum cleaners,
thermostats etc.), present AI as an important added value. Indeed, it is. With AI
powered cameras we can get better photographs, with an autonomous vacuum cleaner we
can gain more spare time and with a smart thermostat we can save a lot of money on
heating bills.</p>
      <p>However, despite this growing interest, very often AI is perceived, by the general
public, like a sort of magic, or, to put it in the words used by Arthur C. Clarke, an
advanced technology indistinguishable from magic. But such perception has a problem
in that it presents a technology’s inner mechanisms as being incomprehensible.
Rather, AI is a powerful tool that can be harnessed for personal or professional
purposes, even by lay people thanks to off-the-shelf software packages in which users are
requested only to add their data. Adding data alone, however, could simplify the process
too much, preventing users from grasping the inner mechanics of what they are using,
leading to potential mistakes or misuse.</p>
      <p>AI, undoubtedly, represents an effective tool to solve or to simplify many relevant
problems in our daily lives. For this reason, we should disclose as much as possible
about how certain algorithms work. Such an operation can be beneficial to improve a
basic understanding of a particular algorithm, whether a user is just willing to better
grasp a topic or where a user is interested in using that kind of technology with and
increased awareness for his own purposes. This can provide the following benefits of
explaining the technology to potential users as they can:
1) make better use of the tools,
2) better understand the problems it can and cannot solve and
3) make a more informed assessment and evaluation of the produced solution
5</p>
    </sec>
    <sec id="sec-5">
      <title>Exploratory focus group</title>
      <p>
        In order to understand which kind of information should be conveyed about an
intelligent technology in general and in particular to a classifier system such as that presented
in this paper, we organized a focus group, held in Rome in December 2018, with a small
number of participants. Focus groups have a long tradition in behavioral sciences where
have been used to understand how an issue or a product is perceived by a group of
people [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
      <p>Seven participants, 1 woman and 6 men, with an average age of 37.57 (SD= 6.23)
with a background in AI research were identified within an Italian research center. For
the aim of this exploratory work, participants were chosen for their unfamiliarity with
support vector machines although well-versed in other AI techniques such as neural
networks, genetic algorithms etc. The focus group was organized with the following
structure:
1) Short introduction on the topic of the focus group
2) Brief presentation of the participants
3) Discussion of the topic:
• Question 1: How do you think people perceive AI?
• Question 2: Do you think it is necessary to explain how AI works?
• Question 3: Which features of a classifier system should be stressed for
educational purposes?
4) Presentation of SVMs through the software we developed
5) Request of feedback on the software as an educational tool</p>
    </sec>
    <sec id="sec-6">
      <title>Qualitative results</title>
      <p>The focus group lasted three hours and stimulated a very interesting discussion on the
general importance of AI in our lives and the features that should be shared in order to
increase people’s awareness on specific algorithms.</p>
      <p>The first question raised a dualistic view on AI. To the participants, people seem to
see AI either as the ultimate evil or magic. Both polarized views, however, lack a
realistic perspective and all the participants agreed with the fact that AI at the moment is
an inflated word due to marketing purposes.</p>
      <p>The second question divided the participants. Three of them underlined that in order
to understand many AI algorithms a strong background in maths is needed, hence it is
impossible to provide such kinds of concepts to a general public, who likely lack
specific hard skills. The rest of the group in different ways highlighted the need to explain
how algorithms work. In particular, two proposals emerged on how best to explain how
AI works to members of the general public: a) using a very simple language without
referring to mathematical jargon; and b) demonstrating algorithm with
micro-educational software in which users can manipulate data and parameters.</p>
      <p>The final question firstly collected a series of answers related to the fact that the
outcome of classifiers system, regardless of the algorithm that is being used, is strictly
connected to the data we put in. Secondly, although sometimes very complicated math
is required to understand the specific aspects of an algorithm, an extremely simple
formula can often be used to evaluate the outcome of a classifier (see for example precision
and recall).</p>
      <p>After the discussion raised by the first three questions, we presented our software
(by explaining the objectives and the algorithm behind it) to the participants and asked
them if it was, in their opinion, viable for educational purposes.</p>
      <p>Participants appreciated two aspects of the software: 1) that is web-based and it is
able to seamlessly run on mobile phones without issues, and 2) the two windows easily
allow for seeing what happens to the outcome when different parameters are applied to
the underlying algorithm. Less appreciated was the graphical aspect. Participants
suggested to improve the graphics in order to make the training and the testing sets more
visible.</p>
      <p>An overall positive consideration emerged about the possible use of the software as
an educational tool. Although not experts in SVMs, they understood how this type of
classifier works in a relatively simplified way (here we remember that the participants
shared an AI background)
7</p>
    </sec>
    <sec id="sec-7">
      <title>Future research</title>
      <p>AI and its applied arm, machine learning, are becoming an important part of our daily
life. Our mobile phone recognizes our vocal commands and the AI powered cameras
can take pictures with a professional quality. Applications, however, are not limited to
our spare time. AI can also be added to the toolkit of our professions: a biologist or a
psychologist, for example, could exploit a machine learning solution for their own
purposes. In particular, a biologist could use a classifier such as the one described in this
paper to classify his/her own data points or re-run the algorithm with new collected
data. In order to do that it is not required to be a data scientist but just to be able to use
an off-the-shelf solution with a proper awareness.</p>
      <p>Qualitative data collected in an exploratory focus group seems to suggest that our
approach goes in this direction, however, in order to evaluate its effectiveness we need
quantitative data. Gathering this quantitative data will be the objective of the next step
of this research.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgements</title>
      <p>PW is supported by funding through Science Foundation Ireland (SFI) Multi-Gene
Assay Cloud Computing Platform - (16/IFA/4342), BK, JL, CP, OG and RDF are funded
under H2020 RISE project SemAntically integrating Genomics with Electronic health
records for Cancer CARE (SageCare), grant number 644186.
Available:</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1
          <string-name>
            <given-names>S.</given-names>
            <surname>Consortium</surname>
          </string-name>
          ,
          <article-title>"Periodic Reporting for period 1 - SAGE-CARE (SemAntically integrating Genomics with Electronic health records for Cancer CARE),"</article-title>
          <source>H2020 CORDIS European Commission</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2
          <string-name>
            <given-names>A.</given-names>
            <surname>Karpathy</surname>
          </string-name>
          ,
          <article-title>"Support Vector Machine in Javascript,"</article-title>
          [Online]. https://cs.stanford.edu/people/karpathy/svmjs/demo/.
          <source>[Accessed August</source>
          <year>2018</year>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3
          <string-name>
            <given-names>T.</given-names>
            <surname>Vos</surname>
          </string-name>
          and e. al.,
          <article-title>"Global, regional, and national incidence, prevalence, and years lived with disability for 310 diseases</article-title>
          and injuries, 1990
          <article-title>-2015: a systematic analysis for the Global Burden of Disease Study 2015.," The Lancet</article-title>
          , vol.
          <volume>388</volume>
          , no.
          <issue>10053</issue>
          , pp.
          <fpage>1545</fpage>
          -
          <lpage>1602</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>4 M.-C. King</surname>
            ,
            <given-names>J. H.</given-names>
          </string-name>
          <string-name>
            <surname>Marks</surname>
            and
            <given-names>J. B.</given-names>
          </string-name>
          <string-name>
            <surname>Mandell</surname>
          </string-name>
          ,
          <article-title>"Breast and ovarian cancer risks due to inherited mutations in BRCA1 and BRCA2,"</article-title>
          <source>Science</source>
          , vol.
          <volume>302</volume>
          , no.
          <issue>5645</issue>
          , pp.
          <fpage>643</fpage>
          -
          <lpage>646</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>5 L. M. Morimoto</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>White</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>R. T.</given-names>
          </string-name>
          <string-name>
            <surname>Chlebowski</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Hays</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Kuller</surname>
            ,
            <given-names>A. M.</given-names>
          </string-name>
          <string-name>
            <surname>Lopez</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Manson</surname>
            ,
            <given-names>K. L.</given-names>
          </string-name>
          <string-name>
            <surname>Margolis</surname>
            ,
            <given-names>P. C.</given-names>
          </string-name>
          <string-name>
            <surname>Muti</surname>
            ,
            <given-names>M. L.</given-names>
          </string-name>
          <string-name>
            <surname>Stefanick</surname>
            and
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>McTiernan</surname>
          </string-name>
          ,
          <article-title>"Obesity, body size, and risk of postmenopausal breast cancer: the Women's Health Initiative (United States),"</article-title>
          <source>Cancer Causes &amp; Control</source>
          , vol.
          <volume>13</volume>
          , no.
          <issue>8</issue>
          , p.
          <fpage>741</fpage>
          -
          <lpage>751</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>6 P. R. Marian L. Neuhouser</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. Aaron K. Aragaki</surname>
            ,
            <given-names>P. Ross L.</given-names>
          </string-name>
          <string-name>
            <surname>Prentice</surname>
          </string-name>
          and e. al,
          <article-title>"Overweight, Obesity, and Postmenopausal Invasive Breast Cancer Risk,"</article-title>
          <source>JAMA oncology</source>
          , vol.
          <volume>1</volume>
          , no.
          <issue>5</issue>
          , pp.
          <fpage>611</fpage>
          -
          <lpage>621</lpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>7 P. A. T. E. B. P. B. C. T. B. M. N</surname>
            .
            <given-names>C. I. NCI</given-names>
          </string-name>
          ,
          <article-title>"Breast Cancer Treatment (PDQ</article-title>
          ®)-Patient
          <string-name>
            <surname>Version</surname>
          </string-name>
          ,"
          <year>2019</year>
          . [Online]. Available: https://www.cancer.gov/types/breast/patient/breasttreatment-pdq.
          <source>[Accessed 2</source>
          <volume>2</volume>
          <fpage>2019</fpage>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>8 K. D. M. MPH</surname>
            ,
            <given-names>R. L. S. MPH</given-names>
          </string-name>
          ,
          <string-name>
            <surname>M. Chun Chieh Lin PhD</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. B. M. PhD</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. L. K. MD</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. H. R. PhD</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. D. S. PhD</surname>
            ,
            <given-names>R. A.</given-names>
          </string-name>
          <string-name>
            <surname>MD and P. Ahmedin Jemal</surname>
            <given-names>DVM</given-names>
          </string-name>
          ,
          <article-title>"Cancer treatment</article-title>
          and survivorship statistics,
          <year>2016</year>
          ,
          <article-title>" CA: A Cancer Journal for Clinicians</article-title>
          , vol.
          <volume>66</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>271</fpage>
          -
          <lpage>289</lpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9
          <string-name>
            <given-names>M.</given-names>
            <surname>Wu</surname>
          </string-name>
          and
          <string-name>
            <given-names>D. E.</given-names>
            <surname>Burstein</surname>
          </string-name>
          ,
          <article-title>"Fine needle aspiration</article-title>
          .,
          <source>" Cancer investigation</source>
          , vol.
          <volume>22</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>620</fpage>
          -
          <lpage>628</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10
          <string-name>
            <given-names>W.</given-names>
            <surname>Nick Street</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. H.</given-names>
            <surname>Wolberg</surname>
          </string-name>
          and
          <string-name>
            <given-names>O. L.</given-names>
            <surname>Mangasarian</surname>
          </string-name>
          ,
          <article-title>"Nuclear Feature Extraction for Breast Tumor Diagnosis,"</article-title>
          <source>Biomedical Image Processing and Biomedical Visualization</source>
          ,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11
          <string-name>
            <given-names>W. N. e. a.</given-names>
            <surname>Street</surname>
          </string-name>
          ,
          <article-title>"Breast Cancer Wisconsin (Diagnostic) Data Set,"</article-title>
          [Online]. Available: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic) .
          <source>[Accessed August</source>
          <year>2018</year>
          ].
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12
          <string-name>
            <given-names>A.</given-names>
            <surname>Samuel</surname>
          </string-name>
          ,
          <article-title>"Some Studies in Machine Learning Using the Game of Checkers,"</article-title>
          <source>IBM Journal of Research and Development</source>
          , vol.
          <volume>3</volume>
          , no.
          <issue>3</issue>
          , p.
          <fpage>210</fpage>
          -
          <lpage>229</lpage>
          ,
          <year>1959</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13
          <string-name>
            <given-names>C.</given-names>
            <surname>Cortes</surname>
          </string-name>
          and
          <string-name>
            <given-names>V. N.</given-names>
            <surname>Vapnik</surname>
          </string-name>
          ,
          <article-title>"Support-vector networls</article-title>
          .,
          <source>" Machine Learning</source>
          , vol.
          <volume>20</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>273</fpage>
          -
          <lpage>297</lpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>14 D. M. W. Powers</surname>
          </string-name>
          ,
          <article-title>"Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness &amp; Correlation,"</article-title>
          <source>Journal of Machine Learning Technologies</source>
          , vol.
          <volume>2</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>37</fpage>
          -
          <lpage>63</lpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sasaki</surname>
          </string-name>
          ,
          <article-title>"The truth of the F-measure</article-title>
          .,
          <source>" Teach Tutor mater 1</source>
          , vol.
          <volume>1</volume>
          , no.
          <issue>5</issue>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16
          <string-name>
            <given-names>B. E.</given-names>
            <surname>Boser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. M.</given-names>
            <surname>Guyon</surname>
          </string-name>
          and
          <string-name>
            <given-names>V. N.</given-names>
            <surname>Vapnik</surname>
          </string-name>
          ,
          <article-title>"A training alorithm for optimal margin classifiers,"</article-title>
          <source>Proceedings of the fifth annual workshop on Computational learning theory.</source>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17
          <string-name>
            <given-names>D. W.</given-names>
            <surname>Stewart</surname>
          </string-name>
          and
          <string-name>
            <given-names>P. N.</given-names>
            <surname>Shamdasani</surname>
          </string-name>
          ,
          <source>Focus Groups: Theory and Practice</source>
          , 3rd ed.,
          <source>Sage</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <string-name>
            <surname>18 M. C. Pike</surname>
            ,
            <given-names>D. V.</given-names>
          </string-name>
          <string-name>
            <surname>Spicer</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Dahmoush and M. F</surname>
          </string-name>
          . Press,
          <article-title>"Estrogens progestogens normal breast cell proliferation and breast cancer risk.," Epidemiologic reviews</article-title>
          , vol.
          <volume>15</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>17</fpage>
          -
          <lpage>35</lpage>
          ,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>