<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Clustering Cancer Drugs According to their Mechanisms of Action</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Syed Abdullah Ali</string-name>
          <email>syeabdullah.ali@postgrad.manchester.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Riza Batista-Navarro</string-name>
          <email>riza.batista@manchester.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Manchester</institution>
          ,
          <addr-line>Oxford Road, Manchester, M13 9PL</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This research investigates similarities between cancer-related drugs with respect to their mechanisms of action (MOA) guided by information extracted from two resources: scienti c literature and ontologies. To nd similarity between drug pairs, the Chemical Entities of Biological Interest (ChEBI) ontology and Gene Ontology (GO) were leveraged to compute drug and drug target features, respectively. A graph of drugs was formed based on drug pairs clustered using two unsupervised graph clustering algorithms: Chinese Whispers and Louvain Method. As a result of clustering, drugs that share the same dominant MOA were placed in the same cluster. Additionally, the most prominent drugs in the entire graph and within each cluster were identi ed according to graph centrality measures. Quality of the clusters was assessed by calculating silhouette coe cient values, ensuring consistency of results generated by the two algorithms, and employing the help of a domain expert who carried out manual evaluation.</p>
      </abstract>
      <kwd-group>
        <kwd>Mechanisms of Action</kwd>
        <kwd>Drug Clustering</kwd>
        <kwd>Chinese Whispers</kwd>
        <kwd>Louvain Method</kwd>
        <kwd>Cancer</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Cancer, the abnormal growth of cells due to gene mutations, is one of the top
causes of mortality. In 2014 alone, cancer claimed 163,000 lives in the UK, which
means that on the average, 450 die every day [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. One in every two people who
were born after 1960 in the UK will be diagnosed with cancer at some point in
his or her life [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. With over 200 di erent types, cancer is usually progressive,
chronic and has a mild phenotype making it hard to diagnose.
      </p>
      <p>
        The discovery of anti-cancer drugs has not been a successful endeavour. The
rate at which new molecular entities (NMEs) are approved for clinical
investigation is low. In the US, between the period of 2010 and 2014, the maximum
number of approved NMEs was only 11 (in 2012) [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. In addition to that, many
drugs fail in clinical trials. The recorded success rate in cancer trials is three
times lower than in clinical trials for cardiovascular diseases [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>
        Numerous e orts have been carried out, based on di erent approaches, to
facilitate anti-cancer drug discovery such as: proteomic analysis to highlight
protein drug targets [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], metabolic control analysis to highlight likely pathological
metabolism targets [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] or development of drug ontology databases [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ] for
indepth understanding of mechanisms of action (MOA) of known drugs. A MOA is
a biochemical interaction that results in a drug producing its desired
therapeutic pharmacological e ect [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. It depicts events happening at the chemical level,
including drugs binding with drug targets, as well as reactions with enzymes or
receptor sites [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. To understand these mechanisms, di erent approaches have
been proposed, e.g., analysis of chemical properties of drugs as well as of drug
targets, and the extraction of drug-protein relationships from scienti c literature.
However, no e ort has attempted to combine these approaches to group drugs
according to their MOA. Clustering drugs which share similar MOA streamlines
the drug discovery process as it constrains the number of drugs that need to be
investigated with respect to their drug targets. It also provides the foundation
for understanding and advancing combinatorial drug therapies for cancer.
      </p>
      <p>To ll this research gap, we propose an approach that makes use of
information from ontologies for clustering drugs in order to highlight similarities and
di erences between them according to a range of physiochemical properties. The
approach is novel in that it seeks to identify groups or clusters of drugs which
are similar in terms of three features: (1) chemical structure, (2) biological role
and (3) drug target properties. In forming groups amongst a total of 831 drugs,
we investigated the use of unsupervised graph clustering techniques.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Clustering drugs based on similarity is a well-known approach to drug discovery,
although there is great variation in terms of how similarity measures have been
de ned. Gemma et al. 2006 [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], for instance, clustered lung cancer drugs based
on their gene expression pro les and sensitivity to multiple lung cancer cell lines.
The results suggested that one of the drugs acted particularly di erent than the
rest of the drugs; hence this drug might be useful in second-line chemotherapy
if it was not administered to a patient initially. A similar attempt was made
in Uhr et al. 2015 [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], whose research focussed on clustering 37 breast cancer
drugs based on their sensitivity to 42 breast cancer cell lines. This resulted in
six clusters which highlighted relationships between drugs and their sensitivities.
Meanwhile, Jeon et al. 2011 [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] employed k -means clustering on gene
expression data to understand changes in mitochondrial proteins brought about by
mitochondrial DNA depletion. Their research revealed how cells compensate for
mitochondrial DNA depletion and it also led to the identi cation of proteins
that repair this depletion.
      </p>
      <p>
        Ross et al. 2000 [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] presented a clustering approach which analysed 60 cell
lines (NCI-60) based on microarray analysis, i.e., the parallel monitoring of gene
expression levels in thousands of genes within the context of a speci c biological
process across di erent environments or tissue samples (e.g., tumours).
Hierarchical clustering was performed to cluster cell lines and genes, separately. Their
results showed that two of the breast cancer cell lines are similar to melanoma
cell lines, suggesting a relationship between the two types of cancers.
      </p>
      <p>
        A data-driven approach was presented in Udrescu et al. 2016 [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] where
information on drug-drug interactions (DDI) from the DrugBank database was
utilised to nd communities based on Louvain Method [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Prominent drugs
were ranked using graph centrality measures [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] such as degree, betweenness,
closeness, eigenvector and PageRank centrality [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. A total of 1,141 nodes
(corresponding to drugs) and 11,688 links (corresponding to DDIs) were divided
into nine di erent clusters, each of which represented a speci c class of drugs.
For example, one cluster represented drugs that targeted the immune system;
another pertained to drugs for the nervous system, and so on. E ectively, the
research identi ed functional drug categories and relationships.
      </p>
      <p>While each of the related work presented above holds resemblance to our
own research methodology in attempting to cluster drugs based on their
chemical interactions, our work proposes a di erent set of features for highlighting
similarities between drugs. Speci cally, the work last mentioned above assumes
all drug similarities to be of equal importance whereas we calculate similarity
scores based on various features, described in the next section.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Methodology</title>
      <p>
        Our approach to clustering drugs according to their mechanisms of action is
based on their similarity with respect to the following types of information
derived from ontologies: (1) chemical structure, (2) biological roles, and (3) drug
target properties. Guided by a set of biomolecular events automatically extracted
from scienti c literature, we formed a list of drugs and drug targets of interest,
whose features were calculated with the help of two ontologies: the Chemical
Entities of Biological Interest (ChEBI) ontology [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and the Gene Ontology (GO)
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. The rest of this section describes in detail our processing pipeline, also
depicted in Figure 1.
3.1
      </p>
      <sec id="sec-3-1">
        <title>Information extraction</title>
        <p>
          A set of biomolecular events, i.e., interactions between drugs and their targets,
served as input to our clustering approach. These events were obtained using a
text mining work ow developed using the Argo workbench [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] and employed in
the work carried out by Zerva et al. [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ]. Based on the processing of 6,529
fulltext papers relevant to melanoma1, a total of 3,168 events were extracted. Out of
these, 293 pertained to interactions between drugs (which we will henceforth refer
to as drug-drug interactions or DDIs) while 2,875 are drug-protein interactions
(DPIs). A list of 831 drugs was formed upon combining all unique drugs in the
DDI and DPI sets. For each drug in the DPI set, we also created a list of drug
targets.
1 Retrieved by calling the Europe PubMed Central API with a query containing
\melanoma", its synonyms and names of melanoma cell lines
ChEBI is an ontology that catalogues chemical structures and biological roles
of drugs. For each drug of interest in our list, we obtained related terms from
the \chemical structure" and \biological role" sub-trees of the ontology.
Structural information among the tree terms was discarded and only terms appearing
between the drug and three levels up were stored, corresponding to individual
drugs2.
3.3
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Drug target feature extraction</title>
        <p>The Gene Ontology (GO) contains information on cellular components and
molecular functions of genes, as well as the biological processes they are involved
in. For each of the proteins that a particular drug interacts with|according to
our list of DPIs|we obtained a list of gene products from GO. For each gene
product, we obtained all related terms in GO up to a tree depth of three levels
up. The result is a combined list of GO terms associated with a particular drug.
A depiction of the di erent types of features extracted for any given drug is
shown in Figure 2.
3.4</p>
      </sec>
      <sec id="sec-3-3">
        <title>Drug similarity measurement</title>
        <p>Based on the features obtained in the steps described above, we measure the
similarity between drugs. Speci cally, given any two drugs, two types of similarity
scores were calculated, i.e., the number of shared ChEBI terms and the number
of shared GO terms. From a total of 423,801 drug pairs, there were 92,700 and
208,499 pairs that have a non-zero number of shared ChEBI and GO terms,
respectively. The similarity score (i.e., the number of shared terms) between a
pair of drugs based on ChEBI terms varied from 1 to 49 while that based on
shared GO terms ranged from 1 to 8,526. In order to normalise them, we divided
each score by the respective maximum value observed, and then multiplied the
result by 100 to get a percentage.
2 The number of levels was constrained to three to ensure that succeeding steps will
be computationally feasible.</p>
        <p>There are however, many drug pairs whose similarity scores were very low,
indicating insigni cant similarity between drugs. To overcome this problem, we
retained only drug pairs whose ChEBI and GO similarity scores are both at
least 10%. The threshold was set to this level based on our analysis of the data
distribution which informed our decision on an optimal cut o . A combined
similarity score is nally calculated by taking the mean of the two scores. If only
one of the scores exists for a drug pair (e.g., in cases where a score was obtained
by counting the number of shared ChEBI terms but none based on shared GO
terms, or vice-versa), we take half of the value of the lone score as the combined
similarity but only if it is at least 50%. Otherwise, the pair is discarded. After
this step, only 261 drug pairs remained, having 89 unique drugs.</p>
        <p>
          The remaining drug pairs were combined to form an interconnected graph
following the Yifan Hu representation [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. In this graph representation, nodes
represent drugs and weighted edges between nodes correspond to their similarity
score. It is found that approximately around 20% of the nodes in the graph were
not connected to the larger central interconnected graph. We consider them as
pertaining to drugs which are extremely unique in terms of their MOA, or
irrelevant to cancer but were spuriously included during the information extraction
step. These nodes were therefore ltered out, retaining only 72 nodes.
3.5
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>Drug clustering and ranking</title>
        <p>
          We investigated two algorithms for clustering the graph of drugs, namely,
Chinese Whispers (CW) [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] and Louvain Method (LM) [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. CW assigns a class label
to each node based on the strongest weight of the neighbouring class, in
iterative cycles. Meanwhile, LM optimises graph modularity, which is a measure of
separation between densely connected regions of the graph.
        </p>
        <p>All drugs in the graph were then ranked with respect to their prominence3 in
the whole graph according to: (1) degree, the number of immediate neighbors;
3 Pertains to importance based on their centrality within the graph
(2) closeness, the distance between each node and all other nodes in the graph;
(3) betweenness, the number of times a node is traversed along the shortest
distance between each node and all other nodes in the graph; (4) PageRank, the
probability distribution based on the likelihood of stopping on the node while
traversing connected nodes; and (5) eigenvector, the principal eigenvector in the
adjacency matrix of the graph. Additionally, drugs were also ranked within each
cluster using the same centrality measures.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experiments and Results</title>
      <p>Upon the application of Chinese Whispers and Louvain Method on our data,
we obtained the results shown in Figures 3 and 4, respectively, following the
Yifan Hu graph representation. For simplicity, each cluster is assigned a cluster
identi er and represented by a unique colour. Its distribution, i.e., the proportion
of the nodes in the graph that belongs to the cluster, is also indicated.
Cluster ID Color Distribution</p>
      <p>CW C1 41.67%
CW C2 27.78%
CW C3 20.83%
CW C4 9.72%</p>
      <p>In both cases, the graph of 72 drugs was divided into four clusters of varying
sizes. It is noticeable that similar clusters were produced by the two algorithms.
We can consider the following as equivalent to each other: CW C1 and LM C1,
CW C2 and LM C2, CW C3 and LM C3, and CW C4 and LM C4. While each
of the rst two cluster pairs has only a one-node di erence, the latter two pairs
correspond to exactly the same clusters.</p>
      <p>Table 1 presents the most prominent drugs within the entire graph while
Table 2 provides a list of the same but only within each cluster. In both tables,
drugs are ranked according to prominence, calculated based on our chosen graph
centrality measures, as mentioned in Section 3.5. The three highest ranked drugs
across the di erent centrality parameters are exactly the same corresponding to
clusters from CW and LM except for one di erence, shown in Table 2.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Analysis and Discussion</title>
      <p>The thresholds which were applied to ensure that only drugs with considerable
similarity were retained|described in Section 3|signi cantly reduced the
number of drugs which formed the graph. Out of 831 drugs in our initial list, only
72 were included in the graph for clustering. As both CW and LM clustering
algorithms are non-deterministic, they were applied on the data several times;
results were consistent over the di erent runs.</p>
      <p>
        To assess the quality of our clustering results, we rstly computed the value
of the silhouette coe cient [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], a widely accepted metric for judging the quality
of clusters where class labels are not prede ned. Its value varies from -1 to 1,
where positive values are taken to mean good clustering while anything less than
zero is undesirable. Results obtained by CW and LM produced 0.294 and 0.292
as mean silhouette coe cient values, respectively. The cluster consistency is not
very strong, but it is still substantial and falls within the acceptable range [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>
        CW and LM use di erent techniques in generating clusters. While CW
relies on random class distribution, LM is based on optimisation of modularity.
We computed the value of Adjusted Rand Index (ARI), a measure of similarity
between two clusterings, taking into account by-chance grouping [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. The
index can take a value that varies from -1 to 1, where -1 corresponds to lack of
correlation while 1 pertains to a perfect match. We obtained 0.95 as the value
of ARI for the clusterings produced by CW and LM, which is very high. This
strengthens our con dence in the results, as it suggests that the drug clusters
are consistent even across di erent algorithms.
      </p>
      <p>Furthermore, our results were reviewed by a domain expert who was
convinced by the results and suggested that it is worth pursuing further research
into highlighting the dominant features for each cluster in order to uncover
relevant mechanisms of action. He expressed con dence in the viability of the
research and signi ed that it holds great potential.</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion and Future Work</title>
      <p>In this paper, we proposed a novel approach to highlighting similarities between
drugs, according to features derived from scienti c literature and ontologies. Two
unsupervised clustering methods were employed, namely, Chinese Whispers and
Louvain Method. Upon the application of our approach to 72 drugs, the same
four clusters were independently produced by each of the clustering methods.
We then obtained a list of the most prominent drugs within the entire graph of
drugs as well as within each cluster.</p>
      <p>The approach is a preliminary investigation and leaves room for improvement
and future work. The most important next step is the experimental validation of
the dominant MOA represented by each cluster using sources such as annotated
data sets or manual annotation by domain experts. Other methods that could
potentially improve the results include hierarchical clustering to explore
subcluster relationships or soft clustering to take into account membership of a drug
in multiple clusters. Feature engineering can also be extended, e.g., to increase
tree depth when extracting terms from ontologies.</p>
      <p>Acknowledgment. The authors would like to express their gratitude to the
National Centre for Text Mining (NaCTeM) for sharing their event extraction
results. The authors also thank Prof. Ross King for his help in evaluating the
clustering results.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.P.</given-names>
            <surname>Adams</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.Q.</given-names>
            <surname>Urban</surname>
          </string-name>
          , Pharmacology:
          <article-title>Connections to Nursing Practice (Pearson Education</article-title>
          , UK,
          <year>2015</year>
          ). ISBN 9780133896817. https://books.google.co.uk/books?id=usOgBwAAQBAJ
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ahmad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ormiston-Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Sasieni</surname>
          </string-name>
          ,
          <article-title>Trends in the lifetime risk of developing cancer in great britain: comparison of risk for those born from 1930 to 1960</article-title>
          .
          <source>British journal of cancer 112(5)</source>
          ,
          <volume>943</volume>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.I.</given-names>
            <surname>Archakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.M.</given-names>
            <surname>Govorun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.V.</given-names>
            <surname>Dubanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.D.</given-names>
            <surname>Ivanov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.V.</given-names>
            <surname>Veselovsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lewi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Janssen</surname>
          </string-name>
          ,
          <article-title>Protein-protein interactions as a target for drugs in proteomics</article-title>
          .
          <source>Proteomics</source>
          <volume>3</volume>
          (
          <issue>4</issue>
          ),
          <volume>380</volume>
          {
          <fpage>391</fpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Biemann</surname>
          </string-name>
          ,
          <article-title>Chinese whispers: an e cient graph clustering algorithm and its application to natural language processing problems</article-title>
          ,
          <source>in Proceedings of the rst workshop on graph based methods for natural language processing</source>
          ,
          <source>Association for Computational Linguistics</source>
          ,
          <year>2006</year>
          , pp.
          <volume>73</volume>
          {
          <fpage>80</fpage>
          . Association for Computational Linguistics
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>V.D.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-L.</given-names>
            <surname>Guillaume</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Lambiotte</surname>
          </string-name>
          , E. Lefebvre,
          <article-title>Fast unfolding of communities in large networks</article-title>
          .
          <source>Journal of statistical mechanics: theory and experiment</source>
          <year>2008</year>
          (
          <volume>10</volume>
          ),
          <volume>10008</volume>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Cascante</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.G.</given-names>
            <surname>Boros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Comin-Anduix</surname>
          </string-name>
          , P. de Atauri,
          <string-name>
            <given-names>J.J.</given-names>
            <surname>Centelles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.W.-N.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Metabolic control analysis in drug discovery and disease</article-title>
          .
          <source>Nature biotechnology 20(3)</source>
          ,
          <volume>243</volume>
          {
          <fpage>249</fpage>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G.O.</given-names>
            <surname>Consortium</surname>
          </string-name>
          , et al.,
          <source>The gene ontology project in 2008. Nucleic acids research 36(suppl 1)</source>
          ,
          <volume>440</volume>
          {
          <fpage>444</fpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>K.</given-names>
            <surname>Degtyarenko</surname>
          </string-name>
          , P. De Matos,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ennis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hastings</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zbinden</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>McNaught</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Alcantara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Darsow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Guedj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ashburner</surname>
          </string-name>
          ,
          <article-title>Chebi: a database and ontology for chemical entities of biological interest</article-title>
          .
          <source>Nucleic acids research 36(suppl 1)</source>
          ,
          <volume>344</volume>
          {
          <fpage>350</fpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gemma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sugiyama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Matsuda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Seike</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kosaihira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Minegishi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Noro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Nara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Seike</surname>
          </string-name>
          , et al.,
          <article-title>Anticancer drug clustering in lung cancer based on gene expression pro les and sensitivity database</article-title>
          .
          <source>BMC cancer 6</source>
          (
          <issue>1</issue>
          ),
          <volume>174</volume>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <article-title>E cient, high-quality force-directed graph drawing</article-title>
          .
          <source>Mathematica Journal</source>
          <volume>10</volume>
          (
          <issue>1</issue>
          ),
          <volume>37</volume>
          {
          <fpage>71</fpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Jeon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.H.</given-names>
            <surname>Jeong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-H.</given-names>
            <surname>Baek</surname>
          </string-name>
          , H.
          <article-title>-</article-title>
          <string-name>
            <surname>J. Koo</surname>
            ,
            <given-names>W.-H.</given-names>
          </string-name>
          <string-name>
            <surname>Park</surname>
            ,
            <given-names>J.-S.</given-names>
          </string-name>
          <string-name>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.-H. Yu</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>Y.K.</given-names>
          </string-name>
          <string-name>
            <surname>Pak</surname>
          </string-name>
          ,
          <article-title>Network clustering revealed the systemic alterations of mitochondrial protein expression</article-title>
          .
          <source>PLoS computational biology 7</source>
          (
          <issue>6</issue>
          ),
          <volume>1002093</volume>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12] L. Kaufman, P.J.
          <string-name>
            <surname>Rousseeuw</surname>
          </string-name>
          ,
          <article-title>Finding groups in data: an introduction to cluster analysis</article-title>
          , vol.
          <volume>344</volume>
          (John Wiley &amp; Sons, ???,
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>I.</given-names>
            <surname>Kola</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Landis</surname>
          </string-name>
          ,
          <article-title>Opinion: Can the pharmaceutical industry reduce attrition rates? Nature reviews</article-title>
          .
          <source>Drug discovery 3(8)</source>
          ,
          <volume>711</volume>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>D.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stroh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.A.</given-names>
            <surname>Graham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Musib</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.-C. Li</surname>
            ,
            <given-names>B.L.</given-names>
          </string-name>
          <string-name>
            <surname>Lum</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Joshi</surname>
          </string-name>
          ,
          <article-title>A survey of new oncology drug approvals in the usa from 2010 to 2015: a focus on optimal dose and related postmarketing activities</article-title>
          .
          <source>Cancer chemotherapy and pharmacology 77(3)</source>
          ,
          <volume>459</volume>
          {
          <fpage>476</fpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>C. McQueen</surname>
          </string-name>
          ,
          <string-name>
            <surname>Comprehensive Toxicology</surname>
          </string-name>
          (Elsevier Science, ???,
          <year>2010</year>
          ). ISBN 9780080468846. https://books.google.co.uk/books?id=jzCAKsa2CpMC
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M.</given-names>
            <surname>Newman</surname>
          </string-name>
          ,
          <source>Networks:An Introduction (OUP Oxford, UK</source>
          ,
          <year>2009</year>
          ). ISBN 9780191637766. https://books.google.co.uk/books?id=7LmNAQAACAAJ
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>L.</given-names>
            <surname>Page</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Brin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Motwani</surname>
          </string-name>
          , T. Winograd,
          <article-title>The PageRank citation ranking: Bringing order to the web</article-title>
          .,
          <source>Technical report</source>
          , Stanford InfoLab,
          <year>1999</year>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>R.</given-names>
            <surname>Rak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rowley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Black</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ananiadou</surname>
          </string-name>
          ,
          <article-title>Argo: an integrative, interactive, text mining-based workbench supporting curation</article-title>
          .
          <source>Database</source>
          <year>2012</year>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>W.M.</given-names>
            <surname>Rand</surname>
          </string-name>
          ,
          <article-title>Objective criteria for the evaluation of clustering methods</article-title>
          .
          <source>Journal of the American Statistical association</source>
          <volume>66</volume>
          (
          <issue>336</issue>
          ),
          <volume>846</volume>
          {
          <fpage>850</fpage>
          (
          <year>1971</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>D.T.</given-names>
            <surname>Ross</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Scherf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.B.</given-names>
            <surname>Eisen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.M.</given-names>
            <surname>Perou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rees</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Spellman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Iyer</surname>
          </string-name>
          , S.S. Je rey, M. Van de Rijn,
          <string-name>
            <given-names>M.</given-names>
            <surname>Waltham</surname>
          </string-name>
          , et al.,
          <article-title>Systematic variation in gene expression patterns in human cancer cell lines</article-title>
          .
          <source>Nature genetics 24(3)</source>
          ,
          <volume>227</volume>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>P.J.</given-names>
            <surname>Rousseeuw</surname>
          </string-name>
          ,
          <article-title>Silhouettes: a graphical aid to the interpretation and validation of cluster analysis</article-title>
          .
          <source>Journal of computational and applied mathematics 20</source>
          ,
          <volume>53</volume>
          {
          <fpage>65</fpage>
          (
          <year>1987</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>L.</given-names>
            <surname>Udrescu</surname>
          </string-name>
          , L. Sba^rcea, A. Top^rceanu,
          <string-name>
            <given-names>A.</given-names>
            <surname>Iovanovici</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kurunczi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bogdan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Udrescu</surname>
          </string-name>
          ,
          <article-title>Clustering drug-drug interaction networks with energy model layouts: community analysis and drug repurposing</article-title>
          .
          <source>Scienti c reports 6</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>K.</given-names>
            <surname>Uhr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.J.</given-names>
            <surname>Prager-van der Smissen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.A.</given-names>
            <surname>Heine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ozturk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Smid</surname>
          </string-name>
          , H.W. Gohlmann,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jager</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.A.</given-names>
            <surname>Foekens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.W.</given-names>
            <surname>Martens</surname>
          </string-name>
          ,
          <article-title>Understanding drugs in breast cancer through drug sensitivity screening</article-title>
          .
          <source>SpringerPlus</source>
          <volume>4</volume>
          (
          <issue>1</issue>
          ),
          <volume>611</volume>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>C.R. UK</surname>
          </string-name>
          ,
          <article-title>Cancer Statistics for the UK</article-title>
          . Accessed:
          <fpage>2017</fpage>
          -10-05. http://www.cancerresearchuk.org/health-professional/cancer-statistics-forthe-uk
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>C.</given-names>
            <surname>Zerva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Batista-Navarro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Day</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ananiadou</surname>
          </string-name>
          ,
          <article-title>Using uncertainty to link and rank evidence from biomedical literature for model curation</article-title>
          .
          <source>Bioinformatics</source>
          ,
          <volume>466</volume>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>