<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The p-median Problem with Order for Two-source Clustering⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Xenia Klimentova</string-name>
          <email>xenia.klimentova@inesctec.pt</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anton V. Ushakov</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Igor Vasilyev</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>INESC TEC</institution>
          ,
          <addr-line>Campus da FEUP, Rua Dr. Roberto Frias, 378, 4200-465 Porto</addr-line>
          ,
          <country country="PT">Portugal</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Matrosov Institute for System Dynamics and Control Theory</institution>
          ,
          <addr-line>Lermontov 134, 664033, Irkutsk</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <fpage>536</fpage>
      <lpage>544</lpage>
      <abstract>
        <p>In this paper we present a hybrid approach to integrative clustering based on the p-median problem with clients' preferences. We formulate the problem of simultaneous clustering of a set of objects, characterized by two sets of features, as a bi-level p-median model. An exact approach involving a branchand-cut method combined with the simulated annealing algorithm is used, that allows one to nd a two-source clustering. The proposed approach is compared with some well-known mathematical optimisation based clustering techniques applied to the NCI-60 tumour cell line anticancer drug screen dataset. The results obtained demonstrate the applicability of our approach to nd competitive integrative clusterings.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Recent advances in microarray technologies give rise to collecting huge amount of
highdimensional omic data, generated simultaneously for the same biological samples. A
huge number of techniques and approaches has been proposed to carefully combine
and analyse continuous, discrete and categorical multisource data, mainly based on
probabilistic and statistical modelling. Recent reviews [
        <xref ref-type="bibr" rid="ref17 ref20">17, 20</xref>
        ] cover a wide range of
approaches to omic data integration. Modern techniques in statistics for integrative
analyses of cancer data, including incorporating multiple heterogeneous genomic data
types, are also reviewed in [
        <xref ref-type="bibr" rid="ref19 ref27">19, 27</xref>
        ].
      </p>
      <p>This paper focuses on the cluster analysis, which is one of the main unsupervised
learning methods. Clustering is an exploratory tool that consists in dividing a set of
objects (biological samples, observations) into subsets (clusters) containing similar
objects, while objects from different clusters are dissimilar. Despite a large number of
general clustering algorithms having been proposed, there is a lack of such methods for</p>
      <p>Copyright ⃝c by the paper's authors. Copying permitted for private and academic purposes.
efficient integrative clustering of multisource data. The simplest and obvious approach
to integrative clustering consists in concatenating the feature vectors of the
considered objects or samples followed by applying some well-known clustering algorithms to
analyse the data obtained. Though this approach may be useful in some rare instances,
it is rather in exible due to its impossibility of capturing important features that are
speci c to each dataset and in the case of heterogeneous data.</p>
      <p>
        We propose an approach to integrative clustering of two-source data which is based
on bi-level integer linear programming and metaheuristic optimization. As opposed
to previously developed integrative clustering methods, which are based on modelling
each dataset using mixture models or post hoc combining multiple base clusterings,
we propose a hybrid integer programming approach based on the bi-level p-median
problem with clients' preferences. The approach is compared to those presented in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ],
indicating that our approach can provide clustering solutions competitive with other
distance-based integrative clustering algorithms. One of the rst gene-drug integrative
clustering approach was proposed in [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] and was based on a hierarchical clustering
algorithm. Other mathematical optimization based approaches that have been applied
to simultaneously clustering gene expression and drug activity pro les are the Soft
Topographic Vector Quantization [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], relational k-means [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], genetic programming [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ],
and a consensus p-median approach [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>The paper is structured as follows. The generic p-median clustering model is
described in section 2. In section 3 an extension to the bi-level problem is presented.
Finally, section 4 illustrates the comparison of different approaches and gives some
concluding remarks.
2</p>
    </sec>
    <sec id="sec-2">
      <title>The p-median clustering problem</title>
      <p>
        The p-median problem, proposed by [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], consists in choosing p sites for locating
facilities from the set of potential places in order to minimize the sum of weighted distances
between customers and open facilities. This model can naturally be formulated in the
form of the following combinatorial optimization problem:
min{∑ min dij : jSj = p};
      </p>
      <p>S I j2J i2S
where jIj = f1; : : : ; mg is a set of potential facility sites, J = f1; : : : ; lg is a set of
customers, and dij | the distance (or the cost of satisfying the demand) between
customer j 2 J and the facility located at site i 2 I.</p>
      <p>
        The p-median problem is known to be NP-hard in the strong sense and now
supposed to be well-studied. Apart from a number of applications, the p-median problem is
a powerful tool for the cluster analysis. To transform the problem into a clustering tool,
let us assume that I = J , i.e. the potential sites and customer locations coincide. In this
case the p-median problem (also referred as the minimum-sum-of-stars clustering [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]),
can be formulated on a simple digraph G(I; A) with the node set I corresponding to
the set of samples to be clustered and the arc set A = f(i; j) : i; j 2 I; i ̸= jg, each arc
(i; j) 2 A has a weight (distance) dij &gt; 0 measuring the dissimilarity between each pair
of samples. Now the p-median problem is to select p nodes, those are called medians, in
order to minimize the sum of the distances between each node and its nearest median.
Any feasible solution to the problem consists of p stars with medians in the centres.
Each star is a cluster and the median is a cluster representative (prototype).
      </p>
      <p>
        The p-median model has several important advantages over centroid-based
parametric clustering approaches, like k-means and its variations. First of all, modern
hybrid heuristics and exact approaches are able to nd optimal (global) or suboptimal
solutions to the p-median instances of huge size (e.g. see [
        <xref ref-type="bibr" rid="ref12 ref15 ref4">4, 12, 15</xref>
        ]), while k-means or
k-means type algorithms, like PAM, CLARA, CLARANS, k-medians, are heuristics
that converge fast to a local optimum. Secondly, the classical k-means algorithm
presupposes using the squared Euclidean distance as the measure of similarity, which not
always results in a good clustering. For other types of parametric clustering problems,
several algorithms using k-means type iterative relocation scheme have been proposed,
e.g. k-medians in the case of Manhattan distance or the Linde-Buzo-Gray algorithm
when the Itakura-Saito distance is considered [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        The p-median problem can be formulated as an integer linear program. For each
node i 2 I, let us consider a binary variable yi, which takes the value 1 if node i is a
median (sample i is a cluster representative), and takes the value 0 otherwise. For each
arc (i; j) 2 A let xij be the binary variable which is equal to 1 if node i is the closest
median to node j (sample j is assigned to cluster i), and takes the value 0 otherwise.
Let also (j) = fi 2 Ij (i; j) 2 Ag be a set of nodes of the graph G assigned to j with
outgoing arcs, and let +(i) = fj 2 Ij (i; j) 2 Ag be a set of nodes assigned to i with
the arcs leaving node i. Thus, the p-median problem can be written as follows [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]:
xij
      </p>
      <p>yi;
∑ yi = p;
i2I</p>
      <p>
        Several studies considered the p-median model for clustering various types of data,
e.g. psychological data [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] or gene expression pro les and drug responses [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
Moreover, p-median model is proved to be a powerful clustering tool that provides high
quality clustering solutions outperforming those provided by CLARA and CLARANS [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]
or k-means and k-means++ [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>Bi-level p-median clustering</title>
      <p>Suppose that we are given a set I of objects (cell lines) characterized by two feature
vectors, i.e. representing gene expression and drug activity pro les. Here we propose
an approach to two-source integrative clustering based on a bi-level version of the
p-median problem and study its application to clustering cancer cell lines. Let two
matrices dij 0 and gij 0 indicate the dissimilarity of a pair of cell lines (i; j) 2
A = f(i; j) : i 2 I; j 2 I; i ̸= jg in the drug and gene space respectively. Then, let us
consider the following bi-level p-median clustering problem:
where xij (y) is the optimal solution to the following lower-level problem:
min ∑
y</p>
      <p>dij xij (y);
yi 2 f0; 1g;
min ∑
x
xij</p>
      <p>yi;
xij 2 f0; 1g;
On the upper level on seeks for p cell lines to be the cluster representatives such that
the sum of dissimilarities of drug activity pro les between cell lines and its closest
representatives is minimised. On the lower level all cell lines are assigned to the cluster
representatives, selected at the rst level, in order to minimise the sum of dissimilarities
of gene expression pro les between cell lines belonging to the same cluster and its closest
representatives. In other words, the decision about which cell lines will be the medians
(cluster representatives) is made on the rst level according to the matrix fdij g, while
assigning remaining cell lines to clusters is performed on the second level taking into
account the dissimilarity matrix fgij g, (i; j) 2 A. Thus, if gij gkj and both i; k 2 I
are cluster representatives, then cell line j is assigned to cluster Ci.</p>
      <p>
        Note that when all the columns of the lower-level matrix fgij g are sorted in
ascending order or when dij = gij , the problem (7){(13) is a particular case of the p-median
problem. Thus, it is NP-hard in the strong sense and does not belong to the class
APX [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In general case the solution xij (y) may be not unique, then one has to specify
what kind of solution is supposed to be optimal to the clustering problem (7){(13).
Two extreme cases, i.e. cooperative and non-cooperative decision-making strategies,
are often emphasised [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], depending on whether cell lines on the lower level are
assigned to their closest representatives in order to respectively minimise or maximise
the value of the upper-level objective. To avoid these cases for the sake of simplicity,
we suppose that all elements of a column j of the matrix fgij g are distinct [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ].
i 2 I;
      </p>
      <p>
        The bi-level p-median clustering problem can be reduced to a single-level integer
linear program [
        <xref ref-type="bibr" rid="ref14 ref7">14, 7</xref>
        ]. Let us denote Wij = fk 2 I : gij &lt; gkj g, (i; j) 2 A, then the
single-level problem is
xij
      </p>
      <p>yi;
This formulation is identical to the p-median except constraints (15), which guarantee
that if i 2 I is a cluster representative, then a cell line j 2 I is not assigned to more
dissimilar (according to gene expression pro les) representatives from the set Wij .
Thus, xij , (i; j) 2 A is the optimal solution to the lower-level problem for any yi, i 2 I.
Note that one can similarly consider the bi-level p-median clustering problem with the
matrices fgij g and fdij g on the upper and lower levels respectively.</p>
      <p>
        To nd the optimal solution to the bi-level p-median clustering problem, we have
developed an exact approach including a branch-and-cut algorithm and a metaheuristic
to search for initial upper bounds of the optimal value, which is detailed in [24{26]. The
method is based on the family facet-de ning inequalities proposed in [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] for the
simpleplant location problem with order. For some node j and subset S I let us denote by
bj (S) 2 S the nearest node from S for j, i.e. Wij Wbj(S)j for all i 2 S n fbj (S)g.
Theorem 1 For all i; u; v 2 I the inequalities
      </p>
      <p>∑
k2Wiu
xku +
∑ xkv + yt</p>
      <p>1;
k2Uitu
where t = bv(I nWiu) and Uitu = I n(Wiu [ftg), are valid for the polytope of the problem
(14)-(20).</p>
      <p>
        The cutting plane method for this family of inequalities was implemented using two
computational tricks for reduction of the number of violated inequalities. On each
iteration we add the inequalities corresponding to the most distant hyperplane from the
current fractional solution, preventing the almost parallel inequalities from adding. To
nd an upper bound of the optimal solution the standard scheme of simulated annealing
method was implemented, using ip-neighbourhood (see [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] for more detail).
      </p>
      <p>Finally, as an exact method we have used Cut-and-Branch scheme, one of the
effective methods tested in previous works for a similar problem. In root node of branching
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
tree the cutting plane method is run and the new formulation obtained, as well as
an upper bound provided by the simulated annealing, are used in order to solve the
problem with branch-and-bound algorithm.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Clustering analysis of the NCI-60 dataset</title>
      <p>∑
i;j2f1;:::;mg:i&lt;j</p>
      <p>corrij ;
p
P = ∑</p>
      <p>2
k=1 m(jCkj</p>
      <p>
        1)
The proposed approach was implemented as a program using C++ programming
language. The MIP solver FICO Xpress callable library has been used as a branch-and-cut
framework. We compare our results with those obtained in the paper [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] as well as
with other integrative clustering techniques presented in that paper. The number of
clusters p was equal to 9.
      </p>
      <p>
        As a measure of dissimilarity between cell lines both in the drug and gene spaces, we
apply one of the most widely used measure based on the Pearson correlation coefficient,
i.e. distij = 1 corrij , (i; j) 2 A [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>
        We report our results performing two-source cluster analysis of National Cancer
Institute (NCI)-60 panel of human tumour cancer cell lines [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. The dataset consists
of 60 cell lines from 9 cancer tissues. To compare our approach, we use the same dataset
as was previously considered in [
        <xref ref-type="bibr" rid="ref10 ref21">10, 21</xref>
        ]. It includes 1376 gene expression pro les and
1400 drug activity patterns. Thus, I is a set of m = 60 cell lines, and the dissimilarity
matrices fgij g and fdij g are computed for each pair (i; j) 2 A as distij , using the given
gene expression and drug response data.
      </p>
      <p>To evaluate the quality of a cluster solution we use the average Pearson correlation
coefficient
where Ck is a cluster k. Such coefficient was computed taking into account both the
gene and drug spaces, thus providing P G and P D values respectively.</p>
      <p>
        The results of the computational experiments are presented in Table 1, where the
rst column demonstrates the method applied, the second column contains the value
of parameters and setting the stepsize of the consensus p-median approach and
weighted coefficients of the Soft Topographic Vector Quantization method respectively
from [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The notations (d) and (g) indicate whether drug or gene dissimilarity matrix
are used on the upper-level.
      </p>
      <p>
        Note that the clustering results presented in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] are obtained on the base of a
leaveone-out cross validation, i.e. on the set O n fig for each f1; : : : ; mg, and the con dence
intervals as well as the mean are then estimated. The leave-one-out cross validation
procedure provides a small effect size for most of the approaches both for correlations in
the gene and drug space, thus we compare our results with the presented mean values.
      </p>
      <p>Analysing the results obtained, we can conclude that our approach provides
better integrative clustering solutions, than most of the methods under consideration, i.e.
STVQ, p-Median, k-means, relational k-means, and probabilistic d-clustering. These
results are better with respect to the cluster homogeneity in the drug space in all cases.
They are also better in the gene space when STVQ with less than or equal to 0:2 is
considered. Concerning the consensus p-median clustering approach, our method
provides competitive or little worse clustering solutions. Nevertheless the main advantage
P G</p>
      <p>P D
of the proposed approach is that one has not to solve a series of integer linear programs
with different values of , which in the case of large problem instances may be of great
importance. Moreover, our method can be a useful tool and provide competitive
solutions when no prior information about the data structure is known. This is especially
important in the eld of unsupervised machine learning.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>E.</given-names>
            <surname>Alekseeva</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Kochetov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Plyasunov</surname>
          </string-name>
          .
          <article-title>An exact method for the discrete</article-title>
          (r|p)
          <article-title>- centroid problem</article-title>
          .
          <source>J. Glob. Optim.</source>
          ,
          <volume>63</volume>
          (
          <issue>3</issue>
          ):
          <volume>445</volume>
          {
          <fpage>460</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>E.</given-names>
            <surname>Alekseeva</surname>
          </string-name>
          , Yu. Kochetov,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Plyasunov</surname>
          </string-name>
          .
          <article-title>Complexity of local search for the pmedian problem</article-title>
          .
          <source>Eur. J. Oper. Res.</source>
          ,
          <volume>191</volume>
          (
          <issue>3</issue>
          ):
          <volume>736</volume>
          {
          <fpage>752</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>F.</given-names>
            <surname>Archetti</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Giordani</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Vanneschi</surname>
          </string-name>
          .
          <article-title>Genetic programming for anticancer therapeutic response prediction using the nci-60 dataset</article-title>
          .
          <source>Comput. Oper. Res.</source>
          ,
          <volume>37</volume>
          (
          <issue>8</issue>
          ):
          <volume>1395</volume>
          {
          <fpage>1405</fpage>
          ,
          <year>2010</year>
          .
          <article-title>Operations Research and Data Mining in Biological Systems</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>P.</given-names>
            <surname>Avella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Boccia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Salerno</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Vasilyev.</surname>
          </string-name>
          <article-title>An aggregation heuristic for large scale p-median problem</article-title>
          .
          <source>Comput. Oper. Res.</source>
          ,
          <volume>39</volume>
          (
          <issue>7</issue>
          ):
          <volume>1625</volume>
          {
          <fpage>1632</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>P.</given-names>
            <surname>Avella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sassano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and I.</given-names>
            <surname>Vasilyev</surname>
          </string-name>
          .
          <article-title>Computational study of large-scale p-median problems</article-title>
          . Math. Program.,
          <volume>109</volume>
          (
          <issue>1</issue>
          ):
          <volume>89</volume>
          {
          <fpage>114</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>A.</given-names>
            <surname>Banerjee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Merugu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. S.</given-names>
            <surname>Dhillon</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Ghosh</surname>
          </string-name>
          .
          <article-title>Clustering with bregman divergences</article-title>
          .
          <source>J. Mach. Learn. Res.</source>
          ,
          <volume>6</volume>
          :
          <fpage>1705</fpage>
          {
          <fpage>1749</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>L.</given-names>
            <surname>Canovas</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          <article-title>Garc a, M. Labbe, and</article-title>
          <string-name>
            <surname>A. Mar n.</surname>
          </string-name>
          <article-title>A strengthened formulation for the simple plant location problem with order</article-title>
          .
          <source>Oper. Res. Lett.</source>
          ,
          <volume>35</volume>
          (
          <issue>2</issue>
          ):
          <volume>141</volume>
          {
          <fpage>150</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>J.-H. Chang</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.-B. Hwang</surname>
            , and
            <given-names>B.-T.</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
          </string-name>
          .
          <article-title>Methods of Microarray Data Analysis II: Papers from CAMDA' 01, chapter Analysis of Gene Expression Pro les and Drug Activity Patterns by Clustering and Bayesian Network Learning</article-title>
          , pages
          <volume>169</volume>
          {
          <fpage>184</fpage>
          . Springer, Boston,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Do</surname>
          </string-name>
          and
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Choi</surname>
          </string-name>
          .
          <article-title>Clustering approaches to identifying gene expression patterns from dna microarray data</article-title>
          .
          <source>Mol. Cells</source>
          ,
          <volume>25</volume>
          (
          <issue>2</issue>
          ):
          <volume>279</volume>
          {
          <fpage>288</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. E.
          <string-name>
            <surname>Fersini</surname>
            , E. Messina, and
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Archetti</surname>
          </string-name>
          .
          <article-title>A p-median approach for predicting drug response in tumour cells</article-title>
          .
          <source>BMC Bioinformatics</source>
          ,
          <volume>15</volume>
          (
          <issue>1</issue>
          ):1{
          <fpage>19</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. E.
          <string-name>
            <surname>Fersini</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Messina</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Archetti</surname>
            , and
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Manfredotti</surname>
          </string-name>
          .
          <article-title>Combining gene expression pro les and drug activity patterns analysis: A relational clustering approach</article-title>
          .
          <source>J. Math. Mod. Alg.</source>
          ,
          <volume>9</volume>
          (
          <issue>3</issue>
          ):
          <volume>275</volume>
          {
          <fpage>289</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. S.
          <article-title>Garc a, M. Labbe, and</article-title>
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>Mar n. Solving large p-median problems with a radius formulation</article-title>
          .
          <source>INFORMS J. Comput.</source>
          ,
          <volume>23</volume>
          (
          <issue>4</issue>
          ):
          <volume>546</volume>
          {
          <fpage>556</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>S. L.</given-names>
            <surname>Hakimi</surname>
          </string-name>
          .
          <article-title>Optimum distribution of switching centers in a communication network and some related graph theoretic problems</article-title>
          .
          <source>Oper. Res.</source>
          ,
          <volume>13</volume>
          (
          <issue>3</issue>
          ):
          <volume>462</volume>
          {
          <fpage>475</fpage>
          ,
          <year>1965</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>P.</given-names>
            <surname>Hanjoul</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Peeters</surname>
          </string-name>
          .
          <article-title>A facility location problem with clients' preference orderings</article-title>
          .
          <source>Regional Sci. Urban</source>
          Econom.,
          <volume>17</volume>
          (
          <issue>3</issue>
          ):
          <volume>451</volume>
          {
          <fpage>473</fpage>
          ,
          <year>1987</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <given-names>P.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Brimberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Urosevic</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Mladenovic</surname>
          </string-name>
          .
          <article-title>Solving large p-median clustering problems by primal-dual variable neighborhood search</article-title>
          .
          <source>Data Min. Knowl. Discov.</source>
          ,
          <volume>19</volume>
          (
          <issue>3</issue>
          ):
          <volume>351</volume>
          {
          <fpage>375</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>P.</given-names>
            <surname>Hansen</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Jaumard</surname>
          </string-name>
          .
          <article-title>Cluster analysis and mathematical programming</article-title>
          .
          <source>Math. Program.</source>
          ,
          <volume>79</volume>
          (
          <issue>1-3</issue>
          ):
          <volume>191</volume>
          {
          <fpage>215</fpage>
          ,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>E. R.</given-names>
            <surname>Holzinger</surname>
          </string-name>
          and
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Ritchie</surname>
          </string-name>
          .
          <article-title>Integrating heterogeneous high-throughput data for meta-dimensional pharmacogenomics and disease-related studies</article-title>
          .
          <source>Pharmacogenomics</source>
          ,
          <volume>13</volume>
          (
          <issue>2</issue>
          ):
          <volume>213</volume>
          {
          <fpage>222</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>H. F. K</surname>
          </string-name>
          <article-title>ohn</article-title>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Steinley</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Brusco</surname>
          </string-name>
          .
          <article-title>The p-median model as a tool for clustering psychological data</article-title>
          .
          <source>Psychol. Methods</source>
          ,
          <volume>15</volume>
          (
          <issue>1</issue>
          ):
          <volume>87</volume>
          {
          <fpage>95</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <given-names>V. N.</given-names>
            <surname>Kristensen</surname>
          </string-name>
          ,
          <string-name>
            <surname>O. C.</surname>
          </string-name>
          <article-title>Lingj rde</article-title>
          , H. G. Russnes,
          <string-name>
            <surname>H. K. M. Vollan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Frigessi</surname>
          </string-name>
          , and A.
          <string-name>
            <surname>-L. B rresen-Dale</surname>
          </string-name>
          .
          <article-title>Principles and methods of integrative genomic analyses in cancer</article-title>
          .
          <source>Nat. Rev. Cancer</source>
          ,
          <volume>14</volume>
          (
          <issue>5</issue>
          ):
          <volume>299</volume>
          {
          <fpage>313</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>M. D. Ritchie</surname>
            ,
            <given-names>E. R.</given-names>
          </string-name>
          <string-name>
            <surname>Holzinger</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>S. A.</given-names>
          </string-name>
          <string-name>
            <surname>Pendergrass</surname>
            , and
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Kim</surname>
          </string-name>
          .
          <article-title>Methods of integrating data to uncover genotype-phenotype interactions</article-title>
          .
          <source>Nat. Rev. Genet</source>
          .,
          <volume>16</volume>
          :
          <fpage>85</fpage>
          {
          <fpage>97</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <given-names>U.</given-names>
            <surname>Scherf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. T.</given-names>
            <surname>Ross</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Waltham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. H.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Tanabe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. W.</given-names>
            <surname>Kohn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. C.</given-names>
            <surname>Reinhold</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. G.</given-names>
            <surname>Myers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. T.</given-names>
            <surname>Andrews</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Scudiero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. B.</given-names>
            <surname>Eisen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Sausville</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Pommier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Botstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. O.</given-names>
            <surname>Brown</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. N.</given-names>
            <surname>Weinstein</surname>
          </string-name>
          .
          <article-title>A gene expression database for the molecular pharmacology of cancer</article-title>
          .
          <source>Nat. Genet</source>
          .,
          <volume>24</volume>
          (
          <issue>3</issue>
          ):
          <volume>236</volume>
          {
          <fpage>244</fpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <given-names>R. H.</given-names>
            <surname>Shoemaker</surname>
          </string-name>
          .
          <article-title>The nci60 human tumour cell line anticancer drug screen</article-title>
          .
          <source>Nat. Rev. Cancer</source>
          ,
          <volume>6</volume>
          (
          <issue>10</issue>
          ):
          <volume>813</volume>
          {
          <fpage>823</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <given-names>A. V.</given-names>
            <surname>Ushakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. L.</given-names>
            <surname>Vasilyev</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T. V.</given-names>
            <surname>Gruzdeva</surname>
          </string-name>
          .
          <article-title>A computational comparison of the p-median clustering and k-means</article-title>
          .
          <source>International Journal of Arti cial Intelligence</source>
          ,
          <volume>13</volume>
          (
          <issue>1</issue>
          ):
          <volume>229</volume>
          {
          <fpage>242</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24. I.
          <article-title>Vasil'ev, X. Klimentova, and</article-title>
          <string-name>
            <surname>Yu. Kochetov.</surname>
          </string-name>
          <article-title>New lower bounds for the facility location problem with clients' preferences</article-title>
          .
          <source>Comput. Math. Math. Phys.</source>
          ,
          <volume>49</volume>
          (
          <issue>6</issue>
          ):
          <volume>1010</volume>
          {
          <fpage>1020</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <given-names>I.</given-names>
            <surname>Vasilyev</surname>
          </string-name>
          and
          <string-name>
            <given-names>X.</given-names>
            <surname>Klimentova</surname>
          </string-name>
          .
          <article-title>The branch and cut method for the facility location problem with clients preferences</article-title>
          .
          <source>J. Appl. Ind. Math.</source>
          ,
          <volume>4</volume>
          (
          <issue>3</issue>
          ):
          <volume>441</volume>
          {
          <fpage>454</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26. I.
          <string-name>
            <surname>Vasilyev</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Klimentova</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Boccia</surname>
          </string-name>
          .
          <article-title>Polyhedral study of simple plant location problem with order</article-title>
          .
          <source>Oper. Res. Lett.</source>
          ,
          <volume>41</volume>
          (
          <issue>2</issue>
          ):
          <volume>153</volume>
          {
          <fpage>158</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wei</surname>
          </string-name>
          .
          <article-title>Integrative analyses of cancer data: A review from a statistical perspective</article-title>
          .
          <source>Cancer Inform</source>
          ., pages
          <volume>173</volume>
          {
          <fpage>181</fpage>
          , 05
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>