<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Approach to Data Mining Inside PostgreSQL Based on Parallel Implementation of UDFs</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Proceedings of the XIX International Conference “Data Analytics and Management in Data Intensive Domains” (DAMDID/RCDL'2017)</institution>
          ,
          <addr-line>Moscow</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Timofey Rechkalov South Ural State University</institution>
          ,
          <addr-line>Chelyabinsk</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <fpage>114</fpage>
      <lpage>121</lpage>
      <abstract>
        <p>Relational DBMSs remain the most popular tool for data processing. However, most of stand-alone data mining packages process flat files outside a DBMS. In-database data mining avoids exportimport data/results bottleneck as opposed to use stand-alone mining packages and keeps all the benefits provided by DBMS. The paper describes an approach to data mining inside PostgreSQL based on parallel implementation of user-defined functions (UDFs) for modern Intel many-core platforms. The UDF performs a single mining task on data from the specified table and produces a resulting table. The UDF is organized as a wrapper of an appropriate mining algorithm, which is implemented in C language and is parallelized based on OpenMP technology and thread-level parallelism. The library of such UDFs supports a cache of precomputed mining structures to reduce costs of computations. We compare performance of our approach with R data mining package, and experiments show efficiency of the proposed approach.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        Currently relational DBMSs remain the most popular
facility for storing, updating and querying structured
data. At the same time, most of data mining algorithms
suppose processing of flat file(s) outside a DBMS.
However, exporting data sets and importing of mining
results impede analysis of large databases outside a
DBMS [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. In addition to avoiding export-import
bottleneck, an approach to data mining inside a DBMS
provides many benefits for the end-user like query
optimization, data consistency and security, etc.
      </p>
      <p>Existing approaches to integrating data mining with
relational DBMSs include special data mining
languages and SQL extensions, implementation of mining
algorithms in plain SQL and user-defined functions
(UDFs) implemented in high-level language like C++.
The latter approach could serve as a subject of applying
parallel processing on modern many-core platforms.</p>
      <p>
        In this paper, we present an approach to data mining
inside PostgreSQL open-source DBMS exploiting
capabilities of modern Intel MIC (Many Integrated
Core) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] platform. Our approach supposes a library of
UDFs where each one of them performs a single mining
task on data from the specified table and produces a
resulting table. The UDF is organized as a wrapper of
an appropriate mining algorithm, which is implemented
in C language and is parallelized for Intel MIC platform
by OpenMP technology and thread-level parallelism.
      </p>
      <p>The paper is structured as follows. We describe the
proposed approach in the Section 2. The results of
experimental evaluation of our approach are given in
Section 3. Section 4 briefly discusses related works.
Section 5 contains summarizing comments and directions
for future research.</p>
    </sec>
    <sec id="sec-2">
      <title>2 Embedding of data mining functions into</title>
    </sec>
    <sec id="sec-3">
      <title>PostgreSQL</title>
      <sec id="sec-3-1">
        <title>2.1 Motivation example</title>
        <p>Our approach is aimed to provide a database application
programmer with the library of data mining functions,
which could be run inside DBMS as it shown in Fig. 1.</p>
        <p>
          In this example the mining function performs
clustering by Partitioning Around Medoids (PAM) [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]
algorithm for the data points from the specified input table
and saves results in output table (with respect to the
specified number of the input table's columns, number
of clusters and accuracy). An application programmer is
not obliged to export data to be mined from DBMS and
import mining results back. At the same time here PAM
encapsulates parallel implementation [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ] based on
OpenMP technology and thread-level parallelism.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>2.2 Component structure</title>
        <p>The pgMining library consists of two following
subsystems, namely Frontend and Backend, where the
former provides presentation layer and the latter – data
access layer of concerns for an application programmer.</p>
        <p>The Frontend provides a set of functions for mining
inside PostgreSQL. Each function performs a single
mining task (e.g. clustering, classification, search
patterns, etc.) and produces a resulting table.</p>
        <p>The Backend consists of two modules, namely
Wrapper and Cache manager. The Wrapper provides
functions that serve as envelopes for the respective
mining functions from mcMining library. The Cache
manager supports cache of precomputed mining structures
to reduce costs of computations.</p>
        <p>The mcMining library provides a set of functions to
solve various data mining tasks in main memory and
exploits capabilities of Intel many-core platforms.</p>
      </sec>
      <sec id="sec-3-3">
        <title>2.3 Frontend</title>
        <p>An example of Frontend's function is given in Fig. 3.
Such a function connects to PostgreSQL, carries out
some mining task and returns exit code (0 in case of
success, otherwise negative error code). As a side
effect, the function creates a table with mining results.
The function's mandatory parameters are ID of
PostgreSQL connection, name of the input table, name of
the output table and number of first left columns in
input table containing data to be mined. The rest
parameters are specific to the task (e.g. number of clusters,
accuracy, etc.).</p>
        <p>In fact, Frontend's function wraps the respective
UDF from Backend, which is loaded into PostgreSQL
and executed as “INSERT INTO … SELECT …” query to
save mining results in the specified table.</p>
      </sec>
      <sec id="sec-3-4">
        <title>2.4 Backend</title>
        <p>The Cache manager provides buffer pool to store
precomputed mining structures. Distance matrix is a
typical example of mining structure to be saved in
cache. Indeed, distance matrix A=(aij) stores distances
between each pair of ai and aj elements in input data set.
Being precomputed once, distance matrix could be used
many times to perform clustering or kNN-based
classification with various parameters (e.g. number of
clusters, number of neighbors, accuracy, etc.).</p>
        <p>The Cache manager exports the following two basic
functions depicted in Fig. 5. The putObject function
loads a mining structure specified by its ID, buffer
pointer and size into cache. The getObject searches in
cache for an object with the given ID. An ID of mining
structure is a string, which is made as concatenation of
input table's name and object's informational string (e.g.
“_distMatrix”).</p>
      </sec>
      <sec id="sec-3-5">
        <title>2.5 Library of parallel many-core algorithms</title>
        <p>
          In this example, we use Partition Around Medoids
(PAM) [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] clustering algorithm, which is used in a wide
spectrum of applications where minimal sensitivity to
noise data is required. The PAM provides such a
property since it represents cluster centers by points of input
data set (medoids).
        </p>
        <p>The PAM firstly calculates distance matrix for the
given data points. Then in the BUILD phase, an initial
clustering is obtained by the successive selection of
medoids until the required number of clusters have been
found. Next, in the SWAP phase the algorithm attempts
to improve clustering in accordance with an objective
function. However, for large and high-dimensional
datasets PAM's computations are very costly.</p>
        <p>
          In our previous research [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ], we parallelize PAM
for Intel Xeon CPU and Intel Xeon Phi coprocessor. In
order to perform best on Intel many-core platforms the
PAM's parallel version exploits modifications of loops
to provide vectorization of calculations and
chunk-bychunk data processing to decrease number of cache
misses.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3 Experimental evaluation</title>
      <sec id="sec-4-1">
        <title>3.1 Hardware, datasets and goals of experiments</title>
        <p>
          To evaluate the developed approach, we performed
experiments on the Tornado SUSU supercomputer [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]
whose node provides two different platforms, namely
Intel Xeon CPU and Intel Xeon Phi coprocessor (cf.
Tab. 1 for the specifications).
        </p>
        <p>
          In the experiments, we studied the following aspects
of the developed approach. Firstly, we investigated the
speedup of mcPAM function to understand its
scalability on both platforms depending on number of threads
employed. Secondly, we evaluated the runtime of
mcPAM function to understand how the performance on
both platforms depends on number of data points and
what benefits could we derive from precomputations of
the distance matrix. Finally, we compared the
performance of pgPAM function with implementation of
PAM algorithm from R data mining package [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2 Results of experiments</title>
        <p>The results of the first series of experiments on mcPAM
speedup are depicted in Fig. 7. On both platforms,
mcPAM’s speedup is close to linear, when the number
of threads matches the number of physical cores the
algorithm is running on (i.e. 12 cores for Intel Xeon and
60 cores for Intel Xeon Phi, respectively).</p>
        <p>Speedup becomes sub-linear when the algorithm
uses more than one thread per physical core. The mcPAM
achieves up to 15× and 120× speedup on Intel Xeon and
Intel Xeon Phi, respectively. Summing up, mcPAM
demonstrates good scalability on both platforms.
(a) Intel Xeon CPU
Figure 7 Speedup of the mcPAM function
(b) Intel Xeon Phi coprocessor
(a) FCS Human dataset
(b) MixSim dataset
(c) US Census dataset
Figure 8 Performance of the mcPAM function
(d) Power Consumption dataset
(c) US Census dataset
(d) Power Consumption dataset</p>
        <p>Overall performance is better on Intel Xeon Phi than
Intel Xeon when the algorithm deals with big
dimensionality dataset due to possibility of intensive
vectorization in calculations of distance matrix. Since
calculations of distance matrix take from 15 to 80 percent of
overall runtime, we can derive substantial benefits from
caching of the distance matrix.</p>
        <p>The results of the third series of experiments on
comparison performance of pgPAM and PAM from R
data mining package are illustrated in Fig. 9. We carried
out these series of experiments on Intel Xeon platform
only due to the following reason. Running PostgreSQL
on Intel MIC platform demands Intel Xeon Phi Knights
Landing (KNL), which is the next generation product
from Intel and is bootable device. However, Intel Xeon
Phi KNL is not available yet at Tornado SUSU
supercomputer. We plan to perform this study as further
research.</p>
        <p>We can see that pgPAM significantly overtakes R's
PAM in both cases when one thread or the maximum
number of threads are employed. Caching of distance
matrix improves the performance up to 80 percent of
overall runtime (in case of high-dimensional dataset).</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4 Related work</title>
      <p>The problem of integrating data analytics with relational
DBMSs has been studied since data mining research
originates.</p>
      <p>
        Data mining query languages include DMQL [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ],
MSQL [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], MINE RULE operator [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and Microsoft's
DMX [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ].
      </p>
      <p>
        There are many SQL implementations of data mining
algorithms. SQL versions of classical clustering
algorithms include K-Means [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], EM [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], Fuzzy
CMeans [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. SQL versions of association rule mining
algorithms include K-Way-Join, Three-Way-Join,
Subquery and Two-Group-Bys [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], Set-oriented
Apriori [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ], Quiver [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], Propad [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. Classification
includes SQL implementations of decision trees [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ],
kNN [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] and Bayesian classification [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. SQL is also
successfully used in mining applications for data with
“non-relational” nature as graphs, for instance in search
for frequent graphs [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], detection of cycles in graph [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
graph partitioning [
        <xref ref-type="bibr" rid="ref11 ref22">11, 22</xref>
        ], etc.
      </p>
      <p>
        User-defined functions-based approach. Integration
of correlation, linear regression, PCA and clustering
into the Teradata DBMS based on UDFs is proposed
in [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. There are two sets of UDFs that work in a
single table scan, that is an aggregate UDF to compute
summary matrices and a set of scalar UDFs to score
data sets. Experiments showed that UDFs are faster than
SQL queries and UDFs are more efficient than C++,
due to long export times. In [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] UDFs implementing
common vector operations were presented and it was
shown that UDFs are as efficient as automatically
generated SQL queries with arithmetic expressions and
queries calling scalar UDFs are significantly more
efficient than equivalent queries using SQL aggregations.
      </p>
      <p>
        In-database mining frameworks. The ATLAS [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] is
a framework for in-database analytics, which provides
SQL-like database language with user-defined
aggregates (UDAs) and table functions. The system's
language processor translates ATLAS programs into C++
code, which is then compiled and linked with the
database storage manager and user-defined external
functions. Authors presented ATLAS-based
implementations of several data mining algorithms.
      </p>
      <p>
        The MADlib [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] is an open source library of
indatabase analytical algorithms for PostgreSQL. The
MADlib is implemented by a big team and provides
many methods for supervised learning, unsupervised
learning and descriptive statistics. The MADlib exploits
UDAs, UDFs, and a sparse matrix C library to provide
efficient representations on disk and in memory. As
many statistical methods are iterative (i.e. they make
many passes over a data set), authors wrote a driver
UDF in Python to control iteration in such a way that all
large data movement is done within the database engine
and its buffer pool.
      </p>
      <p>Comparison. In this paper, we suggest an approach
to embedding data mining functions into PostgreSQL.
As some methods mentioned above our approach
exploits UDFs. The difference from the previous works
includes the following. Our approach supposes
parallelization of UDFs for many-core platform that current
DBMS is running on. All the parallelization details are
encapsulated in implementation of the UDF and are
hided from the DBMS, so our approach could be ported
to some other open-source DBMS (with possible
nontrivial but mechanical software development effort). In
addition, our approach supposes a special module,
which provides a cache of precomputed mining
structures and lets UDF know to reuse these structures to
reduce costs of computations.</p>
    </sec>
    <sec id="sec-6">
      <title>5 Conclusion</title>
      <p>In this paper, we touch upon the problem of organizing
data mining inside a DBMS. We present an approach to
implementation of in-database analytical functions for
PostgreSQL that exploits capabilities of modern Intel
many-core platforms.</p>
      <p>Our approach supposes implementation of two
libraries, namely pgMining and mcMining. The
pgMiningis a library of data mining functions each one
of them is to be run inside PostgreSQL. The mcMining
is a library that exports functions to solve various data
mining tasks, which are parallelized for Intel MIC
platforms.</p>
      <p>The pgMining consists of Frontend and Backend
subsystems. The Frontend's function loads an UDF
from the Backend into PostgreSQL and executes it as
“INSERT INTO … SELECT …” query to save mining
results in a table. The Backend consists of Wrapper and
Cache manager modules. The Wrapper provides
functions that serve as envelopes for the respective
mcMining mining functions. The Cache manager supports
cache of precomputed mining structures to reduce costs
of computations.</p>
      <p>Since our approach assumes hiding details of
parallel implementation from PostgreSQL, such an approach
could be ported to some other open-source DBMS (with
possible non-trivial but mechanical software
development effort).</p>
      <p>We have evaluated our approach on previously
implemented parallel clustering algorithm of mcMining
library and four real datasets. Experiments showed good
speedup and performance of the algorithm as well as
our approach derive benefits from caching of
precomputed mining structures and overtakes R data mining
package.</p>
      <p>As future work, we plan to implement other mining
algorithms formcMining library and conduct
experiments on Intel Xeon Phi Knights Landing platform.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgement</title>
      <p>This work was financially supported by the Russian
Foundation for BasicResearch (grant No. 17-07-00463),
by Act 211 Government of the Russian Federation
(contract No. 02.A03.21.0011) and by the Ministry of
education and science of Russian Federation (government
order 2.7905.2017/8.9).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Balachandran</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Padmanabhan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chakravarthy</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Enhanced</surname>
            <given-names>DB</given-names>
          </string-name>
          -Subdue:
          <article-title>Supporting Subtle Aspects of Graph Mining Using a Relational Approach</article-title>
          . In: W.K. Ng,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kitsuregawa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.</surname>
          </string-name>
          Chang (eds.)
          <article-title>Advances in Knowledge Discovery and Data Mining, 10th Pacific-Asia Conf</article-title>
          .,
          <source>PAKDD</source>
          <year>2006</year>
          , Singapore, April 9-
          <issue>12</issue>
          ,
          <year>2006</year>
          ,
          <source>Proc., Lecture Notes in Computer Science</source>
          ,
          <volume>3918</volume>
          , pp.
          <fpage>673</fpage>
          -
          <lpage>678</lpage>
          . Springer (
          <year>2006</year>
          ).
          <source>doi:10.1007/ 11731139 77</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Duran</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klemm</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The Intel Many Integrated Core Architecture</article-title>
          . In: W.W. Smari, V. Zeljkovic (eds.) HPCS, pp.
          <fpage>365</fpage>
          -
          <lpage>366</lpage>
          . IEEE (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Engreitz</surname>
            ,
            <given-names>J.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jr.</surname>
            ,
            <given-names>B.J.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marshall</surname>
            ,
            <given-names>J.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Altman</surname>
          </string-name>
          , R.B.:
          <article-title>Independent Component Analysis: Mining Microarray Data for Fundamental Human Gene Expression Modules</article-title>
          .
          <source>J. of Biomedical Informatics</source>
          .
          <volume>43</volume>
          (
          <issue>6</issue>
          ), pp.
          <fpage>932</fpage>
          -
          <lpage>944</lpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Garcia</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ordonez</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Efficient Algorithms Based on Relational Queries to Mine Frequent Graphs</article-title>
          . In: A.
          <string-name>
            <surname>Nica</surname>
            ,
            <given-names>A.S.</given-names>
          </string-name>
          Varde (eds.)
          <source>Proc. of the Third Ph.D. Workshop on Information and Knowledge Management</source>
          ,
          <string-name>
            <surname>PIKM</surname>
          </string-name>
          <year>2010</year>
          , Toronto, Ontario, Canada, October
          <volume>30</volume>
          ,
          <year>2010</year>
          , pp.
          <fpage>17</fpage>
          -
          <lpage>24</lpage>
          . ACM (
          <year>2010</year>
          ). doi:
          <volume>10</volume>
          .1145/ 1871902.1871906
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Han</surname>
            ,
            <given-names>J</given-names>
          </string-name>
          .,
          <string-name>
            <surname>Fu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chiang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gong</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koperski</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rajan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stefanovic</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xia</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zaiane</surname>
            ,
            <given-names>O.R.</given-names>
          </string-name>
          :
          <article-title>Dbminer: A System for Mining Knowledge in Large Relational Databases</article-title>
          . In: E. Simoudis, J. Han,
          <string-name>
            <surname>U</surname>
          </string-name>
          .M. Fayyad (eds.)
          <source>Proc. of the Second Int. Conf. on Knowledge Discovery and Data Mining (KDD-96)</source>
          , Portland, Oregon, USA, pp.
          <fpage>250</fpage>
          -
          <lpage>255</lpage>
          . AAAI Press (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Hellerstein</surname>
            ,
            <given-names>J.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Re</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schoppmann</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>D.Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fratkin</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gorajek</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>K.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Welton</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Feng</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>The MADlib Analytics Library or MAD Skills, the SQL</article-title>
          .
          <source>PVLDB</source>
          <volume>5</volume>
          (
          <issue>12</issue>
          ), pp.
          <fpage>1700</fpage>
          -
          <lpage>1711</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Imielinski</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Virmani</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>MSQL: A Query Language for Database Mining</article-title>
          .
          <source>Data Min. Knowl. Discov</source>
          .
          <volume>3</volume>
          (
          <issue>4</issue>
          ), pp.
          <fpage>373</fpage>
          -
          <lpage>408</lpage>
          (
          <year>1999</year>
          ). doi:
          <volume>10</volume>
          .1023/A:1009816913055
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Kaufman</surname>
          </string-name>
          , L.,
          <string-name>
            <surname>Rousseeuw</surname>
            ,
            <given-names>P.J.</given-names>
          </string-name>
          :
          <article-title>Finding Groups in Data: An Introduction to Cluster Analysis</article-title>
          . John Wiley (
          <year>1990</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Kostenetskiy</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Safonov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>SUSU Supercomputer Resources</article-title>
          . In: L.
          <string-name>
            <surname>Sokolinsky</surname>
          </string-name>
          , I. Starodubov (eds.) PCT'
          <year>2016</year>
          , Int.
          <source>Scientific Conf. on Parallel Computational Technologies</source>
          , Arkhangelsk, Russia, March
          <volume>29</volume>
          -31,
          <year>2016</year>
          , pp.
          <fpage>561</fpage>
          -
          <lpage>573</lpage>
          . CEUR Workshop Proceedings.
          <volume>1576</volume>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Lichman</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>UCI Machine Learning Repository</article-title>
          [http://archive.ics.uci.edu/ml/datasets/individual+ household+electric+power+consumption]. Irvine, CA: University of California, School of Information and Computer Science (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>McCaffrey</surname>
            ,
            <given-names>J.D.:</given-names>
          </string-name>
          <article-title>A Hybrid System for Analyzing Very Large Graphs</article-title>
          . In: S. Latifi (ed.)
          <source>Ninth Int. Conf. on Information Technology: New Generations, ITNG</source>
          <year>2012</year>
          ,
          <string-name>
            <given-names>Las</given-names>
            <surname>Vegas</surname>
          </string-name>
          , Nevada, USA,
          <fpage>16</fpage>
          -
          <lpage>18</lpage>
          April,
          <year>2012</year>
          , pp.
          <fpage>253</fpage>
          -
          <lpage>257</lpage>
          . IEEE Computer Society (
          <year>2012</year>
          ). doi:
          <volume>10</volume>
          .1109/ITNG.
          <year>2012</year>
          .43
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Meek</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thiesson</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heckerman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>The Learning-curve Sampling Method Applied to Model-based Clustering</article-title>
          .
          <source>J. of Machine Learning Research. 2</source>
          , pp.
          <fpage>397</fpage>
          -
          <lpage>418</lpage>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Melnykov</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>W.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maitra</surname>
          </string-name>
          , R.: Mixsim:
          <article-title>An R Package for Simulating Data to Study Performance of Clustering Algorithms</article-title>
          .
          <source>J. of Statistical Software, Articles</source>
          <volume>51</volume>
          (
          <issue>12</issue>
          ), pp.
          <fpage>1</fpage>
          -
          <lpage>25</lpage>
          (
          <year>2012</year>
          ). doi:
          <volume>10</volume>
          .18637/jss.v051.i12
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Meo</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Psaila</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ceri</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>A New SQL-like Operator for Mining Association Rules</article-title>
          . In:
          <string-name>
            <surname>T.M. Vijayaraman</surname>
            ,
            <given-names>A.P.</given-names>
          </string-name>
          <string-name>
            <surname>Buchmann</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Mohan</surname>
            ,
            <given-names>N.L.</given-names>
          </string-name>
          Sarda (eds.)
          <source>VLDB'96, Proc. of 22th Int. Conf. on Very Large Data Bases, September 3-6</source>
          ,
          <year>1996</year>
          , Mumbai (Bombay), India, pp.
          <fpage>122</fpage>
          -
          <lpage>133</lpage>
          . Morgan Kaufmann (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Miniakhmetov</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zymbler</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Integration of Fuzzy c-means Clustering Algorithm with PostgreSQL Database Management System</article-title>
          .
          <source>Numerical Methods and Programming</source>
          <volume>13</volume>
          (
          <issue>2</issue>
          (
          <issue>26</issue>
          )), pp.
          <fpage>46</fpage>
          -
          <lpage>52</lpage>
          (
          <year>2012</year>
          )
          <article-title>(in Russian)</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Ordonez</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Integrating k-means Clustering with a Relational DBMS Using SQL</article-title>
          .
          <source>IEEE Trans. Knowl. Data Eng</source>
          .
          <volume>18</volume>
          (
          <issue>2</issue>
          ), pp.
          <fpage>188</fpage>
          -
          <lpage>201</lpage>
          (
          <year>2006</year>
          ). doi:
          <volume>10</volume>
          .1109/TKDE.
          <year>2006</year>
          .31
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Ordonez</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Building Statistical Models and Scoring with UDFs</article-title>
          . In: C.Y.
          <string-name>
            <surname>Chan</surname>
            ,
            <given-names>B.C.</given-names>
          </string-name>
          <string-name>
            <surname>Ooi</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Zhou (eds.)
          <source>Proc. of the ACM SIGMOD Int. Conf. on Management of Data</source>
          , Beijing, China, June 12-14,
          <year>2007</year>
          , pp.
          <fpage>1005</fpage>
          -
          <lpage>1016</lpage>
          . ACM (
          <year>2007</year>
          ). doi:
          <volume>10</volume>
          .1145/1247480.1247599
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Ordonez</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Statistical Model Computation with UDFs</article-title>
          .
          <source>IEEE Trans. Knowl. Data Eng</source>
          .
          <volume>22</volume>
          (
          <issue>12</issue>
          ), pp.
          <fpage>1752</fpage>
          -
          <lpage>1765</lpage>
          (
          <year>2010</year>
          ). doi:
          <volume>10</volume>
          .1109/TKDE.
          <year>2010</year>
          .44
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Ordonez</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cereghini</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : SQLEM:
          <article-title>Fast Clustering in SQL Using the EM Algorithm</article-title>
          . In: W. Chen,
          <string-name>
            <given-names>J.F.</given-names>
            <surname>Naughton</surname>
          </string-name>
          , P.A. Bernstein (eds.)
          <source>Proc. of the 2000 ACM SIGMOD Int. Conf. on Management of Data, May 16-18</source>
          ,
          <year>2000</year>
          , Dallas, Texas, USA, pp.
          <fpage>559</fpage>
          -
          <lpage>570</lpage>
          . ACM (
          <year>2000</year>
          ). doi:
          <volume>10</volume>
          .1145/ 342009.335468
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Ordonez</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garcia-Garcia</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Vector and Matrix Operations Programmed with UDFs in a Relational DBMS</article-title>
          . In: P.S.
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>V.J.</given-names>
          </string-name>
          <string-name>
            <surname>Tsotras</surname>
            ,
            <given-names>E.A.</given-names>
          </string-name>
          <string-name>
            <surname>Fox</surname>
          </string-name>
          , B. Liu (eds.).
          <source>Proc. of the 2006 ACM CIKM Int. Conf. on Information and Knowledge Management</source>
          , Arlington, Virginia, USA, November 6-
          <issue>11</issue>
          ,
          <year>2006</year>
          , pp.
          <fpage>503</fpage>
          -
          <lpage>512</lpage>
          . ACM (
          <year>2006</year>
          ). doi:
          <volume>10</volume>
          .1145/ 1183614.1183687
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Ordonez</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pitchaimalai</surname>
            ,
            <given-names>S.K.</given-names>
          </string-name>
          :
          <article-title>Bayesian Classifiers Programmed in SQL</article-title>
          .
          <source>IEEE Trans. Knowl. Data Eng</source>
          .
          <volume>22</volume>
          (
          <issue>1</issue>
          ), pp.
          <fpage>139</fpage>
          -
          <lpage>144</lpage>
          (
          <year>2010</year>
          ). doi:
          <volume>10</volume>
          .1109/TKDE.
          <year>2009</year>
          .127
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Pan</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zymbler</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Very Large Graph Partitioning by Means of Parallel DBMS</article-title>
          . In: B.
          <string-name>
            <surname>Catania</surname>
          </string-name>
          , G. Guerrini, J. Pokorny (eds.)
          <source>Advances in Databases and Information Systems - 17th East European Conf., ADBIS</source>
          <year>2013</year>
          , Genoa, Italy, September 1-
          <issue>4</issue>
          ,
          <year>2013</year>
          .
          <source>Proc., Lecture Notes in Computer Science</source>
          ,
          <volume>8133</volume>
          , pp.
          <fpage>388</fpage>
          -
          <lpage>399</lpage>
          . Springer (
          <year>2013</year>
          ).
          <source>doi: 10.1007/978-3-642-40683-6 29</source>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Rantzau</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>Frequent Itemset Discovery with SQL Using Universal Quantification</article-title>
          . In: R. Meo,
          <string-name>
            <given-names>P.L.</given-names>
            <surname>Lanzi</surname>
          </string-name>
          , M. Klemettinen (eds.)
          <article-title>Database Support for Data Mining Applications: Discovering Knowledge with Inductive Queries</article-title>
          , Lecture Notes in Computer Science,
          <volume>2682</volume>
          , pp.
          <fpage>194</fpage>
          -
          <lpage>213</lpage>
          . Springer (
          <year>2004</year>
          ).
          <source>doi: 10.1007/ 978-3-540-44497-8 10</source>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Rechkalov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zymbler</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Accelerating Medoids-based Clustering with the Intel Many Integrated Core Architecture</article-title>
          .
          <source>In: 9th Int. Conf. on Application of Information and Communication Technologies, AICT 2015, October 14-16</source>
          ,
          <year>2015</year>
          ,
          <article-title>Rostov-on-</article-title>
          <string-name>
            <surname>Don</surname>
          </string-name>
          ,
          <source>Russia. Proceedings</source>
          , pp.
          <fpage>413</fpage>
          -
          <lpage>417</lpage>
          (
          <issue>IEEE</issue>
          ,
          <year>2015</year>
          ). doi:
          <volume>10</volume>
          .1109/ICAICT.
          <year>2015</year>
          . 7338591
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Sarawagi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thomas</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Agrawal</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>Integrating Association Rule Mining with Relational Database systems: Alternatives and Implications</article-title>
          .
          <source>Data Min. Knowl. Discov</source>
          .
          <volume>4</volume>
          (
          <issue>2</issue>
          /3), pp.
          <fpage>89</fpage>
          -
          <lpage>125</lpage>
          (
          <year>2000</year>
          ). doi:
          <volume>10</volume>
          .1023/A:1009887712954
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Sattler</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dunemann</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>SQL Database Primitives for Decision Tree Classifiers</article-title>
          .
          <source>In: Proc. of the 2001 ACM CIKM Int. Conf. on Information and Knowledge Management</source>
          , Atlanta, Georgia, USA, November 5-
          <issue>10</issue>
          ,
          <year>2001</year>
          , pp.
          <fpage>379</fpage>
          -
          <lpage>386</lpage>
          . ACM (
          <year>2001</year>
          ). doi:
          <volume>10</volume>
          .1145/502585.502650
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Shang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sattler</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Geist</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>SQL Based Frequent Pattern Mining with FPGrowth</article-title>
          . In: D.
          <string-name>
            <surname>Seipel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Hanus</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          <string-name>
            <surname>Geske</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <article-title>Bartenstein (eds.) Applications of Declarative Programming and Knowledge Management, 15th Int. Conf. on Applications of Declarative Programming and Knowledge Management</article-title>
          ,
          <string-name>
            <surname>INAP</surname>
          </string-name>
          <year>2004</year>
          ,
          <article-title>and</article-title>
          18th Workshop on Logic Programming,
          <source>WLP</source>
          <year>2004</year>
          , Potsdam, Germany, March 4-
          <issue>6</issue>
          ,
          <year>2004</year>
          ,
          <source>Revised Selected Papers, Lecture Notes in Computer Science</source>
          ,
          <volume>3392</volume>
          , pp.
          <fpage>32</fpage>
          -
          <lpage>46</lpage>
          . Springer (
          <year>2004</year>
          ).
          <source>doi: 10.1007/11415763 3</source>
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Tang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maclennan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>P.P.</given-names>
          </string-name>
          :
          <article-title>Building Data Mining Solutions with OLE DB for DM and XML for Analysis</article-title>
          .
          <source>SIGMOD Record</source>
          ,
          <volume>34</volume>
          (
          <issue>2</issue>
          ), pp.
          <fpage>80</fpage>
          -
          <lpage>85</lpage>
          (
          <year>2005</year>
          ). doi:
          <volume>10</volume>
          .1145/1083784.1083805
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Thomas</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chakravarthy</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <article-title>: Performance Evaluation and Optimization of Join Queries for Association Rule Mining</article-title>
          . In:
          <string-name>
            <surname>M.K. Mohania</surname>
            ,
            <given-names>A.M.</given-names>
          </string-name>
          <article-title>Tjoa (eds.) Data Warehousing and Knowledge Discovery, First Int</article-title>
          . Conf.,
          <source>DaWaK '99</source>
          ,
          <string-name>
            <surname>Florence</surname>
          </string-name>
          , Italy,
          <source>August 30 - September 1</source>
          ,
          <year>1999</year>
          ,
          <source>Proc., Lecture Notes in Computer Science</source>
          ,
          <volume>1676</volume>
          , pp.
          <fpage>241</fpage>
          -
          <lpage>250</lpage>
          . Springer (
          <year>1999</year>
          ).
          <source>doi:10.1007/ 3-540-48298-9 26</source>
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zaniolo</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luo</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>ATLAS: A Small but Complete SQL Extension for Data Mining and Data Streams</article-title>
          . In: VLDB, pp.
          <fpage>1113</fpage>
          -
          <lpage>1116</lpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Yao</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>P.: K</given-names>
          </string-name>
          <string-name>
            <surname>Nearest Neighbor</surname>
          </string-name>
          <article-title>Queries and kNN-joins in Large Relational Databases (almost) for Free</article-title>
          . In:
          <string-name>
            <given-names>F.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.M.</given-names>
            <surname>Moro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ghandeharizadeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.R.</given-names>
            <surname>Haritsa</surname>
          </string-name>
          , G. Weikum,
          <string-name>
            <given-names>M.J.</given-names>
            <surname>Carey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Casati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.Y.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Manolescu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mehrotra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Dayal</surname>
          </string-name>
          , V.J.
          <source>Tsotras (eds.) Proc. of the 26th Int. Conf. on Data Engineering, ICDE 2010, March 1-6</source>
          ,
          <year>2010</year>
          , Long Beach, California, USA, pp.
          <fpage>4</fpage>
          -
          <lpage>15</lpage>
          . IEEE Computer Society (
          <year>2010</year>
          ). doi:
          <volume>10</volume>
          .1109/ICDE.
          <year>2010</year>
          .5447837
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>