<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Identifying Algebraic Properties to Support Optimization of Unary Similarity Queries?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>onica Ribeiro Porto Ferreira</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Agma J. M. Traina</string-name>
          <email>agma@icmc.usp.br</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ires Dias</string-name>
          <email>iresdias@icmc.usp.br</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Richard Chbeir</string-name>
          <email>richard.chbeir@u-bourgogne.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Caetano Traina Junior</string-name>
          <email>caetano@icmc.usp.br</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Science Dept., ICMC-Univ. of Sa~o Paulo</institution>
          ,
          <addr-line>Sa~o Carlos-SP</addr-line>
          ,
          <country country="BR">Brazil</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>LE2I Laboratory UMR-CNRS Univ. of Bourgogne</institution>
          ,
          <addr-line>Dijon</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Mathematics Dept., ICMC-Univ. of Sa~o Paulo</institution>
          ,
          <addr-line>Sa~o Carlos-SP</addr-line>
          ,
          <country country="BR">Brazil</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Conventional operators for data retrieval are either based on exact matching or on total order relationship among elements. Neither of them is appropriate to manage complex data, such as multimedia data, time series and genetic sequences. In fact, the most meaningful way to compare complex data is by similarity. However, the Relational Algebra, employed in the Relational Database Management Systems (RDBMS), cannot express similarity criteria. In order to address this issue, we provide here an extension of the Relational Algebra, aimed at representing similarity queries in algebraic expressions. This paper identi es fundamental properties to allow the integration of the unary similarity operators into the Relational Algebra to handle similarity-based operators, either alone or combined with the existing (exact matching and/or relational) operators. We also show how to take advantage of such properties to optimize similarity queries, including these properties into a similarity query optimizer developed for a Similarity Retrieval Engine, which uses an existing RDBMS to answer similarity queries.</p>
      </abstract>
      <kwd-group>
        <kwd>similarity algebra</kwd>
        <kwd>algebraic properties</kwd>
        <kwd>query optimization</kwd>
        <kwd>unary similarity queries</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        In 1970, Codd [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] introduced the relational model, which is the foundation for
most of the actual commercial DataBase Management Systems (DBMS). It is
based on the mathematical relation theory: the database is represented as a set
of relations, where each relation is a table with tuples (or rows) and attributes
(or columns). The domain of possible values for each attribute is restricted by
the data types.
      </p>
      <p>Initially, the relational model supported only traditional data, i.e., numerical
and string data types. Elements of these types can be compared using exact
matching (= and 6=) and relational (&lt;, &gt;, and ) operators. Now, with the
? This work has been supported by FAPESP, CNPq and CAPES/Fulbright
advent of multimedia and spatial applications, the Relational DBMS (RDBMS)
must be able to support new data types, operators and kinds of queries. Thus,
similarity emerges as the natural way to compare elements in complex domains,
such as images, audios, videos, genomic sequences, and time series, and
consequently handling operations based on similarity (or distance) between data
becomes a must. To illustrate this, let us take the following examples:</p>
      <p>Q1: In a health-care information system: \Given a mammography exam
with images of left and right breast from cranio-caudal (RCC) and medio-lateral
oblique (RMLO) views of a patient, show the exams whose texture do not di er
more than 10 units from those in the exam ".</p>
      <p>Q2: In a health-care information system: \Given a head tomography exam of
a patient showing a pathology, retrieve the 5 exams most similar not presenting
pathology, and that texture do not di er more than 5 units from those in the
exam".</p>
      <p>Q3: In Geographic Information Systems (GIS): \Find the 15 districts nearest
to `Arequipa' that are not farther than 15 miles, and where the population having
between 21 and 64 year is greater than 65-year-old population and over ".</p>
      <p>
        To solve similarity-based queries, several extensions of relational algebra have
been provided in the literature aimed at including the similarity functionality in
RDBMS from various perspectives. The rst algebra to consider this issue has
been the Multi-Similarity Algebra (MSA), presented by Adali et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. It has
been designed to integrate di erent interpretations of similarity values coming
from multiple similarity implementation in a common framework. However, it
remains at a higher abstraction level and thus does not address the problem of
an \operational" algebra usable for modeling, optimizing, and processing queries
with similarity-based operations [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Therefore, it is not fully consistent to the
relational model.
      </p>
      <p>
        Other works have associated similarity to uncertainty and provided fuzzy
logic-based methods to solve this [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. The problem of those approaches is
that they assume that complex data manipulation involves evaluation of their
similarity, but this does not mean that these data or the similarity evaluation
are uncertain or imprecise (as only exact match comparisons are useless in these
domains). In fact, it is possible to execute similarity queries resulting in either
approximated or exact answers.
      </p>
      <p>
        Likewise, other approaches have been based on the notion of ranking, i.e.,
ordering among tuples or elements [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ]. It is true that they are consistent to
the relational model and can be applied to similarity queries considering the
distance functions as the ranking criterion, but they depend on ranking criteria
that are independent from queries, whereas the ranking criterion of a similarity
query varies with the query.
      </p>
      <p>
        None of these previous works has addressed optimizations based on query
rewriting for the similarity-based select operators in complex expressions. Traina
et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] proposed an extension of relational algebra considering complex
similarity queries with two or more similarity predicates combined with Boolean
operators. However, they have only treated queries with the same query element
(unique center), which is very restrictive and does not cover all cases occurring
in a RDBMS.
      </p>
      <p>In this paper, we present the fundamental properties of a Similarity Algebra
aiming at integrating both unary similarity operators with the relational algebra,
which allows optimizing similarity queries in relational DBMS. The properties
allow handling queries including any number of query centers, and suitable to
support both similarity-based and traditional operators in the same query.</p>
      <p>The remainder of this paper is structured as follows. Section 2 presents the
Similarity Algebra. Section 3 shows experimental results conducted to evaluate
the relevance of our approach. Finally, Section 4 concludes this paper and draws
our future steps.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>Similarity Algebra</title>
      <p>Preliminaries
In order to execute similarity queries in relational DBMS, it is necessary to
provide a measurement of how to quantify similarity between two elements.
Usually, it is done by de ning a distance function d, which is the basis to
create a metric space M = hS; di, where S denotes the universe of valid
elements (domain) and d is a function d : S S ! R+ that expresses a
\distance" between elements of S. The distance function d must satisfy the
following properties: (i) symmetry: d(s1; s2) = d(s2; s1); (ii) non-negativity:
0 &lt; d(s1; s2) &lt; 1; if s1 6= s2 and d(s1; s1) = 0; and (iii) triangular
inequality: d(s1; s2) d(s1; s3) + d(s3; s2); 8s1; s2; s3 2 S .</p>
      <p>An attribute is comparable by similarity only if it is associated to a
similarity measure d. Although distance functions can theoretically be assigned to
any attribute, they are of utter importance when applied to complex attributes.
Therefore, without loss of generality, we call complex attributes1 and,
correspondingly, its domains, those associated to distance functions, and the others
we call simple attributes.</p>
      <p>Relations that have complex attributes should follow the same properties
and de nitions of traditional relations. In this paper, we employ the following
notation to express relations. Let Ah Ah be a simple attribute in a domain Ah
that allows comparisons using traditional operators; Sj Sj be an complex
attribute in a domain Sj in a metric space that allows comparisons using complex
operators; and T be an relation with any number of both simple and complex
attributes. That is, let T = fA1; : : : ; Am; S1; : : : Spg be a relational schema, a
relation T T is a set of elements represented as tuples T = fA1; : : : ; Am; S1; : : : Spg,
which has for each tuple t = ha1; : : : ; am; s1; : : : spi values ah (1 h m)
obtained in the domain Ah and values sj (1 j p) obtained in the domain Sj .
Thus, let ti(Sj ) (1 i n) be the value of the Sj complex attribute of the
ith tuple in the relation, and correspondingly let ti(Ah) be the value of the Ah
simple attribute. To alleviate the notation of handling several attributes in a
1 Distinctly from object-oriented models, we employ here the term \complex attribute"
to refer to those having a distance function assigned. Examples are images, audios,
geographical coordinates, genomaic sequences, etc.
relation, in the remainder of the paper, we will use just S and S to refer to a
complex attribute Sj and its respective domain Sj , and A and A refer
respectively to a simple attribute Ah and its respective domain Ah whenever the focus
of the text is over only one attribute.
2.2</p>
      <p>Unary similarity queries operations
Traditional selections follow the format (A a) T , where is a comparison
operator valid in the domain A of the attribute A, and `a' is either a constant taken
in the domain of A or the value of another attribute from the same domain of
A in the same tuple. Similarity selections follow the same format: c (S c sq) T ,
where c represents a similarity selection, c is a similarity operator valid in the
domain S of the attribute S and `sq' is either a constant taken in the domain
of S or the value of another attribute from the same domain of S in the same
tuple.</p>
      <p>There are two similarity operators commonly employed: range and k-nearest
neighbor. As their properties can be di erent from those of the traditional
selection, we initially use the symbols ^ and  to represent range and kN N selections,
and ^ and  to represent range and kN N operators, respectively. They are
described as follows.</p>
      <p>De nition 1. Range query - Rq: Let S be a complex attribute taken in domain
S over which the similarity condition is expressed, d be a distance function, be
the similarity threshold and sq 2 S be the query element. The query ^(S ^(d; ) sq)T
returns every tuple ti 2 T such that d (ti (S) ; sq)
. That is:
^(S ^(d; ) sq) T = fti 2 T j d (ti (S) ; sq)
g :
(1)
De nition 2. k-Nearest Neighbor query - kN N : Let S be a complex
attribute taken in domain S over which the similarity condition is expressed, d be
a distance function, k 2 N be the similarity threshold and sq 2 S be the query
element. The query (S (d;k) sq) T returns the tuples from T whose value of the
attribute S is one of the k elements in S nearest to the query element sq based
on the distance function d. That is:
(S (d;k) sq) T = T 0 = fti 2 T j 8 tj 2 [T</p>
      <p>T 0] ; T 0 = ti=1;:::;k;
d (ti (S) ; sq)
d (tj (S) ; sq)g :
(2)
2.3</p>
      <p>
        Algebraic properties
The query optimizer of RDBMSs employs algebraic equivalences to rewrite
queries into equivalent expressions which are expected to be executed faster.
Selections are important operations because they reduce the size of relations.
In the subsections following, we identify algebraic properties useful to rewrite
expressions of both range operator ^ and k-nearest neighbor operator . Due to
space restriction, formal proofs of these properties are omitted here (they can
be found in Ferreira et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]).
      </p>
      <p>Range Selection - ^.</p>
      <p>Properties 1 and 2 apply conjunctive and disjunctive conditions involving only
^ operations, respectively.</p>
      <p>Property 1. Conjunctions of ^ operators can be rewritten into a cascade of
individual ^ operations or a sequence of intersection operations, i.e.,
(3)
(4)
(5)
(6)
^(S1 ^(d1; 1) sq1) ^ (S2 ^(d2; 2) sq2) T = ^(S1 ^(d1; 1) sq1)
^(S2 ^(d2; 2) sq2) T
= ^(S1 ^(d1; 1) sq1) T \ ^(S2 ^(d2; 2) sq2) T :
A special case exists when sq1 = sq2, as follows.</p>
      <p>Property 1.1. Special case where sq1 = sq2 = sq.</p>
      <p>^(S ^(d; 1) sq) T
\
^(S ^(d; 2) sq) T
=
^(S ^(d; 1) sq) ^ (S ^(d; 2) sq) T = ^(S ^(d;min( 1; 2)) sq) T :
Property 2. Disjunctions of ^ operators can be rewritten into a sequence of
union operations as follows.</p>
      <p>^(S1 ^(d1; 1) sq1) _ (S2 ^(d2; 2) sq2) T =
^(S1 ^(d1; 1) sq1) T
[
^(S2 ^(d2; 2) sq2) T
:
A special case exists when sq1 = sq2, as follows.</p>
      <p>Property 2.1. Special case where sq1 = sq2 = sq.</p>
      <p>^(S ^(d; 1) sq) T
[
^(S ^(d; 2) sq) T
=
^(S ^(d; 1) sq) _ (S ^(d; 2) sq) T = ^(S ^(d;max( 1; 2)) sq) T :</p>
      <p>Properties 3 and 4 explore the commutativity of ^ operation with its
composition and traditional operation.</p>
      <p>Property 3. The Rq selection operation commutes under its composition, i.e.,
^(S1 ^(d1; 1) sq1)
^(S2 ^(d2; 2) sq2) T
= ^(S2 ^(d2; 2) sq2)
^(S1 ^(d1; 1) sq1) T : (7)
Property 4. The Rq selection operation and the traditional selection operation
commute under their composition, i.e.,
^(S ^(d; ) sq)
(A a) T
= (A a) ^(S ^(d; ) sq) T
:
(8)</p>
      <p>As ^ operation is commutative with operation, Properties 1 and 2 can also
be employed for these operations. Therefore, we can use Properties 1 and 2 with
either the ^ operation only or ^ and operations.</p>
      <p>The next set of properties involving ^ allows pushing range selection through
the traditional binary operators: union ([), intersection (\), di erence ( ), cross
product ( ) and join (on). Property 5 shows that ^ is distributive over the set
binary operators [, \ and . The relations T1 and T2 must be union compatible.</p>
      <p>Property 5. The operator ^ is distributive over the set binary operators [,
and \ as follows.</p>
      <p>Property 5.1. For union, the following expression holds:
^(S ^(d; ) sq) (T1 [ T2) =
^(S ^(d; ) sq) T1
[
^(S ^(d; ) sq) T2</p>
      <p>Property 6. When the complex attribute mentioned in the range predicate
belongs to only one of the joined relations, the operation ^ is distributive over on
or . Let T1 be the relation that has the complex attribute S. Thus:
^(S ^(d; ) sq) (T1</p>
      <p>T2) =
^(S ^(d; ) sq) T1</p>
      <p>T2 ;
(12)
for any
=on or</p>
      <p>.</p>
      <p>Properties 1 to 6 show that range selection shares the same algebraic
equivalences as the traditional selection. Moreover, Property 4 shows the
commutativity property between similarity-based selections and traditional ones. This is
an important result, as it allows the RDBMS query optimizer to treat range
selection as traditional selection. Therefore, we can use the symbol instead of
^ to represent range selections, only using ^ to represent the range operator,
without lost of generality.
2.3.2</p>
      <p>k-Nearest Neighbor Selection - .</p>
      <p>Distinctly from range and traditional selections, kN N selections have only three
properties to rewrite algebraic expressions. Property 7 regards the conjunctive
selection conditions, as follows:
Property 7. Conjunctions of  operators can be rewritten into a sequence of
intersection operations but they cannot be rewritten as a cascade of individual 
operations, i. e.,</p>
      <p>(S1 (d1;k1) sq1) ^ (S2 (d2;k2) sq2) T =
Property 8. Disjunctions of  operators can be rewritten into a sequence of
union operations as follows (this property requires that the relation T is a set
because, in this way, duplications will be correctly eliminated):</p>
      <p>(S1 (d1;k1) sq1) _ (S2 (d2;k2) sq2) T =
Property 9. For complex conditions, each selection should be executed
separately and the intersection (for conjunctions) or the union (for disjunctions) of
results must be returned, since the operator  is not commutative neither with
other selection operators nor with itself. That is, for conjunctive conditions:
and for disjunctive conditions:
The same property can be employed combining either \ \ / [ ^" or \ \ / [</p>
      <p>When query elements sq1 and sq2 are the same, there are special cases where
the kN N selection operation becomes commutative with range and self selection
operations. First, for a composition of kN N selection operation, this
expression can be rewritten in the conjunction of kN N condition; therefore, only the
kN N selection with the smallest k condition needs to be executed. Second, for a
composition of kN N and range selection operation, this expression can also be
rewritten in the conjunction of kN N and range condition; then, the intersection
of the results from both basic operators should be executed. Finally, for the
disjunction of range and kN N condition, the union of the results from both basic
operators needs to be executed.</p>
      <p>For the set of properties involving traditional binary operators, no property
involving  exists, because  is not distributive over these operators.</p>
      <p>
        As kN N selection operations accepts only three properties (7, 8 and 9) and
ve special cases (over the same query element), they do not allow optimization
algorithms equivalent to traditional and range selections. Thus, speci c
optimization algorithms should be implemented in the query optimizer to optimize
this kind of selection. The kAndRange and the kOrRange algorithms [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] are
examples speci cally created to handle the commutativity property of Range and
kN N query over the same query element.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Experimental Results</title>
      <p>
        In this section, we present experiments comparing the evaluation of the similarity
queries both optimized and not optimized using the properties of the Similarity
Algebra presented in Section 2. To obtain the measurements, this algebra was
incorporated into the SIREN query optimizer. SIREN is a similarity retrieval
engine that allows expressing similarity queries in SQL and executing them [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
We call the new version of SIREN able to perform optimization on queries
involving similarity as SIREN+O. The experiments analyze the performance of
SIREN and SIREN+O to execute similarity queries. As we will see here, the
rst results show that the proposed algebra leads SIREN+O to perform faster
than SIREN.
      </p>
      <p>
        The test framework was implemented in C++, and the experiments ran on
an AMD Athlon XP 3000+ processor with 1024MB of main memory, under the
Windows XP operational system. The RDBMS employed was Oracle 9i. Every
test was performed using both sequential scan and a Slim-tree index. Due to
space limitations, we only highlight here the performance regarding total time
(in milliseconds) as it summarizes the whole computational cost. Four data sets
were used:
{ RCCMammography : a set of 658 medical images obtained from
mammograms of right breast with cranio-caudal view (CC). They were compared
using the texture distance function [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ];
{ RMLOMammography : a set of 695 medical images obtained from
mammography exams of right breast with medio-lateral oblique view (MLO). They
were also compared using the same texture distance function [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
{ MedImage: a set of 5,180 medical images obtained from three human body
parts (abdomen, cranium and thorax) by computerized tomographies (CT).
      </p>
      <p>
        They were compared using the metric histogram distance function [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
{ PeruDistricts : a set of 1,829 Peruvian districts. They were compared using
the Euclidean distance function.
      </p>
      <p>The rst three sets were obtained from the Clinical Hospital at Ribeir~ao
Preto of the University of S~ao Paulo and the last set was obtained from Peru
Instituto Nacional de Estad stica e Informatica (INEI).</p>
      <p>The experiments evaluated the execution time of Queries Q1 over
RCCMammography and RMLOMammography data sets, Q2 over MedImage
data set and Q3 over PeruDistricts data set stated in Section 1. The queries
were performed 30 times and the values shown are the average of performing the
same query, but varying query elements sq.</p>
      <p>
        Table 1 summarizes the results executing SIREN and SIREN+O using both
sequential scan and the Slim tree index, a well-known index structure for metric
data [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>Query Q1 involves a traditional join and a range selection and it can be
algebraically expressed as ^(S ^(texture;0:1) sq) (RCC on RM LO). Property 6 was
employed to optimize the query. Its gain was about 30.39% using sequential scan
and 30.16% using a Slim-tree index.</p>
      <p>Both Queries Q2 and Q3 involve traditional selection, range selection and
kNN selection. Therefore, Properties 4 and 9 as well as their special cases
should be used to optimize these queries. Q2 can be algebraically expressed as:
(pathology=`N0) ^(S ^(texture;0:05) sq) (S (texture;5) sq) (M edImage) . The
gain obtained was about 64.68% using a Slim-tree index and 62.92% using
sequential scan.</p>
      <p>Q3 can be expressed as:
(adultpop &gt; oldpop) (^(S ^(Euclidean;15) sq)
((S (Euclidean;15) sq)(P eruDistricts))), and the gain obtained was about
63.82% using sequential scan and 62.61% using a Slim-tree index.</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>Nowadays, storing and retrieving multimedia data is a requirement that must be
provided by RDBMS. In order to allow query compilers to optimize a similarity
query execution, we presented here the properties holding for the unary similarity
operators, that is, for Range and k-Nearest Neighbor Selection operations. We
also presented the experiments conducted to show the performance obtained
with SIREN query optimizer using Similarity Algebra (SIREN+O) reducing
total time in up to 64% over the performance of queries in SIREN without the
algebra regardless of the usage of an index. As a follow-up of this paper, we are
currently working on the properties to extend the Similarity Algebra to support
similarity join. This will surely open the possibility to support the storage and
retrieval of complex data in RDBMS. We are also working on developing better
statistics that can be measured over data, to create heuristics able to control the
DBMS query optimizer for similarity queries.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Codd</surname>
            ,
            <given-names>E.F.</given-names>
          </string-name>
          :
          <article-title>A relational model of data for large shared data banks</article-title>
          .
          <source>CACM</source>
          <volume>13</volume>
          (
          <issue>6</issue>
          ) (
          <year>1970</year>
          )
          <volume>377</volume>
          {
          <fpage>387</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Adali</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bonatti</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sapino</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Subrahmanian</surname>
          </string-name>
          , V.:
          <article-title>A multi-similarity algebra</article-title>
          .
          <source>In: SIGMOD</source>
          , ACM Press (
          <year>1998</year>
          )
          <volume>402</volume>
          {
          <fpage>413</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Atnafu</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chbeir</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Coquil</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brunie</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Integrating similarity-based queries in image DBMSs</article-title>
          . In: SAC, ACM Press (
          <year>2004</year>
          )
          <volume>735</volume>
          {
          <fpage>739</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Belohlavek</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Opichal</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Vychodil.,
          <string-name>
            <surname>V.</surname>
          </string-name>
          :
          <article-title>Relational algebra for ranked tables with similarity: properties and implementation</article-title>
          .
          <source>In: IDA</source>
          . Volume
          <volume>4723</volume>
          of LNCS., Springer Verlag (
          <year>2007</year>
          )
          <volume>140</volume>
          {
          <fpage>151</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ciaccia</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montesi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Penzo</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trombetta</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Imprecision and user preferences in multimedia queries: a generic algebraic approach</article-title>
          . In: FoIKS. Volume
          <volume>1762</volume>
          of LNCS., Springer Verlag (
          <year>2000</year>
          )
          <volume>50</volume>
          {
          <fpage>71</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Adali</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sapino</surname>
            ,
            <given-names>M.L.</given-names>
          </string-name>
          :
          <article-title>Ranked relations: query languages and query processing methods for multimedia</article-title>
          .
          <source>MTAJ</source>
          <volume>24</volume>
          (
          <issue>3</issue>
          ) (
          <year>2004</year>
          )
          <volume>197</volume>
          {
          <fpage>214</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>K.C.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ilyas</surname>
            ,
            <given-names>I.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Song</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>RankSQL: query algebra and optimization for relational top-k queries</article-title>
          .
          <source>In: SIGMOD</source>
          , ACM Press (
          <year>2005</year>
          )
          <volume>131</volume>
          {
          <fpage>142</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Traina-Jr.</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Traina</surname>
            ,
            <given-names>A.J.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vieira</surname>
            ,
            <given-names>M.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arantes</surname>
            ,
            <given-names>A.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Faloutsos</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>E cient processing of complex similarity queries in RDBMS through query rewriting</article-title>
          .
          <source>In: CIKM</source>
          , ACM Press (
          <year>2006</year>
          )
          <volume>4</volume>
          {
          <fpage>13</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Ferreira</surname>
            ,
            <given-names>M.R.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Traina-Jr.</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Traina</surname>
            ,
            <given-names>A.J.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dias</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Extending SQL to support unary similarity queries</article-title>
          .
          <source>Technical Report 325</source>
          , ICMC/USP (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Barioni</surname>
            ,
            <given-names>M.C.N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Razente</surname>
            ,
            <given-names>H.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Traina</surname>
            ,
            <given-names>A.J.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Traina-Jr.</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>SIREN: A similarity retrieval engine for complex data</article-title>
          .
          <source>In: VLDB</source>
          , ACM Press (
          <year>2006</year>
          )
          <volume>1155</volume>
          {
          <fpage>1158</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Felipe</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Traina</surname>
            ,
            <given-names>A.J.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Traina-Jr.</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Retrieval by content of medical images using texture for tissue identi cation</article-title>
          . In: CBMS, IEEE Computer Society (
          <year>2003</year>
          )
          <volume>175</volume>
          {
          <fpage>180</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Traina</surname>
            ,
            <given-names>A.J.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Traina-Jr.</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bueno</surname>
            ,
            <given-names>J.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chino</surname>
            ,
            <given-names>F.J.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Azevedo-Marques</surname>
            ,
            <given-names>P.:</given-names>
          </string-name>
          <article-title>E cient content-based image retrieval through metric histograms</article-title>
          .
          <source>WWW</source>
          <volume>6</volume>
          (
          <issue>2</issue>
          ) (
          <year>2003</year>
          )
          <volume>157</volume>
          {
          <fpage>185</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Traina-Jr.</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Traina</surname>
            ,
            <given-names>A.J.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seeger</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Faloutsos</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Slim-trees: High performance metric trees minimizing overlap between nodes</article-title>
          .
          <source>In: EDBT</source>
          . Volume
          <volume>1777</volume>
          of LNCS., Springer Verlag (
          <year>2000</year>
          )
          <volume>51</volume>
          {
          <fpage>65</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>