<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Scalable Benchmark for OBDA Systems: Preliminary Report?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Diego Calvanese</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Davide Lanti</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Rezk</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mindaugas Slusnys</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Guohui Xiao</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Computer Science, Free University of Bozen-Bolzano</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In ontology-based data access (OBDA), the aim is to provide a highlevel conceptual view over potentially very large (relational) data sources by means of a mediating ontology. The ontology is connected to the data sources through a declarative specification given in terms of mappings that relate each (class and property) symbol in the ontology to an (SQL) view over the data. Although prototype OBDA systems providing the ability to answer SPARQL queries over the ontology are available, a significant challenge remains: performance. To properly evaluate OBDA systems, benchmarks tailored towards the requirements in this setting are needed. OWL benchmarks, which have been developed to test the performance of generic SPARQL query engines, however, fail at 1) exhibiting a complex real-world ontology, 2) providing challenging real world queries, 3) providing large amounts of real-world data, and the possibility to test a system over data of increasing size, and 4) capturing important OBDA-specific measures related to the rewriting-based query answering approach in OBDA. In this work, we propose a novel benchmark for OBDA systems based on a real world use-case adopted in the EU project Optique. We validate our benchmark on the system Ontop, showing that it is more adequate than previous benchmarks not tailored for OBDA.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        In ontology-based data access (OBDA), the aim is to provide a high-level conceptual
view over potentially very large (usually relational) data sources by means of a mediating
ontology. Queries are posed over such conceptual layer and then translated into queries
over the data layer. The ontology is connected to the data sources through a declarative
specification given in terms of mappings that relate each (class and property) symbol in
the ontology to an (SQL) view over the data. The W3C standard R2RML [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], was created
with the goal of providing a standardized language for the specification of mappings in
the OBDA setting.
      </p>
      <p>To properly evaluate the performance of OBDA systems, benchmarks tailored
towards the requirements in this setting are needed. OWL benchmarks, which have been
developed to test the performance of generic SPARQL query engines, however, fail at
1) exhibiting a complex real-world ontology, 2) providing challenging real world queries,
3) providing large amounts of real-world data, and the possibility to test a system over
data of increasing size, and 4) capturing important OBDA-specific measures related to the
rewriting-based query answering approach in OBDA. For instance, in the Berlin SPARQL
? This paper is supported by the EU under the large-scale integrating project (IP) Optique (Scalable</p>
      <p>End-user Access to Big Data), grant agreement n. FP7-318338.</p>
      <p>
        Benchmark [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], although it is possible to configure the data size (in triples), there is no
ontology to measure reasoning tasks and the queries are rather simple. Therefore it is hard
to evaluate some of the key features of OBDA system, such as rewritings with respect
to an ontology and/or a set of mappings. Moreover, the data is fully artificial, hence it
is difficult to assess the significance of the obtained performance results with respect to
real world settings. Another popular benchmark is FishMark [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. In this benchmark the
queries are more challenging, and “real world ” data is used. However, the benchmark
does not come with an ontology and the data size is rather small ( 20M triples). A
popular benchmark that does come with an ontology and with the possibility of generating
data (triples) of arbitrary size is LUBM [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. However, the ontology is rather small, and
the benchmark is not tailored towards OBDA, since no mappings to data sources are
provided. Moreover, the queries in combination with the ontology appear to provide
unnatural results, e.g., they either return no results or a very large portion of the data.
      </p>
      <p>
        In this work, we propose a novel benchmark for OBDA systems based on the
Norwegian Petroleum Directorate (NPD), which is a real world use-case adopted in the
EU project Optique [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].Specifically, we adopt the NPD Fact Pages as dataset, the NPD
Ontology, which has been mapped to the NPD Fact Pages stored in a relational database,
and queries over such an ontology developed by domain experts. The main challenge we
address here has been to develop a data generator for generating datasets of increasing
size, starting from the available data. This problem has been studied before in the context
of databases [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], where increasing the data size is achieved by encoding domain-specific
information into the data generator [
        <xref ref-type="bibr" rid="ref12 ref8 ref9">12, 8, 9</xref>
        ]. One drawback of this approach is that
each benchmark requires its ad-hoc generator, and also that it disregards OBDA specific
aspects. In the context of triple stores, [
        <xref ref-type="bibr" rid="ref16">24, 16</xref>
        ] present an interesting approach based on
machine learning. Unfortunately, the approach proposed in these papers is specifically
tailored for triple stores, and thus it is not directly applicable to the OBDA settings.
Applying these approaches to OBDA, in fact, is far from trivial and closely related to the
view update problem [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>We present the NPD benchmark1 in Section 2, discuss our data generator in Section 3,
validate our benchmark in Section 4, and conclude in Section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>NPD Benchmark</title>
      <p>
        The Norwegian Petroleum Directorate [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] (NPD) is a governmental organisation
whose main objective is to contribute to maximize the value that society can obtain from
the oil and gas activities. The initial dataset that we use are the NPD Fact Pages [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], which
contains information regarding the petroleum activities on the Norwegian continental
shelf. The ontology, the query set, and the mappings to the dataset have all been developed
at the University of Oslo [22], and are freely available online [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Next we provide more
details on each of these items.
      </p>
      <p>The Ontology. The ontology contains OWL axioms specifying comprehensive
information about the underlying concepts in the dataset. Since we are interested in benchmarking
OBDA systems that are able to rewrite SPARQL queries over the ontology into FOL
queries (therefore, SQL queries) that can be evaluated by a relational DBMS, we
concentrate here on the OWL 2 QL profile [20] of OWL, which guarantees FO-rewritability of</p>
      <sec id="sec-2-1">
        <title>1 https://github.com/ontop/npd-benchmark</title>
        <p>
          unions of conjunctive queries (see, e.g., [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]). Table 1 shows the statistics for the maximal
OWL 2 QL subset of the NPD ontology. This ontology is suitable for benchmarking
reasoning tasks, given that (i) it is a complex ontology in terms of number of classes,
maximum depth of the class hierarchy, and average number of sibling classes for each
class [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ], making it suitable for reasoning w.r.t. hierarchies; and (ii) it contains
existentials in the right-hand side of ontology axioms. These axioms infer unnamed individuals
in the virtual instance that cannot be retrieved as part of the answer, but can affect the
evaluation of the query as they strongly affect the size of the rewritten query.
The Query Set. The original NPD query set contains 25 queries obtained by interviewing
users of the NPD dataset. Starting from the original NPD query set, we devised 12
queries having different degrees of complexity (see Table 2). In the OBDA context, it is
recognized that queries involving classes (or object/data properties) with a rich hierarchy,
or making use of existentially quantified variables in a rather sophisticated way (i.e.,
giving rise to tree witnesses [
          <xref ref-type="bibr" rid="ref17">17, 21</xref>
          ]) are harder to answer than other queries, because they
produce more complex rewritings. We also fixed some minor issues, e.g., the absence in
the ontology of certain concepts present in the queries, removing aggregates not supported
by Ontop, and flattening of nested sub-queries.
        </p>
        <p>The Mappings. The mapping consists of 1190 assertions mapping a total of 464 among
classes, objects properties, and data properties. The SQL queries in the mappings, in
terms of their direct DATALOG translation, count an average of 2:6 rules, with 1:7 joins
per rule. We observe that the mappings have not been optimised to take full advantage of
an OBDA framework, e.g., by trying to minimize the number of mappings that refer to
the same ontology class or property, so as to reduce the size of the SQL query generated
by unfolding the mapping.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>The Data Generator</title>
      <p>The generator produces data according to certain features of an input training database.
The algorithm is not specific for NPD, and it can in principle be applied to every database.
We put special care in making the generation process fast, so as to be able to generate
very large datasets. The algorithm starts from a non-empty database D. Given a desired
increment f &gt; 0, it generates a new database such that jT 0j = jT j (1 + f ), for each
table T of D that has to be incremented (where jT j denotes the number of tuples of T ).
Duplicate Values Generation. Values in each column T:C of a table T 2 D are generated
with a duplicate ratio (jjT:Cjj jT:Cj)=jjT:Cjj, where jjT:Cjj (resp., jT:Cj) denotes the
cardinality of the column T:C under the multi-set (resp., set) semantics. A duplicate ratio
“close to 1” indicates that the content of the column is essentially independent from the
size of the database, and it should not be increased by the data generator.
Fresh Values Generation. For each non-string column T:C over a totally ordered domain,
the generator chooses values from the interval I := [min(T:C); max(T:C)]. If the
number of values to insert exceeds the number of different fresh values that can be chosen
from the interval I, then values greater than max(T:C) are allowed.</p>
      <p>Chase Cycles. The data generation is done respecting foreign keys. Let T1 ! T2 denote
the presence of a foreign key from table T1 to table T2. In case of a cycle T1 ! T2 !
! Tk ! T1, inserting a tuple in T1 could potentially trigger an infinite number of
insertions. By analyzing the input database, the generator prevents an infinite chain of
insertions while ensuring that no foreign key constraint is violated.</p>
      <p>Geometric Types. The NPD database makes use of geometric datatypes available in
MYSQL. Some of them come with constraints, e.g., a polygon is a closed non-intersecting
line composed of a finite number of straight lines. For each geometric column in the
database, the generator first identifies the minimal rectangular region of space enclosing
all the values in the column, and then it generates values in that region.
Generator Validation. We aim at producing synthetic data that approximates the
realworld data well. By considering the mapping assertions, it is possible to establish the
expected growth of the ontology elements w.r.t. the growth of the database. For our
analysis we focus here on those ontology elements for which the expected growth has
to be either linear (w.r.t. the growth of the database) or absent. We tested this way 138
concept names, 28 object properties and 226 data properties.</p>
      <p>Table 3 reports the results of the aforementioned analysis. The first column indicates
the type of ontology elements being analyzed, and the specified increment f (e.g.,
“class npd2” refers to the population of classes for the database incremented with factor
f = 2). The columns “avg err” show the averages of the errors between the expected
growth and the observed growth, in terms of percentage of the expected growth. The
remaining columns report the number and percentage of ontology elements for which the
error was greater than 50%.</p>
      <p>
        Concerning concepts, our generator seems to behave well. This is because concepts
are usually populated starting from primary keys in the database, or by simple queries
that do not involve the Cartesian product of two or more tables (in this last case the
growth would be quadratic). More attention, instead, has to be paid when populating
object or data properties. Our data generator, indeed, considers the distribution of the
elements in a column-wise manner, and ignores the distribution of the actual relations in
the ontology. Observe that, in order to solve this problem, analyzing the distribution at the
level of tuples in relations might in general be insufficient, since object/data properties
can be populated with columns from different tables. On the other hand, a more accurate
approximation working at the level of the triples in the ontology [24] might be unfeasible,
because of the well-known view update problem [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and of the quantity of triples that
need to be generated in order to create big datasets. However, the problem is not the same
as the view update, since we are not interested in reflecting a particular “virtual instance”
into a “physical (relational) source”, but instead we aim at creating a physical source
that produces a virtual instance satisfying certain statistics on some key features. Further
investigation is left to future work.
      </p>
      <p>Finally, the performed tests and the data reported in Table 3 indicate that our heuristic
generator substantially outperforms a pure random generator.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Benchmark Results</title>
      <p>We ran the benchmark on the Ontop system2 [21], which, to the best of our knowledge,
is the only one fully implemented OBDA system that is freely available. MYSQL was
used as underlying relational database system. The hardware consisted of an HP Proliant
server with 24 Intel Xeon X5690 CPUs (144 cores @3.47GHz), 106GB of RAM and a
1TB 15K RPM HD. The OS is Ubuntu 12.04 LTS. Due to space constraints, we present
the results for only one running client.</p>
      <p>Results were obtained with a configuration that does not consider anonymous
individuals. This is motivated by the fact that none of the queries for which this kind of reasoning
would give additional answers (i.e., queries with tree witnesses) could be rewritten, and
executed, within the given timeout (1 min.).</p>
      <sec id="sec-4-1">
        <title>2 http://ontop.inf.unibz.it/</title>
        <p>In order to test the scalability of the system w.r.t. the growth of the database, we used
the data generator described in Section 3 and produced several databases, the largest
being approximately 1500 times bigger than the original one (“NPD1500” in Table 4,
117 GB of size on disk).</p>
        <p>Table 4 refers to a mix of the 7 easiest queries from the initial query set, for which
the unfolding of the mapping produces exactly one non-union SQL query. The mix is
executed 10 times, each time with different parameters at the filter conditions, so that the
effect of caching is minimized, and statistics were collected for each execution. These are
the only queries which display a response time—that is, the sum of the query execution
time (avg(ex time)) and the time spent by the system to display the results to the user
(avg(out time))—that is less than one minute for every database in the benchmark. Results
are in milliseconds, and columns qmpH and avg(res size) refer to query mixes per hour
and the average number of results for queries in the mix, respectively.</p>
        <p>Table 5 contains results for the 5 hardest queries. Each query was run once with a
timeout of 1 minute on the response time. Observe that the response time tends to grow
faster than the growth of the underlying database. This follows from the complexity of the
queries produced by the unfolding step (column #unfolding indicates the number of union
operators in the produced sql query), which usually contain several joins (remember that
the worst case cardinality of a result set produced by a join is quadratic in the size of
the original tables). Column NPD10 RAND witnesses how using a purely random data
generator gives rise to datasets for which the queries are much simpler to evaluate. This
is mainly due to the fact that a random generation of values tends to decrease the ratio of
duplicates inside columns, resulting in smaller join results over the tables [23]. Hence,
purely randomly generated datasets are not appropriate for benchmarking.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and Future Work</title>
      <p>
        The benchmark proposed in this work is the first one that thoroughly analyzes a
complete OBDA system in all significant components. So far, little or no work has been
done in this direction, as pointed out in [19], since the research community has mostly
focused on rewriting engines. Thanks to our work, we have gained a better understanding
of the current state of the art for OBDA systems: first, we confirm that the unfolding
phase is the real bottleneck of modern OBDA systems [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]; second, more research work
is needed in order to understand how to improve the design of mappings, avoiding the
use of mappings that give rise to huge queries after unfolding.
      </p>
      <p>We conclude by observing that for a better analysis it is crucial to refine the generator
in such a way that domain-specific information is taken into account, and a better
approximation of real-world data is produced.
19. Mora, J., Corcho, O.: Towards a systematic benchmarking of ontology-based query rewriting
systems. In: Proc. of the 12th Int. Semantic Web Conf. (ISWC 2013). Lecture Notes in
Computer Science, vol. 8218, pp. 369–384. Springer (2013)
20. Motik, B., Fokoue, A., Horrocks, I., Wu, Z., Lutz, C., Cuenca Grau, B.: OWL 2 web ontology
language profiles. W3C Recommendation, World Wide Web Consortium (Oct 2009), available
at http://www.w3.org/TR/owl-profiles/
21. Rodriguez-Muro, M., Kontchakov, R., Zakharyaschev, M.: Ontology-based data access: Ontop
of databases. In: Proc. of the 12th Int. Semantic Web Conf. (ISWC 2013). Lecture Notes in
Computer Science, vol. 8218, pp. 558–573. Springer (2013)
22. Skjaeveland, M.G., Lian, E.H.: Benefits of publishing the Norwegian Petroleum Directorate’s
FactPages as Linked Open Data. In: Proc. of Norsk informatikkonferanse (NIK 2013). Tapir
(2013)
23. Swami, A., Schiefer, K.: On the estimation of join result sizes. In: Proc. of the 4th Int. Conf.
on Extending Database Technology (EDBT 1994). Lecture Notes in Computer Science, vol.
779, pp. 287–300. Springer (1994)
24. Wang, S.Y., Guo, Y., Qasem, A., Heflin, J.: Rapid benchmarking for semantic web knowledge
base systems. In: Proc. of the 4th Int. Semantic Web Conf. (ISWC 2005). Lecture Notes in
Computer Science, vol. 3729, pp. 758–772. Springer (2005)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. Norwegian Petroleum Directorate, http://www.npd.no/en/, accessed:
          <fpage>2014</fpage>
          -05-20
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. NPD FactPages, http://factpages.npd.no/factpages/, accessed:
          <fpage>2014</fpage>
          -05-20
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>NPD</surname>
          </string-name>
          <article-title>'s FactPages as semantic web data</article-title>
          , http://sws.ifi.uio.no/project/ npd-v2/, accessed:
          <fpage>2014</fpage>
          -05-01
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4. Optique:
          <article-title>Scalable end-user access to Big Data</article-title>
          , http://www.optique-project.eu/, accessed:
          <fpage>2014</fpage>
          -05-01
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Bail</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alkiviadous</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Parsia</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Workman</surname>
            , D., van Harmelen,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goncalves</surname>
            ,
            <given-names>R.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garilao</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>FishMark: A linked data application benchmark</article-title>
          .
          <source>In: Proc. of the Joint Workshop on Scalable and High-Performance Semantic Web Systems (SSWS+HPCSW 2012)</source>
          , vol.
          <volume>943</volume>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>15</lpage>
          . CEUR Electronic Workshop Proceedings, http://ceur-ws.org/ (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Bitton</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dewitt</surname>
            ,
            <given-names>D.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Turbyfill</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Benchmarking database systems: A systematic approach</article-title>
          .
          <source>In: Proc. of the 9th Int. Conf. on Very Large Data Bases (VLDB</source>
          <year>1983</year>
          ). pp.
          <fpage>8</fpage>
          -
          <lpage>19</lpage>
          (
          <year>1983</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schultz</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>The Berlin SPARQL benchmark</article-title>
          .
          <source>Int. J. on Semantic Web and Information Systems</source>
          <volume>5</volume>
          (
          <issue>2</issue>
          ),
          <fpage>1</fpage>
          -
          <lpage>24</lpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Bruno</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chaudhuri</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Flexible database generators</article-title>
          .
          <source>In: Proc. of the 35th Int. Conf. on Very Large Data Bases (VLDB</source>
          <year>2009</year>
          ). pp.
          <fpage>1097</fpage>
          -
          <lpage>1107</lpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Bsche</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sellam</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pirk</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Beier</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mieth</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manegold</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>Scalable generation of synthetic GPS traces with real-life data characteristics</article-title>
          .
          <source>In: Selected Topics in Performance Evaluation and Benchmarking, Lecture Notes in Computer Science</source>
          , vol.
          <volume>7755</volume>
          , pp.
          <fpage>140</fpage>
          -
          <lpage>155</lpage>
          . Springer (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Giacomo</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lembo</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lenzerini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poggi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <article-title>Rodr´ıguez-</article-title>
          <string-name>
            <surname>Muro</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosati</surname>
          </string-name>
          , R.:
          <article-title>Ontologies and databases: The DL-Lite approach</article-title>
          . In: Tessaris,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Franconi</surname>
          </string-name>
          , E. (eds.)
          <source>Reasoning Web. Semantic Technologies for Informations Systems - 5th Int. Summer School Tutorial Lectures (RW 2009), Lecture Notes in Computer Science</source>
          , vol.
          <volume>5689</volume>
          , pp.
          <fpage>255</fpage>
          -
          <lpage>356</lpage>
          . Springer (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Cosmadakis</surname>
            ,
            <given-names>S.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Papadimitriou</surname>
            ,
            <given-names>C.H.</given-names>
          </string-name>
          :
          <article-title>Updates of relational views</article-title>
          .
          <source>J. of the ACM</source>
          <volume>31</volume>
          (
          <issue>4</issue>
          ),
          <fpage>742</fpage>
          -
          <lpage>760</lpage>
          (
          <year>1984</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Crolotte</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghazal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Introducing skew into the TPC-H Benchmark</article-title>
          .
          <source>In: Topics in Performance Evaluation, Measurement and Characterization, Lecture Notes in Computer Science</source>
          , vol.
          <volume>7144</volume>
          , pp.
          <fpage>137</fpage>
          -
          <lpage>145</lpage>
          . Springer (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Das</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sundara</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cyganiak</surname>
            ,
            <given-names>R.:</given-names>
          </string-name>
          <article-title>R2RML: RDB to RDF mapping language</article-title>
          .
          <source>W3C Recommendation, World Wide Web Consortium (Sep</source>
          <year>2012</year>
          ), available at http://www.w3.org/ TR/r2rml/
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>Di</given-names>
            <surname>Pinto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Lenzerini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Mancini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Poggi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Rosati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Ruzzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Savo</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.F.</surname>
          </string-name>
          :
          <article-title>Optimizing query rewriting in ontology-based data access</article-title>
          .
          <source>In: Proc. of the 16th Int. Conf. on Extending Database Technology (EDBT</source>
          <year>2013</year>
          ). pp.
          <fpage>561</fpage>
          -
          <lpage>572</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Guo</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pan</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heflin</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>LUBM: A benchmark for OWL knowledge base systems</article-title>
          .
          <source>J. of Web Semantics</source>
          <volume>3</volume>
          (
          <issue>2-3</issue>
          ),
          <fpage>158</fpage>
          -
          <lpage>182</lpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Guo</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qasem</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pan</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heflin</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>A requirements driven framework for benchmarking semantic web knowledge base systems</article-title>
          .
          <source>IEEE Trans. on Knowledge and Data Engineering</source>
          <volume>19</volume>
          (
          <issue>2</issue>
          ),
          <fpage>297</fpage>
          -
          <lpage>309</lpage>
          (
          <year>Feb 2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Kontchakov</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zakharyaschev</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The combined approach to query answering in DL-Lite</article-title>
          .
          <source>In: Proc. of the 12th Int. Conf. on the Principles of Knowledge Representation and Reasoning (KR</source>
          <year>2010</year>
          ). pp.
          <fpage>247</fpage>
          -
          <lpage>257</lpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>LePendu</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Noy</surname>
            ,
            <given-names>N.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jonquet</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alexander</surname>
            ,
            <given-names>P.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shah</surname>
            ,
            <given-names>N.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Musen</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          :
          <article-title>Optimize first, buy later: Analyzing metrics to ramp-up very large knowledge bases</article-title>
          .
          <source>In: Proc. of the 9th Int. Semantic Web Conf. (ISWC 2010). Lecture Notes in Computer Science</source>
          , vol.
          <volume>6496</volume>
          , pp.
          <fpage>486</fpage>
          -
          <lpage>501</lpage>
          . Springer (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>