<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Data Posting: a New Frontier for Data Exchange in the Big Data Era?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Domenico Sacca`</string-name>
          <email>sacca@unical.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Edoardo Serra</string-name>
          <email>eserra@deis.unical.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DIMES, Universita` della Calabria</institution>
          ,
          <addr-line>87036 Rende</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Preliminaries on Data Exchange and Count Constraints Data exchange [5, 1] is the problem of migrating a data instance from a source schema to a target schema such that the materialized data on the target schema satisfies a number of given integrity constraints (mainly inclusion and functional dependencies). The target schema typically contains some new attributes that are defined using existentially quantified variables: the main issue is to reduce arbitrariness in selecting such variable values. Therefore a data exchange solution is required to be “universal” in the sense that homomorphisms exists into every possible solution, i.e., a universal solution enjoys a sort of “minimal arbitrariness” property. The main research goal of the large data exchange literature is to single out situations for which a universal solution exists and can be computed in polynomial time. Recently a different approach to data exchange has been proposed in [11] that considers a new type of data dependency, called count constraint (an extension of cardinality constraint), that prescribes the result of a given count operation on a relation to be within a certain range. We illustrate this approach by means of an example. Consider a source relation scheme S with the following attributes: I (Item), B (Brand), P (Price). The target scheme is the relation scheme T with attributes: I, B, P, W (Warehouse), C (product Category), R (price Range). We assume that the domains of I, B and P for T are the projections of the source relation S on the respective attributes, e.g., DB = B(S). Also the domains of the other attributes of T are finite and are defined by supplementary source relations: DW (the list of all available warehouses for storing items), DC (a 2-arity relation listing the category for each product) and DR (a 3-arity relation listing all price ranges together with the interval extremes). The mapping is defined as follows (as usual, lower-case and upper-case letters denote variables that are respectively universally and existentially quantified - in addition, dotted letters denote free variables used for defining sets): (1): S(i; p; b) ^ DC(i; c) ^ DR(r; p; p) ^ (p</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>p &lt; p) ! 1
(2): T ( ; ; b; w; ; ) ! #(fI : T (I; P; b; w; C; R)g)
(3): T ( ; ; ; w; c; ) ! #(fI : T (I; P; B; w; c; R)g)
5
5
#(fW : T (i; p; b; W; c; r)g)
5
(Note that # is an interpreted function symbol for computing the cardinality of a
set, existentially quantified variables are local in a set term, i.e., fI : T(I; P; b; w; C; R)g
? The research was partially funded by MIUR (PON Project “InMoto – Information Mobility
for Tourism”) and by Calabria Region (POR Regional Innovation Pole on ICT).
stands for fI : 9 P C R T(I; P; b; w; C; R)g, and that T ( ; ; b; w; ; ) and T ( ; ; ; w; c; )
stand for the projections of T on B; W and on W; C respectively.) The values of the new
attributes C and R are univocally determined by the domains DC and DR, whereas the
values for the attribute W may be arbitrarily taken from the domain DW provided that
the following count constraints are satisfied: (1) every item must be stored into at least
one and at most 5 warehouses, (2)-(3) if a warehouse stores an item of a given brand
(resp., category) , it must store at least other 4 items of the same brand (resp., category).</p>
      <p>
        The approach of [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] has received three main strong criticisms from a number of
reviewers of the paper during its long process for publication: (1) the high complexity of
deciding whether an admissible solution for T exists (NEXP-complete under combined
complexity), (2) the lack of a universal solution and (3) an alleged fictitious nature of
data exchange with count constraints.
      </p>
      <p>The first criticism is a common drawback for many approaches – e.g., data exchange
is undecidable in the general case. The real (and open) issue is: are there tractable (and,
at the same time, meaningful) cases? Even more, as we shall argue later in the paper, we
believe that intractability must be dealt with more and more every day, as it is currently
done in many knowledge discovery tasks.</p>
      <p>
        The second criticism is motivated by the relevance of a universal solution to
answer certain queries. Nevertheless, at the risk to be accused of heresy, we believe that
answering certain queries is not a ”must” for data exchange. In fact, one may want to
get answers from the transferred data without caring about certainty w.r.t. source data.
Indeed, as discussed in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], within a scenario of privacy-preserving data exchange, one
could even have an opposite goal: defining a target schema for which answering a
number of given ”sensible” queries is ”uncertain”! As the main point of our data exchange
setting is the choice of some attribute values, we have presented in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] an extension of
Datalog to provide an alternative formalization of the problem. To give an intuition, the
above constraint (1) can be expressed using Datalog with choice as follows:
T(I; P; B; W; C; R)
      </p>
      <p>S(I; P; B); DC(I; C); DR(r; P; P); P</p>
      <p>
        P &lt; P; DW(W); choice(I; W)[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
      </p>
      <p>
        The construct choice(I; W)[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] extends the classical choice construct: instead of
choosing exactly one value of W for each I, the new construct enables the selection
of up to five distinct values.
      </p>
      <p>Concerning the criticism about the fictitious nature of data exchange with count
constraints, indeed the original formulation of our approach intended to address the
problem of generating fact tables (i.e., relations used in OLAP applications) satisfying
a number of given count constraints, mainly to perform benchmark experiments on
artificial data cubes reflecting patterns extracted from reality. On the sidelines of our
work, we eventually realized that the same setting can be used for a new declination of
data exchange to transform database relations into Web contents.
2</p>
      <p>
        Data Posting: a New Paradigm for Sharing Data in Big Data
Platforms
It is well known that a Web Search Engine such as Google mainly executes string (word)
selection queries across public resources on the Web. In a sense, for those who have
spent decades of their research effort to elaborate query languages advancing SQL, it
may be frustrating to eventually witness the victory of query languages that are much
more elementary than SQL, as they only enable to list words for making a simple
selection. The complexity is behind the query language: the large technological
infrastructure for crawling, indexing and accessing huge amount of contents on the Web. We have
added ”may be” to the above-mentioned frustration as we believe that advanced query
languages may take their revenge if moved to the backstage: they may have a crucial
role in enriching the semantics of database tuples that are posted on the Web cloud. We
fully agree with the arguments of [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]: to solve the challenges of the emerging
“Big Data” platforms, database technology (including database theory) may continue to
have a crucial role, if it will be suitably revised and immersed into the new technological
and applicative perspectives. For instance, much effort is being presently put in
providing intelligent answers to simple queries, see the numerous semantic Web proposals
and, among them, the Knowledge Graph of Google [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. The database community has
significant expertise in declarative data processing and how to do it efficiently as well
as to make it scale. The community should apply its expertise to the design of more
intelligent and more efficient future Big Data platforms.
      </p>
      <p>
        Since recently, Google provides an interesting solution, the Google Search
Appliance (GSA), to access public and private contents, such as emails and database tuples,
that cannot be directly browsed by the search engine. To this end, so-called connectors
extend the reach of the GSA to non-Web repositories [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. For instance database tuples
can be materialized as XML documents by ad-hoc connectors and, afterwards,
transformed into HTML documents for the search. A database connector can be thought of
as a simple exchange data setting for posting data on the Web.
      </p>
      <p>Inspired by this solution, we have shaped our vision on data posting: during the
process of database publishing, the contents can be enriched by supplying additional
concepts. In our example, we added values derived from a sort of “ontology” for the
classification of products and of prices. We have also added the attribute W (warehouse)
together with its domain, but in this case the values for the attribute are not
predetermined by the available domain values but are selected on the basis of the constraints. We
point out that the issue of inventing new values to be included into the target relation
is one of the goals of classical data exchange setting. The main difference with data
posting is the focus: to preserve the relationships with the source database, classical
data exchange only considers dependencies delivering universal solutions, whereas we
look for more expressive constraints for enriching the contents of the exchanged data at
the cost of loosing certainty. As witnessed by data mining applications, the process of
knowledge discovery is inherently uncertain.</p>
      <p>
        Another important peculiarity of data posting is the structure of the target database
scheme. We assume that it consists of a unique relation scheme that corresponds to a flat
fact table as defined in OLAP analysis – we recall that an OLAP system is characterized
by multidimensional data cubes that enable manipulation and analysis of data stored in
a source database from multiple perspectives (see for instance [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]).
      </p>
      <p>
        A fact table is relation scheme whose attributes are dimensions (i.e., properties,
possibly structured at various levels of abstraction) and measures (i.e., numeric values)
– but measures can be seen as dimensions as well. For instance, in our example, the
attribute P (price) is a typical measure but together with R (price range) it forms a pair of
2-layered dimensions. In general, in addition to a fact table, an OLAP scheme includes
other tables describing dimension attributes (e.g., star or snowflake scheme [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]). We
instead denormalize all tables into a unique flat fact table in order to comply with search
engine strategies – string selection queries are easier to express on denormalized tables
and can be massively parallelized on them (perhaps, after almost thirty years, the time
has come for revenge of the Universal Relation [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]). We observe that the very objective
of Google Knowledge Graph is to add hidden dimensions to a fact table (corresponding
to a Web document) on-the-fly during the search in order to enrich the semantics of
strings1.
      </p>
      <p>Let S = hS1; : : : ; Sni be a source database schema with relation schemes S1; : : : ;
Sn, D = hD1; : : : ; Dmi be a domain database schema with domain relation schemes
D1; : : : ; Dm and T be a target flat fact table whose attribute domains are in D.</p>
      <p>A source-to-target count constraint is a dependency over hS; D; T i of the form
8x ( (x) ! T (x) ), where x is a list of variables, the formula is a conjunction of
atoms with predicate symbols in S [ D and whose variables are exactly the ones in x,
and T is a formula of the form:
1
#(fy : 9 z (x; y; z)g)
2:
The above formula is a 3-arity comparison predicate, where 1 and 2 are simple terms
that are either constants or variables in x, the two lists y and z consist of distinct
variables that are also different from the ones in x, the formula (x; y; z) is a conjunction of
atoms T (x; y; z) and # is an interpreted function symbol that computes the cardinality
of the (possibly empty) set defined by fy : 9 z (x; y; z)g.</p>
      <p>A target count constraint differs from a source-to-target constraint only in the
formula , which in this case is is a conjunction of atoms with T as predicate symbol.</p>
      <p>Given finite source instances IS for S, ID for D and IT for T , hIS ; ID; IT i satisfies
a count constraint if for each substitution x=v that makes true (x)[x=v], the cardinality
of the set W is in the range between 1[x=v] and 2[x=v], where W is the (possibly
empty) projection on y of the selection of IT defined by fy : 9 z (x; y; z)g[x=v].</p>
      <p>Observe that a Tuple Generating Dependency (TGD) 8x ( (x) ! 9 y T (x; y) )
of the classical data exchange setting can be formulated by the following count
constraint:
8x ( (x) ! #( fy : T (x; y)g )
1 ):</p>
      <p>Also an Equality Generating Dependency (EGD) 8x( T (x) ! x1 = x2 ), where
x1 and x2 are variables in x, can be formulated by the following count constraint:
8x ( true ! #(fy : (x) ^ (y = x1 _ y = x2)g )
1)
where y is a new variable not included in x. The extension of our formalism to include
“safe” comparison predicates such as (y = x1 _ y = x2) is straightforward.</p>
      <p>We are now ready to formulate the data posting problem:
1 Actually we were tempted to title this paper: “The Elegant Search Universe: Superstrings,
Hidden Dimensions and the Quest for the Ultimate Big Data Theory”, but we later discovered
that a similar title was already used by a physics book. Ugh, we are always late!
Definition 1. The data posting setting (S; D; T; st; t) consists of a source database
schema S, a domain database scheme D, a target flat fact table T , a set st of
sourceto-target count constraints and a set t of target count constraints. The data posting
problem associated with this setting is: given finite source instances IS for S and ID for
D, find a finite instance IT for T such that hIS ; ID; IT i satisfies both st and t. 2</p>
      <p>The main difference w.r.t. classical data exchange is the presence of the domain
database scheme that stores “new” values (dimensions) to be added while exchanging
data. In a motto we can say that “data posting is enriching data while exchanging them”.
Theorem 1. Given (S; D; T; st; t) and finite source instances IS for S and ID
for D, the problem of deciding whether there exists an instance IT of T such that
hIS ; ID; IT i satisfies st [ t is NEXP-complete under the combined complexity and
NP-complete under the data complexity. 2</p>
      <p>
        The proof of NEXP-completeness can be easily derived from a similar result
presented in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] – actually NEXP-completeness also holds for binary domains. The proof
of NP-completeness consists of a rather straightforward reduction from the graph
3coloring problem. We stress that our complexity results derives from the assumption
that the domains of the attributes in T are finite and are part of the input.
3
      </p>
      <p>
        Research Lines for Data Posting
In our example we have used count constraints to perform an elementary task of
grouping items into warehouses on the basis of their categories and brands. Grouping objects
is the goal of two important data mining techniques: clustering and classification [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
A first research line is to include some features of these techniques in the data
posting setting. This is coherent with our ambitious goal of posting data with knowledge
value added. And this explains why we do not insist in finding tractable cases: most of
the data mining problems are indeed intractable but yet there are a lot of approaches
aimed at finding solutions for small-sized or well-structured instances or at searching
for approximate solutions.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] we have shown that a version of count constraint implementing the
“groupby” operator can be used to mimic another classical data mining technique: frequent
itemset mining. As an example, the following “group-by” count constraint imposes the
items a; b and c to be together frequent in the warehouses (the frequency threshold is
fixed at 3):
x = fa; b; cg ! #(fW : x
fI : T (I; P; B; W; C; R)gg)
3
      </p>
      <p>In a “group-by” count constraint, the formula #(fy : 9 z (x; y; z)g) is replaced
by #(fy : t f z1 : 9 z2 (x; y; z) g g), where is a set operator such as = or or ,
and t is a bounded set term.</p>
      <p>Continuing in our efforts to add semantics to data posting, we point out that in our
example we have used hierarchy domains C and R to classify items. We dared to say
with a bit of shame that the two domains represent a sort of ontology. A second line of
research is to add “real” ontology tools to the data posting setting.</p>
      <p>
        Let us now discuss the issue of relaxing the assumption that the domains of all
attributes of the target relation T are finite and are explicitly listed in the input. Consider
first the case that one of such domains, say D, is still finite but their values are not listed.
Thus only the cardinality k of the domain is given in input; in addition, the k values can
be generated using a polynomial-time function. This case has been considered in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]
to show that, in the presence of “group-by” count constraints, also data complexity of
decision data posting becomes NEXP-complete. A third research line is to analyze the
undecidabity risk of the case that D is a countably infinite set – obviously a
polynomialtime function for generating domain values must be available.
      </p>
      <p>
        We conclude by pointing out some possible interesting relationships between data
posting and data integration. It is known that a target instance need not be
materialized in data integration; the main focus there is on answering queries posed over the
target schema using views that express the relationship between the target and source
schemata [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Data posting can be thought of as a bottom-up enriched view directly
provided by information source experts, which should be integrated with the classical
top-down local view designed by the data integration administrator.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>ARENAS</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>BARCEL O</surname>
          </string-name>
          ´,
          <string-name>
            <surname>P.</surname>
          </string-name>
          ,
          <string-name>
            <surname>FAGIN</surname>
          </string-name>
          ,
          <string-name>
            <surname>R.</surname>
          </string-name>
          ,
          <string-name>
            <surname>AND LIBKIN</surname>
          </string-name>
          ,
          <string-name>
            <surname>L.</surname>
          </string-name>
          <article-title>Locally consistent transformations and query answering in data exchange</article-title>
          .
          <source>In PODS</source>
          (
          <year>2004</year>
          ),
          <string-name>
            <given-names>C.</given-names>
            <surname>Beeri</surname>
          </string-name>
          and
          <string-name>
            <surname>A</surname>
          </string-name>
          . Deutsch, Eds., ACM, pp.
          <fpage>229</fpage>
          -
          <lpage>240</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>BORKAR</surname>
            ,
            <given-names>V. R.</given-names>
          </string-name>
          , CAREY,
          <string-name>
            <surname>M. J.</surname>
          </string-name>
          ,
          <string-name>
            <given-names>AND</given-names>
            LI,
            <surname>C.</surname>
          </string-name>
          <article-title>Inside “Big Data Management”: Ogres, Onions, or Parfaits? In EDBT (</article-title>
          <year>2012</year>
          ),
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Rundensteiner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Markl</surname>
          </string-name>
          , I. Manolescu,
          <string-name>
            <given-names>S.</given-names>
            <surname>Amer-Yahia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Naumann</surname>
          </string-name>
          , and I. Ari, Eds., ACM, pp.
          <fpage>3</fpage>
          -
          <lpage>14</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>CATTELL</surname>
            ,
            <given-names>R. Scalable SQL</given-names>
          </string-name>
          and
          <article-title>NoSQL data stores</article-title>
          .
          <source>SIGMOD Record 39</source>
          ,
          <issue>4</issue>
          (
          <year>2010</year>
          ),
          <fpage>12</fpage>
          -
          <lpage>27</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>CHAUDHURI</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , AND DAYAL,
          <string-name>
            <surname>U.</surname>
          </string-name>
          <article-title>An Overview of Data Warehousing and OLAP Technology</article-title>
          .
          <source>SIGMOD Record 26</source>
          ,
          <issue>1</issue>
          (
          <year>1997</year>
          ),
          <fpage>65</fpage>
          -
          <lpage>74</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>FAGIN</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>KOLAITIS</surname>
          </string-name>
          ,
          <string-name>
            <surname>P. G.</surname>
          </string-name>
          ,
          <article-title>AND POPA, L. Data Exchange: getting to the core</article-title>
          .
          <source>ACM Trans. Database Syst</source>
          .
          <volume>30</volume>
          ,
          <issue>1</issue>
          (
          <year>2005</year>
          ),
          <fpage>174</fpage>
          -
          <lpage>210</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>GOOGLE</given-names>
            <surname>DOCUMENTATION</surname>
          </string-name>
          .
          <article-title>Getting the Most from Your Google Search Appliance</article-title>
          .
          <source>In Google Developers Site (November</source>
          ,
          <year>2011</year>
          ). https://developers.google.com/searchappliance/documentation/614/QuickStart/quick start intro.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>JIAWEI</surname>
            <given-names>HAN</given-names>
          </string-name>
          ,
          <string-name>
            <surname>MICHELINE</surname>
            <given-names>KAMBER</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Data</surname>
          </string-name>
          <article-title>Mining: Concepts and Techniques</article-title>
          . Morgan Kaufmann Publishers,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>LENZERINI</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. Data</surname>
          </string-name>
          <article-title>Integration: A Theoretical Perspective</article-title>
          . In PODS (
          <year>2002</year>
          ), L. Popa,
          <string-name>
            <given-names>S.</given-names>
            <surname>Abiteboul</surname>
          </string-name>
          , and P. G. Kolaitis, Eds., ACM, pp.
          <fpage>233</fpage>
          -
          <lpage>246</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>MAIER</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>ULLMAN</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. D.</surname>
          </string-name>
          , AND VARDI, M. Y.
          <article-title>On the Foundations of the Universal Relation Model</article-title>
          .
          <source>ACM Trans. Database Syst. 9</source>
          ,
          <issue>2</issue>
          (
          <year>1984</year>
          ),
          <fpage>283</fpage>
          -
          <lpage>308</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. SACC A`,
          <string-name>
            <surname>D.</surname>
          </string-name>
          , AND SERRA,
          <string-name>
            <surname>E.</surname>
          </string-name>
          <article-title>Data Exchange in Datalog Is Mainly a Matter of Choice</article-title>
          . In
          <string-name>
            <surname>Datalog</surname>
          </string-name>
          (
          <year>2012</year>
          ), P. Barcelo´ and
          <string-name>
            <given-names>R.</given-names>
            <surname>Pichler</surname>
          </string-name>
          , Eds., vol.
          <volume>7494</volume>
          of Lecture Notes in Computer Science, Springer, pp.
          <fpage>153</fpage>
          -
          <lpage>164</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. SACC A`,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>SERRA</surname>
          </string-name>
          ,
          <string-name>
            <surname>E.</surname>
          </string-name>
          ,
          <string-name>
            <surname>AND GUZZO</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>Count Constraints and the Inverse OLAP Problem: Definition, Complexity and a Step toward Aggregate Data Exchange</article-title>
          . In FoIKS (
          <year>2012</year>
          ),
          <string-name>
            <given-names>T.</given-names>
            <surname>Lukasiewicz</surname>
          </string-name>
          and
          <string-name>
            <surname>A</surname>
          </string-name>
          . Sali, Eds., vol.
          <volume>7153</volume>
          of Lecture Notes in Computer Science, Springer, pp.
          <fpage>352</fpage>
          -
          <lpage>369</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>SINGHAL</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <article-title>Introducing the Knowledge Graph: things, not strings</article-title>
          .
          <source>In Official Google Blog (May</source>
          ,
          <year>2012</year>
          ). http://googleblog.blogspot.com/
          <year>2012</year>
          /05/introducing
          <article-title>-knowledge-graphthings-not</article-title>
          .html.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>