<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Googling the Deep Web</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andrea Cal</string-name>
          <email>andrea@dcs.bbk.ac.uk</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Davide Martinenghi</string-name>
          <email>davide.martinenghi@polimi.it</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Riccardo Torlone</string-name>
          <email>torlone@dia.uniroma3.it</email>
        </contrib>
      </contrib-group>
      <abstract>
        <p>The Deep Web is constituted by data that are accessible through Web pages, but not indexable by search engines as they are returned in dynamic pages. In this paper we propose a conceptual framework for answering keyword queries on Deep Web sources represented as relational tables with so-called access limitations. We formalize the notion of optimal answer and characterize queries for which an answer can be found.</p>
      </abstract>
      <kwd-group>
        <kwd>Keyword Query</kwd>
        <kwd>Access Pattern</kwd>
        <kwd>Deep Web</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>r1 =</p>
      <sec id="sec-1-1">
        <title>Depti Emp</title>
        <p>IT John t11
AI Mike t12
r2 =</p>
      </sec>
      <sec id="sec-1-2">
        <title>Empi Proj</title>
        <p>John P1 t21
Ann P2 t22
Mike P2 t23
r3 =</p>
      </sec>
      <sec id="sec-1-3">
        <title>Proji Emp Role</title>
        <p>P1 John DBA t31
P1 Ann Analyst t32
Relation r1 represents a form that, given a department, returns all the
employees working in it; relation r2 a form that, given an employee, returns all the
projects he/she works on; and relation r3 a form that, given a project, returns
the employees working in it along with their role. These access modalities are
commonly referred to as access limitations, in that data can only be queried
according to given patterns.</p>
        <p>
          Di erent approaches have been proposed in the literature for querying
databases with access limitations: conjunctive queries [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], natural language [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ],
and SQL-like statements [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. In this paper, we address the novel problem of
accessing the Deep Web by just providing a set of keywords, in the same way in
which we usually search for information on the Web with a search engine.
        </p>
        <p>Consider for instance the case in which the user only provides the keywords
\DBA" and \IT" for querying the portion of the Deep Web represented by the
relations above. Intuitively, he/she is searching for employees with the DBA role
in the IT department. Given the access limitations, this query can be concretely
answered by rst accessing relation r1 using the keyword IT, which allows us
to extract the tuple t11. Then, using the value John in t11, we can extract the
tuple t21 from relation r2. Finally, using the value P1 in t21, we can extract the
tuples t31 and t32 from relation r3. Now, since t31 contains DBA, it turns out
that the set of tuples ft11; t21; t31g is a possible answer to the input query in
that the set is connected (every two tuples in it share a constant) and contains
the given keywords. However, the tuple t21 is somehow redundant and can be
safely eliminated from the solution, since the set ft11; t31g is also connected and
contains the keywords. This example shows that, in this context, the keyword
query answering problem can be involved and tricky, even in simple situations.</p>
        <p>In the rest of this paper, we formally investigate this problem in depth. We
rst propose, in Section 2, a precise semantics of (optimal) answer to a keyword
query in the Deep Web. We then tackle, in Section 3, the problem of nding
an answer to a keyword query by assuming that the domains of the keywords
are known in advance. This allows us to perform static analysis to immediately
discard irrelevant cases from our consideration. Section 4 ends the paper with
some conclusions and future works.</p>
        <p>Related work To the best of our knowledge, ours is the rst comprehensive
approach to the problem of querying the Deep Web using keywords. Other works
addressed keyword search on relational data previously extracted from the Deep
Web [?]. This paper reports results previously published in [?], the paper where
we illustrate our techniques for keyword search in the Deep Web.</p>
        <p>
          The problem of query processing in the Deep Web has been widely
investigated in the last years, with di erent approaches and under di erent perspectives
including: data crawling [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], integration of data sources [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], query plan
optimization [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], and generic structured query models [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. However, none of them has
tackled the problem that we have addressed in this paper.
        </p>
        <p>
          The idea of querying structured data using keywords emerged more than a
decade ago [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] as a way to provide high-level access to data and free the user from
the knowledge of query languages and data organization. Since then, a lot of work
has been done in this eld (see, e.g., [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] for a survey) but never in the context
of the Deep Web. This problem has been investigated in the context of various
data models: relational [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], semi-structured [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], XML [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], and RDF [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. Within
the relational model, the common assumption is that an answer to a keyword
query is a graph of minimal size in which the nodes represent tuples, the edges
represent foreign key references between them, and the keywords occur in some
node of the graph [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Our de nition of query answer follows this line but it is
more general, since it is only based on the presence of common values between
tuples, while not forcing the presence of foreign keys.
        </p>
        <p>
          The various approaches to keyword query answering over relational databases
are commonly classi ed into two categories: schema-based and schema-free.
Schema-based approaches [
          <xref ref-type="bibr" rid="ref1 ref10">1, 10</xref>
          ] make use, in a preliminary phase, of the
database schema to build SQL queries that are able to return the answer.
Conversely, schema-free approaches [
          <xref ref-type="bibr" rid="ref12 ref7">7, 12</xref>
          ] rely on exploration techniques over
a graph-based representation of the whole database. Since the search for an
optimal answer consists in nding a minimal Steiner tree on the graph, which is
known to be an NP-Complete problem [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], the various proposals rely on
heuristics aimed at generating approximations of Steiner trees. Our approach makes
use of the schema of the data sources but cannot be classi ed in any of the
approaches above since, given the access limitations, it rather relies on building
a minimal query plan of accesses to the data sources.
2
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries and problem de nition</title>
      <p>We model data sources as relations of a relational database and we assume that,
albeit autonomous, they have \compatible" attributes. For this, we x a set
of abstract domains D = fD1; : : : ; Dmg, which, rather than denoting concrete
value types (such as string or integer), represent data types at a higher level of
abstraction (for instance, car or country ). Therefore, in an abstract domain an
object is uniquely represented by a value. The set of all values is denoted by
D = Sn</p>
      <p>i=1 Di. For simplicity, we assume that all abstract domains are disjoint.
We then say that a (relation) schema r, customarily indicated as r(A1; : : : ; Ak),
is a set of attributes fA1; : : : ; Akg, each associated with an abstract domain
dom(Ai) 2 D, 1 i k. A database schema S is a set of schemas fr1; : : : ; rng.</p>
      <p>As usual, given a schema r, a tuple t over r is a function that associates a
value c 2 dom(A) with each attribute A 2 r, and a relation instance rI of r is a
set of tuples over r. For simplicity, we also write dom(c) to indicate the domain
of c. A (database) instance I of a database schema S = fr1; : : : ; rng is a set of
relation instances fr1I ; : : : ; rIng, where riI denotes the relation instance of ri in I.</p>
      <p>For the sake of simplicity, in the following we assign the same name to
attributes of di erent schemas that are de ned over the same abstract domain.
De nition 1 (Access pattern). An access pattern for a schema
r(A1; : : : ; Ak) is a mapping : fA1; : : : ; Akg ! M , where M = fi; og is called
access mode, and i and o denote input and output, respectively; Ai is
correspondingly called an input (resp., output) attribute for r wrt .
Henceforth, we denote input attributes with an `i' superscript, e.g., Ai. Moreover,
we assume that each relation has exactly one access pattern.</p>
      <sec id="sec-2-1">
        <title>De nition 2 (Binding). Let A01; : : : ; A0` be all the input attributes for r wrt</title>
        <p>; any tuple b = hc1; : : : ; c`i such that ci 2 dom(A0i) for 1 i ` is called a
binding for r wrt .</p>
      </sec>
      <sec id="sec-2-2">
        <title>De nition 3 (Access). An access is a pair h ; bi, where is an access pattern</title>
        <p>for a schema r and b is a binding for r wrt . The output of such an access on
an instance I is the set T of all tuples in the relation rI 2 I over r that match
the binding, i.e., such that T = A1=c1;:::;A`=c` (r):
Intuitively, we can only access a relation if we can provide a binding for it, i.e.,
a value for every input attribute.</p>
      </sec>
      <sec id="sec-2-3">
        <title>De nition 4 (Access path). Given an instance I for a database schema S,</title>
        <p>a set of access patterns for the relations in S, and a set of values C D, an
access path on I (for S, and C) is a sequence !b1r1I T1 !b2r2I !bnrnI Tn,
where, for 1 i n, (i) bi is a binding for a relation ri 2 S wrt a pattern
i 2 for ri, (ii) Ti is the output of access h i; bii on I, and (iii) each value
in bi either occurs in Tj with j &lt; i or is a value in C.</p>
        <p>De nition 5 (Reachable portion). A tuple t in I is said to be reachable
given C if there exists an access path P (for S, and C) such that t is in the
output of some access in P ; the reachable portion reach(I; ; C) of I is the set
of all reachable tuples in I given C.</p>
        <p>In the following, we will write S</p>
        <p>to refer to schema S under access patterns
Example 1. Consider the following instance
fr1(Ai; B); r2(Bi; A); r3(Ai; B; C)g.</p>
        <p>Ai B
Then, for instance, ft11g is the output of the access with binding ha1i wrt
r1(Ai; B), and h!a1ri1I ft11g h!b1ri2I ft21; t23g, is an access path for S, and
C = fa1g, since, given C, we can extract t11 from r1 and, given fb1g from t11,
we can extract t21 and t23 from r2. The reachable portion of I, given C, is
reach(I; ; C) = ft11; t12; t21; t23; t31; t33g, while ft22; t32g \ reach(I; ; C) = ;.
Figure 1a shows the reachable portion I0 of I given C along with the access
paths used to extract it, with dotted lines enclosing outputs of accesses.
The de nition of answer to a keyword query in our setting requires the
preliminary notion of join graph.</p>
        <p>De nition 6 (Join graph). Given a set T of tuples, the join graph of T is a
node-labeled undirected graph hN; Ei constructed as follows: (i) the nodes N are
labeled with tuples of T , with a one-to-one correspondence between tuples of T
and nodes of N ; and (ii) there is an arc between two nodes n1 and n2 whenever
the tuples labeling n1 and n2 have at least one value in common.</p>
        <p>A1</p>
        <p>A2</p>
        <p>A keyword query (KQ) is a non-empty set of values in D called keywords.
De nition 7 (Answer to a KQ). An answer to a KQ q against a database
instance I over a schema S is a set of tuples A in reach(I; ; q) such that:
(i) each keyword k 2 q occurs in at least one tuple t in A; (ii) the join graph
of A is connected; (iii) no proper subset A0 A satis es both Conditions (i)
and (ii) above.</p>
        <p>
          It is straightforward to see that there could be several answers to a KQ; below
we give a widely accepted criterion for ranking such answers [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-4">
        <title>De nition 8. Let A1; A2 be two answers to a KQ q on an instance I. We say</title>
        <p>that A1 is better than A2 if jA1j jA2j. The optimal answers are those of
minimum size.</p>
        <p>Example 3. Consider a KQ q = fa1; c1g over the instance I of Example 1.
Figure 1a shows two possible answers: A1 = ft11; t31g and A2 = ft11; t23; t33g. A1
is better than A2 and is the optimal answer to q.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Detecting non-answerable queries</title>
      <p>For convenience of notation, we sometimes write c : D to denote value c and
indicate that dom(c) = D. In addition, in our examples, the name of an attribute
will also indicate its abstract domain.
3.1</p>
      <sec id="sec-3-1">
        <title>Compatible queries</title>
        <p>In order to focus on meaningful queries, we semantically characterize queries for
which an answer might be found.</p>
        <p>De nition 9 (Compatibility). A KQ q is said to be compatible with a schema</p>
        <sec id="sec-3-1-1">
          <title>S if there exist a set of access patterns and an instance I over S such that there is an answer to q against I.</title>
          <p>Example 4. The KQ q1 = fa : A; c : Cg is not compatible with schema S1 =
fr1(A; B); r2(C; D)g, since no set of tuples from S containing all the keywords in
q1 can ever be connected, independently of the access patterns for S1. Conversely,
q1 is compatible with S2 = fr1(A; B); r3(B; C)g, as witnessed by a possible
answer fr1(a; b); r3(b; c)g and patterns such that S2 = fr1(Ai; B); r3(B; C)g.</p>
          <p>Similarly, KQ q2 = fa : A; a0 : Ag is compatible with a schema S3 =
fr1(A; B)g, as witnessed by a possible answer fr1(a; b); r1(a0; b)g and patterns
such that S3 = fr1(Ai; B)g. However, q2 is not compatible with a schema
S4 = fr4(A)g, since a unary relation, alone, can never connect two keywords.
Checking compatibility of a KQ with a schema essentially amounts to checking
reachability on a graph. The main idea is that in order for an answer to ever be
possible, we must nd an instance that exhibits a witness (i.e., a set of tuples)
satisfying all the conditions of De nition 7.
3.2</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Answerable queries</title>
        <p>A stricter requirement than compatibility is given by the notion of answerability.
De nition 10 (Answerability). A KQ q is answerable against a schema S
if there is an instance I over S such that there is an answer to q against I.
Example 5. Consider KQ q = fa : A; c : Cg and schema S1 = fr(Ai; B);
s(B; C; Di)g. Although q is compatible with S1, it is not answerable against
S1 , since no tuple from S1 can be extracted under (no values for domain D
are available). Conversely, q is answerable against S2 = fr(Ai; B); s(Bi; C; D)g,
since an answer like fr(a; b); s(b; c; d)g could be extracted by rst accessing r
with binding hai, thus extracting value b, and then s with binding hbi.
In order to check answerability, we need to check that all the required relations
can be accessed according to the access patterns. To this end, we rst refer to a
schema enriched with unary relations representing the keywords in the KQ. 1
De nition 11 (Expanded schema). Let q be a KQ over a schema S . The
expanded schema Sq of S wrt. q is de ned as Sq = S [frc(C)jc 2 qg, where
rc is a new unary relation, not occurring in S , whose only attribute C is an
output attribute with abstract domain dom(C) = dom(c).</p>
        <p>Then, we use the notion of dependency graph (d-graph) to denote output-input
dependencies between relation arguments, indicating that a relation under access
patterns needs values from other relations.
1 If other values are known besides the keywords, this knowledge may be represented
by means of appropriate unary relations with output mode in the schema.</p>
        <p>A
a</p>
        <p>Ai
B
r</p>
        <p>B
C
Di
s</p>
        <p>Ci
D
s</p>
        <p>C
c</p>
        <p>C
c</p>
        <p>Ai
B
r</p>
        <p>A
a</p>
        <p>A
a</p>
        <p>B
D
Ei
u</p>
        <p>Ai
B
r</p>
        <sec id="sec-3-2-1">
          <title>De nition 12 (d-graph). Let q be a KQ over a schema S . The d-graph GqS</title>
          <p>is a directed graph hN ; E i de ned as follows. For each attribute A of each relation
in the expanded schema Sq , there is a node in N labeled with A's access mode
and abstract domain. There is an arc uyv in E whenever: (i) u and v have the
same abstract domain; (ii) u is an output node; and (iii) v is an input node.
Some relations are made invisible by the access patterns and can be discarded.</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>De nition 13 (Visibility). An input node vn 2 N in a d-graph hN ; E i is</title>
          <p>visible if there is a sequence of arcs u1yv1; : : : ; unyvn in E such that (i) u1's
relation has no input attributes, and (ii) vi's and ui+1's relation are the same,
for 1 i n 1. A relation is visible if all of its input nodes are.
Example 5 (cont.). Consider the KQ and the schemas from Example 5. The
dgraphs GqS1 and GqS2 are shown in Figures 2a and Figure 2b, respectively.
Bi
C
D
s</p>
          <p>C
c
(a) D-graph GqS1 from Example 5; s is
not visible.
(b) D-graph GqS2 from Example 5; all
relations are visible.
(c) D-graph GqS from Example 6; u is
not visible.</p>
          <p>Answerability of a KQ q is checked by means of compatibility with a schema in
which all non-visible relations have been eliminated.</p>
          <p>Example 6. Consider a KQ q = fa : A; c : Cg and a schema S =
fr(Ai; B); s(Ci; D); u(B; D; Ei)g. Relation u is not visible in GqS (Figure 2c).
Then, q is not answerable in S , since q is not compatible with schema
fr(A; B); s(C; D)g (i.e., S without relation u).</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusions and future work</title>
      <p>We have de ned the problem of keyword search in the Deep Web. We are
currently working on minimizing the number of accesses to the data sources.</p>
      <p>We believe that several interesting issues can be studied in the framework
de ned in this paper. We plan, e.g., to leverage known values (besides the
keywords) and ontologies to speed up the search for an optimal answer as well as
to consider the case in which nodes and arcs of the join graph are weighted to
model source availability and proximity, respectively.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Sanjay</given-names>
            <surname>Agrawal</surname>
          </string-name>
          , Surajit Chaudhuri, and
          <string-name>
            <surname>Gautam Das</surname>
          </string-name>
          .
          <article-title>Dbxplorer: A system for keyword-based search over relational databases</article-title>
          .
          <source>In ICDE</source>
          , pages
          <volume>5</volume>
          {
          <fpage>16</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Meghyn</given-names>
            <surname>Bienvenu</surname>
          </string-name>
          et al.
          <article-title>Dealing with the deep web and all its quirks</article-title>
          .
          <source>In Proc. of VLDS</source>
          , pages
          <volume>21</volume>
          {
          <fpage>24</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Andrea</given-names>
            <surname>Cal</surname>
          </string-name>
          and
          <string-name>
            <given-names>Davide</given-names>
            <surname>Martinenghi</surname>
          </string-name>
          .
          <article-title>Querying data under access limitations</article-title>
          .
          <source>In ICDE</source>
          , pages
          <volume>50</volume>
          {
          <fpage>59</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Andrea</given-names>
            <surname>Cal</surname>
          </string-name>
          and
          <string-name>
            <given-names>Davide</given-names>
            <surname>Martinenghi</surname>
          </string-name>
          .
          <article-title>Querying the deep web</article-title>
          .
          <source>In EDBT</source>
          , pages
          <volume>724</volume>
          {
          <fpage>727</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Andrea</given-names>
            <surname>Cal</surname>
          </string-name>
          , Davide Martinenghi, and
          <string-name>
            <given-names>Riccardo</given-names>
            <surname>Torlone</surname>
          </string-name>
          .
          <article-title>Keyword search in the deep web</article-title>
          .
          <source>In Proc. of the 9th AMW</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Garey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. L.</given-names>
            <surname>Graham</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Johnson</surname>
          </string-name>
          .
          <article-title>The complexity of computing Steiner minimal trees</article-title>
          .
          <source>SIAM J. on Applied Mathematics</source>
          ,
          <volume>32</volume>
          (
          <issue>4</issue>
          ):
          <volume>835</volume>
          {
          <fpage>859</fpage>
          ,
          <year>1977</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Konstantin</given-names>
            <surname>Golenberg</surname>
          </string-name>
          , Benny Kimelfeld, and
          <string-name>
            <given-names>Yehoshua</given-names>
            <surname>Sagiv</surname>
          </string-name>
          .
          <article-title>Keyword proximity search in complex data graphs</article-title>
          .
          <source>In SIGMOD</source>
          , pages
          <volume>927</volume>
          {
          <fpage>940</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Lin</surname>
            <given-names>Guo</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Feng Shao</surname>
            , Chavdar Botev, and
            <given-names>Jayavel</given-names>
          </string-name>
          <string-name>
            <surname>Shanmugasundaram</surname>
          </string-name>
          .
          <article-title>Xrank: ranked keyword search over xml documents</article-title>
          .
          <source>In SIGMOD</source>
          , pages
          <volume>16</volume>
          {
          <fpage>27</fpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Bin</surname>
            <given-names>He</given-names>
          </string-name>
          , Zhen Zhang, and
          <string-name>
            <surname>Kevin</surname>
            <given-names>Chen-Chuan</given-names>
          </string-name>
          <string-name>
            <surname>Chang</surname>
          </string-name>
          .
          <article-title>Metaquerier: querying structured web sources on-the- y</article-title>
          .
          <source>In Proc. of SIGMOD</source>
          , pages
          <volume>927</volume>
          {
          <fpage>929</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>Vagelis</given-names>
            <surname>Hristidis</surname>
          </string-name>
          and
          <string-name>
            <given-names>Yannis</given-names>
            <surname>Papakonstantinou</surname>
          </string-name>
          . Discover:
          <article-title>Keyword search in relational databases</article-title>
          .
          <source>In VLDB</source>
          , pages
          <volume>670</volume>
          {
          <fpage>681</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Hasan</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Jamil</surname>
            and
            <given-names>Hosagrahar V.</given-names>
          </string-name>
          <string-name>
            <surname>Jagadish</surname>
          </string-name>
          .
          <article-title>A structured query model for the deep relational web</article-title>
          .
          <source>In CIKM</source>
          , pages
          <volume>1679</volume>
          {
          <fpage>1682</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>Benny</given-names>
            <surname>Kimelfeld</surname>
          </string-name>
          and
          <string-name>
            <given-names>Yehoshua</given-names>
            <surname>Sagiv</surname>
          </string-name>
          .
          <article-title>Finding and approximating top-k answers in keyword proximity search</article-title>
          .
          <source>In PODS</source>
          , pages
          <volume>173</volume>
          {
          <fpage>182</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Jens</surname>
          </string-name>
          Lehmann et al.
          <article-title>Deqa: Deep web extraction for question answering</article-title>
          .
          <source>In ISWC'12</source>
          , pages
          <fpage>131</fpage>
          {
          <fpage>147</fpage>
          . Springer-Verlag,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Guoliang</surname>
          </string-name>
          Li et al.
          <article-title>EASE: an e ective 3-in-1 keyword search method for unstructured, semi-structured and structured data</article-title>
          .
          <source>In SIGMOD</source>
          , pages
          <volume>903</volume>
          {
          <fpage>914</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Jayant</surname>
            <given-names>Madhavan</given-names>
          </string-name>
          , Loredana Afanasiev, Lyublena Antova, and
          <string-name>
            <surname>Alon</surname>
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Halevy</surname>
          </string-name>
          .
          <article-title>Harnessing the deep web: Present and future</article-title>
          .
          <source>In CIDR</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <article-title>Sriram Raghavan and Hector Garcia-Molina. Crawling the hidden web</article-title>
          .
          <source>In VLDB</source>
          , pages
          <volume>129</volume>
          {
          <fpage>138</fpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Thanh</surname>
          </string-name>
          Tran et al.
          <article-title>Top-k exploration of query candidates for e cient keyword search on graph-shaped (rdf) data</article-title>
          .
          <source>In ICDE</source>
          , pages
          <volume>405</volume>
          {
          <fpage>416</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Je rey Xu Yu</surname>
            ,
            <given-names>Lu</given-names>
          </string-name>
          <string-name>
            <surname>Qin</surname>
            , and
            <given-names>Lijun</given-names>
          </string-name>
          <string-name>
            <surname>Chang</surname>
          </string-name>
          .
          <article-title>Keyword search in relational databases: A survey</article-title>
          .
          <source>IEEE Data Eng. Bull.</source>
          ,
          <volume>33</volume>
          (
          <issue>1</issue>
          ):
          <volume>67</volume>
          {
          <fpage>78</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>