<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>User driven Information Extraction with LODIE</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Anna Lisa Gentile</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Suvodeep Mazumdar</string-name>
          <email>s.mazumdarg@sheffield.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of She eld</institution>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Information Extraction (IE) is the technique for transforming unstructured or semi-structured data into structured representation that can be understood by machines. In this paper we use a user-driven Information Extraction technique to wrap entity-centric Web pages. The user can select concepts and properties of interest from available Linked Data. Given a number of websites containing pages about the concepts of interest, the method will exploit (i) recurrent structures in the Web pages and (ii) available knowledge in Linked data to extract the information of interest from the Web pages.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        rules to previously unseen websites of the same domain [
        <xref ref-type="bibr" rid="ref10 ref9">9,10</xref>
        ]. Completely
unsupervised methods (e.g. RoadRunner [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and EXALG [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]) do not require any
training data, nor an initial extraction template (indicating which concepts and
attributes to extract), and they only assume the homogeneity of the considered
pages. The drawback of unsupervised methods is that the semantic of produced
results is left as a post-process to the user. Hybrid methods [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] intend to nd a
tradeo with these two limitations by proposing a supervised strategy, where the
training data is automatically generated exploiting Linked Data. In this work
we perform IE using the method proposed in [
        <xref ref-type="bibr" rid="ref2 ref3">2,3</xref>
        ] and follow the general IE
paradigm from [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-2">
      <title>User-driven Information Extraction</title>
      <p>In LODIE we adopt a user driven paradigm for IE. As rst step, the user must
de ne her/his information need. This is done via a visual exploration of linked
data (Figure 1).</p>
      <p>
        The user can explore underlying linked data using the A ective Graphs
visualization tool [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and select concepts and properties she/he is interested in
(a screenshot is shown in Figure 1). These concepts and properties get added
to the side panel. Once the selection is nished, she/he can start the IE
process. The IE starts with a dictionary generation phase. A dictionary di;k consists
of values for the attribute ai;k of instances of concept ci. Noisy entries in the
dictionaries are removed using a cleaning procedure detailed in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. As a
running example we will assume the user wants to extract title and author for the
concept Book. We retrieve from the Web k websites containing entity-pages of
the concept types selected by the user, and save the pages Wci;k. Following the
Book example, Barnes&amp;Noble2 or AbeBooks3 websites can be used, and pages
collected in Wbook;barnesandnoble and Wbook;abebooks.
      </p>
      <p>For each Wci;k we generate a set of extraction patterns for every attribute.
In our example we will produce 4 sets of patterns, one per each website and
2 http://www.barnesandnoble.com/
3 http://www.abebooks.co.uk
attribute. To produce the patterns we (i) use our dictionaries to generate
bruteforce annotations on the pages in Wci;k and then (ii) use statistical (occurrence
frequency) and structural (position of the annotations in the webpage) clues to
choose the nal extraction patterns.</p>
      <p>
        Brie y, a page is transformed to a simpli ed page representation Pci : a
collection of pairs 〈xpath4, text value〉. Candidates are generated matching the
dictionaries di;k against possible text values in Pci (Figure 2).
/HTML[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/BODY[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/H2[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/text()[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] breaking dawn
/HTML[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/BODY[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/H2[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/EM[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/text()[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] breaking dawn
/HTML[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/BODY[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]/TABLE[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]/TBODY[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/TR[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/TD[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]/B[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/A[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/text()[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
breaking dawn
/HTML[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/BODY[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]/TABLE[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/TBODY[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/TR[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/TD[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]/B[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/A[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/text()[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
breaking dawn
/HTML[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/BODY[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]/TABLE[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/TBODY[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/TR[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/TD[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]/B[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/A[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/text()[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
breaking dawn
/HTML[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/BODY[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]/TABLE[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]/TBODY[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/TR[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/TD[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]/B[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/A[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/text()[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
breaking dawn
/HTML[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/BODY[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]/TABLE[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]/TBODY[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/TR[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/TD[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]/B[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/A[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/text()[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
breaking dawn
/HTML[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/BODY[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]/TABLE[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]/TBODY[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/TR[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/TD[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]/B[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/A[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/text()[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
breaking dawn
/HTML[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/BODY[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]/UL[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/LI[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/A[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/text()[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] the host
/HTML[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/BODY[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]/UL[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/LI[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]/A[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/text()[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] new moon
Fig. 2: Example of candidates for book title for a Web page on the book \Breaking Dawn", from the
website AbeBooks.
      </p>
      <p>
        Final patterns are chosen amongst the candidates exploiting frequency
information and other heuristics. Details of the method can be found in [
        <xref ref-type="bibr" rid="ref2 ref3">2,3</xref>
        ].
In the running example, higher scoring patterns for extracting book title from
AbeBooks website are shown in Figure 3.
/HTML[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/BODY[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/H2[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/text()[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] 329.0
/HTML[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/BODY[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]/DIV[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/H2[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/EM[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]/text()[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] 329.0
      </p>
      <p>Fig. 3: Extraction patterns for book titles from AbeBooks website.</p>
      <p>All extraction patterns are then used to extract target values from all Wci;k.
Results are produced as linked data, using the concept and properties initially
selected by the user for representation, and made accessible to the user via an
exploration interface (Figure 4), implemented using Simile Widgets5.</p>
      <p>A video showing the proposed system used with the running Book
example can be found at http://staffwww.dcs.shef.ac.uk/people/A.L.Gentile/
demo/iswc2014.html.</p>
    </sec>
    <sec id="sec-3">
      <title>4 Conclusions and future work</title>
      <p>In this paper we describe the LODIE approach to perform IE on user de ned
extraction tasks. The user is prompted a visual tool to explore available linked
data and choose concepts for which she/he wants to mine additional material
from the Web. We learn extraction patterns to wrap relevant websites and return
structured results to the user.
4 http://www.w3.org/TR/xpath/
5 http://www.simile-widgets.org/</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Kushmerick</surname>
          </string-name>
          , N.:
          <article-title>Wrapper Induction for information Extraction</article-title>
          .
          <source>In: IJCAI97</source>
          . (
          <year>1997</year>
          )
          <volume>729</volume>
          {
          <fpage>735</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Gentile</surname>
            ,
            <given-names>A.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Augenstein</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ciravegna</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Unsupervised wrapper induction using linked data</article-title>
          .
          <source>In: Proc. of the seventh international conference on Knowledge capture. K-CAP '13</source>
          , New York, NY, USA, ACM (
          <year>2013</year>
          )
          <volume>41</volume>
          {
          <fpage>48</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Gentile</surname>
            ,
            <given-names>A.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ciravegna</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Self training wrapper induction with linked data</article-title>
          .
          <source>In: Proceedings of the 17th International Conference on Text, Speech and Dialogue (TSD</source>
          <year>2014</year>
          ).
          <article-title>(</article-title>
          <year>2014</year>
          )
          <volume>295</volume>
          {
          <fpage>302</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Ciravegna</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gentile</surname>
            ,
            <given-names>A.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          :
          <article-title>Lodie: Linked open data for web-scale information extraction</article-title>
          .
          <source>In: SWAIE</source>
          . (
          <year>2012</year>
          )
          <volume>11</volume>
          {
          <fpage>22</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Muslea</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Minton</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Knoblock</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Hierarchical wrapper induction for semistructured information sources</article-title>
          . Autonomous Agents and
          <string-name>
            <surname>Multi-Agent Systems</surname>
          </string-name>
          (
          <year>2001</year>
          )
          <volume>1</volume>
          {
          <fpage>28</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Soderland</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Learning information extraction rules for semi-structured and free text</article-title>
          .
          <source>Mach. Learn</source>
          .
          <volume>34</volume>
          (
          <issue>1-3</issue>
          ) (
          <year>February 1999</year>
          )
          <volume>233</volume>
          {
          <fpage>272</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Muslea</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Minton</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Knoblock</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Active Learning with Strong and Weak Views: A Case Study on Wrapper Induction</article-title>
          .
          <source>IJCAI'03 8th international joint conference on Arti cial intelligence</source>
          (
          <year>2003</year>
          )
          <volume>415</volume>
          {
          <fpage>420</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Dalvi</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Soliman</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Automatic wrappers for large scale web extraction</article-title>
          .
          <source>Proc. of the VLDB Endowment</source>
          <volume>4</volume>
          (
          <issue>4</issue>
          ) (
          <year>2011</year>
          )
          <volume>219</volume>
          {
          <fpage>230</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Wong</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lam</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Learning to adapt web information extraction knowledge and discovering new attributes via a Bayesian approach. Knowledge and Data Engineering</article-title>
          , IEEE
          <volume>22</volume>
          (
          <issue>4</issue>
          ) (
          <year>2010</year>
          )
          <volume>523</volume>
          {
          <fpage>536</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Hao</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cai</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
          </string-name>
          , L.:
          <article-title>From One Tree to a Forest : a Uni ed Solution for Structured Web Data Extraction</article-title>
          .
          <source>In: SIGIR</source>
          <year>2011</year>
          .
          <article-title>(</article-title>
          <year>2011</year>
          )
          <volume>775</volume>
          {
          <fpage>784</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Crescenzi</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mecca</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Automatic information extraction from large websites</article-title>
          .
          <source>Journal of the ACM</source>
          <volume>51</volume>
          (
          <issue>5</issue>
          ) (
          <year>September 2004</year>
          )
          <volume>731</volume>
          {
          <fpage>779</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Arasu</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garcia-Molina</surname>
          </string-name>
          , H.:
          <article-title>Extracting structured data from web pages</article-title>
          .
          <source>In: Proc. of the 2003 ACM SIGMOD international conference on Management of data, ACM</source>
          (
          <year>2003</year>
          )
          <volume>337</volume>
          {
          <fpage>348</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Mazumdar</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Petrelli</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elbedweihy</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lanfranchi</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ciravegna</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>A ective graphs: The visual appeal of linked data</article-title>
          .
          <source>Semantic Web{Interoperability</source>
          , Usability, Applicability. IOS Press (to appear,
          <year>2014</year>
          ) (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>