<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Data Extraction from Semantic Annotated Deep Web Sites</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Eduardo Mart´ın Rojo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vicente Luque Centeno</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universidad Carlos III de Madrid Av. Universidad 30</institution>
          ,
          <addr-line>28911 Legan ́es (Madrid), Espan ̃a</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Automatic navigating and gathering information from Deep Web sites requires the use of Web Wrappers in order to simulate human interaction with Web sites. Web Wrappers have some drawbacks: their implementations are specific to the accessed site and also their source code needs a constant maintenance in order to support new changes on Web site. In this work we propose an annotation model for Deep Web sites that could be used for data extraction from the point of view of a Web client. Using these annotations will enable Web Wrappers to be more adaptable to Web site changes.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>user-friendly tools that allow natural language references like Chickenfoot 7, and
also graphical IDE solutions like OpenKapow RoboMaker 8.</p>
      <p>The main common characteristic among all current Wrapper development
tools is that, although their operation is simple, they still have strong
dependencies with Web site structure, originating a continuous effort of maintenance
with the created Wrappers. In our work we will define a formalization of Web
structure that can be used for semantic annotation of Deep Web sites in order to
represent their navigation operation. These annotations combined with current
Wrapper development tools will allow implementation of Wrappers focused on
the Web site model, avoiding overlapping with site structure. The maintenance
required by new Web site changes will be isolated inside this model layer. Any
future modifications on the structure of the Web site will need only to change
the semantic annotation for the Web site, and all the Wrappers that make use
of these annotations will not need to be changed.</p>
      <p>
        One of the main problems raised with the navigation model generation is
the election of a point of view. In [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], it is indicated that a Web navigation
model can be generated from three points of view: from server point of view,
analyzing physical structure of Web site; from client point of view, analyzing
client interaction with Web site; and from a hybrid point of view as a combination
of client and server point of view. In our work we have chosen to follow the client
point of view because of the following reasons:
– The navigation model must support the changeableness and diversity of Web
content. Client side technologies like AJAX or Flash are widespread. The
model must support also this kind of content.
– It is non intrusive; it does not require to change the Web site being annotated
as other semantic annotation methods like RDFa[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] and Microformats[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
– Client-only point of view allow third party and end users to participate and
collaborate in the creation of annotations about any Web site because they
will not have access to the server.
      </p>
      <p>In this document, section 2 will introduce previous works related with Deep
Web navigation modeling and data extraction. The annotation model for
expressing navigation graphs that we present is described in section 3 by defining
its main classes and properties. Also we present an example of annotation for a
shop Web site that uses our model. Section 4, describes how can be used the
annotations for the extraction of information by using an example that performs
a query over different sources; and finally, future works and conclussions are
described in section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Works</title>
      <p>
        The task of generating a navigation model from a Web Site has been previously
faced, but previous works do not provide a Web model annotations formalization.
7 http://groups.csail.mit.edu/uid/chickenfoot/
8 http://openkapow.com/
There were attempts to automatically extract the model using real navigation
examples like [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], where the authors make use of navigation examples that were
extracted by recording and replaying the actions performed by a user and [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ],
focused on generation of models for visualization of Web sites hierarchy.
      </p>
      <p>
        Deep Web navigation model generation is treated by [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], where the authors
generate a navigation model where they use keyword-matching for identifying
different Deep Web pages. Other work more focused on interacting with Deep
Web sites is Transcendence[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], a system that allows users to generalize their
queries for expanding their search scope. It allows also to define data
extraction patterns for obtaining output data that may be combined with other data
sources.
      </p>
      <p>
        One of the main problems of Deep Web remains in that a Deep Web page
cannot be always represented uniquely with an URL9. The problem of
referencing Deep Web pages is addressed in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], where the authors present a way of
creating bookmarks of a Deep Web page using the sequence of steps specified
with Chickenfoot scripts that simulate the user interaction in order to reach the
bookmarked page. This is combined with images that show the visual
representation of the bookmarked page.
      </p>
      <p>
        Extracting semantic data from Web sites is a problem that have been faced
in Marmite[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], a system that provides an end-user interface that allows him to
construct the workflow needed for extracting the data, or PiggyBank [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], a system
where the user can make use of scripts called Screen Scrapers for converting
HTML to semantic data in RDF. Related with this last work is Sifter, a tool
that
      </p>
      <p>
        In our demo system we use Chickenfoot as Wrapper development tool. The
main characteristics of this tool and its natural language references to HTML
elements are presented on [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Chickenfoot enable the specification of
client-side Web interactions with a simple Javascript based language.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Deep Web Annotation Model</title>
      <p>The objective of a Web client is to reach a specific state through interacting
with the Web. Our model represent the states and transitions that composes
the navigation as a graph that describes all possible situations that could occur
in a Web site from client’s point of view. In the graph, vertexes represents all
possible logical states that require an action produced by the user, while edges
represent the actions produced that allow the transitions between logical states.</p>
      <p>Every state could be divided in elements called fragments, and also these
fragments could be divided into new fragments. Every fragment represents a type
of semantic content that can be extracted by selecting part of the element that
it belongs. Figure 1 shows an example of state division in different fragments.
Fragments of logical states will allow us to extract semantic information from
Deep Web pages if we know the location of the state inside the navigation graph
of the Web site.
9 Uniform Resource Locator, defined on http://tools.ietf.org/html/rfc1738</p>
      <p>A transition represents the actions available for the user. These actions will
allow changing among different states in the Web site. Transitions are composed
of, first, an ordered sequence of interactions that originates the change of state,
and second, a list of all possibe destinations that could be feasible by using the
transition. There could be different possible destinations because the Web site
could act in different ways as a consequence of dynamic factors like the state of
the service framework, the interactions historical, date and time, etc.</p>
      <p>Our model of states and transitions for the navigation graph is represented
by the following types of elements:
1. PageState: a uniquely identifying state that represents a Deep Web page. A
state contains fragments.
2. Fragment: represents an element inside a PageState or another Fragment. It
is domain of the following properties:
(a) fragmentOf: Defines this fragment as part of another fragment or part of
a PageState as a hierarchy.
(b) locatedBy: a XPath expression that identifies the position of the fragment
inside the XHTML representation of the PageState.
(c) numResults: an integer that defines how many times this fragment
appears inside his father Fragment or PageState in hierarchy.
(d) optional: a boolean that indicates if the fragment appears optionally
inside his father Fragment or PageState. It may happen when a page has
variable elements.
(e) semantic: a set of RDF triples that represents the knowledge refered by
this fragment. This property is also defined in Input and Action.
3. transitions: represents all possible transitions to other states that could be
followed from this state. It has the following properties:
(a) actions: any transition is originated by an ordered sequence of actions
(instances of Action class)
(b) sources and destinations: all possible states that could be source or
destination of this transition respectively.
(c) precondition and postcondition: preconditions and postconditions required
in order to use this transition for travelling from sources to destinations.
Postconditions can be used for distinguishing between different possible
destinations.
4. Action: an interaction that could be performed over a fragment. It contains
the following properties:
(a) hasType: type of interaction (clicking element, selecting element,
entering text. . . ). It is represented by a Chickenfoot command in our demo
annotations.
(b) inputs: all the form values needed for performing the interaction are
refered in this property as a list that contains elements of type Input.</p>
      <p>For each input it must be provided a name and a description.</p>
      <p>
        We have represented this formalization in an OWL10 ontology that can be
accessed in [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
      <p>In figure 2 we have represented an example of the annotated navigation
model for searching products inside a typical Shop of Books site. From initial
page at this site represented by node state0, there are two fragments: element0
(a searching textbox) and element1 (a button that initiates the searching task).
From state0 it is possible to go to state1 through transition0. This transition
requires to perform two actions (represented by the orderedActions list of actions):
the first action is performed by inserting a search string inside element0 textbox
and the second action is performed by clicking element1 button. At state1, there
are three fragments that can be accessed: element2, element3 and element4.
Because each of them is a part of other fragment, the XPath expression referred by
the locatedBy property is relative to its father’s locatedBy property. For example,
element4 is constructed with element3 and element2 locations, and so its location
should be //div[@id=’atfResults’][$INDEX]//div[@class=’productPrice’].</p>
      <p>In this figure we have used an hypothetical ontology for defining concepts of
Shops, but the annotation model presented can be combined with any ontology
for defining semantic knowledge inside the Fragments, Actions or Inputs of the
model. This knowledge can be provided as RDF data inside the property semantic
of these elements.
10 http://www.w3.org/TR/owl-features/</p>
      <p>FRAGMENT
locatedBy
semantic</p>
      <p>element4
//div[@class=’productTitle’]
ADD({this log:content ?text .</p>
      <p>this :fragmentOf ?father.}</p>
      <p>log:implies
{?father shop:productTitle ?text.})</p>
      <p>Shop of Books Web Site
FRAGMENT
locatedBy
semantic</p>
      <p>element1
//input[@id=’navGoButtonPanel’]</p>
      <p>FRAGMENT
locatedBy
semantic</p>
      <p>element0
//input[@id=’twotabsearchtextbox’]
fragmentOf
FRAGMENT
locatedBy
semantic
element3
[$INDEX]</p>
      <p>ADD(this a shop:Product.)</p>
      <p>ADD(this shop:searchedBy :searchText.)
FRAGMENT
locatedBy
semantic</p>
      <p>element2
//div[@id=’atfResults’]
ADD(this a shop:ListOfProducts.)
fragmentOf</p>
      <p>precond = ’’
fragmentOf</p>
      <p>postcond = ’’ orderedActions
state1</p>
      <p>ACTION enter(element0, $INPUT);
semantic
ACTION click(element1);
semantic
inputs
INPUT
type
description
semantic
$SEARCHED_PRODUCT</p>
      <p>rdfs:String
String representing product
ADD({this log:content ?text}</p>
      <p>log:implies
{:searchText a :Variable.
:searchText log:content ?text.})</p>
      <p>
        Representing the semantic of a specific element inside a state for a Deep Web
site requires not only the knowledge inferred from the specific state, but also the
knowledge of all the transitions that have been followed for reaching the state.
This knowledge may be transformed during the transitions, and new knowledge
may be added or removed from the client’s working memory of assertions,
depending on the client interactions with the Deep Web site, as in a Production
System or Rule-Based System [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>As can be seen in figure 2, Fragment, Input and Action have defined the
semantic property that indicates which knowledge (represented as RDF triples
or inference rules) is added (ADD) or deleted (DEL) from the working memory.
Adding a RDF triple simply adds itself to the working memory, while adding a
RDF rule executes it inside working memory and every time new RDF triples
are added, the rule must be reapplied. Deleting a triple or a rule eliminates its
effect from the working memory. The effect of adding or removing RDF data of
client’s working memory follows these rules:
1. If the client is located at a State that contains fragments, the semantic
property of the fragments is processed inside working memory from outer
fragments to inner fragments following the fragmentOf property. Fragments at
the same level of indexation can be processed in any order.
2. If the client has travelled through a Transition that contains a ordered list of
actions, the semantic property of the actions is processed following the same
order of the ordered actions list.
3. If the client uses an input defined in a fragment of a State or in an action
of a Transition, the semantic property of the input is processed after all the
other semantic properties of the State or Transition.</p>
      <p>Semantic information is associated with every instance of Fragment, Action
or Input by using the semantic property. With this property, the user that is
annotating a Web site indicates which kind of information does the fragment,
action or input represents. Its content is expressed in RDF.</p>
      <p>For defining semantic information, in figure 2 we make use of the following
properties and concepts (based on the built-in CWM reasoner functions11):
– log:implies → Property that relates antecedents and consequents of a RDF
expressed rule. Antecedents and consequents are defined between brackets {
and }.
– log:content → This property relates the logical representation of an element
with its string representation. It is used for defining the text content of an
input, or for defining the specific values of an information.
– this → When it is used inside the semantic property of an element, it
represents the element itself. It is used for adding knowledge to an element.</p>
      <p>In order to facilitate sharing and reusing annotations, it is needed the use of
ontologies for modeling this semantic content.
11 http://www.w3.org/2000/10/swap/doc/CwmBuiltins.html
4</p>
    </sec>
    <sec id="sec-4">
      <title>Data Extraction using Annotations</title>
      <p>
        The content of the semantic property allows to locate a particular information
inside the navigation map by performing a query that specifies the conditions in
which the information can be found in the working memory. As our model deals
with knowledge expressed as RDF triples, we have selected SPARQL12 as query
language for this purpose. SPARQL allows to indicate the Named Graph[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] that
a set of conditions are referring. We make use of this functionality for relating
annotations of different Deep Web sites in order to perform a distributed query
among their graphs. Figure 3 shows an example of query that uses two different
maps. The SPARQL query requires the price of a book that fulfills the following
conditions expressed in the WHERE part of the query:
1. Select from RSS Web Site any element identified as Product that has a
defined title
2. Select from Shop of Books Web Site any element identified as Product that
has been searched in the site using the string that represents the title of the
product from RSS Web Site previously obtained
      </p>
      <p>A set of conditions expressed for a specific navigation model in the query can
be achieved by a list of transitions. The actions of these transitions could need
to provide client inputs. Also these inputs could be needed for accessing specific
fragments in a state. In the example at figure 3, RSS map requires an input
called INDEX for accessing a fragment inside rss state0, and Shop of Books map
requires the input SEARCHED PRODUCT for performing the actions of
transition0. Inputs are the points where the SPARQL query can relate data among
different maps. Because inputs require that an information must be supplied,
the SPARQL query must face with the possible situations with these rules:
– If the conditions states clearly the specific value of the input, use this value.
– If the conditions indicate that the information required by the input must be
obtained from another graph, then the query must first perform its actions
in that other graph. This happens in the example with Shop of Books map
at condition ?product2 shop:searchedBy ?title because title is refered by the
RSS map.
– If the information required can not be infered, then the SPARQL query must
use all possible informations that could be used for the input. In the example,
this happens with the input INDEX, that is not provided, so the query must
iterate among all the elements.</p>
      <p>
        We have developed a demo system that may be accessed through [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Our
demo generates Web wrappers scripts composed of Chickenfoot 13 commands
that could be executed in Firefox with Chickenfoot plugin installed.
Chickenfoot is a programming environment that provides a set of special functions for
12 http://www.w3.org/TR/rdf-sparql-query/
13 http://groups.csail.mit.edu/uid/chickenfoot/
Shop of Books Web Site
transition0
      </p>
      <p>orderedActions
ACTION enter(element0, $SEARCHED_PRODUCT);
semantic</p>
      <p>RSS Interesting Books
FRloAcaGtMedEBNyT ADD({thiselleo/mgti:tecleonnt2tent ?text .
semantic this :fraloggm:eimntpOlife?sfather.}
{?father shop:productTitle ?text.})</p>
      <p>fragmentOf
FRloAcaGtMedEBNyT [e$lIeNmDeEnXt1]
semantic ADD(this a shop:Product.)</p>
      <p>inputs fragmentOf
deIsNctyrPipUpetTion APDosDit(i${oBtnhrOidosfOflsoK:bIgon_:ocIteNokgnDietnerEsniXtd?etelixstt}
semantic {:positionlSoegl:eicmtepdlieas:Variable.</p>
      <p>:positionSelected log:content ?text.})</p>
      <p>FRloAcaGtMedEBNyT ADD(this a sehl/eo/mpit:eeLmnist0tOfProducts.)
semantic ADD(this shop:searchedBy :searchText.)</p>
      <p>
        fragmentOf
rss_state0
performing Web tasks like clicking a link, entering data in a HTML form, etc.
Chickenfoot scripts are written in a superset of Javascript. One of its main
characteristic is its support of natural language naming for HTML elements which
eases development of Web automatic tasks to end users [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        The scripts are generated from SPARQL queries that can use annotations
expressed in the formalization that we have presented in this article, also accesible
as OWL at URL [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Although with SPARQL we can not define very complex
tasks, we think it is a good starting point for defining a more powerful language
that may be used in a semantic-oriented Web Wrappers development. We think
this new approach may be useful for solving the drawbacks of previous Web
Wrappers development on Deep Web sites, like dependencies with Web
structure, constant maintenance of wrappers scripts, the need of strong knowledge of
the physical structure of accessed Web sites, etc.
5
      </p>
    </sec>
    <sec id="sec-5">
      <title>Future works</title>
      <p>Accessing and integrating data from non-structured heterogeneous sources is
a problem that currently is solved with Web Wrappers despite its drawbacks.
Web Wrappers are also tools that may be used to give structure to that
nonstructured information and made it available to the Web of Data. We think that
wrapper development must be adapted to the new environment of linked data
using semantic-oriented wrapper definitions, and not site-specific
implementations. In order to achieve this, we are interested in researching more powerful
languages and development tools for this new semantic approach of semantic
wrapper development.</p>
      <p>We think our model will improve development of Wrappers, because all the
Web site structure changes will require only to modify the annotation model,
and all Wrappers that make use of the annotations will be corrected. However,
it would be adequate an exhaustive evaluation of user effort with our annotation
model, that will be accomplished in future work.</p>
      <p>We are interested in continuing working on ontologies integration with
semantic wrapper implementations in order to exploit the common concepts among
different Web sites. The same semantic wrapper implementation may be used in
any Web site that is annotated using the same ontology.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>This work has been partially funded by the spanish Ministry of Education and
Science, project ITACA No. TSI2007-65393-C02-01</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Robert</given-names>
            <surname>Baumgartner</surname>
          </string-name>
          , Michal Ceresna, and
          <string-name>
            <given-names>Gerald</given-names>
            <surname>Ledermuller</surname>
          </string-name>
          .
          <article-title>Deep web navigation in web data extraction</article-title>
          .
          <source>In CIMCA '05: Proceedings of the International Conference on Computational Intelligence for Modelling</source>
          ,
          <source>Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce</source>
          Vol-
          <volume>2</volume>
          (CIMCA-IAWTIC'
          <volume>06</volume>
          ), volume
          <volume>2</volume>
          , pages
          <fpage>698</fpage>
          -
          <lpage>703</lpage>
          , Washington, DC, USA,
          <year>2005</year>
          . IEEE Computer Society.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Jeffrey</surname>
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Bigham</surname>
          </string-name>
          , Anna C. Cavender, Ryan S. Kaminsky,
          <string-name>
            <surname>Craig M. Prince</surname>
          </string-name>
          , and
          <string-name>
            <surname>Tyler</surname>
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Robison</surname>
          </string-name>
          .
          <article-title>Transcendence: enabling a personal view of the deep web</article-title>
          .
          <source>In IUI '08: Proceedings of the 13th international conference on Intelligent user interfaces</source>
          , pages
          <fpage>169</fpage>
          -
          <lpage>178</lpage>
          , New York, NY, USA,
          <year>2008</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Michael</given-names>
            <surname>Bolin</surname>
          </string-name>
          and
          <string-name>
            <given-names>Robert C.</given-names>
            <surname>Miller</surname>
          </string-name>
          .
          <article-title>Naming page elements in end-user web automation</article-title>
          .
          <source>SIGSOFT Softw. Eng. Notes</source>
          ,
          <volume>30</volume>
          (
          <issue>4</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Michael</given-names>
            <surname>Bolin</surname>
          </string-name>
          , Matthew Webber, Philip Rha, Tom Wilson, and
          <string-name>
            <surname>Robert</surname>
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Miller</surname>
          </string-name>
          .
          <article-title>Automation and customization of rendered web pages</article-title>
          .
          <source>In UIST '05: Proceedings of the 18th annual ACM symposium on User interface software and technology</source>
          , pages
          <fpage>163</fpage>
          -
          <lpage>172</lpage>
          , New York, NY, USA,
          <year>2005</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Ronald</given-names>
            <surname>Brachman</surname>
          </string-name>
          and
          <string-name>
            <given-names>Hector</given-names>
            <surname>Levesque</surname>
          </string-name>
          .
          <source>Knowledge Representation and Reasoning</source>
          (The Morgan Kaufmann Series in Artificial Intelligence). Morgan Kaufmann, May
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Jeremy</surname>
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Carroll</surname>
            , Christian Bizer, Pat Hayes, and
            <given-names>Patrick</given-names>
          </string-name>
          <string-name>
            <surname>Stickler</surname>
          </string-name>
          .
          <article-title>Named graphs, provenance and trust</article-title>
          .
          <source>In WWW '05: Proceedings of the 14th international conference on World Wide Web</source>
          , pages
          <fpage>613</fpage>
          -
          <lpage>622</lpage>
          , New York, NY, USA,
          <year>2005</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Sudarshan</given-names>
            <surname>Chawathe</surname>
          </string-name>
          , Hector Garcia-molina, Joachim Hammer, Kelly Irel, Yannis Papakonstantinou, Jeffrey Ullman, and
          <string-name>
            <given-names>Jennifer</given-names>
            <surname>Widom</surname>
          </string-name>
          .
          <article-title>The tsimmis project: Integration of heterogeneous information sources</article-title>
          .
          <source>In In Proceedings of IPSJ Conference</source>
          , pages
          <fpage>7</fpage>
          -
          <lpage>18</lpage>
          ,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Darris</given-names>
            <surname>Hupp</surname>
          </string-name>
          and
          <string-name>
            <given-names>Robert C.</given-names>
            <surname>Miller</surname>
          </string-name>
          .
          <article-title>Smart bookmarks: automatic retroactive macro recording on the web</article-title>
          .
          <source>In UIST '07: Proceedings of the 20th annual ACM symposium on User interface software and technology</source>
          , pages
          <fpage>81</fpage>
          -
          <lpage>90</lpage>
          , New York, NY, USA,
          <year>2007</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>David</given-names>
            <surname>Huynh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Stefano</given-names>
            <surname>Mazzocchi</surname>
          </string-name>
          , and David Karger.
          <article-title>Piggy Bank: Experience the Semantic Web inside your web browser</article-title>
          , volume
          <volume>5</volume>
          , pages
          <fpage>16</fpage>
          -
          <lpage>27</lpage>
          . Elsevier Science Publishers B. V., Amsterdam, The Netherlands, The Netherlands,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>Rohit</given-names>
            <surname>Khare</surname>
          </string-name>
          and
          <string-name>
            <given-names>Tantek</given-names>
            <surname>Celik</surname>
          </string-name>
          .
          <article-title>Microformats: a pragmatic path to the semantic web</article-title>
          . pages
          <fpage>865</fpage>
          -
          <lpage>866</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>Dirk</given-names>
            <surname>Kukulenz</surname>
          </string-name>
          .
          <article-title>Adaptive site map visualization based on landmarks</article-title>
          .
          <source>In IV '05: Proceedings of the Ninth International Conference on Information Visualisation</source>
          , pages
          <fpage>473</fpage>
          -
          <lpage>479</lpage>
          , Washington, DC, USA,
          <year>2005</year>
          . IEEE Computer Society.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. W3C.
          <article-title>Rdfa primer</article-title>
          .
          <source>W3C Working Group Note 14 October</source>
          <year>2008</year>
          ,
          <year>October 2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Yang</surname>
            Wang and
            <given-names>Thomas</given-names>
          </string-name>
          <string-name>
            <surname>Hornung</surname>
          </string-name>
          .
          <article-title>Deep web navigation by example</article-title>
          .
          <source>In Tomasz Kaczmarek Marek Kowalkiewicz Tadhg Nagle Jonny Parkes Dominik Flejter</source>
          , Slawomir Grzonkowski, editor,
          <source>BIS 2008 Workshop Proceedings</source>
          , Inssbruck, Austria,
          <fpage>6</fpage>
          -7
          <source>May</source>
          <year>2008</year>
          , pages
          <fpage>131</fpage>
          -
          <lpage>140</lpage>
          . Department of Information Systems, Pozna, University of Economics,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14. WebTLab.
          <article-title>Site annotation demo</article-title>
          . http://corelli.gast.it.uc3m.es/siteannotation.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15. WebTLab.
          <article-title>Site annotation ontology</article-title>
          . http://corelli.gast.it.uc3m.es/siteannotation/ ontology.owl.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>Jeffrey</given-names>
            <surname>Wong</surname>
          </string-name>
          and
          <string-name>
            <surname>Jason I. Hong.</surname>
          </string-name>
          <article-title>Making mashups with marmite: towards enduser programming for the web</article-title>
          .
          <source>In CHI '07: Proceedings of the SIGCHI conference on Human factors in computing systems</source>
          , pages
          <fpage>1435</fpage>
          -
          <lpage>1444</lpage>
          , New York, NY, USA,
          <year>2007</year>
          . ACM.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>