<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Hyperlocal Event Extraction of Future Events</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tobias Arrskog</string-name>
          <email>tobias.arrskog@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Peter Exner</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hakan Jonsson</string-name>
          <email>hakan.jonsson@sonymobile.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Peter Norlander</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pierre Nugues</string-name>
          <email>pierre.nugues@cs.lth.se</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, Lund University Advanced Application Labs</institution>
          ,
          <addr-line>Sony Mobile Communications</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>From metropolitan areas to tiny villages, there is a wide variety of organizers of cultural, business, entertainment, and social events. These organizers publish such information to an equally wide variety of sources. Every source of published events uses its own document structure and provides di erent sets of information. This raises signi cant customization issues. This paper explores the possibilities of extracting future events from a wide range of web sources, to determine if the document structure and content can be exploited for time-e cient hyperlocal event scraping. We report on two experimental knowledge-driven, pattern-based programs that scrape events from web pages using both their content and structure.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        There has been considerable work on extracting events from text available from
the web; see [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] for a collection of recent works. A variety of techniques have
been reported: [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] used successfully data-driven approaches for the extraction
of news events while knowledge-driven approaches have been applied to extract
biomedical [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], historical [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], or nancial events [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] among others.
      </p>
      <p>
        Much previous research focuses on using the body text of the document, while
some authors also use the document structure. For example, [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] apply semantic
role labelling to unstructured Wikipedia text while [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] use both the document
structure and body text to extract events from the same source.
      </p>
      <p>The focus of this paper is on extracting future events using the body text of
web pages as well as their DOM structure when the content has multiple levels of
structure. We naturally use the body text from the web page as itthat contains
essential information, e.g. time, date, and location instances. We also exploit the
DOM structure as a source of information. Although HTML embeds some sort
of structure, the actual structure is not homogeneous across websites. We report
on the problem of extracting event information from a variety of web pages and
we describe two systems we implemented and the results we obtained. .</p>
    </sec>
    <sec id="sec-2">
      <title>Properties of Local Events</title>
      <p>The events we are interested in are those that typically appear in calendars and
listings, such as cultural, entertainment, educational, social, business
(exhibitions, conferences), and sport events, that attract athe general and large public
may have an interest in.</p>
      <p>The end goal of this project is to be able to serve users with information about
events that match their current interest and context, e.g. using location-based
search, by aggregating these events from hyperlocal sources.</p>
      <p>Event aggregators already exist, e.g. Eventful and Upcoming, that collect
and publish event information, but they tend to only gather information about
major events in cooperation with organizers or publishers. By contrast, we want
to extract existing information directly from the publisher.</p>
      <p>The main challenge is time-e cient scaling since there is a great number of
hyperlocal organizers and sources as well as variations in the formats and DOM
structure of the sources and ambiguity. We may also have to deal with missing,
ambiguous, or contradictory information. For example, locations can appear in
the title:</p>
      <p>Concert { Bruce Springsteen (This time in the new arena),
and contradict the location indicated elsewhere. Another example is a title:</p>
      <p>Outdoor dining now every Friday and Saturday
containing date information which narrows or sometimes contradicts the dates
indicated elsewhere on the page.</p>
      <p>FThe domain we are interested deals with future events form. This is a very
wide area, where only few historically-annotated data is available. This makes a
statistical approach problematic, at least initially. Instead, we chose a
knowledgedriven, pattern-based approach, where we process both the structure of HTML
documents and their content. We analyzse the content using knowledge of the
event domain, e.g. event keywords.</p>
      <p>In this paper, we report on the problem of extracting event information from
given web pages and we describe two systems we implemented and the results
we obtained.
1.2</p>
    </sec>
    <sec id="sec-3">
      <title>Applications and Requirements for Event Structures</title>
      <p>From the possible properties of an event, wWe chose to extract the title, date,
time, location, event reference (source) and publisher which answers the wWhen,
where, and what questions aboutof thean event. These are however the most basic
attributes, and for a useful application, further information could be extracted,
including topic, organizer, cost and target audience.</p>
      <p>We set aside In this paper, we do not cover the semantic representation of
event data, but future research may need to address representing the above
attributes in existing event data models.</p>
      <sec id="sec-3-1">
        <title>System Architecture</title>
        <p>2.1</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Designing a Simple Scraper</title>
      <p>For each site in the list, we created a unique script. These scripts contained a
hand-crafted set of rules to extract the correct information for that speci c site.
This may require a good deal of manual e ort as we naturally have toTo expand
the list of additional hand-crafted scripts is required, which leads to high costs
when scaling to multiplemany sources..</p>
      <p>In order to limit scaling costs, the scripts need to be simplistic. For this
reason, we decided to A chosen limit ation was that the internal structure of the
information in the events needs to be the same between each other, so that a
small set of rules can extract the information from all the events.
2.2</p>
    </sec>
    <sec id="sec-5">
      <title>Designing a Generic Scraper</title>
      <p>We investigated if it would be possible to create a generic scraper which could
handle all websites without manual labour.</p>
      <p>The rst step to generically scrape a website is to nd all the pages that
contain events. This is currently done using domain knowledge, i.e. the system
is given only pages which are known to contain events. The possibilities to nd
pages without manual labour is further discussed in Sect. 5. The system uses
six steps to scrape the events from a given web page. Figure 1 shows the system
architecture. We implemented the rst three steps using the ad-hoc scripts of
Sect. 2.1.</p>
      <p>Classify</p>
      <p>Page</p>
      <p>Extract default
values and
domain knowledge
Identify the event list
Identify each speci c
event within the list</p>
      <p>Scraper</p>
      <p>Reevaluate selected
attributes by looking
at the entire event list</p>
      <p>Rank and select
attributes for each event</p>
      <p>Annotate</p>
    </sec>
    <sec id="sec-6">
      <title>Attribute Annotation and Interpretation</title>
      <p>The system uses rules to annotate and interpret text. The bene t of a
rulebased system is that it can both parse the text and create structured data. As
previous work suggests, extracting the time and date of events can be solved
through rules. While problematic, the system is able to extract named entities,
for example named locations as well. To do this, the system uses three major
rules:
1. Keyword detection preceding a named location, e.g looking for location: or
arena:
2. Keyword detection succeeding a named location, for example a city
3. Structured keyword detection preceding a named location. e.g. look for
location or arena when isolated in a separate structure. As an example:
location Boston which corresponds to \&lt;b&gt;location&lt;/b&gt; Boston" using
HTML tags.</p>
      <p>When the rules above return a named location, we query it against a named
location database. Using these rules and a database lookup, we can minimize
the false positives.
2.4</p>
    </sec>
    <sec id="sec-7">
      <title>Attribute Ranking and Selection</title>
      <p>The system uses domain knowledge to choose what data to extract:
{ The system extracts only one title and chooses the most visually
distinguished text it can, implied by the DOM structure
{ Dates and times are following a hierarchy of complexity, where it takes those
of highest complexity rst. Some sites used a structure where event structures
were grouped by date. To avoid false positives with dates in these event
structures, the scraper choose dates between the event structures if less than
half of the event structures contained dates.
{ The extraction of the location for the event was done in the following order:
If the event structure contained a location coordinate, choose it. Otherwise
use a default location. If the event site had no default location, use the most
commonly referred city in the event structure.
3</p>
      <sec id="sec-7-1">
        <title>Evaluation</title>
        <p>3.1</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Scoring</title>
      <p>We evaluated the performances of the simple and generic scrapers and we
compared them with a scoring de ned in Table 1.
At the start of the project, we gathered a training set composed of nine di erent
event sites found in the Lund and Malmo area, Sweden. With the help of the
training set, we could change the rules or add new ones and easily monitor their
overall e ect. This concerned both the rules of the annotator, scraper, and the
location lookup.
3.3</p>
    </sec>
    <sec id="sec-9">
      <title>Evaluation</title>
      <p>In order to evaluate the system, we gathered a test set of nine, previously unseen,
event web sites. The goal was to extract information about all (max. 30) events.
The tests were conducted in three parts.
1. In the rst part, we used the generic scraper (Sect. 2.2);
2. In the second one, we built simple scrapers (Sect. 2.1) for each of the test
sites.
3. We extracted the events manually by hand in the third part.
The results from the rst two parts were then compared against the third.</p>
      <p>The generic scraper and the simple scrapers were compared in how accurately
they extracted the title, date, time, and location of the event. The time of the
setup was also compared for both the generic and simple scrapers.</p>
      <p>We built a simple scraper for each site speci cally to extract the text
containing the title, date, time, and the location. The text strings containing the dates
and times were then sent to the same algorithm that the generic scraper uses to
parse the date and time. Once the text containing the location is extracted, we
use the same location lookup in all the scrapers.
3.4</p>
    </sec>
    <sec id="sec-10">
      <title>Bias Between the Training and Test Sets</title>
      <p>The sites in the training set were all composed of a list with events where all
the necessary information (title, date, time, location) could be found. In the</p>
      <p>Full Partial
Site Title Date Time Location Average Title Date Time Location Average
lu 0.0 0.967 0.767 0.433 0.542 0.4 0.967 0.933 0.633 0.733
mah 0.068 1.0 0.0 0.6 0.417 0.915 1.0 1.0 1.0 0.979
babel 0.0 0.818 0.0 1.0 0.830 1.0 0.909 0.818 1.0 0.932
lund.cc 1.0 0.667 1.0 0.652 0.714 1.0 0.967 1.0 0.652 0.905
mollan 0.0 0.857 1.0 1.0 0.75 0.0 0.857 1.0 1.0 0.714
nsf 1.0 1.0 1.0 0.0 0.673 1.0 1.0 1.0 0.286 0.822
malmo.com 1.0 1.0 0 0.691 0.543 1.0 1.0 0 0.963 0.741
burlov 0.889 0.75 0.333 0.2 0.369 1.0 0.875 0.333 0.2 0.602
dsek 0.0 0.2 0.444 0.833 0.588 1.0 0.2 1.0 0.833 0.758
Average F1 0.440 0.807 0.505 0.601 0.603 0.813 0.864 0.787 0.730 0.799
test set, most of the sites had a structure that did not have all the required
information: Each event had a separate page with all the information, the event
details page. The information on the event details page was not composed of
the typical compact structured form but rather had more body text. Of the
nine sites in the test set, three sites (lund.cc, nsf, dsek) did not require an event
details page for the necessary information. But the information on the sites nsf
and dsek were in their structure more comparable to a body text. A concept to
handle this is presented in Sect. 4.1 that concerns the extraction of the title.
4</p>
      <sec id="sec-10-1">
        <title>Conclusion</title>
        <p>The setup for the generic scraper took on average 12 minutes, compared to
creating a simple scraper for each site that took on average 39 minutes (Table 5).
The setup for the generic scraper is more than three times faster than creating
a simple scraper for each site. This can be compared to the pure manual labor
which took on average 34 minutes per site, thus both scrapers essentially have
a pay back time of one pass.
4.1</p>
      </sec>
    </sec>
    <sec id="sec-11">
      <title>Title</title>
      <p>The generic scraper performs rather poorly on the test set while it shows better
results on the training set. This is either due to a training over t or a signi cant
mismatch between the training and test sites. Sect. 3.4 analyzes the mistakes and
discusses this problem. When using the system on these pages without loading,
they do yield better results, as shown in Table 3. The rest of the failing test sites
failed because the system looked to much in the structure where it should have
analyzed the layout instead, i.e. it chose links when it should have chosen the
ones which were more visually prominent.
4.2</p>
    </sec>
    <sec id="sec-12">
      <title>Date</title>
      <p>The simple scraper is 5% better on the date identi cation than the generic
scraper on average for both the full and partial matches. Examining the scores
for the full match more closely, (Tables 2 and 4), the score for the generic is the
same or better than the score for the simple scraper for every site except burlov
and dsek. We even observe a complete failure for dsek. We investigated it and we
discovered that dsek expressed the dates relative to the current date e.g. today,
tomorrow. This wasn't implemented yet which made the generic scraper pick
another strategy for picking dates, as a result the correct dates were forfeited.
4.3</p>
    </sec>
    <sec id="sec-13">
      <title>Time</title>
      <p>The average scores for the time extraction between the generic and the simple
scrapers are rather similar. The system does nds the correct times but does
report many false positives, which according to the scoring set in Sect. 3.1 yields
only a partial match. The system tends to over detect times. We programmed
it to prefer times coupled with dates over solitary times but in the test set, it
seems it was rather common to have time and dates further apart. This makes
the system choose all times, where it should have chosen a subset. Another
pattern was also found: for some sites, the system returned both start and end
time separately which shows that the system is lacking rules to bind start and
end times together.
4.4</p>
    </sec>
    <sec id="sec-14">
      <title>Location</title>
      <p>The di erence between simple and generic scraper is negligible and the problem
of location is less about selection and more about actually nd and understand
the named locations (Tables 2 and 4). The system uses assumed knowledge to
ll in what is left out of the events, i.e. knows city, region or location which it
can use to fallback to or base the search around. Using this assumed knowledge
has proved useful when looking at babel, mollan, dsek, lu and mah and this
should hold true on all hyperlocal websites. Even if the system has some basic
knowledge about the web page, the location annotation and selection still has
problems with disambiguation. This disambiguation problem is partly rooted in
the fact that the named locations are within the domain knowledge of the site.
As an example, a university website might write lecture halls or class rooms
as the location of the event. These named locations could have the same name
as pub in another city, a scientist or simply nonexistent in any named location
database.
4.5</p>
    </sec>
    <sec id="sec-15">
      <title>Final Words</title>
      <p>At the end of the test cycle, however, we considered that an generic scraper is not
only possible to do, but in some cases even better than a simple one. The hardest
problem with scraping sites is not necessarily to understand the structure, even
if vague. The problem for a scraper is rather to understand what can only be
described as domain knowledge. Sites uses a lot of assumed knowledge which
can be hard to understand for a machine or even if its understanding could be
completely wrong in the context. For example, lecture halls can be named the
same as a pub in the same region, making it hard for a system to determine if the
location is correct or not. This might be attainable with better heuristics, e.g.
if the location lookup can be made with some hierarchical solution and domain
knowledge can be extracted from the sites prior to the extraction of events.
5</p>
      <sec id="sec-15-1">
        <title>Future Work</title>
        <p>5.1</p>
      </sec>
    </sec>
    <sec id="sec-16">
      <title>Page Classi cation</title>
      <p>On the Internet, sites show a signi cant variation and most of them do not
contain entertainment events. Therefore a rst step in a generic system, the
dashed box \Classify" in Figure 1, would be to identify if the input web page
contains events. If it does not, it makes no sense to scrape it and doing so could
even lead to false positives. If web pages could be classi ed with reasonable
certainty, it could also be used with a crawler to create an endless supply of
event pages to scrape.
5.2</p>
    </sec>
    <sec id="sec-17">
      <title>Exploring Repetitiveness</title>
      <p>To solve the dashed box \Identify the event list" shown in Figure 1, we
investigated the repetitiveness of the event list. With the help of weighing in structural
elements, e.g. P, STRONG, H3, it yielded some interesting results on small sites.
This technique can potentially be further re ned by calibrating weights if the
page is annotated using what is described in Sect. 2.3.
5.3</p>
    </sec>
    <sec id="sec-18">
      <title>Rank and Select with Help of Layout</title>
      <p>
        While the system uses a very limited rank and selection based on an implied
layout for title (prefer H3, H2 etc. over raw text), it would be interesting to have
the selection fully use layouts. To attract attention and to create desire, the vital
information about an event are among the rst things the reader is supposed to
notice and comprehend. Thus it is usually presented in a visually distinguishing
way. This can be achieved by coloring the text di erently, making it larger, or
simply in a di erent font or typing. This layout is bundled within the HTML
document, possibly modi ed by the CSS, thus looking at these clues with some
heuristics allows to nd the visually distinguishing sentences [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. As an example,
an event might use a H3 element for the title, bold for the location, or it might
have another background color for the date. If the entire system would use layout
to aid the selection we believe that the system will perform better and will yield
less false positives.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Hogenboom</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frasincar</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaymak</surname>
          </string-name>
          , U., de Jong, F.:
          <article-title>An Overview of Event Extraction from Text</article-title>
          . In van Erp, M.,
          <string-name>
            <surname>van Hage</surname>
            ,
            <given-names>W.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hollink</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jameson</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Troncy</surname>
          </string-name>
          , R., eds.: Workshop on Detection, Representation, and
          <article-title>Exploitation of Events in the Semantic Web (DeRiVE</article-title>
          <year>2011</year>
          ) at Tenth International Semantic Web Conference (ISWC
          <year>2011</year>
          ). Volume 779 of CEUR Workshop Proceedings., CEURWS.org (
          <year>2011</year>
          )
          <volume>48</volume>
          {
          <fpage>57</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          :
          <article-title>Extracting key entities and significant events from online daily news</article-title>
          . In Fyfe,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.Y.</given-names>
            ,
            <surname>Yin</surname>
          </string-name>
          , H., eds.
          <source>: Intelligent Data Engineering and Automated Learning - IDEAL 2008. Volume 5326 of Lecture Notes in Computer Science</source>
          . Springer Berlin / Heidelberg (
          <year>2008</year>
          )
          <volume>201</volume>
          {
          <fpage>209</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Chun</surname>
          </string-name>
          , H.w.,
          <string-name>
            <surname>Hwang</surname>
            ,
            <given-names>Y.s.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rim</surname>
            ,
            <given-names>H.C.</given-names>
          </string-name>
          :
          <article-title>Unsupervised event extraction from biomedical literature using co-occurrence information and basic patterns</article-title>
          .
          <source>In: Proceedings of the First international joint conference on Natural Language Processing. IJCNLP'04</source>
          , Berlin, Heidelberg, Springer-Verlag (
          <year>2005</year>
          )
          <volume>777</volume>
          {
          <fpage>786</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Exner</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nugues</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Using Semantic Role Labeling to Extract Events from Wikipedia</article-title>
          . In van Erp, M.,
          <string-name>
            <surname>van Hage</surname>
            ,
            <given-names>W.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hollink</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jameson</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Troncy</surname>
          </string-name>
          , R., eds.: Workshop on Detection, Representation, and
          <article-title>Exploitation of Events in the Semantic Web (DeRiVE</article-title>
          <year>2011</year>
          ) at Tenth International Semantic Web Conference (ISWC
          <year>2011</year>
          ). Volume 779 of CEUR Workshop Proceedings., CEUR-WS.org (
          <year>2011</year>
          )
          <volume>38</volume>
          {
          <fpage>47</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Borsje</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hogenboom</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frasincar</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Semi-automatic nancial events discovery based on lexico-semantic patterns</article-title>
          .
          <source>Int. J. Web Eng. Technol</source>
          .
          <volume>6</volume>
          (
          <issue>2</issue>
          ) (
          <year>January 2010</year>
          )
          <volume>115</volume>
          {
          <fpage>140</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Hienert</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luciano</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Extraction of historical events from wikipedia</article-title>
          .
          <source>In: Proceedings of the First International Workshop on Knowledge Discovery and Data Mining Meets Linked Open Data, CEUR-WS.org</source>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Cai</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wen</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          , Ma, W.Y.:
          <article-title>Extracting content structure for web pages based on visual representation. In: Proceedings of the 5th Asia-Paci c web conference on Web technologies and applications</article-title>
          .
          <source>APWeb'03</source>
          , Berlin, Heidelberg, SpringerVerlag (
          <year>2003</year>
          )
          <volume>406</volume>
          {
          <fpage>417</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>