<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Ontology-Based Query Expansion Widget for Information Retrieval</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jouni Tuominen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tomi Kauppinen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kim Viljanen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eero Hyvonen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Semantic Computing Research Group (SeCo) Helsinki University of Technology (TKK) and University of Helsinki</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper we present an ontology-based query expansion widget which utilizes the ontologies published in the ONKI Ontology Service. The widget can be integrated into a web page, e.g. a search system of a museum catalogue, enhancing the page by providing a query expansion functionality. We have tested the system with general, domainspeci c and spatio-temporal ontologies.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        In information retrieval systems the relevancy of search results depends on the
user's ability to represent her information needs in a query [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. If the vocabularies
used by the user and the system are not the same ones, or if the shared
vocabulary is used in di erent levels of speci city, the search results are usually poor.
Query expansion has been proposed to solve these issues and to improve
information retrieval by expanding the query with terms related to the original query
terms. Query expansion can be based on corpus, e.g. analyzing co-occurences of
terms, or on knowledge models, such as thesauri [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] or ontologies [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Methods
based on knowledge models are especially useful in cases of short, incomplete
query expressions with few terms found in the search index [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ].
      </p>
      <p>
        We have implemented a web widget providing query expansion functionality
to web-based systems as an easily integrable service with no need to change
the underlying system. The widget uses ontologies to expand the query terms
with semantically related concepts. The widget extends the previously
developed ONKI Selector widget, which is used for selecting concepts especially for
annotation purposes [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        The user does not have to be familiar with the ontologies used in content
annotations by utilizing the autocompletion search feature of the widget, as the
system suggests matching concepts as the user is writing the query string. Also,
to help the user to disambiguate concepts the ONKI Ontology Browsers [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] can
be used to get a better understanding of the semantics of the concepts, e.g. by
providing a concept hierarchy visualization.
      </p>
      <p>
        The query expansion widget supports Semantic web and legacy systems1,
i.e. either the concept URIs or the concept labels can be used in queries. In
1 By legacy systems we mean systems that do not use URIs as identi ers.
legacy systems cross-language search can be performed, if the used ontology
contains concept labels in several languages. In addition to the widget, the query
expansion service can also be utilized via JavaScript and Web Service APIs. The
query expansion widget and the APIs are available for public use as part of
the ONKI Ontology Service2 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The JavaScript code needed for integrating
the widget into a search system can be generated by using the ONKI Widget
Generator3.
      </p>
      <p>The contribution of this paper is to present an approach to perform query
expansion in systems cost-e ectively, not to evaluate how the chosen query
expansion methods improve information retrieval in the systems.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Ontologies used for Query Expansion</title>
      <p>The ONKI query expansion widget can be used with any ontology published in
the ONKI Ontology Service. The service contains some 60 ontologies at the time
of writing. Users are encouraged to submit their own ontologies to be published
in the service by using the Your ONKI Service4. In the following, we describe
how we have used di erent types of ontologies for query expansion.
2.1</p>
      <p>
        Query Expansion with General and Domain-speci c Ontologies
For expanding general and domain-speci c concepts in queries we have used The
Finnish Collaborative Holistic Ontology KOKO5 which consists of The Finnish
General Upper Ontology YSO [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and several domain-speci c ontologies
expanding it. To improve poor search results caused by using vocabularies in di erent
levels of speci city in queries and in the search index we have used the transitive
is-a relation (rdfs:subClassOf 6) for expanding the query concepts with their
subclasses. So for example, when selecting a query concept publications, the query
is expanded with concepts magazines, books, reports and so on.
      </p>
      <p>
        Using other relations in addition or instead of the is-a relation in query
expansion might be bene cial. When considering general associative relations, caution
should be exercised as their use in query expansion can lead to uncontrolled
expansion of result sets, and thus to potential loss in precision [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ]. In case of
a legacy system (not handling URIs, using labels instead) the use of alternative
labels of concepts (synonyms) may improve the search. The relations used in the
query expansion of an ontology can be con gured when publishing the ontology
in the ONKI Ontology Service.
      </p>
      <sec id="sec-2-1">
        <title>2 http://www.yso.fi/</title>
      </sec>
      <sec id="sec-2-2">
        <title>3 http://www.yso.fi/onkiselector/</title>
      </sec>
      <sec id="sec-2-3">
        <title>4 http://www.yso.fi/upload/</title>
      </sec>
      <sec id="sec-2-4">
        <title>5 http://www.seco.tkk.fi/ontologies/koko/</title>
      </sec>
      <sec id="sec-2-5">
        <title>6 De ned in the RDFS Recommendation, http://www.w3.org/TR/rdf-schema/</title>
        <p>2.2</p>
        <p>
          Query Expansion with the Spatio-temporal Ontology SAPO
A spatial query can explicitly contain spatial terms (e.g. Helsinki) and spatial
relations (e.g. near), but implicitly it can include even more spatial terms that
could be used in query expansion [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. For example, in a query \museums near
Helsinki" not only Helsinki is a relevant spatial term, but also its neighboring
municipalities. Spatial terms { i.e. geographical places { do not exist just in
space but also in time [
          <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
          ]. This is especially true for museum collections where
objects have references to places from di erent times. This sets a requirement to
utilize also relations between historical places and more contemporary places in
query expansion. To provide these mappings we used a spatio-temporal ontology
SAPO (The Finnish Spatio-temporal Ontology) [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
        <p>
          In SAPO regional overlap mappings are expressed as depicted in Figure 1,
where example Turtle RDF7 statements8 express that the region of the latest
temporal part of place sapo:Joensuu | i.e. the one valid from the beginning of
year 2009 | overlaps the region of the temporal part of sapo:Eno of years 1871{
2008. The temporal part of the place simply means the place during a certain
time-period such that di erent temporal parts might have di erent extensions
(i.e. borders) [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
        <p>sapo:Joensuu(2009-)
sapo:begin</p>
        <p>"2009-01-01" ;
sapo:overlaps
sapo:Eno(1871-2008) ,
sapo:Pyhaselka(1925-2008) ,
sapo:Joensuu(2005-2008) .</p>
        <p>Fig. 2. A place is a union of its
temporal parts. Moreover, places may have
overlapped other places at some time.</p>
        <p>For example, the place sapo:Joensuu is a union of four temporal parts, de ned
in the example depicted in Figure 2. However, annotations of items likely utilize
places rather than their temporal parts. For this reason the model uses property
sapo:overlapsAtSomeTime to explicate that e.g. a place sapo:Joensuu has | at
some point in the history | overlapped together ve di erent places (sapo:Eno
and four others). In other words, e.g. at least one temporal part of sapo:Joensuu
has overlapped at least one temporal part of sapo:Eno. We have used this more
generic property sapo:overlapsAtSomeTime between places for query expansion.</p>
      </sec>
      <sec id="sec-2-6">
        <title>7 http://www.dajobe.org/2004/01/turtle/</title>
        <p>8 The example uses the following pre x - sapo: http://www.yso. /onto/sapo/</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>A Use Case of the Query Expansion Widget</title>
      <p>We have created a demonstration search interface9 consisting of the original
Kantapuu. search form10 and integrated ONKI widgets for query expansion.
Kantapuu. is a web user interface for browsing and searching for collections of
Finnish museums of forestry, using simple matching algorithm of free text query
terms with the item index terms. The ontologies used in the query expansion are
the same ones as used in annotation of the items11, namely The Finnish General
Upper Ontology YSO, Ontology for Museum Domain MAO12 and Ag forest
Ontology AFO13. For expanding geographical places the Finnish Spatio-temporal
Ontology SAPO is used.</p>
      <p>When a desired query concept is selected from the results of the
autocompletion search of the widget or by using the ONKI Ontology Browser, the concept is
expanded. The resulting query expression is the disjunction of the original query
concept and the concepts expanding it, formed using the Boolean operation OR.
The query expression is placed into a hidden input eld, which is sent to the
original Kantapuu. search page when the HTML form is submitted.</p>
      <p>An example query is depicted in Figure 3, where the user is interested in old
publications from place Joensuu. User has used the autocompletion feature of the
widget to input to the keywords eld a query term \publicat", which has been
autocompleted to the concept publications, which has been further expanded to
its subclasses (their Finnish labels). Similarly, the place Joensuu has been added
to the eld place of usage and expanded with the places it overlaps.</p>
      <p>The result set of the search contains four items, from which two are magazines
used in place Eno and the rest two are cabinets for books used in place Joensuu.
Without using the query expansion the result set would have been empty, as the
place Eno and the concept books were not in the original query.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Discussion</title>
      <p>When implementing the demonstration search interface for the Kantapuu.
system with ONKI widgets we faced some challenges. If a query concept has lots
of subconcepts, the expanded query string may become inconveniently long, as
the concept URIs/labels of the subconcepts are added to the query. This may
cause problems because the used HTTP server, database system or other
software components may set limits to the length of the query string. With lengthy
queries the system may not function properly or the response times of the system
may increase.</p>
      <sec id="sec-4-1">
        <title>9 http://www.yso.fi/kantapuu-qe/</title>
        <p>10 http://www.kantapuu.fi/, follow the navigation link \Kuvahaku".
11 To be precise, the ontologies are based on thesauri that have been used in annotation
of the items.
12 http://www.seco.tkk.fi/ontologies/mao/
13 http://www.seco.tkk.fi/ontologies/afo/</p>
        <p>Future work includes user testing for nding out if users consider the query
expansion of the concepts and places useful. Also, systematic evaluation of the
search systems used would be essential to nd out if the query expansion
improves the information retrieval, and speci cally which semantic relations
improve the results the most. The user interface of the query expansion widget
needs further developing, e.g., the user should be able to select/unselect the
suggested query expansion concepts.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgements</title>
      <p>We thank Ville Komulainen for his work on the original ONKI server and Leena
Paaskoski and Leila Issakainen for cooperation on integrating the ONKI query
expansion widgets into the Kantapuu. system. This work has been partially
funded by Lusto The Finnish Forest Museum14 and partially by the IST funded
EU project SMARTMUSEUM15 (FP7-216923). The work is a part of the
National Semantic Web Ontology project in Finland16 (FinnONTO) and its
followup project Semantic Web 2.017 (FinnONTO 2.0, 2008-2010), funded mainly by
the National Technology and Innovation Agency (Tekes) and a consortium of 38
private, public and non-governmental organizations.
14 http://www.lusto.fi
15 http://smartmuseum.eu/
16 http://www.seco.tkk.fi/projects/finnonto/
17 http://www.seco.tkk.fi/projects/sw20/</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Voorhees</surname>
            ,
            <given-names>E.M.:</given-names>
          </string-name>
          <article-title>Query expansion using lexical-semantic relations</article-title>
          .
          <source>In: Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval</source>
          , Dublin,
          <source>Ireland (July 3-6</source>
          <year>1994</year>
          )
          <volume>61</volume>
          {
          <fpage>69</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>Y.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vandendorpe</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Evens</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Relational thesauri in information retrieval</article-title>
          .
          <source>Journal of the American Society for Information Science</source>
          <volume>36</volume>
          (
          <issue>1</issue>
          ) (
          <year>1985</year>
          )
          <volume>15</volume>
          {
          <fpage>27</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Viljanen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tuominen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Hyvonen, E.:
          <article-title>Publishing and using ontologies as mashup services</article-title>
          .
          <source>In: Proceedings of the 4th Workshop on Scripting for the Semantic Web (SFSW</source>
          <year>2008</year>
          ),
          <source>5th European Semantic Web Conference</source>
          <year>2008</year>
          (
          <article-title>ESWC 2008), Tenerife, Spain (June 1-5</article-title>
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Viljanen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tuominen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Hyvonen, E.:
          <article-title>Ontology libraries for production use: The Finnish ontology library service ONKI</article-title>
          .
          <source>In: Proceedings of the 6th European Semantic Web Conference (ESWC</source>
          <year>2009</year>
          ). (May 31 - June 4
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. Hyvonen, E.,
          <string-name>
            <surname>Viljanen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tuominen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Seppala,
          <string-name>
            <surname>K.</surname>
          </string-name>
          :
          <article-title>Building a national semantic web ontology and ontology service infrastructure|the FinnONTO approach</article-title>
          .
          <source>In: Proceedings of the 5th European Semantic Web Conference (ESWC</source>
          <year>2008</year>
          ).
          <article-title>(June 1-5</article-title>
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Tudhope</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alani</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Augmenting thesaurus relationships: Possibilities for retrieval</article-title>
          .
          <source>Journal of Digital Information</source>
          <volume>1</volume>
          (
          <issue>8</issue>
          ) (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Hollink</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schreiber</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wielinga</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Patterns of semantic relations to improve image content search</article-title>
          .
          <source>Journal of Web Semantics</source>
          <volume>5</volume>
          (
          <issue>3</issue>
          ) (
          <year>2007</year>
          )
          <volume>195</volume>
          {
          <fpage>203</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Fu</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>C.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abdelmoty</surname>
            ,
            <given-names>A.I.</given-names>
          </string-name>
          :
          <article-title>Ontology-based spatial query expansion in information retrieval</article-title>
          .
          <source>In: In Lecture Notes in Computer Science</source>
          , Volume
          <volume>3761</volume>
          , On the Move to Meaningful
          <source>Internet Systems: ODBASE</source>
          <year>2005</year>
          .
          <article-title>(</article-title>
          <year>2005</year>
          )
          <volume>1466</volume>
          {
          <fpage>1482</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Kauppinen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , Hyvonen, E.:
          <article-title>Modeling and reasoning about changes in ontology time series</article-title>
          . In Kishore, R.,
          <string-name>
            <surname>Ramesh</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sharman</surname>
          </string-name>
          , R., eds.: Ontologies:
          <article-title>A Handbook of Principles, Concepts and Applications in Information Systems</article-title>
          .
          <source>Integrated Series in Information Systems</source>
          , New York, NY, Springer-Verlag, New York (NY) (
          <year>January</year>
          15
          <year>2007</year>
          )
          <volume>319</volume>
          {
          <fpage>338</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Jones</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Abdelmoty</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fu</surname>
          </string-name>
          , G.:
          <article-title>Maintaining ontologies for geographical information retrieval on the web</article-title>
          . Volume
          <volume>2888</volume>
          ., Sicily, Italy, Springer Verlag (
          <year>November 2003</year>
          )
          <volume>934</volume>
          {
          <fpage>951</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Kauppinen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , Vaatainen, J., Hyvonen, E.:
          <article-title>Creating and using geospatial ontology time series in a semantic cultural heritage portal</article-title>
          . In: S. Bechhofer et al.(Eds.):
          <source>Proceedings of the 5th European Semantic Web Conference 2008 ESWC</source>
          <year>2008</year>
          , LNCS 5021,
          <string-name>
            <surname>Tenerife</surname>
            ,
            <given-names>Spain.</given-names>
          </string-name>
          (June 1-5
          <year>2008</year>
          )
          <volume>110</volume>
          {
          <fpage>123</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>