<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Collaborative Editing and Linking of Astronomy Vocabularies Using Semantic Mediawiki</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Stuart Chalmers</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Norman Gray</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Iadh Ounis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alasdair Gray</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computing Science, University of Glasgow</institution>
          ,
          <addr-line>Glasgow</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Physics and Astronomy, University of Glasgow</institution>
          ,
          <addr-line>Glasgow</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>School of Computer Science, Manchester University</institution>
          ,
          <addr-line>Manchester</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The International Virtual Observatory Alliance (IVOA) comprises 17 Virtual Observatory (VO) projects and facilitates the creation, coordination and collaboration of standards promoting the use and reuse of astronomical data archives. The Semantics working group in the IVOA has repurposed ve existing vocabularies (modelled using SKOS), capturing concepts within speci c areas of astronomy expertise and applications. A major task however, is to promote the uptake of these semantic representations within the Astronomy community, and further, to let astronomers model (and in turn create links from) their own custom vocabularies to use these existing de nitions. In this paper we show how Semantic Mediawiki (SMW) can be used to support expert interaction in the lifecycle of vocabulary creation, linking, and maintenance.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Astronomy as a discipline incorporates a broad range of topics and data analysis
across the wavelength spectrum, from gamma-rays to radio waves, and a wide
range of expertise from professional researchers to amateurs. Because of the
collaborative nature of astronomy working groups and projects, and a culture
where sharing data is the norm, there is a well-established need for consensus
de nitions describing data (mostly image and object catalogue data). To this
end a number of standardised vocabularies have emerged, which are mostly, at
present, focused on the search for and retrieval of resources, primarily data and
journal articles.</p>
      <p>Thus, multiple independent controlled vocabularies have evolved to meet the
various terminological needs of these di erent sub-communities (Table 1). The
most widely-known of these is the keyword list maintained jointly by the three
main astronomy journals A&amp;A, ApJ and MNRAS (these keywords are used to
tag journal articles, so that most astronomers have a familiarity with this set),
and the largest is a thesaurus developed by the International Astronomical Union
(IAU) (with the IVOA starting work on an update, the IVOAT). Newer than
both are the AVM vocabulary { a recent e ort intended for use when tagging
astronomy outreach images { and the UCD list, in increasingly wide use as a set
of standardised database column headings4. For further discussion see [1].
4 http://www.ivoa.net/Documents/latest/Vocabularies.html
Journal Keywords
Astronomy Visualization
Metadata (avm)
The IAU Thesaurus (iaut) iau
The ivoa Thesaurus (ivoat) ivoa
Universal Content Descriptors ivoa
(ucd)
Journal
publishers
various</p>
      <p>While the IVOA vocabularies have provided a basis for standardisation of
experimental terminology, there remain a few problems:
{ There are no standardised tools or methodology for creating custom
experimental descriptions based on these vocabularies.
{ Users may be familiar with speci c IVOA vocabularies relating to their
subdiscipline, but not others, meaning that their description cannot describe
their data as fully as a searching colleague might require.
{ Searching of user-de ned vocabularies and data is limited to terminology in
the IVOA vocabular(ies) used to de ne them. For instance, a user
vocabulary described using the IVOAT thesaurus has no relation to searches using
keywords from the IAUT thesaurus.</p>
      <p>Recent work in the Explicator project5 has laid the foundations for a solution
to these problems, by representing the main IVOA vocabularies in SKOS, and
exploiting SKOS relationships to help domain experts articulate cross-vocabulary
links [2].
2</p>
    </sec>
    <sec id="sec-2">
      <title>Current Vocabulary Building Tools</title>
      <p>The Explicator project has developed a number of tools for the creation and
use of SKOS astronomy vocabularies. The main entry point for searching and
exploring terminology is the Web Vocabulary Explorer6, built upon the Terrier
Information Retrieval Platform [3] and providing an AJAX frontend for
searching and browsing the astronomy vocabularies by entering a simple search string
to nd matching concepts. Fig. 1 (left) shows the search results for \star". The
use of Terrier is important, in order to provide useful ranking of results: this
vocabulary contains a large number of labels with common strings, so a naive
search for \star" produces more than 600 concepts which have that string
somewhere in their label, with the key concept `Star' appearing uselessly far down
the list. Using Terrier's ranking support, however, the appropriate concepts from
5 http://explicator.dcs.gla.ac.uk
6 http://explicator.dcs.gla.ac.uk/WebVocabularyExplorer
the three searched vocabularies appear at the beginning of this list. The explorer
allows users to expand results and view details of concepts, such as alternate
labels, available de nitions and semantic relationships. Related concepts, both
within a vocabulary and across vocabularies, can be explored by following links
to broader, narrower, related, and equivalent concepts. Searches can be con
gured by selecting sets of vocabularies and mappings. This service is also available
via XML-RPC, so that it can be embedded within other applications.</p>
      <p>To create links between the main vocabularies in Table 1 we have a Java
mapping application providing a GUI interface to declare mappings between
vocabularies that can then be integrated into the Web Vocabulary Explorer.
The ve vocabularies listed here were pre-existing ones, though not published as
SKOS, and so were converted from their original formats as part of the process of
developing [4]. The tool also allows the inclusion of automatically created RDF
representations of databases, created using the D2RQ database to RDF mapping
tool7. The other important source of ontology information within the VO is the
IVOA's resource registry8, which curates resource metadata using a standardised
set of XML Schemas, which we have also converted to RDF Schemas using XSLT
transformations.</p>
      <p>Part of the point of the tool's search functionality is to help users nd relevant
concepts in multiple vocabularies, and to support them in articulating
intervocabulary mappings. However we do not aim to do any automatic vocabulary
alignment.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Semantic Mediawiki in the Vocabulary Lifecycle</title>
      <p>While the astronomy community is in general technically adept, the immediate
payo from adopting the tools described in section 2 and converting to SKOS</p>
      <sec id="sec-3-1">
        <title>7 http://www4.wiwiss.fu-berlin.de/bizer/d2rq/ 8 http://rofr.ivoa.net</title>
        <p>D2R/XSLT</p>
        <p>
          (
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
UML
models
        </p>
        <p>RDBMS
schemas</p>
        <p>
          Python
parsing (
          <xref ref-type="bibr" rid="ref5">5</xref>
          ) maintenance (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) Jena parser
tools
(7)
        </p>
        <p>
          Semantic
MediaWiki
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          )
SKOS vocabs
&amp; mappings
[master copy]
tion tiaon
rcau rce&amp;
        </p>
        <p>Astronomers
lookup
(JSON?)</p>
        <p>(6)
Vocabulary
Explorer</p>
        <p>
          Lookup
service
representations is not obvious (or apparent) enough to users to make this an
attractive option (this is a general problem, also discussed in [5]). What is needed
is a cohesive, familiar and easily understandable interface that integrates these
tools in a way that allows the creation of SKOS-based experiment descriptions
and vocabularies (based on and utilising current IVOA standard vocabularies)
with minimal expenditure on learning the underlying semantic representations.
To this end we have proposed a coherent vocabulary `lifecycle' methodology
(creation, collaborative editing, linking and searching/use) { see Fig 2. This
uses SMW as a collaborative vocabulary building tool to create and edit
vocabularies (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ), link (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) these to existing IVOA vocabularies (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) and have them
automatically exported to (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) and imported from (
          <xref ref-type="bibr" rid="ref5">5</xref>
          ) their corresponding SKOS
representations (
          <xref ref-type="bibr" rid="ref4">4</xref>
          ) for use in the Web Vocabulary Explorer.
To link SMW to our existing tools (6), we have developed a general set of
python scripts (7), using pywikipediabot9 and the rd ib10 library to automate
the uploading and parsing of our SKOS vocabularies into Wikipedia pages11. The
SMW pages are based on a simple semantic form/template structure, parsed
from the main SKOS vocabularies (
          <xref ref-type="bibr" rid="ref4">4</xref>
          ) and uploaded using the python bots.
Similarly we use a Jena-based parser to parse the SMW OWL/RDF export (
          <xref ref-type="bibr" rid="ref3">3</xref>
          )
for a particular vocabulary and create the corresponding SKOS version for
reinclusion in the Web Vocabulary Explorer search.
        </p>
        <p>This linking of the ve main IVOA vocabularies into SMW pages means that
we now have a base set of terms for users to begin using in their own
experimental vocabularies. To help users nd related terminology (e.g. for broader,
narrower, or related matches in their SKOS terms) we use simple inline queries
embedded in the main vocabulary term template to show (on each term's page)
the possible related terminology. Fig. 1 (right) shows the inline query used in
the main template of the vocabulary wiki pages and an example, the AOIM
term `Galaxy'. This shows the main de nition (scopenote, prefLabel, altlabel,
broader, narrower and related) and a table of the possible related terms
(including TheGalaxy in the AAKeys vocabulary and the src.class.starGalaxy from the</p>
      </sec>
      <sec id="sec-3-2">
        <title>9 http://meta.wikimedia.org/wiki/Pywikipedia 10 http://www.rdflib.net/ 11 We currently host this testbed at http://vocabularies.referata.com</title>
        <p>UCD vocabulary) that may be linked to by the user as cross-vocabulary related
terms.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Related and future work</title>
      <p>There are other vocabulary development systems in existence, including the
NeOn project's ontology editor12, and its Cicero project, which is also based
on SMW, and which supports an elaborate argumentation structure for
collaborative ontology development (NeOn deliverable 2.3.1). On a similar theme is
LexWiki13, which is a platform for developing a biomedical vocabulary. The
problem we are addressing, however, is not that of collaboratively creating a
large ontology from scratch, but supporting the collaborative inter-relation of
multiple existing vocabularies from various sources, with a community which
is made more rather than less comfortable by having some of the underlying
technology visible, and repurposable from user-written applications.</p>
      <p>At present we are working on a mediawiki extension that will allow us to use
the XML-RPC search from the Web Vocabulary Explorer to nd related terms.
This will use the Terrier search described above, to provide more accurate ranked
searches for related terms, than is possible with the existing inline searches.</p>
      <p>A key advantage, for us, of using a wiki-based solution is that it provides a
good match to the expectations of the domain experts { they feel comfortable
and in control when using it. Both the wiki and its embedded functionality must
therefore evolve in tune with the user base, and an important strand of our
future work on this project is to evaluate the provided functionality in use.
12 http://www.neon-project.org/
13 http://informatics.mayo.edu/vkcdemo/lexwiki1/</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ounis</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Vocabularies in the VO</article-title>
          . In Bohlender, D., et al., eds.
          <source>: Proc. Astronomical Data Analysis and Software Systems Conference (ADASS XVIII)</source>
          . Volume
          <volume>411</volume>
          ., Astronomical Society of the Paci c (
          <year>2009</year>
          )
          <volume>179</volume>
          {
          <fpage>182</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>A.J.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hall</surname>
            ,
            <given-names>C.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ounis</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Finding the right term: Retrieving and exploring semantic concepts in astronomical vocabularies</article-title>
          .
          <source>Information Processing and Management</source>
          (
          <year>2009</year>
          ). In press.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Ounis</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Amati</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Plachouras</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>He</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Macdonald</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lioma</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Terrier: A high performance and scalable information retrieval platform</article-title>
          .
          <source>In: Proceedings of ACM SIGIR'06 Workshop on Open Source Information Retrieval (OSIR</source>
          <year>2006</year>
          ), Seattle (Washington, USA),
          <source>ACM</source>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>A.J.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hessman</surname>
            ,
            <given-names>F.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Preite</surname>
            <given-names>Martinez</given-names>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <article-title>Vocabularies in the virtual observatory</article-title>
          .
          <source>IVOA Recommendation</source>
          (
          <year>2009</year>
          ) Available at: http://www.ivoa. net/Documents/latest/Vocabularies.html.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Linde</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Andrews</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>SKUA - retro tting semantics</article-title>
          . In Auer, S., et al., eds.
          <source>: Proc. 5th Workshop on Scripting and Development for the Semantic Web at ESWC</source>
          <year>2009</year>
          , Heraklion, Greece. Volume
          <volume>449</volume>
          <source>of CEUR Workshop Proceedings ISSN 1613-0073</source>
          . (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>