<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Neologism: Easy Vocabulary Publishing</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Cosmin Basca</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stephane Corlosquet</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Richard Cyganiak</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sergio Fernandez</string-name>
          <email>sergio.fernandez@fundacionctic.org</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Thomas Schandl</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Digital Enterprise Research Institute National Univerisity of Ireland</institution>
          ,
          <addr-line>Galway Galway</addr-line>
          ,
          <country country="IE">Ireland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Fundacion CTIC Gijon</institution>
          ,
          <addr-line>Asturias</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2004</year>
      </pub-date>
      <abstract>
        <p>Creating, documenting, publishing and maintaining an RDF Schema vocabulary is a complex, time-consuming task. This makes vocabulary maintainers reluctant to evolve their creations quickly in response to user feedback; it prevents use of RDF for casual, ad-hoc data publication about niche topics; it leads to poorly documented vocabularies, and contributes to poor compliance of vocabularies with bestpractice recommendations. Neologism is a web-based vocabulary editor and publishing system that dramatically reduces the time required to create, publish and modify vocabularies. By removing a lot of pain from this process, Neologism will contribute to a generally more interesting, relevant and standards-compliant Semantic Web.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Anyone who wants to publish information as RDF on the Semantic Web rst
faces the choice which RDF Schema vocabulary or OWL ontology to use. Some
areas, such as social networks (FOAF), online communities (SIOC) or general
document metadata (DC) are covered by established vocabularies. Outside of
these domains, registries like SchemaWeb3 and search services like Falcons
Concept Search4 assist in the task of nding vocabularies for niche topics, but what
they nd might be of insu cient quality, or might not cover all required terms,
and at present many areas of interest are not covered by any vocabulary at all.</p>
      <p>In summary, most e orts to publish information on the Semantic Web rst
require an e ort to create, extend or modify an RDF Schema vocabulary or OWL
ontology. But this is a complex and time-consuming task in itself. It involves:
{ Creating the formal speci cation of the vocabulary in RDFS or OWL,
{ writing documentation that is clear and helpful for users of the ontology,
{ keeping both documents in sync as the vocabulary evolves,</p>
    </sec>
    <sec id="sec-2">
      <title>3 http://www.schemaweb.info/</title>
    </sec>
    <sec id="sec-3">
      <title>4 http://iws.seu.edu.cn/services/falcons/conceptsearch/</title>
      <p>
        { archiving older versions of the documents,
{ de ning and maintaining mappings to related vocabularies,
{ con guring the web server in accordance with W3C best practices [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>In this paper we present an online vocabulary editor and publishing system
based on Drupal5, implemented in PHP and ActionScript, which will support
vocabulary authors in the tasks above and thereby dramatically reduce the time
required to create, publish and modify vocabularies. The work presented in this
paper is in progress.
2</p>
      <sec id="sec-3-1">
        <title>The value of vocabularies</title>
        <p>
          We de ne vocabularies as simple, \lightweight" ontologies, such as FOAF, DC,
SIOC and SKOS. They usually comprise less than 50 terms. Expressivity is
limited to RDF Schema plus selected OWL features, e.g. inverse functional
properties and class disjointness. Their value is in providing common terminology
for exchanging information between programs. The actual information is in the
RDF instance data that is expressed with the vocabulary's terms, while in more
complex ontologies, the actual information lies in the de nitions of the classes
and properties. A vocabulary is created by publishing a description of its terms
in natural using HTML or formal using RDFS/OWL language. Since classes
and properties are identi ed by URIs, it is considered a good practice to make
these URIs resolvable [
          <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
          ]. This enables clients to look up de nitions of the
vocabulary terms, with the following bene ts:
{ Information publishers can refer to a speci cation. This is important to
create interoperability around a vocabulary. The top ten most popular
vocabularies of 20066 all have a such a speci cation.
{ RDF-aware tools such as data browsers (e.g. Tabulator [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]), SPARQL query
builders and RDF instance editors can use the formal speci cation to improve
the user experience, e.g. by showing friendlier labels and comments, listing
available terms and providing widgets appropriate to a property's data type.
{ Inference can be performed to increase recall when performing queries or
lookups against RDF data, which is especially useful when terms are mapped
to other vocabularies. Systems that use such techniques are the Tabulator
data browser [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] and the Sindice semantic lookup index [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
3
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Current approaches to vocabulary publishing</title>
        <p>
          Vocabulary maintenance with text editors and custom scripts. Many popular
vocabularies such as FOAF and SIOC are maintained by a process involving
hand-authoring of RDF and HTML les and custom scripts, e.g. SpecGen7.
Often, complex custom Apache con gurations are employed to follow best practices
regarding content negotiation, MIME types and resolvable URIs [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5 http://drupal.org/</title>
    </sec>
    <sec id="sec-5">
      <title>6 http://ebiquity.umbc.edu/resource/html/id/196/</title>
    </sec>
    <sec id="sec-6">
      <title>7 http://sioc-project.org/specgen</title>
      <p>
        O ine ontology editors. OWL ontology editors such as Protege [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], TopBraid
Composer8 and SWOOP9 can be used to create the formal speci cation of a
vocabulary. While being great tools for knowledge engineering professionals, these
applications have a steep learning curve and they intimidate casual users. They
use a le-based, o ine model, where ontology les are stored on the local user's
computer. Remote publishing, if supported at all, is an after-thought.
Web-based systems. OntoWiki [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] provides basic ontology editing, but its main
focus is the display and editing of RDF instance data. MyOntology [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] focuses
on collaborative editing in a larger community, in the hope of creating rich
knowledge bases, while creation of simple vocabularies typically does not involve
many collaborating users. Knoodl10 is a hosted service with strong community
features and an easy-to-use vocabulary editor, but it does not publish created
vocabularies with resolvable URIs or according to best-practice guidelines.
Areas for improvement. We identify four points where we can simplify the
process: (i) Instant web-based publishing instead of le-based o ine editing. (ii)
Focus on a limited subset of RDFS and OWL. (iii) No instance editing or
browsing. (iv) Handling of HTTP details like URI management, content negotiation
and redirects within the web-based application.
4
      </p>
      <sec id="sec-6-1">
        <title>Easier vocabulary publishing with Neologism</title>
        <p>Neologism11 is a web-based vocabulary editor and publishing platform designed
to address these issues. It is currently being implemented and will soon be
released as an open-source project. This section presents Neologism's current state.
Public interface. To non-authenticated users on the Web, Neologism presents
a very simple interface: a homepage that lists one or more vocabularies, and
for each of them a vocabulary page containing some general information about
the vocabulary (Figure 1), followed by the descriptions of all its classes and
properties.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>8 http://www.topbraidcomposer.com/</title>
    </sec>
    <sec id="sec-8">
      <title>9 http://www.mindswap.org/2004/SWOOP/ 10 http://knoodl.com/ 11 http://neologism.deri.ie/</title>
      <p>Editor. After a vocabulary maintainer logs in, additional links become visible on
the vocabulary page and allow adding new terms, as well as editing of existing
terms. Terms are created and edited through a web form (Figure 2). The form
allows entry of an ID (to become part of the term's URI), label, comment,
subclasses, subproperties, domain, range, disjoint classes, inverse properties, and
marking a property as inverse functional. Authenticated users can also create
new vocabularies and modify the vocabulary metadata.</p>
      <p>Overview diagram. The vocabulary page provides access to a diagram that shows
the vocabulary's classes and their relationships (Figure 3). The vocabulary
maintainer can arrange the diagram into a sensible layout and then save its
current state which will henceforth be shown to all users.</p>
      <p>RDFS output, URIs and content negotiation. The URIs identifying classes and
properties are always generated by appending the hash character and the term's
ID to the URI of the vocabulary page. This makes sure that the vocabulary
page is returned when these URIs are resolved. HTTP requests to the vocabulary
page are subject to content negotiation. Web browsers will see the HTML variant
shown in Figure 1. RDF-aware clients will receive the RDFS/OWL speci cation,
either in RDF/XML or N3 syntax. In a nutshell, Neologism publishes
standardscompliant vocabularies on the Web without requiring any additional e ort on
the part of vocabulary maintainers.</p>
      <p>Implementation. Neologism is implemented in PHP as a Drupal module. Drupal
reduces development time by providing many features for free, such as account
management. It also makes integration with a larger Drupal-based site very easy,
for example to provide a news blog and discussion forum for each vocabulary. All
data is stored in a MySQL database. RAP12 is used to serialize RDF/XML and
N3. The PHP Content Negotiation library13 is used instead of the usual Apache
rules to implement content negotiation, and Vapour14 was used to validate its
correctness. The overview diagram is implemented using Adobe Flex and coded
in ActionScript; the ObjectHandles and Tweener libraries are used for animation
and object handling.
5</p>
      <sec id="sec-8-1">
        <title>Future Work</title>
        <p>Hosted Neologism service. Currently, vocabulary maintainers must install
Neologism on their own webspace. A central hosted service, which could be easily
built on the Drupal platform, would remove this barrier.</p>
        <p>
          Branching and revision tracking. Neologism does not yet o er revision control.
Some desirable features for vocabulary revision control are: archival of all prior
versions; grouping of several small edits into a single version to avoid putting
the vocabulary into an inconsistent intermediate state; publishing changes as a
draft before accepting them as a new version.
12 http://www4.wiwiss.fu-berlin.de/bizer/rdfapi/
13 http://ptlis.net/source/php-content-negotiation/
14 http://vapour.sourceforge.net/
Plugin system. We intentionally kept the set of supported class and property
annotations small to simplify the user experience, and don't support many
possible further annotations, such as OWL cardinality constraints, plural and
inverse labels15, multilingual labels or associating Fresnel lenses [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] with classes
and properties. Such additional annotations could be supported through plugins
that are installed by vocabulary maintainers.
        </p>
        <p>Consistency checking. Neologism doesn't check the created vocabulary for
consistency. This can become an issue when a vocabulary is integrated with several
external vocabularies. A solution could be the integration of an external
reasoning service that performs consistency checks and is invoked through an API over
the Web.
6</p>
      </sec>
      <sec id="sec-8-2">
        <title>Conclusion</title>
        <p>We have shown a web-based vocabulary publishing system that simpli es the
process of creating, publishing and maintaining RDF vocabularies by (i) instant
web-based publishing, (ii) focus on a limited subset of RDFS and OWL, (iii)
avoiding instance editing or browsing, and (iv) handling URI management and
HTTP content negotiation. We hope that the presented system will encourage
the creation of new vocabularies and thereby contribute to a generally more
interesting, relevant and standards-compliant Semantic Web.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dietzold</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Riechert</surname>
          </string-name>
          .
          <article-title>OntoWiki, a tool for social, semantic collaboration</article-title>
          .
          <source>The Semantic Web - ISWC</source>
          <year>2006</year>
          ,
          <volume>4273</volume>
          /
          <year>2006</year>
          :736{
          <fpage>749</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>T.</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Chilton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Connolly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Dhanaraj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hollenbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lerer</surname>
          </string-name>
          , , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Sheets</surname>
          </string-name>
          . Tabulator:
          <article-title>Exploring and Analyzing Linked Data on the Semantic Web</article-title>
          .
          <source>In The 3rd International Semantic Web User Interaction Workshop (SWUI06)</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>D.</given-names>
            <surname>Berrueta</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Phipps</surname>
          </string-name>
          .
          <article-title>Best Practice Recipes for Publishing RDF Vocabularies</article-title>
          . Working Draft,
          <year>W3C</year>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>C.</given-names>
            <surname>Bizer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and E.</given-names>
            <surname>Pietriga</surname>
          </string-name>
          .
          <article-title>Fresnel, a Browser-Independent Presentation Vocabulary for RDF</article-title>
          .
          <source>In International Semantic Web Conference</source>
          <year>2006</year>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>H.</given-names>
            <surname>Knublauch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. W.</given-names>
            <surname>Fergerson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. F.</given-names>
            <surname>Noy</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Musen</surname>
          </string-name>
          .
          <article-title>The Protege OWL Plugin: An open development environment for semantic web applications</article-title>
          .
          <source>The Semantic Web ISWC</source>
          <year>2004</year>
          ,
          <volume>3298</volume>
          /
          <year>2004</year>
          :229{
          <fpage>243</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>E.</given-names>
            <surname>Oren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Delbru</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Catasta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Stenzhorn</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Tummarello.</surname>
          </string-name>
          <article-title>Sindice.com: A document-oriented lookup index for open linked data</article-title>
          .
          <source>International Journal of Metadata, Semantics and Ontologies</source>
          ,
          <volume>3</volume>
          (
          <issue>1</issue>
          ),
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>K.</given-names>
            <surname>Siorpaes</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Hepp</surname>
          </string-name>
          . myOntology:
          <article-title>The marriage of ontology engineering and collective intelligence</article-title>
          .
          <source>In ESWC 2007 Workshop Bridging the Gap between Semantic Web and Web 2.0</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>