=Paper= {{Paper |id=Vol-55/paper-4 |storemode=property |title=Easing Participation in the Semantic Web |pdfUrl=https://ceur-ws.org/Vol-55/haustein.pdf |volume=Vol-55 |dblpUrl=https://dblp.org/rec/conf/sww/HausteinP02 }} ==Easing Participation in the Semantic Web== https://ceur-ws.org/Vol-55/haustein.pdf
                         Easing Participation in the Semantic Web

                                                        Stefan Haustein, Jörg Pleumann
                                                                 Computer Science VIII, X
                                                                  University of Dortmund,
                                                                     Baroper Str. 301
                                                               D-44221 Dortmund, Germany
                                             fstefan.haustein, joerg.pleumanng@udo.edu


ABSTRACT                                                                                 Immediate feedback. After an HTML page had been
Although a promising idea, the Semantic Web currently                                     designed in a text-editor, the result could be displayed
seems to have a problem duplicating the success story of                                  in any HTML client to get an impression of the results.
its predecessor, the World Wide Web. The number of peo-                                   Thus, the user had an immediate feedback on his or
ple actively participating in the Semantic Web has been very                              her work.
limited until now, because people can't see the bene ts origi-                           Additional bene ts. Even though their original pur-
nating from the extra e ort they invest into semantically rich                            pose was to present information to other people,
web pages. Unfortunately, this advantage is barely visible at                             HTML pages could be used as a means of discussion or
all until a critical mass of RDF-annotated pages is available                             documentation for people participating in a project or
on the net, thus making is diÆcult to recruit new partici-                                even for personal use. Thus, there was an additional
pants for the Semantic Web. The article tries to break this                               gain users got from participating in the world wide
vicious circle by showing that the use of appropriate tools                               web, which made the system even more attractive to
may both ease participation in the semantic web and pro-                                  them.
vide a number of additional advantages not directly related
to the Semantic Web. The latter, in particular, may con-                                 Low critical mass. As a networked e ort, the World
vince a larger number of people to participate, and thus                                  Wide Web required a minimum (but large enough)
bring the Semantic Web nearer its critical mass.                                          number of participants to raise the interest of out-
                                                                                          side people, convincing them to become involved. Yet,
1.     INTRODUCTION                                                                       since the World Wide Web was the rst system of its
                                                                                          kind, and there was no similar system to compete with,
  The Semantic Web is a great idea. Yet, it did not quite                                 this critical mass was relatively low.
take o until now. Why is this the case? Some argue that
RDF [19], the language for adding the semantic information                             When we compare these points to the Semantic Web in its
to existing web pages is the problem. These critics see RDF                          current form, we notice that most of them are not ful lled:
as being too complicated or under-speci ed [11, 6]. While
RDF truly has its problems in some areas, we don't think                                 Simplicity is only partially given. The mixture of RDF
that the language itself is the main obstacle that hinders                                and DAML+OIL is understood in all its details only
people from participating in the Semantic Web. But to nd                                  by people that have a background in AI or related
out where the problem actually lies, we rst need to take a                                 elds. Novices will only be able to use basic concepts
step back and look at what made the original web such a                                   of RDF and might thus have problems to see the real
tremendous success.                                                                       advantages of the Semantic Web.
  In our opinion, there were four important reasons for the
success of the World Wide Web:                                                           Immediate feedback is not given. Unfortunately, there
                                                                                          is no speci c client software for the Semantic Web that
      Simplicity. HTML was easily understood and quickly                                 gives users an impression of their RDF fact base. One
       written down. Even novices could design a few basic                                could argue that it doesn't even make sense to ask for
       web pages with little e ort, put them in a matching                                such a software, because the clients of the Semantic
       directory structure and start an HTTP daemon to de-                                Web are programs rather than human beings.
       liver the content to clients.
                                                                                         There are no additional bene ts, at least none that
                                                                                          are ovious to "`ordinary end-users"'. While human-
                                                                                          readable HTML pages primarily designed for other
Permission to make digital or hard copies of all or part of this work for                 people can also be used for personal purposes, this
personal or classroom use is granted without fee provided that copies are                 is not true for RDF facts, which are meant to be read
not made or distributed for profit or commercial advantage and that copies                by programs.
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific       The critical mass is considerably higher. Why is this
permission by the authors.
Semantic Web Workshop 2002 Hawaii, USA                                                    the case? This time, there already is an existing
Copyright by the authors.                                                                 system | the original World Wide Web | , and
                       Figure 1: Simple UML Diagram for university department's web site


     most people nowadays tend to use the "`brute force"'         ontology and fact management. Information is stored in a
     method to nd a speci c piece of information in it,           knowledge base providing ne grained access. The ontology
     namely Google or some other search engine. Thus, it          is utilized to make sure that the content corresponds to the
     is more diÆcult to convince people to take part in an-       desired structure. The two systems mentioned above are
     other system, even if it is an extension to the existing     able to export their fact base to an RDF representation.
     one.                                                            While these tools aim into the right direction, they still
                                                                  have a problem: As long as one wants a machine-readable
   As long as the rst three points are true, the critical mass    RDF-version of the facts as well as a human-readable
of users needed to make the Semantic Web "`take o "' will         HTML-version, duplicate e ort is required to maintain both.
be hard to reach. Unfortunately, seen the other way round,        Take, for example, a typical web site for a university depart-
the Semantic Web hardly has some kind of real bene t unless       ment containing information about the department's sta ,
there is a large-enough number of participants that makes         research topics, projects, and publications. A highly struc-
available RDF-speci ed information to others, that is, until      tured site like this is suitable for participating in the seman-
the critical mass is reached. The current situation could be      tic web, and it can easily be modelled using a corresponding
seen as some kind of vicious circle that has to be broken         domain ontology. Yet, a change as simple as a telephone
before the Semantic Web has a chance to succeed.                  number has to be propagated to the RDF version as well as
                                                                  the HTML version.
2.   TOOLS TO BREAK THE CIRCLE                                       Given a Semantic Web tool followed a generative ap-
   To break the circle, we have to get rid of as many as pos-     proach, the situation would be easier: Assume this tool were
sible of the four problems shown in the previous section.         able to incorporate regular HTML for the unstructured part
Since we cannot lower the critical mass for mainstream ac-        of the web site, and these pages could contain placeholders
ceptance of the Semantic Web (possibly by forcing people          for insertion of information contained in the fact base. The
into it), we have to focus on the other three: Simplicity, im-    tool would then be able to generate the actual HTML pages
mediate feedback, and additional bene ts. A very promising        automatically from the existing RDF information { or even
way to achieve this seems to be the use of appropriate tools.     both from a common fact base { , thus requiring the user
These tools would have to ease participation in the Semantic      to maintain this fact base only, at least as far as structured
Web, but would also have to provide some "`added value"'          information is concerned. If the generation of pages takes
that makes them attractive to end-users. Obviously, when          place at run-time, we arrive at a tool that could be seen as
using the tools, people will also likely participate in the se-   a "`Semantic Web-enabled HTTP server"'
mantic web, even if that is not their original motivation.           While the avoidance of redundancy already is a big advan-
The following sections try to show what features these tools      tage addressing simplicity, the generative approach provides
might o er.                                                       other advantages that fall into the area of "`added value"':
2.1 Generative approach                                                In contrast to editing HTML directly, a unique look
                                                                         and feel can easily be established for the whole site,
  Looking at existing tools developed for or related to the              given an appropriate template mechanism.
Semantic Web, for example Protege-2000 [23] or Ontobroker
[15], one notices that these are primarily designed to support        In addition to HTML and RDF, other target formats
     like WML and cHTML can be generated from the same              In its current form, the Semantic Web requires users to
     fact base, lowering redundancy even further.                learn yet another formal description language. Users having
                                                                 an background in AI may be expected to be familiar with de-
    In contrast to plain HTML les, ontology-based con-          scription logics and corresponding ontology modelling tools.
     sistency checks can be performed automatically while        For mainstream acceptance, though, integration of recog-
     entering data, e.g. avoiding dangling links inside the      nised standards like UML [20] may help to improve accep-
     system.                                                     tance of Semantic Web tools and thus lower the entrance
                                                                 barrier [13]. Most students of computer science or related
2.2 Incorporation of database features                           engineering disciplines can be assumed to be familiar with
   To broaden the possible target audience of our Seman-         UML and modelling tools like Together or Rational Rose.
tic Web server, we might try to incorporate database-like        These students could easily apply their modelling knowl-
features and thus position it as an alternative to a "`heavy-    edge to the Semantic Web and thus contribute to its group
weight"' database solution.                                      of early adopters.
   While relational databases with HTML-generating front-
end are quite common these days (e.g. Cold Fusion [8], PHP       3. THE INFORMATION LAYER
[2], Enhydra [1] etc.), these solutions are mainly used for         In order to demonstrate that participation in the Seman-
sites with a simple, low-dimensional structure, such as guest    tic web actually can be simple, and that using a server based
books or news pages (e.g. Slashdot.org). More complex            on a ne grained fact base instead of HTML- or RDF les
domains such as university departments often still use plain     can provide immediate gains, we have started to model our
HTML les for their web presentation, or make only limited        own unit's web pages accordingly. For this purpose, we used
use of database tables.                                          our Information Layer system, which stores data in a simple
   Here, the reason may be that a high number of ta-             XML format that is determined by a given ontology. The
bles would be required for modeling even simple ontologies,      information layer uses an object-oriented model for data rep-
mainly because associations are not rst class members of         resentation. Objects consist of atomic attributes and rela-
relational database systems. Revisiting the university de-       tions to other objects. The consistency of relations in both
partment scenario, we need at least tables for persons, re-      directions is ensured automatically, avoiding inconsistencies
search topics, projects, and publications. Figure 1 shows        inside the system. The concepts and relations are de ned
a possible UML class diagram of the database's conceptual        application-dependent in an external ontology de nition le.
model. Since all n:n associations require separate associa-      All les used by the information layer are stored as XML
tion tables, this results in quite a lot of normalised tables    documents.
(more than 10), each of which potentially contains only a           The InfoLayer system was originally designed as an inte-
very small subset of all the possible instances.                 grated information platform for software agents and human
   In this case, the bene t for the creator, that is, the dy-    users in a conference scenario. The system was used in the
namic generation of HTML or { in our case { RDF from a           COMRIS project [21] in order to make conference informa-
single set of data, does not outweigh the extra e ort inherent   tion available in appropriate formats to human users as well
in maintaining the tables.                                       as software agents, utilizing the same underlying knowledge
   Using Semantic Web tools, the picture may change signif-      base. Access to the content is possible via a generic HTML
icantly. For a low number of instances, the internal knowl-      interface as well as a FIPA [16] based XML interface [18].
edge base provided by a Semantic Web tool may be suÆ-            Obviously, when information is machine readable for soft-
cient. Associations are direcly supported, and the ontology      ware agents, it is not a big leap to make this information
language also allows to specify integrity constraints for them   available for the Semantic Web as well.
at an appropriate level. Since Semantic Web tools usually           In the process of modelling our unit web pages, we made
come with a generic user interface, the need to create HTML      several improvements to our system, simplifying the use as
forms for editing the tables is avoided.                         a replacement for a \regular" web server. While there may
                                                                 be alternative paths appropriate for other systems, our main
2.3 Incorporation of Content Management                          purpose was to show that using semantic web systems may
    Features                                                     provide direct advantages over regular web servers, even
   Another area that a Semantic Web tool might address is        without relying on advanced features such as knowledge inte-
content management. Content management systems, such             gration from di erent sources (e.g. KAON-REVERSE [17]).
as Hyperwave [3], Zope [5] or OpenCMS [4] provide user,
version and metadata management for a set of HTML pages          3.1 XMI Import
or binary documents in other formats such as PDF or Word.           The original version of the Information Layer system used
Their set of meta data, hoewever, is usually xed and tai-        its own proprietary XML-based ontology description lan-
lored to the most common needs. Here, ontology-based Se-         guage. In order to simplify the initial step of generating the
mantic Web tools provide much more exibility, and may be         application ontology, we have replaced the internal format
superior to general content management systems in domains        by XMI [20], the XML based exchange format for UML di-
where the meta data requirements signi cantly di er from         agrams. Figure 1 shows a simpli ed version of the UML
the standard set provided by content management systems.         model currently used as a basis for our unit web pages.
                                                                    We have chosen UML as ontology modelling language [13]
2.4 Openess to Alternative Schema Languages                      instead of RDFS [7] because it is diÆcult to avoid contact
  In the introduction, we claimed that beneath providing no      with UML when working in computer science or in the IT in-
gain that becomes immediately obvious, RDF annotation is         dustry in general. For most computer scientists, a UML ed-
complex.                                                         itor like Rational Rose or Together is part of their standard
          Figure 2: A Subset of the Semantic Web Research Community ontology concept Hierarchy


tool box. Thus, the extra e ort of installing and getting        tance hierarchy of the SWRC ontology.
familiar with an RDFS editor, possibly preventing people            Since our \local" research unit ontology was primarily de-
from getting in touch with the Semantic Web, is avoided.         signed to t the needs of our \regular" web presentation, it
   Compared to other languages suitable for ontology mod-        does not match the \shared" SWRC ontology exactly. How-
elling, UML currently still lacks clearly de ned semantics.      ever, using the template mechanism of our system, we are
However, there are signi cant e orts to solve this problems      able to generate RDF pages corresponding to the SWRC
[22, 10].                                                        ontology on the y. Figure 3 shows a simpli ed example
   This aspect may be less important for systems providing       template that is used to generate SWRC-compliant RDF
their own comfortable Ontology editor.                           content for instances of the class \Member". In the tem-
                                                                 plates, elements in a special namespace, denoted by the t
3.2 HTML Generation                                              pre x in the example, are replaced by content queried from
   The most important capability required for being able to      the Information Layer with respect to the current instance
replace existing web servers is { of course { the generation     which is determined from the page URL.
of HTML pages.                                                      Thus, it is possible to participate in the Semantic Web
   The information layer contains a module that provides         without needing to extend a prede ned shared ontology,
built-in web-server functionality. The server is able to gen-    which may be bloated and still not full ll all local require-
erate HTML dynamically: For any object, the attributes           ments. Instead, the domain of interest can be modelled us-
are simply displayed, and the associations to other objects      ing a lean domain speci c local ontology. The SWRC person
are converted to sets of hyperlinks to the related objects.      name slot illustrates the advantage of this approach: SWRC
Concepts are displayed as a clickable list of instances corre-   contains only one person name slot that is not split into rst
sponding to the concept. The HTML interface can also be          and last name. If the local application requires having both
used to edit the content of the system using forms generated     parts available separately, it would be necessary to duplicate
dynamically based on the ontology. In the COMRIS project,        the corresponding information, when building the local on-
the HTML interface was used for interaction with the end         tology on top of the SWRC ontology. Also, SWRC concepts
user as well for as debugging and inspection purposes.           like \Organization" may not be required in a local ontology
   In addition to generic HTML generation, templates can be      covering a single organization. Information about the local
used in order to generate HTML pages conforming to a given       organization can be stored in a single static RDF le, not
look and feel. In the COMRIS project, we have also used          bloating the local ontology.
the template mechanism to generate the input structure re-          In addition to template based RDF generation, it would
quired by the text generation system TG/2 ([9]) which was        be possible to generate RDF directly corresponding to the
used to generate natural language output for a wearable de-      local ontology automatically [12]. However, this feature is
vice. The template mechanism is described in some more           not implemented yet.
detail in the next section.
                                                                 3.4 Infrastructure Integration
3.3 SWRC and RDF Integration                                        For simpler integration with the existing Web server in-
  The Semantic Web Research Community (SWRC) On-                 frastructure, we changed the Information Layer implemen-
tology [24] is an ontology designed in order to describe         tation to become a Java Servlet instead of a stand alone
the structure of the Semantic Web Research Community,            program. Running the Information Layer as a Java Servlet
namely the members, events, topics and projects, in a            allows smooth integration with existing Web presentations,
machine-readable manner. It is available in DAML+OIL             without any hard switch. The service can simply be added
and FLogic formats. Figure 2 shows a subset of the inheri-       where it makes most sense, and then later be extended to
                                                         Templates

                                                             
                                                                 
                                                                   
                                                                 




             Servlet-Container
              Servlet-Container(e.g.
                                (e.g.Tomcat)
                                      Tomcat)

                         Infolayer
                                                Template
                                                based XML
                                                generation




                          XHTML                                           RDF