=Paper=
{{Paper
|id=Vol-105/paper-14
|storemode=property
|title=Web Design for the Semantic Web
|pdfUrl=https://ceur-ws.org/Vol-105/Plessers-annotation-final.pdf
|volume=Vol-105
|dblpUrl=https://dblp.org/rec/conf/www/PlessersT04
}}
==Web Design for the Semantic Web==
Web Design for the Semantic Web
Peter Plessers*, Olga De Troyer
Vrije Universiteit Brussel, Department of Computer Science, WISE, Pleinlaan 2, 1050
Brussel, Belgium
{Peter.Plessers, Olga.DeTroyer}@vub.ac.be
Abstract user to annotate existing web pages using a graphical user
interface. While such tools solve a number of issues like
To be able to realize the vision of the semantic web an syntactic mistakes or inconsistencies with the used
important bottleneck that needs to be solved is an easy ontology, a number of fundamental problems still remain.
and intuitive approach for the annotation of websites with
semantic information. Annotating websites defines the The main reason for these problems is that current
containing data in a form which is suitable for tools define a linkage between an ontology and the actual
interpretation by machines. In this paper, we present a data of the website on an implementation level resulting
new approach to annotate websites by taking the in a strong weaving of semantics and implementation. We
annotation process to a conceptual level and by list some of the problems we encounter in current
integrating it into an existing website design method. By annotation approaches:
this means, we are able to solve some of the problems • Despite the introduction of supporting tools, the
current annotation solutions have. annotation process remains a very heavy and time
consuming task. In addition, in most current
1. Introduction approaches this process is an additional activity and
the ones that will benefit from the annotations are
The importance of being able to express the semantics usually not the ones that should accomplish the job.
of the presented information on the Word Wide Web Therefore, the motivation for performing the
(WWW) was neglected for a long time. It was the vision annotation process is low.
of the Semantic Web [1] that brought this issue to the • It is usually assumed that the granularity of the
foreground. The idea of the Semantic Web states that the concepts defined in the ontology matches exactly the
information available on the WWW should be defined granularity of the data on the website, although this
such that it remains usable for human interpretation, but assumption cannot be taken for granted. It must
also becomes usable for machines. Realizing this vision, therefore be possible to define a link between
some limitations of the current WWW (e.g. its restricted semantically equivalent concepts but with a different
query possibilities) can be solved. Although a lot of work level of granularity.
has been done in recent years in the research domain of • Most of the supporting tools only allow annotating
the Semantic Web, an easy and intuitive approach for static websites, page by page on an implementation
authoring websites with semantic markup still remains an level. Even approaches that support the annotation of
important bottleneck. As mentioned in [13], the dynamically generated websites (by annotating the
generation of such semantic markup should be a by- database) create a direct link between the
product of normal computer use. implementation structure of the database (i.e. tables
A step towards this goal has been taken in recent years and columns for a relational database) and concepts in
by annotation approaches such as SHOE [11] [12], the ontology. For static web pages this has as a
MindSwap [7] and CREAM [9]. The earliest annotation consequence that the work done for one page needs to
systems were based on a manual editing of the HTML be repeated for similar structured web pages and that
pages to add the needed semantic information. Already the maintenance of the metadata becomes a heavy task
soon, this process of manual editing proved to be a with a huge cost. Also note that for both static and
cumbersome and erroneous task and the necessity of dynamic websites, every time one changes the
supporting tools became undisputable. The most well- implementation of the website or database, even
known annotation tool is the SHOE Knowledge though nothing has changed to the semantics of the
Annotator [12] of the SHOE project which allows the presented data, the defined linkage between the web
pages or database and the ontologies can be affected.
*
This research is partially performed in the context of the e-VRT Advanced Media project (funded by the Flemish government) which consists of a joint
collaboration between VRT, VUB, UG, and IMEC.
markup and by authoring. SMORE has also the
In this paper we show that elevating the annotation possibility to create a new ontology borrowing concepts
process to a conceptual level, provides an answer to the of existing web ontologies.
problems mentioned before. It is also our belief that CREAM is, as far as we know, the only approach that
(whenever possible) the annotation is best performed supports the annotation of dynamically generated
while designing the website, not after it is implemented. websites. Opposite to the annotation tools previously
In this way we can take advantage of the information mentioned, the database is annotated instead of the
available during the website design process to ease and HTML page. The following information is published to a
improve the annotation process. Therefore, we propose to web page to be able to link concepts of a given ontology
integrate the annotation process into an existing website to tables and columns of a data source: 1) which database
design method. Several website design methods have is used and how the database can be accessed; 2) which
already been proposed in literature. We will use WSDM query is used to retrieve data from the database; and 3)
(Web Site Design Method) [3] [4] in our approach as this which elements of the query result are used to create the
method is well suited for our purpose. It uses an explicit dynamic web page. Using this information it can be
information-modeling step at a conceptual level. In fact, defined which data on the web page is originated from
we propose an approach that bridges classical website which column of which table. By defining a linkage
design methods and annotation techniques developed for between the database columns and concepts of an
the Semantic Web. Using website design methods in the ontology, semantic meaning is added to the data stored in
context of the Semantic Web can provide great value and the columns.
benefits for the annotation process. Nevertheless the linkage between the database and the
The rest of the paper is organized as follows. In ontology is defined at a somewhat higher level than is
section 2 we give a short overview of existing annotation done between static HTML pages and ontologies, the
approaches. We present an overview of our approach in linkage is still done in an implementation-dependent way.
section 3. In section 4 and 5, more details on the As can be seen in the case of CREAM which supports
important aspects of the method are given, making use of dynamic pages, the direct linkage between the database
a small example. The next section lists the advantages of columns and the concepts in the ontology can be easily
our approach and the paper is concluded with future work broken by a change in the structure of the database. This
and conclusions. shows that an annotation approach on a higher level - a
conceptual level - is necessary.
2. Related work
3. Overview of the approach
Current annotation approaches in use are fully
decoupled from existing web design methods. The most Figure 1 gives an overview of the global architecture
well-known approach is the SHOE Knowledge Annotator of our annotation approach. The different phases of
[12] of the SHOE project. It provides the user a form- WSDM that are relevant for our annotation approach are
based graphical user interface to markup existing web at the left: Task Modeling, Navigational Design, Page &
pages using SHOE ontologies without having to worry Presentation Design, Database Design and finally the
about syntax. This tool only supports the annotation of Implementation. Our approach is integrated into the
static web pages, no support for dynamic pages is original phases of the WSDM design method. A short
provided. The annotation process also remains an overview of each step of the WSDM method, together
additional task that needs to be performed after the with the enhancements (if any) we made for our
website is completed. Furthermore, it doesn’t give any annotation approach, is given below.
support to solve the granularity problem between the data
on a website and the concepts of an ontology (as • Mission Statement Specification: Specifies the subject
mentioned earlier in the introduction). and goal of the website and declares the target
Another system is the SMORE (Semantic Markup, audience. No enhancements are needed in this step.
Ontology and RDF Editor) application [15] of the • Audience Modeling: In this phase the different types of
MindSwap project which is based on the same principles users are identified and classified into audience
as the SHOE Knowledge Annotator, but provides a more classes. For each audience class, the different
advanced user interface. It contains an embedded HTML requirements and characterizations are formulated.
editor, web – and ontology browser which allow the user Also in this step, nothing additional is needed.
by means of drag and drop to create web page elements as • Task Modeling: A task model is defined for each
instances of ontology concepts. The Ont-0-Mat tool [10] requirement of each audience class. Each task defined
of the CREAM project uses a similar graphical user in the task model is elaborated into elementary tasks.
interface. Both tools allow annotating web pages by For each elementary task a data model (called ‘object
chunk’) is created, which models the necessary mapping between the BIM, used as the conceptual
information and/or functionality needed to fulfill the database schema, and the actual implementation (called
requirement of that elementary task. ORM (Object database mapping) (C in Figure 2). In this way we are
Role Modeling) [8] is used as the representation able to determine the mapping between the queries
language for the object chunks. For our purpose, we specified at the (conceptual) level of the object chunks,
added an annotation process to the Task Modeling and the actual database.
phase. This results in the creation of a linkage between Implementation: In this phase of WSDM the actual
the object types and roles of the different object implementation of a website, based on the models created
chunks and the concepts of one or more ontologies. in the previous phases, is generated. To this step we
This annotation is called the conceptual annotation added the generation of the actual annotation of the
(arrow A in Figure 1) because it is performed on a website (called the page annotation) (D in Figure 2).
conceptual level. In this way we define the semantic Here we have to distinguish between static websites and
meaning of the object types and roles used in the dynamically generated websites. For static websites only
object chunks. This conceptual annotation is the conceptual annotation is needed. For dynamic
performed for static as well as dynamic websites. websites also the chunk integration and the database
• Navigational Design: In this phase of WSDM the mapping have to be taken into consideration. A more
navigational structure of the website is described by detailed explanation is given in section 5.
defining components, connecting object chunks to
those components and linking components to one
another.
• Page Design: During Page Design, the components of
the navigational structure and their associated object
chunks are mapped onto a Page structure defining the
pages that will be implemented for the website. We
determine which object chunks will be placed on a
certain page. Using this step as well as the previous
one (the navigational design) we can identify which
object chunks will be placed on a page. This is
necessary to know for the actual implementation which
annotations we have to add to a page.
• Presentation Design: For each page defined in the
Page Design a page template is created defining the
layout of the page. This layout is defined in an
implementation independent way. To implement the
actual web pages making use of a chosen
implementation language (e.g. HTML, XML, …), an Figure 1 - Architectural overview
instantiation of these page templates can be generated.
For this, the templates are filled using the proper data 4. Conceptual Annotation
to obtain the actual pages.
• Data Design: As explained in [6] we can derive an To explain the different steps in our approach, we
integrated conceptual schema from the object chunks introduce a simple example situated in the domain of
made during Task Modeling. This integrated object universities. Assume the following two requirements for a
schema is called the Business Information Model university website:
(BIM) and can be used as the basis for a database 1. We want to be able retrieve a list of all the labs
schema from which an underlying database can be with their associated research domain(s) and the
created. The Data Design is only done when we deal name of the professor who is the head of the lab.
with dynamically generated websites querying a 2. It must be possible to see some detailed
database. For static web pages the data design step is information of all employees (professors,
omitted as the actual data will not originate from a assistants, technical personnel, …) working for a
database, but will be supplied by the designer during certain department.
implementation. For our approach, we need to keep These requirements are formulated during the Audience
track of two mappings: 1) the mapping from the object Modeling phase of the WSDM method. The information
types and relationships of the different object chunks needed to fulfill these requirements is expressed by means
to their correspondence in the integrated BIM (called of two object chunks given respectively in Figure 2 and
object chunk mapping) (B in Figure 2); and 2) the
Figure 3. These object chunks are constructed during combination of object chunk entities can be mapped on
Task Modeling. a single ontology entity;
The conceptual annotation is illustrated for the
‘LabOverview’ object chunk (see Figure 2) in Table 1.
The left column contains the different object chunk
entities; the right column lists the corresponding ontology
concepts and relationships. The used ontology itself is
omitted in this paper due to space limitations. Note that
for the entities ‘first_name’ and ‘surname’, we define the
conceptual annotation as a many-to-one mapping between
the tuple and the ontology concept
‘name’. It would be incorrect to define a direct annotation
between the object type ‘first name’ and the ontology
concept ‘name’ or/and the object type ‘surname’ and the
ontology concept ‘name’. The other conceptual
annotations are all defined as one-to-one mappings.
Figure 2 – Object Chunk LabOverview
Object Chunk Entities Ontology Entities
Professor Professor
Lab Lab
name
research domain researchField
has name hasName
name labName
… …
Table 1- Conceptual Annotation example
Figure 3 - Object Chunk EmployeeOverview We conclude this section with a possible outline of
As already explained, while creating an object chunk, implementation of this Conceptual Annotation. For a one-
the designer performs the conceptual annotation. The to-one mapping, the annotation is straightforward as we
designer will create associations between the concepts can define a direct link between the entity in the object
used in the object chunk (the object types, e.g. chunk and the ontology entity. This is not possible for the
‘Professor’, ‘Lab’, ‘research domain’, … and the roles one-to-many and many-to-one mappings. To solve this,
e.g. ‘works for’, ‘has name’, …) and semantically we introduce an intermediate ontology, called Extended
equivalent concepts defined in one or more ontologies. In Ontology. This ontology extends the ontologies used; it
this way, we allow designers to define the meaning of the contains new entities that are constructed from the
different object types and roles they introduce during existing ones by applying some operators (e.g. the
conceptual modeling. As already indicated, this concatenation) on these entities. We introduce this
conceptual annotation is used to generate automatically intermediate ontology because it is not always allowed to
the actual page annotation for the website modify or extend an existing ontology (e.g. because of a
implementation. lack of sufficient permissions). For our example, the
The conceptual annotation is defined as a mapping Extended Ontology will contain three new concepts:
from the different object chunk entities (object types and ‘first_name’, ‘surname’ and ‘name’, where we define
roles) onto the different ontology entities (concepts and ‘name’ 1) equivalent to the concept ‘name’ in the original
relationships). We distinguish between three different ontology; and 2) as the concatenation of ‘first_name’ and
cases: ‘surname’ in the Extended Ontology. Then, a one-to-one
• One-to-one mapping: an object chunk entity can be mapping from respectively the object type ‘first name’ to
mapped in a one-to-one way onto an ontology entity; the Extended Ontology concept ‘first_name’ and from the
• One-to-many mapping: an object chunk entity cannot object type ‘surname’ to the Extended Ontology concept
be mapped onto one single ontology entity but on a ‘surname’ is possible.
combination of ontology entities;
• Many-to-one mapping: an object chunk entity cannot 5. Generating the page annotation
be mapped onto one single ontology entity but a
Starting from the conceptual annotation provided by of ‘Employee’. Then, the mapping of the role ‘has as first
the designer(s), the actual page annotation can be name’ is as follows:
generated. Note that the conceptual annotation is the only
information that is requested from the designers ‘has as first name’ → ‘has as first
name’where <’Employee’ is ‘Professor’>
(concerning the annotation process), as the following
steps can be done automatically. For this generation
process a distinction has to be made between static and Note that ‘has as first name’
dynamic websites. For static web pages, at this point of <‘Employee’ is a ‘Professor’> is the view
the method, all necessary information is gathered. expressing that we only should consider the role ‘has as
Through the conceptual annotation we can trace which first name’ for those ‘Employee’ instances which are also
ontology concepts are associated with the object chunk instances of ‘Professor’.
entities and by the Page Design we know which object If we consider the second object chunk (Figure 3) with
chunk entities will be implemented on a page. the object type ‘name’, the mapping of this object type
would be as follows (‘X’ s the operator to express the
Cartesian Product):
5.1 The object chunk mapping
‘name’ → ‘first_name’ X ‘surname’
In case of a website dynamically generated from the
content of a database a database design need to be done.
In WSDM, the database design is done during the Data
Design phase by integrating the different object chunks
into one integrated schema, called the Business
Information Model (BIM). The conceptual annotation can
be used to drive the integration process as it identifies
semantically equivalent and related object types (e.g. it
can be derived that the object type ‘Professor’ in the
‘LabOverview’ object chunk is a subtype of the object
type ‘Employee’ in the ‘EmployeeOverview’ object
chunk if the ontology concepts linked with these object
types are also involved in a subtype relationship). For a
more in depth overview of the object chunk integration
itself, we refer to [6]. We illustrate the chunk integration
with our example. Assume that the conceptual design
only consists of the two object chunks given in figure 2
and 3., Then, the integrated schema is shown in Figure 4.
During integration it was recognized that a ‘name’ (of an Figure 4 - BIM example
employee) is equivalent with the concatenation of ‘first
name’ and ‘surname’ (of this employee) (this can e.g. be
derived from the conceptual annotation). Therefore, it
was decided to keep the ‘first_name’ and ‘surname” for
an employee and to drop the ‘name’. Therefore, the object
type ‘name’ (as an employee’s complete name) is not
included in the BIM because it would be superfluous. Figure 5 - Database schema
It should be noted that it couldn’t be assured that there
always exists a one-to-one mapping between an object
chunk entity and an entity of the BIM. In general, an
5.2 The database mapping
entity of an object chunk is mapped onto a view of the
The next step is to generate an actual (relational)
BIM. Let us illustrate this with our example. Take for
database schema from the BIM. This can be done using
instance the role ‘has as first name’ defined between the
one of the known mapping algorithms for ORM like for
object types ‘Professor’ and ‘first name’ in our
example RMap [8]. Which mapping algorithm is used is
‘LabOverview’ object chunk (see Figure 2). In the
integrated BIM this information is modeled by means of a not important, but the mapping has to be made explicit.
This is essential, because we need to know in which
role that is more general, i.e. a role between ‘Employee’
database columns we can find the instances of a particular
and ‘first_name’ and the fact that ‘Professor’ is a subtype
object type or role. Again, we cannot assume that the
algorithm maps an object type or role to exactly one
column. In general, an entity of the BIM will be mapped Head:
onto a (relational) view in the data schema.
We have applied the RMap algorithm to our example.
Figure 5 shows the resulting database schema. Note that
the column ‘function’ in the table ‘Employee’ is used to
check if an ‘Employee’ instance is an instance of
‘Professor’ (for professors, the value of ‘function’ will be …
‘P’). If we take the object type ‘Professor’ (see Figure 4),
we see that this object type is mapped on a part of the