=Paper=
{{Paper
|id=Vol-105/paper-12
|storemode=property
|title=Designing and Creating a Web Site Based on RDF Content
|pdfUrl=https://ceur-ws.org/Vol-105/swehg_www2004.pdf
|volume=Vol-105
|dblpUrl=https://dblp.org/rec/conf/www/HyvonenHV04
}}
==Designing and Creating a Web Site Based on RDF Content==
Designing and Creating a Web Site Based on RDF Content
Eero Hyvönen, Markus Holi, and Kim Viljanen
Helsinki Institute for Information Technology (HIIT), University of Helsinki
P.O. Box 26, 00014 UNIV. OF HELSINKI, FINLAND,
FirstName.LastName@cs.Helsinki.FI
http://www.cs.helsinki.fi/group/seco/
Abstract
This paper presents a method and a tool for designing
and automatically creating an HTML web site for publish-
ing Semantic Web content represented in RDF(S). The idea
is to specify the needed RDF to HTML transformation on
two separate levels. On the HTML level, the layout of the
pages can be described by an HTML layout designer by us-
ing templates and tags. On the RDF level, the semantics of
the tags are specified by a system programmer in terms of
logical rules based on the RDF(S) repository. The idea is to
apply logic for defining the semantic linkage structure and
the indices of the page repository. The method has been im-
plemented as a tool called SWeHG for generating a static,
semantically linked site of HTML pages from an RDF repos-
itory. As real life case applications, web exhibitions gener- Figure 1. Rendering RDF(S) content as an
ated from museum collection metadata are presented. HTML web site.
1. Two Views of the Semantic Web can be transformed for the human to view, i.e., how ma-
chine interpretable RDF(S) or OWL content proliferating
The notion of the Semantic Web1 [1, 3] has two interpre- the web can be rendered to the human end-user as a search-
tations. From the machine’s viewpoint, the Semantic Web able and browsable HTML web site or space. In this pa-
manifests itself as a distributed source of interpretable meta- per we present a new approach and tool named “Semantic
data concerning resources, such as web pages2 , documents, Web HTML Generator” (SWeHG) [9] to address this prob-
photos, and real world object. The metadata descriptions lem (cf. figure 1). The idea is to specify the structure and
are given in terms of ontologies using frameworks and lan- the layout of an HTML web site in terms of a set of HTML
guages such as RDF(S)3 and OWL4 . From the human’s templates using a tag language. The templates can be used
viewpoint, the Semantic Web looks like the current web, by a web layout designer who does not know the details of
i.e., it is a repository of HTML pages, but empowered with the underlying RDF(S) content or Semantic Web technolo-
more useful semantics-based links, search engines, and in- gies. The semantics of the tags, i.e., the machine’s view on
telligent web services. the RDF level, is specified by a Semantic Web program-
A central question in the development of Semantic Web mer in terms of logic predicates. A benefit of separating the
applications is how the content represented for the machine HTML and RDF levels is that ontological details and vari-
ance can be hidden from the HTML designer. By modifying
1 http://www.w3.org/2001/sw/ the semantics of the tag, content represented using different
2 See, e.g., http://dmoz.org. ontological structures can be mapped on the same HTML
3 http://www.w3.org/RDF/ tags that the HTML designer is capable of using. The tag
4 http://www.w3.org/2001/sw/WebOnt/ definitions can be re-used directly in applications based on
similar ontologies and annotation schemas. The templates
provide a declarative description of the web site structure,
indices, and linkage. By modifying the templates alone in
HTML, the same RDF(S) content can more easily be ren-
dered in different ways in different applications to human
end-users.
In the following, we first discuss two examples of seman-
tically indexed and linked HTML web sites generated by
SWeHG. The layout specifications with the corresponding
tag definitions needed for the RDF to HTML transforma-
tion are then discussed. After this, the transformation pro-
cess and its implementation are presented. In conclusion,
experiences of our research and experimentation are sum-
marized, related work is described, and directions for fur-
ther research are outlined.
Figure 2. A photo exhibition generated with
2. Example Applications SWeHG.
2.1. Helsinki University Museum
Using SWeHG to publish the archive provides the end-
The virtual exhibition of a photo archive in the Helsinki users with two services. First, the photos can be found along
University Museum5 was generated. The archive contained the different orthogonal views based on the ontologies. Sec-
629 photographs about the promotion ceremonies of the ond, the photos can be browsed by using the links created
University of Helsinki. The content of the archive was trans- between semantically related photos. The links are grouped
formed into RDF(S) format in an other application project based on the semantics of the link. For example, there is a
[7] and was used as it is by SWeHG. The domain knowledge link group that points to other photos taken of the same per-
consists of six ontologies with 329 promotion-related con- son.
cept classes, such as “Person” and “Building”, 125 proper-
ties, and 2890 instances, such as “Linus Torvalds” and the
2.2. Espoo City Museum
“Entrance of Cathedral of Helsinki”.
In the photo annotation schema, the subject of a photo- Figure 2 presents the home page of the exhibition “Es-
graph is represented by a collection of ontology classes and poo City Museum on the Semantic Web” that was generated
individuals that appear on the image6 . For example, if Linus using SWeHG for the museum7. Seven RDF(S) ontologies
Torvalds appears in a photo on a particular street, then the are used with some 10,000 classes and individuals and the
photo record is related directly with the corresponding per- metadata is described in terms of 38 properties. The RDF(S)
son and street resources with a property corresponding to repositories where originally created for the semantic por-
dc:subject. However, the relation between photos and tal MuseumFinland [6]. In this work, we could re-use the
subjects can be indirect, as well, involving traversal through semantic recommendation predicates and the inference rule
several RDF arcs in the underlying knowledge base. For ex- base developed for the original system, and the exhibition
ample, Linus Torvalds is present in a photograph as a Hon- could be generated in a day or two.
orary Doctor. Then only an instance of such a role is asso- In the RDF(S) repository, each ontological property of
ciated with the image. The person instance in not directly the collection objects in the exhibition, such as “material”
linked with the image, but indirectly through the role in- is associated with a domain ontology of its own. For exam-
stance. SWeHG predicate definition facility is very handy ple, artifact, material, and technique ontologies have been
in hiding such annotation schema specific details from the defined based on the Finnish MASA Thesaurus [10] of key-
HTML designer: the persons can be associated with im- words used in several museums for indexing data. The on-
ages either directly or indirectly through roles. The crite- tology MAO [8] created based on MASA contains some
rion for association can be defined freely and conveniently 6600 classes organized in a taxonomy. There is also a lo-
by a declarative predicate. cation ontology that defines geographical concepts such as
“country” and “town”. Their instances are individual areas
5 http://www.helsinki.fi/museo/
6 The annotations also include other metadata, such as the photogra-
pher, free text descriptions, some technical information of the images, 7 The exhibition is on the web at
etc. http://www.cs.helsinki.fi/group/seco/swehg/ekmdemo/
and places. The places are related with each other by a part- RDF templates
HTML repository
of meronymy. In the same way, an agent ontology defines repository +
rules
concepts such as “person” and “company”, whose instances
HPage
are active individuals. There is also an ontology for time pe-
R1
riods and an ontology of collections in different museums.
Still another ontology of “activities and processes” contains R2 IPage1 IPage2
R3
a taxonomy of concepts such as “wedding” and “fishing”.
It is used to provide the end-user with an event-based view R4 R5 R6
to cultural artifacts by associating them with correspond- R1Page R2Page R3Page R4Page R5Page
ing events through annotations and logical rules. Each ob- R7
ject’s metadata and annotations are given in an RDF card,
that points to different classes and instances of the ontolo-
gies by the respective URIs through RDF properties. Some
of the properties in an RDF card have literal values, and
some point to resources by using URIs.
The created HTML site consists of some 1200 resource Figure 3. Transforming an RDF repository
web pages (RPage) describing objects in the museum’s col- into HTML pages.
lection database, pages indexing the contents along differ-
ent classifications, and a short user’s guide. On the left in
figure 2, three frames containing indices for the underly-
ing content are seen. The alphabetical index (“Aakkostettu portance, since museums typically do not have competent
hakemisto”) contains links to the RPages in alphabetical or- IT personnel, servers, and resources to create and maintain
der. By selecting a link, the respective RPage is shown on semantic portals of their own.
the right. In figure 2, the user has selected a link to an RPage To sum up, the output of SWeHG is a semantically linked
depicting perfume bottles. Before making a selection, the space of HTML pages of the following kind: 1) Resource
user’s guide was shown in the same frame. The classified pages (RPage) depict selected resources with their meta-
index (“Hakemisto aiheittain”) is based on the RDFS tax- data. 2) Index pages (IPage) classify RPages along concep-
onomy of the underlying cultural MAO ontology [8] that tual hierarchical classifications, that will be called facets or
was used when creating the collection metadata. When se- views [11]. By using IPages, RPages can be found along dif-
lecting a concept, the rightmost frame shows links to its sub- ferent facets. 3) A home page (HPage) defines the entrance
concepts together with links to RPages whose objects are di- page to the HTML repository.
rectly related to the concept. By selecting a subconcept link
there, the taxonomy can be browsed further downward; by 3. Specifying the Transformation
selecting a link to an RPage, the corresponding collection
object with its metadata can be viewed in the frame. The
third index “Hakemisto tapahtumittain” classifies the col- RDF graph is on the left. Each
corresponds to a re-
Figure 3 depicts the RDF to HTML transformation. The
lection objects by associating them with the different events, source corresponding to a data entry in the RDF repository.
processes or activities in which the objects are used or oth- In our example, the data entries are collection objects with
erwise related to. their metadata. On the right, the HPage has links to vari-
By using the indices, the user can find collection ob- ous IPages classifying the underlying RPages that are re-
jects of interest. An alternative way is to use a conventional lated with each other by semantic links.
search engine. In the upper right corner of figure 2 a form The transformation is based on descriptions on two lev-
for using Google to search for the pages in the repository is els: 1) The layout of the HTML pages is described on the
seen. The hit list will be shown in the rightmost frame. HTML level by templates using custom tags. 2) The seman-
After finding an PRage of interest, the collection can be tics of the tags is defined on the RDF level in terms of logi-
browsed by using the semantic links generated between re- cal rules based on the input RDF(S) content. The idea is that
lated collection items. For example, in figure 2 links to ob- an HTML designer can design the layout of the page repos-
jects manufactured at the same location, objects of similar itory to be generated by using tags without knowing details
material etc. can be clicked. The semantic links are gener- of the underlying RDF structures, RDFS ontologies, and
ated based on the underlying ontologies, metadata, and log- Prolog programming. RDF(S) related knowledge as well
ical recommendation rules. as programming capability in Prolog is needed only for
The museum can publish the content by just copying the the system programmer when defining the tags. The same
pages into a public HTML directory. This is of practical im- tag definitions can be re-used in applications conforming to
similar ontological schemas. photoOf(Class, Record) :-
rdf(Instance, rdf:type, Class),
SWeHG provides the HTML designer with three major rdf(Record, dc:subject, Instance).
tags: getProperty, getLinks, and getView. The tag is used for rendering a label related to the Here buildings selects the class building as
resource underlying an RPage. For example, the metadata the view root, and the hierarchy is expanded along the
property values of the bottles and the photo in figure 2 are rdfs:subClassOf property. The photoOf predi-
rendered in this way. The relation can be specified by the cate relates each building type of this tree with a set of
system programmer on the RDF level freely by a binary log- photo record resources which are used as the leaf cate-
ical predicate. gories of . These are rendered as HTML links to the cor-
The tag is used for rendering links between responding RPages. The tag definitions could also be
RPages. For example, the tag much more complex than this, depending on the struc-
view expansion into HTML can be controlled with the
could expand into the following HTML code linking help of additional tag attributes for, e.g., ordering the cate-
photographs taken at the same location: gories.
The following is an example of a complete RPage tem-
- plate. It could be used for rendering the images using the
View from Eiffel-tower
HTML img-tag and links to related RPages:
-
Cafe Parisienne
...
On the RDF level, the criterion SameLocation for the
" />
ciates the attribute SameLocation with the HTML link Photos from the same place:
Location defining the link relation.
swehg_relation_rule( ’SameLocation’,
’Same Place’, photosWithSameLocation).
photosWithSameLocation(Context, Target) :-
photo(Context), photo(Target), The tag attribute selector in the tag
rdf(Context, _:place, Location), tells the criterion for selecting con-
rdf(Target, _:place, Location),
not(Context == Target). text resources from the RDF repository. Each context re-
source will have an RPage of their own on the HTML
The tag renders into a hierarchical index-like level. The attribute value, here photo, is the name of a
view of category resources used in IPages. Each category is unary Prolog predicate called selector that should evalu-
associated with a set of subcategories and additional indi- ate true for context resource URIs.
viduals of the categories. A view is defined by specifying An example of a complete IPage template is given below
1) the root resource selector, 2) a binary subcategory rela- using the view definitions above:
tion predicate, and 3) a binary relation predicate that maps
the hierarchy categories with the individuals used as leaves
in the view. For example, the tag
Building index
branches="subclass"
leaves="photoOf"
expands recursively into a hierarchical unordered tree orderby="order_alphabetically"
listType="ul"/>
(ul), where the leaves are links to photo record resources
related to different building categories. The predicate defi-
nitions defining the meaning of the attribute values can be,
for example, the following:
buildings(URI) :- 4. Web Site Generation
rdf(URI, rdf:type, ’http://some.org#building’).
subclass(SubCategory, SuperCategory) :-
rdf(SubCategory, rdfs:subClassOf, SuperCategory). The process for transforming an RDF(S) repository into
HTML pages is defined by the algorithms 1 and 2. The in-
8 The examples are presented in SWI-Prolog (http://www.swi- put of the procedure is a set of HTML templates, and an
prolog.org) syntax. Here RDF triples are presented as rdf(Subject, RDF(S) repository. The output is an HTML page repository
Predicate, Object). Underscore “_” is an unnamed variable. conforming to the templates. The transformation is based
INPUT SWeHG OUTPUT Algorithm: createHTMLpage
RDF(S)
Data: Template t, Context Resource r
repository
XML page Page content XML Linkage
Link
Analysis Result: HTML page
generator report
analyzer
Prolog
HTML
String H = t;
predicates
Processing
foreach Tag in H do
instructions
h = executeRule(Tag.rulename, r);
replace Tag in H with h;
HTML Template Layout XSL HTML
pages
end
templates processor XSL transformer
return H;
Algorithm 2: Algorithm createHTMLpage for render-
Figure 4. Internal architecture of SWeHG. ing an HTML template. Tag.rulename returns the name
of the rule, e.g., “getProperty”.
on a set of logical rules for selectors, properties, links, and
views.
The pages are generated using the HTML templates one
after another. If a template is associated with a selector, then
it is expanded into a set of RPages corresponding to the se-
lected context resources, else it is expanded once without a
reference to a context resource. In the latter case, the HPage
and IPages are created. When generating an HTML page,
the tags are expanded into HTML in the ways described in
the previous section.
Figure 5. An analysis page created by
Algorithm: RDF2HTML SWeHG.
Data: Templates T, RDF(S) repository R
HTMLPageRepository H = empty;
foreach Template t in T do of “Processing instructions” into a separate Prolog source
if t has a selector rule S then code file. These instructions link template tags with the Pro-
foreach RDF Resource r in R do log predicates used in them as attribute values. The mod-
if S(r) == true then ule “XML page generator” is a Prolog program that applies
h = createHTMLpage(r, t); the predicates used in the HTML tags with respect to the
add h to H; RDF repository according to the Processing instructions.
The result is a set of XML files describing the page con-
end
tents. These XML files are then transformed using Apache
end
Xalan10 and with the help of the XSLT templates generated
end
earlier into the final HTML pages.
else
The intermediate XML files in figure 4 are also used
h = createHTMLpage(T);
as a basis for the “Linkage analyzer” module that tries to
add h to H;
identify the following potential problems: Self loops (a link
end that points to the page itself), Bad links (link pointing to a
end non existing page), Dead ends (an RPage with no outbound
Algorithm 1: Main procedure for the RDF to HTML links), No way in (an RPage with no inbound links from
transformation any RPages or IPages), Not in index (an RPage with no in-
bound links from any IPage), and Unused rules (rules that
Figure 4 depicts the architecture of our implementation. are newer referred to when generating the HTML reposi-
The main program is a Perl script which first builds an tory). The analysis results are represented as HTML pages.
XSLT 9 template out of the HTML templates using the mod- This helps the designer in debugging the specifications.
ule “Template processor”. This module also writes out a set Figure 5 depicts a portion of the result from the analyzer.
9 http://w3.org/TR/xslt 10 xml.apache.org/xalan-j/
On this page the number of in-coming and out-going links The page repository can be published easily by just copying
can be seen for each RPage together with a status explana- it into a public HTML directory. SWeHG can be adapted to
tion. The analyzer has found out that the page with label different contents conforming to different ontologies. The
“Aikaisempien yleisten ...” is not connected with any other publication process is independent from semantic portal
page or index. Furthermore, the page “Airueet” has one in- providers—no special server software is needed. The pages
coming and two outgoing links but was not included in any need no special maintenance. The static pages are indexed
index. This kind of connectivity information is vital when and searched for by general search engines. The pages can
debugging the logical rules that produce the HTML pages. be viewed efficiently. Data security problems are minimal.
The properties of the resulting HTML page set can be ana-
lyzed efficiently.
5. Discussion
On the other hand, the static approach taken in SWeHG
also has, of course, its limitations. First, static pages can not
5.1. Benefits and Limitations
adapt their content dynamically to different user or patterns
of usage. Second, dynamic systems can be connected more
Our initial experiences indicate that the presented RDF
easily with other services providing additional functionality.
to HTML transformation method is feasible. HTML tem-
Third, if the RDF repository, the rules, or the HTML tem-
plates can be created fairly easily and can be adapted to
plates change, the site has to be regenerated usually from
different RDF repositories. Moreover, changes in ontology
scratch. Dynamic systems can adapt better to such changes.
versions do not affect the usage of the templates on the
Fourth, if the RDF repository is large and many templates
HTML level in any way. The idea of using logic and Pro-
are used, then the number and size of generated pages can
log for defining the semantics of the tags seems powerful.
be large.
Complicated semantic link relations and views can be de-
Clearly, both the dynamic and static approaches have
fined and modified easily thanks to the declarative nature
their own virtues and application possibilities.
of logic programming. By using generic rules it is possible,
in principle, to create tag definitions that will apply to any
RDF repository. In contrast to view-based search systems, 5.2. Related Work
such as [11, 5], the views are projected from the RDF(S)
ontologies. The main benefit is that arbitrary mappings be- Logic and dynamic link creation on the semantic web
tween view categories and data resources can be flexibly de- have been discussed, e.g., in [4, 2]. Our approach is differ-
fined. The system infers the mapping between views and re- ent in it’s use of HTML templates and Prolog for describing
sources which gives it an “intelligent” flavor. Furthermore, the static HTML output. In the RDF Twig tool11 the RDF to
the HTML pages are linked semantically with each other ac- HTML transformation is based on XSLT. A problem here is
cording to the ontologies, metadata, and rule base used. To that an RDF graph can be serialized in many ways in XML.
the end-user, the underlying hidden associations between Different applications may produce different XML serial-
collection objects is a most interesting aspect of cultural col- izations of the same RDF graph, and thus a number of XSLT
lections. The nature of the associations can be explained to templates would have to be written for a single graph. In our
the user by the labels of the links. approach only actual changes in the graph structures are rel-
The tag definitions are not application specific, and can evant, because in SWI-Prolog, by which we define the log-
be used also in different applications that use the same ical rules, the RDF graph is processed purely as triplets. In
RDF(S) content. For example, we could use the linkage Spectacle12 the RDF to HTML transformation is based on
rules, the selector rules, and the rules generating the views APIs. Then the user must write programs that use the API,
of the indices of the “Espoo City Museum on the Semantic and also an application server is needed. In contrast, our ap-
Web” developed originally for a semantic portal [6]. On the proach is based on tags, is declarative, and the result is a set
other hand, the tag language of SWeHG is limited, and it can of static pages whose linkage structure is inferred by logi-
not be extended easily. Also, the set of different HTML out- cal linking predicates.
puts that the tags produce is limited at the moment. The out-
put varies from simple strings to lists of links. In addition, 5.3. Directions for Future Work
SWeHG does not offer sufficient tools for testing the tag
definitions before the actual transformation. A preview or SWeHG is a research prototype. More work and testing
debug function would be useful, because when the RDF(S) is still needed in order to evaluate and enhance the usability
database is large, then the transformation process is long. and extendability of system in different applications. More
SWeHG generates static pages in a batch process before
publishing them on the web. This approach has the follow- 11 http:/rdftwig.sourceforge.net/
ing benefits when compared with dynamic semantic portals: 12 http://www.aidministrator.nl/spectacle/
work is also needed in optimizing the efficiency of the code [10] R. L. Leskinen, editor. Museoalan asiasanasto. Museovi-
and in providing better development tools for the HTML de- rasto, Helsinki, Finland, 1997.
signer and system programmer using the system. [11] A. S. Pollitt. The key role of classification
and indexing in view-based searching. Techni-
cal report, University of Huddersfield, UK, 1998.
Acknowledgments http://www.ifla.org/IV/ifla63/63polst.pdf.
A. Valo contributed significantly to the implementation
of SWeHG. Also M. Kiesilä, V. Komulainen, R. Köppä-
Laitinen, and J. Muhonen participated in the implementa-
tion project. Our work was mainly funded by the National
Technology Agency Tekes, Nokia Corp., TietoEnator Corp.,
the Espoo City Museum, the Foundation of the Helsinki
University Museum, the National Board of Antiquities, and
the Antikvaria-group of some 20 Finnish museums.
References
[1] T. Berners-Lee, J. Hendler, and O. Lassila. The semantic
web. Scientific American, 284(5):34–43, May 2001.
[2] P. Dolong, N. Henze, and W. Neijdl. Logic-based open hy-
permedia for the semantic web. In Proceedings of the Int.
Workshop on Hypermedia and the Semantic Web, Hypertext
2003 Conference, Nottinghan, UK, 2003.
[3] D. Fensel, J. Hendler, H. Lieberman, and W. Wahlster, edi-
tors. Weaving the Semantic Web. The MIT Press, 2002.
[4] C. Goble, S. Bechhofer, L. Carr, D. De Roure, and W. Hall.
Conceptual open hypermedia = the semantic web? In
Proceedings of the WWW2001, Semantic Web Workshop,
Hongkong, 2001.
[5] M. Hearst, A. Elliott, J. English, R. Sinha, K. Swearingen,
and K.-P. Lee. Finding the flow in web site search. CACM,
45(9):42–49, 2002.
[6] E. Hyvönen, M. Junnila, S. Kettula, E. Mäkelä, S. Saarela,
M. Salminen, A. Syreeni, A. Valo, and K. Viljanen.
Finnish Museums on the Semantic Web. User’s perspec-
tive on museumfinland. In Proceedings of Museums and
the Web 2004 (MW2004), Arlington, Virginia, USA, 2004.
http://www.archimuse.com/mw2004/papers/hyvonen/ hyvo-
nen.html.
[7] E. Hyvönen, S. Saarela, and K. Viljanen. Ontoga-
tor: combining view- and ontology-based search with
semantic browsing. In Proceedings of the XML
Finland 2003 conference. Kuopio, Finland, 2003.
http://www.cs.helsinki.fi/u/eahyvone/publications/ xmlfin-
land2003/yomXMLFinland2003.pdf.
[8] E. Hyvönen, M. Salminen, S. Kettula, and M. Junnila. A con-
tent creation process for the Semantic Web, 2004. Proceed-
ing of OntoLex 2004: Ontologies and Lexical Resources in
Distributed Environments, May 29, Lisbon, Portugal (forth-
coming).
[9] E. Hyvönen, A. Valo, K. Viljanen, and M. Holi. Publishing
semantic web content as semantically linked HTML pages.
In Proceedings of XML Finland 2003, Kuopio, Finland,
2003. http://www.cs.helsinki.fi/u/eahyvone/publications/
xmlfinland2003/swehg_article_xmlfi2003.pdf.