A Framework for Ontological Description
                 of Archaeological Scientific Publications
                                      Andrea Bonomi, Glauco Mantegari and Giuseppe Vizzari
                           Department of Informatics, Systems and Communication, University of Milan–Bicocca
                                           Via Bicocca degli Arcimboldi 8, 20126 Milano, Italy
                                  {andrea.bonomi, glauco.mantegari, giuseppe.vizzari}@disco.unimib.it


   Abstract— This paper describes the results of the first step        structure, without any human intervention. On the other hand,
in the development of a comprehensive project aimed at re-             the libraries that use the first approach are maintained with
alizing a portal and a set of advanced services supporting             massive human effort, justified only if the offered quality is
the sharing of knowledge about Prehistory and Protohistory
in the Italian context. In particular, one of these services is        much higher. For example, in DBLP3 (Digital Bibliography
represented by a digital library, whose entries (i.e. bibliographic    & Library Project) the entries and the related information
descriptions of publications) will be ontologically described. The     (authors, conferences, journals, etc.) are manually standardized
paper introduces the approach that was adopted to support              to guarantee than every entity is always represented by the
the acquisition and representation of ontological knowledge.           same string (e.g. every author name is always spelt the same
The software modules that were developed to support these
phases allow on one hand the management of the assertional             way). This process is necessary because different bibliographic
component of the ontology, and on the other the association of         information sources provide information in different formats
the related entities to digital library contents. These descriptions   (e.g. some sources give full authors names, in others names
will be exploited to support effective strategies for bibliographic    are abbreviated).
information retrieval as well as semantic navigation schemes
through the recommendation of contents related to the currently
                                                                          Another difference between the two approaches regards the
viewed one.                                                            description of the publications contents. The manual approach
                                                                       consists in (manually) associating publications to keywords
                             I. I NTRODUCTION                          from a dictionary or a classification system. General classifica-
   The term e-Science is used to describe science performed            tion systems are available (e.g. the ACM Computing Classifi-
through global collaborations between scientists, enabled by           cation System4 or the Dewey Decimal Classification System5 ),
Internet technologies, in order to solve scientific problems [1].      however they are extremely generic and they do not support the
Today, no researcher can work isolated but his/her work                description of relations between different publications (e.g. to
depends on the available resources in the scientific commu-            describe that a publications is part of a collections or to define
nity. Publications provide one of the main channels for the            links between an article and a technical report). The automatic
dissemination of scientific results and it is very important to        approach consists in extracting keywords from the full-text
have access to the right publications when they are needed.            documents and associating them to the related publication
Moreover, in most scientific fields, the amount of publications        description. This approach requires to have access to the full-
is growing exponentially [2] and finding the right information         text documents in processable form (i.e. if the documents are
is correspondingly getting harder. The growth of the amount of         digitized from hardcopies they must be processed by means
existing scientific publications is not a new phenomenon: in the       of an OCR tool).
1960, Maron and Kuhns reported the the fact that documentary              This paper describes instead a manual description approach
data are being generated at an alarming rate, doubling every           adopted in the specific domain of Archaeology. This work is
12 years [3].                                                          set in the wider context of a project aimed at realizing a portal
   Today there are many on-line digital libraries helping users        and a set of advanced services supporting the sharing of knowl-
in finding information about scientific publications. There are        edge about Prehistory and Protohistory in the Italian area. For
mainly two approaches to populate these libraries: manually            this specific activities partners in Archaeology Departments
edit its contents and automatically populate it. The first             participate to the project by providing their domain knowl-
approach is generally used by libraries, publishers, editors,          edge, but also providing the active participation of (thesis,
laboratories, researchers, and so on, whereas the other is             master, PhD) students that can carry out document description
often adopted by Internet portals and general-purpose search           activities. However, in order to effectively describe contents
engines, like Google-Scholar1 or CiteSeer2 . These search en-          beyond a keyword based approach, and thus in order to
gines actively retrieve new documents and automatic tags and           support effective forms of information retrieval and semantic
link metadata information related in a scientific publications
                                                                         3 http://dblp.uni-trier.de/
  1 http://scholar.google.com/                                           4 http://www.acm.org/class/
  2 http://citeseer.ist.psu.edu/                                         5 http://en.wikipedia.org/wiki/Dewey Decimal Classification
                                                                     for the retrieval of digital resources related to prehistory and
                                                                     protohistory, and in general to the archaeological research
                                                                     methodologies. In fact, even if there are a growing number of
                                                                     initiatives providing for the electronic publishing of scientific
                                                                     papers - as, for example, the digital archives of the Italian
                                                                     Institute for Prehistory and Protohistory7 or the BibAr8 project
                                                                     hosted at the University of Siena - their indexing by traditional
                                                                     search engines is often unsatisfactory.
                                                                        The main requirement of the portal is to give the community
                                                                     itself the possibility of autonomously managing the contents
                                                                     by means of simple editing tools. At this regard, we must
                                                                     keep in mind that, in most cases, archaeologists have just low-
                                                                     level technical competence and the development of a complex
          Fig. 1.   A screenshot of the ArcheoServer Home Page       editing system may result in the failure of the project.
                                                                        In our scenario there are two principal classes of editors.
                                                                     The first one is represented by the students of Archaeology of
navigation, human annotators must have available a domain            the Universities involved in the project, who are responsible
ontology whose elements can be selected as relevant indicators       of the content creation; the second one is represented by
of the topics treated in the described publication. The paper        Archaeology professors or researchers that, beyond creating
introduces the ontological description approach, as well as the      contents, supervise the work of students.
software modules developed to support the definition of the             However, our intent is also to create a platform for the ex-
archaeological domain ontology and the e-library document            perimentation of computer science research, focusing on those
annotation.                                                          aspects that can lead to a real innovation, such as the semantic
   The remainder of this paper is organized as follows. Sec-         description and retrieval of the contents. In fact, scientific
tion II provides a description of the application scenario which     publications in archaeology reflect its strong interdisciplinar
is followed by a discussion of related works. In Section IV          nature in terms of contents richness and articulation. For this
we discuss the chosen content description approach and in            reason it may be interesting to describe, by means of a specific
Section V we describe the overall system architecture. We            ontology, all the publications that will be archived in the
will end with an outlook about possible future extension.            e-Library section in order to provide advanced instruments for
                                                                     a more effective retrieval of the specific information a user is
                    II. A PPLICATION SCENARIO
                                                                     interested to. Moreover the system may even suggest relevant
   In the course of 2005, the chair of Prehistory and Pro-           contents which are semantically related to the ones the user is
tohistory of the University of Milan, in collaboration with          actually viewing on the screen.
the Department of Informatics, Systems and Communication                Therefore, the e-Library must allow content editors to
of the University of Milan-Bicocca and the Department of             describe semantically all the publications in a collaborative and
Archaeology of the University of Bologna, have started a long-       simple way, by adopting a simple web-based user interface.
term project for the creation of a set of Web-oriented services      In particular this description will be performed manually by
aimed at supporting the sharing of knowledge on prehistory           the students, while archaeology professors and researchers will
and protohistory in Italy.                                           supervise the work and will progressively refine/maintain the
   The main objective of the project has been the creation           domain ontology.
of a Web portal, named ArcheoServer6 , which will pro-                  Since ontologies are complex to build and understand, the
vide a collaborative platform for the exchange of scientific         ontology terminological component (roughly speaking, the
information among the communities of Italian archaeology             structure of the ontology) has to be designed by archaeology
researchers. A first type of information regards the preliminary     professors and researchers with the aid of knowledge engi-
results of the research in progress (e.g. that relating to the       neers. In our scenario, since after the initial design of the
archaeological excavations conducted during the year), which         ontology a structural modification occurs rarely, an ontology
are rarely communicated to the scientific community before           editing tool, external to the e-Library Web-based system, can
being revised and included in larger studies. Moreover, the          be used for this activity. The e-Library has only to support
portal will provide an easy access to more articulated and           the maintenance on the domain ontology assertion component
analytical contributions on specific topics (e.g. those discussed    (the instances of the concepts defined in the terminological
in a PhD thesis or in a article in a scientific journal), by means   component). Figure 2 shown a scenario that conveys how
of the electronic publishing of traditional papers.                  different users groups should interact with the application main
   A particularly relevant section of the portal is devoted to a     components.
e-Library which was devised to supply an effective mechanism
                                                                       7 http://www.iipp.it/
  6 http://www.archeoserver.it/                                        8 http://www.bibar.unisi.it/
      Knowledge                         Ontology                           Bibliography                     CiteSeer11 in another example of a Web-based scientific
       Engineer                         maintainer                            editor
     Edit the ontology
                                                                                                         literature digital library which was developed by the NEC
                                      Maintenance of the               Edit the bibliography,
      terminological                  ontology assertion                    describe the                 Research Institute. The aim of the project is not only to create
       components                        components                         publications
                                                                                                         a digital library but to provide algorithms, metadata, services,
                                                           view
                    edit              view edit                            edit        edit              techniques, and software that can be used in other libraries.
                                                                                                         CiteSeer offers to the users features similar to DBLP but
                                           A-Box                                                         uses a different approach to populate the library: CiteSeer
                            T-Box
                                                                  Publications                           actively retrieves new documents and automatic tags and
                                                                  descriptions            Bibliography
                                                                                                         links metadata information inherent in an academic documents
                           Ontology
                                                                                                         syntactic structure [5]. In our opinion, CiteSeer offers many
                                                                         view             view
                                                           view
                                                                                                         interesting features, but since it is not an open source product,
                                                                                  Search and             we cannot use it for our e-Library framework.
                                                                                  navigation
                                                                                                            Another application for assisting users in managing, search-
                                                                            End-User
                                                                                                         ing, and sharing bibliographic information is Bibster12 [6]. It
                                                                                                         allows searching bibliographic information on a distributed
                                                                                                         peer-to-peer network using Semantic Web technologies and
Fig. 2.   Use case displaying users groups and their actions over the system                             provides an easy way to share data with other users. Biblio-
                                                                                                         graphic data are represented following the SWRC (Semantic
                                                                                                         Web Research Community) Ontology [7]. This ontology de-
   A non-functional requirement for the e-Library is the adop-                                           fines a shared and common domain theory that represents a
tion of OWL9 (Web Ontology Language) as ontology language                                                research community, its researchers, topics, publications, tools,
to describe the publications contents. OWL was adopted be-                                               and relations between them. However, Bibster does not match
cause it allows representing and exporting ontological knowl-                                            our requirements since the project requires a web-based e-
edge in an interoperable way.                                                                            Library application.
   It must be noted that this paper reports the first results of                                            Out of the bibliographic domain, there are many ontology-
the project, but we also aim at adopting this approach to the                                            based Web search applications which we have analyzed.
ontological description of other contents of the portal, from                                            OntoWeb [8] is a semantic portal through which knowledge
images depicting findings and sites, to specific elements of                                             can be gathered, stored and accessed by members of a certain
interest in the webGIS (e.g. sites, settlements). The description                                        community. Knowledge retrieval and extraction is based on
of these aspects of the project, however, are out of the scope                                           the documents ontological annotation. In the portal, the hier-
of this paper.                                                                                           archical organization of the different concepts of the ontology
                                                                                                         is graphically represented as a dynamic tree, from which
            III. A NALYSIS OF THE R ELATED S YSTEMS                                                      the users can view instances of a class by expanding the
                                                                                                         tree nodes and selecting the element of interest. OntoWeb
   Before choosing how to develop the e-Library, different                                               graphically displays only the relations of the classes but not
available bibliography information system, semantic annota-                                              the relations between individuals. In our opinion, this kind of
tion frameworks, OWL editors and viewers have been ana-                                                  visualization is not suitable for our requirements, because the
lyzed. A summary of the analysis of such systems will be                                                 relation between specific e-Library contents (i.e. individuals)
given in this section.                                                                                   are extremely relevant.
   The e-Library is mainly inspired by DBLP10 (Digital Bib-                                                 Sesame13 [9] is an open source framework with support
liography & Library Project). DBLP is a Computer Science                                                 for inferencing and querying on RDF and RDF Schema.
Bibliography developed by the University of Trier that allows                                            Despite it is mainly a library for building applications that
searching a huge collection of bibliographic information (in                                             need to work with RDF, Sesame comes with an interface to
October 2006, more than 800.000 publications) with a easy-to-                                            allow access to semantic repositories through a Web browser.
use Web interface. The Web interface also allows browsing the                                            The interface supports both semantic query and navigation
bibliography by following links of author, citations, journals                                           of the ontology via hyperlinks. However this interface is not
and conferences. DBLP collects bibliographics information                                                intended to support end-users with little or no knowledge about
provided by publishers, editors and so on. A detailed de-                                                ontology languages and thus it offers only a basic support
scription of DBLP, its architecture, evolution and perspectives                                          for our requirements. From the developer point of view, the
can be found in [4]. Since DBLP was started as a prototype                                               API provided by Sesame are comparable to the Jena API. In
Web application in 1993, several years before the birth of                                               an evaluation of different knowledge base presented in [10],
the Semantic Web initiative, it does not provide any form of                                             Sesame seems to be faster than Jena. However we choose
semantic description of the publications.
                                                                                                           11 http://citeseer.ist.psu.edu/
  9 http://www.w3.org/TR/owl-features/                                                                     12 http://bibster.semanticweb.org/
  10 http://dblp.uni-trier.de/                                                                             13 http://www.openrdf.org/
                                                                                                                                                                    Geographics
                                                                                                                                   instance of                        Place
                                                                                                    Italy
                                                                                                                                       partOf
                                                                                                                                     (inferred) instance of
                                                                                                            partOf_directly
                                                                                                                                                     instance of
                                                                                                                    North Italy                                                   Activity

                                                                                                                              partOf_directly
                                                                                                                                                      instance of

                                                                                                            partOf_directly               Lombardy
                                                                                                                                                                          instance of
                                                                                               partOf_directly
                                                                                                                                                         instance of
                                                                                                                  Central Italy


                                                                                                 South Italy                                                           Human Activity


                                                                     Fig. 4. Navigation tree and graphical representation of the correspondent
                                                                     ontology graph
                 Fig. 3.   A screenshot of the A-Box Editor

                                                                     ontology terminological component, save it as an OWL file,
Jena for developing out framework because Sesame lacks a             and than we import this file in the framework.
complete support of OWL.                                                From the user’s point of view, the developed framework is
   In order to develop the e-Library user interface, many            composed of different modules: A-Box Editor, Publication De-
ontology editors and visualization tools have been investi-          scription Interface and End-Users Interface. This last module
gated. In our opinion, these applications are critical because       is divided in three submodules: the Semantic Query Interface,
the diffusion of Semantic Web technology depends on the              the Semantic Navigation Interface and the A-Box Viewer. Not
availability of convenient and flexible tools for editing and        all the modules are fully implemented yet, in particular the
browsing ontologies.                                                 Semantic Query and Semantic Navigation Interfaces are still
   The more popular ontology editor is Protégé14 . It is a free,   under development. In the following paragraphs, more details
open source ontology editor and knowledge-base framework.            about each module will be given.
A detailed description of Protégé is out of the aim of this
paper and can be found in [11]. In our opinion, Protégé is one     A. A-Box Editor
of the best OWL editor, but its user interface is too complex           The A-Box Editor is only available for ontology maintainers
for a user with no experience of OWL and lacks some useful           and enables them to edit the ontology A-Box.
functions like the inspection of the elements via hyperlinks and        As shown in Figure 3, the ontology navigation tree is
comfortable edit/visualization facilities for the A-Box [12].        placed on the left part of the interface and the individuals
   An interesting Web-based OWL ontology exploration tool            and properties editor on the right. The aim of the navigation
is OntoXpl, which is described in [12]. In particular, an            tree is to explore the A-Box and select the individual to edit.
interesting features of OntoXpl is the visualization facility        The navigation tree is not a hierarchy of classes, but rather of
for A-Box, that can be displayed as tree whose nodes are             individuals connected with partOf or superType15 properties.
individuals and arc are properties. This kind of visualization       OWL does not contain specific primitives for partOf or
is suitable for A-Boxes with many individuals. OntoXpl also          superType properties but it supports suitable mechanisms to
supports the inspection of the ontology elements via hyper-          express the features we wanted to specify for these properties.
links. OntoXpl has inspired the design of the framework user            We defined both these properties as transitive (e.g. if Varese
interface, particularly the navigation tree and the A-Box Editor.    Province is part of Lombardy, and Lombardy is part of North
                                                                     Italy, then Varese Province is part of North Italy). For each
             IV. C ONTENT DESCRIPTION APPROACH                       property, we also defined a sub-property which is directed and
   Following the previously introduced requirements, three           non transitive (e.g. we defined the property partOf directly as
e-Library user groups have been identified: ontology main-           a sub-property of which partOf ). These properties link directly
tainers, content editors and end-users. End-users can have no        an individual with its “father” and will be used to build the
knowledge about ontologies and related editors, and ontology         navigation tree. For example, if we assert that Varese Province
maintainers are supposed to have a limited background of             is directly part of Lombardy, a reasoner infers that Varese
ontologies. Thus, one of the most important decisions in the         Province is part of Lombardy and Varese Province is part of
design of the e-Library is how to display and edit the ontology      Italy. A description of this approach to the representation of
terminological component (a set of classes and properties, in        the Part-Whole relation is described in [13].
the following called T-Box) and assertion component (a set of           We decided that the displayed navigation tree should not
T-Box-compliant individuals, in the following called A-Box)          exactly reflect the structure of the ontology A-Box but rather
in a user-friendly way.                                              it should attempt to provide a clear and usable presentation
                                                                     of the ontology to the users. An example of navigation tree
   As mentioned in Section II, in this framework, there is no
specific tool to edit the T-Box. We adopted Protégé to edit the      15 Our superType property is different from the OWL subClassOf : in fact
                                                                     the subClassOf is a relations between classes, the superType is between
  14 http://protege.stanford.edu/                                    individuals.
is shown in Figure 4. The root of the navigation tree is the
“fake” element Thing; it is not actually part of the ontology
and it is only a placeholder. Under the tree root node, there are
the top-level individuals (e.g. Human activity or Italy). These
individuals are connected to the underlying individuals with
partOf or superType properties (e.g. North Italy is part of Italy
or Farming has super type Human activity).
   The editor of individuals and properties is placed on the
right part of the interface. Using this editor, an ontology
maintainer can create new individuals related to an existent                                               Fig. 6.   A screenshot of the Publication Description Interface
one by means of a partOf or superType property (as shown in
Figure 5), remove individuals, edit the label of an individual
(the displayed name) and edit the related properties.                                                 ontology to a publication. The statements predicate (also
   The properties of each classes are defined in the T-Box.                                           called property) defines the relation between the publication
Two types of properties are distinguished: object property                                            (subject) and the topic (object). Examples of properties are
is a binary relation between two individuals and datatype                                             hasTopicCulture and hasTopicHistoricalPeriod. These proper-
property is a binary relation between an individual and a                                             ties are defined in the ontology T-Box and every property
literal (a primitive type, like string or number). Properties                                         is a sub-property of the generic topic property hasTopic
can also have cardinality and range restriction. For example,                                         (e.g. hasTopicCulture is a sub-property of hasTopic). Range
the class TypologyOfArchaeologicalObject has the property                                             restriction is used to specify the valid values for the property
buildOf. This property has no cardinality restriction (so it                                          (e.g. hasTopicCulture has Culture as range). The Publication
can have zero, one ore more values) but Material is specified                                         Description Interface considers the range restriction allowing
as range (co-domain). For instance, Sword is an instance of                                           only to select the valid individuals as values of every proper-
TypologyOfArchaeologicalObject and has the property buildOf                                           ties. For example, the property hasTopicGeographics accepts
Metal, where Metal is an instance of Material.                                                        only instances of GeographicsPlace as object, so, as shown in
   There are four properties editor defined in the framework:                                         Figure 6, the interfaces only allows to select instances of this
   • single datatype allows editing a single literal value,                                           class.
      displayed as a single line input box;
                                                                                                      C. End-User Interface
   • multiple datatype allows editing multiple literal value,
      adding and removing values;                                                                        End-User Interface is composed of three submodules: the A-
   • single object allows defining a relation with a single                                           Box Viewer, the Semantic Query Interface and the Semantic
      individual, presenting the user a tree for selecting the                                        Navigation Interface. Not all the submodules are yet fully
      value; the individuals displayed in the tree are only those                                     implemented.
      that are valid for the property range;                                                             The A-Box Viewer is directly derived from the A-Box
   • multiple object allows defining relations with multiple                                          Editor. Through this module users can view ontology indi-
      individuals; It is similar to the single object editor but                                      viduals and their properties, browse properties via hyperlinks
      allows adding and removing individuals, rather than se-                                         and access related publications thanks to their description.
      lecting only one.                                                                               Browsing the ontology is essential for the user to explore
                                                                                                      the available information and it also helps non-expert users
B. Publication Description Interface                                                                  to refine their search requirements, should they start with no
  The Publication Description Interface allows the content                                            specific requirement in mind [14].
editors to associate a ontology-based description to the publi-                                          The Semantic Query Interface is in an early stage of
cations.                                                                                              development. Currently it only allows searching for papers
  The publications descriptions are statements (i.e. subject-                                         characterized by a specific topic. The interface, as shown
predicate-object triples) that associate a topic defined in the


                                                             Italy
                                                                         instance of
                                                       partOf_directly                  Geographics
                                            partOf
                                                                                          Place
                                          (inferred)
                                                        North Italy

                                                                          instance of
                                                   partOf_directly

                                                    New                        instance of
                                                  Instance


Fig. 5. Dialog box to create a new individual under North Italy and graphical
representation of the graph after the creation of the new individual                                           Fig. 7.   A screenshot of the Semantic Query Interface
                                                                                          LiILiteral
                                                                                                                       LiResource                             LiProperty
                                                                                       <<interface>>
                                  Publication                                                                         <<interface>>                         <<interface>>


                                                             Presentation
                       A-Box                     End-User                              value : String
                                  Description                                                                                                      label : String
                       Editor                    Interface


                                                                layer
                                   Interface                                                                                                       localName : String
                                                                                                                                                   URI : String
                                                                                                               LiClass                             dataTypeProperty : boolean
                                            Web Interface                                                  <<interface>>                           functionalProperty: boolean
                                                                                                         label : String                            objectProperty : boolean
                                                                                                         localName : String                        literal : boolean
                                  Ontology API                                                           URI : String                              range : LiClass[]


                                                             Business logic
                                                                 layer
                                                                                                     LiAdapter
                       Jena Adapter                                                                <<interface>>
                                             DB Adapter                                  addInstance(class,uri) : LiInstance                          LiInstance
                                                                                         classOf(instance) : LiClass                                <<interface>>
                          Jena                                                                                                        label : String
                                                                                         getInstance(uri) : LiInstance
                                                                                         getProperty(uri): LiProperty                 localName : String
                                                                                         newLiClass(uri) : LiClass                    URI : String
                                                                                         newLiInstance(uri): LiInstance               triplesWithSubject() : LiTriple[]


                                                             Persistence
                                                                                         newLiLiteral(value): LiLiteral               triplesWithObject() : LiTriple[]
                                             Bibliography                                newLiProperty(uri): LiProperty               superType() : LiInstance[]


                                                                layer
                                                                                         newLiTriple(s,p,o): LiTriple                 superTypeInverse() : LiInstance[]
                       Ontology                                                          removeProperyValue(p)                        partOf() : LiInstance[]
                                  Publications                                           removeResource(uri)                          partOfInverse() : LiInstance[]
                                  Descriptions                                           setPropertyValue(p,o)                        hasParent() : boolean
                                                 MySQL
                                                                                         tripleByObject(uri) : LiTriple               hasChild() : boolean
                                                                                         tripleBySubject(uri) : LiTriple              classesOf() : LiClass[]
                                                                                                                                      addSubResource(resource)
                                                                                                                                      removeResource()
                                                                                                      LiTriple                        setPropertyValue(prop,value)
       Fig. 8.   Overview of the framework three-tier architecture                                <<interface>>                       removePropertyValue(prop,value)
                                                                                           subject : LiInstace                        getPropertyValues(prop) : LiResource[]
                                                                                           property : LiProperty
                                                                                                                                      properties() : LiProperty[]
                                                                                           object : LiResource


in Figure 7, allows selecting the requested topic from the
                                                                                              Fig. 9.        UML diagram of the Framework API
ontology individuals tree. The current implementation retrieves
only the publications that satisfies all the specified criteria.
A future extension may relax this constraint especially with
reference to the number of retrieved publications (adapting the                  We use Hivemind16 to develop an open framework that
query to the results).                                                        can be easy integrated with new adapters. Hivemind is a
   The Semantic Navigation Interface will support users in                    framework that supports the configuration of different services,
the e-Library navigation. This system will suggests to the                    their lifecycle, and their combination. It is inspired by the
users publications considering multiple strategies for making                 Service-Oriented Architecture, an approach to the design of
recommendations (e.g. similar treated topics, recently visited                software architectures adopting loosely coupled services.
document, user interest, access frequency). This module is not
currently implemented in the prototype system and will be                        In the framework, the ontology language OWL is used to
object of future work.                                                        define a set of concepts and relation between them and to
                                                                              use these definitions to describe the contents of the e-Library
                 V. F RAMEWORK A RCHITECTURE                                  publications. OWL defines an information model that can be
   In this section, we introduce a high level overview of the                 represented as a directed graph, in which the nodes represent
implemented prototype system. The framework was developed                     resources and the arcs the properties. The implemented API
according to a three-tier architectural approach, as shown in                 supports the manipulation and query of these graph in two
Figure 8.                                                                     different ways: frame-centric and statement-centric.
   The presentation layer is a Web-based user-interfaces. The                    The frame-centric view is similar to the object-oriented
business logic layer consists of a platform that implements                   paradigm. Every resource is viewed as and object and proper-
the e-Library main services accessible through a set of API                   ties as attribute. This view is used for ontology navigation and
independent from the underlaying storage systems. The aim                     resource manipulation. The statement-centric is a lower level
of the persistence layer is to store the topics ontology, the                 view in which the graph is represented as a set of triples. Each
publications descriptions and bibliographic data.                             triple contains three components: subject, predicate and object.
A. Business Logic Layer                                                       This kind of representation is used to obtains query results.
   The business logic layer consists of a platform that imple-                   The Figure 9 shows an UML diagram of the framework
ments the e-Library main services accessible through a set of                 API. All the information provided to the upper level are
API. The main purposes of these API are to support the ma-                    modeled using these interfaces. The interfaces LiClass and
nipulation and querying of the ontology and the publications                  LiProperty correspond to OWL Class and Property, LiResource
descriptions without requiring a detailed understanding of the                represents a generic RDF17 (Resource Description Framework)
specific internal storage facility.                                           Resource, LiInstance a class instance (an individual), LiLiteral
   The business logic layer interacts with the underlaying layer              a literal and LiTriple a statement (an assertion), constituted by
through a set of adapters: this plug-in interface makes the                   a subject, a predicate an a object. LiAdapter is the interface
application independent from the specific implementations. We                 of each adapter (i.e. Jena Adapter and DB Adapter).
defined a common API for the adapters: currently implemen-
tations of these API are the Jena Adapter and the DB Adapter.
The first one is a wrapper for the semantic framework Jena,                     16 http://jakarta.apache.org/hivemind/

the other one for the relational database MySQL.                                17 http://www.w3.org/RDF/
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              i                                                   o                       k           e                   k           e       r           e       l                   m           e       t   h   o
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          3           )


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      n                   v                                                                                                       n
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      d


                                                           Journal id
                                     Journal                                                                                                                                                                                                                                                                                                              2                           )
                                                                                                                                                                                                                                                                                                                                                                                                                          r                           e                           q                                       u                                   e                           s           t


                                                            Name                              Author id          1   )
                                                                                                                             e   x       p


                                                                                                                                             a   n
                                                                                                                                                     d


                                                                                                                                                                  J                                               s                   r           i   p           t
                                                                                                                                                                                                                                                                                                                                                              s                               u                           b                                                   i           t                       e                                       m                                   s


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  D               W                                                   R


                                              (0,n)
                                                                                                                                                                          a       v           a                               c


                                                                                                                                     o       e


                                                                                                                         n
                                                                                                                                         d
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      l


                                                                                              First name                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          e


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  n


    Publication id                                                        (0,n)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   r


                                                                                  Author
                                                                                                                                                          v   a               a       a


                                                            written by
                                                                                                                                                                                                                                                                                                                                                                                                              r               e                               s                                       p                                   o                                                       s           e           :
                                                                                                                                                                                                                                                                                                                                          4                           )


                                   published on
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          n


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      K           e


                                                                                                                                                                                          b               b           b


                                                                                              Last name
                                                                                                                                                                      >

                                                                                                                                                                                                                                                                                                                              s                       u                           b                                                                                   o                                                                               e                           s                       l       i   s       t


             Title
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  n

                                                                                                                                                                                                                                                                                                                                                                                                                                  n
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              d


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      J                                                                           e                       r                                       e           r


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              a       v               a                   S                                                                   v


                                                                                                                                                                      v               c           c           c


                                                                                                                                                                                                                                                                                                                                                                                                                                      s                                                       X                                       M                                               L


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      i           o


                                                                                                                                                                                                                                                                                                                                                                                                  a


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          t


                                           (0,1)
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  a


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              F                                                       e                       s


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          a                       c


Electronic edition                                                                            Homepage
                                                                                                                                                                                                      >                   d               d               d


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      i           c


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      l


                                                                                                                                                                                          e               e           e


                                                 (0,n)
                                                                                                                                                                      >


                           Publication
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  p


            ISBN
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      A           p


                                                                                                                                                                                                                                                                                                                                                                                                                                          i               s                           p                                       l
                                                                                                                                                                                                                                                                                                                                                                  )
                                                                                                                                                                                                                                                                                                                                      5


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              a                       y
                                                                                                                                                                                                                                                                                                                                                                                                      d


                                                                          (0,n)               Class id                                                                                                                                                                                                                                                        t               h                                       e                                           s                                       u                                   b


                                                (0,1)                              Class
                                                                                                                                                                                                                                              W                           e   b       p                           e


                                                            has class
                                                                                                                                                                                                                                                                                                  a       g


                                      (0,1)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       W       e       b                           I                               t               e                       r               f                               e


                         (0,n)
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      n                                                                                       a           c


          Volume                                                                                                                                                                                                                                                                                                                              n
                                                                                                                                                                                                                                                                                                                                                                          o


                                                                                                                                                                                                                                                                                                                                                                                                          d
                                                                                                                                                                                                                                                                                                                                                                                                                                              e                           s                                                       o


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  n


                                                                                              Description                                                                                                                         W                           e       b           B       r   o       w       s       e   r
                                                                                                                                                                                                                                                                                                                                  i


                                                                                                                                                                                                                                                                                                                                                  n
                                                                                                                                                                                                                                                                                                                                                                                          t                       h                               e                                               t                   r                           e                           e


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          A                               p                       p                       l       i                               t           i   o                           e               r               e       r


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          c           a                                   n           S                                   v


             Year

                                                                          (0,n)               Publisher id
                                                           published by           Publisher
                                   has
                                                                                               Name
                                 subject                                                                     Fig. 11.                                Interaction among the navigation tree and the server components.
                          (1,1)
                                                         Predicate
            Inferred      Description
                                                         Object                                              issue. Generally speaking, bibliographics data and descriptions
                                                                                                             can grow very quickly22 and can have a memory occupation
                                                                                                             much more relevant than the ontology. If publications were
Fig. 10.         ER diagram for bibliographics data and publications descriptions
                                                                                                             annotated, the number of descriptions could be very high.
                                                                                                             A semantic framework like Jena, that uses a memory-based
B. Jena                                                                                                      reasoner, is not suitable to manage this amount of data (a
                                                                                                             performance evaluation of several frameworks suitable for
   Our choice for OWL ontology storage, manipulation and                                                     large OWL ontologies is presented in [10]).
quering is Jena18 , an open-source Semantic Web Toolkit                                                         The bibliographic data and the descriptions are stored in
developed by HP Labs19 . Its aim is to support the development                                               the database, whose schema is shown in Figure 10. The
of applications that use the Semantic Web information models                                                 most notable element is the publication description table.
an languages [15]. We have adopted this framework since it                                                   This table holds information about publications descriptions
matched our requirements and because is widely used within                                                   as subject-predicate-object triples: the subject is a publication
the Semantic Web research community and well documented.                                                     identifier, the predicate defines the type of the topic (e.g.
The core of the toolkit is the RDF API, which supports the                                                   hasHistoricalPeriodTopic, hasCultureTopic) and the object is
manipulation and querying of RDF graphs (an OWL graph                                                        the topic of the document. Examples of such triples are:
can be viewed as a specialization of a RDF graph, so the                                                     publication001 hasHistoricalPeriodTopic bronzeAge, publica-
Jena API also supports OWL graphs). Jena supports several                                                    tion001 hasCultureTopic Etruschi. According to the defined
different storage technologies for ontology persistence. The                                                 domain ontology, every publication can have zero, one or more
simplest is to load axioms and individuals directly from an                                                  topics, also of the same topic type (e.g. the a publication can
OWL file, but this approach requires the document to be parsed                                               be related to both the historical periods Middle Bronze Age
each time the framework starts up and to store after every                                                   and Late Bronze Age).
modification. This can be a source of significant overhead.
To avoid this problem, we have used the relational databases                                                 D. Presentation layer
persistent storage strategy. This approach also enables faster                                                  The main technology used to develop the user interface, de-
retrieval and insertion of the ontology elements20 . To import                                               scribed in Section IV, is JSF23 (JavaServer Faces). We choose
the OWL ontology created with Protégé into the database,                                                   this technology mainly because JavaServer Faces define a
we have used the Jena OWL readers (Jena has readers and                                                      clear separation between application and presentation logic
writers for different languages that can be used to represent                                                and support the connection of the presentation layer to the
RDF graphs and OWL).                                                                                         application code. JSF defines a set of APIs for representing
C. Persistence Layer                                                                                         user interface components, managing their state, handling
                                                                                                             events, input validation, and defining page navigation.
   The topics ontology, the publications descriptions and bibli-                                                Another adopted technology is AJAX; it is not a technology
ographics data are stored in on the relational DBMS MySQL21 .                                                in itself, but a term that refers to the combined use of a group
   Jena stores ontology in a statements table and other ad-                                                  of technologies (JavaScript, DHTML (Dynamic HTML)24 ,
ditional tables (e.g. for reification statements); these tables                                              XML and the Remote Scripting) [16]. In particular, we use
are not intended for direct access by other applications.                                                    AJAX for the dynamic tree component, that is used as nav-
Publications descriptions and bibliographics data are described                                              igation tree, properties editor tree and semantic query topics
in the ontology but they are stored separately for performance
                                                                                                                22 For example, DBLP(Digital Bibliography & Library Project), the Com-
  18 http://jena.sourceforge.net/
                                                                                                             puter Science Bibliography of the University of Trier, indexes more than
  19 http://www.hpl.hp.com/                                                                                  800000 publications.
   20 For more information, see: Jena Fastpath Query Processing -                                               23 http://java.sun.com/javaee/javaserverfaces/
http://jena.sourceforge.net/DB/fastpath.html                                                                    24 http://www.w3.org/DOM/faq.html#DHTML–DOM,
   21 http://www.mysql.org/                                                                                  http://www.w3schools.com/dhtml/
tree. We use AJAX because this technology enables to display          of results, the query could be extended to select publication
new contents in a Web page without completely reloading it.           treating also topics related to those explicitly required.
As shown in Figure 11, it is possible to dynamically load the            Finally, future works will be focused on the development
tree elements when required. Having such feature allows it to         and test of the Semantic Navigation Interface, which will
handle large amounts of data: this is a very important aspect         support users in the e-Library navigation. This system will
because the tree could be very large and is unnecessary to load       make recommendations considering multiple strategies: e.g.
all the elements every time.                                          correlation, recently visited documents, user interests, access
   The AJAX tree is integrated with the rest of the framework         frequency. This interface will also capture the cumulative
by DWR25 (Direct Web Remoting). This technology allows                effect of an entire user navigation session in order to generate
JavaScript code in client Web browser to communicate with             semantic queries. An description of a work based on this
the framework running on the server.                                  approach can be found in [17].
                                                                                                   R EFERENCES
     VI. C ONCLUSIONS AND F UTURE D EVELOPMENTS
                                                                       [1] C. A. Goble, “Using the Semantic Web for e-Science: Inspiration,
   In this paper, we presented a prototype of a semantic-                  Incubation, Irritation.” in International Semantic Web Conference, 2005,
                                                                           pp. 1–3.
based e-Library. This applications allows users searching a            [2] M. Ley and P. Reuther, “Maintaining an Online Bibliographical
collection of publications semantically described. Moreover it             Database: the Problem of Data Quality.” in EGC, ser. Revue des
gives to the content editors the possibility of autonomously               Nouvelles Technologies de l’Information, vol. RNTI-E-6. Cépaduès-
                                                                           Éditions, 2006, pp. 5–10.
managing the assertional component of the domain ontology,             [3] M. E. Maron and J. L. Kuhns, “On Relevance, Probabilistic Indexing
the publications description and the bibliographic data. To                and Information Retrieval.” J. ACM, vol. 7, no. 3, pp. 216–244, 1960.
describe the publications topic, the e-Library exploits ontology       [4] M. Ley, “The DBLP Computer Science Bibliography: Evolution, Re-
                                                                           search Issues, Perspectives.” in SPIRE, Lecture Notes in Computer
expressed in OWL. A campaign of tests with the students                    Science, vol. 2476. Springer, 2002, pp. 1–10.
of Archaeology aimed at evaluating the effectiveness of the            [5] C. L. Giles, “Citeseer: Past, Present, and Future.” in AWIC, Lecture
publications description approach and the usability of the user            Notes in Computer Science, vol. 3034. Springer, 2004, p. 2.
                                                                       [6] P. Haase, J. Broekstra, M. Ehrig, M. Menken, P. Mika, M. Olko,
interface is under way. The tests were focused on the A-Box                M. Plechawski, P. Pyszlak, B. Schnizler, R. Siebes, S. Staab, and
Editor and the Publication Description Interfaces because these            C. Tempich, “Bibster - a Semantics-based Bibliographic Peer-to-Peer
modules are in a more advanced stage of development.                       System.” in International Semantic Web Conference, Lecture Notes in
                                                                           Computer Science, vol. 3298. Springer, 2004, pp. 122–136.
   Preliminary results of this tests showed that the proposed          [7] Y. Sure, S. Bloehdorn, P. Haase, J. Hartmann, and D. Oberle, “The
ontology visualization is useful for the users as a guide to               SWRC Ontology - Semantic Web for Research Communities.” in EPIA,
                                                                           Lecture Notes in Computer Science, vol. 3808. Springer, 2005, pp.
describe the contents of publications. It helps users with no              218–231.
knowledge about ontologies to understand the relationship              [8] P. Spyns, D. Oberle, R. Volz, J. Zheng, M. Jarrar, Y. Sure, R. Studer,
between the different topics and between the topics and the                and R. Meersman, “Ontoweb - a Semantic Web Community Portal.” in
                                                                           PAKM, Lecture Notes in Computer Science, vol. 2569. Springer, 2002,
publications. Moreover new required features were expressed                pp. 189–200.
after the tests. In particular, the users required the possibility     [9] J. Broekstra, A. Kampman, and F. van Harmelen, “Sesame: An Archi-
to choose the property on which each tree is built on. For                 tecture for Storing and Querying RDF Data and Schema Information.”
                                                                           in Spinning the Semantic Web, MIT Press, 2003, pp. 197–222.
example, the users found useful the findings tree build on the        [10] Y. Guo, Z. Pan, and J. Heflin, “An Evaluation of Knowledge Base
“superType” property (e.g. “Sword has super type weapon”,                  Systems for Large OWL Datasets.” in International Semantic Web
“weapon as super type handwork”), but they can also make                   Conference, Lecture Notes in Computer Science, vol. 3298. Springer,
                                                                           2004, pp. 274–288.
use on a tree build on the “hasMaterial” property. Another            [11] H. Knublauch, M. A. Musen, and A. L. Rector, “Editing Description
required feature is the ability to sort the tree items according to        Logic Ontologies with the Protégé OWL Plugin.” in Description Logics,
a given property. Currently, the items are sorted alphabetically,          CEUR Workshop, vol. 104. 2004.
                                                                      [12] V. Haarslev, Y. Lu, and N. Shiri, “Ontoxpl - Intelligent Exploration of
whereas for some concepts, like the historical periods, this               OWL Ontologies.” in Web Intelligence. IEEE CS, 2004, pp. 624–627.
choice is not sensible. For example, the historical periods           [13] R. Alan, W. Chris, N. Natasha, and W. Evan, “Simple Part-Whole
are better ordered by an explicit “isPrecedent/isSuccessive”               Relations in OWL Ontologies,” Aug 2005. [Online]. Available:
                                                                           http://www.w3.org/2001/sw/BestPractices/OEP/SimplePartWhole/
property.                                                             [14] S. Ram and G. Shankaranarayanan, “Modeling and Navigation of Large
   The tests also considered the Semantic Query Interface,                 Information Spaces: a Semantics Based Approach.” in HICSS, 1999.
which is at an early stage of development. Currently it only          [15] B. McBride, “Jena: A Semantic Web Toolkit.” IEEE Internet Computing,
                                                                           vol. 6, no. 6, pp. 55–59, 2002.
allows searching for papers characterized by specific topics.         [16] G. Jesse James, “Ajax: a New Approach to Web Applications.”
The interface allows selecting the topics from the ontology                [Online]. Available: http://www.adaptivepath.com/publications/essays/
individuals tree and retrieves the publications related with               archives/000385.php
                                                                      [17] N. Athanasis, V. Christophides, and D. Kotzinos, “Generating on the fly
all the selected topics. From the test experience, it might be             Queries for the Semantic Web: The ICS-Forth Graphical RQL Interface
useful to relax these constraints especially with reference to             (GRQL).” in International Semantic Web Conference, Lecture Notes in
the number of retrieved publications, adapting the query to the            Computer Science, vol. 3298. Springer, 2004, pp. 486–501.
                                                                      [18] S. A. McIlraith, D. Plexousakis, and F. van Harmelen, Eds., The Seman-
results. For example, if a query selects only a small number               tic Web - ISWC 2004: Third International Semantic Web Conference.
                                                                           Proceedings, Lecture Notes in Computer Science, vol. 3298. Springer,
  25 http://getahead.ltd.uk/dwr/                                           2004.