Vol-596
                                              urn:nbn:de:0074-596-3
                                              C opyright © 2010 for the individual
                                              papers    by  the   papers'     authors.
                                              C opying permitted only for private and
                                              academic purposes. This volume is
                                              published and copyrighted by its
                                              editors.


ORES-2010
Ontology Repositories and Editors for
the Semantic Web

Proceedings of the 1st Workshop on Ontology Repositories and
Editors for the Semantic Web

Hersonissos, Crete, Greece, May 31st, 2010.


Edited by

Mathieu d'Aquin, The Open University, UK
Alexander García Castro, Universität Bremen, Germany
Christoph Lange, Jacobs University Bremen, Germany
Kim Viljanen, Aalto University, Helsinki, Finland


10-Jun-2010: submitted by C hristoph Lange
11-Jun-2010: published on C EUR-WS.org
                Ontology Repository for User Interaction

                                          Martins Zviedris1
     1
         Institute of Mathematics and Computer Science, University of Latvia, Raina bulv. 29,
                                      Riga LV-1459, Latvia
                                     Martins.Zviedris@Lumii.lv


         Abstract. The systematization and the interaction with ontologies are problems
         that deserve more attention. One of the key aspects is that ontology repositories
         should also work as the first step in a data interaction process for end-users, not
         only as a collection of ontology schemas. We propose a novel systematization
         of similar domain ontologies described by a high-level abstraction domain
         ontology that could be used as a domain ontology repository and access point to
         gather instance data.

         Keywords: Ontology Systematization, Ontology Interaction


1 Introduction

Data systematization, representation and accessibility are key factors for the data
usage. In the Semantic Web, the state of art for the data representation is ontology and
currently ontology repositories are used to store collections of ontology schemas.
Main disadvantage is that ontology repositories only allow to access ontology
schema. Thus, this leads to an effect that ontologies are developed in an isolated way,
as there is not provided access to the real data that would motivate to interlink them.
   First, we propose that an ontology repository should contain link to instance data.
Second we propose that it would be better to organize a repository around specific
domain and group domain ontologies by additional domain specific meta-information.
Added meta-information should be organized by a high-level abstraction domain
ontology, described in more detailed in section 3. Thus, it would be easier to find
similar onotologies and define ways to merge them together. Third, domain
repositories would also work as a first step in data interaction for a domain expert user
or an intelligent agent. The user would use the repository to select ontologies that
contain the data of interest and use selected ontologies to construct relevant data
queries. The repository thus becomes a bridge towards the real data.


2 Practical Experience and Proposed Solution

   In practice we have encountered a problem, where we had to develop an ontology
for medical researchers that describes data from different disease registries [1]. Since
instance data was originally stored in SQL databases then we had to work with SQL-
like ontologies; we will preset simple ontology examples below. The role of
ontologies was to integrate eleven disease registries and to allow medical researchers
to use ontologies as access point to the real data. Thus there was no need for
elaborated ontology mechanisms. The only goal was to use ontologies to enable
medical researchers to sort through vast amounts of data from registries without
programmer’s assistance.
   A naïve approach is to merge all the registries into single, large ontology that could
be stored in the data access point. We have successfully implemented the naïve
approach. However, in this case medical researchers hardly comprehend the resulting
ultra-complex ontology. The naïve approach needs improvement, as understandability
because of ontology is crucial for medical researchers to use the ontology for actual
data selection.
   Based on the medical domain we will describe a more elaborate solution. The
solution involves interaction of two steps. In the first step, a medical researcher has to
choose registries that interest him. In the second step, he has to select and obtain data
from selected registries. As each registry can also be perceived as an ontology then
we need to develop a disease domain repository from where the medical researcher
selects ontologies that interests him. So, in the first step the medical researcher selects
registries (ontologies) from high-level abstraction disease ontology that are merged
into single ontology. In the second step, the end-user can interact with data better as
ontology consists of a smaller number of classes and most of them are of end-user’s
interest. The main idea in this approach is to develop a high abstraction level ontology
that represents features from all registries. As a result, the end-user in the first step
can select registries (ontologies) more conveniently and can further work with single
ontology containing the needed data.


3 A Detailed Example of Medical Domain

We start with a requirement to integrate different medical disease registries into single
integrated registry. Each registry contains patient’s data. We propose that all registries
should be stored in a medical domain repository. Also, it is required that ontology
records contain links to instance data. As for medical researchers it is often necessary
to work with several ontologies at a time then we also propose that these ontologies
need to be organized in a more elaborate way using a high-level abstraction medical
disease registries ontology.
   To better understand a high-level abstraction ontology, we will build it from
medical domain examples. We need to integrate two simplified registries – diabetes
registry depicted in Fig 1. and cancer registry depicted in Fig 2. In practice these
registries contained also other information connected to simplified solution classes
and consisted of about 10 classes and about 20 enumerated classes used for
classification.
   By analyzing depicted registries we can identify similar structure in them. Each
registry contains such general concepts as person, disease information, disease details
and disease cure. A question arises whether these similarities can be used to develop a
high-level abstraction ontology. As we need to develop a high-level ontology that is
used in a repository for ontology selection then we need to consider only those
concepts that can ease selection process. As concept “person” does not contain
information useful for ontology selection it should not be considered for a high-level
ontology. Still, it could be useful to mark person concept as a concept that can be used
to merge registries, thus, ontologies merging could be done at least semi-automatic.


Fig. 1 Simplified diabetes registry


Fig. 2 Simplified cancer registry

  We can identify a pattern that can be described as a disease has a treatment and an
examination. The pattern is general and can be consider as a high-level abstraction
ontology. The pattern can be depicted as ontology in Fig. 3. (additional information
about the treatment and the examination may be added).


Fig. 3 Common disease ontology

   Such pattern was discovered in eleven registry ontologies that ware developed for
medical registries. We can see that the pattern ontology is very simple and easy to
grasp for the end-user. In addition, most medical disease ontologies can be described
as an instance of this ontology. Still, this pattern ontology lacks meta information
about actual registries and where one can find instance data and thus link to it. Also, it
could be possible that data is stored in more than one place; for example, each clinic
could have own cancer registry and specific disease details. We add links between the
high-level abstraction ontology instances and corresponding ontologies. Links are
also added towards real data and contain information about, for example, how long
data is gathered, thus links allows additional selection possibilities. Schematically
structure is depicted in Fig 4.


Fig. 4 Connection between ontology levels

   We will sketch how a medical researcher could gather relevant data about cancer
and its treatment possibilities using the proposed solution. We will not describe a
specific way to query ontology as it can be done through SPARQL or more preferably
by a graphical query language [2, 3]. Firstly, a medical researcher connects to a
disease domain repository and queries high-level ontology depicted in Fig. 3. He
restricts that he is interested in disease with name = “Cancer” and all corresponding
treatments. As a result he gets instance pairs of cancer and corresponding cancer
treatment and link information to ontologies that contain instance data. At this point
the medical researcher can further restrict registries that interest him, for example, he
could be interested only in registries that gather data at least 10 years. If he selects at
least two registries, for example, the Baltic cancer registry and the England cancer
registry then ontologies from both registries are merged into single ontology. This can
be achieved using information that both ontologies contain similar concepts “person”
and that in both ontologies is present abstract Cancer class as super class of specific
Cancer classes that contain data. This ontology is presented to a medical researcher,
where he can gathers clinical data for further analysis.


4 Related Work

From technical point of view, OWL2.0 allow to use punning, where an object can be
represented as a class. Still, metadata addition to ontology does not solve the problem
of how to group similar domain ontologies together for further interaction. Most
existing ontolory repositories just collect ontologies and allow to reuse ontologies.
For example [4] does not give possibility for further interaction with collected
ontology data that is needed for common users. They even do not collect links to
existing data.
5 Results and Future Work

In practice, we have designed and implemented the naïve approach with six different
disease registries [1] integrated into single ontology. To allow medical researchers
query data, we have developed and implemented a graphical query language [2].
After implementing the prototype we gathered user feedback to evaluate our work.
Most valuable feedback that we got was that ontology was too ultra-complex for
medical researchers. Also, it was relatively easy to produce queries in ontology part,
with witch a researcher was familiar with. Other valuable feedback was that medical
researchers are interested in registries meta information, for example, how long data
has been gathered.
   It would be important to practically implement and test specific domain repository
with access to ontologies that contain real data. The proposed approach needs to be
developed in more details to interlink between an ontology repository and ontologies
that contain real data. Interesting problem also would be to find whether there should
be predefined ontologies for specific diseases that could be configurable for each
registry that contains such disease data and such ontology could be used for mush-up.
As we have only developed the theoretical approach for medical domain it is
important to go further into other domains and see possibility of a high-level
abstraction ontology approach. We should mention that such domain repositories
would be useful for intelligent agents as they could find links to similar data in one
place.


Acknowledgements

I would like to thank prof. Guntis Barzdins for valuable discussions and prof. Karlis
Podnieks for useful ideas. Also I would like to thank Arturs Sprogis and Renars
Liepins for valuable assistance.


References

1. Barzdins G., Liepins E., Veilande M., Zviedris M., Ontology Enabled Graphical Database
   Query Tool for End-Users, Selected papers from DB&IS'2008, Hele-Mai Haav (Eds.),
   Frontiesrs in Artificial Intelligence and Applicatons series, IOS Press, 2009. 187:105--116
2. Barzdins, G., Rikacovs, R., Zviedris, M.: Graphical Query Language as SPARQL Frontend.
   In Grundspenkis, J., Kirikova, et. al. (Eds.), Local Proceedings of 13th East-European
   Conference (ADBIS 2009), pp. 93--107. Riga Technical University, Riga. (2009)
3. Chen, H., Wang, Y., Wang, H., Mao, Y., Tang, J., Zhou, C., Yin, A., Wu, Z.: Towards a
   semantic web of relational databases: A practical semantic toolkit and an in-use case from
   traditional chinese medicine. In Cruz, I.F., et.al, eds.: 5th International Semantic Web
   Conference. LNCS, vol. 4273, pp. 750--763. Springer (2006)
4. N. F. Noy, N. H. Shah, P. L. Whetzel, et. al. BioPortal: ontologies and integrated data
   resources at the click of a mouse. Nucleic Acids Research, 2009