<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Toward cross-granular querying over modularized ontologies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>C. Maria Keet</string-name>
          <email>keet@inf.unibz.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Computer Science, Free University of Bozen-Bolzano</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>To address the problems of both structured coordination of linked and modularised ontologies and to query a large dynamic ontology system, we propose a basic granularity framework and a set of functions to query such a granulated system. The granularity framework enforces a constrained and structured modularization. This facilitates automation of both dividing a large body of represented information as well as relinking the pieces. The functions enable basic cross-granular querying in a transparent and scalable way, as they rely on the unambiguous management layer provided by the granularity framework, and are reusable for ontologies represented and stored in di erent formats.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Modularization of ontologies and modularization of the subject domain to have
di erent ontologies require, to some extent, di erent methodologies and
technologies to address successfully. Tried and tested approaches are, for instance,
manual modularization of conceptual data models as is customary in software
development with UML Packages [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], and the inverse where one has a large
ontology that has to be split-up somehow in smaller chunks [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. The former tends
to be used to make the `cognitive overload' manageable, whereas the latter has
more to do with a system's performance and scalability. Moreover,
modularization of the subject domain can avoid the problem of having to modularize a large
ontology, but it passes on problems to, among others, 1) how to modularize in an
optimal way, 2) which way is optimal, 3) how to keep the modules coordinated to
avoid subject domain overlap and corresponding range ontology matching issues,
4) how to repeatedly (re-)connect those ontology modules statically or
dynamically on-demand when a user poses a query across the modules, and 5) if one
could keep such querying locally within the module by good design of the
modularization. The most active domain of ontology development, bio-ontologies,
faces both issues: having to deal with very large ontologies that can bene t from
modularization, as well as having to manage multiple smaller ontologies that
need to be connected. One direction bio-ontologists have taken is coordination
of the `modules' along levels of granularity, i.e., a speci c constrained
modularization that has avours of subject domain-motivated modularization, yet also the
desire to computationally manage this. A formal approach to such endeavours is
lacking, however. Research into computational implementations of granularity is
not new, and is applied mainly in GIS, data warehousing, and time granularity,
but its use to coordinate ontologies and application in knowledge representation
in general, is. Here it will be demonstrated that with a formally de ned
granularity framework as additional knowledge management layer, the modularization
and connectivity is not just any mapping, but can be semantically enriched with
why and how the linking between the modules has been performed. This also
greatly simpli es posing granular queries over such a modularized system. The
connected ontologies remain separable, yet can be linked transparently into one
coherent system within such a domain granularity framework.
      </p>
      <p>In the remainder of the paper, we rst introduce a motivating example in
section 2, after which we introduce a basic granularity framework in section 3. In
section 4 we introduce the set of functions to perform granular queries over the
to be loosely linked and coordinated ontologies at di erent levels of granularity.
We close with conclusions and future work in section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Motivating example</title>
      <p>
        Early attempts to group and coordinate bio-ontologies and related medical
information systems focussed primarily on identifying which perspectives one can
take on the chosen subject domain, which levels of granularity one can identify,
and how many of them would be practically useful [4{6]. The largest and most
ambitious e ort to date is OBO Foundry's approach to coordinated evolution
of ontologies [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and LAV integration of biological databases. OBO Foundry's
overview of both linking and modularizing bio-ontologies at di erent levels of
granularity is depicted in Fig.1-A, which lists 10 of the 16 Foundry ontologies
out of the 64 that are indexed [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] (as of 3-3-2008). For instance, at the \Organ
and Organism"-level, the NCBI taxonomy may be linked to both the
Foundational Model of Anatomy [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], which covers human anatomy only but into great
detail for supra-cellular levels, and CARO [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] that also considers anatomy of
other types of organisms but contains only generic universals for anatomy. How
all this coordination among the ontologies and between sections of ontologies
is to be implemented is not yet clear|and far from trivial. The ontologies are
developed in a distributed fashion and owned by di erent organizations,
represented in di erent representation languages, of widely varying size, and are
intended to be complimentary in coverage. The latter has as major advantage
that the linking of ontologies is not expected to face serious di culties like with
traditional ontology matching [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ]. Instead, connect points between the
ontologies and sections thereof have to be clearly de ned and re-computed each
time after one of the participating ontologies is updated (which is a frequent
event). A rst step is enabling computational management of those di erent
perspectives and levels, which includes both a formal representation of
granularity as well as (cross-)granular querying for basic information retrieval from
such a large domain that no person comprehends in full. The second step is
cross-granular querying, for instance:
3
π
1 2
δ π
1
π
A
      </p>
      <p>B
1 2
λ λ</p>
      <p>v s
3 o t
λ c n
a r e
r
g th
1
λ
λ3 λ5 ouF ,)p</p>
      <p>s
a b C S</p>
      <p>g
gy gy l</p>
      <p>o
n eb (s th R O O c</p>
      <p>n
s ,n sa ce en eu</p>
      <p>o
s
ta iego fto )i le</p>
      <p>O
G S</p>
      <p>=
R tw</p>
      <p>A an ton
o t i</p>
      <p>t
fo re is e e
ega ta e gu it</p>
      <p>h n
re t i
o l
h b n A o
a d t s om fo A</p>
      <p>n
d h o</p>
      <p>l N
m e R</p>
      <p>C o =</p>
      <p>M
.R = la aO</p>
      <p>O n n</p>
      <p>;
d i
a (</p>
      <p>R io R
A ta ,y
C d g</p>
      <p>o
n l
u o
R sn o t</p>
      <p>F n
c
e )
7 r l</p>
      <p>v
e re</p>
      <p>M o
d gn bb F r</p>
      <p>O
n
=</p>
      <p>i
A e</p>
      <p>t
h ta
a ce .</p>
      <p>A ,e P</p>
      <p>p =
d r n y
h i
O it l</p>
      <p>o
w s</p>
      <p>w T O
d ra ll R</p>
      <p>d e P
log ,) ];
t
n m gy e
o u o s
feo l(co ton</p>
      <p>l
l e
b v
ta itc i
n</p>
      <p>( t
y e e
r p o edf 1 o</p>
      <p>t
d s r t</p>
      <p>C ,n</p>
      <p>t
P ca rk re L a</p>
      <p>C m</p>
      <p>r
i
d d o a ,
n n
ew ))
m 2 re In
O (
a i(
O s
G v f
e ra ;
i</p>
      <p>p r
. la s s l</p>
      <p>a
1 u a re</p>
      <p>l
ig ra c a .g o io
F g su in rg (e iB B
g c
o te</p>
      <p>o
ts f
e y
t</p>
      <p>o</p>
      <p>I l
1 n g
R i</p>
      <p>o
P a n
c
h
Q1: \What are the cell components of blood?", which can be decomposed into
smaller tasks where Blood resides in another level as the cells it has as parts
Q2: \which organs have macrophages?", i.e. for each macrophage (and its
subclasses in the Cell-level), which organ (or its subclasses in the Organ-level)
are they part of?
Q3: \Which hormones are located in the kidney, and where in the kidney?" This
query uses three di erent perspectives, being a structural one for Kidney (an
organ) with its parts, one for location (a spatial perspective), and a functional
one for Hormone (a molecule with a speci c function).</p>
      <p>That is, we need ways to select and retrieve entities, levels, and perspectives,
and combine such queries into more complext ones.
3</p>
      <p>
        Basic granularity framework for relating the ontologies
To enhance coordination of the modules that contain the ontologies or sections
of an ontology at di erent levels of granularity, we introduce a simpli ed, yet
e ective, granularity framework. A comprehensive theory of granularity with
de nitions, constraints, and proofs|i.e., with model-theoretic semantics|is
presented in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], but abridged here due to space limitations. The main components
are given in the following de nition. Note that, given that it is to be applied to
(type-level) ontologies, we would be in second order, but they could be used in
a rst order language or description logic language with nominalization, added
outside the ontology itself in the application layer, or added to an ontology that
is stored in a database.
      </p>
      <p>
        De nition 1 (Granularity framework G) A granularity framework is a
tuple = f ; ; ; ; ; ; RE ; RL; RC ; RG; uses g where
{ is the domain, that can be divided up into a particular subject domain s
and the encompassing granularity frame f that contains the other elements
of the granularity framework;
{ denotes granular perspective (granulation hierarchy), where its instances
are denoted with 1; : : : ; n;
{ denotes granular level, where its instances are denoted with 1; : : : ; n;
{ is the granulation criterion (a combination of at least two properties) by
which one granulates a particular perspective, where its instances are denoted
with 1; : : : ; m;
{ is a type of granularity from the taxonomy of types of granularity [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ];
{ is a granulation relation between entities residing in adjacent levels;
{ RE is a binary parthood relation constrained to relating two framework
components, being either a level and a perspective or a perspective and a domain;
{ RL is a binary parthood relation constrained to relating two adjacent ne
and coarser-grained levels that reside in the same perspective;
{ RC is a binary relation associating a granulation criterion to a perspective;
{ RG is a binary relation relating a perspective or level to the type of granularity
it adheres to;
{ uses is a binary relation between and .
      </p>
      <p>
        Salient constraints are that each perspective must have at least two levels, a
level must be contained in exactly one perspective (8x( (x) ! 9!yRE (x; y))),
that the multiplicity (cardinality) for RL is 1:1, and a perspective can be
identi ed by the combination of the criterion and type of granularity it adheres to
(8x( (x) ! 9!y; (RC (x; y) ^ RG(x; ))) where is a shorthand for any of the
eight types of granularity). The components of G enable one to represent
explicitly the distinction between, e.g., an is-a taxonomy of structural body parts
versus a partonomy of body parts (`is-a vs. part-of' is dealt with by and
and the `human structural anatomy' by and ), and to manage various
distinct perspectives within one domain. There are three perspective depicted in
Fig.1-B, each with at least three levels. As levels for the partonomic granular
perspective for humans, we have in Fig.1-B, e.g., 2 = Organ in 1, where its
contents are provided by the FMA by means of selecting the Organ entity type as
root and recursively querying for the taxonomic subtypes of organ (about 3000
entity types). The molecule levels in the three perspectives, on the other hand,
are not well covered by the FMA, therefore this gap is lled by other ontologies,
such as PRO and ChEBI in 5 in Fig.1-B. That is, the contents at each level
are intended to be (or assumed to be) complimentary to one another, where
the G provides the machinery for a structured coordination and linking; hence,
one knows what is linked where and how, which then is, at least in theory, easy
to generate again when an ontology is developed independently and at certain
time intervals updated and re-connected. Likewise, adding a new level lled with
another ontology can be done transparently and non-disruptive for the other
perspectives and levels. Thus, we have four principal components for a granulated
information system: the types of granularity [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] that links to the granularity
framework as theory (De nition 1) through , an instantiation (model) of this
theory for a speci c subject domain, such as the i, i etc. shown in Fig.1-B,
and the data sources that are granulated (the GO, FMA, etc.).
4
      </p>
    </sec>
    <sec id="sec-3">
      <title>Granular queries</title>
      <p>
        The next step is cross-ontology reasoning over such ontologies linked by
granularity, which already has been noted as a requirement [
        <xref ref-type="bibr" rid="ref15 ref7">7, 15</xref>
        ]. A major advantage
of having a granularity framework for coordinating the di erent (sections of)
ontologies, is that one can reuse the underlying idea of querying conceptual models
[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], but where in this case it is not the conceptual model but the granularity
framework G (illustrated in Fig.1-B) that is used to structure and simplify the
granular queries. They share the notion that the goal of the query can remain
the same, but their implementation di er, be it di erent SQL versions [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] or
a Semantic Web ontology language. Given that ontologies can be represented
and saved in di erent languages and software, we de ne the types of granular
queries in a purpose-oriented way to ensure portability. The rst group deals with
querying for perspectives and levels, the second with retrieving level's contents,
and the third group with conditional cross-granular queries and other auxiliary
queries to retrieve additional information. The functions will be discussed in the
remainder of this section.
      </p>
      <p>Set of perspectives in a domain granularity framework i.e., 1; : : : ; n 2 P
Set of levels in a domain granularity framework, i.e., 1; : : : ; m 2 DL
Set of levels i; : : : ; k (i; k m, i 6= k) of a particular i
Selected level i, output of the function selectL
Set of selected levels, output of the function selectLs
Set of selected levels, output of the function selectDL
Collection of universals residing in a particular level i</p>
      <p>Intersection of the contents of two levels i and j (with i 6= j)
Selection of one level. Selecting a level within a perspective is a four-step
process: retrieve the desired perspective, select a perspective, retrieve the levels
in the selected perspective, and then select the desired level; subscripts denote
what is selected.</p>
      <p>F1. Goal: retrieve all granular perspectives 1; : : : ; n contained in the domain
granularity framework df . Input is a df and output is an unordered set of
perspectives of that domain, P . Speci cation: getP : f 7! P (F.1)
F2. Goal: select a granular perspective i from the perspectives retrieved with
getP . Input is the set of perspectives of the domain, P, output is a set with one
perspective i 2 P. Speci cation: selectP = i : P 7! P (F.2)
F3. Goal: retrieve all granular levels 1; : : : ; n contained in the selected
perspective i. Input is a i 2 P and output is an ordered set of levels of that
perspective, L. Speci cation: getL : P 7! L (F.3)
F4. Goal: select a particular granular level i from the levels retrieved with the
getL function. Input is the set of levels of the perspective, L, output is a set, li,
with a single element from L. Speci cation: selectL = i : L 7! L (F.4)
Although one could choose to add a function to select a level from DL (all
speci ed levels) only, this is prone to user-mistakes, the above four functions can
more easily be reused for other selection and retrieval functions, and they can
be abstracted into one compound user-interface operation anyway.</p>
      <p>To select the Cell-level as declared in Fig.1-B for query Q2, one uses getP
to retrieve f 1; 2; 3g, selectP = 1 to obtain the human structural anatomy
perspective, and with getL the levels of that perspective (f 1; : : : ; 5g), and,
last, selectL = 3 for the Cell-level. The same procedure can be used for selecting
the Organ-level, but with the last step being selectL = 2 .</p>
      <p>Selection of multiple levels. Two options to select multiple levels are
distinguished: 1) selecting several levels to subsequently retrieve and combine its
contents for further processing, and 2) conditional selection that considers the
contents as well. Option 1 can be subdivided into two similar operations:
selecting more than one level from one perspective and selecting levels from di erent
perspectives. Both can be accomplished with a sequence of sub-functions. In
addition to those introduced in the previous section, we have:
F5. Goal: select 1 granular levels contained in one granular perspective. Input
is set of levels L retrieved with getL, output is a set of selected levels lsi from
L such that lsi L holds. Speci cation: selectLsV i : L 7! L (F.5)
F6. Goal: perform a selection of 1 levels from 1 perspectives. Input is the
set of perspectives, P, and for each perspective the set of levels, L, from which
one selects, and output is a set of levels contained in df , DL, denoted with lssi,
where lssi DL. Speci cation: selectDLV i V i : P L 7! DL (F.6)
Thus, selectL is extended for multi-level selection within one perspective, using
the binary operator ^ to select multiple levels, which has been written in
shorthand notation, V, such that V i = 1 ^ ::: ^ k where i contains j levels, k j,
and RE ( i; i). Note that lsi is not a proper subset of L because it is possible
that a user wants to select all levels in the chosen perspective. The 1-level
selection is a special case of the multi-level one, but labelled di erently to avoid
overloading terms.</p>
      <p>
        For selection of more than one level from more than one perspective, selectLs
cannot be extended to selecting levels from multiple perspectives, because one
has to nest selection of at least one other perspective and have some way to
distinguish levels belonging to di erent perspectives. A mapping of (F.6) to a
formal notation and implementation algorithm can be achieved with a loop in
two near-equivalent ways: either to select all desired perspectives and
subsequently one or more levels for each selected perspective, or repeat the two-step
process of selecting a perspective &amp; retrieve levels and then selecting levels.
Retrieving the contents of a level. Retrieving the contents of a
particular granular level is conceptually straightforward with a getC function, but
this hides many details, in particular the need to use the structure of the level's
contents given the di erent types of granularity. This is elaborated on afterward.
F7. Goal: retrieve the contents, i.e., entity types and their relations, of a selected
granular level i. Input is the selected level, where i 2 L and output is a set of
predicates, E 2 E , that takes into account the structure of the contents in the
level. Speci cation: getC : L 7! E (F.7)
F8. Goal: intersect the retrieved contents of selected granular levels i; j (with
RE ( i; i), RE ( j ; j ), and i 6= j). Input are the contents of the selected levels
Ei and Ej (obtained with getC for each level), and output is the intersection,
set I 2 I, where I E . Speci cation: intersect : L L 7! I (F.8)
getC takes a particular granular level as argument and returns the contents of
that granular level, irrespective of how the contents themselves may be structured.
Without having represented explicitly which type of granularity is used for the
perspective and levels, one cannot know this other than manually hardcoding
this information, which is laborious and error-prone. However, if we have, for
instance, two granular perspectives that both devise the levels where entity types
in ner-grained levels are always part of those residing in coarser levels and for
both hierarchies the contents is a taxonomy, such as types of organs and types
of molecular processes, then the mechanism of granulation is the same (just
applied to di erent subject domain information). One can exploit this sameness
in approach of granulation by identifying each principal granulation mechanism
and relate the mechanism to the levels and perspectives, so that one has to
de ne only once how the contents of a level should be retrieved. For the types
of organs and types of molecular processes, this amounts to a straight-forward
recursive query. To achieve this e ciency, we can reuse the taxonomy of types
of granularity [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], i.e. a set of granulation mechanisms, and relate that to the
perspectives and levels. This is achieved by reusing RG and transforming it
into a function, tgL, which is an abbreviation of t ype of g ranularity that the
l evel adheres to, thus tgL : L 7! i RG(x; ), where RG has been typed
already such that (x) and ! ( is syntactic sugar for the eight leaf types
in the taxonomy of types of granularity). Thus, to retrieve a level's contents
with getC, the nested function tgL has to be used to query for the type of
granularity that a level adheres to (details can be found in section 4.2 of [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]).
In an implementation, this may well be an intermediate query or database view in
the ontology-stored-in-a-database, done with a method in a C++ program, and
so forth. For instance, to retrieve Macrophage (in Q2), we rst consult the type
of granularity that 3 adheres to, i.e., tgL( 3), which returns nrG by which we
also obtain the type of granulation relation used, which is proper parthood (see
[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] for details). Subsequently, the contents of level 3 is retrieved with getC( 3)
and, by knowing the type of granularity|hence, also the mechanism required
for retrieval|it is ensured that a taxonomy is returned as query answer (in casu,
a taxonomy of cell types); to select the entity type of interest, Macrophage, the
selectE function will be introduced below.
      </p>
      <p>Once the content is retrieved, it can be used for further processing, such
as intersecting contents of two levels. To this end, the intersect function (F.8)
is introduced, which rst retrieves the contents of each level with getC and
subsequently intersects them. For instance, to answer a query like \retrieve those
molecules that are also hormones" we have, among others, the molecule Insulin in
the molecule-level 5 of 1 in Fig.1-B but also in 3 of the function-perspective
2 where it is categorised as a hormone. Obviously, this can be scaled up to
intersection of more than two levels.</p>
      <p>
        Type selection to retrieve its levels. We introduce ve functions to achieve
unambiguous selection of types (universal/class/concept) from an ontology and
to query for the level(s) it resides in to simplify scalability and reusability.
F9. Goal: select a type from the ontology (subject domain). Input is a type,
C 2 ds, and the output of the selection ensures that the selected type resides in
some level i already, hence C 2 E. Speci cation: selectE : s 7! E (F.9)
F10. Goal: given a selected type C, retrieve the level i that C resides in. Input
is a type C 2 ds and output of the function is a level that is a subset of L.
Speci cation: grain : s 7! L (F.10)
F11. Goal: given a selected type C, retrieve all the levels 1; : : : ; j (in di erent
perspectives 1; : : : ; k, k = j) that C resides in. Input is a type C 2 ds, and
output of the function is a set of levels that is a subset of DL. Speci cation:
grains : s 7! DL (F.11)
F12. Goal: given multiple selected types C1; : : : ; Cn, retrieve their levels 1; : : : ;
j within one perspective i. Input are types C1; : : : ; Cn 2 ds, uses grain as
nested function, and output of the function is a set of levels within one
perspective (a subset of L). Speci cation: grainM : s 7! L (F.12)
F13. Goal: given multiple selected types C1; : : : ; Cn, retrieve their levels 1; : : : ;
j among multiple perspectives 1; : : : ; k. Input are types C1; : : : ; Cn 2 ds, uses
grains as nested function, and output of the function is a set of levels as subset
of DL. Speci cation: grainsM : s 7! DL (F.13)
The basic function to retrieve the level an entity resides in, is grain. Its neat
simplicity, however, does not su ce [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The main limitations are that it is
perspective-unaware and is based on the assumption that any entity type Ci can
be categorised only in one level in the whole granularity framework. Although it
is possible that a particular granularity framework contains only 1 or constrain
it to that [
        <xref ref-type="bibr" rid="ref17 ref4">17, 4</xref>
        ], it is more realistic that multiple perspectives have been declared
and that Ci is located in more than one granular level across perspectives [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
To remedy this, two scalable and reusable options are at our disposal. First, one
can restrict usage to one limited case: if one knows which perspective to search,
one can construct a rule alike \if i then do grain(x) = i" or precede it with
the selection operator selectP . The same approach can be used to decompose
the retrieval of multiple levels into sequential steps of the grain function, but
this requires additional process management. Second, to de ne a new function
that retrieves all levels the selected type resides in, i.e. grains (F.11).
      </p>
      <p>
        In a more complex subject domain than the OBO Foundry setting depicted
in Fig.1, such as human infectious diseases that could rely on the granulations of
the OBO Foundry, we then can retrieve all levels of, say, Cholera toxin, with the
grains function and retrieve f 2 9; 6 2g; that is, using the granulation from
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], the Cholera toxin resides in the Molecule-level (from structural component
perspective 2) as well as the Inhibitor-level (from 6, a function the molecule
has in the Second Messenger System). The four functions (F.10-F.13) address all
permitted options to nd the level(s) of entities (/types). Other combinations,
such as one type in one perspective residing in multiple levels, violate the
constraints of granularity framework G. For instance, if a user had allocated Nisin
(a type of bacteriocin) both in the Peptide ( k) and in the Quaternary protein
structures ( i, where i &lt; k) levels within the same perspective, then something
is wrong about the knowledge of what Nisin is, the user is confused, the domain
granularity framework has been ill-designed, or all of them, because Nisin
cannot be both a peptide and complex protein. The constraint to have an entity
in no more than one level within the same perspective should have prevented
this double allocation, or have returned an error upon checking the granulated
system for consistency.
      </p>
      <p>
        Cross-granular queries and other functions. The functions (F.1-F.13)
introduced in the preceding subsections enable formulation of more complex
queries, such as Q1-Q3 in section 2. It is possible to de ne many more
functions for a granularity framework, also because, in principle, the same approach
could be taken for knowledge bases and, among others, ontology-enhanced Data
WareHouses (DWH) or modularized ontology-driven information systems. DWH
implementations in particular focus on querying the information system with
advanced queries. Functions for such queries can be easily added. For instance,
with P still the set of perspectives, and adding C as the set of criteria, then
crit : P 7! C enables us to retrieve the criterion of a granular perspective,
tgP : P 7! to retrieve the type of granularity of a perspective, and plain
functions to query the entity types and relations of the ontology proper are also
still possible [
        <xref ref-type="bibr" rid="ref13 ref18">18, 13</xref>
        ] either independently or together with the granularity
framework by using a granulation relation from . Moreover, one can de ne functions
to retrieve each component of the granularity framework G and the more
comprehensive the representation of granularity-components, the more versatile and
well-founded the queries one can pose over a G. There are limitations to it,
however, in particular if the ontology is represented in a common DL-based
Semantic Web ontology language. Most notably, then we can neither represent and
query over the weak entity type that is nor combine querying for and in
one go due to known undecidability results with the role composition operator
[
        <xref ref-type="bibr" rid="ref19 ref20">19, 20</xref>
        ], likewise if one would want to introduce a newly de ned path has
such that has , RG uses to retrieve the granulation relation of the
entity types residing in the levels in a particular perspective (e.g., 1 in Fig.1-B
uses proper parthood as ). Implementing the coordinated modularization with
granularity and such advanced querying for ontologies stored in database is, of
course, an option. On the other hand, given the current state of granulation in
bio-ontologies, these issues are not of great concern, yet. It will be the
modeling exercise to meaningfully and in a structured way represent a particular
granularity framework that will be of most use to organise the relatively large
amount of bio-ontologies at this stage. In addition, this eases transparent
coordination of the distributed development of the yet to be developed ontologies to
ll the `gaps' in the granulation hierarchies as well as to maintain the existing
ontologies|often also developed in a distributed fashion|on a long-term basis.
      </p>
      <p>
        Looking at prospects for immediate use, the functions obviously could be
hard-coded and pre-computed as is customary in bioinformatics, be it as
successive steps or also with additional compound queries. However, with the
granularity framework, one has a management structure in place so that (i) queries
can be executed that suits the user on-demand as opposed to being limited to
the imagination of the software developer, and (ii) F.1-F.13 are de ned in an
implementation-independent and a reusable way so that practicalities of a
software system can be hidden from the domain experts so as to avoid burdening
them to learn yet another query language, to re-write the whole query for each
permutation (e.g., macrophages in tissues), and to keep in mind which levels
there are whereas that can be added to the informationsystem now. At present,
Q1-Q3 can be performed by standard relational database management systems
(RDBMS), such as the FMA in PostgreSQL [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] with queries in StruQL. This
requires more e ort for OWL ontologies due to traversal over a path of an
arbitrary but nite amount of DL-roles for Q2 [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ].
5
      </p>
    </sec>
    <sec id="sec-4">
      <title>Conclusions</title>
      <p>
        To address the problems of both structured coordination of linked and
modularised ontologies and to query a large dynamic ontology system like proposed
by the OBO Foundry, we proposed a basic granularity framework and a set of
functions to query such a very large granulated system. The granularity
framework enforces a constrained modularization so that dividing a large body of
represented information as well as re-linking the pieces is amenable to
automation. The proposed functions enhance cross-granular querying in transparent and
scalable way, because they rely on the unambiguous management layer provided
by the granularity framework, and are reusable for ontologies represented and
stored in di erent formats. We are currently working on transforming the theory
of granularity [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] to a practically usable OWL version to ease experimentation
in the real-life setting with the many available bio-ontologies.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. OMG:
          <article-title>Superstructure speci cation</article-title>
          .
          <source>Standard 2.1</source>
          .2, Object Management Group (
          <year>2007</year>
          ) http://www.omg.org/spec/UML/2.1.2/.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Cuenca</given-names>
            <surname>Grau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Parsia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Sirin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Kalyanpur</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <article-title>Modularity and web ontologies</article-title>
          .
          <source>In: Proc. of KR'06</source>
          . (
          <year>2006</year>
          )
          <article-title>Lake District</article-title>
          , UK.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Cuenca</given-names>
            <surname>Grau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Horrocks</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Kazakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Sattler</surname>
          </string-name>
          ,
          <string-name>
            <surname>U.</surname>
          </string-name>
          :
          <article-title>Just the right amount: Extracting modules from ontologies</article-title>
          .
          <source>In: Proc. of WWW-2007</source>
          .
          <article-title>(2007) Canada</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Novotny</surname>
          </string-name>
          , D.D.:
          <article-title>Biomedical informatics and granularity</article-title>
          .
          <source>Comparative and Functional Genomics</source>
          <volume>5</volume>
          (
          <issue>6</issue>
          -7) (
          <year>2005</year>
          )
          <volume>501</volume>
          {
          <fpage>508</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Keet</surname>
            ,
            <given-names>C.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Applying partitions to infectious diseases</article-title>
          . In Engelbrecht, R.,
          <string-name>
            <surname>Geissbuhler</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lovis</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mihalas</surname>
          </string-name>
          , G., eds.:
          <article-title>Connecting Medical Informatics and bio-informatics (MIE2005</article-title>
          ), Amsterdam: IOS Press (
          <year>2005</year>
          )
          <volume>1236</volume>
          {1241 Geneva, Switzerland,
          <fpage>28</fpage>
          -
          <issue>31</issue>
          <year>August</year>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Tange</surname>
            ,
            <given-names>H.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schouten</surname>
            ,
            <given-names>H.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kester</surname>
            ,
            <given-names>A.D.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hasman</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>The granularity of medical narratives and its e ect on the speed and completeness of information retrieval</article-title>
          .
          <source>J. Am. Med</source>
          . Inf. Assoc.
          <volume>5</volume>
          (
          <issue>6</issue>
          ) (
          <year>1998</year>
          )
          <volume>571</volume>
          {
          <fpage>582</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ashburner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosse</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bard</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bug</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ceusters</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goldberg</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Eilbeck</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ireland</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mungall</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , OBI Consortium,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Leontis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>RoccaSerra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Ruttenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Sansone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.A.</given-names>
            ,
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Whetzel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.:</surname>
          </string-name>
          <article-title>The OBO Foundry: Coordinated evolution of ontologies to support biomedical data integration</article-title>
          .
          <source>Nature Biotechnology</source>
          <volume>25</volume>
          (
          <issue>11</issue>
          ) (
          <year>2007</year>
          )
          <volume>1251</volume>
          {
          <fpage>1255</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8. : OBO Foundry (
          <year>2006</year>
          ) http://obofoundry.org.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Rosse</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mejino</surname>
            <given-names>Jr</given-names>
          </string-name>
          ,
          <string-name>
            <surname>J.V.</surname>
          </string-name>
          :
          <article-title>A reference ontology for biomedical informatics: the foundational model of anatomy</article-title>
          .
          <source>J. of Biomedical Informatics</source>
          <volume>36</volume>
          (
          <issue>6</issue>
          ) (
          <year>2003</year>
          )
          <volume>478</volume>
          {
          <fpage>500</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. :
          <article-title>Common anatomy reference ontology (</article-title>
          <year>2006</year>
          ) http://www.bioontology.org/wiki/index.php/CARO:Main Page.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Bouquet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Euzenat</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Franconi</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sera</surname>
            <given-names>ni</given-names>
          </string-name>
          , L.,
          <string-name>
            <surname>Stamou</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tessaris</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Speci cation of a common framework for characterizing alignment</article-title>
          .
          <source>Technical report, KnowledgeWeb Deliverable D2.2.1, v1.2</source>
          ,
          <issue>3</issue>
          -
          <fpage>8</fpage>
          -
          <fpage>2004</fpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Klein</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>Combining and relating ontologies: Problems and solutions</article-title>
          . In: IJCAI Workshop on Ontologies. (
          <year>2001</year>
          ) Seattle, USA.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Keet</surname>
            ,
            <given-names>C.M.:</given-names>
          </string-name>
          <article-title>A Formal Theory of Granularity</article-title>
          .
          <source>Phd thesis</source>
          , KRDB Research Centre, Faculty of Computer Science, Free University of Bozen-Bolzano,
          <source>Italy</source>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Keet</surname>
            ,
            <given-names>C.M.:</given-names>
          </string-name>
          <article-title>A taxonomy of types of granularity</article-title>
          .
          <source>In: Proc. of GrC2006</source>
          . IEEE Computer Society (
          <year>2006</year>
          )
          <volume>106</volume>
          {111 Atlanta, USA, May
          <volume>10</volume>
          -12
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Keet</surname>
            ,
            <given-names>C.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roos</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marshall</surname>
            ,
            <given-names>M.S.:</given-names>
          </string-name>
          <article-title>A survey of requirements for automated reasoning services for bio-ontologies in OWL</article-title>
          .
          <source>In: Proc. of OWLED2007</source>
          . Volume
          <volume>258</volume>
          of CEUR-WS. (
          <year>2007</year>
          )
          <fpage>6</fpage>
          -
          <lpage>7</lpage>
          June 2007, Innsbruck, Austria.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Bloesch</surname>
            ,
            <given-names>A.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Halpin</surname>
            ,
            <given-names>T.A.</given-names>
          </string-name>
          :
          <article-title>Conceptual Queries using ConQuer-II</article-title>
          .
          <source>In: Proc. of ER'97</source>
          . Volume 1331 of LNCS., Springer (
          <year>1997</year>
          )
          <volume>113</volume>
          {
          <fpage>126</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Bittner</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>A Theory of Granular Partitions</article-title>
          .
          <source>In: Foundations of Geographic Information Science</source>
          . London: Taylor &amp; Francis
          <string-name>
            <surname>Books</surname>
          </string-name>
          (
          <year>2003</year>
          )
          <volume>117</volume>
          {
          <fpage>151</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Keet</surname>
            ,
            <given-names>C.M.:</given-names>
          </string-name>
          <article-title>Enhancing comprehension of ontologies and conceptual models through abstractions</article-title>
          . In Basili, R.,
          <string-name>
            <surname>Pazienza</surname>
          </string-name>
          , M., eds.
          <source>: Proc. of AI*IA 2007</source>
          .
          <article-title>Volume 4733 of LNAI</article-title>
          ., Springer Verlag (
          <year>2007</year>
          )
          <volume>814</volume>
          {822 Rome, September 10-
          <issue>13</issue>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Schmidt-Schauss</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>Subsumption in KL-ONE is undecidable</article-title>
          .
          <source>In: Proc. of KR'89</source>
          . (
          <year>1989</year>
          )
          <volume>421</volume>
          {
          <fpage>431</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Wessel</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>Obstacles on the way to qualitative spatial reasoning with description logics: some undecidability results</article-title>
          . In Goble,
          <string-name>
            <given-names>C.A.</given-names>
            ,
            <surname>McGuinness</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.L.</given-names>
            , Moller, R.,
            <surname>Patel-Schneider</surname>
          </string-name>
          , P.F., eds.
          <source>: Proc. of DL'01. Volume 49 of CEUR WS</source>
          . (
          <year>2001</year>
          ) Stanford, CA, USA,
          <year>August</year>
          1-
          <issue>3</issue>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Mork</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brinkley</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosse</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>OQAFMA querying agent for the foundational model of anatomy: a prototype for providing exible and e cient access to large semantic networks</article-title>
          .
          <source>Journal of Biomedical Informatics</source>
          <volume>36</volume>
          (
          <issue>6</issue>
          ) (
          <year>2003</year>
          )
          <volume>501</volume>
          {
          <fpage>517</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Keet</surname>
            ,
            <given-names>C.M.</given-names>
          </string-name>
          :
          <article-title>Granular information retrieval from the Gene Ontology and from the Foundational Model of Anatomy with OQAFMA</article-title>
          .
          <source>KRDB Research Centre Technical Report KRDB06-1</source>
          , Free University of Bozen-Bolzano,
          <source>Italy</source>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>