=Paper= {{Paper |id=Vol-3603/Paper8 |storemode=property |title=Concrete Names for Complex Expressions in Ontologies: A Survey of Biomedical Ontologies |pdfUrl=https://ceur-ws.org/Vol-3603/Paper8.pdf |volume=Vol-3603 |authors=Christian Kindermann,Martin Georg Skjæveland |dblpUrl=https://dblp.org/rec/conf/icbo/KindermannS23 }} ==Concrete Names for Complex Expressions in Ontologies: A Survey of Biomedical Ontologies== https://ceur-ws.org/Vol-3603/Paper8.pdf
                                Concrete Names for Complex Expressions in
                                Ontologies: A Survey of Biomedical Ontologies
                                Christian Kindermann1 , Martin Georg Skjæveland2
                                1
                                    Stanford University, 450 Serra Mall, Stanford, USA
                                2
                                    University of Oslo, Problemveien 7, 0315 Oslo, Norway


                                                                         Abstract
                                                                         The representation of an entity in an ontology may require complex expressions to capture all of its
                                                                         relevant characteristics. If an entity can be defined based on its characteristics, then its definition can
                                                                         be explicitly stated in most knowledge representation languages, such as the Web Ontology Language
                                                                         (OWL). Specifically, a domain-specific entity can be identified by a name in an ontology, which can be
                                                                         declared to be logically equivalent to a more complex expression This not only fixes the meaning of the
                                                                         entity in the ontology but also allows its name to replace the more complex expression throughout the
                                                                         ontology. Consistently using concise and informative names for domain-specific entities in an ontology
                                                                         can arguably enhance ontology comprehension, maintenance, and usability in practice. This raises the
                                                                         question of the extent to which entities represented in ontologies are associated with concrete names
                                                                         and how such names are used.
                                                                             In this paper, we analyze how often named classes in OWL ontologies are defined as logically
                                                                         equivalent to complex expressions. We investigate whether such named classes are consistently used
                                                                         whenever possible and whether they are associated with labels intended for human understanding. Our
                                                                         findings indicate that complex class expressions are frequently declared to be equivalent to named classes
                                                                         in ontologies, and that such named classes are linked to human readable labels. While there seems to be
                                                                         a tendency to encourage the reuse of these names, we also observe a notable number of instances where
                                                                         such named classes are not consistently reused despite being defined.

                                                                         Keywords
                                                                         Ontology Engineering, Biomedical Ontology, Web Ontology Language, OWL




                                1. Introduction
                                The representation of an entity in an ontology typically involves statements about the entity’s
                                characteristics. When an entity can be defined based on its characteristics, the definition may
                                include an informative name by which the entity can be referred to. Specifically, an entity’s
                                name may be used instead of its more complex definitional description. Despite the potential
                                benefits of consistently using concise and informative names whenever possible, it has been
                                observed that this practice is not always followed in published ontologies. To illustrate this, we
                                revisit a concrete example taken from the Galen ontology, which was originally presented by
                                Nikitina and Koopmann [1]. Here, the medical concept Clotting is represented as follows:


                                Proceedings of the International Conference on Biomedical Ontologies 2023, August 28th-September 1st, 2023, Brasilia, Brazil
                                EMAIL: christian.kindermann@stanford.edu (C. Kindermann); martige@ifi.uio.no (M. G. Skjæveland)
                                ORCID: 0000-0001-7818-3466 (C. Kindermann); 0000-0002-9736-8316 (M. G. Skjæveland)
                                                                       © 2023 Copyright for this paper by its authors.
                                                                       Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                    CEUR
                                    Workshop
                                    Proceedings
                                                  http://ceur-ws.org
                                                  ISSN 1613-0073
                                                                       CEUR Workshop Proceedings (CEUR-WS.org)




CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings



                                                                                                                                                                                       82
         Clotting ≡ ∃ actsSpecificallyOn.(Blood
                         ⊓ ∃ hasPhysicalState.(PhysicalState ⊓ ∃ hasState.Liquid))
                    ⊓ ∃ hasOutcome.SolidBlood
  This axiom is arguably complex due to both its size and the nesting of expressions. However,
Galen also contains the following axioms:

                   LiquidBlood ≡ Blood ⊓ ∃ hasPhysicalState.LiquidState
                   LiquidState ≡ PhysicalState ⊓ ∃ hasState.Liquid

Given these equivalences, the named concept LiquidBlood can be used to simplify the represen-
tation of Clotting to

                         Clotting ≡ ∃ actsSpecificallyOn.LiquidBlood
                                    ⊓ ∃ hasOutcome.SolidBlood

   The latter representation of Clotting is arguably easier to read, comprehend, and maintain.
This observation raises questions about the frequency of defining concrete names for complex
expressions, the consistency of using such names throughout an ontology, and to what extent the
use of names simplifies the definition of more complex concepts. The contributions presented in
this paper are as follows: (i) we propose an approach for identifying named classes with logical
definitions in ontologies, (ii) we develop techniques for quantifying the use and lack of reuse of
such named classes, and (iii) we use these techniques to conduct an empirical investigation on a
large and complex corpus of ontologies in the biomedical domain to shed light on the use of
such names in real-world ontologies.


2. Preliminaries
We assume the reader to be familiar with OWL [2] and only fix some terminology. Let 𝑁𝐶 , 𝑁𝐼 ,
and 𝑁𝑃 be sets of class names, individual names, and property names. A class is either a class
name or a complex class built using OWL class constructors. We will use ⊤ and ⊥ to denote
owl:Thing and owl:Nothing respectively. We use both OWL Functional Style Syntax [3] and
Manchester Syntax [4] to write OWL axioms. An ontology is a set of axioms and we write
𝒪 |= 𝛼 to denote that the ontology 𝒪 entails the axiom 𝛼. An axiom 𝛼 is explicit in 𝒪 if
𝛼 ∈ 𝒪, and implicit if 𝛼 ̸∈ 𝒪 but 𝒪 |= 𝛼. An OWL expression 𝑒 occurs in 𝒪 if 𝑒 is used as a
subexpression within an explicit axiom in 𝒪.


3. Abbreviations in Ontologies
The Oxford English Dictionary defines the word abbreviation to denote “[t]he result of shortening
something; an abbreviated or condensed form, esp. of a text; a summary, an abridgement” [5].
So, we define an abbreviation for a complex OWL expression in terms of an equivalent named
class. More formally, let A be a named class and C be a complex class expression. Then A is an




                                                                                                     83
          𝛼1 = SpicyPizza EquivalentTo Pizza and
                             (hasTopping some (PizzaTopping and (hasSpiciness some Hot)))
          𝛼2 = SpicyTopping EquivalentTo PizzaTopping and (hasSpiciness some Hot)
          𝛼3 = SpicyTopping EquivalentTo HotTopping
          𝛼4 = DiavolaPizza SubClassOf SpicyPizza
          𝛼5 = DiavolaPizza SubClassOf Pizza and hasCountryOfOrigin value Italy
          𝛼6 = NapoletanaPizza SubClassOf Pizza and hasCountryOfOrigin value Italy

Figure 1: Example of abbreviations and synonyms in a sample ontology. The named class SpicyPizza is
an abbreviation. The classes SpicyTopping and HotTopping are synonyms.


abbreviation for C in an ontology 𝒪, if 𝒪 |= EquivalentClasses(A, C). We will refer to the
EquivalentClasses axiom as the definition of the abbreviation A.
   A complex OWL expression can be equivalent to more than just one named class. We refer
to equivalent named classes as synonyms.1 In particular, a synonym for a named class N in
an ontology 𝒪 is a named class S s.t. 𝒪 |= EquivalentClasses(S, N) and we will refer to the
EquivalentClasses axiom as the synonym’s definition. Please note that synonyms are not
necessarily abbreviations. However, a synonym for an abbreviation is also an abbreviation (due
to transitivity of EquivalentClasses).
   Both abbreviations and synonyms are notions based on entailment, i.e., an EquivalentClasses
axiom with exactly two arguments. However, OWL specifies EquivalentClasses as an 𝑛-ary
constructor. So, for the purpose of analyzing how abbreviations and synonyms are spec-
ified in ontologies, we introduce the notion syntactic definitions types for both abbre-
viations and synonyms. In particular, an axiom of the form EquivalentClasses(A, C) and
EquivalentClasses(S, A) will be referred to as simple definitions for the abbreviation A and
the synonym S respectively. An axiom of the form EquivalentClasses(A, C1 , . . . , C𝑛 ) where
C1 , . . . C𝑛 are complex class expressions is a ambiguous definition of A. An axiom of the form
EquivalentClasses(S1 , . . . , S𝑚 ) is an enumerative definition for the synonyms S1 , . . . , S𝑛 .
And lastly, an axiom of the form EquivalentClasses(S1 , . . . , S𝑚 , C1 , . . . , C𝑛 ) will be referred
to as a compound definition for S1 , . . . , S𝑚 , which are both synonyms and abbreviations.
   With this notion of definition types, we can quantify how abbreviations and synonyms are
specified explicitly in an ontology. However, counting implicit definitions of abbreviations
and synonyms is not as straightforward, as extracting finite sets of entailments is a non-trivial
matter [7]. We will delve into the determination and counting of implicit abbreviations and
synonyms in more detail in Section 4. Before that, though, we address the more obvious question
of how abbreviations and synonyms are used in an ontology.
   Consider the example ontology 𝒪𝐸𝑥 shown in Figure 1. Here, the abbreviation SpicyPizza is
specified via a simple definition in axiom 𝛼1 and occurs on the right-hand side of 𝛼4 . So, we say
an abbreviation is used if it occurs in an OWL axiom that is not its definition. In addition to
the use of an abbreviation, we can also determine if an abbreviation is not used even though its
use would be possible. We refer to such a case as an abbreviation’s possible use. For example,

1
    The Oxford English Dictionary defines the word synonym to denote “Strictly, a word having the same sense as
    another (in the same language); [. . .]” [6].




                                                                                                                  84
consider axiom 𝛼1 ∈ 𝒪𝐸𝑥 . Here, the abbreviation SpicyTopping (and its synonym HotTopping)
has possible uses since the complex OWL expression PizzaTopping and (hasSpiciness some Hot)
could be replaced by either SpicyTopping or HotTopping.
  With the notions of an abbreviation’s use and possible use, we can quantify the impact of
abbreviations in an ontology. Before we do so, we come back to the topic of determining both
explicit and implicit definitions of abbreviation and synonyms in an ontology.


4. Determining Abbreviations and Synonyms
Explicit definitions for abbreviations can be easily determined by checking the syntactic shape
of all axioms in a given ontology. Similarly, implicit definitions can be determined by checking
𝒪 |= EquivalentClasses(A, C) for all pairs of named classes and complex classes occurring in
an ontology. However, this becomes impractical for large ontologies with numerous named and
complex classes.
   Instead, to determine implicit abbreviations, we build upon highly optimized implementations
of the standard reasoning service classification, i.e., computing all entailed SubClassOf and
EquivalentClasses axioms between named classes in an ontology [8, 9, 10]. We will refer to
this set as the inferred class hierarchy (ICH). The idea is to introduce an abbreviation for every
complex class expressions that occurs in a given ontology, then to compute the ICH of the
ontology with these newly added abbreviations, and finally to read off all implicit abbreviations
from the ICH.
   More formally, for a given ontology 𝒪, we create the abbreviation ontology
         𝒪𝐴 = 𝒪 ∪ {EquivalentClasses(A𝑖 , C𝑖 ) | C𝑖 occurs in 𝒪, A𝑖 does not occur in 𝒪}
and compute ICH(𝒪𝐴 ). Since the ICH captures all SubClassOf and EquivalentClasses rela-
tionships between named classes in an ontology, it is straightforward to identify all named
classes in 𝒪 that are equivalent to a newly introduced abbreviation A𝑖 in 𝒪𝐴 .
   We will demonstrate this procedure by way of example. Consider the ontology 𝒪𝐸𝑥 shown
in Figure 1. This ontology contains complex class expressions C1 , . . . , C6 as shown in Figure 2a.
Classifying the abbreviation ontology 𝒪𝐸𝑥𝐴 and inspecting the ICH (see Figure 2) reveals that,

for example, SpicyTopping is equivalent to A2 , which in turn is equivalent to C2 by construction.
So, SpicyTopping is an abbreviation for C2 in 𝒪𝐸𝑥 .


5. Study Design and Materials
Before we investigate to what extent abbreviations and synonyms are defined, used, and not
used even though this would be possible, we first establish a baseline. This baseline aims to
determine whether named classes in ontologies are associated with concrete domain-specific
terms that are intended for human interpretation. Specifically, we assess the association of
named classes in ontologies with human-readable annotations specified via rdfs:label2 and
obo:definition.3 Similarly, we establish a baseline for human-readable synonyms specified
2
    https://www.w3.org/TR/rdf-schema/#ch_label
3
    http://purl.obolibrary.org/obo/IAO_0000115




                                                                                                       85
      C1 = hasSpiciness some Hot
      C2 = PizzaTopping and (hasSpiciness some Hot)
      C3 = hasTopping some (PizzaTopping and (hasSpiciness some Hot))
      C4 = Pizza and (hasTopping some (PizzaTopping and (hasSpiciness some Hot)))
      C5 = hasCountryOfOrigin value Italy
      C6 = Pizza and hasCountryOfOrigin value Italy

                                    (a) Complex class expressions in 𝒪𝐸𝑥 .


                                                      ⊤




                   Hot         A3           Pizza              A5        PizzaTopping          A1




                               SpicyPizza, A4             A6          HotTopping, SpicyTopping, A2




                                DiavolaPizza        NapoletanaPizza



                                 (b) Visualisation of ICH(𝒪𝐸𝑥
                                                           𝐴
                                                              ) without ⊥.

                                                                                         𝐴
Figure 2: Determining implicit abbreviations in 𝒪𝐸𝑥 via the inferred class hierarchy of 𝒪𝐸𝑥 .


via skos:altLabel,4 oio:hasExactSynonym, oio:hasNarrowSynonym, oio:hasBroadSynonym,
oio:hasRelatedSynonym,5 or obo:alternativeLabel.6
   We conduct our empirical investigation using ontologies indexed in BioPortal as of February
2023.7 The data set is created folllowing the same approach as described by Matentzoglu and
Parsia [11] and includes a total of 785 ontologies. For orchestrating the empirical investigation,
we use use the OWL API (v.5.1.15). We exclude ontologies that cannot be processed with the
OWL API. Additionally, we exclude ontologies that do not contain any class expression axioms
since such ontologies cannot contain abbreviations or synonyms. As a result of this procedure,
our study corpus consists of 744 ontologies.
   We group ontologies into three disjoint categories. First, ontologies that consist of atomic
axioms only, i.e., SubClassOf and EquivalentClasses axioms that have only named classes
as arguments. Second, ontologies expressible in ℰℒ++ , and third, ontologies not expressible
in ℰℒ++ . We refer to these three kinds of ontologies as atomic, ℰℒ++ , and rich ontologies
4
  https://www.w3.org/2012/09/odrl/semantic/draft/doco/skos_altLabel.html
5
  https://raw.githubusercontent.com/geneontology/go-ontology/master/contrib/oboInOwl#{hasExactSynonym, has-
  NarrowSynonym, hasBroadSynonym, hasRelatedSynonym}.
6
  http://purl.obolibrary.org/obo/IAO_0000118
7
  https://bioportal.bioontology.org/




                                                                                                              86
                                                 Atomic ℰℒ++           Rich                       A        B
                                         7
                                   1 × 10




                Number of Axioms
                                   1 × 106
                                   100000
                                     10000
                                      1000
                                       100
                                        10
                                         1
                                             0        100      200     300      400      500     600      700
                                                 Ontology indices (grouped by category and ordered by TBox size)



Figure 3: Size of (A) an ontology’s TBox and (B) the set of an ontology’s class expression axioms.


respectively. The study corpus contains 91 atomic ontologies, 88 ℰℒ++ ontologies, and 565 rich
ontologies. We order ontologies within a category by the size of their TBoxes and assign each
ontology an index in ascending order starting with atomic ontologies, then ℰℒ++ ontologies,
and finally rich ontologies. Figure 3 illustrates this indexing by showing a comparison between
the size of an ontology’s (a) TBox and (b) the subset of class expression axioms.
   Using the reasoner Konclude (v0.7.0-1138), we successfully classified 714 ontologies (for the
purpose of determining implicit synonyms) and 656 abbreviation ontologies (for the purpose of
determining implicit abbreviations).


6. Results
Before presenting our results on abbreviations and synonyms (see Section 3) we report on the
use of annotation properties for specifying human-readable labels, definitions, and synonyms.
Table 1 illustrates the number of ontologies that offer human-readable annotations for varying
percentages of named classes.
   We find that rdfs:labels are available in many ontologies for large proportions of named
classes. For instance, 51 + 51 + 232 = 334 ontologies provide rdfs:labels for all named classes.
An additional 19 + 16 + 130 = 165 ontologies provide rdfs:labels for at least 90% of named
classes (but not 100%), so that (334 + 165)/744 ≈ 67% of ontologies provide rdfs:labels for at
least 90% of named classes. This provides strong evidence of the importance of human-readable
rdfs:labels for named classes representing domain-specific concepts in biomedical ontologies.
   We also find that obo:definitions are used in many ontologies. For example, 226 ontologies,
i.e., 226/744 ≈ 30%, provide obo:definitions for at least half of all named classes. While these
proportions are smaller compared to rdfs:labels, they are non-trivial and provide evidence
that obo:definitions play an important role in many biomedical ontologies.
   However, human-readable annotations for synonyms appear to be less common in biomedical
ontologies compared to rdfs:labels and obo:definitions. While there are a few ontologies




                                                                                                                   87
                                                         Atomic ℰℒ++                    Rich              A      B        C         D     E
                                                 6
                                           1 × 10




                        Number of Axioms
                                           100000
                                             10000
                                              1000
                                               100
                                                10
                                                 1
                                                     0          100         200         300         400         500           600        700
                                                           Ontology indices (grouped by category and ordered by size)
Figure 4: Number of (A) simple abbreviation definitions, (B) simple synonym definitions, (C) ambiguous
abbreviation definitions, (D) enumerative synonym definitions, and (E) compound definitions.


Table 1
Number of atomic (A), ℰℒ++ (E), and rich (R) ontologies that provide human-readable annotations for
different proportions of named classes.
                                                                              Number of Ontologies
  %       rdfs:label                      obo:def           obo:alt          skos:alt       oio:exact      oio:narrow      oio:broad      oio:related
        A     E    R                   A    E     R      A    E     R     A    E      R   A    E     R    A     E   R     A    E   R     A     E    R
  100   51 51 232                       4    9    11      -    -      -    1    1       -  2    -     1    -     -    -    -    -    -    -     -     -
 ≥ 90   19 16 130                       5 22      57      -    -     1     5    1      4   -    -     1    -     -    -    -    -    1    -     -    1
 ≥ 50    4     5   61                   4    8 106        -    -     5     1    1      7   3    3    30    -     -   1     -    -    -    2     4    6
 ≥ 20    -     2   19                   3    2    31      -    -    23     -    5     13   5    7    67    -     -   5     -    -    1    2 12      42
  >0     1     1   30                   4    9    41      1    1    96     4    -     27 10 27       69    5     9  96     4 10     87   12 21 103
    0   16 13      93                  71 38 319         90 87 440        80 80 514 71 51 397             86 79 463       87 78 476      75 51 413



that annotate more than 90% of named classes with synonyms, e.g., 12 in the case of skos:alt,
a lot of ontologies do not provide such synonym annotations for named classes at all (see
last row in Table 1). This suggests that although annotations for synonyms are used in some
biomedical ontologies, they do not seem to hold the same level of importance as rdfs:labels
and obo:definitions for the most part.
   The last observation can also be made w.r.t. the logical notions of abbreviations and syn-
onyms. Figure 4 shows how many EquivalentClassesAxioms are syntactic definition types for
abbreviations or synonyms (see Section 3 for definition types). It becomes evident that there
are about twice as many ontologies in which abbreviations are (explicitly) defined compared
with ontologies in which synonyms are (explicitly) defined — namely 309 and 136 respectively.
We also note that abbreviations and synonyms are specified only via simple definitions.
   The difference between abbreviations and synonyms is not only evident in the number of
ontologies in which they are defined but also in the number of definitions within ontologies.
We find that the definitions for abbreviations are more numerous compared to definitions for
synonyms. Specifically, there are 105 ontologies that define more than a hundred abbreviations,
whereas only eight ontologies have more than a hundred definitions for synonyms. Given these
observations, we will focus on abbreviations rather than synonyms in the remainder of this
paper and will start with a discussion of explicitly defined abbreviations and then proceed with
implicitly defined ones.




                                                                                                                                                          88
    Figure 5 shows how many abbreviations are defined in ontologies. We observe that explicitly
defined abbreviations (represented by green dots in Subfigure 5a) can be found in 309/744 ≈
41% of ontologies. Furthermore, a considerable number of these ontologies contain numerous
abbreviations, with 48 of them having at least a thousand explicitly defined abbreviations.
Additionally, we notice that each explicitly defined abbreviation tends to be used (as shown
by the blue triangles, indicating the number of used abbreviations, on top of the green dots,
indicating the number of defined abbreviations in Subfigure 5a). Specifically, in 248 out of the
309 ontologies with explicitly defined abbreviations, all abbreviations are also used.
    On the contrary, the number of explicitly defined abbreviations with potential uses tends to
be considerably smaller compared to the overall number of defined abbreviations (indicated
by the yellow cross in Subfigure 5a). Only one-third, specifically 101 out of 309 ontologies
(101/309 ≈ 33%), contain explicitly defined abbreviations with possible uses. Additionally,
it is important to note that no ontology exists where all explicitly defined abbreviations have
potential uses. These observations suggest that explicitly defined abbreviations are generally
used whenever possible, but there are a few exceptions in which over a thousand explicitly
defined abbreviations have potential uses.
    Regarding implicit abbreviations, there are 231 ontologies that define at least one abbreviation
implicitly. It seems that ontologies generally have fewer implicitly defined abbreviations
compared to explicitly defined ones. For instance, there are only 36 ontologies with more
than a hundred implicitly defined abbreviations. However, it is important to note that implicit
abbreviations could not be computed for 88 ontologies, and this group include many larger
ontologies.
    Before presenting the number of uses and possible uses for implicitly defined abbreviations, we
remind the reader of the definition of an abbreviation’s use in an ontology. An abbreviation’s use
is considered as its occurrence outside of its definition. Now, an implicitly defined abbreviation
necessarily occurs in the ontology but there is no explicit definition. So, any implicitly defined
abbreviations is also used (the only exceptions being owl:Thing and owl:Nothing). Regarding
possible uses, we find that almost all implicitly defined abbreviations come with possible uses.
In other words, most implicitly defined abbreviations are not used in at least one case where it
would be possible to use them.
    Besides counting how many abbreviations are defined, used, or not used, we are also interested
in the question of how often abbreviations are used or could possibly be used. Since reporting
these numbers for all abbreviations defined in all ontologies would be impractical (considering
that some ontologies contain several thousand abbreviations), we focus on reporting the data
for each ontology regarding the abbreviations with the most uses and most possible uses. Figure 6
depicts these numbers for both explicit and implicit abbreviations. We find that the abbreviation
with the highest use for both explicitly and implicitly defined abbreviations (represented with a
purple square and blue triangle respectively in Figure 6) tends to fall between 10 and 100 in
most ontologies. However, in some large ontologies, this number can be much higher, reaching
several thousand.
    In the case of most possible uses, we find that the numbers for implicit abbreviations (rep-
resented with a yellow cross in Figure 6) are larger compared to the numbers of explicit ones
(represented with a green dot). This provides additional evidence that explicit abbreviations
tend to be used where possible, whereas implicitly abbreviations are not consistently reused.




                                                                                                       89
                                            Atomic ℰℒ++           Rich        A          B     C      D
                                    6
                              1 × 10


          Number of Classes
                              100000
                                10000
                                 1000
                                  100
                                   10
                                    1
                                        0       100       200       300        400       500   600   700
                                              Ontology indices (grouped by category and ordered by size)



                                                      (a) Explicitly defined abbreviations.


                                            Atomic ℰℒ++           Rich        A          B     C      D
                                    6
                              1 × 10
          Number of Classes




                              100000
                                10000
                                 1000
                                  100
                                   10
                                    1
                                        0       100       200       300        400       500   600   700
                                              Ontology indices (grouped by category and ordered by size)



                                                      (b) Implicitly defined abbreviations.
Figure 5: Number of (A) named classes, (B) classes defined as abbreviations, (C) used abbreviations, (D)
abbreviations with possible uses.


7. Related Work
Logical equivalent rewritings for ontologies are usually motivated for the purpose of improved
reasoning performance [12] or ontology-based data access [13]. However, the idea of rewriting
axioms to improve ontology comprehension has also been discussed. Existing work in this
direction focuses on rewritings that are minimal in size because large expressions are arguably
hard to read and comprehend [14, 15, 1]. Yet, it is debatable whether the smallest possible logical
rewriting of an axiom is indeed most suitable for human interpretation.
   In the work presented in this paper, the focus is not on rewritings that are minimal in size.
Rather, we study to what extent domain-specific vocabulary defined in an ontology can be
reused to simplify otherwise complex expressions. The main argument being that a meaningful




                                                                                                           90
                                                       Atomic ℰℒ++         Rich      A        B        C         D
                                               6
                                         1 × 10




                     Number of Classes
                                         100000
                                           10000
                                            1000
                                             100
                                              10
                                               1
                                                   0       100       200     300      400     500      600      700
                                                         Ontology indices (grouped by category and ordered by size)



Figure 6: Number of maximal (A) explicit abbreviation use, (B) explicit abbreviation possible use, (C)
implicit abbreviation use, (D) implicit abbreviation possible use.


name is more readily understood by domain experts compared to more complex expressions in
OWL. It is important to note that the associated reduction in size is secondary in this context.
   The task of determining abbreviations in an ontology (cf. Section 4) can be interpreted as con-
cept definability, i.e., the problem of finding a definition for a concept name in an ontology [16].
However, we restrict the problem to finding definitions for concepts in terms of complex class
expressions that already occur, syntactically speaking, in an ontology. Nevertheless, advances
in research on concept definability may provide useful insights, e.g., knowing under what
conditions implicitly defined concepts can also be defined explicitly.


8. Discussion & Future Work
The OBO Foundry considers naming conventions important for ontology comprehension,
readability, navigability, alignment, and integration [17] and recommends that the majority of
classes in an ontology should have textual definitions [18]. While not all biomedical ontologies
conform to the principles put forward by the OBO Foundry, there is no question that human-
readable names and definitions are used in many ontologies (see Table 1 in Section 6).
   The use of human-readable names is important, because technical terms in a domain-specific
vocabulary tend to be defined in terms of already defined terms. For example, a ’blood assay
datum’ is defined as ’A data item that is the specified output of a blood assay’. This definition
only makes sense if the notion of a ’blood assay’ is already defined. So, if textual definitions
make use of already defined terms, and textual definitions should match logical definitions, as
the OBO Foundry advocates, then possible uses of explicitly defined abbreviations (cf. Section 3)
should not occur.8
   However, our results suggest that even though the reuse of already defined concepts seems
to be preferred, there is a non-trivial number of cases in which a complex class expression

8
    See https://obofoundry.org/principles/fp-006-textual-definitions.html.




                                                                                                                      91
could be replaced by an existing equivalent named class. It would be interesting to consult with
the ontology developers in such cases to determine whether such cases are intended or not.
Likewise, it would be interesting to find out whether classes with implicit logical definitions are
intentional and should be made explicit, and whether they should be reused whenever possible.
   In addition to the question of when to use an abbreviation, is the question of when to introduce
a new abbreviation. In particular, if a complex class expression occurs often in an ontology,
one may want to think about whether such an expression can be given a meaningful name and
which should be used instead.
   However, it needs to be highlighted that the introduction of an abbreviation, as defined in
this work, changes the meaning of an ontology. Consider the ontology 𝒪 and 𝒪𝐴 = 𝒪 ∪ {𝛼}
where 𝛼 = EquivalentClasses(A, C) is a definition for an abbreviation A. If A does not occur
in 𝒪, then 𝒪 ̸≡ 𝒪𝐴 because 𝒪𝐴 |= 𝛼 but 𝒪 ̸|= 𝛼. This change in meaning can be avoided by
encoding abbreviations using a meta-language, e.g., OTTR [19], on top of OWL. As an example,
consider the ontology

   𝒪 = { Napoletana         SubClassOf Pizza and hasCountryOfOrigin value Italy,
         Diavola            SubClassOf Pizza and hasCountryOfOrigin value Italy,
         Hawaiian           SubClassOf Pizza and hasCountryOfOrigin value Canada              }.

With OTTR, a mapping ItalianPizza ↦→ Pizza and hasCountryOfOrigin value Italy can be
defined, so that 𝒪 can be encoded as

  𝒪𝑇 = { Napoletana          SubClassOf ItalianPizza,
         Diavola             SubClassOf ItalianPizza,
         Hawaiian            SubClassOf Pizza and hasCountryOfOrigin value Canada              }.

Note that ItalianPizza is not an OWL class but an expression in OTTR. In particular, the
ontology 𝒪 is semantically equivalent to 𝒪𝑇 because the OTTR expression ItalianPizza
is indistinguishable from Pizza and hasCountryOfOrigin value Italy on the level of OWL.
The use of a meta-level language also opens up possibilities to capture definitions on higher
level of abstraction than OWL. In the case of the example ontology 𝒪, the representa-
tion of a pizza’s country of origin could be captured by a parameterized OTTR expression
PizzaWithOrigin(𝑥) ↦→ Pizza and hasCountryOfOrigin value 𝑥. With this, all three pizzas
in 𝒪 can be encoded in a uniform manner giving rise to the following even more meaningful
definitions:

            𝒪𝑃 = { Napoletana          SubClassOf PizzaWithOrigin(Italy),
                   Diavola             SubClassOf PizzaWithOrigin(Italy),
                   Hawaiian            SubClassOf PizzaWithOrigin(Canada)           }.

9. Conclusion
In this paper, we proposed an approach for analyzing and quantifying the use of logical abbrevia-
tions, i.e., named classes that are defined to be logically equivalent to complex class expressions.
We used this approach to survey biomedical ontologies indexed in BioPortal and find that




                                                                                                       92
abbreviations are highly prevalent. Although there are some exceptions, explicitly defined
abbreviations tend to be used whenever possible. However, implicitly defined abbreviations
often come with many possible uses which rasies the question of whether this is intentional or
undesireable.


References
 [1] N. Nikitina, P. Koopmann, Small Is Beautiful: Computing Minimal Equivalent EL Concepts, in:
     AAAI, AAAI Press, 2017, pp. 1206–1212.
 [2] B. Cuenca Grau, I. Horrocks, B. Motik, B. Parsia, P. F. Patel-Schneider, U. Sattler, OWL 2: The next
     step for OWL, Journal of Web Semantics 6 (2008) 309–322.
 [3] B. Motik, P. Patel-Schneider, B. Parsia, OWL 2 Web Ontology Language. Structural Specification
     and Functional-Style Syntax (Second Edition) (2012). URL: http://www.w3.org/TR/owl2-syntax/.
 [4] M. Horridge, P. Patel-Schneider, OWL 2 Web Ontology Language Manchester Syntax (Second
     Edition) (2012). URL: https://www.w3.org/TR/owl2-manchester-syntax/.
 [5] abbreviation, n., in: OED Online, Oxford University Press, 2022. URL: https://www.oed.com/view/
     Entry/180?redirectedFrom=abbreviation&, accessed July 05, 2022.
 [6] synonym, n., in: OED Online, Oxford University Press, 2022. URL: https://www.oed.com/view/
     Entry/196522?result=1&rskey=EVWsim&, accessed July 05, 2022.
 [7] S. Bail, B. Parsia, U. Sattler, Extracting Finite Sets of Entailments from OWL Ontologies, in:
     Description Logics, volume 745 of CEUR Workshop Proceedings, CEUR-WS.org, 2011.
 [8] F. Baader, B. Hollunder, B. Nebel, H. Profitlich, E. Franconi, An Empirical Analysis of Optimization
     Techniques for Terminological Representation Systems, or Making KRIS Get a Move on, in: KR,
     Morgan Kaufmann, 1992, pp. 270–281.
 [9] B. Glimm, I. Horrocks, B. Motik, R. D. C. Shearer, G. Stoilos, A novel approach to ontology
     classification, J. Web Semant. 14 (2012) 84–101.
[10] Y. Kazakov, M. Krötzsch, F. Simancik, The Incredible ELK - From Polynomial Procedures to Efficient
     Reasoning with ℰℒ Ontologies, J. Autom. Reason. 53 (2014) 1–61.
[11] N. Matentzoglu, B. Parsia, BioPortal Snapshot 30.03.2017, 2017. URL: https://doi.org/10.5281/zenodo.
     439510. doi:10.5281/zenodo.439510.
[12] D. Carral, C. Feier, B. C. Grau, P. Hitzler, I. Horrocks, EL-ifying Ontologies, in: IJCAR, volume 8562
     of Lecture Notes in Computer Science, Springer, 2014, pp. 464–479.
[13] M. Imprialou, G. Stoilos, B. C. Grau, Benchmarking Ontology-Based Query Rewriting Systems, in:
     AAAI, AAAI Press, 2012.
[14] F. Baader, R. Küsters, R. Molitor, Rewriting Concepts Using Terminologies, in: KR, Morgan
     Kaufmann, 2000, pp. 297–308.
[15] M. Horridge, B. Parsia, U. Sattler, Laconic and Precise Justifications in OWL, in: ISWC, volume
     5318 of Lecture Notes in Computer Science, Springer, 2008, pp. 323–338.
[16] B. ten Cate, E. Franconi, I. Seylan, Beth Definability in Expressive Description Logics, J. Artif. Intell.
     Res. 48 (2013) 347–414.
[17] D. Schober, B. Smith, S. E. Lewis, W. Kusnierczyk, J. Lomax, C. Mungall, C. F. Taylor, P. Rocca-
     Serra, S. Sansone, Survey-based naming conventions for use in OBO Foundry ontology devel-
     opment, BMC Bioinform. 10 (2009). URL: https://doi.org/10.1186/1471-2105-10-125. doi:10.1186/
     1471-2105-10-125.
[18] R. C. Jackson, N. Matentzoglu, J. A. Overton, R. Vita, J. P. Balhoff, P. L. Buttigieg, S. Carbon,
     M. Courtot, A. D. Diehl, D. M. Dooley, W. D. Duncan, N. L. Harris, M. A. Haendel, S. E. Lewis, D. A.
     Natale, D. Osumi-Sutherland, A. Ruttenberg, L. M. Schriml, B. Smith, C. J. S. Jr., N. A. Vasilevsky,
     R. L. Walls, J. Zheng, C. J. Mungall, B. Peters, OBO Foundry in 2021: operationalizing open




                                                                                                                  93
     data principles to evaluate ontologies, Database J. Biol. Databases Curation 2021 (2021). URL:
     https://doi.org/10.1093/database/baab069. doi:10.1093/database/baab069.
[19] M. G. Skjæveland, D. P. Lupp, L. H. Karlsen, H. Forssell, Practical Ontology Pattern Instantiation,
     Discovery, and Maintenance with Reasonable Ontology Templates, in: ISWC (1), volume 11136 of
     Lecture Notes in Computer Science, Springer, 2018, pp. 477–494.




                                                                                                           94