<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Testing the Adequacy of Automated Explanations of E L Subsumptions</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marvin Schiller</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Florian Schiller</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Birte Glimm</string-name>
          <email>birte.glimmg@uni-ulm.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Justus-Liebig-Universitat Gie en</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Ulm</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Ontology verbalization techniques have been introduced to automatically translate description logic (DL) axioms and derivations to natural-language texts. This way, non-expert users can be o ered explanations for subsumptions derived by systems using ontologies for knowledge-representation. We address the question of the readability and understandability of explanations generated from longer chains of inference steps, as they occur with non-trivial ontologies. An experimental design is presented to assess readers' understanding, the readability and the quality of the generated texts. The experiment tests verbalizations of derivations of di erent lengths, and assesses the e ect of a strategy proposed to shorten explanations while retaining an adequate level of informativeness.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Ontologies serve to organize concepts, terminology and relationships in a domain
of interest, such as biology or medicine. Furthermore, logical consequences of this
knowledge can be derived using automated ontology reasoners. To make this
knowledge available to users that are not familiar with the employed formalisms
(for instance, ontology languages such as OWL), verbalization techniques have
been developed to automatically translate axioms and derivations to
naturallanguage statements and explanations.</p>
      <p>Example 1. As the running example throughout this paper, consider the
subsumption
that can be derived from the following axioms:</p>
      <sec id="sec-1-1">
        <title>EsophagealPathology</title>
        <p>(PathologicalCondition u</p>
      </sec>
      <sec id="sec-1-2">
        <title>DigestiveSystemPathology</title>
        <p>(PathologicalCondition u
The verbalization approach presented in this paper constructs a step-wise
argument for this derivation in natural language, in this case:</p>
        <p>An esophageal pathology is de ned as a pathological condition that is located
in the esophagus. The esophagus is a part of the gastrointestinal tract, thus an
esophageal pathology is located in a part of the gastrointestinal tract.
Furthermore, since an esophageal pathology is a pathological condition, an esophageal
pathology is a pathological condition that is located in a part of the
gastrointestinal tract. A digestive system pathology is de ned as a pathological condition that
is located in a part of the gastrointestinal tract. Thus, an esophageal pathology
is a digestive system pathology.</p>
        <p>Such an explanation combines the information from the relevant axioms in a
step-wise fashion and does not require readers to be familiar with the syntax of
ontology languages or description logics. However, depending on the granularity
at which knowledge is modeled in an ontology, the possible derivations, and
consequently the generated explanations for these derivations, can grow very long.
The state-of-the-art in ontology verbalization has so far concentrated on short
inference problems (one or two inference steps), but not addressed the problems
that arise when verbalizing more complex derivations (more than two inference
steps). Ontologies do not need to be very expressive for long and su ciently
complex derivations to occur. All the derivations considered in the following
remain within the OWL2 EL pro le.</p>
        <p>This paper is organized as follows. Building on a short introduction to the
considered language fragment in Section 2, we present related work on
verbalization in Section 3. In Section 4 we present our own approach, which extends
previous approaches by focusing on the aspect of conciseness of the generated
explanations. Section 5 presents an experiment that compares explanations generated
by our setup in two di erent variants, leading to the conclusions in Section 6.
2</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Preliminaries</title>
      <p>This work remains within the DL fragment E L with some extensions that are
common features of the OWL2 EL pro le (which is based on the DL E L++). As
usual, class names are denoted with capital letters A; B; C; :::, role names with
small letters r; s; :::, individuals with small letters a; b::: and the universal concept
with &gt;. Complex class expressions are formed by using conjunction (C1 u C2) and
existential restriction (9r:C). Axioms that specify the subclass relationship
between two class expressions C1 and C2, also known as subsumptions, are denoted
as C1 v C2. Besides these pure E L constructors, we consider further constructors
that are common in the OWL2 EL pro le. This includes the unsatis able concept
?, nominals fag which are concepts consisting of a single individual a, domain
axioms dom(r; C) that are a shorthand for 9r:&gt; v C, equivalences between
concepts (mutual subclass relationship), denoted as C1 C2, disjointness axioms
specifying that two class expressions C1 and C2 are disjoint, as disj(C1; C2),
and role inclusion axioms r v s. We further include role inclusions that use role
composition r1 ::: rk v s (called property chains in OWL2 EL).</p>
      <p>
        A more comprehensive introduction to EL++ is provided by Baader et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Despite its limited expressiveness, a number of practically relevant ontologies in
numerous application domains fall into this language fragment. This includes, for
instance, large biomedical ontologies such as SNOMED CT1, the Gene Ontology
(GO),2 and large portions of the NCI Thesaurus3 and the Galen ontology.4
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Related Work</title>
      <p>
        A correspondence between formal expressions in ontologies and natural-language
has been proposed in the form of controlled languages (cf. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]), for instance
OWL Simpli ed English (OSE, [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]), Attempto Controlled English (ACE, [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]),
Sydney OWL Syntax (SOS, [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]), CLOnE [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and Rabbit [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Controlled languages
de ne a (usually very restricted) subset of natural language that unambiguously
corresponds to DL constructors and expressions. For example, the subsumption
C1 v C2 is represented in OSE as \A [C1] is a [C2]", where [C1] and [C2] are
text strings to represent the concept descriptions for C1 and C2, respectively. For
example, \A city is a place". Whereas controlled languages remain closely-tied to
the corresponding formalism, some approaches have focused on text quality and
support for di erent (natural) languages. This includes a tool developed by the
SWAT project [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], the OntoVerbal verbalizer [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] and NaturalOWL [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Whereas
these approaches verbalize axioms in a knowledge base, some approaches have
considered explanations generated from derivations. These include the Classic
system [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], the \tracing" facility of the ELK reasoner [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], and the approaches
of Borgida et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and Nguyen et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Whereas the explanation facilities of
ELK and Classic do not use natural language, Borgida et al. use text patterns
for inference rules, but retain formula language for axioms. The approach of
Nguyen et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] is similar to ours, since it employs rule-based proofs and
patterns to produce natural-language explanations, such as, for example:
(a) Every A is a B.
(b) Every B is a C.
      </p>
      <p>! (c) Every A is a C.</p>
      <p>
        The generated explanation combines such patterns using the text pattern
\Statement (c) is implied because (a) ... and (b) ...". Thus, the structure
remains quite close to how proofs are presented, but the formulae are replaced
by more commonly-understandable text patterns. To test the understandability
of these patterns, an experiment was conducted where the acceptance of these
patterns was tested. Di erent rules and corresponding text patterns were found
to vary greatly in whether they were accepted as correct by experiment
participants (cf. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]). The understandability of the individual verbalized inference
rules was used to predict the understandability of verbalized two-rule inference
problems, and was indeed found to be correlated with the empirically measured
understandability [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
1http://www.snomed.org/snomed-ct 2http://www.geneontology.org
3https://ncit.nci.nih.gov/ncitbrowser/ 4http://www.opengalen.org/
      </p>
    </sec>
    <sec id="sec-4">
      <title>Generating Verbalized Explanations for Derivations</title>
      <p>The generation of explanations is based on two main components, a
consequencebased reasoning system for generating step-wise derivations and a
natural-language generation component to transform these formal derivations to text. In
the following, these two components are introduced brie y. Based on this, we
address the problem of the inconciseness of some of the generated explanations
by introducing techniques and heuristics that shorten the explanations. The
presented approach has been implemented as a prototype system and is available
as a plugin5 for the ontology editor Protege.6
4.1</p>
      <p>
        Reasoning
Derivations are constructed using a rule-based inference system with a custom
set of inference rules. Using a custom system allows us to include inference rules
that are logically redundant, but which help to obtain shorter derivations, and
thus shorter explanations. The current implementation includes and modi es
rules proposed by Nguyen et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and the rules employed in the ELK
system [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and incorporates a few additional inference rules. Fig. 1 shows some
of the inference rules relevant for the remainder of this paper (together with
verbalization patterns, as discussed further below). Note that the introduced
modi cations and additions do not impact the formal properties of the original
rule systems, instead they introduce shortcuts (e.g. R v ) and n-ary versions
of originally binary rules (e.g. Ru+/R5). The full ruleset for the DL fragment
considered in this paper is shown in [17, Appendix A].
      </p>
      <p>
        Since the current implementation of the consequence-based reasoning
procedure is not as performant on large ontologies as well-optimized tableau-based
reasoners, proof search is not performed on an entire ontology. Rather, in a
preprocessing step, a justi cation [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] (a minimal set of axioms required to prove a
derived axiom) is obtained using an o -the-shelf tableau-based reasoner (such
as FaCT++,7 HermiT,8 etc.). Then proof search is performed only on the set
of relevant axioms. Such pre-processing is also used by related work [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. In the
example from the introduction, the following proof tree is obtained (with
abbreviations EP: EsophagealPathology, DSP: DigestiveSystemPathology, GTP:
GastrointestinalTractBodyPart, PC: PathologicalCondition, loc: locativeAttribute):
R
      </p>
      <p>EP
Ru+/R5</p>
      <p>PC u 9loc.E
EP v PC</p>
      <p>R v</p>
      <p>R EP PC u 9loc.E</p>
      <p>EP v 9loc.E E v GTP
R9/R15</p>
      <p>EP v 9loc.GTP
EP v PC u 9loc.GTP DSP</p>
      <p>EP v DSP</p>
      <p>PC u 9loc.GTP
5https://verbalizer.github.io/ 6http://protege.stanford.edu/
7http://owl.man.ac.uk/factplusplus/ 8http://www.hermit-reasoner.com/</p>
      <p>C1 C2</p>
      <p>R1 C1 v C2
|: According to its definition, v(C1 v C2).</p>
      <p>Ru =R2</p>
      <p>C1 v C2 u ::: u Cn 2 i n</p>
      <p>C1 v Ci
|: Hence, v(C1 v Ci).</p>
      <p>Ru+/R5
(1) C1 v C2 :::</p>
      <p>(n) C1 v Cn+1</p>
      <p>C1 v C2 u ::: u Cn+1
(1) ... (n): Since v(C1 v C2) and ... and v(C1 v Cn+1), v(C1 v C2 u ::: u Cn+1).
(i), (j),...: Furthermore, since v(C1 v Ci) and v(C1 v Cj) and ...,</p>
      <p>v(C1 v C2 u ::: u Cn+1).
| : Therefore v(C1 v C2 u ::: u Cn+1).</p>
      <p>Rv /R12
(1) C1 v C2</p>
      <p>(2) C2 v C3</p>
      <p>C1 v C3
(1)&amp;(2): Since v(C1 v C2) and v(C2 v C3) it follows that v(C1 v C3).
(1): v(C1 v C2), therefore being v(C3).
(2): Given that v(C2 v C3), v(C1 v C3).
| : Thus, we have established that v(C1 v C3).</p>
      <p>R9/R15
(1) C1 v 9r:C2</p>
      <p>(2) C2 v C3</p>
      <p>C1 v 9r:C3
(1)&amp;(2): v(C1 v 9r:C2) which [is] v(C3). Therefore, v(C1 v 9r:C3).
(1): v(C1 v 9r:C2), thus v(C1 v 9r:C3).
(2): v(C2 v C3), thus v(C1 v 9r:C3).</p>
      <p>| : Therefore, v(C1 v 9r:C3).
(1)&amp;(2): v(C2) is defined as v(C3). Thus, v(C1 v C3).</p>
      <p>| : Thus, v(C1 v C3) according to the definition of v(C2).</p>
      <p>Rv
R
(1) C1 v C2 (2) C2 C3</p>
      <p>C1 v C3
C1 C2 u ::: u Cn 2 i n</p>
      <p>C1 v Ci
| : v(C1) is defined as v(Ci u ::: u Cn).
4.2</p>
      <p>Verbalization
First, the inference steps are ordered in a linear sequence for being output as
text. For this, a post-order traversal of the proof tree (as seen from the root of
the tree, which contains the conclusion of the derivation) is performed. For each
inference rule it is speci ed in which order its children (i.e., premises) are being
output, which corresponds to the order in which the rules are indicated in Fig. 1.
Then, the text patterns in Fig. 1 are applied to transform the derivation into
text. For each rule, the exact pattern that is applied depends on whether one or
several of its premises have been presented immediately before in the generated
text (for example, as a conclusion of a previous step), in which case they should
not be repeated.</p>
      <p>In the running example, the rst step to be output is the application of
R (top left in the proof tree), with its template producing by default: \An
esophageal pathology is de ned as a pathological condition that is located in
the esophagus". The second application of R (top center in the proof tree)
produces no output, for being detected as identical to the previous output. The
next rule application to be output is R9/R15. Since the rst premise is counted
as being \covered" by the previous output, only the second premise (marked
as (2)) is output, together with the conclusion, i.e. the second pattern is
chosen: \The esophagus is a part of the gastrointestinal tract, thus an esophageal
pathology is located in a part of the gastrointestinal tract." As can be seen in
this example, some intermediate conclusions remain implicit (e.g. EP v PC and
EP v 9loc.E as conclusions of R ), a strategy we discuss below.
4.3</p>
      <p>Techniques and Heuristics for Improving Text Quality
The aim of conciseness of the verbalized derivations is pursued at three levels.
At the level of the generated proofs, inference rules are used that correspond
to two or more applications of simpler rules and provide a kind of shortcut.
In the running example, one application of R v (the last step) replaces the
application of R1 and R v /R12.</p>
      <p>Secondly, some inference rules are considered to be trivial and their
application is simply omitted altogether in the text output. Among the rules in Fig. 1,
Ru =R2 represents such an inference rule. Furthermore, for some of the extra
inference rules, the conclusion is not being output (again, for being considered
obvious). As illustrated above, this is the case for R .</p>
      <p>
        Finally, at the level of individual statements, the text patterns are designed
such that unnecessary repetitions are avoided. For example, the \middle term"
in R v /R12 only needs to be mentioned once (in contrast to the pattern used
by [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and shown in Section 3). Furthermore, the verbalization mechanism uses
annotations to replace class and role names with more readable names, where
provided. This was used in the running example to supply the concept
originally named NAMEDGITractBodyPart with a more readable label \part of the
gastrointestinal tract".
      </p>
      <p>When using more \complex" inference rules and hiding inference rule
applications (referred to as \shortening" in the following), the question is whether
the understandability of the resulting explanations is retained. This prompted
an investigation presented in the following section.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Experiment</title>
      <p>To test the understandability and text quality of generated explanations for
derivations in ontologies, a questionnaire-based experiment was devised. This
experiment allowed for a comparison between explanations in their unshortened
form and their shortened form according to the presented heuristics. Since no
such experiment has been conducted before, we explored with a small number
of participants whether its design is suited as an instrument for assessing di
erences between shortened and unshortened explanations. We consider the results
informative as a preparation for larger studies and also for the further
development of the presented verbalization techniques.</p>
      <p>Procedure Participants were randomly assigned to two groups. Eight
explanations were shown to each participant. The rst group received four
explanations in their unshortened version and four in their shortened version. The
second group received the corresponding shortened and unshortened alternatives of
these explanations. The shortened version of an explanation uses logically
redundant rules (R v and R in the running example), whereas the unshortened
version uses only the most basic rules. In the unshortened case, verbalizations of
Ru =R2 are omitted. For comparison, Fig. 2 shows both versions for the running
example used in the experiment.</p>
      <p>As an objective test for participants' careful reading and understanding,
participants were asked to indicate for each explanation whether it is logically
correct. Two out of the eight explanations were manipulated to be erroneous by
replacing one occurrence of a classname by a di erent one which was not part of
the initial axioms. This manipulation was designed to ensure that participants
read the text properly, but not to test their formal reasoning skills.9 Note also
that the two last sentences of Fig. 2 (b) are generated from one rule application
of R v . The employed pattern was di erent from the one presented in Fig. 1, in
that it lacks the part \de ned as" to make clear that it refers to an equivalence
and not a subsumption. This de ciency was detected during the experiment and
corrected in the verbalization system.</p>
      <p>
        Understandability and readability of the explanations were assessed using a
questionnaire to be answered on a 7-point scale. Participants were asked for
ratings pertaining to key aspects of the text quality of the presented explanations;
namely understanding/comprehension, conciseness and appreciation. The topic
domains for which the explanations were generated were chosen to be relatively
9It is well known that untrained participants in reasoning experiments do not always
apply classical logical reasoning. For example, consider the Wason selection task [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
An esophageal pathology is de ned as a pathological condition that is located in the esophagus.
Hence, an esophageal pathology is a pathological condition.
      </p>
      <p>Additionally, an esophageal pathology is located in the esophagus.</p>
      <p>The esophagus is a part of the gastrointestinal tract, thus an esophageal pathology is located in a
part of the gastrointestinal tract.</p>
      <p>Furthermore, since an esophageal pathology is a pathological condition, an esophageal pathology is
a pathological condition that is located in a part of the gastrointestinal tract.</p>
      <p>According to the de nition of a digestive system pathology, a pathological condition that is located
in a part of the gastrointestinal tract is a digestive system pathology.</p>
      <p>Thus, we have established that an esophageal pathology is a digestive system pathology.</p>
      <p>(a) Unshortened Explanation
An esophageal pathology is de ned as a pathological condition that is located in the esophagus.
The esophagus is a part of the gastrointestinal tract, thus an esophageal pathology is located in a
part of the gastrointestinal tract.</p>
      <p>Furthermore, since an esophageal pathology is a pathological condition, an esophageal pathology is
a pathological condition that is located in a part of the gastrointestinal tract.</p>
      <p>A digestive system pathology is a pathological condition that is located in a part of the
gastrointestinal tract. Thus, an esophageal pathology is a digestive system pathology.</p>
      <p>(b) Shortened explanation
employed in the experiment in unshortened and shortened form.
abstract and unfamiliar to most participants. Therefore, when judging
readability one has to take into account that the domain itself may be challenging, an
aspect for which questions were included under appreciation. The items are:</p>
      <sec id="sec-5-1">
        <title>Understandability Conciseness</title>
        <p>{ I can follow the reasoning steps presented in the explanation. (Question 1)
skipped. (Question 2)
(Question 3)
{ I</p>
        <p>nd that some steps in the explanation are so obvious that they should be
{ The explanation conveys less information than I need to fully understand it.
{ I</p>
        <p>nd that the explanation should be made more concise. (Question 4)
Appreciation</p>
        <p>(Question 5)
{ The text of the explanation is well-formed (according to writing conventions)
{ The sentences are arranged such that they
t together well. (Question 6)
{ I</p>
        <p>nd the text easy to read. (Question 7)
{ I am familiar with the technical terms in this text. (Question 8)
{ The technical terms make it di cult for me to follow the text. (Question 9)
{ The topic of the text makes it di cult for me to read the text. (Question 10)</p>
        <p>At the beginning of the experiment, participants were informed that they
will be asked to provide judgments for automatically generated explanations.
An example was shown together with the correct answer and an explanation
(reproduced in [17, Appendix B]). Then participants were asked for demographic
data and prior experience with the following
elds of science: computer science,
arti cial intelligence, mathematics/formal logic, philosophy, linguistics, physics,
biology, medicine, chemistry. Participants received the eight explanations in
random order, each with the same set of questions. The presentation of each
explanation was preceded by a presentation of the axioms that were assumed to hold
(in verbalized form) and the conclusion derived from them (also verbalized). A
screenshot showing the running example together with the associated
questionnaire is reproduced in [17, Appendix B]. After the experiment, participants were
o ered a free-text eld for any comments on the explanations and were thanked
for their participation. The questionnaire was administered using LimeSurvey.10
Participants Seven current and former members of Ulm University took part
in the experiment. None of them was involved in the development of the
presented verbalization techniques and the experiment. Participants included one
female and six males and were aged between 20 and 34. Four indicated to be
uent in English, two indicated good English pro ciency and one intermediate
pro ciency. All participants reported experience in computer science, four in
Arti cial Intelligence, and three in mathematics/formal logics.</p>
        <p>Materials To generate a pool of verbalized derivations, the verbalization tool
was run on the ontologies in the TONES repository.11 To enable the retrieval of
subcorpora of verbalizations according to a set of criteria (e.g. number of
inference steps, employed rules), explanations and their properties were stored in a
MySQL12 database. The TONES ontology repository was chosen for including
a good level of diversity regarding the included ontologies' domains and
complexities. Since TONES includes ontologies of di erent levels of expressivity, only
derivations that fell into the language fragment handled by the verbalization tool
were generated. Still, a su ciently large and diverse corpus of explanations in
various domains (anatomy, chemistry, geology, physics, but also the \pizza"
tutorial domain13) was obtained. Most importantly, this set represents \ordinary"
ontologies that were not designed with verbalization in mind, but which are likely
to be encountered by potential users of the verbalization tool. To restrict the
scope of the experiment in a sensible way, the pool of considered explanations
was further narrowed down according to the following criteria:
{ The explanations are of medium length (3{5 inference steps when shortened).
{ The verbalizations make use of additional rules or the skipping of rules in the
presentation, as discussed in this paper, such that the e ect of shortening
can be studied.
{ The axioms from which the verbalizations are generated are plausible
according to common sense. Some of the included ontologies were found to contain
unusual or erroneous formalizations of their domain. For example, some
ontologies include axioms for a concept { which in the real world is known
to have instances { such that the concept becomes unsatis able (probably
10https://www.limesurvey.org/de/
12https://www.mysql.com/de/
13http://mowl-power.cs.man.ac.uk/protegeowltutorial/resources/
ProtegeOWLTutorialP4_v1_3.pdf
long explanation
short explanation
1
2
3
4
7
8
9</p>
        <p>10
5
Question
6</p>
        <p>unintendedly) and therefore the concept can be shown to be a subconcept
of any concept. Some ontologies also contain equivalences with apparently
unintended consequences.
{ Verbalizations that rely on equivalences between several concepts with the
same classname in di erent ontologies (with di erent URLs) were excluded.</p>
        <p>Such equivalences result in statements such as \since a person is a person...".
{ Explanations with excessively long concept names were also excluded.
However, since long concept names are quite common in the investigated
ontologies, compound concept names of up to three words were accepted.
In order to provide a realistic selection of explanations for the experiment
(instead of hand-picking some \nice" explanations), and thus a realistic evaluation
of the verbalization tool, the presented explanations were selected at random
from the pool that ful lled the criteria stipulated above. By default, the
verbalization tool provides a simple highlighting of concept descriptions by displaying
them in a di erent color (blue) than the surrounding text (black). Without such
highlighting, the reading of the often long sentences containing long compound
concept names is made unnecessarily tedious.</p>
        <p>Results The two manipulated explanations were always correctly detected to be
wrong by the participants except in one case. The non-manipulated explanations
were mostly judged as correct, as predicted. One participant noticed the problem
with the verbalization of R v .</p>
        <p>Figure 3 shows the subjects' averaged scores on Questions 1-10 for long and
short explanations. The understandability of the explanations was judged
favorably by the participants (Question 1), but not always found to be ideal.
Participants' opinions on conciseness (Questions 2-4) turned out to be mixed and not
too strong. Text quality (Questions 5-10) was also judged favorably. The answers
of the participants to most questions are concentrated within relatively precise
ranges and do not occupy the endpoints of the scale. Inter-rater reliability was
assessed using Kendall's coe cient of concordance (corrected for ties) for the seven
subjects' averaged scores on Questions 1-10 for long and short explanations (as
shown in Fig. 3) and was found to be good (W = 0:617, 2(19) = 82:1, p &lt; 0:001).
Concordance for individual answers measured separately for both
experimental groups (with three and four participants each) was also good (W = 0:759,
2(79) = 180, p &lt; 0:001 and W = 0:561, 2(79) = 177, p &lt; 0:001).</p>
        <p>This is a favorable outcome with respect to the question whether this
experimental setup can be used as an instrument to detect di erences in the scores
related to the experimental manipulation, i.e. the shortening of the explanations,
provided a larger number of participants to ascertain su cient statistical power.
For example, if a larger number of participants leads to a statistically signi
cant di erence in responses to Question 4 (the question whether the explanation
should be made more concise), this would provide evidence for a positive e ect
of shortening on perceived conciseness of the explanations.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusions</title>
      <p>This work has, for the rst time, investigated the generation of verbalized
explanations for non-trivial derivations consisting of several inference steps. In the
evaluation, verbalizations for up to seven inference steps (e.g. the running
example in its unshortened form) were considered. To make such explanations more
concise, we propose the use of an extended set of inference rules and the hiding
of inference steps in the presentation. The presented experimental design can be
used to test whether such adjustments impact the understandability and quality
of the generated explanations. Our small-sample study provides a rst indication
of how these measures turn out, which will be helpful for conducting a larger
follow-up study. Whereas the TONES repository was considered an
instrumental choice for generating explanation material in a rst step, more up-to-date
corpora of ontologies should be considered in future studies. Furthermore, the
experiment provided some rst evidence supporting the understandability of
such explanations in general, in spite of the considerable length of the
derivations and the technical jargon used in the considered domains. In how far this is
also the case for users with less background in technical domains than the
participants of this study (in this case, all had a background in computer science)
is to be investigated as part of future work.</p>
      <p>Acknowledgments. We acknowledge the support of the Transregional
Collaborative Research Center SFB/TRR 62 \A Companion-Technology for Cognitive
Technical Systems" funded by the German Research Foundation (DFG). We
thank the three helpful reviewers and all experiment participants.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Androutsopoulos</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lampouras</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Galanis</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Generating natural language descriptions from OWL ontologies: The NaturalOWL system</article-title>
          .
          <source>Journal of Arti cial Intelligence Research</source>
          <volume>48</volume>
          ,
          <volume>671</volume>
          {
          <fpage>715</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brandt</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Pushing the EL envelope further</article-title>
          . In: Clark,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Patel-Schneider</surname>
          </string-name>
          ,
          <string-name>
            <surname>P</surname>
          </string-name>
          . (eds.)
          <source>Proc. of the OWLED 2008 DC Workshop on OWL: Experiences and Directions. CEUR Workshop Proceedings</source>
          , vol.
          <volume>496</volume>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Borgida</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Franconi</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Explaining ALC subsumption</article-title>
          . In: Horn,
          <string-name>
            <surname>W</surname>
          </string-name>
          . (ed.)
          <source>Proc. of the 14th European Conf. on Arti cial Intelligence</source>
          , pp.
          <volume>209</volume>
          {
          <fpage>213</fpage>
          . IOS Press (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Cregan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schwitter</surname>
          </string-name>
          , R., Meyer, T., et al.:
          <article-title>Sydney OWL syntax { Towards a controlled natural language syntax for OWL 1.1</article-title>
          . In: Golbreich,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Kalyanpur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Parsia</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          <source>(eds.) Proc. of the OWLED 2007 Workshop on OWL: Experiences and directions</source>
          , vol.
          <volume>258</volume>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Engelbrecht</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hart</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dolbear</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Talking rabbit: A user evaluation of sentence production</article-title>
          . In: Fuchs,
          <string-name>
            <surname>N.E</surname>
          </string-name>
          . (ed.)
          <source>Controlled Natural Language</source>
          , pp.
          <volume>56</volume>
          {
          <fpage>64</fpage>
          . LNAI 5972, Springer (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Funk</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tablan</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bontcheva</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cunningham</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Handschuh</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>CLOnE: Controlled language for ontology editing</article-title>
          . In: Aberer,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Choi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.S.</given-names>
            ,
            <surname>Noy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Allemang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.I.</given-names>
            ,
            <surname>Nixon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Golbeck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Mika</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Maynard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Mizoguchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Schreiber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Cudre-Mauroux</surname>
          </string-name>
          , P. (eds.)
          <source>The Semantic Web (Proc. ISWC/ASWC 2007)</source>
          , pp.
          <volume>142</volume>
          {
          <fpage>155</fpage>
          . LNCS 4825, Springer (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Horridge</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Justi cation based explanations in ontologies</article-title>
          .
          <source>Ph.D. thesis</source>
          , University of Manchester (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Kaljurand</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fuchs</surname>
            ,
            <given-names>N.E.</given-names>
          </string-name>
          :
          <article-title>Verbalizing OWL in Attempto Controlled English</article-title>
          . In: Golbreich,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Kalyanpur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Parsia</surname>
          </string-name>
          ,
          <string-name>
            <surname>B.</surname>
          </string-name>
          <source>(eds.) Proc. of the OWLED 2007 Workshop on OWL: Experiences and directions</source>
          , vol.
          <volume>258</volume>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Kazakov</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klinov</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Goal-directed tracing of inferences in EL ontologies</article-title>
          . In: Mika,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Tudorache</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Bernstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Welty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Knoblock</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Vrandecic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Groth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Noy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Janowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ,
            <surname>Goble</surname>
          </string-name>
          , C. (eds.)
          <source>The Semantic Web { ISWC</source>
          <year>2014</year>
          ,
          <article-title>LNCS</article-title>
          , vol.
          <volume>8797</volume>
          , pp.
          <volume>196</volume>
          {
          <fpage>211</fpage>
          . Springer (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Kazakov</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , Krotzsch,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Simanc k</surname>
          </string-name>
          , F.:
          <article-title>The incredible ELK</article-title>
          .
          <source>Journal of Automated Reasoning</source>
          <volume>53</volume>
          (
          <issue>1</issue>
          ),
          <volume>1</volume>
          {
          <fpage>61</fpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Kuhn</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>The understandability of OWL statements in controlled English</article-title>
          .
          <source>Semantic Web</source>
          <volume>4</volume>
          (
          <issue>1</issue>
          ),
          <volume>101</volume>
          {
          <fpage>115</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Liang</surname>
            ,
            <given-names>S.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scott</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stevens</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rector</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Ontoverbal: A generic tool and practical application to SNOMED CT</article-title>
          .
          <source>International Journal of Advanced Computer Science and Applications (IJACSA) 4</source>
          ,
          <issue>227</issue>
          {
          <fpage>239</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>McGuinness</surname>
            ,
            <given-names>D.L.</given-names>
          </string-name>
          :
          <article-title>Explaining Reasoning in Description Logics</article-title>
          .
          <source>Ph.D. thesis</source>
          , Rutgers University (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Power</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Piwek</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Measuring the understandability of deduction rules for OWL</article-title>
          . In: Lambrix,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Qi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Horridge</surname>
          </string-name>
          , M. (eds.)
          <source>First International Workshop on Debugging Ontologies and Ontology Mappings (WoDOOM12)</source>
          , pp.
          <volume>1</volume>
          {
          <fpage>12</fpage>
          . Linkoping University Electronic Press (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>T.A.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Power</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Piwek</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Predicting the understandability of OWL inferences</article-title>
          .
          <source>In: The Semantic Web: Semantics and Big Data</source>
          , pp.
          <volume>109</volume>
          {
          <fpage>123</fpage>
          . Springer (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Power</surname>
          </string-name>
          , R.: OWL simpli ed English:
          <article-title>A nite-state language for ontology editing</article-title>
          . In: Kuhn,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Fuchs</surname>
          </string-name>
          , N.E. (eds.)
          <source>Controlled Natural Language (Proc. CNL</source>
          <year>2012</year>
          ), LNAI, vol.
          <volume>7427</volume>
          , pp.
          <volume>44</volume>
          {
          <fpage>60</fpage>
          . Springer (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Schiller</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schiller</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Glimm</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Testing the adequacy of automated explanations of EL subsumptions</article-title>
          .
          <source>Tech. rep., Universitat Ulm</source>
          (
          <year>2017</year>
          ), https://www.uni-ulm.de/fileadmin/website_uni_ulm/iui.inst.090/ Publikationen/2017/SSG17DL.pdf
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Stevens</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malone</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Power</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Third</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Automating generation of textual class de nitions from OWL to English</article-title>
          .
          <source>J. Biomedical Semantics</source>
          <volume>2</volume>
          (
          <issue>Suppl</issue>
          . 2),
          <source>S5</source>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Wason</surname>
            ,
            <given-names>P.C.</given-names>
          </string-name>
          :
          <article-title>Reasoning about a rule</article-title>
          .
          <source>The Quarterly Journal of Experimental Psychology</source>
          <volume>20</volume>
          (
          <issue>3</issue>
          ),
          <volume>273</volume>
          {
          <fpage>281</fpage>
          (
          <year>1968</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>