<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Finding fault: Detecting issues in a versioned ontology</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Computer Science, University of Manchester</institution>
          ,
          <addr-line>Manchester</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <fpage>9</fpage>
      <lpage>20</lpage>
      <abstract>
        <p>Understanding ontology evolution is becoming an active topic of interest to ontology engineers, e.g., we have large collaborative developed ontologies but, unlike software engineering, comparatively little is understood about the dynamics of historical changes, especially at a fine level of granularity. Only recently has there been a systematic analysis of changes across ontology versions, but still at a coarse-grained level. The National Cancer Institute (NCI) Thesaurus (NCIt) is a large, collaboratively-developed ontology, used for various Web and research-related purposes, e.g., as a medical research controlled vocabulary. The NCI has published ten years worth of monthly versions of the NCIt as Web Ontology Language (OWL) documents, and has also published reports on the content of, development methodology for, and applications of the NCIt. In this paper, we carry out a fine-grained analysis of the asserted axiom dynamics throughout the evolution of the NCIt from 2003 to 2012. From this, we are able to identify axiomatic editing patterns that suggest significant regression editing events in the development history of the NCIt.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        This paper is part of a series of analyses of the NCIt corpus [
        <xref ref-type="bibr" rid="ref1 ref2">1,2</xref>
        ], the earlier of which
focus on changes to the asserted and inferred axioms. The current analysis extends
previous work by tracing editing events at the individual axiom level, as opposed to the
ontology level. That is, instead of analysing the total number of axioms added or
removed between versions, we also track the appearance and disappearance of individual
axioms across the corpus. As a result, we are able to positively identify a number of
regressions (i.e., inadvertent introduction of an error) which occur over the last ten years
of the development of the NCIt ontology, as well as a number of event sequences that,
while not necessarily introducing errors, indicate issues with the editing process. We
are able to do this analytically from the editing patterns alone.
We assume that the reader is familiar with OWL 2 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], at least from a modeller
perspective. An ontology O is a set of axioms, containing logical and non-logical (e.g.,
annotation) axioms. The latter are analogous to comments in conventional
programming languages, while the former describe entities (classes or individuals) and the
relations between these entities via properties. The signature of an ontology O (the set of
individuals, class and property names in O) is denoted Oe.
      </p>
      <p>We use the standard notion of entailment, an axiom entailed by an ontology O is
denoted by O j= . We look at entailments of the form A v B where A and B are class
names i.e., atomic subsumptions. This is the part of the type of entailment generated by
the classification reasoning task, a standard reasoning task that forms the basis of the
‘inferred subsumption hierarchy ’ .</p>
      <p>Finally, we use the notions of effectual and ineffectual changes as follows:
Definition 1. Let Oi and Oi+1 be two consecutive versions of an ontology O.
An axiom is an addition (removal) if 2 Oi+1nOi ( 62 OinOi+1).
An addition is effectual if Oi 6j= (written as Ef f Add(Oi; Oi+1)), and
ineffectual otherwise (written as InEf f Add(Oi; Oi+1)).</p>
      <p>
        A removal is effectual if Oi+1 6j= (written as Ef f Rem(Oi; Oi+1)), and ineffectual
otherwise (written as InEf f Rem(Oi; Oi+1)) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-2">
      <title>Conceptual Foundations</title>
      <p>Prior to the study of fault detection techniques, we establish a clear notion of the type
of faults we are trying to isolate. In all cases, we define a fault as deviation from the
required behaviour. In Software Engineering, software faults are commonly divided into
functional and non-functional depending on whether the fault is in the required
functional behaviour (e.g., whether the system is acting correctly in respect to its inputs,
behaviour, and outputs) or whether the fault is in the expected service the system needs
to provide (i.e., whether the (correct) behaviour is performed well). Functional and
nonfunctional faults can be further subdivided based on the impact to the system and/or to
the requirements specifications. For example, functional faults can be divided into
fatal and non-fatal errors depending on whether the fault crashes the system. Generally,
crashing behaviour is always a fatal fault, however it might be preferable to encounter
a system crash instead of a non-fatal fault manifested in some other, harder to detect,
manner. Faults that impact the requirements may be implicit, indeterminate (i.e., the
behaviour might be underspecified), or shifting. A shifting specification can render
previously correct behaviour faulty (or the reverse), as faults are defined as deviations from
the “governing” specification. For convenience, we presume throughout this study that
the specification is stable over the lifetime of the examined ontology, i.e., we expect the
notation of ’acceptable model ’ or ’acceptable entailment ’ to be stable throughout the
lifetime of the ontology.</p>
      <p>We also restrict our attention to the logical behaviour of the ontology, and we
approximate this by sets of desired entailments. This restriction might not reflect the full
behaviour of an ontology in some application as 1) many entailments might be
irrelevant to the application (e.g., non-atomic subsumptions for a terminologically oriented
application) or 2) the application might be highly sensitive to other aspects of the
ontology, including, but not limited to, annotations, axiom shape, and naming patterns.
However, these other aspects are less standardised from application to application, so
are rather more difficult to study externally to a given project. Furthermore, faults in the
logical portion of an ontology both can be rather difficult to deal with and affect these
other aspects. With this in mind, we define a logical bug as follows:
Definition 2. An ontology O contains a (logical) bug if O j=
entailment or O 6j= and is a desired entailment.
and</p>
      <p>is not a desired</p>
      <p>Of course, whether a (non)entailment is desired or not is not determinable by a
reasoner — a reasoner can only confirm that some axiom is or is not an entailment.
Generally, certain classes of (non)entailments are always regarded as errors. In analogy
to crashing bugs in Software Engineering, in particular, the following are all standard
errors:
1. O is inconsistent i.e., O j= &gt; v ?
2. A 2 Oe is unsatisfiable in O i.e., O j= A v ?
3. A 2 Oe is tautological in O i.e., O j= &gt; v A
In each of these cases, the “worthlessness” of the entailment is straightforward1 and
we will not justify it further here. That these entailments are bugs in and of themselves
makes it easy to detect them, so the entire challenge of coping with such is in explaining
and repairing them.</p>
      <p>Of course, not all errors will be of these forms. For example, in most cases, the
subsumption, T ree v Animal would be an undesired entailment. Detecting this requires
domain knowledge, specifically, the denotation of Tree and Animal, the relation
between them, and the intent of the ontology. If there is an explicit specification such e.g.
a finite list of desired entailments, then checking for correctness of the ontology would
be straightforward. Typically, however, the specification is implicit and, indeed, may be
inchoate, only emerging via the ontology development process. Consequently, it would
seem that automatic detection of such faults is impossible.</p>
      <p>This is certainly true when considering a single version of an ontology. The case
is different when multiple versions are compared. Crucially, if an entailment fluctuates
between versions, that is, if it is the case that Oi j= and Oj 6j= where i &lt; j,
then we can conclude that one of those cases is erroneous. However, it is evident that
Oi 6j= but Oj j= may not be as the fact that Oi 6j= as it might just indicate that the
“functionality” has not been introduced yet. In what follows, we consider a sequence
of Oi, ... , Om of ontologies, and use i, j, k, ... , as indices for these ontologies with
i&lt;j&lt;k&lt;... With this in mind, we can conservatively determine whether there are logical
faults in the corpus using the following definition.</p>
      <p>Definition 3. Given two ontologies, Oi; Oj where i&lt;j, then the set of changes such
that is fEf f Add(Oi; Oi+1) \ Ef f Rem(Oj ; Oj+1)gis a fault indicating set of
changes written as FiSoC(i; j).</p>
      <p>62 Oi+1, but Oi+1 j=</p>
      <p>Note that if 2 F iSoC(i; j) either the entailment Oi j= , thus 2 Oi, or the
non-entailment Oi 6j= may be the bug in question and FiSoC(i, j) does not identify
which is the bug. Instead the fault indicating set tells us that one of the changes
introduces a bug. As mentioned earlier, the set shows the existence of a bug assuming a
stable specification. Any subsequent findings of the same 2 F iSoC(i; j) is a fault
indicate content regression. It is not surprising to find reoccurring content regressions
due to the absence of content regression testing.</p>
      <p>
        We can have a similar set of changes wherein the removal is ineffectual i.e., 2 Oi,
. Since the functionality of the ontology is not changed by
1 There is, at least in the OWL community, reasonable consensus that these are all bugs in the
sort of ontologies we build for the infrastructure we use.
an ineffectual removal, such a set does not indicate regression in the ontology. Indeed,
such a set is consistent with a refactoring of the axiom, that is syntactic changes to the
axiom that result in the axiom being strengthened or weakened based on the
effectuallity of the change [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Of course, if the added axiom is the bug, then the ineffectual
removal from Oi to Oi+1 would be a failed attempt to remove the bug. Without
access to developer intentions or other external information, we cannot distinguish
between these two situations. However, we can conclude that an iterated pattern of
ineffectual changes is problematic. That is, even if the set of changes Ef f Add(Oi; Oi+1)\
InEf f Rem(Oj ; Oj+1) is a refactoring, a subsequent ineffectual addition, InEf f Add(Ok; Ok+1),
would indicate a sort of thrashing. Meaning, if the original refactoring was correct, then
“refactoring back” is a mistake (and if the “refactoring back” is correct, then the original
refactoring is a mistake).
      </p>
      <p>Definition 4. Given two ontologies, Oi; Oj where i&lt;j, then any of the following sets
of changes for</p>
      <p>F1SSoC. fEf f Add(Oi; Oi+1) \ InEf f Rem(Oj ; Oj+1)g
F2SSoC. fInEf f Add(Oi; Oi+1) \ InEf f Rem(Oj ; Oj+1)g</p>
      <p>F3SSoC. fInEf f Rem(Oi; Oi+1) \ InEf f Add(Oj ; Oj+1)g
are fault suggesting set of changes written as FSSoC(i, j).</p>
      <p>There is a large gap in the strength of the suggestiveness between sets of kind
F1SSoC and the sets of kind F2SSoC and F3SSoC. Sets of kind F1SSoC can be
completely benign, indicating only that additional information has been added to the axiom
(e.g., that the axiom was strengthened), whereas there is no sensible scenario for the
occurrence of sets of kind F2SSoC and F3SSoC. In all cases, much depends on whether
the ineffectuality of the change is known to the ontology modeller. For instance, if a
set of type F1SSoC(i, j) was an attempt to repair , then is a logical bug if is an
undesired entailment that was meant to have been repaired in Oj , then this repair failed.</p>
      <p>All these suggestive sets may be embedded in larger sets. Consider the set where
is (1) Ef f Add(Oi; Oi+1), (2) InEf f Rem(Oj ; Oj+1), (3) InEf f Add(Ok; Ok+1),
(4) Ef f Rem(Ol; Ol+1). From this we have an indicative fault in the set &lt;(1),(4)&gt;
and two suggestive faults in the sets, &lt;(1),(2)&gt; and &lt;(2),(3)&gt;. The latter two seem to
be subsumed by the encompassing former. The analysis presented here does not, at this
time, cover all paired possibilities. This is partly due to the fact that some are impossible
on their own (e.g., two additions or two removals in a row) and partly due to the fact
that some are subsumed by others.</p>
      <p>Of course, as we noted, all these observations only hold if the requirements have
been stable over the examined period. If requirements fluctuate over a set of changes,
then the changes might just track the requirements and the ontology might never be in
a pathological state.
4</p>
    </sec>
    <sec id="sec-3">
      <title>Methods and Materials</title>
      <p>
        The verification of the concepts and definitions proposed in Section 3 is carried out by
conducting a detailed analysis of The National Cancer Institute Thesaurus (NCIt)
ontology. The National Cancer Institute (NCI) is a U.S. government funded organisation for
the research of causes, treatment, and prevention of cancer [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The NCIt is an ontology
written in the Web Ontology Language (OWL) which supports the development and
maintenance of a controlled vocabulary about cancer research. Reports on the
collaboration process between the NCIt and its contributors have been published in 2005 and
2009 (see [
        <xref ref-type="bibr" rid="ref5 ref6 ref7">5,6,7</xref>
        ]), which provide a view of the procedural practices adopted to support
domain experts and users in the introduction of new concepts into the ontology. These
publications together with the publicly available monthly releases and concept change
logs are the basis for the corpus used in this study.
      </p>
      <p>
        We gathered 105 versions of the NCIt (release 02.00 (October 2003) through to
12.08d (August 2012)) from the public website.2 Two versions are unparseable using
the OWL API [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], and were discarded, leaving 103 versions. The ontologies were parsed
and individual axioms and terms were extracted and inserted into a MySQL v5.1.63
database. The database stores the following data for each NCIt release, Oi (where i is
the version identifier):
1. Ontology Oi: Each ontology’s NCI identifier Oi is stored in a table “Ontology”
with a generated integer identifier i.
2. Axioms j 2 Oi: Each structurally distinct axiom j is stored in an “Axioms”
table with identifier j, and a tuple (j, i) is stored in a table “Is In” (that is, axiom j
is asserted in ontology i).
3. Classes Cj 2 Oi: Each class name Cj is stored in a table “Classes” with an
identifier j, followed by the tuple (j, i) into table “Class In”.
4. Usage of class Cj in Oi: Each class Cj that is used (mentioned) in axiom
is stored in table “Used In” as a triple (j,k,i).
k 2 Oi
5. Effectual changes: Each added (removed) axiom j 2 Ef f Add(Oi; Oi+1) ( j 2
Ef f Rem(Oi; Oi+1)), with identifier j, is stored in table “Effectual Additions”
(“Effectual Removals”) as a tuple (j, i + 1).
6. Ineffectual changes: Each added (removed) axiom j 2 InEf f Add(Oi; Oi+1)
( j 2 InEf f Rem(Oi; Oi+1)), with identifier j, is stored in table “Ineffectual
Additions” (“Ineffectual Removals”) as a tuple (j, i).
      </p>
      <p>The data and SQL queries to produced this study are available online.3</p>
      <p>All subsequent analysis are performed by means of SQL queries against this database
to determine suitable test areas and fault detection analysis. For test area identification,
we select test sets based on the outcome of 1) Frequency Distribution Analysis of the
set of asserted axioms (i.e., in how many versions each axiom appears or follows), and
2) asserted axioms Consecutivity Analysis (whether an axiom’s occurrence pattern has
“gaps”). For fault detection, we conduct SQL driven data count analysis between the
selected test cases and the Effectual and Ineffectual database tables to categorise logical
bugs as FiSoCs or FSSoCs.
2 ftp://ftp1.nci.nih.gov/pub/cacore/EVS/NCI_Thesaurus/archive/.
3 http://owl.cs.manchester.ac.uk/research/topics/ncit/
regression-analysis/
5
5.1</p>
    </sec>
    <sec id="sec-4">
      <title>Results</title>
      <sec id="sec-4-1">
        <title>Test Areas Selection</title>
        <p>The test area selection for this study is determined by conducting analyses on axioms ’
frequency distribution and consecutivity evaluation. Frequency distribution analysis
calculates the number of versions an axiom is present in the NCIt. From this sequence
analysis we identify their consecutivity based on the type of occurrence in the corpus,
such as: continual occurrence, interrupted occurrence, and single occurrence. The
analysis of axioms with continual occurrence provides knowledge about the stability of the
ontology, since it helps with the identification of axioms that, due to their consistent
presence throughout the ontology’s versions, can be associated with the ‘core’ of the
represented knowledge. As described in Section 3, axioms ’ presence can be
successfully correlated with FiSoCs or FSSoCs depending on the effectuality of their changes.</p>
        <p>In the analysis, we found that the highest number of 20,520 asserted axioms
correspond to frequency 11. This means that 20,520 axioms appear in the NCIt ontology
for exactly 11 versions. Of these asserted axioms, 20,453 asserted axioms (99.67%),
appear in 11 consecutive versions. The distribution of these axioms across the corpus
is concentrated between version 6 to 16 with 19,384 asserted axioms (the majority of
these additions took place in version 6 with 13,715 added axioms), between versions
1 to 52 with 593 asserted axioms, and 187 asserted axioms for the remaining versions.
These numbers do not account for the 358 new asserted axioms added in version 93
that are still in the corpus for version 103 with 11 occurrences but have the potential of
remaining in the corpus in the future versions.</p>
        <p>The next highest frequency is 5 with 14,586 asserted axioms and 14,585
occurring consecutively. Only the axiom Extravasation v BiologicalProcess is
present from version 20 to 23, it is removed in version 24 and re-enters in version 45
before being removed in version 46.</p>
        <p>The next two rows in Table 1 show the results for frequency distribution 2 and 3 with
13,680 and 12,806 asserted axioms respectively. For frequency distribution 2, there are
10,506 asserted axioms with consecutive occurrence. Of these axioms, 445 entered the
corpus in version 102 and remain in the corpus until version 103. The total number of
axioms with non-consecutive occurrences is 3,174 asserted axioms. However, only 8
axioms are not included in the set of axioms that are part of the modification event
taking place between versions 93 and 94. In this event 3,166 axioms with non-consecutive
occurrences were added in version 93, removed (or possibly refactored) in version 94,
and re-entered the ontology in version 103. This editing event is discussed in Section
6. Of the 12,806 asserted axioms with frequency distribution 3, 12,804 asserted axioms
occur in consecutive versions (99.98%) and 644 asserted axioms are present in the last
studied version of the corpus.</p>
        <p>Our results show that three high frequency distributions are observed in the top
ten distributions with axioms occurring in 87, 79 and 103 versions. There are 12,689
asserted axioms present in 87 versions with 99.86% of asserted axioms occurring
consecutively. From these axioms, 12,669 asserted axioms appear in the last version of the
ontology with 12,651 asserted axioms added in version 17 and remaining consecutively
until version 103. For frequency distribution 79, there exist 10,910 asserted axioms
that appear in 79 versions with 10,866 still present in version 103. From these 10,866
asserted axioms, 10,861 asserted axioms were added in version 25 and remain until
version 103 consecutively. Finally, there are 8,933 asserted axioms that appear in 103
versions of the NCIt. This means that 8,933 axioms were added in the first studied
version of the NCIt and remain until the last studied version. That is, of the 132,784
asserted axioms present in version 103, 6.73% of the axioms were present from version 1.
From this information it can be inferred that 6.73% of the asserted axioms population
found in the last version of the NCIt represent a stable backbone of asserted axioms
present in all versions of the NCIt.</p>
        <p>As seen in Table 1, 12,219 asserted axioms occur in only 1 version of the NCIt. Of
these asserted axioms, 2,084 axioms appear in version 103 and may remain in future
versions. When taking this fact into account, we observe that a total of 10,135 asserted
axioms with single occurrences are present in the remaining 102 versions. From the 103
studied versions, 98 versions have asserted axioms that only appear in those versions;
and versions 45, 54, 88, 92, and 100 do not. A detailed representation of this
distribution across the NCIt corpus demonstrates that the first three years of the studied NCIt
versions show the highest rate of single occurrences in the corpus with three identifiable
high periods of single occurrences around version 1 to 5, versions 16 to 18, and versions
21 to 25.
5.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Fault Detection Analysis</title>
        <p>In this study, we limit Fault Detection Analysis to the finite set of asserted axioms with
non-consecutive occurrence for the top ten frequencies identified in the previous
section. It is important to note at this point that this study does not examine the set of
all FiSoC and FSSoC for the collected versions of NCIt. Instead we focus our
attention on the identified 53 asserted axioms that occur in non-consecutive versions for
the top ten distributions, excluding all axioms that were part of the renaming events
identified between versions 91 to 103 of the NCIt. Of these 53 examined axioms, 32
asserted axioms have logical bugs of type FiSoC. Further examination of the change
sets of these FiSoCs indicate that 27 axioms conform directly with Definition 3
because all of their additions and removals are effectual; that is, the set of changes is
(Ef f Add(Oi; Oi+1) \ Ef f Rem(Oj ; Oj+1)). The remaining 5 axioms have change
sets of type (Ef f Add(Oi; Oi+1) \ InEf f Rem(Oj ; Oj+1) \ Ef f Rem(Ok; Ok+1)).
Although in this set there is an ineffectual removal prior to the effectual removal, from
this change set we may conclude that ineffectual removal is “fixed” when the effectual
removal takes place.</p>
        <p>We also identified the asserted axiom
Benign Peritoneal Neoplasm v Disease Has Primary Anatomic Site
only Peritoneum with axiom id 159025 as a logical bug of type FiSoC for the first
removal (Ef f Add(O20; O21) \ Ef f Rem(O21; O22)), and a second logical bug type
FSSoC for the second removal (Ef f Add(O26; O27) \ InEf f Rem(O28; O29)). The
presence of both logical bugs, FiSoC and FSSoC, in this axiom suggests that the
reintroduction of the axiom to the ontology in version 27 after being removed in version
22 may correspond to content regression, and the second ineffectual removal in version
29 to refactoring.</p>
        <p>
          The remaining 21 asserted axioms have logical bugs of type FSSoC . Seventeen of
these axioms conform with F1SSoC set, thus suggesting possible refactoring. To
confirm these refactoring events, additional axiomatic difference analysis needs to be
carried out on these axioms, as suggested in [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. Four axioms (axiom ids 110594, 153578,
157661, and 127241) have the change sets identified for F2SSoC. Two of these axioms
(axiom ids 157661, and 127241) suggest refactoring for the first change set (the set is
of type F1SSoC), and are later re-introduced in the ontology with logical bugs of type
FiSoC.
        </p>
        <p>
          As mentioned earlier, the analysis conducted in this section excludes fault detection
for the set of axioms affected by the renaming event that took place between versions
91 and 103. We provide more information about this renaming event and the impact
to our results in Section 6. However, it is important to mentioned that our analysis is
sensitive to cosmetic changes to axioms, e.g., axiom renaming, and does not treat them
as logical bugs due to the superfluous nature of these changes.
Frequency Axiom Versions for &lt;Eff. Versions for Versions for &lt;Eff. Versions for First NCIt Last NCIt
Rate ID Add., Eff. Re.&gt; &lt;Eff. Add.&gt; Add., Ineff. Re., Eff. Re.&gt; &lt;Ineff. Add.&gt; Version Version
57506 &lt;4,5&gt;, &lt;7,17&gt; 4 16
58364 &lt;4,5&gt;, &lt;7,17&gt; 4 16
11 103206 &lt;7,17,26&gt; 7 25
105069 &lt;7,17,26&gt; 7 25
210295 &lt;40,47&gt;, &lt;51,55&gt; 40 54
&lt;1,17,86&gt;
&lt;1,17,86&gt;
&lt;1,2,52&gt;
&lt;23&gt;
&lt;23&gt;
&lt;45&gt;
from 2003 to 2012, show that the ontology is consistently active and the evolution
management process in place for NCIt ’s maintenance (as described in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] and [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ])
may be positive contributors to the overall steady growth of the NCIt ontology.
        </p>
        <p>The growth of the ontology is mostly driven by the asserted ontology where high
levels of editing activity took place in the first three years of the analysed population.
The change dynamics observed in this period suggest a trial and error phase, where
editing and modelling activities are taking place until reaching a level of stability, possibly
related to reaching maturity, for the remainder of the observed versions.</p>
        <p>Although the chronological analysis primarily points to the first three years as a
phase of rapid change, a more in-depth study of the diachronic data set revealed that
content regression takes place throughout all versions of the NCIt. A detailed study of
the ‘life of axioms’ in the ontology from the Frequency Distribution Analysis shows that
the evolution of the NCIt is marked by logical bugs of either FiSoC and/or FSSoC types.</p>
        <p>As a result, we found that asserted axioms with logical bugs enter the ontology in a
version, are removed in a different version, and later re-entered the ontology unchanged.
Only 6.73% of the asserted axioms in version 103 correspond to axioms that have been
present unchanged from the first version analysed until this last version.</p>
        <p>Our study revealed that most asserted axioms appear in two versions of the ontology.
However, in this finding we identified 125,294 axioms are affected by the renaming
event that took place between versions 93 and 94. In a preliminary study conducted for
this paper, we found these asserted axioms first appear in version 93, are removed in
version 94, and then re-enter the NCIt unchanged in version 103 . We have confirmed
with the NCI that this editing event corresponds to the renaming of terms that took
place in version 93, where every term name was replaced from its natural language
name to its NCIt code. This renaming event also affects the set of asserted axioms with
frequency distribution 11. The non-consecutive version occurrences for 1,186 axioms
show that they first occur consecutively from versions 91 and 92, are removed in version
93, and then re-enter the ontology in version 94. These axioms remain consecutively
until version 102 before they are removed again in version 103. The identification of
this renaming event does not affect the information content dynamics of the ontology;
however, it does affect the overall change dynamics. This renaming event is important
to our analysis because it shows major editing periods are still part of the NCIt.</p>
        <p>Taking into account these renaming events, the study found that the NCIt overall
‘survival’ rate for asserted axioms is 5 versions. Axioms with non-consecutive presence
in the ontology are directly linked to logical bugs that either indicate content regressions
or suggest axiom refactoring. Information content is not as permanent as the managerial
and maintenance processes indicate, but logical bugs for unmodified axioms are more
predominant than expected. The analysis conducted in this paper identifies specific sets
of axioms that are part of this group of regression cycles, and it is able to provide in
detail the type of faulty editing patterns for these axioms and the location of these errors.
We argue that the identification axioms with re-occurring logical bugs is a crucial step
towards the identification of test cases and test areas that can be used systematically in
Ontology Regression Testing.
7</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Limitations</title>
      <p>This study has taken under consideration the following limitations: (i) The NCIt
evolution analysis and asserted axiom dynamics correspond to the publicly available OWL
versions of the NCIt from release 02.00 (October 2003) to 12.08d (August 2012).
Historical records of NCIt prior to OWL are not taken into consideration in this study. (ii)
The presented results and analysis is limited in scope to the set of asserted axioms only.
The inclusion of entailment analysis is only conducted in regards to the computation of
logical differences to categorise the asserted axioms ’ regression events into logical bugs
of types FiSoC or FSSoC. (iii) Test area selection for the set of axioms with presence in
non-consecutive versions is derived by selecting all axioms with non-consecutive
presence based on their ranking in the high frequency analysis for all asserted axioms. The
selected test area should be viewed as a snapshot of the whole population of axioms
with non-consecutive presence, since the set of 53 analysed axioms correspond only
to the top 10 high frequency distribution as described in Section 5.1. Analysis of the
whole corpus is planned for future research. (iv) This study primarily corresponds to
Functional Requirement Test Impact Analysis since it deals directly with the ontology.
Non-functional Requirements are linked to entailment analysis such as subsumption
hierarchy study, which is excluded in this work.
8</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>Large collaborative ontologies such as the NCIt need robust change analysis in
conjunction with maintenance processes in order to continue to effectively support the ontology.
The work presented in this paper shows that a detailed study of axioms with logical bugs
need to be part of ontology evaluation and evolution analysis techniques due to its
significant contribution to regression testing in ontologies. Although the study presented
here is limited in that it is only evaluating unchanged asserted axioms, it still shows that
a great portion of the editing efforts taking place in the NCIt is in the unmodified
content. Regression analysis of this unmodified content can target specific changes in the
modelling and representation approaches which can potential safe effort and increase
productivity in the maintenance of the ontology.</p>
      <p>Regression testing in Ontology Engineering is still a growing area of research, and
the work presented here shows that a step towards achieving regression analysis in
ontologies is by providing quantitative measurements of axiom change dynamics,
identification of logical bugs, and the study of ontology evolutionary trends, all of which
can be extracted efficiently by looking at versions of an ontology.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. Gonc¸alves, R.S.,
          <string-name>
            <surname>Parsia</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sattler</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>Analysing the evolution of the NCI thesaurus</article-title>
          .
          <source>In: Proc. of CBMS-11</source>
          . (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. Gonc¸alves, R.S.,
          <string-name>
            <surname>Parsia</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sattler</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>Analysing multiple versions of an ontology: A study of the NCI Thesaurus</article-title>
          .
          <source>In: Proc. of DL-11</source>
          . (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Cuenca</given-names>
            <surname>Grau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Horrocks</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Motik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Parsia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Patel-Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.F.</given-names>
            ,
            <surname>Sattler</surname>
          </string-name>
          ,
          <string-name>
            <surname>U.</surname>
          </string-name>
          :
          <article-title>OWL 2: The next step for OWL</article-title>
          .
          <source>J. of Web Semantics</source>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4. de Coronado,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Haber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.W.</given-names>
            ,
            <surname>Sioutos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Tuttle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.S.</given-names>
            ,
            <surname>Wright</surname>
          </string-name>
          ,
          <string-name>
            <surname>L.W.</surname>
          </string-name>
          : NCI Thesaurus:
          <article-title>Using science-based terminology to integrate cancer research results</article-title>
          .
          <source>Studies in Health Technology and Informatics</source>
          <volume>107</volume>
          (
          <issue>1</issue>
          ) (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Hartel</surname>
            ,
            <given-names>F.W.</given-names>
          </string-name>
          , de Coronado,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Dionne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Fragoso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Golbeck</surname>
          </string-name>
          , J.:
          <article-title>Modeling a description logic vocabulary for cancer research</article-title>
          .
          <source>J. of Biomedical Informatics</source>
          <volume>38</volume>
          (
          <issue>2</issue>
          ) (
          <year>2005</year>
          )
          <fpage>114</fpage>
          -
          <lpage>129</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Thomas</surname>
            ,
            <given-names>N.: NCI</given-names>
          </string-name>
          <string-name>
            <surname>Thesaurus - Apelon TDE Editing Procedures</surname>
            and
            <given-names>Style</given-names>
          </string-name>
          <string-name>
            <surname>Guide</surname>
          </string-name>
          .
          <source>National Cancer Institute</source>
          . (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Noy</surname>
            ,
            <given-names>N.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>de Coronado</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Solbrig</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fragoso</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hartel</surname>
            ,
            <given-names>F.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Musen</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          :
          <article-title>Representing the NCI Thesaurus in OWL: Modeling tools help modeling languages</article-title>
          .
          <source>Applied Ontology</source>
          <volume>3</volume>
          (
          <issue>3</issue>
          ) (
          <year>2008</year>
          )
          <fpage>173</fpage>
          -
          <lpage>190</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Horridge</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bechhofer</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>The OWL API: A Java API for working with OWL 2 ontologies</article-title>
          .
          <source>In: Proc. of OWLED-09</source>
          . (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. Gonc¸alves, R.S.,
          <string-name>
            <surname>Parsia</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sattler</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>Categorising logical differences between OWL ontologies</article-title>
          .
          <source>In: Proc. of CIKM-11</source>
          . (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. de Coronado,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Wright</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.W.</given-names>
            ,
            <surname>Fragoso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Haber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.W.</given-names>
            ,
            <surname>Hahn-Dantona</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.A.</given-names>
            ,
            <surname>Hartel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.W.</given-names>
            ,
            <surname>Quan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.L.</given-names>
            ,
            <surname>Safran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Thomas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Whiteman</surname>
          </string-name>
          ,
          <string-name>
            <surname>L.</surname>
          </string-name>
          :
          <article-title>The NCI Thesaurus quality assurance life cycle</article-title>
          .
          <source>Journal of Biomedical Informatics</source>
          <volume>42</volume>
          (
          <issue>3</issue>
          ) (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>