<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Learning classification taxonomies from a classification knowledge based system</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Hendra Suryanto</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paul Compton</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Artificial Intelligence Laboratory, School of Computer Science and Engineering, University of New South Wales</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Knowledge-based systems (KBS) are not necessarily based on well-defined ontologies. In particular it is possible to build KBS for classification problems, where there is little constraint on how classes are organised and a class is expressed by the expert as a free text conclusion to a rule. This paper investigates how relations between such 'classes' may be discovered from existing knowledge bases, then investigates how to construct a model of these classes (an ontology) based on user-selected patterns in the class relations. We have applied our approach to KBS built with Ripple Down Rules (RDR) [1] RDR is a knowledge acquisition and knowledge maintenance methodology, which allows KBS to be built very rapidly and simply, but does not require a strong ontology. Our experimental results are based on a large real-world medical RDR KBS. The motivation for our work is to allow an ontology in a KBS to 'emerge' during development, rather than requiring the ontology to be established prior to the development of the KBS.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Most knowledge acquisition methodologies first build a
model of domain knowledge, before using this to build a
particular problem solver. e.g KADS and CommonKADS
[2], Protege2000 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Although this approach facilitates
reuse it does not overcome the knowledge acquisition and
maintenance bottleneck and these problems are present
both in the development of the ontology and consequent
problem solver.
      </p>
      <p>
        The RDR approach starts knowledge acquisition (KA) to
build the problem solver immediately without any
modelling apart from a simple attribute-value data
representation [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Even the attribute-value representation
can be developed while KA is in progress. The focus of the
approach is to make the addition of each incrementally
added piece of knowledge as simple and as reliable as
possible. Although this approach facilitates KA and
maintenance [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], it does not facilitate re-use, because of the
lack of an ontology.
      </p>
      <p>In this learning problem, we are dealing with rules rather
than raw cases; the relevant attributes have already been
identified and extracted. Our aim is to discover the
appropriate ontology given that the relevant attributes (and
values) are already well-identified. A second aspect of the
problem is that in a real-world system, attributes are
multivalued rather than boolean. In rules, conditions can then
subsume each other, be disjoint, etc. For example age &gt;10
subsumes age 50, whereas age &gt;40 and age &lt;10 are clearly
disjoint. Our method needs to not only combine
information about classes from across the knowledge base,
but to address, the way in which conditions based on
multivalued attributes interrelate. Figure. 1 shows some rules for
the class Satisfactory lipid profile
previous raised LDL noted. In the first rule there
is a condition Max(LDL) &gt; 3.4 and in the second rule
there is a condition Max(LDL) is HIGH), where HIGH is a
range between 2 real number.</p>
      <sec id="sec-1-1">
        <title>Satisfactory lipid profile previous raised</title>
      </sec>
      <sec id="sec-1-2">
        <title>LDL noted &lt;-</title>
        <p>(LDL
AND
&lt;= 3.4) AND Triglyceride is NORMAL)
(Max(LDL) &gt; 3.4)
OR
((LDL is NORMAL) AND (Triglyceride is</p>
      </sec>
      <sec id="sec-1-3">
        <title>NORMAL) AND Max(LDL) is HIGH)</title>
        <p>Finally, the problem is compounded, by the way in which
the expert adds conclusions. When the Multiple
Classification RDR (MCRDR) KB makes an error the task
of the expert is to specify the correct conclusion and
identify the attributes and values that justify this
conclusion. In adding the conclusion, the expert can select
from a list of pre-existing conclusions organised into broad
categories, but can also simply type in a new conclusion.
In medical pathology result interpretation, the evaluation
domain here, the conclusions added by the pathologist may
provide advice to the referring clinician: on patient
diagnosis, management, how treatment is progressing,
whether the tests ordered were appropriate, what tests
might still be necessary or any combination of the above. It
is quite clear to both the expert and the receiver of the
advice what information is being provided in the free text
interpretation, but these interpretations are a long way from
well the defined classes of a formal ontology. A task
analysis would assess this domain as a classification
problem, but this does not imply well defined classes.
Hence the problem is not only that disjuncts for a class
(separate rule paths ) may be scattered across the KB, but
that the same class may be represented by different text
strings. Such text strings may cover a combination of
different classes. Some examples are given in Table 1.</p>
        <p>Rule 1:
if true then ...
except
except
except
except
except</p>
        <p>Rule 3:
if ultraViolet=VERY HIGH
then Swimming at
indoor swimming pool</p>
        <p>except</p>
        <p>Rule 2:
if sky=SUNNY then</p>
        <p>Go Swimming</p>
        <p>Rule 4:
if wave = LOW then
Swimming in the beach
except</p>
        <p>Rule 6:
if wind &gt; 40 km/h
then Play Chess</p>
        <p>Rule 5:
if ultraViolet=MEDIUM and</p>
        <p>wind&lt;=30 then
Swimming in the beach</p>
        <p>except</p>
        <p>Rule 7:
if sky=CLOUDY then
Swimming at indoor
swimming pool</p>
        <p>Rule 8:
if wind &gt; 30 km/h
then Play chess</p>
        <p>Rule 9:
if ultraViolet=LOW then</p>
        <p>Go Swimming</p>
        <p>except</p>
        <p>Rule 10:
if sky=RAINY then</p>
        <p>Play chess</p>
        <p>
          The question arises of whether it would be better to start
with a more well developed ontology, as suggested by most
KA researchers. However in practice RDR systems allow
experts to very rapidly and easily build large knowledge
bases [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] and recent commercial RDR systems have
confirmed these advantages outside the research
environment. (Pacific Knowledge Systems (PKS), personal
communication) The aim of the present work then is to
preserve the ease and speed of development provided by
RDR systems, but overcome their lack of an initial strong
ontology, by discovering the ontologies implicit in these
incrementally developed systems. This may give us the
best of both worlds.
        </p>
        <p>
          RDR exception structure provides a compact representation
of knowledge [
          <xref ref-type="bibr" rid="ref10 ref11 ref12 ref13 ref6 ref7 ref8 ref9">6-13</xref>
          ]. Gaines ([
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]) has also generalized
RDR to Exception Directed Acyclic Graphs (EDAGs). He
argues EDAGs are more compact and readable than
MCRDR because of a graph rather than a tree structure.
Initial RDR development was concerned with classification
tasks, first single and later multiple classification. RDR has
since been extended to configuration [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], heuristic search
[
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], document retrieval [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] and a more general RDR
system for construction tasks has been proposed [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ].
Figure 2 shows a simple example of the exception structure
used in MCRDR. All pathways are evaluated. Evaluation
on any pathway stops when a leaf node is reached or no
child rule is satisfied. A conclusion is provided by the last
satisfied rule in each refinement pathway. When an expert
identifies that an erroneous conclusion has been given, they
enter a new rule, which the system adds as a refinement.
The expert is assisted in providing an appropriate rule by
the system requiring that cases that correctly satisfy the
parent rule should not satisfy the child rule.
        </p>
        <p>In this study we have used 5 different pathology knowledge
bases provided by PKS ranging in size from 25 to 320
rules. We also have associated sets of case data ranging in
number from 453 to 2218 cases. The largest PKS
knowledge base is over 7000 rules, but this is the subject of
other related studies.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Ontology learning overview</title>
      <p>Firstly we discover class relations between rules. We
consider three basic relations: subsumption/intersection,
mutual-exclusivity and similarity. Secondly we specify
some compound relations which appear interesting using
these three basic relations. We then extract the instances of
these compound relations or patterns and assemble them
into a class model.</p>
      <p>We define that class-A subsumes class-B with subsumption
value 1.0, if we always have class-A when we have class-B,
but not the other way around. We decide that class-A
subsumes class-B from the conditions used in the rules for
class-A and class-B. This syntactical subsumption needs to
be evaluated by the expert - whether class-A semantically
subsumes class-B as well. If the subsumption value is less
than 1.0 we use the term intersection rather than
subsumption. If class-A subsumes class-B with a 0 value, it
means we do not have any information about the
subsumption/intersection relation for class-A and class-B.
The mutual exclusivity measure of class-A and class-B is
1.0 if class-A and class-B never occur together. In RDR we
evaluate this measure by checking whether there is a
condition in a rule for class-A mutually exclusive with any
condition in a rule for class-B (for example age &gt; 50 and
age &lt; 30). In RDR knowledge bases parent rules are
always mutually exclusive with child rule.</p>
      <p>Class-A similar to class-B with measure 1.0 if both of the
classes have exactly the same conditions. We use the term
same rather than similar if the similarity is 1.0.</p>
      <p>A class in MCRDR is the set of disjunct rule paths giving
the same conclusion. A rule path consists of all conditions
from all predecessor rules plus conditions of this particular
node’s rule. For example, (see figure 2) :
rule path 6 :
class Play Chess
wave=LOW, sky=SUNNY.</p>
      <p>wind &gt; 40,
The central idea of the technique is to group all rules for
each class and compute a quantitative measurement (from 0
to 1) for each relation (subsumption, mutual-exclusivity,
similarity) between every pair of classes. We use this
quantitative measure as an informal confidence measure as
to whether these relations exist. The algorithm will be
discussed in detail below, but when applied to the example
in figure 2 it gives: class Go Swimming subsumes class
Swimming in the beach with degree of confidence 0.83;
class Play_Chess and class Go_Swimming are mutually
exclusive with degree of confidence 0.17; class Go Swimming
class is similar to class Play Chess have degree of similarity
0.50. This quantitative measure enables us to group different
examples of the class and provides information on whether
across these examples, a class tends to subsume another class
(for example, class Go Swimming subsumed class Swimming
in the beach with degree of confidence 0.83. Using this
quantitative measure we can say class Go Swimming tends to
subsume or almost subsumes class Swimming in the beach,
rather than simply saying class Go Swimming subsumes class
Swimming in the beach or class Go Swimming does not
subsume class Swimming in the beach.</p>
      <p>These measures become interesting when applied to real
examples such as: class [Euthyroid levels] subsumes
class [Levels consistent with adequate
replacement] with degree of confidence 0.866.
(Medically, ‘adequate’ thyroid hormone replacement brings
thyroid hormone levels to approximately normal levels.)
Boolean values are obviously inappropriate for subsumption
/intersection, mutual-exclusivity, and similarity relations in
real domains. We found that in the Iron knowledge base,
there were only 16 subsumption relations with degree of
confidence 1.0; 8 mutually-exclusive relations with degree of
confidence 1.0 and no similarity relations with degree of
confidence 1.0.</p>
      <p>We refine the subsumption/intersection measure by not only
considering the conditions in the rule path but also the ratio of
number of cases handled by parent and child rules in a
rulepath. This allows us to deal with the situation where a rule is
a gross over-generalisation and the child rule is added as
correction to deal with most of the cases the parent rule would
fire on. We consider that there is little value to the
subsumption relation in this circumstance. For example (see
figure 2) rulepath-2 has two cases, rulepath-3 has
two cases, rulepath-4 has 3 cases and rulepath-6 has
3 cases. We therefore calculate the quality of rulepath-2
is 2/(2+2+3+3) = 0.2, quality of rulepath-4 is 3/(3+3) =
0.5, quality of rulepath-3 is 2/2 = 1.0 and quality of
rulepath-6 is 3/3 = 1.0.</p>
      <p>We do not need a rule quality measure if we use flat rule
system rather than the refinements of a rule path, for example
a flat rule for rule_2 is:
class Go Swimming sky=SUNNY, not(ultraviolet= VERY
HIGH), not(wave=LOW).</p>
      <p>Flat rules for rule_3 and rule_6 are the same as their rule paths
since the rule path extends to the leaves and so no negation of
child conditions is required. Rule_1 has three children with
one, one and two conditions, and therefore we get (1 x 1 x 2)
flat rules. By converting this RDR knowledge base to flat
rules we get an equivalent knowledge base, but this is
generally not feasible with real-world knowledge bases. In
the five knowledge bases considered here, some rules have
many children with several conditions. There is one rule with
10 children and every child has 5 conditions which converts to
(510) flat rules.</p>
      <p>
        One of the advantages of learning from rules is that we can
assume that irrelevant attributes have already been discarded.
This is significant as in our application domain there are
hundreds of attributes. Gaines [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] argues that a rule in a
knowledge base is worth many cases for learning. We adopt
the same viewpoint and note that although there is research on
combining KA and machine learning and using background
knowledge in machine learning, there seems little research so
far in learning from a KBS rather than from cases [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
The immediate precursor of this work [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] applied formal
concept analysis to ontology discovery in knowledge bases.
This provided a useful way to explore concepts in a
knowledge base, but because of the complexity of the
conceptual lattice it was more useful to consider sub-sections
of the lattice, selected by the user or by a simple nearest
neighbour algorithm [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. The critical difference from the
work here, is that in formal concept analysis the difference
between different concepts is emphasised. Here we attempt to
combine all the concepts that represent a class and consider
relations between classes rather than between concepts.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. The class relations model</title>
      <p>
        The class relations model shows the relations subsumption,
mutual-exclusivity and similarity between classes and the
degree of confidence that the particular relation exists [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ].
We note that the measures we derive are strictly heuristic.
Other superior and perhaps more well founded measures may
be possible. The results here represent simply a first attempt
at carrying out this type of analysis. The second point to note
is that these relations have to deal with non-boolean as well as
boolean data.
      </p>
      <p>
        Let X be a class in the MCRDR framework. {X0…Xm} is a
set of rules which have class X as the conclusion. {Xi0 …
Xin} is a set of conditions for rule Xi where i = 0...n, n is the
number of distinct conditions in the rule path; m is number of
rule paths for class X. In the MCRDR framework the class is
given as a disjunction of rule paths [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <sec id="sec-3-1">
        <title>Then</title>
        <p>class X = ∨ ( ∧ Xij )
m</p>
        <p>n
i=0</p>
        <p>j=0
That is, Xij stand for an individual condition in a rule path for
the class X. If X is a class and Y is also a class, we could
define a similarity measure as follows:
Sim (Xij ,Yij) = 0 if Xij ,Yij are different
Sim (Xij ,Yij) = 1 if Xij ,Yij are same
If α is set of distinct attributes in rule path Xi, β is set of
distinct attributes in rule path Yi, n = | α ∪ β|, (Xij ,Yij) are
pair of same attributes, but they could be different conditions,
(e.g. Age&gt;50, Age=60) then we can define:
n
j=0
Σ Sim(Xij ,Yij)</p>
        <p>| α ∪ β|
Similar(Xi ,Yi) = - - - - - - - - - - -
* Q( Xi ) * Q(Yi)
where Q( Xi ) = number of cases for Xi / (number of cases for
descendant Xi + number of cases for Xi). Function Q
measures the quality of a rule in RDR.</p>
        <p>If the quality of a rule is close to 100%, it means that nearly
all cases reaching this rule are processed by the its child rules;
and the child rules are rare exceptions. On the other hand if
the quality of a rule is 10%, it means 90% of cases that satisfy
that rule are passed to its children. That is the rule is too
general and can be regarded as not being a particularly good
rule and so it should not be given as strong consideration in
developing the relations in the system.</p>
      </sec>
      <sec id="sec-3-2">
        <title>For example</title>
        <p>Similar(rulepath-2, rulepath-6)
1/3 * Q(rulepath-2) * Q(rulepath-3),
Similar(rulepath-8, rulepath-9)
= 1/2 * Q(rulepath-8) * Q(rulepath-9),
Similar(rulepath-9, rulepath-10)
= 2/3 * Q(rulepath-9) * Q(rulepath-10).</p>
        <p>Function Similar() measures a similarity between 2 nodes
(each node contains a rule path).</p>
        <p>ClassX</p>
        <p>ClassX</p>
        <p>ClassX
1
v1
2</p>
        <p>3
v2 v3
1
v1
2
v2
v3
3
4 5
ClassY
Figure 4.</p>
        <p>4 5
ClassY
Figure 5
1
v1
2</p>
        <p>3
v2</p>
        <p>v4
v3 v5 v6
4 5
ClassY
Figure 6
If α is set of distinct attributes in rule Xi, β is set of distinct
attributes in rule Yi, then we can define:
Subsume(Xi ,Yi) = - - - - - - - - - - - * Q( Xi ) * Q(Yi)
where Q(Xi ) = number of cases for Xi / (number of cases for
descendant Xi + number of cases for Xi )
Function Subsume() measures a degree of confidence that the
first rule path subsumes the second rule path. For example
subsume(rulepath-2,rulepath-4) = 2/2 *
Q(rulepath-2) * Q(rulepath-4).</p>
        <p>Conditions in rulepath-2 are sky=SUNNY;
conditions in rulepath-4 are sky=SUNNY, wave=LOW;
and |α ∪ β| = 2, that is α ∪ β = {sky, wave}.</p>
        <p>Since rulepath-2 does not have the attribute wave, we
consider rulepath-2 is more general than rulepath-4
with the attribute wave. Therefore there are 2 conditions in
rulepath-2 which are same as or more general than
rulepath-4. Similarly,
subsume (rulepath-2, rulepath-5) =2/3 *
Q(rulepath-2) * Q(rulepath-5)</p>
        <p>We compute ClassSubsume(X,Y) = (v1 + v3) / 2. We choose
the v such that all nodes of class Y are covered by at least one
edge and the sum of v (eg. v1 + v3) is maximal. E.g.
classSubsume ( Go swimming, Swimming in the beach) =
(1+0.667)/2. We assume the quality of all rule paths is 1.0 to
simplify this example.</p>
        <p>We can define a mutually exclusive measure as follows with
Xij and Yij standing for individual conditions in rule paths
for the relevant classes as above.</p>
        <p>Mut (Xij ,Yij) = 0 if Xij and Yij are not mutually exclusive.
Mut (Xij ,Yij) = 1 if Xij and Yij are mutually exclusive (for
example A&gt;5 and A&lt;2)
MutualEx(Xi, Yi) = 1 , if at least one of Mut(Xij ,Yij)=1
MutualEx(Xi, Yi) = 1 , if rule Xi and rule Yi are parent and
child.</p>
        <p>MutualEx(Xi ,Yi) = 0 , otherwise
Function MutualEx() measures a degree of confidence that the
first rule subsumes the second rule. Since the quality of the
rule does not affect mutual exclusivity as much as it affects
similarity and subsumption/intersection, we do not apply the
quality measure to mutual exclusivity. For example
MutualEx(rulepath-2, rulepath-10) = 1.0,
Figure 6 suggests how we can find a mutual exclusivity
measure between 2 classes. It shows that Class X is a
disjunction of nodes 1,2 and 3 and Class Y is a disjunction of
nodes 4 and 5. We compute ClassMutualEx(X,Y) = (v1 + v2
+v3 +v4 + v5 + v6) / 6. X and Y are mutually exclusive if
and only if all nodes of X and Y are mutually exclusive with
respect to each other (see Figure 6). Therefore ClassMutualEx
(Go swimming, Play chess) = 1/6, since Go
Swimming has 2 rulepaths, Play chess has 3 rulepaths
and all MutualEx() between those rulepaths are each 0.0,
except for MutualEx(rulepath-2, rulepath-10) = 1.0.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental results</title>
      <p>Hormone knowledge bases system similarity-value &gt; 0.66</p>
      <p>
        Class description Class description
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] interpretation [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] M ale interpretation
      </p>
      <p>
        [Satisfactory prolactin level.] [Satisfatory prolactin level.]
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] interpretation [28] interpretation] [Consistent with
[Consistent with premature premature ovarian failure. Suggest
ovarian failure.] repeat FSH and oestradiol in 2-3
months to confirm.]
Hormone knowledge bases system subsumption-value = 1
      </p>
      <p>
        Class description Class description
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] interpretation [Elevated [33] interpretation [Elevated
prolacprolactin persists. Suggest TSH ] tin persists. Primary hypothyroidism
has been excluded. IV sampling
through an in-dwelling cannula can help
exclude stress-related elevations of
prolactin. Pituitary imaging may be
required.]
[36] interpretation [Elevated
prolac
      </p>
      <p>
        tin persists. Await TSH.]
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] Interpretation [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] M ale interpretation [Raised
pro[Raised prolactin in females is lactin in men is commonly due to
commonly due to medication, stress, medications and occasionally
strees or lactation. hypothyroidism. Suggest TSH and
Suggest TSH to exclude repeat prolactin after 30 minutes rest.]
hypothyroidism and repeat [30] interpretation [Raised prolactin
prolactic after 30 minutes rest.] in females is commonly due to
medications, stress or lactation.
Hypothyroidism has been excluded. Suggest
review medications and repeat
prolactin after 30 minutes rest.]
Hormone knowledge bases system mutual-exclusivity-value=1
      </p>
      <p>
        Class description Class description
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] interpretation] [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] interpretation [No evidence of
[Consistent with perimenopause, unless patient is
perimenopause] on oestrogen therapy.]
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] interpretation [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] interpretation [Elevated
[Satisfactory prolactin level.] prolactin persists. Suggest TSH
We therefore extract specific patterns which seem likely to be
components of a meaningful taxonomy. For example:
      </p>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion</title>
      <p>It is beyond the scope of this paper (and the authors) to
provide a detailed medical analysis of the ontologies produced
by these techniques; however, it is worth noting some lay
observations.</p>
      <p>The mutually-exclusive classes in Table 1 seem reasonable.
However, we also have some cases where very similar
conclusions are identified as mutually exclusive. This occurs
when experts make up rules that include different values for
the same attribute. For example some mutually exclusive
conclusions seem to give the same clinical advice but
specifically refer to a patient being male or female. This may
or may not be of clinical importance, but there is an obvious
opportunity to have a further superclass.</p>
      <p>The most interesting issues arise with the nature of
subsumption. A superclass subclass relation may arise where
one rule specifies a value for an attribute and another does
not. For example a key factor in comments 6, 7, 23, 30, 33
and 36 is whether or not a TSH measurement should be
ordered and whether or not primary hypothyroidism has been
excluded. TSH results are important in the diagnosis of
primary hypothyroidism. The generic comment suggesting a
TSH measurement is given when there is no TSH result
available. The comment also suggests the clinical cause of
the high prolactin level remains unknown. When there is a
TSH result available this provides some evidence to confirm
or exclude one of the causes of the high prolactin. These
relations appears to us as ontologically reasonable. However,
the wording of the actual comments does not readily indicate
such relationships. It would be interesting to know how the
expert would react and how comments might be worded if this
ontological information was available as rules were being
developed.</p>
      <p>
        A more general example of this pattern is the comment
[0]:“patient has ovulated” which is at the top level of the
taxonomy in Fig 7. This subsumes a whole range of more
specific comments related to other attributes. Again it would
be interesting to see if this taxonomic information influenced
the expert’s wording.
5.1. Further work
At this stage we have only conducted a preliminary
examination of some of the relations in the knowledge bases.
A detailed examination by domain experts will be required to
establish the utility of this approach. This examination may
suggest that other types of measure than those suggested here
may be useful and paterns other than those used in Fig 7 may
be of interest for browsing the relations. At this stage we
make no claim about the particular heuristic measures we
have used, except that broadly, measures of this kind seem
useful in discovering and exploring implicit ontologies.
We are also investigating the possibility of learning
interesting patterns from a coloured graph of class relations.
The frequency of isomorphic sub-graphs may be interesting.
If the graph is large, then we could scale up the algorithm by
applying data mining technique (e.g the apriori algorithm
[
        <xref ref-type="bibr" rid="ref24">24</xref>
        ])
Finally, the present technique only considers the conditions in
rule paths and proportion of cases handled by rules in
determining the relations. It does not consider any other
information about the classes themselves. The refinement
structure for RDR does not indicate any ontological
refinement; however, it does indicate that an expert thought a
conclusion was inappropriate and so should be replaced by
another. We are looking at the possible relations between
conclusions that this would allow which could the be related
to the idea of relations presented here.
      </p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>The work we have presented is an attempt to develop
techniques to discover the ontologies implicit in knowledge
bases. We believe it will be of increasing importance to carry
out this particular kind of knowledge discovery as larger and
larger knowledge bases come into use and we seek to exploit
the knowledge in the knowledge bases in different ways. We
do not make any particular claim for the techniques we have
developed to date, except that they suggest that such ontology
discovery is possible. The key idea in the techniques we have
developed is that is seems reasonable to use heuristic
quantitative measures to group classes and class relations.
This then enables possible ontologies to be explored on a
reasonable scale</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Compton</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Jansen</surname>
          </string-name>
          .,
          <article-title>A philosophical basis for knowledge acquisition</article-title>
          .
          <source>Knowledge Acquisition</source>
          ,
          <year>1990</year>
          . 2: p.
          <fpage>241</fpage>
          - -
          <lpage>257</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Schreiber</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Wielinga</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Breuker</surname>
          </string-name>
          ,
          <article-title>KADS A Principled Approach to Knowledge-Based System Development</article-title>
          .
          <year>1993</year>
          : Academic Press.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Grosso</surname>
            ,
            <given-names>W.E.</given-names>
          </string-name>
          , et al.
          <article-title>Knowledge Modeling at the Millennium (The Design and Evolution of Protégé-2000)</article-title>
          .
          <source>in Twelfth Workshop on Knowledge Acquisition, Modeling and Management</source>
          , Voyager Inn, Banff, Alberta, Canada.
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Richards</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Compton</surname>
          </string-name>
          .
          <article-title>Knowledge Acquisition First, Modelling Later</article-title>
          .
          <source>in 10th European Knowledge Acquisition Workshop</source>
          .
          <year>1997</year>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Edwards</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , et al.,
          <article-title>PEIRS: a pathologist maintained expert system for the interpretation of chemical pathology reports</article-title>
          .
          <source>Pathology</source>
          ,
          <year>1993</year>
          .
          <volume>25</volume>
          : p.
          <fpage>27</fpage>
          -
          <lpage>34</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Catlett</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <article-title>Ripple Donw Rules as a Mediating Representation in Interactive Induction</article-title>
          .
          <source>in Proceedings of the Second Japanese Knowledge Acquisition for Knowledge Based Systems Workshop</source>
          .
          <year>1992</year>
          . Kobe, Japan.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Gaines</surname>
            ,
            <given-names>B.R.</given-names>
          </string-name>
          and
          <string-name>
            <given-names>P.J.</given-names>
            <surname>Compton</surname>
          </string-name>
          .
          <article-title>Induction of Ripple Down Rules</article-title>
          .
          <source>in Fifth Australian Conference on Artificial Intelligence</source>
          .
          <year>1992</year>
          . Hobart.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Kivinen</surname>
            , J.,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Mannila</surname>
            , and
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Ukkonen</surname>
          </string-name>
          .
          <article-title>Learning Rules with Local Exceptions</article-title>
          .
          <source>in European Conference on Computational Theory</source>
          .
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Scheffer</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <article-title>Algebraic foundations and improved methods of induction or ripple-down rules</article-title>
          .
          <source>in 2nd Pacific Rim Knowledge Acquisition Workshop</source>
          .
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Siromoney</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Siromoney</surname>
          </string-name>
          ,
          <article-title>Variations and Local Exception in Inductive Logic Programming</article-title>
          ,
          <source>in Machine Intelligence - Applied</source>
          Machine Intelligence,
          <string-name>
            <given-names>K.</given-names>
            <surname>Furukawa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Michie</surname>
          </string-name>
          , and S. Muggleton, Editors.
          <year>1993</year>
          . p.
          <fpage>213</fpage>
          -
          <lpage>234</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Compton</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , P. Preston, and
          <string-name>
            <given-names>B.</given-names>
            <surname>Kang</surname>
          </string-name>
          .
          <article-title>The Use of Simulated Experts in Evaluating Knowledge Acquisition. in 9th AAAIsponsored Banff Knowledge Acquisition for Knowledge Base System Workshop</article-title>
          .
          <year>1995</year>
          . Canada.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Kang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , P. Compton, and
          <string-name>
            <given-names>P.</given-names>
            <surname>Preston</surname>
          </string-name>
          .
          <article-title>Simulated Expert Evaluation of Multiple Classification Ripple Down Rules. in 11th Banff knowledge acqusition for knowledge-based systems workshop</article-title>
          . 1998. Banff: SRDG Publications, University of Calgary.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Suryanto</surname>
            , H.,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Richards</surname>
            , and
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Compton</surname>
          </string-name>
          .
          <article-title>The automatic compression of Multiple Classification Ripple Down Rule Knowledged Based Systems: Preliminary Experiments</article-title>
          .
          <source>in Knowledge-based Intelligence Information Engineering Systems</source>
          .
          <year>1999</year>
          . Adelaide, South Australia: IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Gaines</surname>
            ,
            <given-names>B.R.</given-names>
          </string-name>
          ,
          <source>Transforming Rules and Trees into Comprehensible Knowledge Structures</source>
          .
          <year>1995</year>
          : AAAI/MIT Press.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Ramadan</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          , et al.
          <article-title>From Multiple Classification RDR to Configuration RDR. in 11th Banff Knowledge Acquisition for Knowledge Base System Workshop</article-title>
          .
          <year>1998</year>
          . Canada.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Beydoun</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Hoffmann</surname>
          </string-name>
          .
          <article-title>NRDR for the Acquisition of Search Knowledge</article-title>
          .
          <source>in 10th Australian Conference on Artificial Intelligence</source>
          .
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Kang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , et al.,
          <article-title>A help desk system with intelligent interface</article-title>
          .
          <source>Applied Artificial Intelligence</source>
          ,
          <year>1997</year>
          .
          <volume>11</volume>
          ((
          <issue>7-8</issue>
          )):
          <fpage>611</fpage>
          -
          <lpage>631</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Compton</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Richards</surname>
          </string-name>
          .
          <article-title>Extending Ripple Down Rules</article-title>
          .
          <source>in Australian Knowledge Acquisition Workshop 99</source>
          .
          <year>1999</year>
          . UNSW.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Gaines</surname>
            ,
            <given-names>B. R.</given-names>
          </string-name>
          (
          <year>1989</year>
          ).
          <article-title>Ounce of Knowledge is Worth a Ton of Data: Quantitative Studies of the Trade-off between Expertise and Data based on Statistically Well-founded Empirical Induction</article-title>
          , San Mateo, California: Morgan Kaufmann,
          <year>1989</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Richards</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Compton</surname>
          </string-name>
          .
          <article-title>Uncovering the conceptual models in RDR KBS</article-title>
          . in International Conference on Conceptual Structures ICCS'
          <fpage>97</fpage>
          .
          <year>1997</year>
          . Seattle: Springer Verlag.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Richards</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <source>The Reuse of Knowledge in Ripple Down Rule Knowledge Based Systems, in Artificial Intelligence Department</source>
          .
          <year>1998</year>
          , New South Wales: Sydney. p.
          <fpage>335</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Suryanto</surname>
            <given-names>H.</given-names>
          </string-name>
          and
          <article-title>Compton P. Discovery of class relations in exception structured knowledge bases</article-title>
          .
          <source>The International Conference on Conceptual Structures</source>
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Perez</surname>
            ,
            <given-names>A.G.</given-names>
          </string-name>
          <article-title>Evaluation of Taxonomic Knowledge in Ontologies and Knowledge Bases</article-title>
          .
          <source>in KAW'99</source>
          , Twelfth Workshop on Knowledge Acquisition, Modeling andManagement.
          <year>1999</year>
          . Voyager Inn, Banff, Alberta, Canada.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Agrawal</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , et al.,
          <source>Fast Discovery of Association Rules. Advances in Knowledge Discovery and Data Mining</source>
          , ed. U.M. Fayyad,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Piatetsky-shapiro, and</article-title>
          <string-name>
            <given-names>P.</given-names>
            <surname>Smyth</surname>
          </string-name>
          .
          <year>1996</year>
          :
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>