Diagrammatic Ontology Engineering
                                        Gem STAPLETON a,1
                                      a Visual Modelling Group

                                       University of Brighton,
                                           Brighton,UK

             Abstract. Ontology engineering involves defining axioms to capture required con-
             straints when modelling a domain of interest. Ontologies arise in many areas, with
             potentially a diverse range of end users involved in their creation. This leads to
             the requirement for accessible approaches to ontology engineering, as some stake-
             holders need not be fluent or trained in symbolic notations such as OWL. This pa-
             per summarises concept diagrams and property diagrams which are designed to be
             an accessible alternative to OWL. The paper reports on two empirical studies, the
             first of which focuses on how to choose effective concept diagrams for express-
             ing simple OWL axioms. The second study compares these effective diagrams to
             both OWL and DL, demonstrating that they can bring about significant improve-
             ments in task performance for novice users. These results support the incorporation
             of concept diagrams into ontology engineering tools, such as Protégé or WebPro-
             tege. This is an exciting prospect, allowing more stakeholders to fully engage with
             the ontology engineering process, leading to more efficiently produced and robust
             ontologies in the future.

             Keywords. ontology engineering, visualization, concept diagrams


1. Introduction

Ontology engineers are often faced with the challenging problem of axiomatising com-
plex systems using formal, logical notations such as OWL [1]. To support the use of
OWL in ontology engineering, various tools have been developed with Protégé being a
prominent example [2]. However, the symbolic-like nature of OWL can pose a barrier
to entry for those who are not mathematically trained. This barrier is potentially prob-
lematic: often, producing a set of accurate axioms requires the input from a variety of
people, reflecting the need for ontologies in diverse domains such as privacy engineer-
ing and biomedical sciences. In contrast to symbolic approaches, visual (diagrammatic)
approaches are often seen as user friendly notations that can be accessible to a broad
user base. A major goal is to make ontology engineering more broadly accessible by
providing a fully formal, empirically supported, diagrammatic logic. This paper briefly
summarises concept diagrams [7] and the newly designed property diagrams, along with
early-stage empirical studies to demonstrate their efficacy [3,4]. It goes on to summarise
some key challenges that must be addressed to make concept diagrams and property
diagrams widely usable in practice.
   1 Corresponding   Author:    Gem     Stapleton,   University    of   Brighton,    Brighton,UK;   E-mail:
g.e.stapleton@brighton.ac.uk.
2. Concept Diagrams and Property Diagrams

Concept diagrams and property diagrams are formed from Euler diagrams augmented
with additional syntax to give a highly expressive logic. Concept diagrams are geared
towards making OWL 2.0 class expression axioms, although they are also capable of
making assertions about properties such as their domain and range. Likewise, property
diagrams are aimed at making OWL 2.0 object property expression axioms, primarily
to describe property hierarchies and disjointness. For both concept diagrams and prop-
erty diagrams there is not an exact equivalence in expressive power with the implied
OWL 2.0 fragment: each diagrammatic notation is able to express information that is not
expressible in the implied OWL fragment and vice versa.
     A full formalization of concept diagrams can be found in [11]. A brief introduction
to their core syntax and semantics is given here via examples in which the distinction be-
tween syntax and semantics is sometimes blurred. As with Euler diagrams, closed curves
represent sets (called concepts in description logic and classes in OWL). Properties or
roles (i.e. binary relations) are represented by arrows. Individuals are represented by dots
or, more generally, trees.


                                 Figure 1. Concept diagrams.

    Suppose that we have some information about the individual Helen that we wish to
axiomatize:
    1. Helen is a person.
    2. Helen is married to the person Poly, and nothing else.
    3. Helen owns exactly two pets, both of which are dogs.
    4. One of Helen’s pets is a terrier called Lily.
The left-hand diagram in figure 1 expresses this information, along with other things. We
start by noting that there are three classes, Person, Dog and Terrier. These three classes
are represented using the three labelled closed curves. Notice that the curves for Per-
son and Dog do not overlap; this expresses that Person and Dog are disjoint classes. In
addition, the enclosure of Terrier by Dog asserts subsumption in the obvious way. We
also need to represent the individuals Helen, Poly and Lily. Each of them is represented
by a labelled dot in the desired location; for example, Lily is located inside the curve
labelled Terrier. What remains is to represent the information about properties. Focusing
first on the property married, the arrow in the diagram connecting Helen to Poly asserts
that Helen is married to Poly and only Poly. The other arrow sourced on Helen targets an
unlabelled curve, analogous to an unnamed class. This curve represents the set of thing
that are pets owned by Helen and, since it is drawn inside the Dog curve, these pets are
all dogs. The curve includes Lily and one other, unnamed, individual represented by the
two-node tree (so Helen owns two pets, including Lily). We do not know whether the
unnamed individual is a terrier. This uncertainty is captured by use of two nodes, one
inside both the Dog and Terrier curves and the other inside the Dog curve but outside the
Terrier curve. Shading is used to express that the only dogs owned by Helen are repre-
sented by the trees. In general, in a shaded region, all individuals must be represented by
nodes or trees.
     As well as using solid arrows to represent restrictions on properties, concept dia-
grams also use dashed arrows. These dashed arrows are used when we do not wish to
express complete information about the image of a property under the domain restriction
imposed by the arrow’s source. For example, we may wish to express that Helen loves
some person, without identifying the set of things that Helen loves (i.e. the image of the
property loves when its domain is restricted to Helen). A concept diagram expressing this
is in the middle diagram of figure 1. The arrow connects diagrammatic syntax placed in
different boxes to ensure that we have not asserted that the person Helen loves is different
from Helen.
     The right-hand diagram of figure 1 expresses that every book is read by only people.
The quantification expression written outside of the rectangles tells us that the diagram
is making an assertion about all books. Lastly, we note that concept diagrams can also
make assertions involving inverse relations, by annotating arrow labels using the symbol
− , and negation by labelling a bounding box with ‘Not’.


              Figure 2. Expressing property subsumption, disjointness and equivalence.


     The property diagram in figure 2 illustrates how to express property subsumption,
disjointness and equivalence diagrammatically. A key difference, compared to earlier
examples, is the use of ∗ as an arrow source. This syntactic device acts as a universal
quantifier over elements of the domain. For instance, this property diagram tells us that
for each thing t, the set of things that t assesses is a subset of the set of things that t
teaches. Therefore, the property assesses is subsumed by teaches. Two of the arrows,
labelled assesses and grades, both target the same curve, thus asserting that assesses and
grades are equivalent. Through the use of curves with disjoint interiors, the diagram also
tells us that teaches and owns are disjoint properties. In addition, we can also see that
assesses and grades are disjoint from owns.
     Figure 3 extends figure 2, expressing that the property researches is subsumed by
interests. By exploiting the additional rectangle, we have not made any assertion about
the relationship between these two properties (researches and interests) and teaches, as-
sesses, grades or owns. In addition, the inclusion of the class Topic indicates that the
range of both researches and interests is Topic. Domain information can be expressed by
property diagrams, using inverses.
                          Figure 3. Using multiple boundary rectangles.


3. Choosing Effective Concept Diagrams for Common Axioms


Common ontology axioms include class subsumption and disjointness constraints (see
figures 4 and 5), along with All Values From, Some Values From, Domain, and Range
restrictions. As with any logic, concept diagrams offer a variety of ways to assert these
kinds of relationships. To best support ontology engineers, it is important to understand
the relative impact of different choices of axiomatization on task performance. To this
end, the authors of [3] set out to identify features of concept diagrams that support better
user task performance, measuring time and accuracy. They proposed three different kinds
of diagrammatic patterns for defining axioms:

    1. purely diagrammatic (called unquantified in [3]), in which no explicit use of log-
       ical operators (e.g. Not) or quantifiers (e.g. For all) is permitted ,
    2. quantified diagrams using solid arrows, and
    3. quantified diagrams using dashed arrows.

Examples of the purely diagrammatic versions of the patterns are shown in figures 4 to 9.
     The three types of pattern were evaluated by conducting an empirical study, col-
lecting performance data using multiple choice questions. The error rates were as fol-
lows: purely diagrammatic, 23.44%; quantified with solid arrow, 27.81%; quantified with
dashed arrow, 79.62%. The mean times taken, in seconds, to provide a correct answer
were: purely diagrammatic, 18.40; quantified with solid arrow, 20.89; quantified with
dashed arrow, 29.05. The statistical analysis of their data indicated that avoiding explicit
quantification, and representing the information purely diagrammatically, best supports
task performance. Thus, this study guides ontology engineers who are using concept
diagrams towards avoiding explicit quantification where possible.
       Figure 4. Subsumption.        Figure 5. Disjointness.     Figure 6. All values from.


 Figure 7. Some values from.           Figure 8. Domain.                 Figure 9. Range.


4. Comparing with OWL and DL: A Test of Efficacy

Having identified effective diagrammatic patterns for common ontology axioms, it was
felt important to determine whether there really is an advantage in using diagrammatic
patterns over standard notations in ontology engineering. An empirical evaluation com-
pared the six diagrammatic patterns illustrated in figures 4 to 9 with equivalent axioms
expressed in OWL (as displayed in the stylized form of the Protégé version 4.3 interface)
and description logic [4]. Participants were asked to select the meaning of the diagram
or statement from a choice of four options. Concept diagrams were found to support sig-
nificantly better task performance than both OWL and DL. As an indication of the scale
of benefit, we include here the error rates and mean times taken to provide a correct an-
swer to questions. Regarding errors, participants exposed to concept diagrams returned
an error rate of 7.59%, which increased to 27.03% when using OWL and 33.58% when
using description logic. The mean times taken were 13.51 seconds for concept diagrams,
17.25 seconds for OWL and 22.99 seconds for description logic. These data suggest that,
as well as providing a statistically significant benefit, the scale of this benefit is likely to
be of real practical relevance.
      This second study raises an obvious question: why are diagrams more effective? The
theory of well-matchedness, introduced by Gurr, can help to explain this phenomenon: a
notation is well-matched to its meaning if its semantic relationships are matched by, or
mirrored by, its syntactic relationships [5]. In the case of concept diagrams, specifically
the underlying Euler diagrams, the use of spatial relationships between closed curves is
well-matched. For instance, to express subsumption, that is C2 is a subset of (i.e. con-
tained by) C1 , the curve C2 is contained by the curve C1 . Here, as with disjointness, the
semantics are clearly mirrored by the diagram used for the axiom in question. Generally
speaking, diagrams are often well-matched, unlike symbolic and textual notations. This
property of diagrams is one reason why it is often believed that they lead to improved
task performance.
      The concept diagrams for subsumption, disjointness, all values from and range ax-
ioms are particularly well-matched to meaning. In the study, participants performed well
with         them        (the       raw        data        can        be       found          at
https://sites.google.com/site/eisamalharbi/understandingontologies). The highest error
rate for these four axiom types was 6.67% when using concept diagrams, with just one
error for each of all values from and range; by contrast, the lowest error rates for OWL
and DL were 14.44% and 22.22%, respectively, both for subsumption.
     However, it could be argued that the other two axiom types, some values from and
domain do not have such well-matched diagrams. In particular, the cognitively more dif-
ficult inverse properties used in these axioms are not well-matched to meaning. The study
yielded, for these two axiom types, the highest error rates for concept diagrams (21.11%
and 8.88% respectively). In the case of some values from, though, it is notable that both
OWL and DL have particularly high error rates too (40.00% and 38.88%), indicating that
this kind of axiom is cognitively difficult. It is certainly possible, given these data, that
inverse properties bring with them cognitive difficulty, but the main burden may lie in
understanding some values from, insight supported by other research [6,9,8,10,13].


5. Conclusions and Future Challenges

Concept diagrams and property diagrams, designed for ontology engineering, have real
potential as a formal alternative to OWL, supported by initial empirical evidence. How-
ever, much remains to be done in order to realise this potential and to convincingly
demonstrate their utility to ontology engineers. Firstly, empirical studies are required that
focus on interpreting more complex concept diagrams, such as that in figure 10, as well
as property diagrams. Some work has already begun in this regard, where participants
were asked to identify classes and properties that are necessarily empty (i.e. they are
unsatisfiable); again, concept diagrams supported significantly better task performance
than OWL [12]. Alongside this, we need to understand whether it is easier for ontology
engineers to construct correct ontology axioms using concept diagrams rather than the
standard approach of using OWL.

                     Spirit                                           Brave

                     Fairy
                                               chases
                                                        _
                      Pixie                                     Elf           Halfling               _
     guides                                                                              frightens


                                               likes


                     Goblin
                                                                         Dwarf
                   Hobgoblin                      Giant
                                      tracks                             Gnome
        helps                                                                                        _
                                                                                          dislikes


                                                  scares
                                                            _


                              Figure 10. A more complex concept diagram.

    Beyond empirical research, it is important to develop tools to support the use of
these diagrams that are fully integrated with existing software. To this end, work has
begun on implementing a plug-in for WebProtege that allows users to draw concept and
property diagrams, automatically converting them to OWL and displaying the OWL ax-
ioms in the standard WebProtege interface. It will be particularly difficult to implement
translations from OWL to diagrams, however. A major challenge is define efficient trans-
lation algorithms that produce effective diagrams, rather than arbitrary diagrams, which
are semantically equivalent to the given OWL specification.


Acknowledgements

This paper summarises research that has been undertaken by members of the Visual Mod-
elling Group at the University of Brighton, in particular by the author in collaboration
with John Howse, Eisa Alharbi, Jim Burton, Aidan Delaney, and Ali Hamie alongside
Peter Chapman at Edinburgh Napier University. Michael Compton (formally at CSIRO)
is also acknowledged for his contribution to the design of property diagrams and leading
the implementation of the WebProtege plug-in mentioned in section 5.


References

 [1] OWL. http://www.w3.org/TR/owl2-overview/. accessed April 2014.
 [2] Protégé. http://protege.stanford.edu/. accessed April 2014.
 [3] E. Alharbi, J. Howse, G. Stapleton, and A. Hamie. Evaluating diagrammatic patterns for ontology
     engineering. In Diagrams 2016 (accepted). Springer, 2016.
 [4] E. Alharbi, J. Howse, G. Stapleton, and A. Hamie. Helping people understand ontologie. In in prepara-
     tion, 2016.
 [5] C. Gurr. Effective diagrammatic communication: Syntactic, semantic and pragmatic issues. Journal of
     Visual Languages and Computing, 10(4):317–342, 1999.
 [6] M. Horridge, N. Drummond, J. Goodwin, A. Rector, R. Stevens, and H. Wang. The Manchester OWL
     syntax. In OWLed, volume 216, 2006.
 [7] J. Howse, G. Stapleton, K. Taylor, and P. Chapman. Visualizing ontologies: A case study. In Interna-
     tional Semantic Web Conference, pages 257–272. Springer, 2011.
 [8] A. Rector, N. Drummond, M. Horridge, J. Rogers, H. Knublauch, R. Stevens, H. Wang, and C. Wroe.
     Designing user interfaces to minimise common errors in ontology development: The CO-ODE and Hy-
     OntUse projects. In Proceedings of the UK e-Science All Hands Meeting, volume 2004, pages 493–499,
     2004.
 [9] A. Rector, N. Drummond, M. Horridge, J. Rogers, H. Knublauch, R. Stevens, H. Wang, and C. Wroe.
     OWL pizzas: Practical experience of teaching OWL-DL: Common errors & common patterns. In Engi-
     neering Knowledge in the Age of the Semantic Web, pages 63–81. Springer, 2004.
[10] R. Schwitter and M. Tilbrook. Controlled natural language meets the semantic web. In Australasian
     Language Technology Workshop, volume 2004, pages 55–62, 2004.
[11] G. Stapleton, J. Howse, P. Chapman, A. Delaney, J. Burton, and I. Oliver. Formalizing concept diagrams.
     In 19th International Conference on Distributed Multimedia Systems, pages 182–187, 2013.
[12] T. Hou, P. Chapman, and A. Blake. Antipattern comprehension: An empirical evaluation. In 9th Con-
     ference on Formal Ontology in Information Systems. IOS Press, 2016.
[13] P. Warren, P. Mulholland, T. Collins, and E. Motta. The Usability of Description Logics. In The Semantic
     Web: Trends and Challenges, volume 8465 of LNCS, pages 550–564. Springer, 2014.