Visualizing Populated Ontologies with OntoTrix

         Benjamin Bach1,3 , Gennady Legostaev2,1 , and Emmanuel Pietriga1
              1
               INRIA and LRI (Université Paris-Sud & CNRS), Orsay, France
           benjamin.bach@lri.fr, emmanuel.pietriga@inria.fr
                2
                  St Petersburg State University, Saint Petersburg, Russia
                             glegostaev@gmail.com
                     3
                       Dresden University of Technology, Germany


       Abstract. Most tools for visualizing Semantic Web data structure the represen-
       tation according to the concept definitions and interrelations that constitute the
       ontology’s vocabulary. Instances are often treated as somewhat peripheral infor-
       mation, when considered at all. The visualization of instance-level data poses
       different but significant challenges as instances will often be orders of magnitude
       more numerous than the concept definitions that give them machine-processable
       meaning. We present a visualization technique designed to visualize large in-
       stance sets and the relations that connect them. This visualization uses both node-
       link and adjacency matrix representations of graphs to visualize different parts of
       the data depending on their semantic and local structural properties, exploiting
       ontological knowledge to drive the layout of, and navigation in, the visualization.


1   Introduction

By making use of the rich capabilities of graphical representations and by abstracting
from the complex syntactic details of textual formats, visual tools aim at providing bet-
ter cognitive support to users, from knowledge engineers to domain-expert end-users.
They provide them with interactive representations of the data based upon state-of-the
art information visualization techniques, better supporting tasks such as ontology un-
derstanding, discovery, search, comparison and mapping [3].
    Most tools structure the visualization according to the concept definitions and in-
terrelations that constitute the ontology’s vocabulary. While many of them do support
the visualization of instance data, instances are often treated as somewhat peripheral
information. Employing Description Logics terminology, the visualization is mainly
structured according to the TBox, instances that constitute the ABox being treated as
leaf nodes in this tree or graph structure. Exceptions to this general observation exist,
but either give a limited view of the ABox [1,6] or use conventional node-link diagram
representations that hardly scale beyond a few hundred nodes at best [4].
    Instances represent an essential part of the overall knowledge base. Compared to
the definition of concepts based on OWL constructs, instance data are at a lower level
of abstraction. But more importantly, instance datasets are often orders of magnitude
larger (see, e.g., many of the datasets currently part of the Linking Open Data graph).
As such, the visualization of instance-level data poses different but real challenges that
remain to be addressed. We present OntoTrix, a visualization technique designed to
enable users to visualize, and navigate in, large instance sets and their relations.
2


              Fig. 1. Overview of SWEET’s units.owl ontology with OntoTrix.


2   OntoTrix

Ontology graphs contain many nodes and edges, and are often non planar. Two main
issues with node-link diagram representations of such graphs are their inefficient use of
screen real-estate and edge crossings that make dense regions difficult to read, both
eventually causing scalability problems. A well-known alternative to node-link dia-
grams for graph visualization are adjacency matrices. Nodes are represented as rows and
columns, and edges as filled cells at the intersection of connected rows and columns.
While node-link diagrams are good at showing the structure of relatively small and
sparse graphs, adjacency matrices are very effective at showing large (better use of
screen real-estate) and dense (no edge crossing) graphs. However, adjacency matrix
representations are much less familiar to users than node-link diagrams, and make tasks
that involve following paths in the graph more difficult [2].
    OntoTrix is inspired by a recent hybrid network visualization technique, NodeTrix
[2], that uses both node-link and adjacency matrix representations to visualize differ-
ent parts of the data depending on their semantic and structural properties (Figure 1).
The technique has proven successful at handling large networks, being very efficient
at visualizing locally dense but globally sparse networks. It displays the overall struc-
ture of the network using a node-link diagram, and the dense subgraphs that represent
communities using matrices. While the graph structure of ontologies might not always
share the small-world characteristics of social networks, such a hybrid representation,
combined with appropriate interaction techniques, can be an efficient means to perform
exploratory visualization of large ontology instance sets.
                                                                                         3


                                                                                    B

   A


   C                                                     D
Fig. 2. OntoTrix Interface Overview: (A) Main NodeTrix view, (B) Bird’s eye view, (C) Class
hierarchy view, (D), Property hierarchy view.

    NodeTrix was originally devised for undirected social network structures that only
feature one type of node and one type of relation. We extend NodeTrix to handle the
much richer and complex graph structure of populated ontologies, exploiting ontolog-
ical (TBox) knowledge to drive the layout of, and navigation in, the representation.
Embedded in a smooth zoomable interface [5], and coupled with visual representations
of the class and property hierarchies that enable interactive navigation and filtering of
instance data, this visualization produces more compact and legible representations than
node-link diagram approaches, thus better scaling to large instance sets.
    The OntoTrix environment features four main views (Figure 2). The main view (A),
contains the OntoTrix representation of the instance set. (B) provides an interactive
bird’s eye view of (A). The class hierarchy is visualized in (C), the property hierarchy
in (D). These views are highly synchronized. For instance, hovering a node in the class
hierarchy view highlights all corresponding instances in the OntoTrix view. Hovering
a node in the property hierarchy view highlights all corresponding elements in matri-
ces, as well as corresponding edges between matrices. Classes declared as being in the
domain or range of the property are highlighted in the class hierarchy, and conversely.
4

     Matrices in NodeTrix basically correspond to highly-connected groups of actors,
i.e., dense subgraphs that represent social communities. In OntoTrix, we propose dif-
ferent methods for grouping instances into matrices, yielding very different perspectives
on the dataset. Instances can be clustered in matrices based on edge density, taking into
account all types of relations between instance nodes (Figure 1). A second method
groups instances into matrices according to class membership. A third method repre-
sents a tradeoff between the previous two. From an initial grouping based on density, all
nodes are grouped together on a per-matrix basis according to class membership as de-
scribed above (reordering columns and rows). Each of the original matrices can then be
split into smaller matrices corresponding to class membership groups. The last method
groups all instances involved in statements based on selected object property type(s).
     Beyond improvements in terms of readability of a graph’s structure, NodeTrix fea-
tures an interesting property that is related to the above view of matrices as aggregates
of instance nodes for the layout process. Often, these aggregates will by themselves
represent interesting entities, not explicitly represented in the ontology, but that bear
semantics. When grouping by class membership in OntoTrix, matrices obviously rep-
resent groups of similar instances. We thus label each matrix with the associated class
name. When grouping by other methods, matrices cannot easily be tagged as there is
no explicit information about the grouping that stems from the purely structural clus-
tering of the graph. Matrices might still represent interesting entities, but will have to
be labeled manually and are currently assigned a random identifier. For instance, the
matrices in Figure 1 illustrate interesting grouping patterns: matrix [0] contains mostly
time-related units, matrix [1] mostly energy-related units, matrix [6] units related to
measuring light, etc. These groupings are very different from what is obtained when
grouping by class membership, and give an interesting perspective on the dataset. By
supporting dynamic, smoothly-animated transitions between grouping methods, On-
toTrix allows for rapid switching between these perspectives.
     OntoTrix is implemented in Java, using the ZVTM zoomable user interface toolkit,
LinLogLayout for layout and clustering, Jena 2 for parsing ontologies, and the TDB
backend for storing. Jena’s OWL transitive reasoner provides a complete classification
of the ontology. Additional reasoning can be performed using other reasoners.


References
1. Fluit, C., van Harmelen, F., Sabou, M.: Supporting user tasks through visualisation of light-
   weight ontologies. In: Handbook on Ontologies in Info. Systems. Springer-Verlag (2003)
2. Henry, N., Fekete, J.D., McGuffin, M.J.: Nodetrix: a hybrid visualization of social networks.
   IEEE Transactions on Visualization and Computer Graphics 13(6), 1302–1309 (2007)
3. Katifori, A., Halatsis, C., Lepouras, G., Vassilakis, C., Giannopoulou, E.: Ontology visualiza-
   tion methods—a survey. ACM Computing Surveys 39(4), 10:1–10:42 (2007)
4. Noppens, O., Liebig, T.: Interactive Visualization of Large OWL Instance Sets. In: Interna-
   tional Workshop on the Semantic Web and User Interaction (SWUI) (2006)
5. Pietriga, E.: A toolkit for addressing hci issues in visual language environments. In: Symp. on
   Visual Languages and Human-Centric Computing (VL/HCC’05). pp. 145–152. IEEE (2005)
6. Tu, K., Xiong, M., Zhang, L., Zhu, H., Zhang, J., Yu, Y.: Towards imaging large-scale ontolo-
   gies for quick understanding and analysis. In: ISWC. pp. 702–715. Springer-Verlag (2005)