=Paper=
{{Paper
|id=Vol-555/paper-2
|storemode=property
|title=Towards an Effective Methodology for Rapidly Developing Component-Based Domain Ontologies
|pdfUrl=https://ceur-ws.org/Vol-555/paper2.pdf
|volume=Vol-555
}}
==Towards an Effective Methodology for Rapidly Developing Component-Based Domain Ontologies==
Towards an Effective Methodology for
Rapidly Developing Component-Based
Domain Ontologies
Dave Kolas Troy Self
BBN Technologies BBN Technologies
1300 N. 17th Street, STE 400 1300 N. 17th Street, STE 400
Arlington, VA 22209 Arlington, VA 22209
Email: dkolas@bbn.com Email: tself@bbn.com
Abstract—As the Intelligence Community migrates of one grand ontology that can cumulatively reason
from a paradigm of using disjoint, data and applica- about anything. The relationships in this ontology
tion stovepipes to a paradigm of shared knowledge will be objectively true in all cases, resulting in
and networkeded component services, the cost of
developing appropriate domain ontologies becomes a an ontology that can be used for any purpose at
concern. In this paper, we present a methodology any time and in any context. This position fails to
for rapidly developing composite domain ontologies account for the ”Web” portion of the Semantic Web;
by linking and reusing existing ontologies. We will many knowledge representations will be developed
compare composite domain ontologies to component- by many individuals for many purposes, and they
based software engineering, present a metric for mea-
suring the compositeness of an ontology, and describe may not always be exactly compatible. While the
a composite domain ontology developed for real world one true ontology may one day exist, until that time,
use. a different type of engineering is required.
I. I NTRODUCTION It is our goal in this document to describe
a methodology for engineering ontologies in the
While a significant amount of research effort has real world, where ontologies are expected to in-
been devoted to creating ontologies[1], very little teroperate. We will focus on designing ontologies
has been focused on the high-level engineering of that will be used for a purpose, particularly in
ontologies. Current research details interesting ways concert with computer application software. The
to extract knowledge from subject matter experts[2], recommendations in this document are derived from
ways to align and interoperate between existing research done on the GARCON-F program as well
ontologies[3], and ways to extract information from as the collective experience of our research group
traditional relational databases into ontologies[4]. in building ontology-based applications. We hope to
Current software products provide tools for building espouse a set of ontology engineering principles that
ontologies, and thus make the process less painful, closely parallel those of component-based software
but fail to explain how one should know what to design[5]. From an abstract engineering point of
build. view, ontologies and software components have
Many individuals and institutions in the Semantic many common characteristics:
Web research area, particularly those who started
in knowledge representation work long before the • Both ontologies and software components will
”Semantic Web” idea existed, think of the ultimate likely be used in a multitude of application
solution of knowledge representation in the form situations, not all of which are anticipated by
the designer. II. O NTOLOGY D ESIGN G OALS
• Both ontologies and software components will In this section, we will describe the goals of
likely evolve after their initial creation, due to designing a composite ontology. The subsections
changing requirements, additional use cases, will ideally both propose general guidelines by
etc. which ontologies should be designed and justify
• Unlike hardware, software components and
these guidelines. To discuss the following guide-
ontologies can be easily replicated once de- lines, a working definition of compositeness will
veloped. be required. In this section, we present an intuitive,
• Reusing either ontologies or software compo-
loose definition of compositeness; a more formal,
nents has the potential to save a considerable measurable definition of compositeness will be pre-
amount of engineering resources. sented later.
• Care must be taken to make both ontologies
The compositeness of an ontology is defined as
and software components specific enough to the degree to which its content is made up primarily
be useful, but general enough to be reusable. of other ontologies, and the degree to which these
ontologies are made up of other ontologies, etc.
With that definition established, we examine why
It is this set of common characteristics that and when composite ontologies should be created.
lead us to approach ontology engineering in
much the same way as component-based software A. Time Savings
engineering[6]. When viewed abstractly, ontology The most immediate benefit achieved by building
design and software design are merely two parts of composite ontologies is elimination of much of the
the larger task of designing an application: building time required to design and build them. Extracting
an intelligent data representation, and building com- knowledge from subject matter experts can be a
ponents that can act on that representation. Thus difficult and expensive process, and encoding that
we will define a composite domain ontology as an knowledge into a structured form can be quite
ontology built from logical ontology components, difficult as well. By building an ontology from the
much the same way as a large piece of software is building blocks of other ontologies, it is possible to
built from software components. significantly reduce the amount of time required to
build the ontology.
For the remainder of this document, when we Just as in software, the time savings gained by
refer to an ontology, we will generally mean an reusing another ontology is not totally cost-free;
OWL Web Ontology Language[7] ontology. The incorporating another ontology subjects the new
knowledge representation language, OWL, has sev- ontology’s designer to the design decisions of the
eral qualities that make it particularly well suited component ontologies. However, as with software,
for building the types of purposeful ontologies we the time spent on tuning the incorporation of an
have discussed. First, it has a natural ability to link ontology into a larger ontology is most often vastly
ontologies together. This is critical to achieving the less than the time spent developing a new ontology
goals of reuse. Second, it takes a delicate balance from scratch.
between expressivity and computational tractability,
yielding ontologies that are practical. Third, the B. Interaction with Software
large and growing number of existing tools makes As the number and scope of Semantic Web
interaction with software components significantly applications increases, so too will increase the sym-
easier. That said, we hope that much of what is biosis between these applications and their related
described here is applicable to engineering other ontologies. For instance, in the GARCON-F pro-
knowledge representations, particularly those with gram, the client UI runs within ESRI’s ArcMap
similar properties to OWL. software, providing the capacity to display semantic
annotations in a geospatial context. This software D. Scope and Applicability
expects specific geospatial and temporal ontologies: It is not our intention to suggest that all on-
GeoRSS for geometries and OWL-Time for tempo- tologies should be highly composite. We generally
ral information. Any additional ontological infor- expect the compositeness of an ontology to vary
mation is processed dynamically by the software. in accordance with the simplicity or abstractness
Consider the case of an ontology designer creat- of the task for which the ontology is designed.
ing a new ontology to be used with this system. The more abstract an associated task is, the higher
It is likely that whatever domain he or she is the level of appropriate compositeness. Consider
attempting to represent requires geospatial and tem- the GeoRSS ontology mentioned previously. This
poral information; otherwise, the software would ontology has a very particular, focused purpose. The
not be applicable. The ontology designer can choose goal for the GeoRSS ontology is to attach point and
between creating their own definitions for space polygon spatial data to other ontological entities.
and time, or merely adopting GeoRSS and OWL- Because this purpose is narrow in scope and widely
Time. The former is possible by adding ontology applicable, it makes sense for the GeoRSS ontology
translations to the system, but the latter makes to be simple and not composite.
the client software instantly able to process data On the other hand, consider an ontology for air
from the new ontology with no software changes defense, as developed earlier in the GARCON-F
required. program. The goal for an air defense ontology might
Many applications that deal with Semantic Web be to annotate rich data about air defense scenarios,
data function this way. A few ontologies are treated to reason about these annotations, and to generate
specially, being used for a particular purpose. Other reports about them. This ontology must be large
ontologies provide extra information or linkage to in scope but only narrowly applicable. This is the
other applications. In this paradigm, reusing ontol- scenario in which it makes sense for an ontology to
ogy components provides significantly greater tool be very highly composite.
interoperability. Obviously there is a large amount of space be-
tween these two extremes, and finding the appropri-
ate level of compositeness for an ontology is part of
C. Reasoning and Translation
the process of creating it. We will attempt to give
Even after an ontology has been created, addi- guidance on this as well in the following sections.
tional layers of reasoning are often added. These
III. M EASURING C OMPOSITENESS IN
might come in the form of additional SWRL rules,
O NTOLOGIES
translations, or linkages between ontologies. For
instance, there may be existing translations from A. Relationship to Software Reuse Metrics
GeoRSS to another spatial representation. These Software metrics are effectively used to alle-
translations then inherently work on any ontology viate quality and performance concerns during
that incorporates GeoRSS for spatial representation. development[8]. We believe that analogue metrics
This is very much like the automatic interoperability can be developed to measure quality, reuse, and
of software components described above. In a sense, performance in ontologies. These metrics could
this type of reuse driven by composite ontologies eventually be applied to measure these character-
multiplies the effectiveness of software reuse with istics in existing ontologies as well as to guide the
composite ontologies. If there are three tools that planning and development of new ontologies.
use three different ontologies for one particular In software, reuse and compositeness is measured
purpose, and there are existing mappings between by quantifying abstract constructs of the software
these ontologies, then a designer of a new ontology development language, such as lines of code and
can choose between any of the three and get the au- the number of references between components[9].
tomatic benefit of the translations and the software. While we envision an eventual set of metrics for
measuring quality and reuse of ontologies, we be-
gin in this paper by defining a single metric for
measuring the compositeness of an ontology. (I(A, B) ∧ I(A� , B) ∧ I(A� , C)
The goal of a compositeness metric for an ontol- ∧ ∀X : (I(A, X) ∧ X �= C))
ogy is to have a quick estimate of the ontology’s
→ C(A� ) > C(A)
modularity. The more composite the ontology is, the
more likely it is that one could repurpose significant
parts of the ontology and thus minimize further
development.
D. A Compositeness Metric
B. Definitions
To meet the desired properties, we define a com-
• The compositeness of a given ontology X is
positeness metric as illustrated in Equation 1.
the function C(X).
• The size of a given ontology X is the function
S(X). �
S(y)
• An ontology X importing an ontology Y is � y∈I(X)
defined by the relation I(X, Y ). C(X) = C(y) + (1)
S(X)
• The set of all ontologies imported by X is
y∈I(X)
I(X).
IV. B UILDING C OMPOSITE O NTOLOGIES
C. Desirable Properties of a Compositeness Metric
This section will define a process for creating
If an ontology imports another ontology, its com-
effective composite ontologies. By following this
positeness is greater than the imported ontology.
process, ontology engineers should create an ontol-
ogy with maximum portability and reusability.
I(A, B) → C(A) > C(B)
If two ontologies import the same ontology, and A. Start With the Application Domain
one of the importing ontologies is smaller than the
other, that ontology has higher compositeness. The most important aspect of creating a useful
ontology is starting with a particular application in
I(A, B) ∧ I(C, B) ∧ S(A) < S(C) mind. By starting with an application, the knowl-
edge engineer immediately creates a scope that the
→ C(A) > C(C) ontology will need to fulfill, and thus prevents the
ontology from slowly expanding to represent un-
If two ontologies that are otherwise identical necessary concepts and relationships. This process
import different ontologies, the one that imports the is analogous to software requirements gathering. In
more composite ontology will have higher compos- order to build an effective ontology, the knowledge
iteness. engineer must answer the following questions:
• What questions/queries should the user or soft-
I(A, B) ∧ I(A� , C) ∧ C(B) > C(C) ware client be able to ask of the ontology?
• What should the instance data look like?
→ C(A) > C(A� ) • What types of inference will the ontology need
If one ontology that is otherwise identical to to provide?
a second ontology imports an ontology that the If a concept or relationship is not part of any of
second does not, that ontology will have higher these three sets, then it need not be included in the
compositeness. ontology.
B. Divide the Goals as data, queried for, or inferred over, it should not
Once the overall scope of the ontology has been be included.
established, the next step in the process is to divide E. Link Component Ontologies Together
the scoped relationships and concepts into logical The final step is linking the ontology compo-
subcomponents. These components could be both nents together. This is accomplished by importing
aspects of the data (geolocation, temporal informa- ontology components from the other components,
tion, provenance information) or dividable subparts mirroring the dependency graph created. As in
of the overall domain (vehicles, ground systems, etc software, care should be taken to avoid circular
for air defense). When attempting to partition the dependencies. While a circular dependency will not
ontology, the following questions should be asked: prevent compilation as in software, it will signifi-
• Could this partition be successfully reused cantly reduce reusability of any of the components
without the other partitions? involved.
• Would making this partition allow a piece of When appropriate, ontology components should
software to work without understanding the link directly into the components they inherit from
other partitions? via subclass, subproperty, or restrictions. This pro-
• Are there known existing ontologies that fulfill vides the cohesion between parts necessary for
part of the overall scope? effective use of the overall ontology. Occasionally,
• Can some part of the overall ontology be if multiple components are fulfilled by existing
viewed as a specialization of another part? ontologies, glue components will be required to
By answering these questions, the knowledge en- contain these relationships between parts. It is
gineer should be able to tentatively create partitions preferable to create a new linkage component rather
of the ontology, and create a dependency graph than directly changing an existing ontology.
between the parts. V. E XAMPLE : B UILDING A R AID M ISSION
P LANNING O NTOLOGY
C. Identify Reusable Ontologies
A. Geospatial Semantic Annotation Tool
Once partitions have been identified, all attempts
The Geospatial Semantic Annotation Tool
should be made to fill them in with suitable existing
(gSAT) is a platform for capturing imagery
ontologies. The applicability of a given ontology
annotations using ontologies[10]. Previously, gSAT
component should be evaluated by how well it fits
had been used for annotating imagery intelligence
the partition, how widely it is used, and the quality
in the Air Defense domain. The purpose of this
of its construction.
exercise was to rapidly develop a new domain
It is entirely possible that an ontology that
ontology that could be used within gSAT without
does not perfectly fit a partition may be desirable
changing any of the software. The new domain was
nonetheless, especially if it is already in wide
Raid Mission Planning as defined by the United
use. The ontology component breakdown may be
States Marine Corps for Counter-Insurgency (CoIn)
revisited during this step to accommodate existing
operations. A raid mission is a military mission
ontologies with slightly larger or smaller scope.
where forces quickly advanced on a chosen target
D. Develop New Component Ontologies and then leave. Examples of a raid can include
attacking a known enemy location to eliminate its
Once existing ontologies have been worked into threat capability or evacuation procedures, such as
the overall ontology, the remaining pieces must removing non-essential staff from an embassy.
be created. Other work addresses this part of the
process; here we only advise careful attention to B. Raid Ontology
the scope of the ontology created in the first step. The concepts and relationships defined for this
If a particular concept will never be directly inserted ontology are based on discussions with United
VI. F URTHER R ESEARCH
This document describes the initial research into
a formal methodology for developing composite do-
main ontologies. The compositeness metric defined
in this document requires further testing against
a larger reference set of ontologies to validate
its correctness. Further exploration into other met-
rics of ontology reuse is necessary. It is impor-
tant to encorporate metrics that consider the on-
tology’s internal complexity, cohesion, and other
Fig. 1. The Raid Mission ontology is comprised of multiple measures[11]. Since component-based software has
ontologies for representing time, space, and reusable features,
such as vehicles and routes. The arrows indicate that one multiple metrics for measuring quality, it is ex-
ontology imports the other. pected that component-based ontologies should also
have multiple metrics. The compositeness metric
described here only considers the size of ontologies
States Marine Corp imagery analysts. The raid on- and the single relationship of imports between
tology needed to include concepts and relationships them. Future metrics must consider the amount of
necessary to represent an observation of a physical linking and semantic complexity between linked
feature at some geospatial location at a particular ontologies.
time. The representation for an observation con-
sisting of a what, where, and when were reused VII. C ONCLUSION
from gSAT. The concepts to be annotated included In this document we have likened the process
routes, landing zones, assault support vehicles, en- of engineering ontologies to component-based soft-
emy facility types, and dangerous locations, such as ware engineering. We have demonstrated the bene-
potential IED areas and potential sniper positions. fits of designing ontologies this way, and defined a
Figure 1 shows the various ontology components process for creating such ontologies. We have also
in the Raid Mission ontology and their depen- defined a metric for determining how composite a
dencies. Each circle represents an ontology and particular ontology is. Our hope is that this analysis
includes its compositeness score according to the will help others create more effective ontologies in
metric defined in Equation 1. The ontologies that the future.
show C(X) = 0 do not import any other ontologies. R EFERENCES
As the diagram shows, the Raid ontology is the
[1] D. Bianchini, V. De Antonellis, and M. Melchiori, “Domain
most abstract, and directly or indirectly imports ontologies for knowledge sharing and service composition
all of the other ontologies. The most foundational in virtual districts,” in Database and Expert Systems Ap-
ontologies are at the bottom, and are imported into plications, 2003. Proceedings. 14th International Workshop
on, Sept. 2003, pp. 589–594.
the Raid ontology along two different paths. Most [2] X. Wang, X. Wang, and F. Wang, “How to use class axioms
of the components in the ontology are reused. to model ontology effectively in owl,” in Knowledge Ac-
Reusing the ontologies as shown above allows the quisition and Modeling Workshop, 2008. KAM Workshop
2008. IEEE International Symposium on, Dec. 2008, pp.
gSAT annotation system to create spatiotemporal 601–604.
semantic annotations in the Raid domain without [3] J. Sampson, M. Lanzenberger, and C. Veres, “Facilitating
changing any aspect of the software. This is because interoperability in semantic web applications using on-
tologies,” in Complex, Intelligent and Software Intensive
the software only needs to directly understand the Systems, 2008. CISIS 2008. International Conference on,
parts of the ontology that were reused: the founda- March 2008, pp. 233–239.
tional, temporal, and spatial portions. Thus in this [4] Z. Qu and S. Tang, “Research on transforming relational
database into enriched ontology,” in Advanced Computer
case the reuse of ontologies has led to 100 percent Theory and Engineering, 2008. ICACTE ’08. International
reuse of the software. Conference on, Dec. 2008, pp. 749–753.
[5] L. Etzkorn and H. Delugach, “Towards a semantic metrics
suite for object-oriented design,” in Technology of Object-
Oriented Languages and Systems, 2000. TOOLS 34. Pro-
ceedings. 34th International Conference on, 2000, pp. 71–
80.
[6] A. Farooq, A. Shah, and K. Asif, “Design of ontology in
semantic web engineering process,” in High Capacity Op-
tical Networks and Enabling Technologies, 2007. HONET
2007. International Symposium on, Nov. 2007, pp. 1–6.
[7] M. Dean and G. Schrieber, Eds., OWL Web Ontology Lan-
guage Reference. W3C Recommendation, February 2004,
http://www.w3.org/TR/2004/REC-owl-ref-20040210/.
[8] S. Sedigh-Ali, A. Ghafoor, and R. Paul, “Metrics and mod-
els for cost and quality of component-based software,” in
Object-Oriented Real-Time Distributed Computing, 2003.
Sixth IEEE International Symposium on, May 2003, pp.
149–155.
[9] W. Frakes and C. Terry, “Software reuse: metrics and
models,” ACM Comput. Surv., vol. 28, no. 2, pp. 415–435,
1996.
[10] T. Self, D. Kolas, and M. Dean, “Ontology-driven imagery
analysis,” in Proceedings of the Second International On-
tology for the Intelligence Community Conference OIC-
2007, 2007.
[11] Y. Ma, X. Ma, S. Liu, and B. Jin, “A proposal for
stable semantic metrics based on evolving ontologies,” in
Artificial Intelligence, 2009. JCAI ’09. International Joint
Conference on, April 2009, pp. 136–139.