An Instance-based Approach for Identifying Candidate
        Ontology Relations within a Multi-Agent System
                                           Andrew Brent Williams1 and Costas Tsatsoulis2


Abstract. Discovering related concepts in a multi-agent system               of a set of sentences in the description logic. MacGregor [8], who
among agents with diverse ontologies is difficult using existing             worked on the LOOM knowledge representation language, stated the
knowledge representation languages and approaches. We describe               trade off between this type of language and its expressiveness. Also,
an approach for identifying candidate relations between expressive,          the essential ontological promiscuity of artificial intelligence states
diverse ontologies using concept cluster integration. We evaluate            that any agent can create and accommodate an ontology based on its
the feasibility of this approach using lightweight ontologies. These         usefulness for the task at hand [4]. Therefore, a group of agents with
lightweight ontologies are constructed from the Magellan search              individualized ontologies may wish to share knowledge but find it
Web site and consist of Web page categories, or concepts, and their          difficult to understand the relationships between their concepts.
corresponding instances.                                                        This situation of agents with diverse ontologies may exist in the
                                                                             World Wide Web domain. Web users may construct simple ontolo-
                                                                             gies while searching and ”surfing” the Web. These users search for
1     INTRODUCTION                                                           information using Web browsers and search engines. Once a person
                                                                             finds a Web page of interest, often they will bookmark them for later
In order to facilitate knowledge sharing between a group of interact-
                                                                             reference. These bookmarks can be grouped into categories, or con-
ing information agents (i.e. a multi-agent system), a common ontol-
                                                                             cepts, of similar Web pages to form a taxonomy, or a lightweight
ogy should be shared. However, we recognize that often agents do
                                                                             ontology.
not always commit a priori to a common, pre-defined global ontol-
                                                                                We believe that information agents may benefit from using these
ogy. Our ongoing research investigates approaches for agents with di-
                                                                             ontologies to search for related concepts in a multi-agent system.
verse ontologies to share knowledge by automated learning methods
                                                                             These agents may understand different concepts that are not exactly
and agent communication strategies [15]. When these agents have
                                                                             the same but may be related. As an example, there may be an agent
diverse ontologies there are many challenges for knowledge sharing
                                                                             that understands the concept, ”NBA”. Another agent may know the
and communication. One of these challenges is for agents to automat-
                                                                             concept, ”College Hoops”. Although these concepts are not exactly
ically learn representations for diverse ontologies from categorized
                                                                             the same, they are both clearly related to the concept, ”Basketball”.
Web pages and identify the relationships between two agents’ on-
                                                                             Agents that do not know the relationships of their concepts to each
tologies. In this paper, we demonstrate the feasibility for identifying
                                                                             other need to be able to teach each other these relationships. If the
a candidate 1:N relation between two different agents’ ontologies.
                                                                             agents are able to discover these concept relations, this will aid them
These ontologies represent natural human categorizations of Web
                                                                             as a group in sharing knowledge even though they have diverse on-
page bookmarks into concepts, or sets of corresponding instances.
                                                                             tologies. Information agents acting on behalf of a diverse group of
   Various definitions of an ontology range from simply what is
                                                                             users need a way of discovering relationships between the individu-
known to exist in an agent’s world; the categories in a search en-
                                                                             alized ontologies of users. These agents can use these discovered re-
gine index; or to more rigorous definitions which lend themselves
                                                                             lationships to help their users find information related to their topic,
to constructions of formal ontologies using a description logic. For
                                                                             or concept, of interest. This paper describes a possible approach for
example, in the classic AI blocks world domain, the ontology only
                                                                             discovering these relationships while allowing for maximum expres-
consists of a table surface and three blocks labeled A, B, and C. The
                                                                             siveness in the agents’ vocabularies.
Yahoo! search engine has an ontology which consists of a taxonomy
                                                                                The rest of this paper first discusses related work in Section 2,
of its Web page categories and is referred to as a lightweight ontol-
                                                                             describes our approach in Section 3 and then describes our imple-
ogy [13]. An example of a more extensive formalized ontology is the
                                                                             mentation in Section 4. Section 5 describes how we evaluated our
Cyc Ontology [7].
                                                                             system and presents the results. Section 6 presents our conclusions
   We recognize that research using formal knowledge representation
                                                                             and describes future work.
languages to create formal ontologies for agent knowledge sharing
has made significant strides. These approaches, however, must place
some limits on the expressiveness of the vocabulary in order to facil-       2   RELATED WORK
itate the use of inference mechanisms for deducing the entailments
                                                                             Manually constructing ontologies by combining different ontologies
1 Department of Electrical and Computer Engineering, University of Iowa,
                                                                             on the same or similar subject into one is called merging [11]. Dif-
    Iowa City, IA 52242 email: abwill@eng.uiowa.edu
2 Information and Telecommunication Center, Department of Electrical Engi-   ferentiated ontologies having terms that are formally defined as con-
    neering and Computer Science, University of Kansas, Lawrence, KS 66045   cepts and have local concepts that are shared have been addressed
    email: tsatsoul@ittc.ukans.edu                                           [14]. They use description compatibility measures based on com-
paring ontology structures represented as graphs and by identifying        This combination of a unique vocabulary and a vector of correspond-
similarities as mappings between elements of the graphs. The rela-         ing ones and zeroes makes up a concept vector. The concept vector
tions they find between concepts are based on the assumption that          represents a specific Web page and the actual semantic concept is
local concepts inherit from concepts that are shared. Their system         represented by a group of concept vectors judged to be similar by the
was evaluated by generating description logic ontologies in artifi-        user.
cial worlds. In our approach, we do not assume that the ontologies            Our agents use supervised inductive learning to learn their indi-
share commonly labeled concepts but rather a distributed collective        vidual ontologies. The output of this ontology learning is semantic
memory of objects that can be selectively categorized into the agent’s     concept descriptions (SCD) in the form of interpretation rules. For
ontology. Our system also differs in that it uses Web page text as in-     example, the following is the SCD for the concept in the ontology
stances that describe examples of the agent’s concepts.                    location Arts/Book/Talk/Reviews using a CLIPS rule representation:
   Machine learning algorithms have been used to learn how to ex-
tract information from other Web pages [3]. Their approach uses            1. (defrule Rule 35 (danny 1)
manually constructed ontologies with their classes, relations and             =>
training data. The objective of this work is to construct a knowledge         (assert (CONCEPT Arts Book Talk Reviews)))
base from the World Wide Web and not to find relationships between         2. (defrule Rule 34 (ink 1)
concepts in a multi-agent system.                                             =>
   Several information agent systems attempt to deal with some is-            (assert (CONCEPT Arts Book Talk Reviews)))
sues in using ontologies to find information. IICA, or Intelligent
Information Collector and Analyzer, is an ontology-based Internet             Each Web page bookmark folder label represents a semantic con-
navigation system [5]. IICA gathers, classifies and reorganizes in-        cept name. A Web page bookmark folder can contain bookmarks, or
formation from the Internet. It uses a common ontology to allow            URL’s, pointing to a semantic concept object, or Web page. A book-
IICA to make inexact matches between users’ requests and the candi-        mark folder can also contain additional folders. Each set of book-
date locations. They define their ontologies as weakly structured and      marks in a folder is used as training instances for the semantic con-
are built from existing thesauruses and technical books consisting of      cept learner. The semantic concept learner learns a set of interpre-
about 4,500 terms. This system is based on using a common ontol-           tation rules for all of the agent’s known semantic concept objects
ogy rather than diverse ontologies. For text categorization or classifi-   (Figure 1).
cation it uses the information retrieval vector space model. Informa-
tion agents that can update models of available information sources
using inductive, concept learning but applied it to static, relational
databases using a formal description logic have been demonstrated
[6, 1]. Their system embedded the concept semantics in the initial
ontology and in their query reformulation operators. Since they are
using a description logic their expressiveness of their vocabulary is
limited and would be hampered by the high degree of language ex-
pressiveness in the World Wide Web domain. The InfoSleuth Project
[2] uses multiple representations of ontologies to help in semantic
brokering. Their agents advertise their capabilities in terms of more
than one ontology in order to increase the chances of finding a se-
mantic match of concepts in the distributed information system. The
InfoSleuth system, however, is not attempting to discover relation-
ships between concepts in the different ontologies.


3     APPROACH
We discuss how our agents represent, learn, share, and interpret con-            Figure 1.   Supervised inductive learning produces ontology rules
cepts using ontologies constructed from Web page bookmark hierar-
chies. In particular, we show how we use DOGGIE agents to discov-
ery candidate relations between different ontologies. The relations
                                                                              For each of these semantic concept description rules, an associ-
are assumed to be general is-a relations.
                                                                           ated certainty value is determined during the learning process. This
                                                                           certainty value is used later during the interpretation process. It is
3.1    Concept Representation and Learning                                 equal to the percentage these rules were successful in interpreting
                                                                           the training set minus an error prediction rate calculated for our par-
A semantic concept comprises a group of semantic objects, i.e. Web         ticular semantic concept learner [12].
pages, that describe that concept. The semantic object representa-
tions we use define each token, i.e. word and HTML tag from the            3.2     Concept-based Queries
Web page, as a boolean feature. The entire collection of Web pages,
or semantic objects, that were categorized by a user’s bookmark hier-      DOGGIE agents use concept-based queries (CBQ) to communicate
archy is tokenized to find a vocabulary of unique tokens. This vocab-      their requests for concepts related to the query. A CBQ occurs when
ulary is used to represent a Web page by a vector of ones and zeroes       one agent sends example concepts to other neighboring agents, de-
corresponding to the presence or absence of a token in a Web page.         termines by the agents’ responses who knows related concepts, and
learns new knowledge. This new knowledge can be in the form                semantic concept learner. Next, the original CBQ sent out by the Q
of learning new semantic concepts or knowledge regarding another           agent will be interpreted according to these new semantic concept
agent’s ontology. The actual CBQ consists of the concept name, ad-         descriptions to see if it knows the CBQ concept as the combination
dresses of examples of the concept (i.e. URL’s), and flags indicating      of the returned M region concepts. This CCI process integrates the
what type of service the user requests. For this concept-based query       R agents M region concepts into its own ontology since the exam-
scenario, an acquaintance agent is any other agent that the querying       ples are input into the ontology under a new concept name. This new
agent knows how to locate and communicate with.                            concept name represents a compound concept.
                                                                              If there is a match between the original CBQ and the new com-
                                                                           pound cluster, then new group knowledge which describes a relation-
3.3    Concept Interpretation and Verification                             ship between a Q agent’s concept with more than one R agent’s con-
The querying (Q) agent will send out a CBQ to its acquaintances. The       cepts can be learned. This group knowledge, or CCI rule, is stored in
responding (R) agents will use their semantic concept interpreters to      the form of a concept relation, or compound cluster translation rule.
determine if they think they know related concepts, and will send             The Q agent takes the following outlined steps to perform CCI:
their responses back to the Q agent. A semantic concept interpreter
is a knowledge-based component that can classify concept instances         1. From the R agent response, determine the names of the concepts
according to an agent’s local ontology. Each Q and R agent have their         to cluster.
own local ontologies which represent how they have individually            2. Create a new compound concept using the above names.
conceptualized their view of the world. The R agent’s responses may        3. Create a new ontology category by combining instances associ-
be a positive (K), neutral (M), or negative (D) interpretation along          ated with the compound concept.
with the concept name and type. A positive or negative response cor-       4. Re-learn the ontology rules.
responds to an interpretation value above the positive or negative in-     5. Re-interpret the CBQ using the new ontology rules including the
terpretation threshold, respectively. A neutral response corresponds          new concept cluster description rules.
to the value falling between these two thresholds. A positive inter-       6. If the concept is verified, store the new concept relation rule.
pretation threshold is equal to the percentage accuracy value calcu-
lated for a particular concept during the ontology process minus an        4   IMPLEMENTATION
error prediction rate value for the particular concept interpreter. A
negative interpretation threshold is the lower boundary interpretation     In this section, we describe how we implemented our multi-agent
value that indicates the concept is not known. The concept name cor-       system, the Distributed Ontology Gathering Group Integration En-
responds to the bookmark folder the Web pages belong to. The con-          vironment (DOGGIE). DOGGIE was used for our investigation into
cept type indicates whether the answer to the query is a similar or        knowledge sharing and learning among agents with diverse ontolo-
related concept. If an R agent has a positive response to the CBQ,         gies. DOGGIE agents are multithreaded Java applications with a
it will request permission to send examples of its similar or related      Swing GUI (Figure 2).
concept back to the Q agent. The Q agent can then verify whether the
R agent actually knows a similar or related concept by using its own
concept interpreter on the examples R sends to it.


3.4    Concept Cluster Integration
In this case of multiple M regions in a concept query response, DOG-
GIE can apply the concept cluster algorithm (CCI) to look for candi-
date relations between ontologies. As we have previously described,
the Q agent sends out a CBQ that will be received by R agents. The
R agents will send back the results of its interpretation process. In-
cluded in this response are the name of the concept(s) it has inter-
preted the original concept to be, its interpretation region (K,M, or
D), the stored interpretation threshold, the resulting interpretation
value, and some examples of the R agent’s concept. Since in our
multi-agent system, the agents are willing to perform minimal work
for each other, the actual concept cluster integration algorithm is per-
formed by the original Q agent instead of an R agent. After the inter-
pretation results have been sent back to the Q agent from the R agent,         Figure 2. Example DOGGIE Agent GUI with KQML Messages
the Q agent must do several things to complete concept cluster inte-
gration. First, it must gather all of the returned examples from each
of the returned M region concepts. Then it must combine these into a
new directory named after a combination of these concept names.               The underlying multi-agent communication architecture for the
For example, if the M region concepts returned were Sports and             DOGGIE system is designed around the Common Object Request
NBA, then the new concept cluster directory would be called Sports         Broker Architecture (CORBA). CORBA is used as the underlying
+ NBA. A new ontology category would be created with this label            communication mechanism between the DOGGIE agents that can
and the returned examples would be combined into this category as          be located anywhere on the Internet. The messages between agents
the instances that make up this concept. Then the agent must relearn       are formatted and sent using the Knowledge Query Manipulation
the ontology rules, or semantic concept descriptions (SCD), using its      Language (KQML). Each agent in DOGGIE is actually composed
of both a CORBA client and a CORBA server process running si-           the concepts to build ”narrow” ontologies that only included closely
multaneously so that it can both send and receive queries. A sin-       related concepts.
gle agent sends its concept-based queries (CBQ) using the CORBA
client. The agent receives concept-based queries through its CORBA            Table 1. Some example Magellan concepts used for ontology
server component. The CORBA server and CORBA client are the
main communication components for the Agent Engine and Agent
Control. Each single DOGGIE agent is made up of five major com-
                                                                           Number      ID     Concept
                                                                              1        3      Arts/Architecture Firms
ponents: Agent Control, Agent Engine, Agent UI, Semantic Concept
                                                                              15      170     Business/Companies/Agriculture and Fisheries
Learner, Semantic Concept Interpreter (Figure 3).                             17      1012    Computing/Hardware/LAN Hardware
                                                                              31      2090    Health and Medicine/Mental Health/Resources
                                                                              33      2120    Hobbies/Arts and Crafts/Knitting and Stitching
                                                                              37      3504    Regional/Travel/Travel Agencies/P through Z
                                                                              39      3535    Science/Astrononmy and Space/Resources
                                                                              46      4030    Shopping/Prized Possessions/Collectibles
                                                                              49      4115    Sports/Basketball/NCAA
                                                                              50      4127    Sports/College/School Home Pages


                                                                        5.2    Experiments
                                                                        In this section, we describe how we evaluated the feasibility of find-
                                                                        ing candidate ontology relations using the concept cluster integration
                                                                        algorithm in the context of a multi-agent system.


                                                                        5.2.1 Hypothesis
          Figure 3. Architecture of a single DOGGIE Agent               It is feasible for agents with diverse ontologies to discover concept
                                                                        relations using concept cluster integration.


                                                                        5.2.2 Method
5     EVALUATION                                                        We selected a sample of ten queries which produced two M region
                                                                        responses and set up the DOGGIE agents to communicate between
We show that it may be feasible for agents to discover relationships
                                                                        the respective Q and R agents. We selected the concept cluster in-
between diverse ontologies by testing the concept cluster integration
                                                                        tegration option on the DOGGIE agent user interface then sent and
algorithm using ontologies constructed from the Magellan [10, 9]
                                                                        processed the queries one at a time.
search engine ontology.


5.1    Data Set                                                         5.2.3 Prediction

A manually constructed subject ontology from the Magellan site was      We expected that the CCI algorithm would produce at least some
used [16]. This ontology was grabbed from the Magellan site by a        verified concept cluster relations.
spider. The spider started from the Magellan homepage and recur-
sively followed the links to grab both topics and Web pages listed      5.2.4 Results
in each Web page topic. This approach assumed that the Web pages
listed under a topic were semantically related to the topic. We used    The results of our experiments are located in Table 2 below. Only
an existing open Magellan ontology to objectively measure which         20% of our queries produced verified concept cluster relations.
Web page instances belonged to particular ontology concepts. The           Table 2 shows the experiment number, the original queried con-
data used consisted of 50 random concept categories taken from the      cept, and the concept responses. It also shows the region type for the
Magellan search Web site. The Magellan ontology consisted of 4,385      responses, the delta, or the difference between the stored interpreta-
nodes, or concept categories. Each of the concept categories used had   tion threshold and the actual threshold, the name of the newly created
20 Web pages in them. Each DOGGIE agent was assigned 5 to 12            concept cluster, and the results of the concept cluster verification.
concepts from the Magellan ontology. The concept cluster integra-
tion experiments were run in single agent to single agent configura-
                                                                        5.2.5 Discussion
tions.
   The concepts used were each assigned a unique identification (ID)    Our DOGGIE agents discovered two concept relations out of the ten
number and some examples of the concepts used are listed in Ta-         attempts. The resulting concept relation rules are below:
ble 1 below. These were the concepts that we chose from to build
our individual agent ontologies. For some of our agent ontologies        K(A1, C2090, K(A4, C5+42))
we randomly chose the concepts used. In others we hand selected          K(A1, C3504, K(A4, C2024+3504))
           Table 2. Concept Cluster Integration Experiments Summary        REFERENCES
                                                                            [1] J. Ambite and C. Knoblock, ‘Reconciling distributed information sys-
                                                                                tems’, AAAI 1995 Spring Symposium on Information Gathering from
      #       Query       Reply       2     4       Cluster     V               Distributed, Heterogeneous Environmentss, (1995).
                                                                            [2] R. Bayardo, W. Bohrer, R. Brice, A. Cichocki, J. Fowler, A. Helal,
      1a       2090        5+42      K      0.04      5+42       Y              V. Kashyap, T. Ksiezyk, G. Martin, M. Nodine, M Rashid,
      1b       2090        2090      M     -0.06      5+42       N              M. Rusinkiewicz, R. Shea, C. Unnikrishnan, A. Unruh, and D. Woelk,
       2       3504     2024+3504    K      0.07   2024+3504     Y              InfoSleuth: Agent-Based Semantic Integration of Information in Open
       3       3562       5+3561     M     -0.26     5+3561      N              and Dynamic Environments, 205–216, Morgan Kaufmann, San Fran-
       4       4030        4030      K      0.24   4004+1014     N              cisco, 1998.
      5a       135      3505+1014    M     -0.36   3505+1014     N          [3] M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell,
      5b       135          135      M     -0.06   3505+1014     N              K. Nigam, and S. Slatter, ‘Learning to extract symbolic knowledge from
       6        59          59       M     -0.08   3504+1014     N              the world wide web’, in Proceedings of the 15th National Conference
       7       170          170      M     -0.50    57+2409      N              on Artificial Intelligence (AAAI-98), (1998).
       8       4002      57+3505     D     -0.46    57+3505      N          [4] M. Genesereth and N. Nilsson, Logical Foundations of Artificial Intel-
       9       4002        4002      K      0.15    57+3505      N              ligence, Morgan Kauffmann, Palo Alto, CA, 1987.
      10       3561        3561      -        -     57+1027      N          [5] M. Iwasume, K. Shirakami, H. Takeda, and T. Nishida, ‘Iica: An
                                                                                ontology-based internet navigation system’, AAAI-96 Workshop on
    2 region type                                                               Internet-based Information Systems, (1996).
                                                                            [6] C. Knoblock, Y. Arens, and C. Hsu, ‘Cooperating agents for informa-
The first equation can be read as ”Agent 1’s concept 2090 is related to         tion retrieval’, in Proceedings of the 2nd International Conference on
                                                                                Cooperative Information Systems, Toronto, Canada, (1994). University
Agent 4’s concept cluster 5+42”. From Table 1 we note that concept              of Toronto Press.
2090 is located in the Magellan ontologies as: Health and Medicine /        [7] D. Lenat and R.V. Guha, Building Large Knowledge-Based Systems,
Mental Health Resources. Concepts 5 and 42 are: Arts /Architecture /            Addison-Wesley, Reading, Mass., 1990.
Resources and Professional Organizations and Arts / Books / Genres          [8] R. MacGregor, The evolving technology of classification-based knowl-
                                                                                edge representation systems, Principles of Semantic Networks: Explo-
/ Non-Fiction. Intuitively, it would be difficult to determine such a           rations in the Representation of Knowledge, Morgan Kaufmann, 1991.
relationship between these concepts. However, if we are using the           [9] Magellan. http://www.lib.ua.edu/maghelp.htm, 1998.
DOGGIE for AI-assisted Web-browsing, this is a relationship that           [10] Magellan. http://magellan.mckinley.com, 1999.
the user may wish to explore.                                              [11] H.S. Pinto and J.P. Martins, ‘Reusing ontologies’, AAAI 2000 Spring
   Similarly, the second equation can be read as ”Agent 1’s concept             Symposium on Bringing Knowledge to Business Processes, 77–84,
                                                                                (2000).
3504 is related to Agent 4’s concept cluster 2024+3504”. This con-         [12] J.R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kauf-
cept relation rule shows an interesting situation in which one agent’s          mann, San Mateo, CA, 1993.
concept 3504 is related to another agent’s concept 3504 combined           [13] S. Staab, J. Angele, S. Decker, M. Erdmann, A. Hotho, A. Madche,
with concept 2024. From Table 2 we see that concept 3504 is Re-                 H. Schnurr, R. Studer, and Y. Sure, Semantic Community Web Portals,
                                                                                1–13, Computer Networks (Special Issue: WWW9 - Proceedings of the
gional/Travel/Travel Agencies/P through Z. Concept 2024 is Health               9th International World Wide Web Conference), Elsevier, Amsterdam,
and Medicine/Medicine/Clinics/University Medical Centers. Again,                2000.
this newly created concept cluster could be worth exploring by the         [14] P. Weinstein and W. Birmingham, ‘Agent communication with differ-
user.                                                                           entiated ontologies: eight new measures of description compatibility’,
                                                                                Technical report, Department of Electrical Engineering and Computer
                                                                                Science, University of Michigan, (1999).
                                                                           [15] A.B. Williams and C. Tsatsoulis, ‘Diverse web ontologies: What intel-
                                                                                ligent agents must teach to each other’, AAAI 1999 Spring Symposium
6      CONCLUSION AND FUTURE WORK                                               on Intelligent Agents in Cyberspace, 115–120, (1999).
                                                                           [16] Xiaolan Zhu, Incorporating Quality Metrics in Agent-Based Central-
                                                                                ized/Decentralized Information Retrieval on the World Wide Web, Ph.D.
Our results have demonstrated that our instance-based approach for              dissertation, University of Kansas, 1999.
discovering candidate relations between ontologies using concept
cluster integration is feasible. We believe that further research is re-
quired. Our approach does not attempt to identify the specific type
of relationship (e.g. part-of) in the ontologies but assumes they con-
sist of general is-a relations. For future experiments, the number of
instances included in each concept should be increased to insure
that the machine learning algorithm has sufficient training examples.
Also, a different experiment design should be used to verify that we
can take an existing concept, divide it into two concepts, and deter-
mine whether DOGGIE can discover the relations between them. We
hope that eventually this multi-agent system approach to finding rela-
tions can be used in conjunction with formal ontologies constructed
using a description logic.


ACKNOWLEDGEMENTS

We would like to thank the referees for their comments which helped
improve this paper.