Rule Induction for Concept Hierarchy Alignment
                        ICHISE Ryutaro, TAKEDA Hideaki. and HONIDEN Shinichi
                                  Intelligent Systems Research Division,
                                     National Institute of Informatics,
                                     2-1-2 Hitotsubashi, Chiyoda-ku,
                                          Tokyo, 101-8430, Japan
                                    fichise,takeda,honideng@nii.ac.jp

                         Abstract                                work in regard to HICAL and present the conclusions of this
                                                                 study.
     To manage information like ontology, we usually
     use categorization with concept hierarchy. Such
     concept hierarchies are managed individual for              2 Concept Hierarchy Model
     each system due to the many differences in concept          In this section, we describe a model of the nature of con-
     hierarchies. Consequently, it is difficult to reuse         cept hierarchies. Many information management systems for
     information in computer-based systems. Here, we             use with conceptual information like ontologies and class li-
     propose a new concept alignment method for con-             braries are managed via a system of hierarchical categoriza-
     cept hierarchies as a solution to this problem, and         tion. A concept hierarchy contains only one conceptual cate-
     construct a system to evaluate the performance of           gorization criterion. We can consider the diagram on the left
     our method. The results of this experiment re-              side of Figure 1 to be a representation of a single concept
     veal that the proposed method can be used to in-            hierarchy.
     duce appropriate align rules for concept hierarchies
     and classify information into appropriate categories
                                                                              C1                                  C2
     within another concept hierarchy.


1 Introduction
The rapid advances in computer technology have allowed
us to archive much more information than ever before. To
maintain such archives, hierarchical categorization is often                                    I1
used, by which information is categorized based on a con-
                                                                                                I2
cept hierarchy. Ontologies and class libraries are examples
of such systems. The hierarchical categorization is usually                                     I3
maintained by a single organization or individual for con-
sistency. In practice, a concept hierarchy is not suitable for                Figure 1: Concept Hierarchy Model
multiple purposes, requiring that concept hierarchies be con-
structed for individual application domains. In other words,
concept hierarchies are maintained in each domain separately.       In the figure there are two different concept hierarchies (C1
This means that it is not convenient to reuse information be-    and C2 ) and three different information instances (I1 , I2 and
cause the concept categorization differs between concept hi-     I3 ). Some instances are shared between the two concept hi-
erarchies. To solve this problem, we propose a new method        erarchies and some are not. It is important to keep in mind
that allows a concept in one concept hierarchy to be aligned     that these concept hierarchies are not the same. The next step
with another concept in another concept hierarchy. In this       is to consider an appropriate way to transfer an instance from
paper, we describe a machine-learning method for aligning        C1 into C2 . In the example shown in Figure 1, C2 does not
multiple concept hierarchies.                                    contain I2 . If I2 can be placed in the concept hierarchy of
   This paper is organized as follows: A concept hierarchy       C2 , the user can then use I2 with concept hierarchy C2 . In
model is defined, followed by the proposal for the new ma-       the next section, we propose a method of defining a rule for
chine learning method used to align concepts. In the experi-     conversion from the concept hierarchy of C1 to that of C2 , so
ment section, the performance of our system, HICAL (HIer-        that an instance that has already been categorized in C1 can
archical Concept ALignment system), which is based on the        subsequently be categorized in C2 . The most important point
proposed method, is compared in a variety of settings, fol-      of this approach is that the concept hierarchy of C2 does not
lowed by a discussion of the results. We then discuss related    need to be adjusted to fit the concept hierarchy of C1 . Thus, a
                                       Category S1                  statistic” method. When a similar pair is not generated, the al-
                                   belong not belong                gorithm outputs the alignment rule between the two concept
                                    N11          N12

                                                                                                                                  
       Category       belong                                        hierarchies. We can then apply this rule to deciding whether
           S2       not belong      N21          N22                a particular instance in C1 fits the concept hierarchy in C2 .
   Table 1: Classification of Instances by Two Categories
                                                                      Input:       N10 ,     // Top category in C1
                                                                                   N20 ,     // Top category in C2
user can apply our method while continuing to use whichever                        P;        // threshold for  statistic
concept hierarchy they are accustomed to.                             Output:      R;        // Rule Set
                                                                      begin
3 Concept Alignment                                                      /* make pair for candidate */
                                                                         /* using child node or parent node */
In our proposed method of constructing an alignment rule, we               X1 := make combination(N10 ; N20 );
must first find categories that are similar to each other (“simi-          t := 1;
lar categories”); the instance will then be aligned based on the           R := ;
similarity between categories. To find similar categories, our             while Xt 6= 
algorithm starts by comparing the most general categories of                   while Xt 6= 
the two concept hierarchies. For each pair of categories, we                       I := element in Xt ;
can determine similarity based on the instances categorized in                     N1 ; N2 := two node in I ;
the two categories. For each category, we can decide whether                       /* calculate  statistic */
a particular instance belongs to that category. If the catego-                     if (N1 ; N2 )  P
rization methods of two categories are similar, then the sys-                          Xt+1 := make combination(N1 ; N2 );
tem can generate an aligning rule for them. For example, if                            R := R + I ;
one category contains 100 instances and a category in another                     fi;
concept hierarchy contains the same 100 instances, then the                       Xt := Xt        I;
algorithm can generate an aligning rule for these categories,                  end;
because they can be considered to have the same categoriza-                    t := t + 1;
tion criteria. It should be noted that, because concept hier-            end;
archies are structured as trees, we can easily categorize ac-            return R;


                                                                                                                                  
cording to a nodal structure, such that lower (more specific)         end;
categories are included in higher (more general) categories.
   To find similar categories, we used a statistical method for
determining the degree of similarity between two categoriza-
tion criteria. The “ statistic” method [Fleiss, 1973] is an                          Figure 2: Alignment Algorithm
established method of evaluating similarity between two cri-
teria. We explain this method briefly. Let us suppose that
there are two categorization criteria, S1 and S2 . As men-
tioned earlier, we can decide whether a particular informa-         4 Experimental Evaluation
tion instance belongs to a particular category or not. Con-         4.1    Data and Settings
sequently, instances are divided into four classes shown in
Table 1. Symbols N11 ; N12 ; N21 ; N22 denote numbers of in-        In order to evaluate this algorithm, we conducted three ex-
stances for each class. For example, N11 denotes the number         periments using the Yahoo! Japan [Yahoo! Japan, 2000] and
of instances which belong to both the category S1 and the cat-      LYCOS Japan [LYCOS Japan, 2000] directories as concept
egory S2 . We may logically suppose that if category S1 and         hierarchies, and the links (URLs) in each directory as infor-
S2 have the same criterion of categorization, then N12 and          mation instances. The Yahoo! directory contains approxi-
N21 become close to zero and if the two categories have a dif-      mately 41,000 categories and 224,000 URLs. LYCOS con-
ferent criterion of categorization, then N11 and N22 become         tains approximately 5,700 categories and 48,000 URLs. Ap-
close to zero. The “ statistic” method utilizes this principle     proximately 25,000 URLs are common to both Yahoo! and
to determine the similarity of categorization criteria.             LYCOS. Generally speaking, as a concept hierarchy, Yahoo!
   The relationship between the two categorization criteria is      contains more knowledge than LYCOS, however half of the
examined from “top” to “bottom”. The alignment algorithm            URLs in LYCOS are not contained in Yahoo!. This demon-
is shown in Figure 2. First, the most general categories in the     strates that even a concept hierarchy that contains an enor-
two concept hierarchies are compared using the “ statistic”.       mous amount of information does not cover all information.
If the comparison confirms that the two categories are simi-           In this study, we used the three category pairs (and sub-
lar, then the algorithm outputs an alignment rule for them. At      categories) as our two concept hierarchies. The location in
the same time, the algorithm pairs one of these two similar         Yahoo! and LYCOS are as follows:
categories with a “child” category in the other similar cate-          Yahoo! : Arts / Humanities / Literature
gory. This new pair is then evaluated recursively using the “            LYCOS : Arts / Literature
   Yahoo! : Business and Economy / Companies                                                                                                 1
                                                                                                                                                                 Yahoo to Lycos

                                                                                                                                                                           Exact Rules
                                                                                                                                                                                                            1
                                                                                                                                                                                                                               Yahoo to Lycos

                                                                                                                                                                                                                                         Exact Rules
              LYCOS : Business Industry / Company                                                                                                                         Parent Rules                                                  Parent Rules


   Yahoo! : Recreation                                                                                                                      0.8                                                           0.8


              LYCOS : Hobby Sports
                                                                                                                                             0.6                                                           0.6
   We conducted 10-fold cross validation for shared in-


                                                                                                                                  Accuracy


                                                                                                                                                                                                Accuracy
stances. Shared instances were divided into 10 data sets; 9
of these sets were used for training and the remaining set was                                                                               0.4                                                           0.4


used for testing. Ten experiments were conducted for each
data set. The parameter of significance level for the “ statis-                                                                             0.2                                                           0.2

tic” was set at 5%.
                                                                                                                                              0                                                             0
4.2             Results                                                                                                                            Criterion 1                    Criterion 2                    Criterion 1                    Criterion 2


The results of the experiments are shown in Figure 3, 4 and
                                                                                                                                                    Figure 4: Result for Company Domain
5. The vertical axis is the average accuracy of the test data.
“Exact rules” represent values of accuracy for a system that
only uses the alignment rules for the category to which the                                                                                                      Yahoo to Lycos                                                Yahoo to Lycos
                                                                                                                                              1                                                             1
instance belongs. “Parent rules” indicates that if the system                                                                                                              Exact Rules
                                                                                                                                                                          Parent Rules
                                                                                                                                                                                                                                         Exact Rules
                                                                                                                                                                                                                                        Parent Rules

does not generate an alignment rule for a category to which
the instance belongs, it will use the rule generated for the                                                                                 0.8                                                           0.8


parent category instead. “Criterion 1” indicates that the in-
stance is categorized in the same category as the test data and                                                                              0.6                                                           0.6


                                                                                                                                  Accuracy


                                                                                                                                                                                                Accuracy
“Criterion 2” indicates that the instance is categorized in the
same category or parent category as the test data. “Criterion                                                                                0.4                                                           0.4

1” is very strict criterion because the target concept hierar-
chy should have enough intermediate categories in compar-                                                                                    0.2                                                           0.2

ison with the source concept hierarchy, while “Criterion 2”
is more general and more realistic because it does not matter                                                                                 0                                                             0
which concept hierarchies are rich in categorization. Two di-                                                                                      Criterion 1                    Criterion 2                    Criterion 1                    Criterion 2


rections of alignment are presented; from Yahoo! to Lycos,
and from Lycos to Yahoo!.                                                                                                                          Figure 5: Result for Recreation Domain

                                  Yahoo to Lycos                                                Yahoo to Lycos
               1                                                             1
                                            Exact Rules                                                   Exact Rules
                                           Parent Rules                                                  Parent Rules
                                                                                                                               involve transferring instances from relatively complex cate-
              0.8                                                           0.8
                                                                                                                               gories to relatively general categories. Such a trend is likely
                                                                                                                               to result in relatively accurate rules. For example, suppose
                                                                                                                               concept hierarchy A contains category S within concept X ,
                                                                                                                               however concept hierarchy B does not divide X into cate-
              0.6                                                           0.6
   Accuracy


                                                                 Accuracy


                                                                                                                               gories. In such a case, it would be much easier to learn a
              0.4                                                           0.4
                                                                                                                               rule for “A:/X/S -> B:/X” than to learn a rule for “B:/X ->
                                                                                                                               A:/X/S”, because “B:/X” contains instances that belong in S
              0.2                                                           0.2                                                and instances that do not. The results shown in figures reflect
                                                                                                                               this. Nevertheless, in this situation our method works prop-
               0
                    Criterion 1                    Criterion 2
                                                                             0
                                                                                  Criterion 1                    Criterion 2
                                                                                                                               erly. When regarding parent categories as correct answers
                                                                                                                               (“Criterion 2”), both alignment directions exhibited almost
                                                                                                                               the same results, i.e., “B:/X -> A:/X” is learned instead of
                     Figure 3: Result for Literature Domain
                                                                                                                               “B:/X -> A:/X/S”.
                                                                                                                                  The limitation of this method is that it does not work in
                                                                                                                               cases in which the more general and more specific categories
4.3             Discussion                                                                                                     within a concept hierarchy are independent. For example,
More than 80% of the instances used in our experiments, with                                                                   let’s consider two concept hierarchies; one that classifies a
the exception of the company domain, were categorized cor-                                                                     food-related instance under foodstuff as the more general cat-
rectly by HICAL. In the company domain, more than 60%                                                                          egory and then by country of origin as the more specific cat-
of the instances were categorize appropriately. The data in                                                                    egory, and another that classifies a food-related instance by
Figure 3 5 imply that Yahoo-to-Lycos alignment was more                                                                       country of origin as the more general category and by food-
accurate than the inverse operation. As can be seen from                                                                       stuff as the more specific category. Both types of concept
the total number of categories, the categorical hierarchy in                                                                   hierarchy can be found on the concept hierarchy. In such a
Yahoo! is more complex than that of LYCOS. Therefore,                                                                          case, our current system HICAL would not work, because
for Yahoo-to-Lycos alignment, the learned rules are likely to                                                                  comparison between the two categories proceeds from gen-
eral to specific (top to bottom). To solve this problem, we           advantage of using our method is that it allows users to use
are planning to combine our current system with a bottom-up           their own concept hierarchy for categorizing all information,
method.                                                               and may serve as a powerful tool for aligning concept hierar-
                                                                      chies.
5 Related Work                                                           With these encouraging results, several research possibili-
                                                                      ties present themselves for future development of alignment
One of the systems related to HICAL is the ontology merg-             strategies. Our alignment method is based on a top-down ap-
ing/alignment system. In the merging process for ontology, a          proach. We should combine a bottom-up approach to increase
process such as our system is necessary due the requirements          accuracy. In addition, there may exist other possibilities for
of concept hierarchy management. Chimaera [McGuinness et              alignment. Extending the proposal to applying to more than
al., 2000] and PROMPT [Noy and Musen, 2000] are exam-                 three concept hierarchies needs to be investigated. In such a
ples of such systems, assisting in the combination of different       case, despite confliction between several concept hierarchies,
ontologies. However, such systems require human interac-              more hints can be obtained from other concept hierarchies.
tion for merging or alignment. In addition to this require-
ment, they are based on similarity between words, which in-           References
troduces instability. Word similarity is often biased by the
dictionaries used. In contrast, our system does not use word          [Blink, 2000] Blink. http://www.blink.com/, 2000.
similarity, instead using syntactics alone. Hence, our system         [Fleiss, 1973] Joseph L. Fleiss. Statistical Methods for Rates
has the ability to find identical concepts regardless of the cat-        and Proportions. John Wiley & Sons, 1973.
egory name or word. For example in the experiment con-                [LYCOS Japan, 2000] LYCOS                               Japan.
ducted in this study, in the literature domain, HICAL found
                                                                         http://www.lycos.co.jp/, 2000.
the relationship between the “Genji-monogatari” (a famous
Japanese story written by “Murasakishikibu”) category in LY-          [McGuinness et al., 2000] Deborah L. McGuinness, Richard
COS and the “Murasakishikibu” category in Yahoo!1. In LY-                Fikes, James Rice and Stive Wilder. An Environment
COS, classical literature is classified by title (concept cate-          for Merging and Testing Large Ontologies In Proceed-
gory), whereas in Yahoo!, poetry masters are categorized by              ings of the seventh International Conference on Principles
author. As the dictionaries commonly used do not contain                 of Knowledge Representation and Reasoning(KR2000),
such information, word-based systems would not be capable                Morgan Kaufman Publishers, 2000.
of finding title/author relationships.                                [Mori and Yamada, 1999] Mikihiko Mori and Seiji Yamada.
   The bookmark-sharing systems of Siteseer [Rucker, 1997]               Bookmark-Agent: Information Sharing of URLs In Poster
and Blink [Blink, 2000] are also similar to HICAL. The main              Proceedings of the 8th International World Wide Web
difference is the use of hierarchies for categorization. The             Conference(WWW-10), 1999.
Siteseer and Blink system only considers the number of URLs
                                                                      [Noy and Musen, 2000] Natalya F. Noy and Mark A. Musen.
(instances) in a given category, whereas our method uses hi-
                                                                         PROMPT: Algorithm and Tool for Automated Ontology
erarchical structures. One of the merits of our approach is
                                                                         Merging and Alignment. In Proceedings of the 17th Na-
that if there is no exact category into which a given URL (in-
                                                                         tional Conference on Artificial Intelligence, pages 450–
stance) fits, then the URL (instance) is mapped into the parent
                                                                         455, Austin, Texas, July–August 2000. American Asso-
category. kMedia [Takeda et al.2000 ] is another bookmark-
                                                                         ciation for Artificial Intelligence.
sharing system that uses hierarchical structures explicitly but
is dependent on similarity of words in pages. Bookmark-               [Rucker, 1997] James Rucker and Marcos J. Polanco. Site-
Agent [Mori and Yamada, 1999] uses another approach, uti-                seer: Personalized Navigation for the Web Communica-
lizing bookmarks based on keywords. As mentioned above,                  tions of the ACM, 40(3):73–75, 1997.
HICAL only uses syntactical information, not words as are             [Takeda et al.2000 ] Hideaki Takeda, Takeshi Matsuzuka and
used by a bookmark agent. HICAL is therefore capable of                  Yuichiro Taniguchi. Discovery of Shared Topics Networks
correctly categorizing different words under the same con-               among People - A Simple Approach to Find Community
cept.                                                                    Knowledge from WWW Bookmarks In Proceedings of
                                                                         the 6th Pacific Rim International Conference on Artificial
6 Conclusion                                                             Intelligence(PRICAI-2000), pages 668–678, Melbourne,
                                                                         Australia, August 28 - September 1, Springer, 2000.
In this paper, we propose a new method for aligning concept
hierarchies as a new approach to utilizing information in mul-        [Yahoo! Japan, 2000] Yahoo!                             Japan.
tiple concept hierarchies, based on statistical methods. To test         http://www.yahoo.co.jp/, 2000.
our ideas, we conducted experiments using the Yahoo! and
LYCOS categories. Our experimental results show that the
alignment rules learned by HICAL yield reliable alignments,
allowing information in one concept hierarchy to be aligned
to an appropriate position in another concept hierarchy. The
   1
     similar to the relationship between “Sherlock Holmes” and “Co-
nan Doyle”