=Paper= {{Paper |id=None |storemode=property |title=Learning of Ontologies from the Web: the Analysis of Existent Approaches |pdfUrl=https://ceur-ws.org/Vol-701/paper3.pdf |volume=Vol-701 |dblpUrl=https://dblp.org/rec/conf/icdt/Omelayenko01 }} ==Learning of Ontologies from the Web: the Analysis of Existent Approaches== https://ceur-ws.org/Vol-701/paper3.pdf
        Learning of Ontologies for the Web: the Analysis of Existent
                                Approaches
                                                      Borys Omelayenko

                                              Vrije Universiteit Amsterdam,
                                     Division of Mathematics and Computer Science,
                               De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands
                              Email: borys@cs.vu.nl, URL: www.cs.vu.nl/~borys


                        Abstract                                       representation of the semantics of data accompanied
  The next generation of the Web, called Semantic                      by domain theories (i.e. ontologies) will enable a
  Web, has to improve the Web with semantic                            Knowledge Web that provides a qualitatively new
  (ontological) page annotations to enable                             level of service. It will weave together a net linking
  knowledge-level querying and searches. Manual                        incredibly large segments of human knowledge and
  construction of these ontologies will require                        complement it with machine processability.
  tremendous efforts that force future integration of                     This will require enrichment of the entire Web
  machine learning with knowledge acquisition to                       with lots of ontologies that capture the domain
  enable highly automated ontology learning. In the                    theories. Their manual construction will require
  paper we present the state of the-art in the field of
                                                                       enormous human efforts, thus ontology acquisition
  ontology learning from the Web to see how it can
  contribute to the task of semantic Web querying.                     becomes a bottleneck of the Semantic Web.
  We consider three components of the query                               Recently ontologies have become a hot topic in the
  processing system: natural language ontologies,                      areas of knowledge engineering, intelligent
  domain ontologies and ontology instances. We                         information integration, knowledge management, and
  discuss the requirements for machine learning                        electronic commerce [Fensel, 2000]. Ontologies are
  algorithms to be applied for the learning of the                     knowledge bodies that provide a formal
  ontologies of each type from the Web documents,                      representation of a shared conceptualization of a
  and survey the existent ontology learning and other                  particular domain. Modern research focus lies in
  closely related approaches.                                          Web-based ontology representation languages based
                                                                       on XML and RDF standards and further application
                                                                       of ontologies on the Web (see [Decker et al., 2000]).
                        Introduction
                                                                       Ontology learning (OL) is an emerging field aimed
Nowadays the Internet contains a huge collection of                    at assisting a knowledge engineer in ontology
data stored in billions of pages and it is used for the                construction and semantic page annotation with the
worldwide exchange of information. The pages                           help of machine learning (ML) techniques.
represent mainly textual data and have no semantic                        In the next section of the paper we discuss the
annotation. Thus, query processing based mostly on                     general scheme for semantic querying of the Web
inefficient keyword-matching techniques becomes a                      with three ontological components required; the
bottleneck of the Web.                                                 subsequent sections discuss OL tasks and available
  Tim Berners-Lee coined the vision of the next                        ML techniques. The survey section describes the
version of the Web, called Semantic Web [Berners-                      applications of ML techniques for the learning of
Lee&Fischetti, 1999], that would provide much                          different ontology types, and we conclude with
more automated services based on machine-                              comparison of the approaches.
processable semantics of data and heuristics that
make use of these metadata. The explicit
                                                                                  Semantic Querying of the Web
In: Proceedings of the International Workshop on Web Dynamics,         In this section we discuss the general scheme for
held in conj. with the 8th International Conference on Database        semantic querying of the Web, the types of
Theory (ICDT’01), London, UK, 3 January 2001




                                                                  16
ontologies involved in query process, and basic ML             frequently while the ontology of the catalogue will
algorithms available for learning the ontologies.              remain the same).
                                                                  The Semantic Web will require creation and
The General Scheme                                             maintenance of a huge number of the ontologies of
The general scheme of the querying process is                  all three types, and the following ontology learning
presented in Figure 1. First, the user formulates the          tasks will become important.
query in natural language. Then the query is
transformed into a formal query with the help of the                         Ontology Learning Tasks
natural language ontology and the domain ontology.
The Web pages are (possibly incomplete) instances              Previous research in the area of ontology acquisition
of some domain ontologies, and they will contain               proposed lots of guidelines for manual ontology
pieces of data semantically marked up according to             development (see [Lopez, 1999] for an overview)
the underlying domain ontology. The query                      that organize the work of the knowledge engineer,
processor has to find the mapping between the                  but they pay no attention to the process of the
concepts of the initial query, the domain model used           acquiring of the ontology by humans. The human
to expand the query, and the ontology instances on             experts have to evolve the best knowledge
the Web. This mapping will be non-trivial and will             acquisition process themselves from their past
require inference over domain ontologies.                      experience acquired by passing through numerous
                                                               case studies. Thus, we have to separate several tasks
Ontological Components                                         in OL on our own:
There are a number of domains where ontologies                    Ontology creation from scratch by the knowledge
were successfully applied. The three ontologies that           engineer. In this task ML assists the knowledge
are important for querying the Web (see Figure 1)              engineer by suggesting the most important relations
are:                                                           in the field or checking and verifying the constructed
   Natural Language Ontologies (NLO) that                      knowledge bases.
contain lexical relations between the language                    Ontology schema extraction from Web
concepts; they are large in size and do not require            documents. In this task ML systems take the data
frequent updates. Usually they represent the                   and meta-knowledge (like a meta-ontology) as input
background knowledge of the system and are used to             and generate the ready-to-use ontology as output
expand user’s queries. These ontologies belong to              with the possible help of the knowledge engineer.
so-called ‘horizontal’ ontologies that try to capture             Extraction of ontology instances populates given
all possible concepts, but they do not provide                 ontology schemas and extracts the instances of the
detailed description of each of the concepts.                  ontology presented in the Web documents. This task
   Domain ontologies capture knowledge of one                  is similar to information extraction and page
particular domain, i.e. pharmacological ontology, or           annotation and can apply the techniques developed in
printer ontology. These ontologies provide detailed            these areas.
description of the domain concepts from a restricted              Ontology integration and navigation deals with
domain (so-called ‘vertical’ontologies). Usually they          reconstructing and navigating in large and possibly
are constructed manually but different learning                machine-learned knowledge bases. For example, the
techniques can assist the (especially inexperienced)           task can be to change the propositional-level
knowledge engineer.                                            knowledge base of the machine learner into a first-
   Ontology instances represent the main piece of              order knowledge base.
knowledge presented in the future Semantic Web. As                Ontology update task updates some parts of the
today’s Web is full of HTML documents of different             ontology that are designed to be updated (like
layout, the future Web will be full of instances of            formatting tags that have to track the changes made
different domain ontologies. The ontology instances            in the page layout).
will serve as the Web pages and will contain the                  Ontology enrichment (or ontology tuning)
links to other instances (similar to the links to other        includes automated modification of minor relations
Web pages). They can be generated automatically                into existing ontology. This does not change major
and frequently updated (i.e. a company profile from            concepts and structures but makes the ontology more
the Yellow Pages catalogue will be updated                     precise. Unlike ontology update, this task deals with




                                                          17
the relations that are not specially designed to be                Bayesian learning is mostly represented by Naive
updated.                                                        Bayes classifier. It is based on the Bayes theorem
   The first three tasks relate to ontology acquisition         and generates probabilistic attribute-value rules
tasks in knowledge engineering, and the next three to           based on the assumption of conditional independence
ontology maintenance tasks. In this paper we do not             between the attributes of the training instances.
consider ontology integration and ontology update                  First-order logic rules learning induces the rules
tasks.                                                          that contain variables, called first-order Horn
                                                                clauses. The algorithms usually belong to the FOIL
                                                                family of algorithms and perform general-to-specific
           Machine Learning Techniques                          hill-climbing search for the rules that cover all
The main requirement for ontology representation is             available positive training instances. With each
that ontologies must be symbolic, human-readable                iteration it adds one more literal to specialize the rule
and understandable. This forces us to deal only with            until it avoids all negative instances.
symbolic      learning     algorithms     that    make             Clustering algorithms group the instances
generalizations and skip other methods, like neural             together based on the similarity or distance measures
networks, genetic algorithms and the family of 'lazy            between a pair of instances defined in terms of their
learners' (see [Mitchell, 1997] for an introduction to          attribute values. Various search strategies can guide
ML and the algorithms mentioned below). The                     the clustering process. Iterative application of the
foreseen potentially applicable ML algorithms                   algorithm will produce hierarchical structures of the
include:                                                        concepts.
   Propositional rule learning algorithms that learn               The knowledge bases built by ML techniques
association rules, or other attribute-value rules. The          substantially differ from the knowledge bases that
algorithms are generally based on a greedy search of            we call ontologies. The differences are inspired by
the attribute-value tests that can be added to the rule         the fact that ontologies are constructed to be used by
preserving its consistency with the set of training             humans, while ML knowledge bases are only used
instances. Decision tree learning algorithms, mostly            automatically. This leads to several differences listed
represented by the C4.5 algorithm and its                       in Table 1.
modifications, are used quite often to produce high-               To enable automatic OL we must adapt ML
quality propositional-level rules. The algorithm uses           techniques so that they can automatically construct
statistical heuristics over the training instances, like        ontologies with the properties of manually
entropy, that guide hill-climbing search of the                 constructed ontologies. Thus, OL techniques have to
decision trees. Learned decision trees are equivalent           possess the following properties, which we trace in
to the sets of propositional-level classification rules         the survey:
that are conjunctions of attribute-value tests.                 - ability to interact with a human to acquire his

                                                                                     http://www.cs…
                                            Formal                                   Web pages:
                                                                                  http://www.cs…
               Natural                      Semantic                                  ontology
              Language                      Query to the
                                                                                   Web pages:
                                                                                http://www.cs…
               Query                                                                  instances
                                                                                   ontology
                                            Web                                  Web   pages:
                                                                                   instances
                                                                                 ontology
                                                                                 instances
                                                       Domain
                                                     Domain
                                                       Ontologies
                                Natural             Domain
                                                     Ontologies
                                Language            Ontologies                          Instance-of
                                Ontology                                                links


           Figure 1. Semantic querying of the Web




                                                           18
    knowledge and to assist him; this requires                      the actions. Concept features are usually represented
    readability of internal and external results of the             by adjectives or adjective nouns (like ‘strong-
    learner;                                                        strength’). Thus the ontology can be represented by
- ability to use complex modelling primitives;                      frames with a limited structure.
- ability to deal with complex solution space,                         NLOs define the first and basic interpretation of
    including composed solutions.                                   user’s query, and they must link the query to specific
  Each ontology type has special requirements for                   terminology and specific domain ontology. General
ML algorithms applied for learning these types of                   language knowledge contained in a general-purpose
ontologies.                                                         NLO like WordNet [Fellbaum, 1998] is not
                                                                    sufficient for such a purpose. In order to achieve
  Table 1. Manual and machine representations
                                                                    this, lots of research efforts have been focused on
      Machine-learned              Manually constructed             NLO enrichment. NLO enrichment from domain
      knowledge bases                  ontologies                   texts is a suitable task for ML algorithms, because it
Modelling primitives                                                provides a good set of training data for the learner
Simple and limited. For           Rich set of modelling             (the corpus).
example,      decision    tree    primitives       (frames,
learning algorithms gene-         subclass relation, rules
                                                                       NLOs do not require either frequent or automatic
rate the rules in the form of     with     rich   set    of         updates. They are updated from time to time with
conjunctions over attribute-      operations,    functions,         intensive cooperation from a human, thus ML
value tests.                      etc.).                            algorithms for NLO learning are not required to be
Knowledge base structure                                            fast.
Flat and homogeneous.             Hierarchical, consists of            Domain ontologies use the whole set of modelling
                                  various components with           primitives, like (multiple) inheritance, numerous
                                  subclass-of, part-of and          slots and relations, etc. They are complex in
                                  other relations.                  structure and are usually constructed manually.
Tasks                                                               Domain ontology learning concentrates on
 Classification             and   Classification        task        discovering statistically valid patterns in the data in
clusterization that map the       requires mapping of
                                                                    order to suggest them to the knowledge engineer who
objects described by the          objects into a tree of
attribute-value pairs into a      structured classes. It can        guides the ontology acquisition process. In future we
limited and unstructured set      require construction of           would like to see an ML system that guides this
of class or cluster labels.       class descriptions instead        process and asks the human to validate pieces of the
                                  of selection.                     constructed ontology.
Problem-solving methods                                                ML will be used to predict the changes made by
Very primitive, based on          Complicated,       require        the human to reduce the number of interactions. The
simple search strategies,         inference     over       a        input of this learner will consist of the ontology
like    hill-climbing   in        knowledge base with a             being constructed, human suggestions and domain
decision tree learning.           rich structure, often             knowledge.
                                  domain-specific       and            Domain ontologies require more frequent updates
                                  application-specific.             than NLOs (just as new technical objects appear
Solution space
The non-extensible, fixed   Extensible       set     of
                                                                    before the community has agreed about the
set of class labels.        primitive and compound                  surrounding terminology), their updates are done
                            solutions.                              manually and ML algorithms that assist this process
Readability of the knowledge bases to a human                       are also not required to be fast.
Not required. They can be Required. They may be                        Ontology instances are contained in the Web
used only automatically and (at least potentially) used             pages marked up with the concepts of the underlying
only in special domains.    by humans.                              domain ontology with information extraction or
                                                                    annotation rules. The instances will require more
   NLO contain hierarchical clustering of the                       frequent updates than domain ontologies or NLOs
language concepts (words and their senses). The set                 (i.e. a company profile in a catalogue will be
of relations (slots) used in the representation is                  updated faster than the ontology of a company
limited. The main relations between the concepts are:               catalogue).
‘synonyms’, ‘antonyms’, ‘is-a’, ‘part-of’. The verbs
can contain several additional relations to describe




                                                               19
                                                              word ‘waiter’ has two senses: the waiter in the
                                                              restaurant (related words: waiter–restaurant,
                     The Survey                               menu, dinner); and a person who waits (related
This section presents the survey of existing                  words: waiter–station, airport, hospital). The
techniques related to the learning and enriching of           system queries the Web for the documents related to
the NLO from the Web, Web-based support for                   each concept from the WordNet and then builds a
domain ontology construction, and extraction of               list of words associated with the topic. The lists are
ontology instances. These approaches cover various            called topic signatures and contain the weight (called
issues in the field and show different applications of        strength) of each word. The documents are retrieved
ML techniques.                                                by querying the Web with the AltaVista search
                                                              engine by asking for the documents that contain the
Learning of NLO                                               words related to a particular sense and do not
                                                              contain the words related to the other senses of the
Lots of conceptual clustering methods can be used             word. A typical query may look something like
for ontology construction but no methodology or tool          ‘waiter AND (restaurant OR menu) AND NOT
has been developed to support the elaboration of              (station OR airport)’ to get the documents that
conceptual clustering methods that build task-                correspond to the ‘waiter, server’ concept.
specific ontologies. The Mo'K tool [Bisson et al.,               NLOs, like EuroWordNet or WordNet, help in the
2000] supports development of conceptual clustering           understanding of natural language queries and in
methods for ontology building. The paper focuses on           bringing semantics to the Web. But in specific
elaboration of the clustering methods to perform              domains general language knowledge becomes
human-assisted learning of conceptual hierarchies             insufficient and that requires creation of domain-
from corpora. The input for the clustering methods is         specific NLOs. Early attempts to create such domain
represented by the classes (nouns) and their                  ontologies to perform information extraction from
attributes (grammatical relations) received after             texts failed because the experts used to create the
syntactical analysis of the corpora, which are in turn        ontologies with lots of a priori information that was
characterized by the frequency with which they                not reflected in the texts. The paper
occur in the corpora.                                         [Faure&Poibeau, 2000] suggests improving NLO by
   The algorithm uses bottom-up clustering to group           unsupervised domain-specific clustering of texts
'similar' objects to create the classes and to                from corpora. The system Asium described in the
subsequently group 'similar' classes to form the              paper cooperatively learns semantic knowledge from
hierarchy. The user may adjust several parameters of          texts which are syntactically parsed, without
the process to improve performance: select input              previous manual processing. It uses the syntactic
examples and their attributes, level of pruning, and          parser Sylex to generate the syntactical structure of
distance evaluation functions. The paper presents an          the texts. Asium uses only head nouns of
experimental study that illustrates how learning              complements and links to verbs and performs
quality depends on the different combinations of              bottom-up breadth-first conceptual clustering of the
parameters.                                                   corpora to form the concepts of ontology level. On
   While the system allows the user to tune its               each level it allows the expert to validate and/or
parameters, it performs no interactions during                label the concepts. The system generalizes the
clustering. It builds the hierarchy of the frames that        concepts that occur in the same role in the texts and
contain lexical knowledge about the concepts. The             uses generalized concepts to represent the verbs.
input corpora can be naturally found on the Web,                 Thus, state of the art in NLO learning looks
and the next paper presents a way of integrating              quite optimistic: not only does a stable general-
NLO enrichment with the Web search of the relevant            purpose NLO exist but so do techniques for
texts.                                                        automatically or semiautomatically constructing
   The system [Agirre et al., 2000] exploits the text         and enriching domain-specific NLO.
from the Web to enrich the concepts in the WordNet
[Fellbaum, 1998] ontology. The proposed method
                                                              Learning of Domain Ontologies
constructs lists of topically related words for each
concept in the WordNet, where each word sense has               Domain-specific NLO significantly improves
one associated list of related words. For example, the        semantic Web querying but in specific domains
                                                              general language knowledge becomes insufficient




                                                         20
and query processing requires special domain                   in terms of the common understanding of the
ontologies.                                                    domain, i.e. in the terms of the domain ontology.
   The paper [Maedche&Staab, 2000] presents an                    The system for ontology-based induction of high-
algorithm for semiautomatic ontology learning from             level classification rules [Taylor et al., 1997] goes
texts. The learner uses a kind of algorithm for                further and uses ontologies not only to explain the
discovering generalized association rules. The input           discovered rules for a user, but also to guide learning
data for the learner is a set of transactions, each of         algorithms. The algorithm consequently generates
which consists of a set of items that appear together          queries for an external learner ParkaDB, that uses
in the transaction. The algorithm extracts association         the domain ontology and the input data to check
rules represented by sets of items that occur together         consistency of the query, and consistent queries
sufficiently often and presents the rules to the               become classification rules. The query generation
knowledge engineer. For example, shopping                      process continues until the set of queries covers the
transactions may include the items purchased                   whole data set. Currently the domain ontologies used
together. The association rule may say that ‘snacks            there are restricted to simple concept hierarchies
are purchased together with drinks’ rather than                where each attribute has its own hierarchy of
‘crisps are purchased with beer’. The algorithm uses           concepts. On the bottom level the hierarchy contains
two parameters: support and confidence for a rule.             attribute values present in the data, the next level
Support is the percentage of transactions that                 contains a generalization about these attribute
contain all the items mentioned in the rule, and               values. This forms one-dimensional concepts, and a
confidence for the rule X? Y is conditional                    domain ontology of a very specialized type.
percentage of transactions where Y is seen, given                 The approach uses a knowledge-base system and
that X also appeared in the transaction. The ontology          its inference engine to validate classification rules. It
learner [Maedche&Staab, 2000] applies this method              generates the rules in terms of the underlying
straightforwardly for ontology learning from texts to          ontology, where the ontology still has a very
support the knowledge engineer in the ontology                 restricted type.
acquisition environment.                                          The paper [Webb, Wells, Zheng, 1999]
   The main problem in applying ML algorithms for              experimentally demonstrates how the integration of
OL is that the knowledge bases constructed by the              machine learning techniques with knowledge
ML algorithms have a flat homogeneous structure,               acquisition from experts can both improve the
and very often have prepositional level                        accuracy of the developed domain ontology and
representation (see Table 1). Thus several efforts             reduce development time. The paper analyses three
focus on improving ML algorithms in terms of                   types of knowledge acquisition system: the systems
ability to work with complicated structures.                   for manual knowledge acquisition from experts, ML
   The first step in applying ML techniques to                 systems and the integrated systems built for two
discover hierarchical relations between textually              domains. The knowledge bases were developed by
described classes is taken with the help of Ripple-            experienced computer users who were novices in
Down Rules [Suryanto&Compton, 2000]. The                       knowledge engineering.
authors start with the discovery of the class relations           The knowledge representation scheme was
between classification rules. Three basic relations            restricted to flat attribute-value classification rules
are considered: intersection (called subsumption in            and the knowledge base was restricted to a set of
marginal cases) of classes, mutual-exclusivity, and            production rules. The rationale behind this
similarity. For each possible relation they define a           restriction was based on the difficulties that novice
measure to evaluate the degree of subsumption,                 users experience when working with first-order
mutual exclusivity, and similarity between the                 representations. The ML system used the C4.5
classes. For input, the measures use the attributes of         decision tree learning algorithm to support the
the rules that lead to the classes. After the measures         knowledge engineer and to construct the knowledge
between all classes have been discovered, simple               bases automatically.
techniques can be used to create the hierarchical                 The use of machine learning with knowledge
(taxonomic) relations between the classes.                     acquisition by experts led to the production of more
   Knowledge extraction from the Web (data mining              accurate rules in significantly less time than
from the Web) uses domain ontologies to represent              knowledge acquisition alone (up to eight times less).
the extracted knowledge to the user of the knowledge           The complexity of the constructed knowledge bases




                                                          21
was mostly the same for all systems. The                                                                                                                                            In a classical setting the algorithm C4.5 will take
questionnaire presented in the paper showed that the                                                                                                                             the instances described by attribute-value pairs and
users found the ML facilities useful and thought that                                                                                                                            produce a tree with nodes that are attribute-value
they made the knowledge acquisition process easier.                                                                                                                              tests. The authors propose replacing the attribute-
   Future prospects for research listed in [Webb,                                                                                                                                value dictionary with a more expressive one that
Wells, Zheng, 1999] were to lead to ‘a more                                                                                                                                      consists of simple data types, tuples, sets, and
ambitious extension of this type of study that would                                                                                                                             graphs. The method [Bowers et al., 2000] uses a
examine larger scale tasks that included the                                                                                                                                     modified C4.5 learner to generate a classification
formulation of appropriate ontologies’.                                                                                                                                          tree that consists of tests on these structures, as
   Learning of the domain ontologies is far less                                                                                                                                 opposed to attribute value tests in a classical setting.
developed than NLO improvement. The acquisition                                                                                                                                  Experiments showed that on the data sets with
of the domain ontologies is still guided by a                                                                                                                                    structured instances the performance of this
human knowledge engineer, and automated                                                                                                                                          algorithm is comparable to standard C4.5 but task-
learning techniques play a minor role in                                                                                                                                         oriented modifications of C4.5 perform much better.
knowledge acquisition. They have to find                                                                                                                                            The system CRYSTAL [Soderland et al., 1995]
statistically valid dependencies in the domain texts                                                                                                                             extends the ideas of the previous system AutoSlog,
and suggest them to the knowledge engineer.                                                                                                                                      which showed great performance increase (about
                                                                                                                                                                                 200 times better than the manual system) on a
Learning of Ontology Instances                                                                                                                                                   creation of concept node definitions for a terrorism
In this subsection we survey several methods for                                                                                                                                 domain. It uses an even richer set of modelling
learning of the ontology instances.                                                                                                                                              primitives and creates the text extraction and mark-
   The traditional propositional-level ML approach                                                                                                                               up rules, with a given domain model as input, by
represents knowledge about the individuals as a list                                                                                                                             generalizing semantic mark-up of the manually
of attributes, with each individual being represented                                                                                                                            marked-up training corpora. Manually created mark-
by a set of attribute-value pairs. The structure of                                                                                                                              up is automatically converted into a set of case
ontology instances is too rich to be adequately                                                                                                                                  frames called ‘concept nodes’ using a dictionary of
captured by such a representation. The paper                                                                                                                                     rules that can be present in the concept node. The
[Bowers et al., 2000] uses a typed, higher-order                                                                                                                                 concept nodes represent the ontology instances and
logic to represent the knowledge about the                                                                                                                                       the domain-specific dictionary of rules defines the
individuals.                                                                                                                                                                     list of allowable slots in the ontology instance.

    Table 2. Comparison of the ontology learning approaches
                                  Type                                            OL Task                                                              ML                                                                           Modifications of ML techniques
                                                                                                                                                    technique
                                                                                                                                                                                 First-Order Rule learn.




          Approach                                                                                                                                                                                                        Human            Complex           Complex solution
                                                                                                                                      Propositional learn.
                                  Domain Ontologies
                                                      Ontology Instances



                                                                                                   Instance Extraction



                                                                                                                                                             Bayesian learning




                                                                                                                                                                                                                        interaction        modelling             space
                                                                                                                                                                                                                                           primitives
                                                                                                                         Enrichment




                                                                                                                                                                                                           Clustering
                                                                                      Extraction
                                                                           Creation
                            NLO




  [Bisson et al., 2000]      X       X                                                                                                                                                                       X            Partial              No            Concept hierarchy
  [Faure&Poibeau, 2000]      X                                                                                            X                                                                                  X             Yes         Simplified frames     Simplified frames
  [Agirre et al., 2000]      X                                                                                            X                                                                                  X             No                  No                    No
  [Junker et al., 1999]          X                                                                     X                                                          X                                                        No          Several predicates            No
  [Craven et al., 2000]          X                                                                     X                                                        X X                                                        No                  No                    No
  [Bowers et al., 2000]          X                                                                     X                                 X                                                                                 No          Yes, rich structure   Yes, rich structure
  [Taylor et al., 1997]        X     X                                                                                                   X                                                                                 No          Yes, but restricted           No
  [Webb, Wells, Zheng, 1999]   X   X                                                                                                     X                                                                                 Yes                 No                    No
  [Soderland et al., 1995]     X     X                                                                 X                                 X                                                                   X             No                  Yes                   Yes
  [Maedche&Staab, 2000]        X     X                                                                                                   X                                                                                 No                  No                    No




                                                                                                                                                             22
   After formalizing the instance level of the                position) and three predicates governing these types
hierarchy, CRYSTAL performs a search-based                    for treating text categorization rules as logical
generalization of the concept nodes. A pair of nodes          programs and applying first-order rule learning
is generalized by creating a parent class with the            algorithms. The rules learned are derived from five
attributes that both classes have in common.                  basic constructs of a logical pattern language used in
   The knowledge representation language for the              the framework to define the ontologies. The learned
concept nodes is very expressive, which leads to an           rules are directly exploited in automated annotation
enormous branching factor for the search performed            of the documents to become the ontology instances.
during the generalization. The system stores the                 The task of learning of the ontology instances
concept nodes in a way that best suits the distance           fits nicely into an ML framework, and there are
measure function, and therefore performs reasonably           several successful applications of ML algorithms
efficiently. Experiments on a medical domain                  for this. But these applications are either strictly
showed that the number of positive training instances         dependent on the domain ontology or populate
required for a good recall was limited; after between         the mark-up without relating to any domain
1 and 2 thousand, recall measure no longer grows              theory. A general-purpose technique for
significantly.                                                extracting ontology instances from texts given the
   The system performs two stages necessary for OL:           domain ontology as input has still not been
it formalizes ontology instances from the text and            developed.
generates a concept hierarchy from these instances.
   A systematic study of the extraction of ontology
instances from the Web documents was carried out                                  Conclusions
in the project Web-KB [Craven et al., 2000]. In their         The above case study is summarized in Table 2. The
paper the authors used the ontology of an academic            first column specifies the approach; the next
web-site to populate it with actual instances and             columns represent the ontological component of the
relations from CS departments’ web sites. The paper           Web query system, the OL tasks, and the relevant
targets three learning tasks:                                 ML technique respectively. The last three columns
   (1) recognizing class instances from the hypertext         describe the degree to which the system interacts
documents guided by the ontology;                             with the user and the properties of the knowledge
   (2) recognizing relation instances from the chains         representation scheme.
of hyperlinks;                                                   From the table we see that a number of systems
   (3) recognizing class and relations instances from         related to the natural-language domain deal with
the pieces of hypertext.                                      domain-specific tuning and enrichment of the NLOs
   The tasks are dealt with using two supervised              with various clustering techniques.
learning approaches: Naive Bayes algorithm and                   Learning of the domain ontologies is done by now
first-order rule learner (modified FOIL).                     only on a propositional level, and first-order
   The system automatically creates mapping                   representations are used only in the extraction of
between the manually constructed domain ontology              ontology instances (see Table 2).
and the Web pages by generalizing from the training              There are several approaches in the field of
instances. The system performance was surprisingly            domain ontology extraction, but the systems used
good for the restricted domain of a CS website                there are the variants of propositional-level ML
where it was tested.                                          algorithms.
   Major ML techniques applied for text                          Each OL paper modifies the applied ML algorithm
categorization performed to some degree of                    to handle human interaction, complex modelling
effectiveness [Junker et al., 1999], but beyond that,         primitives or complex solution space together. Only
effectiveness appeared difficult to attain and was            one paper [Faure&Poibeau, 2000] makes all three
only possible in a small number of isolated cases             modifications of the ML algorithm for NLO
with substantial heuristic modification of the                learning, as also shown in the table.
learners. This shows the need for combining these                The research in OL goes mostly in the way of
modifications in a single framework based on first-           straightforward application of the ML algorithms.
order rule learning.                                          This was a successful strategy for beginning, but we
   The paper [Junker et al., 1999] defines three basic        would need substantial modifications of the ML
types (one for text, one for word, and one for text           algorithms for OL tasks.




                                                         23
                                                          Workshop on Description Logics (DL2000),
                                                          Aachen, Germany, August, 2000.
                Acknowledgements
  The author would like to thank Dieter Fensel for        [Evett, 1994] M. Evett. PARKA: A System for
helpful discussions and comments, and Heiner              Massively Parallel Knowledge Representation.
Stuckenschmidt and four anonymous reviewers for           Computer Science Department, University of
their comments.                                           Maryland at College Park, 1994.

                                                          [Faure&Poibeau, 2000] D. Faure, T. Poibeau:
                   References                             First experiments of using semantic knowledge
                                                          learned by ASIUM for information extraction task
                                                          using INTEX. In: S. Staab, A. Maedche, C.
[Agirre et al., 2000] E. Agirre, O. Ansa, E. Hovy,        Nedellec, P. Wiemer-Hastings (eds.), Proceedings
D. Martinez: Enriching very large ontologies using        of the Workshop on Ontology Learning, 14th
the WWW. In: S. Staab, A. Maedche, C. Nedellec,           European Conference on Artificial Intelligence
P. Wiemer-Hastings (eds.), Proceedings of the             ECAI'00, Berlin, Germany, August 20-25, 2000.
Workshop on Ontology Learning, 14th European
Conference on Artificial Intelligence ECAI'00,            [Fellbaum, 1998] C. Fellbaum. WordNet: An
Berlin, Germany, August 20-25, 2000.                      Electronic Lexical Database. The MIT Press,
                                                          1998.
[Berners-Lee&Fischetti, 1999] T. Berners-Lee, M.
Fischetti. Weaving the Web. Harper, San                   [Fensel, 2000] D. Fensel. Ontologies: Silver Bullet
Francisco, 1999.                                          for Knowledge Management and Electronic
                                                          Commerce. Springer-Verlag, Berlin, 2000.
[Bisson et al., 2000] G. Bisson, C. Nedellec, D.
Canamero: Designing Clustering Methods for                [Junker et al., 1999] M. Junker, M. Sintek, M.
Ontology Building - The Mo'K Workbench. In: S.            Rinck: Learning for Text Categorization and
Staab, A. Maedche, C. Nedellec, P. Wiemer-                Information Extraction with ILP. In: J. Cussens
Hastings (eds.), Proceedings of the Workshop on           (eds.), Proceedings of the 1st Workshop on
Ontology Learning, 14th European Conference on            Learning Language in Logic, Bled, Slovenia, June
Artificial Intelligence ECAI'00, Berlin, Germany,         1999, 1999, pp. 84-93.
August 20-25, 2000.
                                                          [Lopez, 1999] F. Lopez: Overview of
[Bowers et al., 2000] A. Bowers, C. Giraud-               Methodologies for Building Ontologies. In:
Carrier, J. Lloyd: Classification of Individuals          Proceedings of the IJCAI-99 workshop on
with Complex Structure. In: Proceedings of the            Ontologies and Problem-Solving Methods,
Seventeenth International Conference on Machine           Stockholm, Sweden, August 2, 1999.
Learning (ICML'2000), Stanford, US, June 29-July
2, 2000, pp. 81-88.                                       [Maedche&Staab, 2000] A. Maedche, S. Staab:
                                                          Semi-automatic Engineering of Ontologies from
[Craven et al., 2000] M. Craven, D. DiPasquo, D.          Text. In: Proceedings of the Twelfth International
Freitag, A. McCallum, T. Mitchell, K. Nigam, S.           Conference on Software Engineering and
Slattery: Learning to construct knowledge bases           Knowledge Engineering, Chicago, 2000.
from the World Wide Web. Artificial Intelligence,
118: 69-113, 2000.                                        [Mitchell, 1997] T. Mitchell. Machine Learning.
                                                          McGraw Hill, 1997.
[Decker et al., 2000] S. Decker, D. Fensel, F. van
Harmelen, I. Horrocks, S. Melnik, M. Klein, J.            [Soderland et al., 1995] S. Soderland, D. Fisher, J.
Broekstra: Knowledge Representation on the Web.           Aseltine, W. Lehnert: Issues in Inductive Learning
In: Proceedings of the 2000 International                 of Domain-Specific Text Extraction Rules. In:
                                                          Proceedings of the Workshop on New Approaches




                                                     24
to Learning for Natural Language Processing at           [Taylor et al., 1997] M. Taylor, K. Stoffel, J.
IJCAI'95, Montreal, Quebec, Canada, 1995.
                                                         Hendler: Ontology-based Induction of High Level
[Studer et al., 1998] R. Studer, R. Benjamins, D.        Classification Rules. In: Proceedings of the
Fensel: Knowledge Engineering: Principles and            SIGMOD Data Mining and Knowledge Discovery
methods. Data and Knowledge Engineering, 25:             workshop, Tuscon, Arizona, 1997.
161-197, 1998.                                           [Webb, Wells, Zheng, 1999] G. Webb, J. Wells, Z.
[Suryanto&Compton, 2000] H. Suryanto, P.                 Zheng: An Experimental Evaluation of Integrating
Compton: Learning Classification taxonomies from         Machine Learning with Knowledge Acquisition.
a classification knowledge based system. In: S.          Machine Learning, 31(1): 5-23, 1999.
Staab, A. Maedche, C. Nedellec, P. Wiemer-
Hastings (eds.), Proceedings of the Workshop on
Ontology Learning, 14th European Conference on
Artificial Intelligence ECAI'00, Berlin, Germany,
August 20-25, 2000.




                                                    25