Learning term to concept mapping through verbs:
                            a case study
                     Valentina Ceausu                                             Sylvie Desprès
                   CRIP 5 - Paris V University                     LIPN UMR CNRS 7030 - University of Paris 13
                   45, Rue des Saints Pères                              99 avenue Jean Baptiste Clément
                     75006 Paris, France                                    93430 Villetaneuse, France
          ceausu@math-info.univ-paris5.fr                               sylvie.despres@lipn.univ-paris13.fr

ABSTRACT                                                           For this work, we assume that the ontology takes into account
                                                                   the linguistic level of entities. Thus, concepts and roles are
We propose in this paper an approach to learn term to concept      labeled by terms, which are linguistic manifestation of ontology
mapping with the joint utilization of verb relations and an        entities in a specific language (French, English, etc.). Therefore,
existing ontology. This is a non-supervised solution that can be   ontology considered for this work has two levels: a conceptual
applied to any field for which an ontology modeling verbs as       level, describing domain specific entities (concepts and roles)
relations holding between the concepts was already created.        and a linguistic level, providing expressions of those entities in a
Conceptual graphs, representing a set of verb relations, are       given language.
learned from a natural language corpus by using part-of-speech     The goal of this approach is to identify within c terms t
information and statistic measures. Labeling strategies are        representing linguistic expression of concepts of o ontology.
proposed to assign terms of the corpus to concepts of the
                                                                   Thus, we can label terms identified in the corpus by concepts of
ontology by taking into account the structure of the ontology
                                                                   ontology. We propose a three steps approach to carry out this
and the extracted conceptual graphs. Results of this assignation
                                                                   labeling process:
could be used to automatically create semantic annotations of
documents. A first experimentation in the field of accidentology   (1) in a first stage, verb relations are extracted from the corpus.
was done and its results are also presented.                       Each verb relation is composed of a verb, be that a general one
                                                                   or a field specific one, and a pair of terms connected by this
Categories and Subject Descriptors                                 verb.
D.3.3: term to concept mapping, semantic annotation.               (2) in a second phase, statistical processing is performed to
                                                                   structure verb relations as conceptual graphs. As the verb is
General Terms                                                      considered to be the key element of a verb relation, it is placed
Experimentation.                                                   at the top of the conceptual graph. Terms occurring as
                                                                   arguments of the verb are connected to this verb through links
                                                                   representing theirs syntactic function which could be subject or
Keywords                                                           object.
Ontology, verb relation, concept learning.
                                                                   (3) the last phase is based on the assumption that the domain
                                                                   ontology models verbs of the field as relations holding between
1. INTRODUCTION                                                    the concepts. If this is the case, labeling strategies are using both
                                                                   the ontology and extracted conceptual graphs to assign field
The rapid evolution in the production of documents in natural      specific terms to field specific concepts.
language requires the definition of efficient automated
                                                                   We shall approach that topic by answering a number of
approaches allowing finding relevant information in those
                                                                   questions: which method should be used to extract verb relations
documents. This paper presents an approach that uses verb
                                                                   from corpus? How to learn conceptual graphs from the extracted
relations and a domain ontology to assign terms of a given
                                                                   verb relations? Those questions are analyzed in sections 2 and 3.
corpus to concepts of the field. Those assignations can be used
                                                                   Given a domain ontology and a set of conceptual graphs, which
thereafter for various exploitation scenarios, that is to say:
                                                                   strategies will be used to assign terms to concepts? The solution
semantic annotation of documents, estimating similarities
                                                                   is discussed in section 4. A first experimentation in the field of
between documents, etc.
                                                                   accidentology is described and its results are presented in
This approach is based on an entirely automatic and non-           section 5. Related work is presented in section 6. Conclusions
supervised process, unless the use of a domain ontology to         and perspectives end this paper.
support the process.
The task to achieve could be described as follows: let o be a
domain ontology and c a collection of domain-specific texts.
2. EXRACTING VERB RELATIONS                                                  similarity of strings s, t . In a similar way, a lexical distance
FROM CORPUS                                                                  measure associates a real number to a pair of strings but the
                                                                             interpretation is different: important values of r indicates minor
To extract verb relations from corpus, we adopted an approach                similarity of strings s, t .
based on pattern recognition. This approach is using part-of-
speech information and consists in seeking within the corpus for             Many coefficients were proposed to calculate similarities or
particular associations of lexical categories. Such an association           distances between strings. A number of them are presented in
represents a lexical pattern. For example Verb, Noun or Verb,                [5]. For this work, we have implemented the Jaccard, Jaro, Jaro-
Preposition, Noun are lexical patterns.                                      Winkler and Monge-Elkan coefficients–.
We manually crafted a set of lexical patterns including a verb               Jaccard coefficient calculates the similarity between two strings
(among other categories). Associations of words matching                      s and t by considering a string composed of several sub-
patterns of this set are identified by a pattern recognition                 strings. Jaccard coefficient is given by:
algorithm, described in [4]. The algorithm takes as input the
corpus tagged by TreeTagger, see [19] and a set of lexical                                                                s t
patterns including verbs. It is applied at sentence level and it                                    Jaccard ( s, t ) 
automatically generates a set of word regroupings matching                                                                s t
those patterns, such as (examples of this paper are translated in
English, although they are extracted from a French corpus                    This measure is takes into account the number of sub-strings
experimentation: Verb, Preposition : diriger vers (direct to );              common to s and t and the number of all sub-strings of s and
Verb, Preposition, Noun: diriger vers place (direct to square).              t . If we consider characters as sub-strings, the coefficient
Obtained word regroupings can be:                                            expresses the similarity by taking into account the number of
                                                                             common characters of s and t only.
- a verb relation, highlighting domain relations, such as:
véhicule diriger vers bretelle (vehicle direct to slip road);                Jaro and Jaro-Winkler coefficients, introduced below, express
                                                                             the distance between two strings s, t by taking into account the
 - an incomplete verb relation such as piéton traverser
(pedestrian crossing); or diriger vers l'opéra (direct to opera);            number and the position of characters shared by s and t.
 - or meaningless word regroupings, as we can see : c, véhicule              Let s  s1 ...s k and t  t1 ...t k be two strings. A character
(c, vehicle,) ; venir de i (come from i).

                                                                             s i in s is considered common to both strings if there exists
3. LEARNING CONCEPTUAL GRAPHS
                                                                             t j in t such as: si  t j and i  h  j  i  h , where
The goal of this phase is to learn conceptual graphs from the
                                                                                     min( s, t )
results of pattern recognition algorithm.                                    h                  .
A conceptual graph represents a hierarchy having as a top a verb
                                                                                        2
and, on a second layer, arguments connected to the verb by their
grammatical function, subject or object. We use the term                     Let     s 1  s11 ...s 1k     be characters of    s   common to     t
conceptual graph as it was introduced by [18].
                                                                             and t
                                                                                     1
                                                                                          t11 ...t k1 characters of t common to s . We define a
   As many terms could be the subject or object of the same
verb, a conceptual graph corresponds to a set of verb relations              transposition between             s and t as an index i such as:
generated by the same verb. To learn conceptual graphs, the                  s t
                                                                               1
                                                                               i
                                                                                         1
                                                                                         i . If   Ts ,t is the number of transpositions from s i to
chain of treatments based on lexical similarity measures
presented bellow is performed.                                               t i the Jaro coefficient calculates the lexical distance between s
                                                                             and t as follows:
3.1 Lexical similarities and lexical distances
A lexical similarity measure associates a real number r to a
                                                                                            1  s          s 1  Ts ,t 
                                                                                                   1
                                                                                                       t1
pair of strings s, t . Important values of r indicate a significant
                                                                             Jaro( s, t )                            
                                                                                            3    s     t       s 1     
                                                                                                                       
 Permission to make digital or hard copies of all or part of this work for   [13] proposes a variant of Jaro coefficient by using          p , the
 personal or classroom use is granted without fee provided that copies are   length of the longer prefix common to both strings :
 not made or distributed for profit or commercial advantage and that
                                                                                                                                p
 copies bear this notice and the full citation on the first page. To copy    Jaro  Winkler ( s, t )  Jaro( s, t )              (1  Jaro( s, t ))
 otherwise, or republish, to post on servers or to redistribute to lists,                                                      10
 requires prior specific permission and/or a fee.
Presented coefficients calculate lexical similarity or distances
                                                                                            partie gauche (left side)
iteratively and consider strings as blocks. There are also hybrid
approaches calculating similarities recursively, by analyzing
                                                                                            partie droite (right side)
sub-strings of initial strings. Thus, Monge-Elkan coefficient
calculates lexical similarity between        s 1  s11 ...s 1k      and                  rétroviseur (rear view mirror)

t 1  t11 ...t l1 by performing two steps. First, the two strings               rétroviseur extérieur (external rear view mirror)
are divided into sub-strings; then the similarity is given by:
                                                                          Hence, a new layer can be added to each conceptual graph by
                                1 k
Monge  Elkan( s, t )            
                                k i 1
                                       max lj 1 sim( s i , t j )         clustering those arguments.
                                                                          A cluster is a group of similar terms, having a central term c
                                                                          called centroid and its k nearest neighbors. Based on the
where sim( si , t j ) are given by some similarity function, for
                                                                          heuristic that the greater number of words in a word regrouping
instance one of those previously presented. Such a function is            there are, the more specific his meaning is, an algorithm is
called a level 2 function.                                                proposed to cluster arguments of verb relations. The clustering
                                                                          algorithm is written as follows:

                                                                          (1) for each list of arguments, create the list L of centroids,
3.2 An iterative approach to learn                                        composed of all one-word arguments;
conceptual graphs                                                          (2) for each centroid c , calculate the lexical similarity with
Conceptual graphs are learned from the set of lexical pattern             other terms of the list by using Monge-Elkan coefficient;
instances extracted according to section 2. An iterative solution
is proposed, performing a number of steps, each of them adding            (3) add to cluster c terms having a similarity value greater than
a new layer to the graphs.                                                a given threshold. An expert intervention allows us to chose the
                                                                          value of this trehsolds.
(1) The first step identifies verb classes which represent the set
of verb relations generated by the same verb, see Table 1.                At that stage, Monge-Elkan function is used because it carries
                                                                          out recursive comparisons between sub-strings. Consequently, it
                                                                          has the capacity to agglomerate around a word (as centroids of
         Table 1.Extract from diriger (to direct) class                   clusters are one-word terms, which is to say words), terms
                                                                          derived from this word.
                diriger vers (direct towards )
                                                                          We chose one-word terms as centroids as they have the most
           diriger vers lieu (direct towards place)                       general meaning, and, by consequence, will be able to attract
                                                                          into a cluster terms that are similar from a lexical point of view
       véhicule diriger vers (vehicle direct towards)                     and that have more specific meanings. Figures 1 and 2 show the
   automobile diriger vers esplanade (car direct towards                  iterative construction of conceptual graphs. We can see one-
                        esplanade)                                        level conceptual graphs learned from diriger (to direct) class and
                                                                          two-level conceptual graphs learned from circuler (to circulate)
                                                                          class.
For each verb class, instances of patterns “Verb” and “Verb,
Preposition, are added to the set of roots. We argue that for
verbs accepting prepositions, each “verb, preposition” pattern
accepts specific arguments and for this reason conceptual graphs
are created for each instance of those patterns. This step creates
a number of conceptual graphs having one level, which is to say
the root (see Figure 1).
(2) For each root, its arguments are identified: terms that are
subjects and objects. As each relation accepts many terms as
subject or object, lists of arguments are obtained. This step is
adding a second layer to each conceptual graph.
(3) We observe that, for a given verb, arguments can have
different levels of granularity, as we can see in table 2:
               Table 2.Granularity of arguments

                        partie (side )
Figure 1. Conceptual graph modeling circuler avec (circulate          The second strategy we propose is a top-down strategy. In the
                          with)                                       first phase, it identifies concepts of ontology which label the
                                                                      centroid of the cluster. If the centroid of a cluster is labeled as
                                                                      unknown, the same label is assigned to each term of the cluster.
                                                                      If the centroid of a cluster is labeled by a concept c of
                                                                      ontology, labels for other terms of the cluster are searched only
                                                                      in the set of sub-concepts of c . In this way, the top-down
                                                                      labelling strategy reduces the search space.
                                                                      A third strategy is based on a bottom-up approach. For each
                                                                      cluster, the similarities between its terms and the concepts of
                                                                      ontology are calculated by using one of presented coefficients. If
                                                                      values of similarities are higher than a threshold, the concept
                                                                      labels the term. If this is not the case, the term will be labeled as
                                                                      inconnu (unknown). Based on the assignments of each term of
                                                                      cluster to ontology concepts, the similarity between the centroid
                                                                      and a concept of ontology is given by:

                                                                                               1 k
                                                                      sim(Centroid , c)          sim(t i , c) , where ti is a term
                                                                                               k i 1
                                                                      of the cluster, c is a concept of ontology, sim(t i , c ) is the

                                                                      similarity between t i and    c , and k is the number of terms
   Figure 2. Conceptual graph modeling diriger (direct to)
                                                                      labeled by c .
                                                                      Those three labelling strategies are used in a first
4. TERM TO CONCEPT MAPPING                                            experimentation in the field of accidentology which is described
                                                                      in the next section.
USING THE ONTOLOGY

At this stage, arguments of verb relations can be assigned to
                                                                      5. EXPERIMENTATION IN
concepts of the domain by using the previous conceptual graphs        ACCIDENTOLOGY AND FIRST RESULTS
and a domain-ontology. We make the assumption that, for a
given conceptual graph, the verb R representing its root node         Results of our approach can be affected by different parameters:
is already modeled by the ontology. If this is the case, let r be     the corpus we use, that is to say its size and its nature (a domain
the corresponding relation and           Ranger , Domainr             specific corpus or a general one) and the ontology. Our first
                                                                      experimentation was performed in accidentology and aimed to
concepts of the ontology connected by r . Those concepts and          points out how different ontologies affect the outcome.
theirs descendants will be used to label arguments of the verbs.
As arguments are connected to verb by links corresponding to          For this experimentation, we used a corpus and we aimed to
the syntactic function, Domainr will be used to label subject         assign terms extracted from this corpus to concepts of two
                                                                      different ontologies. Here after we describe those resources.
arguments, while Ranger will be used to label object
                                                                      The corpus we used is composed of about 250 accident reports
arguments. Assignation of terms to concepts is performed by one       of accidents which occurred in and around Lille region (130
of labelling strategies described bellow.                             KO, 205 000 words). Accident reports are documents created by
A first strategy ignores the hierarchical organization of             the police describing road accidents. They are written by
arguments. Thus, similarities between each argument and terms         policemen, according to declarations of people involved in
naming concepts of the ontology are calculated using one of           accident and testimonies of witnesses.
presented similarity measures. The argument is assigned to the        A first case study was done by using an ontology created from
concept maximizing this similarity, if the value of this similarity
is greater than a pre-defined threshold. If similarity values are     accident reports O1 , see [4]. The ontology was created with
below the threshold, the term will be labelled as inconnu             Terminae, see [3], and it is expressed in OWL, see [6] and [21].
(unknown). This is a non-oriented strategy because all                It models the domain of accidentology as it appears through
arguments are considered at the same level.                           documents created by the police. In this case study, the ontology
                                                                      and the corpus are created by the same community.
Further on, we present two strategies (the second and the third)
which take into account the hierarchical structure of arguments.      Our second case study was done by using an ontology O2 ,
Therefore, each cluster of arguments is considered as a hierarchy
                                                                      created from accident scenarios, see [7]. Accident scenarios are
having on its first level the centroid and on its second level
                                                                      documents created by researchers in road safety which describe
terms that are specializations of centroid.
                                                                      prototypes of road accidents. The ontology was created with
Protégé see [16] and it is expressed in OWL. This ontology
models the domain of accidentology as it appears through
documents created by road safety researchers. In this second
case study, the corpus and the ontology are created from two
different communities.
Each ontology models concepts (see figure 3) and roles.


                                                                  Figure 4. Roles of concept Véhicule (Vehicule)
                                                                                         Ta     C
                                                                  D(Corpus, O)               * a , where Ta is the number
                                                                                        Ttotal C total
                                                                  of arguments of verb relations assigned to concepts of the
                                                                  ontology;  Ttotal is the number of terms extracted from corpus
Figure 3. The concept Véhicule (Vehicle)
                                                                  (arguments of verb relations); C a is the number of concepts to

                                                                  whom arguments of verb relation are assigned, and C total is the
Roles are designated by domain specific verbs, see fig. 4.
                                                                  number of concepts of the ontology.
As the community of road safety researchers is smaller, the
                                                                  This definition is based on relative measures which enables us to
number of entities of O2 is less important, see table 3.          compare results obtained by using different corpus and
                                                                  ontologies.
          Table 3. O1 and O2 : number of entities
                                                                  Values of assignation degree rank from 0 (all terms extracted
                          Concepts               Roles            from corpus are labelled as unknown) to 1 (each term extracted
                                                                  is assigned to a concept and each concept of the ontology labels
       O1                   450                    320            at least one term).

       O2                   130                    70             For each case study, terms are assigned to concepts by using a
                                                                  labelling strategy. Results obtained are presented below:
                                                                       Table 4. Assignation degree : non oriented strategy
The analysis of results is done by using a new measure which we
defined. This measure is called assignation degree, and it is                                                         Assignation
                                                                   Case study        Corpus          Ontology
given by:                                                                                                               degree
                                                                                    accident
                                                                        1
                                                                                     reports            O1                70,5
                                                                                    accident
                                                                        2
                                                                                     reports
                                                                                                        O2                30,5
       Table 5. Assignation degree : top-down strategy                [13] use a principle from information theory to model
                                                     Assignation      selectional preferences for verbs. Several classes may be
 Case study        Corpus           Ontology                          appropriate for modeling selectional preferences.
                                                       degree
                   accident                                            [20] propose RelExt, a system which is capable of automatically
      1
                    reports            O1                 68,5
                                                                      identifying highly relevant triples (pairs of concepts connected
                   accident                                           by a relation). RelExt extracts relevant terms and verbs from a
      2
                    reports
                                       O2                25,5
                                                                      given text collection and it estimates relations between them
                                                                      through a combination of linguistic and statistical processing.
      Table 6. Assignation degree : bottom -up strategy               Extracted triples can be integrated in an already existing
                                                                      ontology.
                                                     Assignation
 Case study        Corpus           Ontology                          [18] propose a system having a multi-layered architecture
                                                       degree
                   accident                                           aiming to extract information from genetic interaction data. The
      1
                    reports            O1                 68,5        system uses verb patterns modelled as conceptual sub-graphs to
                   accident                                           characterize unknown terms in sentences. The goal is to enrich
      2
                    reports
                                       O2                25,5         an existing ontology by integrating discovered concepts.
                                                                      Our approach is based on the previous work presented in [18],
Result analysis is two-fold: for each case study, we compare the      whose major drawback is the impossibility to assign terms
results provided by each labelling strategy; for the same case        composed of many words (multi-words terms) to concepts of
study (which is to say the same ontology), we compare the             ontology. In order to overcome this limitation, our approach
results provided by each labelling strategy.                          takes into account arguments of verb relations which have
                                                                      different levels of granularity. Therefore, we represent verb
As tables above show, the assignation degree has more                 relations by conceptual graphs having three levels: the verb (first
important values if the corpus and the ontology share the same        level), one-word arguments (second level) and multi-words
community. This could be explained by the similarity between          arguments (the third level).
the linguistic level of the ontology (terms designing its entities)
and the corpus. By using a corpus and an ontology belonging to
different communities, the assignation degree decrease
                                                                      7. CONCLUSION AND FUTURE WORK
drastically. In order to overcome this problem, we can use
lexical resources such as WordNet, see [14], allowing us to take      We have presented a non supervised approach developed to
into account synonymy between terms when estimating theirs            automatically assign terms of a corpus to concepts of ontology.
similarity.                                                           This approach is using jointly verb relations and a domain
                                                                      ontology. Results provided could be used to semantically
Among the labelling strategies, the bottom-up one shows lows          annotate or index documents.
values of assignation degrees in both case studies. This is
because the most clusters we have obtained have less than 10          A first experimentation in the accidentology domain was done in
words, and this strategy fails in case of small sized clusters.       order to point out how different ontologies affect the outcome.
                                                                      In order to evaluate the results of this evaluation we have
The non oriented strategy and the top-down strategy provide           defined a new measure, called assignation degree. This
similar values of assignation degree. Nevertheless, the top-down      evaluation shows that the approach provide better results if the
strategy performs faster, as it reduces the search space.             corpus and the ontology belong to the same community.
                                                                      If they belong to different communities, values of assignation
6. RELATED WORK                                                       degree decrease. This experimentation shows that our approach
                                                                      is sensitive to lexical level (changing the vocabulary, by passing
Approaches proposed in different application fields, such as          from a community to another, affects values of assignation
ontology learning or word-sense disambiguation are at the origin      degree).
of this work.
                                                                      As a future work, new evaluation scenarios have to be proposed
Among them, [10] propose Asium, a machine learning system             in order to study how other factors (namely the corpus: its size
which acquires subcategorization frames of verbs based on             and its nature) affect the results.
syntactic input. Asium hierarchically clusters nouns based on the
                                                                      Another perspective concerns the exploitation of lexical
verbs that they are syntactically related with and vice versa.
                                                                      resources such as WordNet, in order to take into account the
The work of [24] concerns the identification meaning of               synonymy between terms. Namely, this will allow us to
unknown verbs using the context of occurrence of the verb. The        overcome the problem of lexical variation between different
system Camille uses WordNet, see [14] as background                   communities.
knowledge and generates assumptions concerning the meaning
                                                                      As a continuation of this work, a feedback could be added in
of verbs. The assumptions are formulated according to linguistic
                                                                      order to enrich the domain ontology by integrating new
criteria's.
                                                                      concepts.
                                                                       International Conference on Knowledge Discovery and
                                                                       Data Mining, 1996.
8. REFERENCES                                                      [16] Noy, N., Fergerson, R. W. and Musen, M. A.. The
                                                                        knowledge model of Protégé-2000 : Combining
                                                                        interoperability and _exibility. In Proceedings of the
[1] Alfonseca, E., Manandhar, S.: Improving an ontology
                                                                        International Conference on Knowledge Engineering and
    refinement method with hyponymy patterns. In proceedings
                                                                        Knowledge Management, 2000.
    of the Third International Conference on Language
    Resources and Evaluation, 2001.                                [17] Parekh, V., Jack, P.G., Finin., T.: Mining domain specific
                                                                        texts and glossaries to evaluate and enrich domain
[2] Aussenac-Gilles, N., Seguela, P.: Les relations
                                                                        ontologies. In Proceedings of the International Conference
    sémantiques: du linguistique au formel. Cahiers de
                                                                        on Information and Knowledge Engineering, 2004.
    grammaire 25 (175), 2000.
                                                                   [18] Roux, C., Prouxet, D., Rechenmann, F., Julliard, L.: An
[3] Biébow, B., Szulman, S.: A linguistic-based tool for the
                                                                        ontology enrichment method for a pragmatic information
    building of a domain ontology. In proceedings of the
                                                                        extraction system gathering data on genetic interactions. In
    International Conference on Knowledge Engineering and
                                                                        Proceedings of Ontology Learning Workshop at ECAI,
    Knowledge Management, 1999.
                                                                        2000.
[4] Ceausu, V., Desprè, S.: Towards a text mining driven
                                                                   [19] Schmid, H.: Probabilistic part-of-speech tagging using
    approach for terminology construction. In proceedings of
                                                                        decision trees. In Proceedings of the International
    the 7th International conference on Terminology and
                                                                        Conference on New Methods in Language Processing,
    Knowledge Engineering, TKE 2005, 2005.
                                                                        1994.
[5] Cohen, W., Ravikumar, P., Fienberg, S.: A comparison of
                                                                   [20] Schutz, A., Buitelaar, P.: Relext: A tool for relation
    string distance metrics for name-matching tasks. In
                                                                        extraction from text in ontology extension. In Proceedings
    proceedings of IJCAI-2003, Workshop on Information
                                                                        of the International Semantic Web Conference, 593–606.
    Integration on the Web pages, 2003.
                                                                        2005.
[6] Dean, G. Schreiber, P. Patel-Schneider, P. Hayes, and I.
                                                                   [21] Szulman, S., Biébow, B.: Owl et Terminae. In actes de la
    Horrocks. Owl web ontology language reference. Technical
                                                                        14ème    Journée     Francophone    d’Ingénierie     des
    report, W3C Proposed Recommendation, 2004.
                                                                        Connaissances, 2004.
[7]    Després, S. Contribution à la conception de méthodes et
                                                                   [22] Valarakos, A., Paliouras, G., Karkaletsis, V., Vouros, G.: A
      d'outils pour la gestion des connaissances. Habilitation à
                                                                        name matching algorithm for supporting ontology
      diriger des recherches, Université René Descartes, 2002.
                                                                        enrichment. In Proceedings of the 3rd Hellenic Conference
[8] Euzenat, J., Valtchev, P.: An integrative proximity measure         on Artificial Intelligence, 2004.
    for ontology alignment. In proceedings of ISWC-2003,
                                                                   [23] Ville-Ometz, F., Royaut, J., Zasadzinski, A.: Filtrage semi-
    Workshop on Semantic Information Integration, 2003.
                                                                        automatique des variantes de termes dans un processus
[9] Faatz, A., Steinmetz, R.: Ontology enrichment with texts            d’indexation contrle. In actes du Colloque International
    from the www. In proceedings of the SemanticWeb Mining              sur la Fouille de Textes, 2004.
    2nd Workshop at ECML/PKDD, 2002.
                                                                   [24] Wiemer-Hastings, P., Graesser, A., Wiemer-Hastings, K.:
[10] Faure, D., Nedellec, C.: Asium, learning subcategorization         Inferring the meaning of verbs from context. In
     frames and restrictions of selection. In proceedings of the        Proceedings of the Twentieth Annual Conference of the
     10th European Conference On Machine Learning,                      Cognitive Science Society, 1998.
     Workshop on text mining, Chemnitz, Germany, 1998.
                                                                   [25] Xiaomeng, S.: Semantic Enrichment for Ontology
[11] Gagliardi, H., Haemmerl, O., Pernelle, N., Sas, F.: An             Mapping. PhD thesis, Norwegian University of Science and
     automatic ontology-based approach to enrich tables                 Technology, 2004.
     semantically. In proceedings of the first International
                                                                   [26] Warin, M., Oxhammer, H., Volk, M.: Enriching an
     Workshop on Context and Ontologies: Theory, Practice
                                                                        ontology with wordnet based on similarity measures. In:
     and Applications, 2005.
                                                                        MEANING-2005 Workshop, 2005.
[12] Hearst, M.: Automatic acquisition of hyponyms from large
                                                                   [27] Widdows, D.: Unsupervised methods for developing
     text corpora. In proceedings of the 14th International
                                                                        taxonomies by combining syntactic and statistical
     Conference on Computational Linguistics, 1992.
                                                                        information. In proceedings of Human Language
[13] Li, H., Abe, N.: Generalizing case frames using a                  Technology Conference, HTL-NAACL, 2003.
     thesaurus and the MDL principle. Computational
     Linguistics 24, 217–244, 1998.
[14] Miller, G.: Wordnet: A lexical database for english. CACM
     38, 39–41, 1995.
[15] Monge, A., Elkan, C.: The field-matching problem:
     algorithm and applications. In proceedings of the Second