<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Corpus-Driven Contextualized Categorization</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tony Veale</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yanfen Hao</string-name>
          <email>yanfen.haog@ucd.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Computer Science and Informatics, University College Dublin</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Ontologies strive to offer a interconnected, hierarchical systems of categories to guide our actions in a complex world. But the boundaries of these categories are highly context-dependent, and what constitutes a prototypical category member in one context may be atypical or unrepresentative in another. In this paper we outline a dynamic, trainable, bottom-up view of category structure based on context-sensitive corpus analysis. By learning from corpora about how people creatively actually use categories in different contexts, we can train our ontologies to creatively adapt themselves to these contexts.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>An ontology is a system of inter-connected categories that
collectively provide a structured representation of a given domain. As such,
an ontology serves as the conceptual bedrock against which domain
meanings are constructed, manipulated and interpreted. However,
this fundamental role of the ontology should not blind us to the fact
that much of what an ontology attempts to model, via its category
structure, is not static but dynamic, making the use of these
categories highly sensitive to context. Consider that many categories in
a language-oriented ontology, like Genius, Fool, Hero, Villain,
Expert, Hunter, and so on, possess subjective membership criteria that
change from user to user, and from context to context. Are
politicians fools, villains or schemers? Are firemen heroes or workmen?
Are scientists experts or geniuses?</p>
      <p>
        Since top-down definitions of membership criteria will always
seem brittle or inadequate in some contexts, it seems best to allow
contexts to define their own criteria, bottom-up. In other words, we
need to establish a contextual ontology [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] based category
structure, which not only preserves the common view of concepts, but
also keeps the local perspective of domains. For language-oriented
ontologies, like WordNet [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] (a flawed, lightweight ontology to be
sure, but an ontology none the less), HowNet [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and, to some
extent, Cyc [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], the context of usage can conveniently be captured via
a large corpus of representative texts. A corpus-based approach to
determining category membership allows us to structure the middle
and lower layers of an ontology according to how words and
concepts are actually used in a particular domain. In short, a
corpusbased approach supports an extremely flexible, non-classical view of
category structure, one that views category membership as a graded
rather than binary notion [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], and one in which concepts can fluidly
move (via metaphor) from one category to another [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In this
current work, we use the ability to support metaphoric reasoning as the
yardstick against which ontological flexibility should be measured.
      </p>
      <p>
        Of course, this fluidity does not sit well with conventional
perspectives on ontological structure, as represented by the ontologies
of [
        <xref ref-type="bibr" rid="ref1 ref5 ref6">1,5,6</xref>
        ]. In this paper we look at one conventional ontology, the
HowNet system of [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], which is a large-scale bilingual lexical
ontology for words and their meanings in both Chinese and English.
In many respects, HowNet is similar to the WordNet lexical
ontology for English [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], though in contrast to WordNet, HowNet
provides an explicit, if sparse, propositional semantics for each of the
word-concepts it defines. Complementing this frame-like semantics,
in which concepts are defined in terms of actions, case-roles and
fillers, is a taxonomic backbone that seems rather impoverished when
compared to that of WordNet. HowNet is essentially an ontology of
”Being” rather than an ontology of ”Doing” which is to say that it
defines concepts according to conventional kinds like human,
animal, tool and so on - rather than according to how specific concepts
actually behave in context. However, we describe in section 2 how
HowNet’s propositional semantics can be used to automatically
derive an ontology of ”Doing” to replace HowNet’s rather shallow
taxonomy of conventional categories [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Once in place, we demonstrate
how this new system of derived categories can be made contextually
sensitive by defining their membership criteria in statistical,
corpusbased terms, to create a fluid system of membership akin to the
Slipnets of Hofstadter [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Once sensitized in this way, the ontology can
be moved with ease from one context to another simply by replacing
the underlying corpus.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>ONTOLOGIES OF ”BEING” AND ”DOING”</title>
      <p>
        HowNet and WordNet each reflect a different view of semantic
organization. WordNet [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] is differential in nature: rather than
attempting to express the meaning of a word explicitly, WordNet
instead differentiates words with different meanings by placing them
in different synonym sets, or synsets, and further differentiates these
synsets from one another by assigning them to different positions
of a taxonomy. In contrast, HowNet is constructive in nature. It
does not provide a human-oriented textual gloss for each lexical
concept, but instead composes sememes from a less discriminating
taxonomy to provide a semantic representation for each word sense.
For example, HowNet defines the lexical concept surgeonj as
follows:
(1)surgeonj fhumanj :HostOf =fOccupationj
domain=fmedicalj gg, fdoctorj :agent=f»ggg
g
which can be glossed thus: ”a surgeon is a human, with an
occupation in the medical domain, who acts as an agent of a doctoring
activity” (the f»g here serves to indicate the placement of the
concept within its associated propositional structure). We see a
similar structure employed by HowNet for the lexical concept
repairmanj :
(2)repairmanj
frepairj
      </p>
      <p>fhumanj :HostOf =fOccupationj
:agent=f»ggg
g,
Note that the impoverished nature of HowNet’s taxonomy means
that over 3000 different concepts are forced to share the immediate
hypernym humanj . However, humanj merely states, very
generally, what a repairman is, rather than what a repairman does.
Fortunately, HowNet also organizes its verb entries taxonomically,
and so we find the verbs doctorj and repairj organized
under the hypernym resumej (the logic being, one supposes,
that ”doctoring” and ”repairing” both involve a resumption of an
earlier, better state). This similarity of verbs, combined with an
identicality of case-roles (both surgeon and repairman are agents of
their respective activities), allows us to abstract out a new taxonomy,
based on the behaviour rather than the general type of these entities.</p>
      <p>
        Of course, this Aristotelian view of metaphor as an abstract
”carrying-over” (the etymological origin of the word ”metaphor”)
can only be valid if concepts are ontologized by what they do, rather
than by what they are (as is typically the case, in both WordNet
and HowNet, and even Cyc [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]). Otherwise, metaphor could never
operate between semantically distant concepts, which it plainly does.
For instance, figure 2 illustrates the derived taxonomy for HowNet
concepts that are defined as agents of the verbs ”kill”, ”damage”
and ”attack”, each a specialization of the abstract verb MakeBad in
HowNet. We see in this taxonomy the potential for famines to be
metaphorically viewed as butchers and assassins, and for viruses to
be seen as deadly intruders, or even man-eaters.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>DERIVING FLUID CATEGORY</title>
    </sec>
    <sec id="sec-4">
      <title>STRUCTURES</title>
      <p>An ontology of ”doing” begs a number of obvious questions about
the nature of categorization. For instance, is every concept that
kills an equally representative member of the category kill-agent?
Is movement always allowed between any two categories that share
a common abstraction like MakeBad-agent, or is movement limited
to certain members only, and in certain directions? When a concept
moves from its conventional category to another, how is its degree of
membership in this new category to be assessed? In this section we
address this key issue of obtaining fluid category structure.</p>
      <p>
        There are two major approaches in the community of automatic
acquisition of taxonomies. One approach is based on the
distributional hypothesis made by Harris[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], in which he believes that word
terms are similar if they have similar linguistic contexts. For instance,
Hindle[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] clusters nouns according to their contextual attributes,
such as the co-occurrence of nouns with verbs as subjects or objects.
Steffen Staab[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] also extracts context information (verb/subject
dependencies, verb/object dependencies, e.g.) about a certain term from
corpus and applies a Formal Concept Analysis to generate a lattice
that is finally transformed into a partial order closer to a concept
hierarchy. Another major approach is on the basis of investigating the
ontological relations such as is-a relation, part-of relation, e.g. via the
corpus. Hearst[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] is a representative of this field. However, it seems
that these approaches still result in binary and static taxonomies
because they all apply the threshold to the category or the concept
architecture to determine whether or not a word concept belongs to it.
In our approach, we also follow Harris[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]’s distributional
hypothesis to investigate the contextual attributes, particularly, the behavior
of nouns. The difference is that we apply Lakoff[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]’s category theory
to assign the graded membership to nouns within a category rather
than simply grouping them into classes according to their contextual
attributes or ontological relations.
      </p>
      <p>
        Following Lakoff [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], every category will possess a prototype, a
member that is highly representative of the category as a whole. Such
prototypes are often lexicalized in simple terms; for instance, ”killer”
will be a highly representative of kill-agent, while the Chinese
translation ” ” is a composition of ”killing” ( ) and ”expert” ( ).
However, many categories like damage-agent have no obvious
lexicalized prototype, so we need a more generic means of identifying
the prototypical member of a category. Lakoff [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] suggests that the
prototype will occupy a central position in the category’s structure,
with other members organized in a radial fashion, at a distance from
the centre that is inversely proportional to their similarity to the
prototype. If we assume that the prototype will be that member that is
most evocative of a category, we should first measure the evocation
strength of each concept for a given category. This can be done by
determining the frequency of occurrence of each concept within the
category, and this, in turn, can be estimated by looking to a large
corpus to see how each concept is actually employed by language users.
Once the most evocative example is found for each category,
membership scores can be assigned based on the strength of evocation.
The corpus we use must be large, and while reasonably authoritative
it must use words both literally and figuratively. For reasons outlined
in section 5, we use here as our corpus the complete text of the
opensource encyclopaedia Wikipedia [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>Thus, to estimate the membership level of the word-concept
butcherj in the category kill-agent, we first determine the
corpus-frequency of the phrase ”butcher who kills/killed”. In
general, for estimating the membership of the concept C in the
category V-agent, we use the query form ”C whojwhichjthat V”;
for categories of the form V-instrument, we use the query ”V with
C”, and so on. Of course, some verbs are more vague than others,
and can have much higher corpus frequencies. We therefore need
to normalize raw corpus-frequencies to obtain a truer picture of
evocation power. If fraw(V-role:C) denotes the corpus frequency
of concept C when considered as a member of the category V-role,
where V is a verb like ”kill” and role is one of agent, instrument,
etc., then the adjusted frequency, a measure of true evocation, is
estimated by:
fadj (V-role:C) = ln(fraw(V-role:C))£ln(P fraw(V-role:x))¡1 (1)
x
Now, the prototype will be that member of a category with the
strongest evocation:</p>
      <p>P rototype(V-role) = maxc(fadj (V-role:C))
(2)
The degree of membership of C in the category V-role is relative to
the prototype:
M embership(V-role:C) = fadj (V-role:C)£
fadj (V-role:prototype(V-role))¡1
(3)
This ensures that the prototypical member has a membership score
of 1, while all other members of a category will have a score in the
range 0... 1. A concept can metaphorically be moved from a category
in which it is conventionally a member to any other category in which
it is considered to have a non-zero membership score.
4</p>
    </sec>
    <sec id="sec-5">
      <title>CLUSTERS AND GROUP-TERMS</title>
      <p>For ontological purposes, a category is essentially a cluster of
concepts that allows one to conveniently infer similarity ¡ the
possession of common properties and shared behaviour ¡ from the simple
act of co-categorization. That these clusters often have a
heterogeneous roster of members (e.g., as illustrated in Figure 2) is testament
both to the prevalence of metaphor and to the necessity of viewing
ontological categories as categories of ”doing” rather than of
”being”. Of course, the converse is also true: we can infer the contextual
behaviour of a concept from how that concept is explicitly clustered
with others. And one common way of signalling the appropriate
cluster for a concept is through an evocative group word, like ”army”,
”mob”, ”tribe” or ”coven”. For instance, when one uses the phrase
”an army of robots”, one is conveying a soldier-like perspective on
the concept Robot, signalling that in this context, Robot should be
viewed more as a attacking agent than as a utensil.</p>
      <p>Group terms like ”army”, ”family” and ”swarm” are highly
suggestive of particular behaviours. For instance, the corpus techniques
of section 3 reveal that, in the context of Wikipedia, a ”swarm” has
two dominant behaviours, biting and attacking, while an ”army” has
three, defeating, fighting and attacking. To use the phrase ”swarm
of X” or ”army of X” is to suggest that X also exhibits these
behaviours, and furthermore, that X is similar in behaviour to other
concepts that comfortably fit these templates. This intuition is easily
contextualized, since the relative frequency of these phrases in a
context’s corpus will reveal the extent to which different concepts belong
to different group-based categories.</p>
      <p>As a corpus, Wikipedia is biased toward popular culture and
genres such as science fiction. This lack of neutrality makes the
Wikipedia corpus an excellent example of a context, more so than
traditional language corpora. Consider the population of the category
Army-member as derived from Wikipedia:
mercenary(238), clone(132), soldier(122), volunteer(72),
monster(70), robot(63), minion(60), warrior(60), frog(58), knight(50),
slave(48), demon(46), clansman(46), monkey(46), crusader(44),
gladiator(38), ant(37), lawyer(32), contributor(28), mutant(27), ...</p>
      <p>Note the prominent presence of the genre elements ”clone”,
”robot” and ”minion”, as well as examples like ”lawyer” for which
”army” has a metaphoric meaning. This grouping suggests that
lawyers may be seen, alternately, as mercenaries, warriors and even
clones, while the extent to which these comparisons are apt in a
particular context is a function of how many different groups can
contextually claim both as members. For instance, ”lawyer” and ”warrior”
are used with seven different group terms in the Wikipedia corpus ¡
society, family, cadre, team, army, class and squad, while ”lawyer”
and ”mercenary” share just three groupings ¡ team, army, squad.
Interestingly, the most common group term for ”lawyer” in Wikipedia
is ”huddle” (the phrase ”huddle of lawyers” occurs 64 times, twice
as often as ”army of lawyers”), which suggests that, in this context,
lawyers are more likely to be categorized as players than warriors,
mercenaries, clones or robots
5</p>
    </sec>
    <sec id="sec-6">
      <title>PRELIMINARY EMPIRICAL EVALUATION</title>
      <p>The choice of corpus is clearly key to the quality of
categorymembership statistics that can be derived using the methods of
sections 3 and 4. This corpus must be large, it must be representative of
language use in general, and it should offer a means of search that is
robust in the face of noise. At first blush, then, the world-wide-web
seems an ideal candidate: in size it is unmatched, and various APIs
are available to access powerful search engines like Google.
Unfortunately, such APIs rarely provide enough control over the query or
the archive to ensure that noise can be eliminated, since these
engines typically perform their own stemming and stop-word
elimination, putting truly strict matching beyond our reach. This means
that common noun-noun collocations, like ”fossil record” and ”share
issue”, are easily confused for infrequent or nonsensical noun-verb
collocations like ”fossils that record” and ”shares that issue”.</p>
      <p>
        To ensure strict matching with controlled morphology, we require
a local text corpus that we can index and search directly, and even
subject to part-of-speech tagging. For this reason we choose the
collected text of the open-source encyclopaedia Wikipedia [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ],
which is available to download in XML form. Wikipedia has several
obvious benefits as a text corpus: each document is explicitly tagged
with a subject-label, since each article defines a specific headword;
documents exist in a rich web of interconnections; and documents
strive to be authoritative on their subjects. Consider the range of
subjects that are found in Wikipedia for the verb ”to infect” (with
frequencies shown in parentheses):
virus(46), worm(12), retrovirus(7), strain(6), disease(6),
bureaucrat(6), poison(4), ally(4), fungus(4), dust(3), smut(2), bacterium(2),
physiologist(2), blood(2), plague(2), war(2), substance(2), germ(1),
application(1), species(1)
Now consider the range of verbs that can be used with the
subject ”virus”:
We see from this snapshot that Wikipedia contains enough
diversity to capture the dominant application of each verb, and the
dominant behaviour of each subject noun. Furthermore, Wikipedia
contains enough diversity to reveal creative uses of these nouns and
verbs; this snapshot reveals, for instance, that ”smut” can ”infect” (2
uses) and that a ”virus” can ”eat”, ”escape” and even ”steal”.
      </p>
      <p>One can ask how well these corpus-derived category structures
compare with the hand-crafted category structures of HowNet, since
one can reasonably expect human-assigned category memberships
to be a gold standard for this task. We find that in 69% of cases,
the HowNet-assigned category for a given word-concept is also the
dominant corpus-derived category, and that in 76% of cases, a
wordconcept has a statistical membership in the HowNet-assigned
category that is greater than the median membership score for that
category.</p>
      <p>In fact, these results suggest that HowNet is far from being a
goldstandard for category membership. In many cases, the HowNet
category name is either poorly named or is dangerously misleading.
For instance, the primary sense of the verb ”doctor” in English is
not ”heal” but ”fiddle” (as in ”to doctor one’s re¶sume¶”). Likewise,
HowNet assigns the name ”resume” to the super-category of ”repair”
and ”doctor”, when the verb ”restore” is more appropriate in
English. In many other cases, the HowNet assigned category is only one
of several that seem intuitively appropriate. For instance, the word
”knight” is assigned the dominant category protect-agent (based on
12 occurrences of the pattern ”knight who protects”) while HowNet
assigns it to the category defend-agent (which is the second-most
popular corpus assignment, based on 10 occurrences of ”knight who
defends”). Viewed from this perspective, the corpus-based and
handcrafted approaches to category assignment are complementary, not
conflicting, where each can serve to validate and enrich the other.
6</p>
    </sec>
    <sec id="sec-7">
      <title>CONCLUSION</title>
      <p>The results of our experiments with Wikipedia are promisingly
suggestive about the possibility of contextualizing ontological category
structures via corpus-derived statistics. For example, the Wikipedia
corpus reveals that the most common verb for the subject noun
”vampire” is ”hunt” (where the phrase ”vampires who hunt” occurs 4
times), indicating that in this pop-culture/fantasy-oriented context,
a vampire is to be seen predominantly as a member of the category
hunt-agent, or hunter. While one is unlikely to find such a
categorization in an ontology like WordNet, or even Cyc, this is the most
appropriate categorization in this context. Nonetheless, these results are
hardly conclusive, for although large, Wikipedia is simply not large
enough to provide the diversity of evidence needed to reliably derive
a heterogeneous category membership. If a resource like Wikipedia
lacks the necessary scale, surely this speaks to the futility of defining
a context via a corpus?</p>
      <p>We believe the answer to this dilemma lies not in ever-larger
corpora (which may be too large to preserve the distinctive biases of a
given context), but in the combination of different perspectives
offered by the same corpus. We have described two different
perspectives in this paper: the perspective of behaviour (captured via verb
collocations) described in section 3, and the perspective of
clustering (captured via group-word collocations) described in section 4.
For instance, we know that Robot is the most representative member
of the category army-agent in Wikipedia (with 63 examples), while
army is itself a highly representative member of the category
attackagent. This suggests that Robot should also be a strong member of
the category attack-agent. While Wikipedia records no uses of the
collocation ”robot whojwhichjthat attacks”, this joint perspective is
sufficient evidence to support going to the web for this collocation.
That is, the intuition that Robot is an attack-agent is consistent with
the corpus, and thus the context, so the precise membership score can
be determined using the larger context of the web.</p>
      <p>Bootstrapping techniques like this should allow us to grow more
heterogeneous category structures while respecting the ontological
biases of the specific context. Once the deficiencies of relatively
small corpora are addressed via such techniques, we expect to be
better poised to fully explore the ramifications and opportunities of
corpus-trained contextual ontologies.</p>
    </sec>
    <sec id="sec-8">
      <title>ACKNOWLEDGEMENTS</title>
      <p>We would like to thank Enterprise Ireland for supporting this
research through a grant from the Commercialization Fund.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Dong</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Dong</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <source>HowNet and the Computation of Meaning</source>
          , World Scientific, Singapore,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Glucksberg</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Keysar</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <article-title>How Metaphors Work, Metaphor and Thought (2nd edition), A</article-title>
          . Ortony (Ed.), Cambridge University Press,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Hofstadter</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Fluid</given-names>
            <surname>Concepts</surname>
          </string-name>
          and
          <string-name>
            <given-names>Creative</given-names>
            <surname>Analogies</surname>
          </string-name>
          , Basic Books,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Lakoff</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Women</surname>
            , Fire and
            <given-names>Dangerous</given-names>
          </string-name>
          <string-name>
            <surname>Things</surname>
          </string-name>
          , Chicago University Press,
          <year>1987</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Lenat</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Guha</surname>
            ,
            <given-names>R. V.</given-names>
          </string-name>
          ,
          <source>Building Large Knowledge-Based Systems, Addison Wesley</source>
          ,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>G. A.</given-names>
          </string-name>
          ,
          <article-title>WordNet: A Lexical Database for English</article-title>
          ,
          <source>Communications of the ACM</source>
          , Vol.
          <volume>38</volume>
          No.
          <issue>11</issue>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Searle</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Metaphor</surname>
          </string-name>
          ,
          <article-title>Metaphor and Thought (2nd edition), A</article-title>
          . Ortony (Ed.), Cambridge University Press,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Veale</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , Analogy Generation in HowNet,
          <source>The proceedings of IJCAI'2005, the International Joint Conference on Artificial Intelligence</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Wikipedia</surname>
          </string-name>
          open
          <article-title>-source encyclopaedia: www</article-title>
          .wikipedia.org.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>F van Harmelen</surname>
            ,
            <given-names>L</given-names>
          </string-name>
          <string-name>
            <surname>Serafini</surname>
            and
            <given-names>H</given-names>
          </string-name>
          <string-name>
            <surname>Stuckenschmidt</surname>
          </string-name>
          ,
          <article-title>C-OWL: Contextualizing ontologies</article-title>
          ,
          <source>Proceedings of the 2nd International Semantic Web Conference</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Harris</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <source>Mathematical Structures of Language</source>
          , Wiley.
          <year>1968</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Hindle</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <article-title>Noun classification from predicate-argument structures</article-title>
          ,
          <source>Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL)</source>
          , pp.
          <fpage>268</fpage>
          -
          <lpage>275</lpage>
          ,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P.</given-names>
            <surname>Cimiano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hotho</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Staab.</surname>
          </string-name>
          ,
          <article-title>Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis</article-title>
          .
          <source>Journal of AI Research</source>
          , Volume
          <volume>24</volume>
          :
          <fpage>305</fpage>
          -
          <lpage>339</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Hearst</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <article-title>Automatic acquisition of hyponyms from large text corpora</article-title>
          .
          <source>Proceedings of the 14th International Conference on Computational Linguistics (COLING)</source>
          , pp.
          <fpage>539</fpage>
          -
          <lpage>545</lpage>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>