=Paper=
{{Paper
|id=Vol-1347/paper10
|storemode=property
|title=What can distributional semantic models tell us about part-of relations?
|pdfUrl=https://ceur-ws.org/Vol-1347/paper10.pdf
|volume=Vol-1347
|dblpUrl=https://dblp.org/rec/conf/networds/Morlane-Hondere15
}}
==What can distributional semantic models tell us about part-of relations?==
<pdf width="1500px">https://ceur-ws.org/Vol-1347/paper10.pdf</pdf>
<pre>
 What can distributional semantic models tell us about part-of relations?

                                   François Morlane-Hondère
                                   LIMSI-CNRS, Orsay, France
                            francois.morlane-hondere@limsi.fr


1   Introduction                                               2       Part-of relation and DSMs
                                                               As its name suggests, part-of relation – or
The term Distributional semantic models (DSMs)                 meronymy1 – holds between a part – the meronym
refers to a family of unsupervised corpus-based                – and its whole – the holonym –, like in bed/pillow,
approaches to semantic similarity computation.                 armor/steel or ostrich/feather. It is one of the cen-
These models rely on the distributional hypothe-               tral relations used in knowledge representation.
sis (Harris, 1954), which states that semantically                Automatic extraction of part-of relations has
related words tend to share many of their contexts.            been addressed using many approaches, most of
So, by collecting information about the contexts               which are pattern-based (Berland and Charniak,
in which words are used in a corpus, DSMs are                  1999; Girju et al., 2006; Pantel and Pennacchiotti,
able to measure the distributional similarity of two           2006). However, the unsupervised nature of the
words, which theoretically translates into a seman-            distributional approach makes it an attractive al-
tic one.                                                       ternative.
   In recent years, these models have become very                 Studies were conducted to assess the nature
popular in a wide range of NLP tasks (Weeds,                   of the semantic relations extracted by distribu-
2003; Baroni and Lenci, 2010), mainly because                  tional models – using human judges (Kuroda et
of the ever-increasing availability of textual data.           al., 2010), thesauri (Morlane-Hondère, 2013; Fer-
Regardless of their use in NLP applications, distri-           ret, 2015) or ad hoc datasets (Baroni and Lenci,
butional data provide precious information about               2011). They showed that part-of relations are
words’ behaviour and their tendency to appear in               present in varying proportions among distribution-
the same contexts. Yet, linguists have shown lit-              ally similar words. This very presence is inter-
tle interest in DSMs (Sahlgren, 2008). We believe              esting in that unlike synonymy, hypernymy or co-
that this kind of information can be relied on to              hyponymy, meronymy is not a similarity relation
empirically assess the validity of linguistic theo-            (Resnik, 1993; Budanitsky and Hirst, 2006): an
ries. Conversely, by shedding light on underlying              ostrich is not the same kind of thing as a feather,
linguistic factors that influence distributional be-           neither an armor is the same kind of thing as steel.
haviours, linguistic studies can contribute to im-             Following the distributional hypothesis, it is not
prove our understanding of the results provided by             expected that these kind of meronyms share a lot
DSMs.                                                          of contexts.
                                                                  It appears, though, that a certain proportion
   This paper illustrates such a qualitative linguis-
                                                               of them tend to do so. For example, in Ba-
tic approach by investigating the presence of part-
                                                               roni and Lenci (2010)’s DSM, player, pianist and
of relations among distributionally similar French
                                                               musician are among the ten most distributionally
words. We compare distributional data and a set of
                                                               similar words of orchestra. In the following of
part-of relations provided by humans in a lexical
                                                               this study, we compare the semantic properties
network. In order to assess the nature of the part-
                                                               of the meronyms which can be extracted using a
of word pairs which can – or cannot – be found
                                                               distributional approach and the properties of the
in DSMs, these words were sense-tagged using
                                                               meronyms which cannot.
WordNet supersenses. Our results show consid-
erable discrepancies between the representation of                 1
                                                                   Some authors make a distinction between part-of relation
part-of sense pairs in distributional data.                    and meronymy (Cruse and Croft, 2004).


          Copyright © by the paper’s authors. Copying permitted for private and academic purposes.
In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final
                          Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org
                                                          46
3       Methodology and data                                 lation between two words and their probability of
                                                             being extracted in a DSM. However, the typology
3.1 The part-of dataset                                      has proven to be inadequate, so we chose to an-
The first step consists in gathering a set of                notate the words instead of their relation. This is
meronyms. Although efforts are made to provide               also what we do in this study. This approach is in-
expert-built lexical semantic resources for French           spired by the idea that the difference between the
(Fišer and Sagot, 2008; Pradet et al., 2014), there         meronymic sub-relations is due to the semantic na-
is currently no freely-available equivalent – in             ture of the words involved (Murphy, 2003).
terms of quality and coverage – to WordNet (Fell-               The above-mentioned lack of freely-available
baum, 1998) or the Moby thesaurus (Ward, 2002)               thesauri for French led us to use WordNet to per-
for French. So, we use the JeuxDeMots (JDM)                  form this task. Words of our dataset were 1) trans-
lexical network (Lafourcade, 2007), which is a               lated to English, 2) mapped to WordNet synsets
GWAP (Game With A Purpose) in which players                  and 3) linked to their translation’s supersense(s).
are asked to provide words which can be in a given           Supersenses – or lexicographer classes – are a set
relation with a given word2 .                                of 44 coarse semantic categories used to classify
   Although collaboratively-built lexical semantic           WordNet’s noun, verb and adjective entries3 . Ex-
resources have shown to be valuable (Gurevych                amples of the 25 noun supersenses are GROUP,
and Wolf, 2010) and although a relation in                   LOCATION or FOOD . Supersenses were then man-
JDM must be provided by two different play-                  ually disambiguated (drawer can both belong to
ers to be added to the network, a certain pro-               the PERSON and ARTIFACT supersenses, but only
portion of part-of relations in JDM are actually             the latter fits in the pair cabinet/drawer).
hypernymys (sucette/bonbon ’lollipop/candy’),
                                                             3.3      The distributional model
synonyms (chef /patron ’chief/boss’) or the-
matic associations (océanographie/eau ’oceanog-             We use a DSM4 generated from the frWaC corpus
raphy/water’). Two possible explanations for                 (Baroni et al., 2009) – a 1.6 billion words corpus
these confusions are the lack of linguistic expertise        of French web pages.
of the players or a misunderstanding of the instruc-            Words in the DSM appear at least 20 times in
tion. Erroneous relations were manually removed              the corpus and in at least 5 different contexts.
from the set.                                                   Syntactic dependencies were used as contexts
   One interesting characteristic of JDM part-of             using the Talismane parser (Urieli, 2013). Rela-
relations is that a considerable number of them              tions taken into account in the context vectors are
do not fit into traditional typologies of meronymy           the subject, object and modifier relations. Prepo-
relations. For example, topological inclusions               sitions and coordinating conjunctions are also in-
(cell/prisoner), attachment relations (ear/earring)          cluded as relations (the label of the relation being
or ownership (millionaire/money) are very com-               the preposition or the coordinating conjuction).
mon among JDM part-of pairs although they are                   The weighting of the contexts was made using
considered to be non-meronymic relations (Win-               the pointwise mutual information and the cosine
ston et al., 1987).                                          measure was used to compute the similarity be-
   After filtering the pairs whose members do not            tween the context vectors. The minimum similar-
appear in our DSM and removing most of the er-               ity threshold has been set to 0.02. The total num-
roneous relations, there were 24 089 part-of pairs           ber of word pairs in the DSM is 3 674 254.
left in our dataset.
                                                             4       Results and discussion
3.2 Sense tagging                                            We then measure the proportion of semantically-
In a previous study (Morlane-Hondère and Fabre,             annotated part-of pairs – sense pairs – in our set
2012), we manually annotated the different                   which are present in the DSM. Sense pairs which
meronymic sub-relations – following Winston and              occur less than 100 times in the dataset are dis-
Chaffin (1987)’s typology – in a dataset like the            carded. Table 1 provides the list of the 22 re-
one described above. The idea was to test whether                3
                                                                   http://wordnet.princeton.edu/man/
there is a correlation between the nature of the re-         lexnames.5WN.html
                                                                 4
                                                                   Provided by Franck Sajous from the CLLE-ERSS labo-
    2
        http://www.jeuxdemots.org/                           ratory.


                                                        47
maining sense pairs and, for each one, the ratio of          holonym/meronym        %     holonym/meronym        %
                                                                 TIME/ TIME         84     ARTIFACT / PERSON    32.6
part-of pairs present in the DSM. In this section,               LOC ./ LOC .      78.3   ARTIFACT / ARTIFACT   31.4
we describe the homogeneous sense pairs – whose                SUBST./ SUBST.      62.4      ARTIFACT / LOC .   24.8
                                                               OBJECT / OBJECT      61      ARTIFACT / PLANT    22.8
semantic classes are identical – and the heteroge-
                                                               COMM ./ COMM .      53.8    ARTIFACT / SUBST.    20.4
neous ones, then we provide a detailed analysis of             GROUP / PERSON      52.8     OBJECT / ANIMAL     19.8
some of the PERSON/BODY meronyms which have                    LOC ./ ARTIFACT     46.8      PLANT / PLANT      19.7
                                                                BODY / BODY        40.5     GROUP / ANIMAL      17.1
been extracted by the DSM.                                    ANIMAL / ANIMAL       41     PERSON / ARTIFACT    16.5
                                                              ARTIFACT / COMM .    39.9      ANIMAL / BODY      9.4
4.1 Homogeneous sense pairs                                    ACT / ARTIFACT      35.8      PERSON / BODY      5.5
As expected, part-of relations composed of two
                                                            Table 1: Part-of sense pairs and their presence in
words of the same class are the most repre-
                                                            the DSM.
sented in the DSM. 84 % of the TIME/TIME
part-of pairs were extracted by the DSM. This
can be explained by the fact that the mem-                  acier – as well as fer – is used as a material, the
bers of pairs like mois/jour ‘month/day’ both               representation of carbone that emerges from the
appear in contexts involving temporal prepo-                corpus is that of a chemical element.
sitions like venir IL Y A ‘to come SINCE’, se
dérouler DURANT ‘to take place DURING’ or
scrutin AVANT ‘election BEFORE’.                            4.2   Heterogeneous sense pairs
   Likewise, the spatial dimension plays a crucial          At the other end of the scale, part-of relations com-
role in the extraction of meronyms (78.3 % of               posed of two words of different classes are – also
LOCATION / LOCATION pairs are extracted). This              logically – the less represented in the DSM.
is due to the fact that, as for time, spatial infor-           Part-of pairs composed of words that refer to
mation can be conveyed by specific prepositions.            human beings or to animals and their body parts
Thus, LOCATION/LOCATION meronyms’ shared                    are barely present in the DSM (although being
contexts massively involve the DANS ‘IN’ relation.          the most frequent sense pairs in our dataset). In
   SUBSTANCE pairs are the third best-extracted             frWaC, PERSON words appear as subjects of ac-
kind of pairs. The reason why 37.6 % of them has            tion (prendre ‘to take’, dire ‘to say’) or cognitive
not been extracted can be illustrated by the com-           verbs (vouloir ‘to want’, savoir ‘to know’). They
parison of acier ‘steel’ and two of its meronyms,           are frequently modified by nationality adjectives.
namely fer ‘iron’ – which was extracted in the              Body parts do not appear in such contexts. The
DSM – and carbone ‘carbon’ – which was not ex-              class of body parts was actually found to be quite
tracted:                                                    heterogeneous, in that body parts’ distributions in
                                                            the corpus differ from persons’, but not in the same
  1. acier and fer both appear in contexts
                                                            way:
     like grille EN ‘grille COMP’, forgé MOD
     ‘forged MOD’ or lame DE ‘blade COMP’.                    • organ nouns mostly appear in noun com-
     Thus, they appear as materials and, moreover,              pounds to indicate the location of medical in-
     as materials which are used to build the same              terventions (radiographie DE ‘x-ray MOD’)
     kind of things;                                            or affections (cancer de ‘cancer COMP’ or
                                                                lésion de ‘injury COMP’);
  2. although being a material as well, carbone
     does not appear as such in the corpus. Rather,           • limb nouns are modified by adjectives related
     its contexts are chemical compounds like                   to location and are objects of verbs like lever
     monoxyde DE ‘monoxide COMP’. It is also                    ‘to raise’ or étendre ‘to stretch’.
     modified by adjectives like inorganique MOD
     ‘inorganic MOD’, which describe chemical               All these contexts are obviously incompatible with
                                                            PERSON words.
     properties of carbone. These two kinds of
     contexts are not found among acier’s.                     A similar distributional discrepancy can be ob-
                                                            served with the ANIMAL/BODY sense pair, ex-
So, we can see that there is a discrepancy between          cept that animal nouns tend to appear in contexts
the contexts in which acier appears in the corpus           like élevage DE ‘farming COMP’ or espèce DE
and the ones in which carbone appears: whereas              ‘species COMP’. They are also modified by size


                                                       48
adjectives. It is interesting to note that many               and the holonym are quite random. For ex-
animal body parts like tête DE ‘head COMP’,                  ample, the meronyms homme/main ‘man/hand’
peau DE ‘skin COMP’ or queue DE ‘tail COMP’                   share contexts like nu MOD ‘bare MOD’ or dos DE
do appear among the closest contexts of animal                ‘back COMP’, which are not very informative
nouns. This means that the meronymic relation                 about their relation. On the other hand (!) some
between nouns referring to animals and their body             shared contexts like doigt DE ‘finger COMP’ and
parts is not a paradigmatic one. Thus, it is rea-             saisir SUJ ‘to grab SUBJ’ are more informative.
sonable to say that, in order to extract this particu-        The fact that these specific features are shared by
lar relation, the use of syntagmatic patterns would           the meronyms indicates some kind of similarity
be a better strategy than the use of a paradigmatic           between them: when a man grabs a rock, it is ac-
DSM.                                                          tually his hand that completes the action of grab-
   The sense pair GROUP/PERSON also presents                  bing, as well as a man’s fingers are also his hand’s
an interesting situation. Of all the heterogeneous            fingers.
sense pairs, meronymic relations belonging to this               The meronyms enfant/oeil ‘child/eye’ also
one are the most likely to be extracted by the distri-        share some interesting contexts:           both the
butional method. This can be explained by a ten-              meronym and the holonym are subjects of verbs of
dency to use the GROUP entities in a metonymic                visual perception like regarder ‘to look’, percevoir
way: although an army is not the same kind of                 ‘to perceive’ or observer ‘to observe’. The
thing as a soldier, both words share contexts like            metonymic interpretation is quite straightforward:
tirer SUJ ‘to shoot SUBJ’ or tué PAR ‘killed BY’.            although the eye is the child’s part that allows him
Another reason is the transitivity of properties like         to look/perceive/observe, this ability is extended
nationality: armée ‘army’ and soldat ‘soldier’ are           to the whole child.
both modified by nationality adjectives because                  This phenomenon partially explains why such
usually, members of the armed forces of a nation              meronyms share semantic – thus distributional –
have to be citizens of this nation.                           features and are more likely to be extracted with a
   In the section 2, we mentioned the fact that               DSM.
three meronyms of orchestra were present among
its ten most distributionally similar words in Ba-
roni and Lenci (2010)’s DSM. In our data, the                 5   Conclusion
meronyms orchestre/musicien have also been ex-
tracted: as for army and soldier, these words                 The main goal of this study is to shed light on
share semantic features. They are related to                  the linguistic phenomena at work in DSMs. By
the kind of music a musician and an orches-                   comparing a set of sense-tagged part-of relations
tra can play (classique MOD ‘classical MOD’,                  and a distributional model, we show that the se-
traditionnel MOD ‘traditional MOD’ or jazz DE                 mantic class of the meronyms has a dramatic in-
‘jazz MOD’), the kind of actions they perform (in-            fluence on their probability to be extracted by a
terprété PAR ‘performed BY’, accompagné PAR                DSM. We also highlight the – positive – influence
‘accompanied BY’) or their nationality.                       of metonymy in the extraction of heterogeneous
                                                              meronyms.
4.3 Focus on the PERSON/BODY sense pair                         These results show that the part-of relation is
                                                              not a monolithic entity but a collection of different
In the previous subsection, we saw that meronyms              kinds of relations between different kinds of words
belonging to the PERSON/BODY are the least likely             which may or may not be distributionally similar.
to be extracted with the distributional approach. In
this subsection, we provide further insight into this
result by examining the nature of the few PER -               Acknowledgments
SON / BODY meronymic pairs that were success-
fully extracted.                                              This work was partially supported by the ANSM
   The examination of the 5.5 % of PER -                      (French National Agency for Medicines and
SON / BODY meronymic pairs that were success-                 Health Products Safety) through the Vigi4MED
fully extracted is disappointing: the vast ma-                project under grant #2013S060.
jority of the contexts shared by the meronym


                                                         49
References                                                     Mathieu Lafourcade. Making people play for Lexi-
                                                                cal Acquisition with the JeuxDeMots prototype. In
Marco Baroni, Silvia Bernardini, Adriano Ferraresi,             SNLP’07: 7th International Symposium on Natu-
 and Eros Zanchetta. The WaCky wide web: a col-                 ral Language Processing, page 7, Pattaya, Chonburi,
 lection of very large linguistically processed web-            Thailand, December 2007.
 crawled corpora. Language Resources and Evalua-
 tion, 43(3):209–226, 2009.                                    François Morlane-Hondère. Une approche linguistique
                                                                 de l’évaluation des ressources extraites par analyse
Marco Baroni and Alessandro Lenci. Distributional                distributionnelle automatique. PhD thesis, Univer-
 memory: A general framework for corpus-based                    sité de Toulouse II le Mirail, 2013.
 semantics. Computational Linguistics, 36(4):673–
 721, 2010.                                                    François Morlane-Hondère and Cécile Fabre. Étude
                                                                 des manifestations de la relation de méronymie dans
Marco Baroni and Alessandro Lenci.        How we
                                                                 une ressource distributionnelle. In Proceedings of
 BLESSed distributional semantic evaluation. GEMS
                                                                 TALN 2012, Grenoble, France, June 2012.
 2011, pages 1–10, 2011.
                                                               Lynne Murphy. Semantic Relations and the Lexicon.
Matthew Berland and Eugene Charniak. Finding parts
                                                                 Cambridge University Press, New York, 2003.
 in very large corpora. In Proceedings of the 37th An-
 nual Meeting of the Association for Computational             Patrick Pantel and Marco Pennacchiotti. Espresso:
 Linguistics on Computational Linguistics, ACL ’99,              Leveraging generic patterns for automatically har-
 pages 57–64, Stroudsburg, PA, USA, 1999. Associ-                vesting semantic relations. In Proceedings of the
 ation for Computational Linguistics.                            21st International Conference on Computational
Alexander Budanitsky and Graeme Hirst. Evaluating                Linguistics and the 44th Annual Meeting of the As-
  WordNet-based Measures of Lexical Semantic Re-                 sociation for Computational Linguistics, ACL-44,
  latedness. Computational Linguistics, 32(1):13–47,             pages 113–120, Stroudsburg, PA, USA, 2006. As-
  March 2006.                                                    sociation for Computational Linguistics.

D. Alan Cruse and William Croft. Cognitive lin-                Quentin Pradet, Gaël de Chalendar and Jeanne Bague-
  guistics. Cambridge: Cambridge University Press,               nier Desormeaux. WoNeF, an improved, expanded
  2004.                                                          and evaluated automatic French translation of Word-
                                                                 Net. In GWC 2014, Tartu, Estonia, 2014.
Christiane Fellbaum, editor. WordNet An Electronic
  Lexical Database. The MIT Press, Cambridge, MA;              Philip Resnik. Selection and Information: a Class-
  London, May 1998.                                              Based Approach to Lexical Relationships. PhD the-
                                                                 sis, The Institute For Research In Cognitive Science,
Olivier Ferret. Typing relations in distributional the-          University of Pennsylvania, 1993.
  sauri. In Núria Gala, Reinhard Rapp, and Gemma
  Bel-Enguix, editors, Language Production, Cogni-             Magnus Sahlgren. The distributional hypothesis. Riv-
  tion, and the Lexicon, volume 48 of Text, Speech and          ista di Linguistica, 20(1):33–53, 2008.
  Language Technology, pages 113–134. Springer In-
                                                               Assaf Urieli. Robust French syntax analysis: recon-
  ternational Publishing, 2015.
                                                                 ciling statistical methods and linguistic knowledge
Darja Fišer and Benoı̂t Sagot. Combining multiple re-           in the Talismane toolkit. PhD thesis, Université de
  sources to build reliable wordnets. In TSD 2008 -              Toulouse II le Mirail, 2013.
  Text Speech and Dialogue, Brno, Czech Republic,
  2008.                                                        Grady Ward. Moby Thesaurus List (English),. 2002.

Roxana Girju, Adriana Badulescu, and Dan Moldovan.             Julie Weeds. Measures and Applications of Lexical
  Automatic discovery of part-whole relations. Com-               Distributional Similarity. PhD thesis, Department
  put. Linguist., 32(1):83–135, March 2006.                       of Informatics, University of Sussex, 2003.

Iryna Gurevych and Elisabeth Wolf. Expert-Built                M. E. Winston, R. Chaffin, and D. Herrmann. A tax-
   and Collaboratively Constructed Lexical Semantic              onomy of part-whole relations. Cognitive Science,
   Resources. Language and Linguistics Compass,                  11(4):417–444, December 1987.
   11(4):1074–1090, 2010.
Zellig Harris.   Distributional structure.       Word,
  10(23):146–162, 1954.
Kow Kuroda, Jun’ichi Kazama, and Kentaro Torisawa.
  A look inside the distributionally similar terms. In
  Proceedings of the Second Workshop on NLP Chal-
  lenges in the Information Explosion Era (NLPIX
  2010), pages 40–49, Beijing, China, August 2010.
  Coling 2010 Organizing Committee.


                                                          50

</pre>