Taming Sense Sparsity:
                                 a Common-Sense Approach
                     Antonio Lieto, Enrico Mensa and Daniele P. Radicioni
                                  Dipartimento di Informatica
                                Università degli Studi di Torino
                          Corso Svizzera 185, 10149 – Torino ITALY
                       {lieto,mensa,radicion}@di.unito.it


                    Abstract                          are preferable (Palmer et al., 2004). In these cases,
    English. We present a novel algorithm and         fine-grained distinctions may be unnecessary and
    a linguistic resource named C L OS E S T af-      even detrimental to WSD and WSI, so that in the
    ter ‘Common SEnse STrainer’. The re-              last few years many efforts concentrated on clus-
    source contains a list of the main senses         tering senses. Most works focused on produc-
    associated to a given term, and it was ob-        ing coarser-grained sense inventories, to the ends
    tained by applying a simple set of prun-          of grouping together the closest (partially over-
    ing heuristics to the senses provided in the      lapped) senses of a word; to these ends, various
    NASARI vectors for the set of 15K most            techniques have been carried out, that are briefly
    frequent English terms. The preliminary           surveyed in Section 2.
    experimentation provided encouraging re-             Differently from existing approaches, we pro-
    sults.                                            pose a simple yet effective method that relies on
                                                      recently developed resources that are assumed to
    Italiano. In questo lavoro presentiamo            also grasp common-sense knowledge (Camacho-
    un algoritmo e una risorsa linguistica,           Collados et al., 2015; Lieto et al., 2016a), which
    ClOSeSt, che contiene i sensi più rile-          is assumed to be both widely accessible and
    vanti per i 15K termini più frequenti del        elementary knowledge (Minsky, 2000), and to
    dizionario inglese. L’algoritmo implemen-         reflect typicality traits encoded as prototypical
    tato utilizza una risorsa esistente che cod-      knowledge (Rosch, 1975). The research question
    ifica conoscenza di tipo enciclopedico, e         presently addressed is thus: To what extent can
    poggia sulla nozione di senso comune per          we individuate few principal —common-sense—
    filtrare i possibili sensi associati a cias-      senses for a term, and in how far is it possible
    cun termine. La valutazione preliminare           to approximate human performance? Although
    ha fornito risultati incoraggianti in merito      it is known that even human annotators provide
    alla qualità dei sensi estratti.                 quite different response when annotating text with
                                                      senses (Palmer et al., 2004), we presently explore
1   Introduction                                      the hypothesis that wide-coverage resources are
Many NLP tasks involve word sense disambigua-         sufficient to individuate the main senses associated
tion (WSD) and word sense induction (WSI), and        to English terms.
require using lexical resources such as Word-
                                                      2   Related Work
Net (Miller, 1995) and BabelNet (Navigli and
Ponzetto, 2010) that provide a rich mapping of        In order to attain coarse-grained senses, differ-
terms (or word forms) onto the corresponding          ent approaches have been proposed, based on
senses (word meanings). These widely used re-         some sort of semantic underspecification (Buite-
sources provide in fact subtle distinctions between   laar, 2000; Ng et al., 2003; Palmer et al., 2007),
the possible senses of a term. It is largely ac-      on existing dictionaries and on exploiting hand-
knowledged that while fine-grained sense distinc-     crafted sense hierarchies (Navigli, 2006), on syn-
tions are necessary for some precise tasks (such      tactic and semantic properties (such as selec-
as machine translation), for other sorts of appli-    tional restrictions on verb arguments) (Artale et
cations (such as text categorization and informa-     al., 1998; Palmer et al., 2004), on linguistically
tion extraction) coarse-grained sense inventories     motivated heuristics (Mihalcea and Moldovan,
2001), or on distributional similarity among word                2010), that provides about 2000 fundamental Ital-
senses (Agirre and De Lacalle, 2003). Further ap-                ian terms (De Mauro, 1999) with an ontological
proaches have been proposed that rely on an ad-                  description.
justable nearest neighbour schema for clustering
senses according to the sense granularity actually               3       The C L OS E S T Algorithm
required by the application at hand (McCarthy,                   The rationale underlying the C L OS E S T algorithm
2006). A popular testbed for experimenting these                 is that the main (most frequent) senses gained
and other approaches is represented by the sense-                more room than marginal senses in our lexical
annotated corpora Senseval-2 and 3 (Edmonds                      and conceptual system and in general in our ut-
and Cotton, 2001; Mihalcea and Edmonds, 2004).                   terances. This phenomenon determines words and
   The problem of annotating a term with the ap-                 phrases availability and saliency (Vossen and Fell-
propriate sense is a challenging one, to such an ex-             baum, 2009), that are arguably grasped by ency-
tent that by no means “two lexicographers work-                  clopedic resources, as well. Herein, more central
ing independently are guaranteed to derive the                   senses are typically featured by richer (i.e., longer
same set of distinctions for a given word” (Palmer               vectors) and less specific information, richer se-
et al., 2004). It has been raised that this issue can            mantic connections with other concepts, and heav-
be overcome to some extent by adopting a more                    ier feature weights. Although it may happen that
flexible annotation schema, where senses are de-                 some sense spans over (or even subsumes) another
scribed in a graded fashion: in this way, the ap-                one, we are not primarily trying to cluster senses in
plicability of a sense can be assessed on an ordi-               agglomerative fashion, e.g., by resorting to some
nal scale, rather than in ‘crisp’ fashion. This sort             superclass of the considered concept; rather, we
of annotation would allow to better interpret hu-                select the most relevant ones (a term is seldom
man annotations, in particular for coarse-grained                associated to more than few, say three or four,
groups (Erk et al., 2013). A related and comple-                 senses) and we discard the other ones.
mentary issue is that of clusterability, that mea-                  The C L OS E S T algorithm takes in input a term
sures in how far word meanings can be partitioned.               t and provides a set of possibly related senses.2
In this setting, whereas highly clusterable lem-                 The algorithm first retrieves the set of senses S =
mas can be grouped based on traditional clustering               {s1 , s2 , . . . , sn } that are possibly associated to t:
techniques, less clusterable lemmas require more                 such set is obtained by directly querying NASARI.
sophisticated soft-clustering algorithms to compu-               The output of the algorithm is a result set S  ⊆ S.
tational systems, and more time and expertise to                 In order to attain S  we devised a process of in-
human annotators (McCarthy et al., 2016).                        cremental filtering, that is arranged into two main
   This work is framed in the context of a long-                 phases:
term project aimed at investigating conceptual
categorization (Lieto et al., 2015; Lieto et al.,                    1. LS-Pruning. Pruning of less salient senses:
2016b) based on a hybrid strategy (Evans and                            senses with associated poor information are
Frankish, 2009) complementing formal ontolo-                            eliminated. Senses salience is determined
gies with the geometrical framework of Concep-                          both in absolute terms and in relation to the
tual Spaces (CS) (Gärdenfors, 2014). In particu-                       most salient sense.
lar, we are building a knowledge base to collect                     2. OL-Pruning. Pruning of overlapping senses:
conceptual information encoded in a CS-based                            if senses with significant overlap are found,
representational format to provide a uniform in-                        the less salient sense is pruned.
terface between the linguistic and the concep-
tual level, where CSs representations are fully                  Senses are represented as NASARI vectors,
endowed with BabelNet identifiers (Lieto et al.,                 that are the vectorial counterpart of BabelNet
2016a).1 This trait will make it possible to link                synsets; concepts (basically, WordNet synsets
the present work to existing initiatives like Senso              and Wikipedia pages) are described through vec-
Comune (Oltramari and Vetere, 2008; Chiari et al.,               tor representations, whose features are synset
    1
                                                                 IDs themselves. Feature weights are computed
      The integration of different semantic models such as CSs
                                                                     2
and the distributional semantics underlying NASARI is still          The present investigation is restricted to nouns, but no
an open issue; we provided an initial solution to this problem   theoretical limitation prevents us from extending the ap-
in (Lieto et al., 2016a).                                        proach to verbs and adjectives.
through the metrics of lexical specificity, by ex-            where, among overlapped vectors, the most salient
ploiting a semantics-based dimensionality reduc-              one has been retained.
tion (Camacho-Collados et al., 2015). Each sense
is associated with exactly one NASARI vector, so              3.1       Building the C L OS E S T resource
that pruning a sense amounts to pruning a vector.             Overall the system handled about 2.69M
LS-Pruning. To analyze the senses in S, we in-                NASARI vectors. Some 207K vectors associated
spect each vector ~vts related to sense s for the term        to Named Entities were discarded, as not directly
t. The first pruning occurs when no enough infor-             related to common-sense concepts; the remaining
mation is found, that is when ~vts contains less than         vectors contained overall 6.9M unique words.
a fixed number of elements (Table 1). Then, in or-               The top (most frequent) 15K nouns were ex-
der to determine the next vectors to be pruned, we            tracted from the Corpus of Contemporary Amer-
compute the weight of each vector (W (~vts )), the            ican English (COCA) which has been built from
longest vector and the heaviest one among those               composite and balanced sources, including spo-
associated with t (L(~vt ) and H(~vt ), respectively).        ken, fiction, magazine, newspaper, academic text.3
The weight of a NASARI vector W (~vts ) is com-               Over 6K terms were discarded, since they are
puted by averaging the weight of the features (i.e.,          associated in NASARI either to 1 sense (about
the synsets) contained herein. The definitions for            1K terms) or to no sense at all (over 5K terms),
these measures are illustrated in Equations 1–3.              which actually reduced the input size to about
                                                              8.7K terms; overall 32.6K senses were retrieved
                                                              (on average, 3.7 senses per term), corresponding
          L (~vt ) = arg max (len(~vts ))               (1)
                            s∈S                               to such input terms.
                         1         X                             The figures featuring the processing phases are
         W (~vts ) =             ·   wsj                (2)
                      len(~vts )                              reported in Table 1: over 4K senses were filtered
                                   j
                                                             in the first step of the LS-Pruning phase, based
         H (~vt ) = arg max W (~vts ) .                 (3)   on the length of the vector ~vts , and 7.4K senses
                           s∈S
                                                              were further discarded in the second step. Finally,
The decision on whether to prune or not a vec-                in the OL-Pruning phase, 5.6K vectors were can-
tor is based on a simple criterion: ~vts ∈ S is               celed based on overlapping accounts, thus overall
pruned if both its length is below a given frac-              yielding 17.5K deleted and 15.1K survived vec-
tion of the length of the longest one L (~vt ), and its       tors.4 The polysemy rate was reduced from the
weight is lower than a given fraction of the heavi-           3.74 senses per term initially featuring NASARI
est one, H (~vt ). The parameter settings adopted in          down to 1.73 senses per term, which is in line with
the present work are illustrated in Table 1.                  the degree of polysemy detected in the Collins En-
OL-Pruning. The second phase of the algorithm                 glish Dictionary for English nouns by WordNet
aims at detecting overlapped senses. The over-                authors (Fellbaum, 1990).
lap between vectors that survived the LS-Pruning
is computed thanks to the information provided                4       Evaluation
in NASARI. The heuristics used in this phase
is as follows: the overlap between two vectors                A preliminary experimentation has been devised
Ovl(~vti , ~vtj ) is computed as a fraction of the            to assess the correctness and completeness of the
length of the shortest vector between the two con-            extracted senses: that is, the question addressed
sidered, as indicated in Equation 4).                         was whether i) all senses extracted for the input
                                                              term are salient (and actually judged as the main
                                ~vti ∩ ~vtj                   senses), and ii) all the relevant senses were pre-
    Ovl(~vti , ~vtj ) =                                 (4)
                          len(shortest(~vti , ~vtj ))         served in C L OS E S T . To these ends, 15 volunteers
                                                              were recruited and interviewed through an on-line
The overlapping is checked for every pair h~vi , ~vj i
                                                              questionnaire to evaluate, on a human common-
(with i 6= j) and when an overlap is detected
                                                              sense judgement basis, the set of senses extracted
higher than a fixed threshold (see Table 1), the
                                                              by the system for 20 terms.
shortest vector between the two is pruned.
  At the end of this phase, we have the set S                    3
                                                                      http://corpus.byu.edu/full-text/.
                                                                  4
where only the most salient vectors survived and                      C L OS E S T is available at http://goo.gl/7B61Oz.
                                            condition                      threshold values    pruned senses    pruning phase
                                                                                                                )
                                        len(~vts ) ≤ α                          α=5                4, 389
        prune ~vts IF                                                                                           LS-Pruning
                            len(~vts )
                             L (~vt ) < β  AND WW(H(~v(~tsv ))) < γ           β, γ = .40           7, 460
                                                             t

                                     Ovl(~vts , ~vtu ) ≥ δ                     δ = .20             5, 676           OL-Pruning


                                                    filtered out senses    17,525
                                                        retained senses    15,134

   Table 1: Pruning of senses in the three steps, along with the number of senses pruned at each step.


Stimuli. The list of 20 terms was algorithmically                         was relevant (and missing, in the C L OS E S T re-
selected from the aforementioned COCA corpus                              source) for the input term ‘education’; the sense
(see footnote 3) by selecting terms herein with in-                       ‘social network’ is relevant for the term ‘network’;
dex 1, 51, 101, and so forth. In this way we se-                          and ‘meeting’ for ‘session’.
lected highly frequent terms that are expected to                            However,     although encouraging results
be part of common-sense for those who partici-                            emerged from the experimentation, further ex-
pated in our experimentation.5                                            periments are needed to assess the C L OS E S T
Experimental design and procedure. The partic-                            resource in a more extensive and principled way,
ipants were asked a) to assess each and every sense                       also in consideration of the many factors that
extracted by the system and associated to each in-                        were presently neglected, such as, e.g., age,
put term by indicating whether it was acceptable                          education, occupation of the participants, their
as one of the principal senses for the term at hand.                      native language, etc..
Additionally, they were requested b) to indicate
                                                                          5    Conclusions
any further sense they reputed essential in order
to complete the common-sense pool of senses for                           In this paper we have illustrated the C L OS E S T
the given term.                                                           algorithm to extract the most salient (under the
Results. Overall 42 senses (corresponding to                              common-sense perspective) senses associated to
the 20 mentioned terms) were assessed through                             a given term; also, we have introduced the
the experimentation: each sense was rated 15                              C L OS E S T resource, which has been built by start-
times, thus resulting in 630 judgements: 24% of                           ing from the 15K top frequency English terms.
senses were not found appropriate, according to a                         The resource currently provides senses in a flat
common-sense judgement, thereby determining a                             manner, but, if required, senses can be organized
76% accuracy as regards as question a). However,                          in a sorted fashion by extending the metrics used
if we consider senses refused by at least 10 par-                         for filtering. Our work relies on a recently devel-
ticipants, only 5 senses were refused (12%), that                         oped resource such as NASARI that is multilin-
actually correspond to very specific senses (e.g.,                        gual in nature.6 Consequently, different from most
the sense ‘Net (textile)’ for the term ‘network’;                         previous approaches, C L OS E S T can be linked
‘Session (Presbyterianism)’, ‘session house’ for                          to various existing resources aimed at grasping
the term ‘session’).                                                      common-sense to complete the ideal chain con-
   As regards as question b), results are more diffi-                     necting lexicon, semantics and formal (ontologi-
cult to interpret, due to the sparsity of the answers:                    cal) description. The experimentation revealed a
out of the 59 added senses, only in 8 cases the                           reasonable agreement with human responses, and
added sense has been indicated by two or three                            pointed out some difficulties in fully assessing this
participants (and never more): in such cases it                           sort of resource. These issues, along with im-
emerged, for example, that the sense ‘manners’                            provements to the heuristics implemented by the
                                                                          algorithm and a different evaluation based on a
    5
      The full list of the considered terms includes: time, side,         shared NLP task, will be addressed in future work.
education, type, officer, ability, network, shoulder, threat, in-
                                                                              6
vestigation, gold, claim, learning, session, aid, emergency,                    An interesting question may be raised on this point,
bowl, pepper, milk, resistance. The printed version of the on-            about the conceptual alignment in a inter-linguistic perspec-
line questionnaire is available at the URL http://goo.                    tive, which is a well-known issue, e.g., for applications in the
gl/w9TNQT.                                                                legal field (Ajani et al., 2010).
References                                                  Proceedings of the International Joint Conference
                                                            on Artificial Intelligence (IJCAI), pages 875–881,
Eneko Agirre and Oier Lopez De Lacalle. 2003. Clus-         Buenos Aires, July. AAAI Press.
  tering WordNet Word Senses. In RANLP, volume
  260, pages 121–130.                                     Antonio Lieto, Enrico Mensa, and Daniele P. Radi-
                                                            cioni. 2016a. A Resource-Driven Approach for An-
Gianmaria Ajani, Guido Boella, Leonardo Lesmo,              choring Linguistic Resources to Conceptual Spaces.
  Marco Martin, Alessandro Mazzei, Daniele P Radi-          In Proceedings of the 15th International Conference
  cioni, and Piercarlo Rossi. 2010. Multilevel legal        of the Italian Association for Artificial Intelligence,
  ontologies. In Semantic Processing of Legal Texts,        Genoa, Italy, December. Springer.
  pages 136–154. Springer.
                                                          Antonio Lieto, Daniele P Radicioni, and Valentina
Alessandro Artale, Anna Goy, Bernardo Magnini,              Rho. 2016b. Dual PECCS: a Cognitive System
  Emanuele Pianta, and Carlo Strapparava. 1998.             for Conceptual Representation and Categorization.
  Coping with WordNet Sense Proliferation. In First         Journal of Experimental & Theoretical Artificial In-
  International Conference on Language Resources &          telligence, pages 1–20.
  Evaluation.
                                                          Diana McCarthy, Marianna Apidianaki, and Katrin
Paul Buitelaar. 2000. Reducing Lexical Semantic
                                                            Erk. 2016. Word Sense Clustering and Clusterabil-
  Complexity with Systematic Polysemous Classes
                                                            ity. Computational Linguistics.
  and Underspecification. In NAACL-ANLP 2000
  Workshop: Syntactic and Semantic Complexity in          Diana McCarthy. 2006. Relating WordNet Senses
  Natural Language Processing Systems, pages 14–            for Word Sense Disambiguation. Making Sense of
  19. Association for Computational Linguistics.            Sense: Bringing Psycholinguistics and Computa-
                                                            tional Linguistics Together, 17.
José Camacho-Collados, Mohammad Taher Pilehvar,
   and Roberto Navigli. 2015. NASARI: a Novel Ap-         Rada Mihalcea and Phil Edmonds.              2004.
   proach to a Semantically-Aware Representation of         SENSEVAL-3: Overview. In Proceedings Senseval-
   Items. In Proceedings of NAACL, pages 567–577.           3 3rd International Workshop on Evaluating Word
Isabella Chiari, Alessandro Oltramari, and Guido Vet-       Sense Disambiguation Systems. ACL, Barcelona,
   ere. 2010. Di Cosa Parliamo quando Parliamo Fon-         Spain.
   damentale? Lessemi, Accezioni, Sensi e Ontolo-
                                                          Rada Mihalcea and Dan I Moldovan. 2001. Automatic
   gie. In Lessico e Lessicologia. Atti del Convegno
                                                            Generation of a Coarse Grained WordNet. In Pro-
   della Societ di Linguistica Italiana, pages 177–194,
                                                            ceedings of the NAACL Workshop on WordNet and
   Roma, September. Bulzoni.
                                                            Other Lexical Resources.
Tullio De Mauro. 1999. Grande Dizionario Italiano
                                                          George A Miller.     1995.  WordNet: a Lexical
  dell’Uso. UTET, Turin, Italy.
                                                            Database for English. Communications of the ACM,
Philip Edmonds and Scott Cotton. 2001. SENSEVAL-            38(11):39–41.
  2: Overview.        In Proceedings of SENSEVAL-
  2 Second International Workshop on Evaluating           Marvin Minsky. 2000. Commonsense-based inter-
  Word Sense Disambiguation Systems, pages 1–5,            faces. Communications of the ACM, 43(8):66–73.
  Toulouse, France, July. Association for Computa-
                                                          Roberto Navigli and Simone Paolo Ponzetto. 2010.
  tional Linguistics.
                                                            BabelNet: Building a Very Large Multilingual Se-
Katrin Erk, Diana McCarthy, and Nicholas Gaylord.           mantic Network. In Proceedings of the 48th Annual
  2013. Measuring word meaning in context. Com-             Meeting of the Association for Computational Lin-
  putational Linguistics, 39(3):511–554.                    guistics, pages 216–225. Association for Computa-
                                                            tional Linguistics.
Jonathan St BT Evans and Keith Ed Frankish. 2009.
  In Two Minds: Dual Processes and Beyond. Oxford         Roberto Navigli. 2006. Meaningful Clustering of
  University Press.                                         Senses Helps Boost Word Sense Disambiguation
                                                            Performance. In Proceedings of the 21st Interna-
Christiane Fellbaum. 1990. English Verbs as a Se-           tional Conference on Computational Linguistics and
  mantic Net. International Journal of Lexicography,        the 44th annual meeting of the Association for Com-
  3(4):278–301.                                             putational Linguistics, pages 105–112. Association
                                                            for Computational Linguistics.
Peter Gärdenfors. 2014. The Geometry of Meaning:
  Semantics Based on Conceptual Spaces. MIT Press.        Hwee Tou Ng, Bin Wang, and Yee Seng Chan. 2003.
                                                            Exploiting Parallel Texts for Word Sense Disam-
Antonio Lieto, Daniele P. Radicioni, and Valentina          biguation: An Empirical Study. In Proceedings of
  Rho. 2015. A Common-Sense Conceptual Cate-                the 41st Annual Meeting on Association for Compu-
  gorization System Integrating Heterogeneous Prox-         tational Linguistics-Volume 1, pages 455–462. As-
  ytypes and the Dual Process of Reasoning. In              sociation for Computational Linguistics.
Alessandro Oltramari and Guido Vetere. 2008. Lexi-
  con and Ontology Interplay in Senso Comune. On-
  toLex 2008 Programme, page 24.
Martha Palmer, Olga Babko-Malaya, and Hoa Trang
 Dang. 2004. Different Sense Granularities for Dif-
 ferent Applications. In Proceedings of Workshop on
 Scalable Natural Language Understanding.
Martha Palmer, Hoa Trang Dang, and Christiane Fell-
 baum. 2007. Making Fine-Grained and Coarse-
 Grained Sense Distinctions, both Manually and
 Automatically. Natural Language Engineering,
 13(02):137–163.
Eleanor Rosch. 1975. Cognitive Representations of
  Semantic Categories. Journal of Experimental Psy-
  chology: General, 104(3):192–233.
Piek Vossen and Christiane Fellbaum, 2009. Multi-
   lingual FrameNets in Computational Lexicography:
   Methods and Applications, chapter Universals and
   idiosyncrasies in multilingual WordNets. Trends in
   linguistics / Studies and monographs: Studies and
   monographs. Mouton de Gruyter.