Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain) 9 Using a distributional neighbourhood graph to enrich semantic frames in the field of the environment Gabriel Bernier-Colborne Marie-Claude L’Homme Observatoire de linguistique Sens-Texte (OLST) Université de Montréal C.P. 6128, succ. Centre-Ville Montréal (QC) Canada, H3C 3J7 {gabriel.bernier-colborne|mc.lhomme}@umontreal.ca Abstract Frame Semantics has proved especially useful to represent predicative units (verbs such as de- This paper presents a semi-automatic method for identifying terms that evoke forest, recycle, warm; predicative nouns such as semantic frames (Fillmore, 1982). The impact, pollution, salinization; adjectives such as method is tested as a means of identifying clean, green, sustainable), units that are often lexical units that can be added to existing ignored in terminological resources. L’Homme frames or to new, related frames, using a et al. (2014) showed that the framework and large corpus on the environment. It is hy- more specifically the methodology devised within pothesized that a method based on distri- the FrameNet Project (Ruppenhofer et al., 2010) butional semantics, which exploits the as- could be used to represent various lexico-semantic sumption that words that appear in sim- ilar contexts have similar meanings, can properties of predicative terms (in English and in help unveil lexical units that evoke the same French). L’Homme and Robichaud (2014) showed frame or related frames. The method em- that frames could be connected via a series of re- ploys a distributional neighbourhood graph, lations and contribute to help us understand how in which each word is connected to its near- terms are used to express environmental knowl- est neighbours according to a distributional edge. However, as will be seen below, the work semantic model. Results show that most that led to the definition of frames and relations lexical units identified using this method can in fact be assigned to frames related to between frames mentioned above was done man- the field of the environment. ually and turns out to be quite time-consuming. In this paper, we explore the potential of a semi- automatic, graph-based method to discover frame- 1 Introduction1 relevant lexical units based on corpus evidence. Recent work has shown that Frame Seman- This paper is structured as follows. Section 2 tics (Fillmore, 1982; Fillmore and Baker, 2010) explains how semantic frames help reveal part is an extremely useful framework to account for of the lexical structure of a specialized field of the lexical structure of specialized fields of knowl- knowledge. Section 3 describes the graph-based edge (Dolbey et al., 2006; Faber et al., 2006; method used to identify frame-relevant lexical Schmidt, 2009; L’Homme et al., 2014). It is espe- units. Section 4 discusses how the model used cially attractive in terminology since it provides an in the manual evaluation of this method was se- apparatus to connect linguistic properties of terms lected. Section 5 presents the evaluation method- to a more abstract conceptual representation level. ology and the results of the evaluation. 1 The work reported in this paper is carried out within a larger project entitled “Understanding the environment lin- 2 Frame Semantics applied to the field of guistically and textually”, whose objective is to develop methods for characterizing the contents of texts on two differ- the environment ent levels: 1. textual (using methods and techniques derived from corpus linguistics and text mining); and 2. linguistic In a specialized field such as the environment, (based on lexical semantic models). many concepts correspond to processes, events Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain) 10 and properties which are typically expressed lin- The primary radiative effect of CO2 and wa- guistically by predicative terms (verbs, predicative ter vapour[CAUSE] is to WARM the surface nouns and adjectives). However, traditional termi- climate[PATIENT] but cool the stratosphere. nological models (and even less traditional ones, As increases in other greenhouse such as ontologies) are not properly equipped gases[CAUSE] WARM the atmosphere and to describe the terms that denote these concepts surface[PATIENT] , the amount of water vapour and account for their specific linguistic proper- also increases, amplifying the initial warming ties, namely the fact that they require arguments effect of the other greenhouse gases. (X changes Y; impact of X on Y). Frame Seman- tics (Fillmore, 1982; Fillmore and Baker, 2010) The simulations of this assessment report (for presents itself as a suitable alternative to these example, Figure 5) indicate that the estimated models since it is designed to connect linguistic net effect of these perturbations[CAUSE] is to properties to an abstract conceptual structure. In HAVE WARMED the global climate[PATIENT] addition, it is well equipped to represent predica- since 1750[TIME] . tive lexical units and their argument structure. Table 1: Annotated contexts for warm1b 2.1 Discovering frames in the field of the environment L’Homme et al. (2014) describe a method to dis- Argument structures and annotations were used cover semantic frames based on an existing termi- to discover frames using two different methods. nological resource called DiCoEnviro2 , that con- A semantic frame is a knowledge structure that tains English and French terms related to the field represents specific situations (e.g. a teaching sit- of the environment. Each entry in DiCoEnviro is uation, a selling situation, a driving situation). A devoted to a lexical unit (LU), i.e. a lexical item frame includes participants (called frame elements that conveys a specific meaning, and states the ar- or FEs), some of which are obligatory (core FEs) gument structure of the LU, as in the following and some of which are optional (non-core FEs). examples: For instance, the Operate vehicle frame describes • warm1a , vi: climate[Patient] warms a situation in which a Vehicle is set in motion by a Driver and includes the following core FEs: Area, • warm1b , vt: gas[Agent] or change[Cause] warms Driver, Goal, Path, Source, and Vehicle. Lexical climate[Patient] units such as cycle, cruise, drive, pedal, and ride • warm, adj.: warm climate[Patient] evoke this frame (FrameNet, 2015). In this pre- vious work, it was assumed that terms that share Argument structures state the number of oblig- similarities with regard to their argument struc- atory participants, and two different systems are tures (number and semantic roles of arguments) used to label them: the first one accounts for and that share similarities with regard to the non- the semantic roles of arguments (Agent, Patient, obligatory participants annotated in contexts are Cause); the second one gives a typical term, i.e. likely to evoke the same frame. a term that is representative of what can appear in that position. The first method consisted in comparing the ar- Many entries – especially entries that describe gument structures and non-obligatory participants predicative terms – come with annotated contexts of terms already encoded in the terminological re- that show how arguments3 are realized in sen- source. This method shows that the verbs cool1a , tences extracted from an environmental corpus. warm1a and the nouns cooling1 and warming1 For example, annotated contexts for warm1b are share many features. They all have a single argu- shown in Table 1. ment (a Patient) and share some non-obligatory 2 See http://olst.ling.umontreal.ca/ participants (Degree, Duration, Location). cgi-bin/dicoenviro/search_enviro.cgi. 3 Non-obligatory participants are also annotated, as shown The second method – which was applied only in the last sentence in Table 1, in which the phrase since 1750, to the English terms – consisted in comparing the which expresses Time, is annotated. contents of the terminological resource to that of Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain) 11 Figure 1: Comparison between the terminological database and FrameNet FrameNet.4 Relevant data were extracted from the relations not found in FrameNet (2015) were FrameNet database for terms that were recorded added (Is opposed to and Is a property of). This in the terminological resource, as shown in Fig- work led to the development of a resource called ure 1. This figure shows an example in which the Framed DiCoEnviro,5 in which users can a correspondence between FrameNet and the ter- navigate through frames and relations between minological database could be established. How- frames, and access the terms that evoke these ever, in many instances, matches could not be frames along with their annotations. Figure 2 made as nicely. In various cases, specific frames shows some of the relations identified between the needed to be defined for the environmental terms frame Change of temperature (COT) (that con- (for instance, a new frame was created to cap- tains verbs such as cool1a , warm1a and the nouns ture adjectives such as clean, environmental and cooling1 and warming1 ) and other frames. green, whose meaning can be loosely described as “that does not harm the environment”). In other 3 Method for discovering related LUs cases, existing frames in FrameNet needed to be adapted to the data extracted from the terminolog- The methods described above allowed us to de- ical database for different reasons (slightly more fine a first subset of frames that are relevant for the specific meanings, different number of arguments, field of the environment, link part of these frames etc.). and assign lexical units (LUs) to them. Based on this preliminary data, we explored the potential of 2.2 A “framed” representation of the a semi-automatic method to enrich our resource by terminology of the environment adding new LUs to existing frames or discovering It soon became obvious that some of the frames new frames. This method exploited distributional identified based on the methods described in Sec- information obtained from a much larger corpus tion 2.1 could be linked. For instance, all pro- than the one used in the work described above. cesses related to changes affecting the environ- The method we tested to discover related LUs ment appeared to be somehow related. is based on the neighbourhood graph induced by a Again, using FrameNet (2015) as a reference, distributional model of semantics. Distributional relations were established between some of the semantic models are commonly used to estimate frames defined for environmental terms. Two semantic similarity, the underlying hypothesis be- 4 5 The FrameNet team releases an XML version of the See http://olst.ling.umontreal.ca/ database (Baker and Hong, 2010). dicoenviro/framed/index.php (in development). Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain) 12 Cause temperature change Change of temperature Cause change of position on a scale Cause balance Ambient temperature Weather event Change natural feature Change of phase Change position on a scale Balancing Cause change natural feature Ceasing to be Undergo change of state Water emanating Change of impact Damaging Being at risk Progress Cause change of impact Cause change into organized society Figure 2: Change of temperature and related frames ing that words that appear in similar contexts tend marized as follows. Given a distributional seman- to be semantically related (Harris, 1954). The tic model, we compute the pairwise similarity be- usual method of querying a distributional model tween all words. For each word, we compute its k is simply to compute, given a particular word, a NNs by sorting all other words in decreasing order sorted list of similar words. This method has sev- of similarity to that word and keeping the k most eral drawbacks, as has been pointed out recently similar. Then, for each word wi and each neigh- by Gyllensten and Sahlgren (2015), who use a rel- bour wj in the k NNs of wi , we add an edge in ative neighbourhood graph to query distributional the graph between wi and wj if wi is also among models in a way that accounts for the fact that the the k NNs of wj . The resulting graph can be used query can have multiple senses. The method used to visualize the distributional neighbourhood of a here is similar in that it exploits a distributional term or set of terms. neighbourhood graph. This allows us to take a list of terms and visualize their semantic neighbour- 4 Model selection hood, in order to identify related terms that can be encoded as frame-evoking LUs, either in existing Any model that allows us to estimate the seman- frames or in new ones. tic similarity of two words can be used to build a Various kinds of graphs could be used to com- semantic neighbourhood graph such as the one de- pute and visualize the distributional neighbour- scribed in Section 3. We tested two different dis- hood of a particular word or set of words. We use tributional semantic models for this purpose. Both a k-nearest-neighbour (k-NN) graph, two exam- models have several parameters which must be set ples of which are the symmetric k-NN graph and and which can have a significant impact on the ac- the mutual k-NN graph (Maier et al., 2007). In curacy of the model in a given application. We a symmetric k-NN graph, two words wi and wj therefore used an automatic evaluation procedure are connected if wi is among the k nearest neigh- to tune the models’ parameters and select a model bours (NNs) of wj or if wj is among the k NNs for manual evaluation. of wi . In a mutual k-NN graph, the two words are 4.1 Corpus and reference data connected only if both conditions are true: wi is among the k NNs of wj and wj is among the k The corpus used to build the models is the NNs of wi . In this work, we chose to use a mu- PANACEA Environment English monolingual tual k-NN graph6 , the intuition behind this deci- corpus (Catalog Reference ELRA-W0063), a cor- sion being that if two words are mutual NNs, there pus containing 28071 web pages related to the is a better chance that they actually do have sim- environment (approximately 50 million tokens). ilar meanings. This principle has been exploited The corpus was compiled automatically using elsewhere (Ferret, 2012; Claveau et al., 2014). a focused web crawler developed within the The graph construction procedure can be sum- PANACEA project, and is freely distributed by ELDA for research purposes.7 The corpus was 6 We also tested the symmetric k-NN graph, but we only 7 report results obtained with the mutual graph. We achieved See http://catalog.elra.info/product_ higher F-scores using the mutual graph. info.php?products_id=1184. Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain) 13 converted from XML to raw text and lemmatized the size of the context window. We tested vari- using TreeTagger (Schmid, 1994). ous values for each of these parameters, including Reference data were extracted from the Framed the recommended values11 when available. A to- DiCoEnviro.8 The reference data are sets of tal of 160 models were tested. In the case of the LUs that evoke the same semantic frame. The BOW model, important parameters12 include the list of English LUs was extracted from each of type, shape and size of the context window, the the frames included in the Change of temperature weighting scheme applied to the cooccurrence fre- (COT) scenario9 (cf. Figure 2). Two LUs (thaw- quencies, and the use of dimensionality reduction. ing and thinning) were excluded because they Again, we tested different values for these param- were not in the vocabulary used to construct the eters. Each model was tested with and without models, which contains the 10,000 most frequent dimensionality reduction, for which we used sin- lemmatized words in the corpus, excluding stop gular value decomposition (SVD). A total of 320 words. We obtained 13 sets containing a total of BOW models were built and evaluated (160 unre- 53 LUs, each frame containing between 2 and 7 duced and 160 reduced using SVD). LUs. The number of unique LUs is 45, several For both models, we used the cosine similarity LUs evoking more than one frame.10 to estimate the similarity between words. 4.2 Models tested 4.3 Evaluation metrics for model selection Two different distributional semantic models were For each model tested, we constructed multiple k- tested. The first is a bag-of-words (BOW) NN graphs, using different values of k. For each model (Schütze, 1992; Lund et al., 1995), which is of these graphs, we computed evaluation metrics based on a word-word cooccurrence matrix com- using the reference data described in Section 4.1. puted using a sliding context window. The second We used precision and recall to check to what ex- is word2vec (Mikolov et al., 2013a; Mikolov et tent LUs belonging to the same frame were con- al., 2013b), a neural language model that has been nected in the graph. These metrics are computed used in many NLP applications in the past few for each of the 45 unique LUs in the reference years. Word2vec (W2V) learns distributed word data. Let wi be an LU, R(wi ) the set of related representations that can be used in the same way LUs that evoke at least one of the frames evoked as BOW vectors to estimate semantic similarity. by wi , and NN (wi ) the set of words that are adja- The models’ parameters were tuned by testing cent to wi in the graph. Furthermore, let TP i (true various combinations of parameter values, build- positives) be the number of words in NN (wi ) that ing neighbourhood graphs from each resulting are one of the related LUs in R(wi ), FP i (false model, and computing evaluation metrics on these positives) the number of words in NN (wi ) that are graphs based on the reference data described in not in R(wi ) and FN i (false negatives) the num- Section 4.1. ber of words in R(wi ) that are not in NN (wi ). The evaluation metrics are then calculated as usual: Some of the main choices that must be made when training a model using word2vec pertain to TP i the architecture of the model (continuous skip- precisioni = TP i + FP i gram or continuous bag-of-words), the training algorithm (hierarchical softmax or negative sam- TP i recalli = pling), the use of subsampling of frequent words, TP i + FN i the size (dimensionality) of the word vectors and 8 Data extracted on 2015-05-22. Data has been added 2 × precisioni × recalli F-scorei = since then, as the resource is in development. precisioni + recalli 9 Frames related to the scenario only through a See also 11 relation were excluded. See https://code.google.com/p/ 10 Polysemous LUs evoke different frames. For word2vec/#Performance. 12 instance, warm1a (intransitive verb) evokes the Several studies have assessed the influence of this Change of temperature frame; warm1b (transitive verb) model’s parameters. The relative importance of several pa- evokes the Cause temperature change frame; and warm2 rameters was quantified using analysis of variance by Lapesa (adjective) evokes the Ambient temperature frame. et al. (2014). Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain) 14 The average precision, recall and F-score for a manually. The evaluation was carried out by one particular graph are then computed by taking the of the co-authors of this paper, who is responsible mean scores over all LUs in the reference data. for the development of the Framed DiCoEnviro. The 45 unique LUs in the reference data had 137 4.4 Results unique neighbours (adjacent nodes in the graph). Table 2 shows how precision, recall and F-score These 137 words were evaluated manually in or- vary with respect to k. As the density of the graph der to determine to what extent the graph can serve increases, recall increases and precision decreases, to discover frame-evoking LUs that can be added the average F-score peaking around k = 10. The to the database. table also shows the number of nodes in the sub- The evaluation was carried out one frame graph corresponding to the 45 LUs and their ad- at a time by observing the subgraph corre- jacent nodes in the graph. Based on these results, sponding to that frame’s LUs and their neigh- we selected 10 as an appropriate value of k. bours (adjacent nodes in the neighbourhood graph). For example, the subgraph for the frame k nb nodes precision recall F1 Cause change of impact is shown in Figure 3. In each subgraph, the LUs already encoded in that 5 125 0.2120 0.2125 0.1915 frame were highlighted in green, and those en- 10 206 0.1681 0.3005 0.1971 coded in other frames in the COT scenario were 15 284 0.1429 0.3560 0.1858 highlighted in blue. One or more numbers were 20 359 0.1253 0.3999 0.1730 appended to the label of each LU to indicate which 25 431 0.1108 0.4339 0.1594 frame(s) it evokes. Table 2: Evaluation metrics and number of nodes in the For each word that was not already encoded as subgraph wrt k (averaged over all models) an LU in the COT scenario (i.e. for each white node), the evaluator was asked to choose one of Table 3 shows the average and maximum scores the following categories: of each model (BOW, BOW reduced using SVD, 1. The word should be encoded as an LU in the and W2V) with k = 10. These results suggest that COT scenario the BOW model performs best for this application. (a) in an existing frame; Model Avg prec. Avg rec. Avg F1 (b) in a new frame. (max) (max) (max) 2. The word should be encoded as an LU in an- BOW 0.1960 0.3153 0.2184 other scenario (0.2775) (0.4268) (0.3016) SVD 0.1567 0.2987 0.1903 (a) in an existing frame that is related to the (0.2007) (0.3830) (0.2412) COT scenario (by a See also relation); W2V 0.1517 0.2875 0.1826 (b) in an existing frame that is not related to (0.2245) (0.4206) (0.2727) the COT scenario; (c) in a new frame. Table 3: Evaluation metrics wrt model (with k = 10) 3. The word should not be encoded as an LU in By analyzing how precision and recall varied the database, but it is the realization of a core with respect to the BOW model’s parameters, we FE of one of the frames in the COT scenario. determined the optimal parameter values for this application. For example, the optimal window 4. The word should not be encoded as an LU in size was determined to be 3 words. The corre- the database, nor is it the realization of a core sponding graph was then evaluated manually. FE of one of the frames in the COT scenario. Table 4 shows the results of this evaluation. As 5 Evaluation these results show, most lexical items identified by Once the model had been selected, the cor- the method (105 out of 137) can be encoded in a responding neighbourhood graph was evaluated relevant frame in the field of the environment and Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain) 15 Figure 3: Subgraph for the frame Cause change of impact should be added to our resource. Among these, the COT scenario (category 2a). 88 would be frame-evoking LUs (categories 1 and Finally, although 32 lexical items identified by 2) and 17 would be encoded as realizations of the method would not be encoded in the resource FEs (category 3). Interestingly, 48 lexical items and are thus considered false positives from the are related to the COT scenario (categories 1a and point of view of our application, further explana- 1b). The method allowed us to identify: 1. new tions are required. Some lexical items could evoke frame-evoking LUs (such as amplification, drop, more general frames. For instance, rapid and slow and scarcity) that had not been encoded in exist- would appear in the same frame if the general lex- ing frames (category 1a); 2. LUs (such as alter- icon were considered. Other items identified are ation and eliminate) that evoke frames that had not acronyms. GW, for instance, is the acronym for been created (category 1b); and 3. variants (such global warming. Technically, it could be defined as cooler for cool and stabilise for stabilize). as an LU evoking the COT frame, but multi-word terms and acronyms are not considered in the re- Category Nb cases source. 1a 39 1b 9 6 Concluding remarks 1 (total) 48 All in all the results obtained are quite interesting 2a 7 and show that the method can be used to assist lex- 2b 3 icographers when defining frames and their lexi- 2c 30 cal content, as the distributional neighbourhood of 2 (total) 40 frame-evoking LUs often contain LUs that evoke the same frame or related frames. Distributional 3 17 neighbourhood graphs provide information about 4 32 the content of a specialized corpus that would be impossible to extract manually from such a large Total 137 corpus. They are a very useful complement to Table 4: Summary of results. other corpus tools, such as term extractors and concordancers, as they help lexicographers save The method also identified 40 items that would time and locate relevant lexical units (near syn- be encoded in environmentally relevant frames, onyms, variants) that they would otherwise miss. but in a different scenario (category 2). It is worth In future work, we plan to integrate this pointing out that among these, 7 items correspond methodology to assist lexicographers when defin- to LUs that would evoke a frame that is linked to ing new frames related to the field of the envi- Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain) 16 ronment. It could be particularly useful to ob- Gabriella Lapesa, Stefan Evert, and Sabine tain a view on corpora that deal with new or more Schulte im Walde. 2014. Contrasting syntag- specific topics and unveil the lexical units used matic and paradigmatic relations: Insights from to convey the knowledge related to these topics. distributional semantic models. In Proceedings of the Third Joint Conference on Lexical and Compu- It would also be interesting to test the potential tational Semantics (*SEM 2014), p. 160–170. of the method in other fields of knowledge. Ex- Marie-Claude L’Homme and Benoı̂t Robichaud. 2014. tensions of this work could also involve using a Frames and terminology: Representing predicative graph-based clustering method to discover sets of units in the field of the environment. In Proceed- lexical units that evoke the same frame without us- ings of the 4th Workshop on Cognitive Aspects of ing existing frames. the Lexicon (CogALex-IV), p. 186–197. Marie-Claude L’Homme, Benoı̂t Robichaud, and Car- Acknowledgments los Subirats. 2014. Discovering frames in special- ized domains. In Proceedings of the Ninth Interna- This work was supported by the Social Sciences tional Conference on Language Resources and Eval- and Humanities Research Council (SSHRC) of uation (LREC’14), p. 1364–1371. Canada. Kevin Lund, Curt Burgess, and Ruth Ann Atchley. 1995. Semantic and associative priming in high- dimensional semantic space. In Proceedings of the References 17th Annual Conference of the Cognitive Science Society, p. 660–665. Collin Baker and Jisup Hong. 2010. Release 1.5 of Markus Maier, Matthias Hein, and Ulrike the FrameNet data. International Computer Science Von Luxburg. 2007. Cluster identification in Institute. Berkeley. nearest-neighbor graphs. In Algorithmic Learning Vincent Claveau, Ewa Kijak, and Olivier Ferret. 2014. Theory, p. 196–210. Springer. Explorer le graphe de voisinage pour améliorer Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey les thésaurus distributionnels. In Actes de la Dean. 2013a. Efficient estimation of word repre- 21e conférence sur le traitement automatique des sentations in vector space. In Proceedings of ICLR. langues naturelles (TALN), p. 220–231. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Andrew Dolbey, Michael Ellsworth, and Jan Schef- Corrado, and Jeff Dean. 2013b. Distributed rep- fczyk. 2006. BioFrameNet: A Domain-specific resentations of words and phrases and their compo- FrameNet Extension with Links to Biomedical On- sitionality. In Proceedings of NIPS, p. 3111–3119. tologies. Proceedings of KR-MED 2006: Biomedi- cal Ontology in Action, Baltimore, Maryland. Josef Ruppenhofer, Michael Ellsworth, Miriam R. L. Petruck, Christopher R. Johnson, and Jan Schef- Pamela Faber et al. 2006. Process-oriented terminol- fczyk. 2010. FrameNet II: Extended the- ogy management in the domain of Coastal Engineer- ory and practice. http://framenet2.icsi. ing. Terminology 12(2): 189–213. berkeley.edu/docs/r1.5/book.pdf. Ac- Olivier Ferret. 2012. Combining bootstrapping and cessed: 2015-09-24. feature selection for improving a distributional the- Helmut Schmid. 1994. Probabilistic part-of-speech saurus. In Proceeding of the 20th European Confer- tagging using decision trees. In Proceedings of the ence on Artificial Intelligence (ECAI), p. 336–341. International Conference on New Methods in Lan- Charles J. Fillmore. 1982. Frame Semantics. In Lin- guage Processing. guistics in the Morning Calm, p. 111–137. Seoul: Thomas Schmidt. 2009. The Kicktionary: A Multi- Hanshin Publishing Co. lingual Lexical Resource of Football Language. In Charles J. Fillmore and Collin Baker. 2010. A frames H.C. Boas (ed.), Multilingual FrameNets in Compu- approach to semantic analysis. In B. Heine and H. tational Lexicography. Methods and Applications, p. Narrog (ed.), The Oxford Handbook of Linguistic 101–134. Berlin/NewYork: Mouton de Gruyter. Analysis, p. 313–339. Oxford: Oxford University Hinrich Schütze. 1992. Dimensions of meaning. In Press. Proceedings of the 1992 ACM/IEEE Conference on FrameNet. 2015. https://framenet.icsi. Supercomputing (Supercomputing’92), p. 787–796. berkeley.edu/fndrupal/. Accessed: 2015- 09-24. Amaru Cuba Gyllensten and Magnus Sahlgren. 2015. Navigating the semantic horizon using relative neighborhood graphs. CoRR, abs/1501.02670. Zellig S. Harris. 1954. Distributional structure. Word, 10(2–3): 146–162.