From Sartre to Frege in Three Steps: ? A Search for Enriching Semantic Text Similarity Measures Davide Colla Marco Leontino University of Turin, University of Turin, Computer Science Department Computer Science Department davide.colla@unito.it marco.leontino@unito.it Enrico Mensa Daniele P. Radicioni University of Turin, University of Turin, Computer Science Department Computer Science Department enrico.mensa@unito.it daniele.radicioni@unito.it Abstract meaningful information contained in text docu- ments, also based on background information con- English. In this paper we illustrate a tained in an encyclopedic resource such as Wiki- preliminary investigation on semantic text data (Vrandecic and Krötzsch, 2014). similarity. In particular, the proposed ap- Although our approach has been devised on a proach is aimed at complementing and en- specific application domain (PhD theses in philos- riching the categorization results obtained ophy), we argue that it can be easily extended to by employing standard distributional re- further application settings. The approach focuses sources. We found that the paths con- on the ability to extract relevant pieces of informa- necting entities and concepts from docu- tion from text documents, and to map them onto ments at stake provide interesting informa- the nodes of a knowledge graph, obtained from tion on the connections between document semantic networks representing encyclopedic and pairs. Such semantic browsing device en- lexicographic knowledge. In this way it is possi- ables further semantic processing, aimed ble to compare different documents based on their at unveiling contexts and hidden connec- graphical description, which has a direct anchor- tions (possibly not explicitly mentioned in ing to their semantic content. the documents) between text documents.1 We propose a system to assess the similarity be- tween textual documents, hybridising the propo- 1 Introduction sitional approach (such as traditional statements In the last few years many efforts have been expressed through RDF triples) with a distribu- spent to extract information contained in text doc- tional description (Harris, 1954) of the nodes con- uments, and a large number of resources have tained in the knowledge graph, that are repre- been developed that allow exploring domain- sented with word embeddings (Mikolov et al., based knowledge, defining a rich set of specific 2013; Camacho-Collados et al., 2015; Speer et al., semantic relationships between nodes (Vrandecic 2017). This step allows to obtain similarity mea- and Krötzsch, 2014; Auer et al., 2007; Navigli sures (based on vector descriptions, and on path- and Ponzetto, 2012). Being able to extract and finding algorithms) and explanations (represented to make available the semantic content of docu- as paths over a semantic network) more focused ments is a challenging task, with beneficial impact on the semantic definition of concepts and entities on different applications, such as document cat- involved in the analysis. egorisation (Carducci et al., 2019), keyword ex- 2 Related Work traction (Colla et al., 2017), question answering, text summarisation, semantic texts comparison, on Surveying the existing approaches requires to building explanations/justifications for similarity briefly introduce the most widely used resources judgements (Colla et al., 2018) and more. In this along with their main features. paper we present an approach aimed at extracting Resources 1 Copyright c 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 BabelNet (BN) is a wide-coverage multilingual International (CC BY 4.0). semantic network, originally built by integrating WordNet (Miller, 1995) and Wikipedia (Navigli years (Schuhmacher and Ponzetto, 2014). Sev- and Ponzetto, 2010). NASARI is a vectorial re- eral approaches have been developed, e.g., aimed source whose senses are represented as vectors as- at extracting knowledge graphs from textual cor- sociated to BabelNet synsets (Camacho-Collados pora, attaining a network focused on the type of et al., 2015). Wikidata is a knowledge graph based documents at hand (Pujara et al., 2013). Such ap- on Wikipedia, whose goal is to overcome prob- proaches may be affected by scalability and gen- lems related to information access by creating new eralisation issues. In the last years many resources ways for Wikipedia to manage its data on a global representing knowledge in a structured form have scale (Vrandecic and Krötzsch, 2014). have been proposed that build on encyclopedic re- sources (Auer et al., 2007; Suchanek et al., 2007; 2.1 Approaches to semantic text similarity Vrandecic and Krötzsch, 2014). As regards as semantic similarity, a frame- Most literature in computing semantic similarity work has been proposed based on entity extraction between documents can be arranged into three from documents, providing mappings to knowl- main classes. edge graphs in order to compute semantic sim- Word-based similarity. Word-based metrics are ilarities between documents (Paul et al., 2016). used to compute the similarity between documents Their similarity measures are mostly based on the based on their terms; examples of features anal- network structure, without introducing other in- ysed are common morphological structures (Islam struments such as embeddings, that are largely and Inkpen, 2008) and words overlap (Huang et acknowledged as relevant in semantic similarity. al., 2011) between the texts. In one of the most Hecht et al. (2012) propose a framework endowed popular theories on similarity (the Tversky’s con- with explanatory capabilities from similarity mea- trast model) the similarity of a word pair is defined sures based on relations between Wikipedia pages. as a direct function of their common traits (Tver- sky, 1977). This notion of similarity has been re- 3 The System cently adjusted to model human similarity judg- ments for short texts: the Symmetrical Tversky In this Section we illustrate the generation process Ratio Model (Jimenez et al., 2013), and employed of the knowledge graph from Wikidata, which will to compute semantic similarity between word- and be instrumental to build paths across documents. sense-pairs (Mensa et al., 2017; Mensa et al., Such paths are then used, at a later time, to enrich 2018). the similarity scores computed during the classifi- Corpus-based similarity. Corpus-based mea- cation. sures try to identify the degree of similarity be- tween words using information derived from large 3.1 Knowledge Graph Extraction corpora (Mihalcea et al., 2006; Gomaa and Fahmy, The first step consists of the extraction of a knowl- 2013). edge graph related to the given reference domain. Knowledge-based similarity. Knowledge-based Wikidata is then searched for concepts and entities measures try to estimate the degree of seman- related to the domain being analysed. By start- tic similarity between documents by using infor- ing from the extracted elements, which constitute mation drawn from semantic networks (Mihalcea the basic nodes of the knowledge graph, we still et al., 2006). In most cases only the hierarchi- consider Wikidata and look for relevant semantic cal structure of the information contained in the relationships towards other nodes, not necessarily network is considered, without considering the already extracted in the previous step. The types relation types within nodes (Jiang and Conrath, of relevant relationships depend on the treated do- 1997; Richardson et al., 1994); some authors con- main. Considering the philosophical domain, we sider the “is-a” relation (Resnik, 1995), but leav- selected a set of 30 relations relevant to com- ing unexploited the more domain-dependent ones. pare the documents. For example, we considered Moreover, only concepts are usually considered, the relation movement that represents the literary, omitting the Named Entities. artistic, scientific or philosophical movement,the An emerging paradigm is that of knowl- relation studentOf that represents the person who edge graphs. Knowledge graph extraction is a has taught the considered philosopher, and the challenging task, particularly popular in recent relation influencedBy that represents the person’s Aleksei hasInfluenced Immanuel isInfluencedBy Baruch … the philosophy of Baruch Spinoza, Losev Kant Spinoza with analysis… hasMovement hasAwardReceived … the relevance Christian of Kant is put in isMovementOf René Jakob Kraus perspective by… Rationalism Decartes Figure 1: A small portion of the knowledge graph extracted from Wikidata, related to the philosophical domain; nodes represent BabelSynsets (concepts or NEs), rectangles represent documents. idea from which the considered philospher’s idea tion rules based on morphological and syntacti- has been influenced. In this way, we obtain a graph cal patterns, considering for example sequences where each node is a concept or entity extracted of words starting with a capital letter or associ- from Wikidata; such nodes are connected with ated to a particular Part-Of-Speech pattern. Simi- edges labeled with specific semantic relations. larly, we extract relevant concepts based on partic- The obtained graph is then mapped onto Ba- ular PoS patterns (such as NOUN-PREPOSITION- belNet. At the end of the first stage, the knowl- NOUN, thereby recognizing, for example, philoso- edge graph represents the relevant domain knowl- phy of mind). edge (Figure 1) encoded through BabelNet nodes, We are aware that we are not considering the that are connected through the rich set of relations problem of word sense disambiguation (Navigli, available in Wikidata. Each text document can be 2009; Tripodi and Pelillo, 2017). The underly- linked to the knowledge graph, thereby allowing to ing assumption is that as long as we are concerned make semantic comparisons by analysing the pos- with a narrow domain, this is a less severe prob- sible paths connecting document pairs. lem: e.g., if we recognise the person Kant in a doc- Without loss of generality, we considered the ument related to philosophy, probably the person philosophical domain, and extracted a knowl- cited is the philosopher whose name is Immanuel edge graph containing 22, 672 nodes and 135, 910 Kant (please refer to Figure 1), rather than the less typed edges; Wikidata entities were mapped onto philosophical Gujarati poet, playwright and essay- BabelNet approximately in the 90% of cases. ist Kavi Kant.3 By mapping concepts and Named Entities 3.2 Information extraction and semantic found in a document onto the graph, we gain a set similarity of access points to the knowledge graph. Once ac- The second step consists in connecting the docu- quired the access points to the knowledge graph ments to the obtained knowledge graph. We har- for a pair of documents, we can compute the se- vested a set of 475, 383 UK doctoral theses in sev- mantic similarity between documents by analysing eral disciplines through the Electronic Theses On- the paths that connect them. line Service (EThOS) of the British National Li- brary.2 At first, concepts and entities related to the 3.3 Building Paths across Documents reference domain were extracted from the consid- The developed framework is used to compute ered documents, with a special focus on two dif- paths between pairs of senses and/or entities fea- ferent types of information, such as concepts and turing two given documents. Each edge in the Named Entities. Concepts are keywords or multi- knowledge graph has associated a semantic re- word expressions representing meaningful items lation type (such as, e.g., “hasAuthor”, “influ- related to the domain (such as, e.g., ‘philosophy- encedBy”, “hasMovement”). Each path interven- of-mind’, ‘Rationalism’, etc.) while Named En- ing between two documents is in the form tities are persons, places or organisations (mostly universities, in the present setting) strongly related ACCESS DOC1 −−−−−−→ SaulKripke −−−−−−−−−→ inf luencedBy to the considered domain. Named entities are ex- inf luencedBy LudwigW ittgenstein −−−−−−−−−→ BertrandRussell tracted using the Stanford CoreNLP NER mod- inf luencedBy ACCESS ule (Manning et al., 2014) improved with extrac- −−−−−−−−−→ BaruchDeSpinoza ←−−−−−− DOC2 2 3 https://ethos.bl.uk. https://tinyurl.com/y3s9lsp7. In this case we can argue in favor of the relatedness and the broad theme of ethics in the latter case. of the two documents based on the chain of rela- Intra-domain classes (that is both ‘Antibiotics’- tionships illustrating that Saul Kripke (from docu- ‘Molecular’ and ‘Hegel’-‘Ethics’) are not sup- ment d1 ) has been influenced-by Ludwig Wittgen- posed to be linearly separable, as it mostly occurs stein, that has been influenced-by Bertrand Rus- in real problems. Of course, this feature makes sel, that in turn has been influenced-by Baruch De more interesting the categorization problem. The Spinoza, mentioned in d2 . The whole set of paths dataset was used to compute some descriptive stats connecting elements from a document d1 to a doc- (such as inverse document frequency), character- ument d2 can be thought of as a form of evidence izing the whole collection of considered docu- of the closeness of the two documents: documents ments. with numerous shorter paths connecting them are From the aforementioned set of 400 documents intuitively more related. Importantly enough, such we randomly chose a subset of 20 documents, 5 paths over the knowledge graph do not contain documents for each of the 4 classes from those general information (e.g., Kant was a man), but containing the terms (either ‘Antibiotics’, ’Molec- rather they are highly domain-specific (e.g., Oskar ular’, ‘Hegel’ or ‘Ethics’) in the title. This selec- Becker had as doctoral student Jürgen Habermas). tion strategy was aimed at selecting more clearly individuated documents, exhibiting a higher simi- A? Search larity degree within classes than across classes.4 The computation of the paths is performed via a modified version of the A? algorithm (Hart et al., 4.1 Investigation on Text Similarity with 1968). In particular, paths among access nodes are Standard Distributional Approaches returned in order, from the shortest to the longest GLoVE and Word Embedding Similarity one. Given the huge dimension of the network, The similarity scores were computed for each doc- and since we are guaranteed to retrieve shortest ument pair with a Word Embedding Similarity ap- paths first, we stop the search after one second of proach (Agirre et al., 2016). In particular, each computation time. document d has been provided with a vector de- 4 Experimentation scription averaging the GloVe embeddings ti (Pen- nington et al., 2014) for all terms in the title and In this Section we report the results of a prelimi- abstract: nary experimentation: given a dataset of PhD the- −→ 1 X ~ Nd = ti , (1) ses, we first explore the effectiveness of standard |Td | ti ∈Td distributional approaches to compute the semantic similarity between document pairs; we then elab- where each t~i is the GloVe vector for the term ti . orate on how such results can be complemented Considering two documents d1 ad d2 , each one as- −→ and enriched through the computation of paths be- sociated to a particular vector Ndi , we compare tween entities therein. them using the cosine similarity metrics: Experimental setting We extracted 4 classes of −−→ −−→ −−→ −−→ Nd · Nd2 documents (100 for each class) from the EThOS sim(Nd1 , Nd2 ) = −−→1 −−→ . (2) kNd1 kkNd2 k dataset. For each record we retrieved the title and abstract fields, that were used for subsequent pro- The obtained similarities between each document cessing. We selected documents containing ‘An- pair are reported in Figure 2(a).5 The computed tibiotics’, ’Molecular’, ‘Hegel’ or ‘Ethics’ either distances show that overall this approach is suffi- in their title (in 15 documents per class) or in their cient to discriminate the scientific doctoral theses abstract (15 documents per class). Each class is from the philosophical ones. In particular, the top featured on average by 163.5 tokens (standard de- green triangle shows the correlation scores among viation σ = 39.3), including both title and ab- antibiotics documents, while the bottom trian- stract. The underlying rationale has been that of gle reports the correlation scores among philo- selecting documents from two broad areas, each 4 one composed by two different sets of data, hav- In future work we will verify such assumptions by in- volving domain experts in order to validate and/or refine the ing to do with medical disciplines and molecular heuristics employed in the document selection. 5 biology in the former case, and with Hegelianism The plot was computed using the corrplot package in R. M1 M2 M3 M4 M5 M1 M2 M3 M4 M5 H1 H2 H3 H4 H5 H1 H2 H3 H4 H5 A1 A2 A3 A4 A5 E1 E2 E3 E4 E5 A1 A2 A3 A4 A5 E1 E2 E3 E4 E5 1 1 A1 A1 A2 A2 0.9 0.9 A3 A3 A4 A4 0.8 0.8 A5 A5 M1 M1 0.7 0.7 M2 M2 M3 M3 0.6 0.6 M4 M4 M5 M5 0.5 0.5 E1 E1 E2 E2 0.4 0.4 E3 E3 E4 E4 0.3 0.3 E5 E5 H1 H1 (a) Glove Embeddings 0.2 (b) One-Hot Vector 0.2 H2 H2 H3 H3 0.1 0.1 H4 H4 H5 H5 0 0 M1 M2 M3 M4 M5 H1 H2 H3 H4 H5 M1 M2 M3 M4 M5 A1 A2 A3 A4 A5 E1 E2 E3 E4 E5 H1 H2 H3 H4 H5 A1 A2 A3 A4 A5 E1 E2 E3 E4 E5 1 1 A1 A1 A2 A2 0.9 0.9 A3 A3 A4 A4 0.8 0.8 A5 A5 M1 M1 0.7 0.7 M2 M2 M3 M3 0.6 0.6 M4 M4 M5 M5 0.5 0.5 E1 E1 E2 E2 0.4 0.4 E3 E3 E4 E4 0.3 0.3 E5 E5 H1 H1 (c) NASARI Embeddings 0.2 (d) NASARI Embeddings 0.2 H2 H2 H3 with connectivity and idf H3 0.1 0.1 H4 H4 H5 H5 0 0 Figure 2: Comparison between correlation scores. Documents have scientific subject (‘A’ for ‘Antibi- otics’, ‘M’ for ‘Molecular’ biology), and philosophic subject (‘E’ for ‘Ethics’, ‘H’ for ‘Hegel’). sophical documents. The red square graphi- mula as in Equation 1. We then computed the sim- cally illustrates the poor correlation between the ilarity matrix, displayed in Figure 2(c). It clearly two classes of documents. On the other side, emerges that also NASARI is well suited to solve the subclasses (Hegelism-Ethics and Antibiotics- a classification task when domains are well sepa- Molecular) could not be separated. Provided rated. However, also in this case the adopted ap- that word embeddings are known to conflate all proach does not seem to discriminate well within senses in the description of each term (Camacho- the two main classes: for instance, the square with Collados and Pilehvar, 2018), this approach per- vertices E1-H1; E5-H1; E5-H5; E1-H5 should be formed surprisingly well in comparison to a base- reddish, indicating a lower average similarity be- line based on a one-hot vector representation, only tween documents pertaining the Hegel and Ethics dealing with term-based features (Figure 2(b)). classes. We experimented in a set of widely varied conditions and parameters, obtaining slightly bet- NASARI and Sense Embedding Similarity ter similarity scores by weighting NASARI vec- We then explored the hypothesis that seman- tors with senses idf, and senses connectivity (c, tic knowledge can be beneficial for better sepa- obtained from BabelNet): rating documents: after performing word sense    disambiguation (the BabelFy service was em- − → 1 X |Sd | 1 Nd = s~i ·log · 1− , (3) ployed (Moro et al., 2014)), we used the NASARI |Sd | H(si ) c −→ si ∈Sd embedded version to compute the vector Nd , as the average of all vectors associated to the senses where H(si ) is the number of documents contain- contained in Sd , basically employing the same for- ing the sense si . The resulting similarities scores are provided in Figure 2(d). cepts and entities. Documents are in fact too close, and pre- The illustrated approach allows the uncover- sumably the adopted representation (merging all ing of insightful and specific connections between senses in each document) is not as precise as documents pairs. However, this preliminary study needed. In this setting, we tried to investigate the also pointed out some issues. One key problem is documents similarity based on the connections be- the amount of named entities contained in the con- tween their underlying sets of senses. Such con- sidered documents (e.g., E5 only has one access nections were computed on the aforementioned point, while E3 has none). Another issue has to graph. do with the inherently high connectivity of some nodes of the knowledge graph (hubness). For in- 4.2 Enriching Text Similarity with Paths stance, the nodes Philosophy, Plato and Aristotle across Documents are very connected, which results in the extraction of some trivial and uninteresting paths among the In order to examine the connections between the specific documents. The first issue could be tack- considered documents we focused on the philo- led by also considering the main concepts of a doc- sophical portion of our dataset, and exploited the ument if no entity can be found, whilst the second knowledge graph described in Section 3. The one could be mitigated by taking into account the computed paths are not presently used to refine connectivity of the nodes as a negative parameter the similarity scores, but only as a suggestion to while computing the paths. characterize possible connections between docu- ment pairs. The extracted paths contain precious 5 Conclusions information that can be easily integrated in down- stream applications, by providing specific infor- In this paper we have investigated the possibil- mation that can be helpful for domain experts ity of enriching semantic text similarity measures to achieve their objectives (e.g., in semantically via symbolic and human readable knowledge. We browsing text documents, in order to find influence have shown that distributional approaches allow relations across different philosophical schools). for a satisfactory classification of documents be- longing to different topics, however, our prelimi- As anticipated, building paths among the fun- nary experimentation showed that they are not able damental concepts of the documents allows grasp- to capture the subtle aspects characterizing docu- ing important ties between the documents top- ments in close areas. As we have argued, exploit- ics. For instance, one of the extracted paths (be- ing paths over graphs to explore connections be- tween the author ‘Hegel’ and the work ‘Sense tween document pairs may be beneficial in making and Reference’ (Frege, 1948)) shows the con- explicit domain-specific links between documents. nections between the entities at stake as follows. As a future work, we could refine the methodol- G.W.F. Hegel hasMovement Continental Philoso- ogy related to the extraction of the concepts in the phy, which is in turn the movementOf H.L. Berg- Knowledge Graph, defining approaches based on son, who has been influencedBy G. Frege, who fi- specific domain-related ontologies. Two relevant nally hasNotableWork Sense and Reference. The works, to these ends, are the PhilOnto ontology, semantic specificity of this information provides that represents the structure of philosophical lit- precious insights that allow for a proper considera- erature (Grenon and Smith, 2011), and the InPho tion of the relevance of the second document w.r.t. taxonomy (Buckner et al., 2007), combining auto- the first one. It is worth noting that the fact that mated information retrieval methods with knowl- Hegel is a continental philosopher is trivial –tacit edge from domain experts. Both resources will knowledge– for philosophers, and was most prob- be employed in order to extract a more concise, ably left implicit in the thesis abstract, while it can meaningful and discriminative Knowledge Graph. be a relevant piece of information for a system re- quested to assess the similarity of two philosoph- Acknowledgments ical documents. Also, this sort of path over the extracted knowledge graph enables a form of se- The authors are grateful to the EThOS staff for mantic browsing that benefits from the rich set of their prompt and kind support. Marco Leontino Wikidata relations paired with the valuable cover- has been supported by the REPOSUM project, age ensured by BabelNet on domain-specific con- BONG CRT 17 01 funded by Fondazione CRT. References Peter E. Hart, Nils J. Nilsson, and Bertram Raphael. 1968. A formal basis for the heuristic determination Eneko Agirre, Carmen Banea, Daniel Cer, Mona Diab, of minimum cost paths. IEEE Transactions on Sys- Aitor Gonzalez-Agirre, Rada Mihalcea, German tems Science and Cybernetics, SSC-4(2):100–107. Rigau, and Janyce Wiebe. 2016. Semeval-2016 task 1: Semantic textual similarity, monolingual Brent Hecht, Samuel H Carton, Mahmood Quaderi, and cross-lingual evaluation. In Proceedings of the Johannes Schöning, Martin Raubal, Darren Gergle, 10th International Workshop on Semantic Evalua- and Doug Downey. 2012. Explanatory semantic re- tion (SemEval-2016), pages 497–511. latedness and explicit spatialization for exploratory search. In Proceedings of the 35th international Sören Auer, Christian Bizer, Georgi Kobilarov, Jens ACM SIGIR conference on Research and develop- Lehmann, Richard Cyganiak, and Zachary Ives. ment in information retrieval, pages 415–424. ACM. 2007. Dbpedia: A nucleus for a web of open data. In The semantic web, pages 722–735. Springer. Cheng-Hui Huang, Jian Yin, and Fang Hou. 2011. A text similarity measurement combining word se- Cameron Buckner, Mathias Niepert, and Colin Allen. mantic information with tf-idf method. Jisuanji 2007. Inpho: the indiana philosophy ontology. APA Xuebao(Chinese Journal of Computers), 34(5):856– Newsletters-newsletter on philosophy and comput- 864. ers, 7(1):26–28. Aminul Islam and Diana Inkpen. 2008. Semantic text Jose Camacho-Collados and Mohammad Taher Pile- similarity using corpus-based word similarity and hvar. 2018. From word to sense embeddings: A string similarity. ACM Transactions on Knowledge survey on vector representations of meaning. Jour- Discovery from Data (TKDD), 2(2):10. nal of Artificial Intelligence Research, 63:743–788. Jay J Jiang and David W Conrath. 1997. Semantic José Camacho-Collados, Mohammad Taher Pilehvar, similarity based on corpus statistics and lexical tax- and Roberto Navigli. 2015. NASARI: a novel onomy. arXiv preprint cmp-lg/9709008. approach to a semantically-aware representation of items. In Proceedings of NAACL, pages 567–577. Sergio Jimenez, Claudia Becerra, Alexander Gelbukh, Av Juan Dios Bátiz, and Av Mendizábal. 2013. Giulio Carducci, Marco Leontino, Daniele P Radicioni, Softcardinality-core: Improving text overlap with Guido Bonino, Enrico Pasini, and Paolo Tripodi. distributional measures for semantic textual similar- 2019. Semantically aware text categorisation for ity. In Proceedings of *SEM 2013, volume 1, pages metadata annotation. In Italian Research Confer- 194–201. ence on Digital Libraries, pages 315–330. Springer. Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David Mc- Davide Colla, Enrico Mensa, and Daniele P Radicioni. Closky. 2014. The Stanford CoreNLP natural lan- 2017. Semantic measures for keywords extraction. guage processing toolkit. In Proceedings of 52nd In Conference of the Italian Association for Artificial Annual Meeting of the Association for Computa- Intelligence, pages 128–140. Springer. tional Linguistics: System Demonstrations, pages Davide Colla, Enrico Mensa, Daniele P. Radicioni, and 55–60. Antonio Lieto. 2018. Tell me why: Computational Enrico Mensa, Daniele P. Radicioni, and Antonio Li- explanation of conceptual similarity judgments. In eto. 2017. Merali at semeval-2017 task 2 subtask Proceedings of the 17th International Conference on 1: a cognitively inspired approach. In Proceed- Information Processing and Management of Uncer- ings of the 11th International Workshop on Semantic tainty in Knowledge-Based Systems (IPMU), Special Evaluation (SemEval-2017), pages 236–240, Van- Session on Advances on Explainable Artificial Intel- couver, Canada, August. Association for Computa- ligence, Communications in Computer and Informa- tional Linguistics. tion Science (CCIS), Cham. Springer International Publishing. Enrico Mensa, Daniele P Radicioni, and Antonio Li- eto. 2018. Cover: a linguistic resource combining Gottlob Frege. 1948. Sense and reference. The philo- common sense and lexicographic information. Lan- sophical review, 57(3):209–230. guage Resources and Evaluation, 52(4):921–948. Wael H Gomaa and Aly A Fahmy. 2013. A survey of Rada Mihalcea, Courtney Corley, Carlo Strapparava, text similarity approaches. International Journal of et al. 2006. Corpus-based and knowledge-based Computer Applications, 68(13):13–18. measures of text semantic similarity. In AAAI, vol- ume 6, pages 775–780. Pierre Grenon and Barry Smith. 2011. Foundations of an ontology of philosophy. Synthese, 182(2):185– Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Cor- 204. rado, and Jeff Dean. 2013. Distributed representa- tions of words and phrases and their compositional- Zellig S Harris. 1954. Distributional structure. Word, ity. In Advances in neural information processing 10(2-3):146–162. systems, pages 3111–3119. George A Miller. 1995. WordNet: a lexical Amos Tversky. 1977. Features of similarity. Psycho- database for English. Communications of the ACM, logical review, 84(4):327. 38(11):39–41. Denny Vrandecic and Markus Krötzsch. 2014. Wiki- Andrea Moro, Alessandro Raganato, and Roberto Nav- data: A free collaborative knowledge base. Commu- igli. 2014. Entity linking meets word sense disam- nications of the ACM, 57(10). biguation: a unified approach. Transactions of the Association for Computational Linguistics, 2:231– 244. Roberto Navigli and Simone Paolo Ponzetto. 2010. BabelNet: Building a very large multilingual se- mantic network. In Proceedings of the 48th Annual Meeting of the Association for Computational Lin- guistics, pages 216–225. Association for Computa- tional Linguistics. Roberto Navigli and Simone Paolo Ponzetto. 2012. BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual se- mantic network. Artif. Intell., 193:217–250. Roberto Navigli. 2009. Word sense disambiguation: A survey. ACM Computing Surveys (CSUR), 41(2):10. Christian Paul, Achim Rettinger, Aditya Mogadala, Craig A Knoblock, and Pedro Szekely. 2016. Effi- cient graph-based document similarity. In European Semantic Web Conference, pages 334–349. Springer. Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 confer- ence on empirical methods in natural language pro- cessing (EMNLP), pages 1532–1543. Jay Pujara, Hui Miao, Lise Getoor, and William Co- hen. 2013. Knowledge graph identification. In In- ternational Semantic Web Conference, pages 542– 557. Springer. Philip Resnik. 1995. Using information content to evaluate semantic similarity in a taxonomy. arXiv preprint cmp-lg/9511007. Ray Richardson, A Smeaton, and John Murphy. 1994. Using wordnet as a knowledge base for measuring semantic similarity between words. Michael Schuhmacher and Simone Paolo Ponzetto. 2014. Knowledge-based graph document modeling. In Proceedings of the 7th ACM international con- ference on Web search and data mining, pages 543– 552. ACM. Robert Speer, Joshua Chin, and Catherine Havasi. 2017. Conceptnet 5.5: An open multilingual graph of general knowledge. In AAAI, pages 4444–4451. Fabian M Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: a core of semantic knowl- edge. In Proceedings of the 16th international con- ference on World Wide Web, pages 697–706. ACM. Rocco Tripodi and Marcello Pelillo. 2017. A game- theoretic approach to word sense disambiguation. Computational Linguistics, 43(1):31–70.