Reconciling Event-Based Knowledge Through RDF2VEC Mehwish Alam1 , Diego Reforgiato Recupero2,3 , Misael Mongiovi2 , Aldo Gangemi1,2 , Petar Ristoski4 1. LIPN, Université Paris 13, France, 2. ISTC-CNR, Rome, Catania, Italy, 3. University of Cagliari, Italy, 4. University of Mannheim, Germany. Abstract. The reconciled knowledge graphs are typically used for multi- document summarization, or to detect knowledge evolution across doc- ument series. This paper focuses on reconciling knowledge graphs gener- ated from two text documents about similar events described differently. Our approach employs and extends MERGILO, a tool for reconciling knowledge graphs extracted from text, using word similarity and graph alignment. Complete semantic representation of events are generated us- ing FRED, a semantic web machine reader, jointly with Framester, a linguistic linked data hub represented using a novel formal semantics for frames. Event-reconciliation is mainly performed via similarities based on the graph structure of frames using RDF2Vec graph embeddings, and the subsumption hierarchy of semantic roles as defined in Framester. Our approach is evaluated over a coreference resolution task. Keywords: Knowledge Reconciliation, Event Reconciliation, Frame Embed- dings, Frame Similarity, Role Similarity, Role Embeddings, Framester. 1 Introduction This study targets the problem of knowledge reconciliation (KR) [18] from the perspective of events. KR is useful in providing a combination of multiple graphs generated by multiple texts describing the same event. This merged graph pro- vides a graph based summary of multiple texts which is more easily comprehen- sible by users and machines and usable by the algorithms providing interactive exploration of graphs/text analytics through visualization methods. MERGILO [18] is a tool for reconciling knowledge graphs extracted from text, it first computes the word similarity between the node labels and then performs graph alignment over the complete graphs. When different verbs de- note similar events and different agents play slightly different roles, the string matching techniques as introduced in MERGILO might not be appropriate in the KR process. For overcoming this limitation we use Frame Semantics which describes a situation in the text with the help of frames and roles. For identify- ing frames and semantic roles of entities in a text we use FRED [12], a machine reader which generates event-centered knowledge graphs from two different texts. Then, the similarity between these events is computed by calculating the simi- larity between the corresponding FrameNet [2] frames and semantic roles (frame elements). We adapt WordNet [10] similarity measures [4] to frames and roles and vector based similarities using the FrameNet graph and the subsumption hi- erarchy of roles as defined in Framester [11]. We follow the approach RDF2Vec [23] to generate graph based frame embeddings, used to calculate the semantic simi- larity between frames. It uses graph mining algorithms such as graph walks and graph kernels to traverse the graph for generating sequences, which are then fed to neural model for generating its vector representations. An evaluation on Cross-document coreference resolution shows significant improvement over the baseline. The rest of this paper is structured as follows. Section 2 lists the data sources, resources and tools we have adopted in our methodology. Section 3 includes state of the art work. Then, Section 4 gives some details of MERGILO and its func- tionalities and explains how frame semantics have been employed for improving MERGILO. Section 5 shows a precision-recall analysis for the presented approach on the EECB dataset. Finally, Section 6 concludes the paper with discussions, remarks and highlights some future directions. 2 Role Oriented Resources FrameNet [2] contains frames, which describe a situation, state or action. Each frame has frame elements usually consisting of agent, patient, time and loca- tion and are also known as semantic roles. Each frame can be evoked by Lexical Units (LUs) belonging to different parts of speech. These LUs can be nouns, verbs, adjectives and adverbs representing closely related sets of meanings. For example, in the frame Conquering the argument for the role Conqueror over- takes the argument of the role Theme where the theme loses its autonomy. Such constructs describing the situation of conquering or invasion are referred to as frame elements and the LUs such as conquer, overtake etc. are example words, typically used to denote conquering situations in text. In the example bellow, The Spaniards is the argument of the role Conqueror and Incas is the argument of the role Theme and conquered is the LU evoking the frame. [The Spaniards]Conqueror [conquered]Lexical U nit [the Incas]T heme . (1) Framester [11] is a large RDF1 knowledge graph (currently including about 30 million RDF triples) acting as a hub between FrameNet, WordNet, VerbNet [14], BabelNet [19], Predicate Matrix [6], etc. Framester uses a mapping between WordNet, BabelNet, VerbNet and FrameNet at its core using detour based ap- proach, expands it to other linguistic resources transitively. It further links these 1 https://www.w3.org/TR/rdf11-primer/ prec. prec. Event initial state Event Event end state Objective inf luence M otion T ransitive action Control Intentionally af f ect M ass motion M otion N oise Invasion Scenario Attack prec. Invading Conquering Besieging prec. Repel Fig. 1: A part of FrameNet graph. “prec.” represents the relation “precedes”, dotted lines represent “SubFrame” relation and solid lines represent the “Inheritance” relation as defined in FrameNet. J framesterrole:Agent gfe:Invader gfe:Agent Invader.Invasion Scenario Agent.Attending Agent.Intentionally Affect Agent.Adjusting Assailant.Attack Agent.Arranging Conqueror.Conquering Enemy.Repel Assailant.Beseiging Assailant.Defend Fig. 2: A part of Subsumption Hierarchy with FrameNet and Framester Roles. resources to important ontological and linked data resources such as DBpedia, YAGO, DOLCE-Zero [20], schema.org etc. Framester keeps the original FrameNet graph where the nodes represent the FrameNet frames and the edges represent different semantic relations between the frames i.e., Inheritance, SubFrame, CausativeOf etc. Figure 1 shows a part of FrameNet graph. Framester also contains a new subsumption hierarchy of semantic roles (i.e., frame elements) and added generic roles on top of the frame specific roles. Figure 2 shows a part of the Framester role hierarchy associated with the framester role agent.2 2 The prefixes for http://www.ontologydesignpatterns.org/ont/framenet/abox/gfe/ and http://www.ontologydesignpatterns.org/ont/framester/data/framesterrole.ttl# are gfe: and framesterrole: respectively. Fig. 3: FRED Knowledge Graph for Example 1 FRED [12] 3 is a machine reader which generates ontological structure from natural language text using Discourse Representation Theory (DRT), frame se- mantics and Ontology Design Patterns. FRED uses Boxer,4 an open source tool for deep parsing of natural language using Combinatory Categorial Grammar (CCG) and produces event-based, semantic representations of natural language. The Discourse Representation Structures (DRS) produced by Boxer use Verb- Net thematic roles. These functionalities implemented in FRED help in the event detection task for our method. 3 State of the Art Approaches for integrating knowledge include cross-document coreference resolu- tion (when knowledge is represented as text documents) and ontology matching (when knowledge is in a machine-readable form). Cross-document coreference resolution aims at associating mentions about a same entity (object, person, concept, etc.) across different texts [8]. When extracted entities are events, the problem changes to resolution of event coreference across documents [3]. The authors in [16] jointly model named entities and events. Clusters of entities and event mentions are constructed and merged accordingly to a similarity thresh- old based on linear regression. Then, information flows between entity and event clusters through features that model semantic role dependencies. The system handles nominal and verbal events as well as entities, and the joint formula- tion allows information from event coreference to help entity coreference, and vice-versa. A rich overview of ontology matching methods is provided by [9]. Relevant work includes [24] that leverages the interplay between schema and 3 http://wit.istc.cnr.it/stlab-tools/fred 4 https://github.com/valeriobasile/candcapi instance matching. Similarly, [15] shows a greedy iterative algorithm for align- ing knowledge bases with millions of entities and facts. These approaches are characterised by the preferred large size of the ontologies/datasets treated (for best performance), which is rarely (probably never) derived from text sources. MERGILO, as other knowledge integration tools [15], employs graph alignment, a more general and widely studied problem [26]. Note that all these approaches are connected and related to the classical graph matching problem [22]. We ad- dress this problem from the perspective of events, by taking advantage of frame embeddings i.e., the vector representations of linguistic frames and semantic roles. Recently, word embeddings have been used in variety of Information Retrieval and Natural Language Processing applications. One recent application is used for generating vector representations of word senses [13] and then these vector representations are used for improving the results of word similarity and word analogy tasks based on BabelNet word senses formally known as SensEmbed. [5] apply Frame Semantics and Distributional Semantics for slot filling in Spoken Dialogue System. In [27], the authors use Word and Frame Embeddings for generating categories of annoying behaviors where each category contains a set of words specific to that category. The frame embeddings are generated using 3.8 million tweets tagged by FrameNet frames using SEMAFOR. By contrast, in this study we are using graph-based Frame and Role Embeddings. 4 Event-Based Knowledge Reconciliation Consider the two sentences: Sent1: “The Spaniards conquered the Incas.” and Sent2: “The Incas were attacked by the Spaniards.” They are describing the same event in the past using different words i.e., event of an attack or an invasion from Spaniards to Incas. Figure 3 shows the FRED graph of Sent1. Given two such knowledge graphs, MERGILO first performs graph compression by merging the nodes in the same graph. The two compressed graphs are aligned by estab- lishing a 1-1 correspondence between the nodes of the two graphs by maximizing a score function, which combines the similarity between aligned nodes and the similarity between aligned edges. In such a case, the similarity between “con- quered” and “attacked” is not effective since the word similarity is low, although in this context such words describe the same event. For computing similarity between two nodes containing verb senses, the verb senses are first mapped to frames using Framester mappings. For exam- ple, in Figure 3 s1 “ vn.data : Conquer 42030000 and for Sent2 we have s2 “ vn.data : Attack 33000000. According to Framester mappings, we ob- tain s1 Ñ tConqueringu and s2 Ñ tAttacku. These nodes are replaced by their corresponding frames. The edges containing the VN-roles are mapped to FN- roles. For example, in Figure 3, the verb sense vn.data5 :Conquer 42030000 evokes the roles vn.role:Agent and vn.role:Patient which are mapped to fe:Conqueror.conquering and fe:Theme.conquering respectively. 5 prefix vn.data: http://www.ontologydesignpatterns.org/ont/vn/vn31/data/ In the sentence in Figure 2, the roles evoked by the verb sense vndata:Attack 33000000 are vndata:Agent and vndata:Theme. The Framester mappings contains the fol- lowing records for these roles: vndata:Agent.conquer_42030000 skos:closeMatch fe:Conqueror.conquering . vndata:Patient.conquer_42030000 skos:closeMatch fe:Theme.conquering . vndata:Agent.attack_33000000 skos:closeMatch fe:Assailant.attack . vndata:Theme.attack_33000000 skos:closeMatch fe:Victim.attack Then the similarities are computed in two ways: (i) by considering the tax- onomical structure imposed by the “inheritance” relation represented as fnschema6 :inheritsFrom in Framester using Path Similarity, Wu-Palmers Sim- ilarity, Leacock-Chodorow Similarity; (ii) using Frame Embeddings. Frame Embeddings using RDF2Vec: To learn latent numerical representation of the frames and roles in the FrameNet graph, we follow the RDF2Vec approach. First we transform the graph into a set of sequences of entities, which is then fed into a neural language models, resulting into vector representation of all the nodes in the graph in a latent feature space. To convert the graph into a set of sequences of entities we use two approaches, i.e., graph walks and Weisfeiler-Lehman Subtree RDF Graph Kernels. (i) Graph Walks: given a graph G “ pV, Eq, for each vertex v P V , we generate all graph walks Pv of depth d rooted in vertex v. To generate the walks, we use the breadth- first algorithm. In the first iteration, the algorithm generates paths by exploring the direct outgoing edges of the root node vr . In the second iteration, for each of the previously explored edges, the algorithm visits the connected vertices. The final set of sequences Ť for the given graph G is the union of the sequences of all the vertices PG “ vPV Pv . (ii) Graph Kernels: it computes the number of sub-trees shared between two or more graphs by using the Weisfeiler-Lehman [7] test of graph isomorphism. This algorithm creates labels representing subtrees. Once the set of sequences of entities is extracted, we build a word2vec model. Word2vec is a particularly computationally-efficient two-layer neural net model for learning word embeddings from raw text. There are two different algorithms, the Continuous Bag-of-Words model (CBOW) and the Skip-Gram model. The CBOW model predicts target words from context words within a given window, while the skip-gram model does the inverse. Once the training is finished, the cosine similarity is computed between two frames and roles. 5 Evaluation The experiments were conducted for the task of Cross-document Coreference Resolution (CCR) on RDF graphs, which focuses on associating RDF nodes about a same entity (object, person, concept, etc.) across different RDF graphs generated from text. The data set used for the experimentation was obtained 6 prefix fnschema: http://www.ontologydesignpatterns.org/ont/framenet/tbox/ by the EECB data set which specifies coreferent mentions (text fragment). Our dataset was obtained by generating RDF graphs using FRED and associating text mensions to graph nodes by manual annotations. The framework is built on top of the original MERGILO code, which was released as a Python tool7 . IBM ILOG CPLEX 12.6.1 was used for solving the Integer Linear Program and the experiments were conducted on a MacOS server with 6-Core Intel Xeon E5 3.50GHz and 64GB of RAM. We used the following metrics for evaluation: (i) MUC [25]: Link-based metric that quantifies the number of merges necessary to cover predicted and gold clusters; (ii) B 3 [1]: Mention-based metric that quan- tifies the overlap between predicted and gold clusters for a given mention; (iii) CEAFM (Constrained Entity Aligned F-measure Mention-based) [17]: Mention- based metric based on a one-to-one alignment between gold and predicted clus- ters; (iv) CEAFE (Constrained Entity Aligned F-measure Entity-Based) [17]: Entity-based metric based on a one-to-one alignment between gold and predicted clusters; (v) BLANC (Bilateral Assessment of NounPhrase Coreference) [21]: Rand-index-based metric that considers both coreference and non-coreference links. For the current evaluation, MERGILO was considered as a baseline. Table 1 shows the results for the baseline method, the Wu-Palmer’s similarity, the Path similarity, the Leacock-Chodorow similarity and the results for cosine similarity using (i) graph walks and (ii) graph kernels with FrameNet roles respectively. Here Frame2Vec refers to the vector representations generated for FrameNet frames and Role2Vec refers to the vector representations generated for frame elements i.e., semantic roles. For the first approach with graph walks, for each entity in the FrameNet graph 200 and 500 random walks were generated, each of depth 4 and 8. For each entity in the subsumption hierarchy of roles we generate 400 random walks with depth 4. For the Weisfeiler-Lehman algorithm, we use h “ 2 iterations and subgraph depth d “ 2, and after each iteration of the algorithm we extract all walks for each entity with the same depth. We use these sequences to build both CBOW and Skip-Gram models with the following parameters: window size = 5; number of iterations = 10; negative sampling for optimization; negative samples = 25; with average input vector for CBOW. We experiment with 200 and 500 dimensions for the entities’ vectors. The results clearly indicate that each model used for graph walks and graph kernels performs better than the MERGILO baseline for all the considered met- rics, showing a clear advantage of using the proposed frame similarities for recon- ciling knowledge graphs. The Wu-Palmer, Path and Leacock Chodorow measures use the inheritance relations only whereas Frame2Vec employs either graph walks or graph kernels over the FrameNet frame graph as well as subsumption hierar- chy of FrameNet roles using either only FrameNet roles or improved subsumption hierarchy of FrameNet roles as introduced in Framester. Based on these settings, vector representations are generated which are further used for computing the cosine similarity. In general, Frame2Vec, for its intrinsic construction, exploits 7 http://wit.istc.cnr.it/stlab-tools/mergilo muc bcub ceafm blanc ceafe MERGILO Baseline 24.05 17.36 28.61 10.70 26.20 Similarity Measures Wu-Palmer 27.14 19.91 31.91 12.81 29.41 Path 27.16 19.93 31.85 12.73 29.38 Leacock Chodorow 27.04 19.80 31.74 12.77 29.21 Graph walks Frame2Vec Role2Vec muc bcub ceafm blanc ceafe CBOW 200 CBOW 200 27.34 19.99 32.15 12.66 29.82 CBOW 200 SG 800 27.38 19.97 32.29 12.69 29.98 CBOW 200 SG 500 27.28 19.95 31.99 12.69 29.54 CBOW 200 CBOW 500 27.09 19.03 29.95 11.91 28.97 CBOW 500 SG 500 26.90 19.68 31.58 12.60 29.08 SG 200 SG 500 26.87 19.57 31.33 12.10 29.01 SG 500 SG 500 26.85 19.45 31.12 12.08 28.98 Graph kernels Frame2Vec Role2Vec muc bcub ceafm blanc ceafe CBOW 200 CBOW 200 26.76 19.57 31.50 12.45 29.06 CBOW 200 CBOW 500 26.76 19.57 31.50 12.45 29.06 CBOW 200 SG 200 26.70 19.52 31.45 12.40 28.99 CBOW 200 SG 500 26.70 19.52 31.45 12.40 28.99 CBOW 500 CBOW 200 26.76 19.51 31.45 12.45 28.96 SG 200 CBOW 200 26.86 19.62 31.67 12.48 29.18 SG 500 CBOW 200 26.90 19.68 31.58 12.60 29.08 Table 1: Event-Based Knowledge Reconciliation Results. The best results are marked in bold. more semantics than the other similarity measures (Wu-Palmer, Path and Lea- cock Chodorow); for such a reason, Frame2Vec provides the highest results for almost each evaluation measure except for BLANC. BLANC is more sensitive to wrong assignments when clusters of mentions are larger, since a wrong assignment lead to a higher number of wrong non- coreference links. Therefore, although BLANC is case-by-case coherent with the other measures (when BLANC is low, the other measures are low and vice- versa), in the few cases when Frame2Vec is outperformed by other measures (Wu- Palmer, Path and Leacock Chodorow), the BLANC measure, and in particular the contribution given by non-coreference link, gives a much smaller score. These cases influence the overall average and for this reason in Table 1 BLANC seems to have a different behaviour than the other measures. The generated models i.e., vector representations of FrameNet frames gener- ated using FrameNet graph and subsumption hierarchy of FrameNet roles using RDF2Vec are freely available on-line8 . 8 http://lipn.univ-paris13.fr/~alam/Frame2Vec/ 6 Conclusions and Discussion This paper presents a way to perform event-reconciliation for merging multiple event-oriented knowledge graphs originated from multiple texts. It uses existing tool MERGILO, a tool for reconciling knowledge graphs using word similarity and graph alignment. The current study exploits several path-based similarity measures for frames and semantic roles, i.e., following the approach RDF2Vec, graph-based frame embeddings were generated. The evaluation shows that the introduced approach is an effective improvement over the baseline. Ongoing work concentrates on practical applications of frame embeddings in real systems, such as news series integration, knowledge graph evolution with robust event reconciliation (e.g. in streaming of texts where we expect related- ness or updates), or conflict detection across texts describing similar facts with different narratives or perspectives. References 1. Amit Bagga and Breck Baldwin. Algorithms for scoring coreference chains. In The first international conference on language resources and evaluation workshop on linguistics coreference, volume 1, pages 563–566. Citeseer, 1998. 2. Collin F Baker, Charles J Fillmore, and John B Lowe. The berkeley framenet project. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics-Volume 1, pages 86–90, 1998. 3. Cosmin Bejan and Sanda Harabagiu. Unsupervised event coreference resolution. Comput. Linguist., 40(2):311–347, June 2014. 4. Alexander Budanitsky and Graeme Hirst. Evaluating wordnet-based measures of lexical semantic relatedness. Computational Linguistics, 32(1):13–47, 2006. 5. Yun-Nung Chen, William Yang Wang, and Alexander I Rudnicky. Jointly modeling inter-slot relations by random walk on knowledge graphs for unsupervised spoken language understanding. In HLT-NAACL, pages 619–629, 2015. 6. Maddalen Lopez De Lacalle, Egoitz Laparra, and German Rigau. Predicate matrix: extending semlink through wordnet mappings. In LREC, pages 903–909, 2014. 7. Gerben Klaas Dirk de Vries and Steven de Rooij. Substructure counting graph kernels for machine learning from rdf data. Web Semantics: Science, Services and Agents on the World Wide Web, 35:71–84, 2015. 8. Sourav Dutta and Gerhard Weikum. Cross-document co-reference resolution using sample-based clustering with knowledge enrichment. Transactions of the Associa- tion of Computational Linguistics, 3(1):15–28, 2015. 9. Jérôme Euzenat and Pavel Shvaiko. Ontology matching. Springer-Verlag, Heidel- berg, 2nd edition, 2013. 10. Christiane Fellbaum, editor. WordNet: an electronic lexical database. MIT Press, 1998. 11. Aldo Gangemi, Mehwish Alam, Luigi Asprino, Valentina Presutti, and Diego Re- forgiato Recupero. Framester: a wide coverage linguistic linked data hub. In Knowledge Engineering and Knowledge Management: 20th International Confer- ence, Bologna, Italy, pages 239–254, 2016. 12. Aldo Gangemi, Valentina Presutti, Diego Reforgiato Recupero, Andrea Giovanni Nuzzolese, Francesco Draicchio, and Misael Mongiovi. Semantic Web Machine Reading with FRED. Semantic Web Journal, 2016. 13. Ignacio Iacobacci, Mohammad Taher Pilehvar, and Roberto Navigli. Sensembed: Learning sense embeddings for word and relational similarity. In ACL (1), pages 95–105, 2015. 14. Karin Kipper Schuler. Verbnet: A Broad-coverage, Comprehensive Verb Lexicon. PhD thesis, Philadelphia, PA, USA, 2005. AAI3179808. 15. Simon Lacoste-Julien, Konstantina Palla, Alex Davies, Gjergji Kasneci, Thore Graepel, and Zoubin Ghahramani. Sigma: Simple greedy matching for aligning large knowledge bases. In KDD2013, pages 572–580, New York, USA, 2013. ACM. 16. Heeyoung Lee, Marta Recasens, Angel Chang, Mihai Surdeanu, and Dan Jurafsky. Joint entity and event coreference resolution across documents. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 489–500, 2012. 17. Xiaoqiang Luo. On coreference resolution performance metrics. In Proc. of the conference on Human Language Technology and Empirical Methods in Natural Lan- guage Processing, pages 25–32. Association for Computational Linguistics, 2005. 18. Misael Mongiovı̀, Diego Reforgiato Recupero, Aldo Gangemi, Valentina Presutti, and Sergio Consoli. Merging open knowledge extracted from text with MERGILO. Knowl.-Based Syst., 108:155–167, 2016. 19. Roberto Navigli and Simone Paolo Ponzetto. BabelNet: The Automatic Con- struction, Evaluation and Application of a Wide-Coverage Multilingual Semantic Network. Artificial Intelligence, 193:217–250, 2012. 20. A. G. Nuzzolese, A. Gangemi, V. Presutti, P. Ciancarini, and A. Musetti. Auto- matic Typing of DBpedia Entities. In Proc. of the International Semantic Web Conference (ISWC), Boston, MA, US, 2012. 21. Marta Recasens and Eduard Hovy. Blanc: Implementing the rand index for coref- erence evaluation. Natural Language Engineering, 17(04):485–510, 2011. 22. Diego Reforgiato Recupero. Efficient graph matching. Encyclopedia of Data Ware- housing and Mining, pages 736–743, 2009. 23. Petar Ristoski and Heiko Paulheim. Rdf2vec: Rdf graph embeddings for data mining. In ISWC, pages 498–514, 2016. 24. Fabian M. Suchanek, Serge Abiteboul, and Pierre Senellart. PARIS: Probabilis- tic alignment of relations, instances, and schema. In Proceedings of the VLDB Endowment, volume 5, pages 157–168. VLDB Endowment, 2011. 25. Marc Vilain, John Burger, John Aberdeen, Dennis Connolly, and Lynette Hirschman. A model-theoretic coreference scoring scheme. In Proceedings of the 6th conference on Message understanding, pages 45–52, 1995. 26. Joshua T Vogelstein, John M Conroy, Vince Lyzinski, Louis J Podrazik, Steven G Kratzer, Eric T Harley, Donniell E Fishkind, R Jacob Vogelstein, and Carey E Priebe. Fast approximate quadratic programming for graph matching. PLoS One, 10(4):e0121002, 2015. PMID: 25886624. 27. William Yang Wang and Diyi Yang. That’s so annoying!!!: A lexical and frame- semantic embedding based data augmentation approach to automatic categoriza- tion of annoying behaviors using# petpeeve tweets. In EMNLP, pages 2557–2563, 2015.