1. Introduction

Strategies for creating knowledge graphs to depict a multi-perspective Queer communities representation

Louann Coste

Flora Helmers

Hammamache Kheddouci

Léo Le Nestour

Mahsa Niazi

Genoveva Vargas-Solar

0 0 CNRS , Univ Lyon, INSA Lyon, UCBL, LIRIS, UMR5205, F-69221 , France 1 Ecole Normale Supérieure de Lyon , 15 parvis René Descartes, 69342, Lyon

This paper introduces an experimental study for building knowledge graphs about queer communities with diferent perspectives. The paper describes a set of pipelines implementing several knowledge graphs construction strategies that lead to a complementary understanding of the queer communities. We implemented a library with the knowledge graph construction pipelines and conducted experiments to evaluate execution cost against the comprehensiveness of the content. The library includes visualization tools for observing the knowledge graphs concerning several granularities. Finally, we use storytelling to provide a first interpretation of our results regarding the profiling of queer communities.

eol>Data science pipelines graph analytics knowledge graphs queer history

1. Introduction

exploitation of knowledge graphs. Section 3 introduces our approach based on diferent pipelines for building Heterogeneous datasets can be structured into networks. queer history knowledge graphs. Section 4 reports our The notion of "graph" is a powerful mathematical concept experiments. Finally, Section 5 concludes the paper and for representing these networks as graphs. Graphs can discusses future work. be exploited through workflows applying algorithms to solve data science problems. When the graphs become too large, the processes used to explore, and analyse 2. Related Work them become costly. Therefore, deployment strategies on target architectures can fulfil the resources’ consumption Knowledge graphs construction and exploration. A knowlrequirements in an adaptable way [1]. edge graph is a directed graph G = (N,R) whose nodes

This paper introduces an experimental approach for N represent entities and literal values (literals), and building knowledge graphs to reveal several perspec- whose edges R represent relations between these entities. tives of Queer History. Our work models the pipelines to The graph can be divided into two parts: the data and explore the Wiki data knowledge graph and extract sub- the knowledge. The knowledge can come from several graphs about queer identities combining diferent vari- sources that describe entities (i.e., concepts) and from ables like age, artistic production, geographical position, relations among entities that can be explicitly stated or etc. Extracted graphs are studied, classified, and prolfied discovered [2]. Knowledge graphs defined using the RDF thoroughly to estimate the allocation of resources. The model can be queried using the SPARQL query language. pipelines include performing unions, merge, and com- Knowledge graphs are used to integrate bibliographimunity discovery to build several perspectives of Queer cal data and exhibit references and papers co-authorship. History. Examples of solutions are DIG [3], Knowledge Vault [4],

The remainder of the paper is organised as follows. Sec- CiteSeerX [5], Pujara et al. [6], the Knowledge Graph tion 2 introduces work addressing the construction and Identification [ 7] and NELL [8]. The authors of [9] propose a framework for incrementally building a knowlPublished in the Workshop Proceedings of the EDBT/ICDT 2023 Joint edge graph of artists used to link the data from diferent *CGonefneorevnecvea(VMaarrgcahs-2S8o-lMara.rch 31, 2023, Ioannina, Greece). museums. These approaches rely on a predefined schema †Authors contributed equally and are enumerated in alphabetical describing the data used to build the graph. Other aporder. proaches, adopt the opposite strategy and do not use $ louann.coste@ens-lyon.fr (L. Coste); flora.helmers@ens-lyon.fr a schema for building knowledge graphs [10, 11, 12]. (F. Helmers); hamamache.kheddouci@univ-lyon1.fr Knowledge graphs can also be built by consolidating (H. Kheddouci); leo.le\protect1_nestour@ens-lyon.fr data from diferent sources like LDIF [ 13]), YAGO and (gLe.nLo.vNevesat.voaurrg);ams-asholsaar.n@iacznir@s.fern(sG-l.yVoanr.fgra(sM-S.oNlaira)zi); YAGO2 [14, 15] and Freebase [16].

© 2023 Copyright © 2023 for this paper by its authors. Use permitted under Creative Commons Modelling the Queer History with graphs. Queer history CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g LCicEenUseRAttWribuotironk4s.0hIontpernPatrioonacl e(CeCdBiYn4g.0)s. (CEUR-WS.org) was not recorded in a structured way for a long time. The most prominent project willing to integrate the history with subject, predicate, and object columns. This of LGBTQ+ communities is the WikiProject LGBT 1 for .csv file can be transformed using the Pandas library developing LGBT-themed content in Wikidata and advo- into a DataFrame or a networkx Graph using the cating for the community of contributors developing this networkx.from_pandas_dataframe function. The content. Wikidata contains self-organized archives start- interest in producing a networkx graph is that the liing in the 1970s. In Munich the Forum Queeres Archiv, brary provides additional methods for graph analysis. has been collecting, researching, and publishing for 20 In the example below, a graph is constructed with all years: Shelf meters full of bequests, files, books, journals, triples where the subject is a non-heterosexual or nonobjects. The project Queer data 2 transforms analog queer cisgender individual. This activity corresponds to the history into linked open data and incorporates it into an data retrieval phases of the pipeline by interacting with open, freely accessible, public data infrastructure. The the Wikidata endpoint server (see Figure 1). knowledge graph contains 7415 entities from the data sources of the Forum Queeres Archiv München, with CONSTRUCT { more than 38,000 connections. ? p e r s o n ? p r e d i c a t e ? o b j e c t . Discussion The challenge to modelling Queer history is } to use diferent properties to define identities that can WHERE { be profoundly diverse and relations and provide a multi- { perspective model. Exhibiting these perspectives can ? p e r s o n wdt : P31 wd : Q5 . #? propose a more representative queer history knowledge p e r s o n I d i s a human graph. This paper shows diferent perspectives of queer ? p e r s o n ? p r e d i c a t e ? o b j e c t . communities by building and integrating graphs. {

3. Building Knowledge Graphs Pipelines

The main contribution of our work is the design of three pipelines to build knowledge graphs using diferent techniques. The purpose is to propose a pipeline that: - Balances the coverage of the resulting knowledge graph that is its scope concerning the queer communities (i.e., are all potential queer individuals part of the graph, are their connections able to exhibit diferent communities). - Reasonably consumes resources (CPU and memory) to explore, analyse and build dense graphs. The second contribution is the implementation of these pipelines within a library 3 that allows testing them by calibrating specific parameters such as the number of iterations, the number of nodes to visit, etc. We also provide a ready-to-use notebook to run the experiments we have conducted (see } Section 4). } Pipeline 1: Pure SPARQL based knowledge graph SERVICE w i k i b a s e : l a b e l { bd : Figure 1 illustrates the pipeline that builds a pure s e r v i c e P a r a m w i k i b a s e : l a n g u a g e " [ SPARQL-based knowledge graph. We propose the con- } AUTO_LANGUAGE ] " . } struction of two initial graphs by querying Wikidata to retrieve individuals identified as non-cisgender and non- We create a second graph where the person is not the heterosexual and their connections. subject but the object. With these two graph views, we

The Wikidata service ofers the WikiData Query have all the connections to the elements. We retrieve the Service endpoint, which allows users to run SPARQL labels of the entity using the service Wikibase. Retrieving queries on the WikiData database. SPARQL allows cre- the whole graph is costly (the evaluation of this query ating a graph with the CONSTRUCT keyword, and doubles the time of the first one), so we set a timeout and the resulting graph can be downloaded as a .csv file limit the size of the graph.

We note that we have to clean the data because 21hhttttppss::////qwuweewr.dwaitkai.dfoartua.mormgu/wenikcih/Weni.koirdga/etan:/WikiProject_LGBT they contain duplicates and "empty" nodes. Indeed, 3https://github.com/FloraHelmers/QueerHistoryProject some elements have diferent Internationalized Resource ? p e r s o n wdt : P21 ? s e x o r g e n d e r .

#? p e r s o n h as ? s e x o r g e n d e r #? s e x o r g e n d e r i s n o t male , f e m a l e , c i s g e n d e r male , c i g e n d e r f e m a l e , o r c i s g e n d e r p e r s o n FILTER ( ? s e x o r g e n d e r NOT IN ( wd : Q6581097 , wd : Q6581072 , wd : Q15145778 , wd :

Q15145779 , wd : Q1093205 ) ) . } UNION { ? p e r s o n wdt : P91 ? s e x u a l o r i e n t a t i o n . #? p e r s o n h as ? s e x u a l o r i e n t a t i o n FILTER ( ? s e x u a l o r i e n t a t i o n ! = wd : Q1035954 ) . #? s e x u a l o r i e n t a t i o n i s n o t h e t e r o s e x u a l Identifiers (IRIs) but have the same meaning: for in- is a process in which the data from two separate RDF stance, http://www.wikidata.org/prop/direct/P161 and graphs is combined into a single graph. The process is http://www.wikidata.org/prop/statement/P161 gives ac- straightforward, as both graphs are in a standard format cess to the same property which is "cast member". How- that can be easily integrated. The only subtleties that ever, since they have diferent terms, they are expressed arise when merging two RDF graphs are related to the as if they were diferent. We also get empty nodes which processing of shared blank nodes. do not have labels, such as entities Q19151093 4 and Shared blank nodes are nodes that appear in both RDF Q19218452 5. Both have diferent IRI and Wikipedia Iden- graphs and are used to represent anonymous resources. tifiers but have potentially the same meaning because These shared blank nodes must be merged in an intetheir properties are identical. Because we only get access grated RDF graph to represent the same resource. The to the elements located at a distance 1 from the people, rules for merging shared blank nodes are defined in the we cannot correct these elements only using what we W3C RDF 1.1 Semantics and Abstract Syntax specificahave gathered. Besides, we lose the added value of the tion 6. The merging process must ensure that the inknowledge graph, which is to automatically deduce in- tegrated graph preserves the intended meaning of the formation by using the varied distant relationships of the original RDF graphs, including the relationships between initial node. The principle of pipeline 1 is sound, but the the nodes in the graph. Therefore, the process must conexecution cost is high according to the resources we have sider the context in which each shared blank node is used access to. Therefore we propose alternative pipelines in both graphs and combine them to preserve the RDF described in the following sections. data’s intended meaning. Our final integrated graph is Pipeline 2: Merging Stars. The construction pipeline then converted to a NetworkX multidigraph that proved of the knowledge graph about queer communities from to be the most helpful format for our analysis. Wikidata using merging stars can be summarised as fol- Once the graph has been created, it is necessary to lows: merge a list of graphs retrieved from Wikidata as prune it in various ways: .nt files that are akin to stars around a central entity (1) Removing "dead-end" nodes (i.e., nodes that have an node (see Figure 2). As shown in Figure 2, this pipeline in-degree of less than 2) "Dead-end" nodes are often isostarts with a SPARQL query to Wikidata that returns a lated and do not provide much information about the list of Wikidata item IDs that are related to the queer community. community. (2) Removing duplicates (i.e., nodes that represent the

This list, originally in JSON, is then used to extract the same object literally) and isolated nodes (i.e., nodes that relevant information necessary to query the graphs asso- have both in-degree and out-degree of 0). These nodes ciated with each entity on Wikidata (IDs and URLs). We are not connected to any other nodes in the graph and iteratively parse new information by using these graphs, do not provide information about the community. basically creating a merged RDF graph using the RDF Depending on the application, one can also specify data of the nodes from the Wikidata entity URLs in an more complex pruning processes with rule-based delen-triples format (given in an .nt file format). tion and deletion before each merge. The pipeline is

The merging of two knowledge graphs in .nt file format executed with two parameters: n the size of the list of

4http://www.wikidata.org/entity/Q19151093

5http://www.wikidata.org/entity/Q19218452

6https://www.w3.org/TR/rdf11-mt/

#shared-blank-nodes-unions-and-merges people used as starting point and a prune_policy dic- the initial set of nodes for ranking nodes based on the tionary describing what process for removing the nodes number and quality of incoming links (see 2 in Figure should be used. 3). The algorithm allows the identification of the most Pipeline 3: Crawler method. The crawler-based critical nodes in the graph, as they are more likely to pipeline starts with a small number of nodes. It runs represent properties of interest. The PageRank algorithm an iterative process to extract critical nodes representing is run multiple times with diferent parameters, such as properties of interest (potential common points) and use the damping factor (alpha) and the number of iterations those to explore and discover more queer people and (see 3 in Figure 3). communities. Figure 3 illustrates the main activities of After the PageRank algorithm has been executed, the the pipeline. The process begins by selecting a subset pipeline selects a certain number of the most critical of nodes using a SPARQL query (see 1 in Figure 3). The nodes (k_prop) and uses them to explore further (see query selects distinct people and their labels who have a 4 in Figure 3). Specifically, it runs the same SPARQL specific property of interest and do not have other prop- query as before, using the selected nodes as the property erties that would exclude them from being considered of interest (see 5 in Figure 3). The result is a new set of queer. A query result is a limited number of nodes, which nodes connected to the previously selected nodes through is used as the starting point for the iterative process. the property of interest. These new nodes are added to

Next, the pipeline runs the PageRank algorithm on the original graph, and the process starts over again in n iterations (see 6 in Figure 3). was appropriate, which allowed us to use the log-normal

The crawler-based pipeline continues to run the PageR- distribution to model the distribution of the number of ank algorithm and explore new nodes until a certain out edges in our simulation experiments. This strategy number of iterations have been completed (n_iter). The allowed us to have a more accurate simulation of the realresulting graph is a composite of all the nodes and edges world construction of knowledge graphs and to assess discovered during the process, and it is more densely the performance of diferent construction methods more connected and contains a higher number of queer people accurately. than the initial set of nodes. Because not all properties constitute common points among people, we only take a fraction of these number 3.1. Visualising graph structures of out-edges to have our distribution for the models. By running tests on Wikidata, we found that approximately Visualising a knowledge graph is a crucial step in under- 1/50 of these out-edges created an entity-linking edge. standing the structure and content of the data. To better This gives us the distribution with a probability density understand the structure of the knowledge graph gener- function (PDF): ated in our study, we implemented custom visualisation tbeucnhdnliiqnug.es, including diferent graph layouts and edge √12 exp − ((2) − 2 )2 (1) Nodes Layout. The force-directed layout, described in [17] minimises overlaps in the graph, evenly distributes nodes and links and organises items so that links are of a similar length with as few crossing edges as possible.

This process is done by assigning forces among the set of edges and the set of nodes based on their relative positions and then using these forces to simulate the edges and nodes’ motion or minimise their energy. In the case of our pipelines based on .nt "stars" associated with entities, the force-directed layout is particularly relevant to distinguish the underlying structure obtained.

Edge-Bundling. We use edge bundling, a method that allows edges to curve and then groups nearby ones together to help convey structure. We use the function "Datashader.hammer_bundle" (a variant of [18]) for the edge bundling.

with ≈ 0.495 and ≈ 5.

Understanding the queer concepts. We defined an ontology to organise the critical elements of the Queer Community.

We used preexisting standard ontologies such as foaf to describe the people and Geonames to make connections between spaces. Our ontology focuses on the European Queer scene of the 1980s. We added the correlation between gender and sex without having it explicitly said.

We based our strategy on the idea of Katharina Brunner in the "Remove NA" project 8. We used pyramidal relations to describe sexual orientations. Our ontology also focuses on art and collaborations, including the notion of avatars which is essential in the drag culture. The concept of influence is also present, considering elements such as songs played in clubs, characters in films or books, and access to art pieces. This version focuses on the representation of LGBTQ+ people in the culture of the 4. Experiments and Results 1980s. Still, it could benefit from exploring the rights of LGBTQ+ individuals from the past from a more global Implementation. We have developed a library for han- and non-Eurocentric perspective. One possible approach dling and building knowledge graphs with real-world would be to examine the various terms used to define data for a comfortable experimental setting. This li- gender across diferent cultures, such as the third gender brary, written in Python, uses the NetworkX, RDFLib and in the Bugis society, and to include references to the laws Datashader libraries and allows to evaluate the perfor- and changes in each country. The first we can get using mance of diferent knowledge graph construction meth- the Wiki Data Query Service. ods. Experiments objective. With the experiments we aimed to Queer datasets. In the conducted experiments, we use directly compare the performance of the three pipelines data from Wikidata, as this platform provides structured in terms of time eficiency, size, structure, and content. data that is well-suited for building knowledge graphs To accomplish this, we measured the time each pipeline 7). We used "real" data to make our simulation experi- takes to create the knowledge graph, the number of nodes, ments realistic. For example, we plotted the histogram and the number of edges in each built graph. Furtherof the number of properties each node from the SPARQL more, we analysed the maximal graphs that each methodquery had and found that the distribution resembled a log- /pipeline can create and compared them. normal distribution. We fitted the data to a log-normal Note that the experiments were performed under the distribution and used the Kolmogorov-Smirnov test to constraints of the regular account on a Google free serevaluate the fitting. The test showed that the fitting vice. We used the Google Colab environment to run our 7https://ai.stanford.edu/blog/introduction-to-knowledge-graphs/ 8https://queerdata.forummuenchen.org/en/ experiments with an allocated virtual machine with CPU (1xsingle core hyperthreaded Xeon Processors @2.3Ghz (i.e., 1 core, 2 threads); 12,7 GB of RAM (with 0.8 GB are already taken); a storage space of 108 GB, of which only 77 GB is available to the user. Thus, the results may difer if performed under diferent conditions. However, this experimental setting was chosen to simulate a realistic scenario for a typical user. 4.1. Structure and communities

In our experiments, we found that the structure of the

created knowledge graphs varied depending on the construction method used and the distribution used. Some Figure 4: Graph generated by Pure SPARQL, after removing methods resulted in graphs with a high degree of con- dead ends nectivity and a dense network of edges, like the Pure SPARQL pipeline, while others produced sparser graphs with fewer connections between nodes. We also observed that some methods tended to create graphs with a more hierarchical structure, especially in trials with the crawler-based pipeline. In contrast, others generated graphs with a more flat structure.

To better understand the structure of the graphs produced through our pipelines, we used community detection algorithms to identify and analyse the communities within the graphs. In particular, we used the modularity maximisation algorithm and the Girvan-Newman algorithm [19] to identify communities based on the density of connections within communities and the sparsity of relations between communities. Considering the size of the graphs, the community algorithms were run on either small portions of the graphs or simplified versions Figure 5: Graph generated by Merging stars where we linked entities according to their properties in common and then removed the redundant nodes. We could then create several dendrograms to be interpreted with examples given below in the figures 4, 5 and 6. As and used for statistical analysis. Crucially, we found stated in Section 3, the pure SPARQL pipeline (pipeline 1) that the identified communities often corresponded to had to include data-cleaning activities to understand the semantically meaningful groups of nodes, such as nodes queer communities better. Still, as shown in Figure 4 the representing people from the same country or nodes rep- raw graph built from exploring Wikidata with SPARQL resenting concepts in the same domain. queries without dead ends is very dense. Pipeline 2 adopting the merging stars leads to a clean 4.2. Results and representative view of the queer communities (in an integrated vision). As described in Section 3 in the The initial "raw" knowledge graph without edges built merging stars pipeline, we defined a raw graph and then from Wikidata consists of 8171 references to queer people pruned it to eliminate "dead nodes", then duplicates and and 4459 predicates, and 32651 objects. The graph with isolated nodes and finally, a wholly cleaned knowledge "in" edges, where the LGBTQ+ people are the object of graph, thoroughly pruned, as shown in Figure 5. the RDF triples, contains 7281 queer people and 7281 The graph produced through the crawler method nodes with "in" edges. We get 521 diferent predicates pipeline exploits the notion of centrality to identify the and 168 923 subjects. In the predicates, when comparing nodes representing influential queer people and then exthe IRI, there are interesting distinctions. plore their connections to determine whether they belong Structural comparison of the resulting graphs. Visual anal- not to a queer community. The resulting graph is shown ysis on dense graphs with an actual number of nodes can in Figure 6. The graph exhibits highly connected nodes be laborious, hence the need for some graphical tools. and communities of nodes agglutinated around. We observe structural diferences on the graph generated, Overall, our experiments suggest that the structure ble included. Table 2 shows the results.

Pipeline Pipeline 1 Pipeline 2 Pipeline 3

5. Conclusions and Future Work

Additionally, we compare the maximal knowledge graph that each pipeline can create, where we do not Through knowledge graphs, this paper introduced our impose any limit and try to get as many people as possi- experimental approach to building queer communities’ history from diferent perspectives. We propose and ex- graph identification, in: International semantic web hibit the pipelines for building the knowledge graphs, conference, Springer, 2013, pp. 542–557. including the sets of SPARQL queries for exploring Wiki- [8] A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E. R. data. Our experiments have shown that we can use these Hruschka, T. M. Mitchell, Toward an architecture models to choose the method, reduce the time it takes for never-ending language learning, in: Twentyto build the graph, and improve the quality of the result- Fourth AAAI conference on artificial intelligence, ing graph. These findings have important implications 2010. for constructing and using knowledge graphs in various [9] G. Gawriljuk, A. Harth, C. A. Knoblock, P. Szekely, applications. A scalable approach to incrementally building

There are two future directions regarding these tasks, knowledge graphs, in: International conference ifrst is deciding which are the most suitable graph op- on theory and practice of digital libraries, Springer, erations and analytics algorithms for answering partic- 2016, pp. 188–199. ular research questions. The second is interpreting re- [10] A. Fader, S. Soderland, O. Etzioni, Identifying relasults. Therefore, we will apply storytelling techniques to tions for open information extraction, in: Proceedproduce dashboards that provide plots and multimedia ings of the 2011 conference on empirical methods in content to propose an interpretation of the represented natural language processing, 2011, pp. 1535–1545. knowledge. [11] O. Etzioni, R. E. Bart, M. D. Schmitz, S. G. Doderland, et al., Open language learning for information extraction, 2014. US Patent App. 14/083,261.

Acknowledgments [12] J. Fan, D. Ferrucci, D. Gondek, A. Kalyanpur, Prismatic: Inducing knowledge from a large scale lexiThis project has been partially funded by the project calized relation resource, in: Proceedings of the GALILEAN of the intergroup collaboration program of NAACL HLT 2010 first international workshop the LIRIS lab. on formalisms and methodology for learning by reading, Association for Computational Linguistics, References 2010, pp. 122–127. [13] C. Bizer, C. Becker, P. N. Mendes, R. Isele, A. Mat[1] G. Vargas-Solar, M. S. Hassan, A. Akoglu, JITA4DS: teini, A. Schultz, Ldif-a framework for large-scale disaggregated execution of data science pipelines linked data integration (2012). between the edge and the data centre, J. Web Eng. 21 [14] F. M. Suchanek, G. Kasneci, G. Weikum, Yago: a (2022). URL: https://doi.org/10.13052/jwe1540-9589. core of semantic knowledge, in: Proceedings of the 2111. doi:10.13052/jwe1540-9589.2111. 16th international conference on World Wide Web, [2] M. Kejriwal, C. A. Knoblock, P. Szekely, Knowl- 2007, pp. 697–706.

edge graphs: Fundamentals, techniques, and appli- [15] J. Hofart, F. M. Suchanek, K. Berberich, G. Weikum, cations, MIT Press, 2021. Yago2: A spatially and temporally enhanced knowl[3] P. Szekely, C. A. Knoblock, J. Slepicka, A. Philpot, edge base from wikipedia, Artificial intelligence A. Singh, C. Yin, D. Kapoor, P. Natarajan, D. Marcu, 194 (2013) 28–61.

K. Knight, et al., Building and using a knowledge [16] K. Bollacker, C. Evans, P. Paritosh, T. Sturge, J. Taygraph to combat human traficking, in: Interna- lor, Freebase: a collaboratively created graph tional Semantic Web Conference, Springer, 2015, database for structuring human knowledge, in: pp. 205–221. Proceedings of the 2008 ACM SIGMOD interna[4] X. Dong, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, tional conference on Management of data, 2008, pp.

K. Murphy, T. Strohmann, S. Sun, W. Zhang, Knowl- 1247–1250. edge vault: A web-scale approach to probabilistic [17] S. G. Kobourov, Spring embedders and force knowledge fusion, in: Proceedings of the 20th ACM directed graph drawing algorithms, CoRR SIGKDD international conference on Knowledge abs/1201.3011 (2012). URL: http://arxiv.org/abs/ discovery and data mining, 2014, pp. 601–610. 1201.3011. arXiv:1201.3011. [5] A. G. Ororbia II, J. Wu, C. L. Giles, Citeseerx: Intel- [18] C. Hurter, O. Ersoy, A. Telea, Graph bundling by ligent information extraction and knowledge cre- kernel density estimation, in: Computer graphics ation from web-based data (????). forum, volume 31, Wiley Online Library, 2012, pp. [6] J. Pujara, L. Getoor, Building dynamic knowledge 865–874.

graphs, in: NIPS Workshop on Automated Knowl- [19] M. Girvan, M. E. Newman, Community structure in edge Base Construction, volume 9, 2014. social and biological networks, Proceedings of the [7] J. Pujara, H. Miao, L. Getoor, W. Cohen, Knowledge national academy of sciences 99 (2002) 7821–7826.