<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Strategies for creating knowledge graphs to depict a multi-perspective Queer communities representation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Louann Coste</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Flora Helmers</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hammamache Kheddouci</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Léo Le Nestour</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mahsa Niazi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Genoveva Vargas-Solar</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CNRS</institution>
          ,
          <addr-line>Univ Lyon, INSA Lyon, UCBL, LIRIS, UMR5205, F-69221</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Ecole Normale Supérieure de Lyon</institution>
          ,
          <addr-line>15 parvis René Descartes, 69342, Lyon</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper introduces an experimental study for building knowledge graphs about queer communities with diferent perspectives. The paper describes a set of pipelines implementing several knowledge graphs construction strategies that lead to a complementary understanding of the queer communities. We implemented a library with the knowledge graph construction pipelines and conducted experiments to evaluate execution cost against the comprehensiveness of the content. The library includes visualization tools for observing the knowledge graphs concerning several granularities. Finally, we use storytelling to provide a first interpretation of our results regarding the profiling of queer communities.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Data science pipelines</kwd>
        <kwd>graph analytics</kwd>
        <kwd>knowledge graphs</kwd>
        <kwd>queer history</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>exploitation of knowledge graphs. Section 3 introduces
our approach based on diferent pipelines for building
Heterogeneous datasets can be structured into networks. queer history knowledge graphs. Section 4 reports our
The notion of "graph" is a powerful mathematical concept experiments. Finally, Section 5 concludes the paper and
for representing these networks as graphs. Graphs can discusses future work.
be exploited through workflows applying algorithms to
solve data science problems. When the graphs become
too large, the processes used to explore, and analyse 2. Related Work
them become costly. Therefore, deployment strategies on
target architectures can fulfil the resources’ consumption Knowledge graphs construction and exploration. A
knowlrequirements in an adaptable way [1]. edge graph is a directed graph G = (N,R) whose nodes</p>
      <p>This paper introduces an experimental approach for N represent entities and literal values (literals), and
building knowledge graphs to reveal several perspec- whose edges R represent relations between these entities.
tives of Queer History. Our work models the pipelines to The graph can be divided into two parts: the data and
explore the Wiki data knowledge graph and extract sub- the knowledge. The knowledge can come from several
graphs about queer identities combining diferent vari- sources that describe entities (i.e., concepts) and from
ables like age, artistic production, geographical position, relations among entities that can be explicitly stated or
etc. Extracted graphs are studied, classified, and prolfied discovered [2]. Knowledge graphs defined using the RDF
thoroughly to estimate the allocation of resources. The model can be queried using the SPARQL query language.
pipelines include performing unions, merge, and com- Knowledge graphs are used to integrate
bibliographimunity discovery to build several perspectives of Queer cal data and exhibit references and papers co-authorship.
History. Examples of solutions are DIG [3], Knowledge Vault [4],</p>
      <p>The remainder of the paper is organised as follows. Sec- CiteSeerX [5], Pujara et al. [6], the Knowledge Graph
tion 2 introduces work addressing the construction and Identification [ 7] and NELL [8]. The authors of [9]
propose a framework for incrementally building a
knowlPublished in the Workshop Proceedings of the EDBT/ICDT 2023 Joint edge graph of artists used to link the data from diferent
*CGonefneorevnecvea(VMaarrgcahs-2S8o-lMara.rch 31, 2023, Ioannina, Greece). museums. These approaches rely on a predefined schema
†Authors contributed equally and are enumerated in alphabetical describing the data used to build the graph. Other
aporder. proaches, adopt the opposite strategy and do not use
$ louann.coste@ens-lyon.fr (L. Coste); flora.helmers@ens-lyon.fr a schema for building knowledge graphs [10, 11, 12].
(F. Helmers); hamamache.kheddouci@univ-lyon1.fr Knowledge graphs can also be built by consolidating
(H. Kheddouci); leo.le\protect1_nestour@ens-lyon.fr data from diferent sources like LDIF [ 13]), YAGO and
(gLe.nLo.vNevesat.voaurrg);ams-asholsaar.n@iacznir@s.fern(sG-l.yVoanr.fgra(sM-S.oNlaira)zi); YAGO2 [14, 15] and Freebase [16].</p>
      <p>© 2023 Copyright © 2023 for this paper by its authors. Use permitted under Creative Commons Modelling the Queer History with graphs. Queer history
CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g LCicEenUseRAttWribuotironk4s.0hIontpernPatrioonacl e(CeCdBiYn4g.0)s. (CEUR-WS.org) was not recorded in a structured way for a long time. The
most prominent project willing to integrate the history with subject, predicate, and object columns. This
of LGBTQ+ communities is the WikiProject LGBT 1 for .csv file can be transformed using the Pandas library
developing LGBT-themed content in Wikidata and advo- into a DataFrame or a networkx Graph using the
cating for the community of contributors developing this networkx.from_pandas_dataframe function. The
content. Wikidata contains self-organized archives start- interest in producing a networkx graph is that the
liing in the 1970s. In Munich the Forum Queeres Archiv, brary provides additional methods for graph analysis.
has been collecting, researching, and publishing for 20 In the example below, a graph is constructed with all
years: Shelf meters full of bequests, files, books, journals, triples where the subject is a non-heterosexual or
nonobjects. The project Queer data 2 transforms analog queer cisgender individual. This activity corresponds to the
history into linked open data and incorporates it into an data retrieval phases of the pipeline by interacting with
open, freely accessible, public data infrastructure. The the Wikidata endpoint server (see Figure 1).
knowledge graph contains 7415 entities from the data
sources of the Forum Queeres Archiv München, with CONSTRUCT {
more than 38,000 connections. ? p e r s o n ? p r e d i c a t e ? o b j e c t .
Discussion The challenge to modelling Queer history is }
to use diferent properties to define identities that can WHERE {
be profoundly diverse and relations and provide a multi- {
perspective model. Exhibiting these perspectives can ? p e r s o n wdt : P31 wd : Q5 . #?
propose a more representative queer history knowledge p e r s o n I d i s a human
graph. This paper shows diferent perspectives of queer ? p e r s o n ? p r e d i c a t e ? o b j e c t .
communities by building and integrating graphs. {</p>
    </sec>
    <sec id="sec-2">
      <title>3. Building Knowledge Graphs</title>
    </sec>
    <sec id="sec-3">
      <title>Pipelines</title>
      <p>The main contribution of our work is the design of three
pipelines to build knowledge graphs using diferent
techniques. The purpose is to propose a pipeline that:
- Balances the coverage of the resulting knowledge graph
that is its scope concerning the queer communities (i.e.,
are all potential queer individuals part of the graph, are
their connections able to exhibit diferent communities).
- Reasonably consumes resources (CPU and memory) to
explore, analyse and build dense graphs. The second
contribution is the implementation of these pipelines within
a library 3 that allows testing them by calibrating specific
parameters such as the number of iterations, the number
of nodes to visit, etc. We also provide a ready-to-use
notebook to run the experiments we have conducted (see }
Section 4). }
Pipeline 1: Pure SPARQL based knowledge graph SERVICE w i k i b a s e : l a b e l { bd :
Figure 1 illustrates the pipeline that builds a pure s e r v i c e P a r a m w i k i b a s e : l a n g u a g e " [
SPARQL-based knowledge graph. We propose the con- } AUTO_LANGUAGE ] " . }
struction of two initial graphs by querying Wikidata to
retrieve individuals identified as non-cisgender and non- We create a second graph where the person is not the
heterosexual and their connections. subject but the object. With these two graph views, we</p>
      <p>The Wikidata service ofers the WikiData Query have all the connections to the elements. We retrieve the
Service endpoint, which allows users to run SPARQL labels of the entity using the service Wikibase. Retrieving
queries on the WikiData database. SPARQL allows cre- the whole graph is costly (the evaluation of this query
ating a graph with the CONSTRUCT keyword, and doubles the time of the first one), so we set a timeout and
the resulting graph can be downloaded as a .csv file limit the size of the graph.</p>
      <p>We note that we have to clean the data because
21hhttttppss::////qwuweewr.dwaitkai.dfoartua.mormgu/wenikcih/Weni.koirdga/etan:/WikiProject_LGBT they contain duplicates and "empty" nodes. Indeed,
3https://github.com/FloraHelmers/QueerHistoryProject some elements have diferent Internationalized Resource
? p e r s o n wdt : P21 ? s e x o r g e n d e r .</p>
      <p>#? p e r s o n h as ?
s e x o r g e n d e r
#? s e x o r g e n d e r i s n o t male ,
f e m a l e , c i s g e n d e r male ,
c i g e n d e r f e m a l e , o r
c i s g e n d e r p e r s o n
FILTER ( ? s e x o r g e n d e r NOT IN (
wd : Q6581097 , wd : Q6581072 ,
wd : Q15145778 , wd :</p>
      <p>Q15145779 , wd : Q1093205 ) ) .
} UNION {
? p e r s o n wdt : P91 ?
s e x u a l o r i e n t a t i o n . #?
p e r s o n h as ?
s e x u a l o r i e n t a t i o n
FILTER ( ? s e x u a l o r i e n t a t i o n ! =
wd : Q1035954 ) . #?
s e x u a l o r i e n t a t i o n i s n o t
h e t e r o s e x u a l
Identifiers (IRIs) but have the same meaning: for in- is a process in which the data from two separate RDF
stance, http://www.wikidata.org/prop/direct/P161 and graphs is combined into a single graph. The process is
http://www.wikidata.org/prop/statement/P161 gives ac- straightforward, as both graphs are in a standard format
cess to the same property which is "cast member". How- that can be easily integrated. The only subtleties that
ever, since they have diferent terms, they are expressed arise when merging two RDF graphs are related to the
as if they were diferent. We also get empty nodes which processing of shared blank nodes.
do not have labels, such as entities Q19151093 4 and Shared blank nodes are nodes that appear in both RDF
Q19218452 5. Both have diferent IRI and Wikipedia Iden- graphs and are used to represent anonymous resources.
tifiers but have potentially the same meaning because These shared blank nodes must be merged in an
intetheir properties are identical. Because we only get access grated RDF graph to represent the same resource. The
to the elements located at a distance 1 from the people, rules for merging shared blank nodes are defined in the
we cannot correct these elements only using what we W3C RDF 1.1 Semantics and Abstract Syntax
specificahave gathered. Besides, we lose the added value of the tion 6. The merging process must ensure that the
inknowledge graph, which is to automatically deduce in- tegrated graph preserves the intended meaning of the
formation by using the varied distant relationships of the original RDF graphs, including the relationships between
initial node. The principle of pipeline 1 is sound, but the the nodes in the graph. Therefore, the process must
conexecution cost is high according to the resources we have sider the context in which each shared blank node is used
access to. Therefore we propose alternative pipelines in both graphs and combine them to preserve the RDF
described in the following sections. data’s intended meaning. Our final integrated graph is
Pipeline 2: Merging Stars. The construction pipeline then converted to a NetworkX multidigraph that proved
of the knowledge graph about queer communities from to be the most helpful format for our analysis.
Wikidata using merging stars can be summarised as fol- Once the graph has been created, it is necessary to
lows: merge a list of graphs retrieved from Wikidata as prune it in various ways:
.nt files that are akin to stars around a central entity (1) Removing "dead-end" nodes (i.e., nodes that have an
node (see Figure 2). As shown in Figure 2, this pipeline in-degree of less than 2) "Dead-end" nodes are often
isostarts with a SPARQL query to Wikidata that returns a lated and do not provide much information about the
list of Wikidata item IDs that are related to the queer community.
community. (2) Removing duplicates (i.e., nodes that represent the</p>
      <p>This list, originally in JSON, is then used to extract the same object literally) and isolated nodes (i.e., nodes that
relevant information necessary to query the graphs asso- have both in-degree and out-degree of 0). These nodes
ciated with each entity on Wikidata (IDs and URLs). We are not connected to any other nodes in the graph and
iteratively parse new information by using these graphs, do not provide information about the community.
basically creating a merged RDF graph using the RDF Depending on the application, one can also specify
data of the nodes from the Wikidata entity URLs in an more complex pruning processes with rule-based
delen-triples format (given in an .nt file format). tion and deletion before each merge. The pipeline is</p>
      <p>The merging of two knowledge graphs in .nt file format executed with two parameters: n the size of the list of</p>
      <sec id="sec-3-1">
        <title>4http://www.wikidata.org/entity/Q19151093</title>
        <p>5http://www.wikidata.org/entity/Q19218452</p>
      </sec>
      <sec id="sec-3-2">
        <title>6https://www.w3.org/TR/rdf11-mt/</title>
        <p>#shared-blank-nodes-unions-and-merges
people used as starting point and a prune_policy dic- the initial set of nodes for ranking nodes based on the
tionary describing what process for removing the nodes number and quality of incoming links (see 2 in Figure
should be used. 3). The algorithm allows the identification of the most
Pipeline 3: Crawler method. The crawler-based critical nodes in the graph, as they are more likely to
pipeline starts with a small number of nodes. It runs represent properties of interest. The PageRank algorithm
an iterative process to extract critical nodes representing is run multiple times with diferent parameters, such as
properties of interest (potential common points) and use the damping factor (alpha) and the number of iterations
those to explore and discover more queer people and (see 3 in Figure 3).
communities. Figure 3 illustrates the main activities of After the PageRank algorithm has been executed, the
the pipeline. The process begins by selecting a subset pipeline selects a certain number of the most critical
of nodes using a SPARQL query (see 1 in Figure 3). The nodes (k_prop) and uses them to explore further (see
query selects distinct people and their labels who have a 4 in Figure 3). Specifically, it runs the same SPARQL
specific property of interest and do not have other prop- query as before, using the selected nodes as the property
erties that would exclude them from being considered of interest (see 5 in Figure 3). The result is a new set of
queer. A query result is a limited number of nodes, which nodes connected to the previously selected nodes through
is used as the starting point for the iterative process. the property of interest. These new nodes are added to</p>
        <p>Next, the pipeline runs the PageRank algorithm on the original graph, and the process starts over again in n
iterations (see 6 in Figure 3). was appropriate, which allowed us to use the log-normal</p>
        <p>The crawler-based pipeline continues to run the PageR- distribution to model the distribution of the number of
ank algorithm and explore new nodes until a certain out edges in our simulation experiments. This strategy
number of iterations have been completed (n_iter). The allowed us to have a more accurate simulation of the
realresulting graph is a composite of all the nodes and edges world construction of knowledge graphs and to assess
discovered during the process, and it is more densely the performance of diferent construction methods more
connected and contains a higher number of queer people accurately.
than the initial set of nodes. Because not all properties constitute common points
among people, we only take a fraction of these number
3.1. Visualising graph structures of out-edges to have our distribution for the models. By
running tests on Wikidata, we found that approximately
Visualising a knowledge graph is a crucial step in under- 1/50 of these out-edges created an entity-linking edge.
standing the structure and content of the data. To better This gives us the distribution with a probability density
understand the structure of the knowledge graph gener- function (PDF):
ated in our study, we implemented custom visualisation
tbeucnhdnliiqnug.es, including diferent graph layouts and edge  √12 exp − ((2) − 2  )2 (1)
Nodes Layout. The force-directed layout, described in
[17] minimises overlaps in the graph, evenly distributes
nodes and links and organises items so that links are of
a similar length with as few crossing edges as possible.</p>
        <p>This process is done by assigning forces among the set
of edges and the set of nodes based on their relative
positions and then using these forces to simulate the
edges and nodes’ motion or minimise their energy. In the
case of our pipelines based on .nt "stars" associated with
entities, the force-directed layout is particularly relevant
to distinguish the underlying structure obtained.</p>
        <p>Edge-Bundling. We use edge bundling, a method that
allows edges to curve and then groups nearby ones
together to help convey structure. We use the function
"Datashader.hammer_bundle" (a variant of [18]) for the
edge bundling.</p>
        <p>with  ≈ 0.495 and  ≈ 5.</p>
        <p>Understanding the queer concepts. We defined an ontology
to organise the critical elements of the Queer Community.</p>
        <p>We used preexisting standard ontologies such as foaf to
describe the people and Geonames to make connections
between spaces. Our ontology focuses on the European
Queer scene of the 1980s. We added the correlation
between gender and sex without having it explicitly said.</p>
        <p>We based our strategy on the idea of Katharina Brunner
in the "Remove NA" project 8. We used pyramidal
relations to describe sexual orientations. Our ontology also
focuses on art and collaborations, including the notion of
avatars which is essential in the drag culture. The concept
of influence is also present, considering elements such
as songs played in clubs, characters in films or books,
and access to art pieces. This version focuses on the
representation of LGBTQ+ people in the culture of the
4. Experiments and Results 1980s. Still, it could benefit from exploring the rights of
LGBTQ+ individuals from the past from a more global
Implementation. We have developed a library for han- and non-Eurocentric perspective. One possible approach
dling and building knowledge graphs with real-world would be to examine the various terms used to define
data for a comfortable experimental setting. This li- gender across diferent cultures, such as the third gender
brary, written in Python, uses the NetworkX, RDFLib and in the Bugis society, and to include references to the laws
Datashader libraries and allows to evaluate the perfor- and changes in each country. The first we can get using
mance of diferent knowledge graph construction meth- the Wiki Data Query Service.
ods. Experiments objective. With the experiments we aimed to
Queer datasets. In the conducted experiments, we use directly compare the performance of the three pipelines
data from Wikidata, as this platform provides structured in terms of time eficiency, size, structure, and content.
data that is well-suited for building knowledge graphs To accomplish this, we measured the time each pipeline
7). We used "real" data to make our simulation experi- takes to create the knowledge graph, the number of nodes,
ments realistic. For example, we plotted the histogram and the number of edges in each built graph.
Furtherof the number of properties each node from the SPARQL more, we analysed the maximal graphs that each
methodquery had and found that the distribution resembled a log- /pipeline can create and compared them.
normal distribution. We fitted the data to a log-normal Note that the experiments were performed under the
distribution and used the Kolmogorov-Smirnov test to constraints of the regular account on a Google free
serevaluate the fitting. The test showed that the fitting vice. We used the Google Colab environment to run our
7https://ai.stanford.edu/blog/introduction-to-knowledge-graphs/
8https://queerdata.forummuenchen.org/en/
experiments with an allocated virtual machine with CPU
(1xsingle core hyperthreaded Xeon Processors @2.3Ghz
(i.e., 1 core, 2 threads); 12,7 GB of RAM (with 0.8 GB are
already taken); a storage space of 108 GB, of which only
77 GB is available to the user. Thus, the results may difer
if performed under diferent conditions. However, this
experimental setting was chosen to simulate a realistic
scenario for a typical user.
4.1. Structure and communities</p>
      </sec>
      <sec id="sec-3-3">
        <title>In our experiments, we found that the structure of the</title>
        <p>created knowledge graphs varied depending on the
construction method used and the distribution used. Some Figure 4: Graph generated by Pure SPARQL, after removing
methods resulted in graphs with a high degree of con- dead ends
nectivity and a dense network of edges, like the Pure
SPARQL pipeline, while others produced sparser graphs
with fewer connections between nodes. We also
observed that some methods tended to create graphs with
a more hierarchical structure, especially in trials with
the crawler-based pipeline. In contrast, others generated
graphs with a more flat structure.</p>
        <p>To better understand the structure of the graphs
produced through our pipelines, we used community
detection algorithms to identify and analyse the communities
within the graphs. In particular, we used the modularity
maximisation algorithm and the Girvan-Newman
algorithm [19] to identify communities based on the density
of connections within communities and the sparsity of
relations between communities. Considering the size of
the graphs, the community algorithms were run on
either small portions of the graphs or simplified versions Figure 5: Graph generated by Merging stars
where we linked entities according to their properties
in common and then removed the redundant nodes. We
could then create several dendrograms to be interpreted with examples given below in the figures 4, 5 and 6. As
and used for statistical analysis. Crucially, we found stated in Section 3, the pure SPARQL pipeline (pipeline 1)
that the identified communities often corresponded to had to include data-cleaning activities to understand the
semantically meaningful groups of nodes, such as nodes queer communities better. Still, as shown in Figure 4 the
representing people from the same country or nodes rep- raw graph built from exploring Wikidata with SPARQL
resenting concepts in the same domain. queries without dead ends is very dense.
Pipeline 2 adopting the merging stars leads to a clean
4.2. Results and representative view of the queer communities (in
an integrated vision). As described in Section 3 in the
The initial "raw" knowledge graph without edges built merging stars pipeline, we defined a raw graph and then
from Wikidata consists of 8171 references to queer people pruned it to eliminate "dead nodes", then duplicates and
and 4459 predicates, and 32651 objects. The graph with isolated nodes and finally, a wholly cleaned knowledge
"in" edges, where the LGBTQ+ people are the object of graph, thoroughly pruned, as shown in Figure 5.
the RDF triples, contains 7281 queer people and 7281 The graph produced through the crawler method
nodes with "in" edges. We get 521 diferent predicates pipeline exploits the notion of centrality to identify the
and 168 923 subjects. In the predicates, when comparing nodes representing influential queer people and then
exthe IRI, there are interesting distinctions. plore their connections to determine whether they belong
Structural comparison of the resulting graphs. Visual anal- not to a queer community. The resulting graph is shown
ysis on dense graphs with an actual number of nodes can in Figure 6. The graph exhibits highly connected nodes
be laborious, hence the need for some graphical tools. and communities of nodes agglutinated around.
We observe structural diferences on the graph generated, Overall, our experiments suggest that the structure
ble included. Table 2 shows the results.</p>
        <p>Pipeline
Pipeline 1
Pipeline 2
Pipeline 3</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Conclusions and Future Work</title>
      <p>Additionally, we compare the maximal knowledge
graph that each pipeline can create, where we do not Through knowledge graphs, this paper introduced our
impose any limit and try to get as many people as possi- experimental approach to building queer communities’
history from diferent perspectives. We propose and ex- graph identification, in: International semantic web
hibit the pipelines for building the knowledge graphs, conference, Springer, 2013, pp. 542–557.
including the sets of SPARQL queries for exploring Wiki- [8] A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E. R.
data. Our experiments have shown that we can use these Hruschka, T. M. Mitchell, Toward an architecture
models to choose the method, reduce the time it takes for never-ending language learning, in:
Twentyto build the graph, and improve the quality of the result- Fourth AAAI conference on artificial intelligence,
ing graph. These findings have important implications 2010.
for constructing and using knowledge graphs in various [9] G. Gawriljuk, A. Harth, C. A. Knoblock, P. Szekely,
applications. A scalable approach to incrementally building</p>
      <p>There are two future directions regarding these tasks, knowledge graphs, in: International conference
ifrst is deciding which are the most suitable graph op- on theory and practice of digital libraries, Springer,
erations and analytics algorithms for answering partic- 2016, pp. 188–199.
ular research questions. The second is interpreting re- [10] A. Fader, S. Soderland, O. Etzioni, Identifying
relasults. Therefore, we will apply storytelling techniques to tions for open information extraction, in:
Proceedproduce dashboards that provide plots and multimedia ings of the 2011 conference on empirical methods in
content to propose an interpretation of the represented natural language processing, 2011, pp. 1535–1545.
knowledge. [11] O. Etzioni, R. E. Bart, M. D. Schmitz, S. G.
Doderland, et al., Open language learning for information
extraction, 2014. US Patent App. 14/083,261.</p>
      <p>Acknowledgments [12] J. Fan, D. Ferrucci, D. Gondek, A. Kalyanpur,
Prismatic: Inducing knowledge from a large scale
lexiThis project has been partially funded by the project calized relation resource, in: Proceedings of the
GALILEAN of the intergroup collaboration program of NAACL HLT 2010 first international workshop
the LIRIS lab. on formalisms and methodology for learning by
reading, Association for Computational Linguistics,
References 2010, pp. 122–127.
[13] C. Bizer, C. Becker, P. N. Mendes, R. Isele, A.
Mat[1] G. Vargas-Solar, M. S. Hassan, A. Akoglu, JITA4DS: teini, A. Schultz, Ldif-a framework for large-scale
disaggregated execution of data science pipelines linked data integration (2012).
between the edge and the data centre, J. Web Eng. 21 [14] F. M. Suchanek, G. Kasneci, G. Weikum, Yago: a
(2022). URL: https://doi.org/10.13052/jwe1540-9589. core of semantic knowledge, in: Proceedings of the
2111. doi:10.13052/jwe1540-9589.2111. 16th international conference on World Wide Web,
[2] M. Kejriwal, C. A. Knoblock, P. Szekely, Knowl- 2007, pp. 697–706.</p>
      <p>edge graphs: Fundamentals, techniques, and appli- [15] J. Hofart, F. M. Suchanek, K. Berberich, G. Weikum,
cations, MIT Press, 2021. Yago2: A spatially and temporally enhanced
knowl[3] P. Szekely, C. A. Knoblock, J. Slepicka, A. Philpot, edge base from wikipedia, Artificial intelligence
A. Singh, C. Yin, D. Kapoor, P. Natarajan, D. Marcu, 194 (2013) 28–61.</p>
      <p>K. Knight, et al., Building and using a knowledge [16] K. Bollacker, C. Evans, P. Paritosh, T. Sturge, J.
Taygraph to combat human traficking, in: Interna- lor, Freebase: a collaboratively created graph
tional Semantic Web Conference, Springer, 2015, database for structuring human knowledge, in:
pp. 205–221. Proceedings of the 2008 ACM SIGMOD
interna[4] X. Dong, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, tional conference on Management of data, 2008, pp.</p>
      <p>K. Murphy, T. Strohmann, S. Sun, W. Zhang, Knowl- 1247–1250.
edge vault: A web-scale approach to probabilistic [17] S. G. Kobourov, Spring embedders and force
knowledge fusion, in: Proceedings of the 20th ACM directed graph drawing algorithms, CoRR
SIGKDD international conference on Knowledge abs/1201.3011 (2012). URL: http://arxiv.org/abs/
discovery and data mining, 2014, pp. 601–610. 1201.3011. arXiv:1201.3011.
[5] A. G. Ororbia II, J. Wu, C. L. Giles, Citeseerx: Intel- [18] C. Hurter, O. Ersoy, A. Telea, Graph bundling by
ligent information extraction and knowledge cre- kernel density estimation, in: Computer graphics
ation from web-based data (????). forum, volume 31, Wiley Online Library, 2012, pp.
[6] J. Pujara, L. Getoor, Building dynamic knowledge 865–874.</p>
      <p>graphs, in: NIPS Workshop on Automated Knowl- [19] M. Girvan, M. E. Newman, Community structure in
edge Base Construction, volume 9, 2014. social and biological networks, Proceedings of the
[7] J. Pujara, H. Miao, L. Getoor, W. Cohen, Knowledge national academy of sciences 99 (2002) 7821–7826.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>