<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Generating Knowledge Graphs from Scienti c Literature of Degenerative Diseases</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Anderson Rossanez</string-name>
          <email>anderson.rossanez@ic.unicamp.br</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Julio Cesar dos Reis</string-name>
          <email>jreis@ic.unicamp.br</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Computing, University of Campinas</institution>
          ,
          <addr-line>Campinas - SP</addr-line>
          ,
          <country country="BR">Brazil</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Degenerative diseases, such as the Alzheimer's Disease, can be very serious and life-threatening. As the scienti c community strives to fully understand their exact root causes and advance their research on the domain, a massive amount of knowledge is generated. To represent and link all this knowledge, we propose the generation of knowledge graphs from the scienti c literature. We aim to provide researchers the ability to relate their new discoveries with the current knowledge and possibly formulate new hypotheses to further advance the research. In this paper, we describe a method to extract information from scienti c literature for generating a knowledge graph reusing existing domain ontologies. We demonstrate the e ectiveness of our method by generating knowledge graphs from a set of abstracts of scienti c papers on Alzheimer's Disease.</p>
      </abstract>
      <kwd-group>
        <kwd>Knowledge Graphs</kwd>
        <kwd>RDF triples</kwd>
        <kwd>Ontologies</kwd>
        <kwd>Information Extraction</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Degenerative diseases a ect the function and structure of cells, tissues, and
organs, becoming worse over time [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. In the case of degenerative nerve diseases,
or neurodegenerative diseases, the brain cells, called neurons, are a ected.
Neurons do not normally reproduce, so they are not replaced by the body when they
die or become damaged. Thus, neurodegenerative diseases can be very serious
and life-threatening, a ecting balance, movement, talking, breathing, and the
heart function [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>
        One example of such neurodegenerative diseases is the Alzheimer's Disease
(AD). It is the sixth leading cause of death in the United States, and the fth
leading cause among individuals aged 65 and older [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The disease is estimated
to begin with small changes in the brains of a ected individuals, at least 20
years before the symptoms are noticeable [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The a ected individuals, then,
start experiencing memory loss and language problems, due to neurons involved
in cognitive functions being either damaged or destroyed. Over time, the
symptoms increase and start interfering with the individual's ability to perform daily
activities, at which point, the individual is said to have dementia due to AD [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>AD is still not curable. The current treatments for the disease aim on slowing
its progress down in the a ected individuals, for which an early diagnosis is of
extreme importance. The scienti c community works on better understanding
the disease to nd new methods of early diagnosis, treatment, and ultimately, the
cure. This research on AD continuously generates new knowledge. Integrating
available data and properly representing the domain knowledge could bring great
bene ts. It could, for instance, provide the researchers the ability to visualize
how the known concepts and their discoveries may relate to each other, as well
as correlate their ndings to discoveries from other researchers. By observing
such relations, researchers might be able to formulate new hypotheses, and in
this way, advance the domain's current state-of-the-art.</p>
      <p>
        Knowledge Graphs (KGs) de ne the interrelations of real world entities in
facts represented as a graph [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. KGs model knowledge using the Resource
Description Framework (RDF)1 representation. The computational formal
representation and explicit description of disease information via KGs can play a key
role in the analysis and understanding of the disease.
      </p>
      <p>In this article, we investigate the generation of KGs as automatically as
possible from the scienti c literature on AD. Several research challenges remain open
in our study context. The scienti c literature has a speci c, yet not uniform
writing style, then posing several issues to information extraction. There are cases
where the sentences are too long, containing several abbreviations, and a very
speci c set of terms that are only known by domain specialists. Natural
Language Processing (NLP) tools involved in the information extraction techniques
are usually not trained for applications in such vocabulary. For those reasons,
a completely automatic system to generate KGs from scienti c texts is a hard
task to be accomplished.</p>
      <p>Our goal is to de ne a (semi)-automatic method to generate KGs from the
processing of unstructured text obtained from scienti c papers on the AD
domain. Via our method a KG is generated by the identi cation and extraction of
information from unstructured text using NLP techniques. The extracted
information is stored in the form of RDF triples. Then, we identify the concepts and
relations present in the text using a knowledge base mapped to a single domain
ontology that is recommended by the NCBO2 bioportal. Via the
implementation of a KGen software tool3, we evaluate the e ectiveness of our approach to
generate KGs from scienti c papers on AD.</p>
      <p>The remaining of this paper is organized as follows: Section 2 discusses the
related work; Section 3 presents our method, along with a running example to
illustrate the process; Section 4 shows the evaluation with KGs generated from a</p>
    </sec>
    <sec id="sec-2">
      <title>1 https://www.w3.org/TR/WD-rdf-syntax-971002/</title>
    </sec>
    <sec id="sec-3">
      <title>2 https://www.bioontology.org/</title>
    </sec>
    <sec id="sec-4">
      <title>3 https://github.com/rossanez/kgen</title>
      <p>set of scienti c papers; Section 5 discusses our obtained ndings. Finally, Section
6 closes the paper presenting the conclusions and future work.
2</p>
      <sec id="sec-4-1">
        <title>Background</title>
        <p>
          The generation of KGs from unstructured texts has been studied in the past
years for the purposes of knowledge representation and reasoning. The
knowledge extraction of RDF triples was addressed via the use of open information
extraction systems, such as ReVerb [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] and OLLIE [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]. Such systems extract
triples from sentences using purely syntactical and lexical patterns, without
considering entities in the text.
        </p>
        <p>
          Di erently, Exner and Nugues [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] considered entity recognition, and also
used Semantic Role Labeling (SRL) to extract triples from text. SRL helps
identifying the Agent and the Patient of a verb. Those elements are then mapped
correctly to the triple's elements consulting resources from VerbNet [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ] and
FrameNet [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. This is very helpful in passive voice sentences, where the subject
and the object may have their orders changed in a triple.
        </p>
        <p>
          Martinez-Rodrigues et al. [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] combined open information extraction systems
and SRL to extract triples. Their work introduced a technique that considers
noun phrases in the identi cation of entities. The identi ed entities are mapped
to multiple knowledge bases, such as DBpedia [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], Babelfy [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ], and TagMe [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
Exner and Nugues [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] interconnected the extracted information to DBpedia [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ],
using a rule-based approach. In such investigations, if there is not an exact
match for any of the triple's constituents in the knowledge bases, then they are
left unmapped.
        </p>
        <p>
          Similarly, T2KG tool [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] uses a hybrid of a rule-based approach and a
vectorbased similarity metric to identify similar mappings to DBpedia [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] in case of
a missing exact match. On the other hand, FRED tool [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] generates its own
ontology from a text, mapping existing entities and concepts to other existing
ontologies/knowledge bases, such as DBpedia [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
        </p>
        <p>
          Other software tools have been proposed for the purpose of KG building. For
instance, the IBM provides a tool for the information extraction from plain text
to ultimately build a KG integrating input documents [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ]. The tool integrates
a set of their services (e.g. Watson4 and Cloud5).
        </p>
        <p>
          Concerning KGs and AD, Lam et al. [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] converted information from
different neuroscience sources to RDF format, making it available as an ontology.
AlzPharm [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] used RDF to build a framework that integrates neuroscience
information, which also includes Alzheimer, obtained from multiple domains. The
goal was unify the neuroscientists' queries into a single ontology.
        </p>
        <p>
          The National Center for Biomedical Ontology [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] (NCBO) provides an
endpoint to access multiple ontologies from the biomedical domain, including
Alzheimer's. It provides an annotator for natural language sentences, helping to
identify mappings from concepts and entities to the ontologies.
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4 https://www.ibm.com/watson</title>
    </sec>
    <sec id="sec-6">
      <title>5 https://www.ibm.com/cloud</title>
      <p>The vast majority of the investigations dealing with RDF in the biomedical
domain focus on the ontologies, either by creating, nding, and unifying them.
In our case, we focus on generating a KG from a text. In this process, entities
are then mapped to already existing ontologies. We seek to match instances
of concepts and entities from ontologies in a text, which may even describe
information that is not yet captured in an existing ontology, due to its novelty
aspect. The works presenting techniques to generate KG are not applied to the
biomedical or scienti c domains, and we seek to employ similar techniques to
address the generation of KGs on scienti c texts from this domain.
3</p>
      <sec id="sec-6-1">
        <title>KGen: Knowledge Graph Generation</title>
        <p>
          We describe the proposed KGen method developed to generate KGs from
unstructured text. KGs rely on RDF and Linked Data principles [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. In RDF,
entities are represented as resources, which in turn, are referenced by Universal
Resource Identi ers 6 (URIs).
        </p>
        <p>Formally, a Knowledge Graph KG = (V; E ) is represented as a regular graph
with a set of Vertices V and Edges E . Whereas the vertices express entities or
concepts, the edges express the relations between them. A RDF triple refers to
a data entity composed of subject, predicate and object de ned as t = (s; p; o).
In KGs, the relations are predicates (p), such that E = fp0; p1; :::; png, i.e., the
edges in KGs are a set of predicates. The predicates are formally de ned in
ontologies.</p>
        <p>
          An ontology O describes a domain in terms of concepts, attributes and
relationships [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. Formally, an ontology O = (CO; RO; AO) consists in a set of
classes CO interrelated by directed relations R, and a set of attributes AO. In
this sense, a predicate p 2 R.
        </p>
        <p>Also in KGs, the entities or concepts are either subjects (s) or objects
(o), considering that the vertices are a set of subjects and objects, such that
V = fs0; s1; :::; sn; o0; o1; :::; ong. In this context, oi 2 CO. We may also say
that a KG is a set of RDF triples, such that, KG = ft0; t1; :::; tng, where
t0 = (s0; p0; o0); t1 = (s1; p1; o1); :::; tn = (sn; pn; on).</p>
        <p>Figure 1 presents the key elements involved in our method. First, there is a
preprocessing of the input text, in a way to identify all the sentences available.
From such sentences, our method of KG generation extracts triples by identifying
the subject, predicate and object. Afterwards, it performs the identi cation of
entities, concepts and properties from the sentences to obtain links in our graph
to an ontology. By combining the output of both steps, i.e., the triples and
the ontology links, the method reaches a new set of linked triples. Finally, the
generated graph is represented in RDF turtle7 format.</p>
        <p>Preprocessor. The preprocessing step (cf. 1 in Figure 1) receives as input
an unstructured text le in plain-text format (e.g., a *.txt le). This
preprocessing step submits the text through some sub-steps. The rst sub-step identi es</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>6 https://www.w3.org/wiki/URI</title>
    </sec>
    <sec id="sec-8">
      <title>7 https://www.w3.org/TR/turtle/</title>
      <p>the sentences from the raw text using a sentence splitter. This NLP tool
outputs each identi ed sentence per line. Afterwards, the next sub-step resolves
co-references by using a NLP technique which identi es pronoun references in
di erent sentences (e.g., John lives next door. He works on Sundays. { He refers
to John). Once identi ed the references, the pronouns are changed to the actual
references (e.g., John lives next door. John works on Sundays ). This is important
to keep the coherence when generating triples, maintaining the actual entities in
the subjects and objects, instead of the pronouns.</p>
      <p>The next preprocessing sub-step is an abbreviation resolver. The method
identi es a common practice in scienti c writing: abbreviations of terms when
they are rst mentioned in the text, whereas the abbreviation is used from that
point on. For instance, Alzheimer's Disease (AD) can be very serious and
lifethreatening. AD is the sixth leading cause of death in the United States. { In
this case, Alzheimer's Disease is replaced in the remainder of the text by AD.
The substep replaces the abbreviated form by the original one in every identi ed
instance to generate coherent triples.</p>
      <p>The last preprocessing sub-step aims at simplifying sentences. This is also a
common practice in scienti c writing, where we may have a complex sentence
bound by conjunctions (e.g., Mitophagy inhibits amyloid-beta and thau
pathology ) { In this case, it would be preferable to have two distinct sentences (e.g.,
Mitophagy inhibits amyloid-beta and Mitophagy inhibits thau pathology ) to
generate coherent triples. The overall output of the preprocessing step is a text
containing a simpli ed, co-reference and abbreviation-resolved sentence.</p>
      <p>Figure 2 presents a running example to illustrate our de ned process. It
presents the unstructured input text and its correspondent preprocessed output.</p>
      <p>Extractor of triples. The next step is the extractor of triples (cf. 2 in
Figure 1), which takes as input the preprocessed text. In this step, each sentence
is processed to identify the candidate predicate, the subject and the object. Our
proposal explores a Semantic Role Labeling (SRL) technique to perform this
identi cation as the rst sub-step. SRL identi es the verbs from a sentence,
along with its agents, patients and other semantic roles (e.g., theme, topic, etc.).</p>
      <p>
        Once agent and patient are identi ed, the second sub-step explores an
algorithm to identify the triple's constituents. According to the triple de nition,
t = (s; p; o), the method needs to identify the subject, predicate and object from
the SRL output. The predicate is naturally mapped to the verb. As for the
subject and object, if there is an agent and a patient linked to the verb, the subject
maps to the agent and the object to the patient. It is important to mention that,
in case of passive voice sentences, the patient and agent may be out of order,
but SRL already assigns them correctly. If there is an agent, but no patient, the
subject maps to the agent, and the object maps to the closest semantic role to
patient. The same applies in the case we have a patient (mapped to the object),
but no agent (subject maps to the closest semantic role). Finally, if either the
object, or the subject cannot be mapped to a role, we discard the sentence,
as, per de nition, a triple must have its three constituents. This algorithm has
been adapted from a similar algorithm de ned by Martinez-Rodrigues et al. [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ],
dealing with the outputs of the SRL method.
      </p>
      <p>Ontology linker. This step (cf. 3 in Figure 1) takes as input the
preprocessed text. The rst sub-step performs a tokenization to split the sentences
into tokens in addition to explore a Part of Speech (PoS) Tagger to tag the
tokens (e.g., as nouns, verbs, adjectives, etc.). Then, a parse tree is obtained
for the referred sentence. By doing so, the technique enables the identi cation
of verbs, considered as predicate candidates, and noun phrases, considered as
subject/object candidates. Such candidates are matched against an ontology to
nd correspondences on such ontology's concepts and attributes.</p>
      <p>Formalizing this process, a sentence S = ft0; t1; :::; tng is a set of terms (or
tokens) ti. Each term gets a PoS p associated (ti; pi). A predicate candidate
pc = tijpi = \V B00 is a term whose PoS is a verb. A subject/object candidate
soc = ftig is a set of terms whose parents in the parse tree are noun-phrases
(NP ). Each candidate is then associated to ontology elements, i.e., pc ) RO,
and soc ) CO.</p>
      <p>Graph builder. The graph builder (cf. 4 in Figure 1) takes as input the
triples and the ontology links. In this step, the method rst creates a local
resource for each of the triple's constituents, binding them to resources obtained
from the ontology links. This results in a turtle content describing the KG. We
convert the turtle content to a set of vertices and edges, feeding them to a graph
generator system, which outputs the graph.
(rdf:Property ). The subject and object types are classes (rdf:Class ). The subject
is linked to an ontology resource (nci:C2866 ), whereas the object is not.</p>
      <p>
        Implementation aspects. KGen has been implemented in Python
language. In the preprocessing, we used Stanford's CoreNLP [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] toolkit, which
provides a sentence splitter, coreference resolver, and a tokenizer, PoS Tagger,
and Parser, used to implement the abbreviation resolver. The sentence
simplicator used iSimp [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], a sentence simpli cation system trained for biomedical
texts. The extractor of triples used SENNA [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] to perform the SRL. SENNA has
been chosen as it shows good accuracy in texts from the biomedical domain [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>
        The ontology linker uses mainly Stanford's CoreNLP [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] toolkit to identify
verbs and noun-phrases, especially using the Tokenizer, PoS tagger, and Parser.
To obtain the ontology links, we explored the National Center for Biomedical
Ontology (NCBO) annotator8, using its REST API. The links are retrieved from
the returned annotations. The conversion of the turtle contents to graphs edges
and vertices are performed using Raptor 9.
4
      </p>
      <sec id="sec-8-1">
        <title>Evaluation</title>
        <p>The goal in the evaluation of our method is to understand the quality of the
generated KGs. For this purpose, we used as input for the method abstracts of
scienti c papers dealing with AD, obtained from PubMed 10. The KGs generated,
along with their intermediary artifacts, are available in the KGen project
repository11. The linked ontology that better suited the abstracts was the National</p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>8 https://bioportal.bioontology.org/annotator</title>
    </sec>
    <sec id="sec-10">
      <title>9 http://librdf.org/raptor/ 10 https://www.ncbi.nlm.nih.gov/pubmed/ 11 https://github.com/rossanez/KGen</title>
      <p>Cancer Institute Thesaurus 12 (NCIT). It was the ontology returned by NCBO's
recommender endpoint13 for all the abstracts.</p>
      <p>The subjects and objects of the triples are most of the time composed of
more than a single entity (or noun-phrase). To capture such characteristic, the
local partof property (local:partof ) links the composing entities belonging to a
subject or object. This property, in turn, is linked to NCIT's Part Of
property (nci:C43743), as illustrated in Figure 4. This sub-graph is available in all
graphs generated for this evaluation, as they all have composing entities (e.g.
\series" and \rutacecarpine derivatives" are composing entities of \A series of
rutacecarpine derivatives").</p>
      <p>The snippet of the KG present in Figure 5 has been generated for the speci c
triple t=(A series of rutacecarpine derivatives, identify, novel ligands), obtained
from one of the abstracts14. In this graph, the original triple is represented by
linking the predicate to the subject and object through the rdf:subject and the
rdf:object properties. This form of rei cation was chosen to allow the
representation of the constituent parts (local:partOf ), and links to the ontology concepts
and attributes.</p>
      <p>The local entities and properties are represented preceded by the local pre x
(e.g., local:identify ), and their original text is mapped through the rdfs:label (e.g.,
identify ). As literals, those values are represented inside rectangular nodes. The
other nodes, as resources, are represented inside elliptical nodes (cf. Figure 6).</p>
      <p>The links to the concepts and attributes of the ontology are achieved through
the rdf:type property. They are represented by their ontology pre xes and code
property (e.g., nci:C25737 ). To enhance the readability, we present their
preferred names, through the rdfs:label property (e.g. Identi cation). In case no link
was retrieved in the ontology for an entity, their local resource is not bound to
an ontology resource in the graph, as illustrated in Figures 5 and 6.
12 https://ncit.nci.nih.gov/ncitbrowser/
13 https://bioportal.bioontology.org/recommender
14 https://www.ncbi.nlm.nih.gov/pubmed/31136894
This investigation aimed to generate KGs from the scienti c literature on
degenerative diseases. Our method linked the concepts, entities and properties from
the graph to classes and attributes from existing ontology in the biomedical
domain. The way of combining extracted triples with the ontology linkage, both
from this speci c domain, refers to the key originality in this investigation. Our
ndings indicate success in generating KGs for unstructured text from abstracts
of scienti c papers.</p>
      <p>The language employed on scienti c papers, especially those in the
degenerative diseases domain, pose a great di culty for the techniques and tools involved
in the method. For this reason, a fully automated method is still a challenge.
Although our method is able to run to completion without human intervention,
the method allows a domain specialist to review and manually change the
intermediate artifacts, i.e., the preprocessed text, triples, ontology links, and the
RDF representation of the KG. In the KGen tool, such intermediary artifacts
are represented by text les. When they are manually changed, the tool is able
to reconsider those intermediary les and update the graphs.</p>
      <p>Some aspects of our work demand further improvements. The triples
generated from the text sentences capture the most important aspects dealt within
the text, but secondary information is usually left aside from them. Open
Information Extraction systems and Semantic Role Labeling focus mainly in the
verbal relations. Secondary information, not directly related to the main verb
is, therefore, not captured in the KG. We believe that exploring the output of a
dependency parser could bring into the graph such missed information.</p>
      <p>The linked concepts and properties from an ontology requires additional
improvements. We could explore alternatives to nd a link when there is not an
exact match. In order to minimize the cases where no link is assigned to a local
concept, we plan to investigate SPARQL queries to obtain more generic
concepts within an ontology, or search for a match from another ontology and then
seeking to nd a mapping between these two ontologies.</p>
      <p>We plan investigating alternatives to improve the issues and re ne our method
to generate an ontology-linked KG from scienti c documents. Domain specialists
will be invited to assess the obtained KGs.
6</p>
      <sec id="sec-10-1">
        <title>Conclusion</title>
        <p>The creation of KGs from scienti c literature on degenerative diseases can help
researchers investigating how their discoveries relate to the existing domain, and
to other researchers' discoveries. However, the automatic generation of KGs is
an open research challenge. In this article, we proposed a method to generate
ontology-linked KGs from scienti c papers on degenerative diseases. Our method
is suited to extract triples and connect them with existing ontologies. The
conducted evaluation used abstracts obtained from scienti c papers. We showed that
the KGs were successfully generated from them. Future work involves
generating KGs linked from di erent ontologies, as well as studies comparing temporal
texts through their generated KGs.</p>
      </sec>
      <sec id="sec-10-2">
        <title>Acknowledgment</title>
        <p>This work is supported by the S~ao Paulo Research Foundation (FAPESP) (Grant
#2017/02325-5)15.
15 The opinions expressed in this work do not necessarily re ect those of the funding
agencies.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Alzheimers</given-names>
            <surname>Association</surname>
          </string-name>
          <article-title>: 2019 alzheimers disease facts and gures</article-title>
          .
          <source>Alzheimer's &amp; Dementia</source>
          <volume>15</volume>
          (
          <issue>3</issue>
          ),
          <volume>321</volume>
          {
          <fpage>387</fpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kobilarov</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cyganiak</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ives</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          :
          <article-title>Dbpedia: A nucleus for a web of open data</article-title>
          .
          <source>In: Proceedings of the 2nd Asian Conference on Semantic Web</source>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Baker</surname>
            ,
            <given-names>C.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fillmore</surname>
            ,
            <given-names>C.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lowe</surname>
            ,
            <given-names>J.B.</given-names>
          </string-name>
          :
          <article-title>The berkeley framenet project</article-title>
          .
          <source>In: Proc. of the 17th International Conference on Computational Linguistics - Vol. 1</source>
          . pp.
          <volume>86</volume>
          {
          <fpage>90</fpage>
          .
          <string-name>
            <surname>Ass</surname>
          </string-name>
          . for Computational Linguistics, Stroudsburg, PA, USA (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Barnickel</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weston</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Collobert</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mewes</surname>
            ,
            <given-names>H.W.</given-names>
          </string-name>
          , Stump en, V.:
          <article-title>Large scale application of neural network based semantic role labeling for automated relation extraction from biomedical texts</article-title>
          . In: PloS one (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>The emerging web of linked data</article-title>
          .
          <source>IEEE Intelligent Systems</source>
          <volume>24</volume>
          (
          <issue>5</issue>
          ),
          <volume>87</volume>
          {
          <fpage>92</fpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Braak</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thal</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghebremedhin</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Del Tredici</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Stages of the Pathologic Process in Alzheimer Disease: Age Categories From 1 to 100 Years</article-title>
          .
          <source>Journal of Neuropathology &amp; Experimental Neurology</source>
          <volume>70</volume>
          (
          <issue>11</issue>
          ),
          <volume>960</volume>
          {
          <volume>969</volume>
          (11
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Collobert</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weston</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bottou</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karlen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kavukcuoglu</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuksa</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Natural language processing (almost) from scratch</article-title>
          .
          <source>J. Mach. Learn. Res</source>
          .
          <volume>12</volume>
          ,
          <issue>2493</issue>
          {
          <fpage>2537</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Ehrlinger</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , Wo , W.:
          <article-title>Towards a de nition of knowledge graphs</article-title>
          .
          <source>In: 12th International Conference on Semantic Systems (SEMANTiCS2016)</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Evans</surname>
            ,
            <given-names>D.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Funkenstein</surname>
            ,
            <given-names>H.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Albert</surname>
            ,
            <given-names>M.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scherr</surname>
            ,
            <given-names>P.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cook</surname>
            ,
            <given-names>N.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chown</surname>
            ,
            <given-names>M.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hebert</surname>
            ,
            <given-names>L.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hennekens</surname>
            ,
            <given-names>C.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taylor</surname>
          </string-name>
          , J.O.:
          <article-title>Prevalence of Alzheimer's Disease in a Community Population of Older Persons: Higher Than Previously Reported</article-title>
          .
          <source>JAMA</source>
          <volume>262</volume>
          (
          <issue>18</issue>
          ),
          <volume>2551</volume>
          {
          <fpage>2556</fpage>
          (
          <year>1989</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Exner</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nugues</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Entity extraction: From unstructured text to dbpedia rdf triples</article-title>
          . In: WoLE@ISWC (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Fader</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Soderland</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Etzioni</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Identifying relations for open information extraction</article-title>
          .
          <source>In: Proceedings of the Conference of Empirical Methods in Natural Language Processing (EMNLP '11)</source>
          . Edinburgh, Scotland,
          <source>UK (July</source>
          <volume>27</volume>
          -31
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Ferragina</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scaiella</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          : Tagme:
          <article-title>On-the- y annotation of short text fragments (by wikipedia entities)</article-title>
          .
          <source>In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management</source>
          . pp.
          <volume>1625</volume>
          {
          <fpage>1628</fpage>
          . CIKM '10,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Gangemi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Presutti</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Recupero</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nuzzolese</surname>
            ,
            <given-names>A.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Draicchio</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mongiov</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Semantic Web Machine Reading with FRED</article-title>
          .
          <source>Semantic Web</source>
          <volume>8</volume>
          (
          <issue>6</issue>
          ),
          <volume>873</volume>
          {
          <fpage>893</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Gitler</surname>
            ,
            <given-names>A.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dhillon</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shorter</surname>
          </string-name>
          , J.:
          <article-title>Neurodegenerative disease: models, mechanisms, and a new hope</article-title>
          .
          <source>Disease Models &amp; Mechanisms</source>
          <volume>10</volume>
          (
          <issue>5</issue>
          ),
          <volume>499</volume>
          {
          <fpage>502</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Gruber</surname>
            ,
            <given-names>T.R.</given-names>
          </string-name>
          :
          <article-title>Toward principles for the design of ontologies used for knowledge sharing</article-title>
          .
          <source>International Journal of Human-Computer Studies</source>
          <volume>43</volume>
          ,
          <volume>907</volume>
          {
          <fpage>928</fpage>
          (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Kertkeidkachorn</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ichise</surname>
            ,
            <given-names>R.:</given-names>
          </string-name>
          <article-title>T2kg: An end-to-end system for creating knowledge graph from unstructured text</article-title>
          .
          <source>In: AAAI Workshops</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17. L.
          <string-name>
            <surname>Martinez-Rodriguez</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lopez-Arevalo</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Rios-Alvarado</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          :
          <article-title>Openie-based approach for knowledge graph construction from text</article-title>
          .
          <source>Expert Systems with Applications</source>
          <volume>113</volume>
          (07
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <given-names>L</given-names>
            <surname>Whetzel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Noy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Alexander</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Nyulas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Tudorache</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Musen</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          : Bioportal:
          <article-title>Enhanced functionality via new web services from the national center for biomedical ontology to access and use ontologies in software applications</article-title>
          .
          <source>Nucleic acids research</source>
          39,
          <issue>W541</issue>
          {
          <volume>5</volume>
          (
          <issue>06</issue>
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Lam</surname>
            ,
            <given-names>H.Y.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marenco</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clark</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kinoshita</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shepherd</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wong</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , Liu,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Crasto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Morse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Stephens</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          , hoi Cheung,
          <string-name>
            <surname>K.</surname>
          </string-name>
          :
          <article-title>Semantic web meets e-neuroscience: An rdf use case</article-title>
          .
          <source>In: ASWC International Workshop on Semantic e-Science</source>
          . pp.
          <volume>158</volume>
          {
          <fpage>170</fpage>
          . University Press (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Lam</surname>
          </string-name>
          , H.Y.,
          <string-name>
            <surname>Marenco</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clark</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gao</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kinoshita</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shepherd</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wong</surname>
            ,
            <given-names>G.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Crasto</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morse</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stephens</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cheung</surname>
            ,
            <given-names>K.H.</given-names>
          </string-name>
          :
          <article-title>Alzpharm: integration of neurodegeneration data using rdf</article-title>
          .
          <source>BMC Bioinformatics</source>
          <volume>8</volume>
          (
          <issue>3</issue>
          ), S4 (May
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Manning</surname>
            ,
            <given-names>C.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Surdeanu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bauer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Finkel</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bethard</surname>
            ,
            <given-names>S.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McClosky</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>The Stanford CoreNLP natural language processing toolkit</article-title>
          . In:
          <article-title>Association for Computational Linguistics (ACL) System Demonstrations</article-title>
          . pp.
          <volume>55</volume>
          {
          <issue>60</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Mausam</surname>
            , Schmitz,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stephen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bart</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Etzioni</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Open language learning for information extraction</article-title>
          .
          <source>In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning</source>
          . pp.
          <volume>523</volume>
          {
          <fpage>534</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Moro</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Raganato</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Navigli</surname>
          </string-name>
          , R.:
          <article-title>Entity Linking meets Word Sense Disambiguation: a Uni ed Approach. Transactions of the Association for Computational Linguistics (TACL) 2</article-title>
          ,
          <issue>231</issue>
          {
          <fpage>244</fpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Peng</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tudor</surname>
            ,
            <given-names>C.O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Torii</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>C.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vijay-Shanker</surname>
            ,
            <given-names>K.:</given-names>
          </string-name>
          <article-title>iSimp in BioC standard format: enhancing the interoperability of a sentence simpli cation system</article-title>
          .
          <source>Database</source>
          <year>2014</year>
          (05
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Ropper</surname>
            ,
            <given-names>A.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Samuels</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klein</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prasad</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Adams and Victor's Principles of Neurology, chap. 38: Degenerative Diseases of the Nervous System</article-title>
          , p.
          <fpage>1645</fpage>
          .
          <string-name>
            <surname>McGraw-Hill Incorporated</surname>
          </string-name>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Schuler</surname>
            ,
            <given-names>K.K.</given-names>
          </string-name>
          :
          <article-title>Verbnet: A Broad-coverage, Comprehensive Verb Lexicon</article-title>
          .
          <source>Ph.D. thesis</source>
          , University of Pennsylvania, Philadelphia, PA, USA (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Setia</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chahal</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hosurmath</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Build a knowledge graph from documents (</article-title>
          <year>2018</year>
          ), https://developer.ibm.com/patterns/build
          <article-title>-a-domain-speci cknowledge-graph-from-given-set-of-documents, [</article-title>
          <source>Accessed on 2019-06-25].</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>