<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards Causal Knowledge Graphs - Position Paper</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Eva Blomqvist</string-name>
          <email>eva.blomqvist@liu.se</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marjan Alirezaie</string-name>
          <email>marjan.alirezaie@oru.se</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marina Santini</string-name>
          <email>marina.santini@ri.se</email>
        </contrib>
      </contrib-group>
      <abstract>
        <p>In this position paper, we highlight that being able to analyse the cause-effect relationships for determining the causal status among a set of events is an essential requirement in many contexts and argue that cannot be overlooked when building systems targeting real-world use cases. This is especially true for medical contexts where the understanding of the cause(s) of a symptom, or observation, is of vital importance. However, most approaches purely based on Machine Learning (ML) do not explicitly represent and reason with causal relations, and may therefore mistake correlation for causation. In the paper, we therefore argue for an approach to extract causal relations from text, and represent them in the form of Knowledge Graphs (KG), to empower downstream ML applications, or AI systems in general, with the ability to distinguish correlation from causation and reason with causality in an explicit manner. So far, the bottlenecks in KG creation have been scalability and accuracy of automated methods, hence, we argue that two novel features are required from methods for addressing these challenges, i.e. (i) the use of Knowledge Patterns to guide the KG generation process towards a certain resulting knowledge structure, and (ii) the use of a semantic referee to automatically curate the extracted knowledge. We claim that this will be an important step forward for supporting interpretable AI systems, and integrating ML and knowledge representation approaches, such as KGs, which should also generalise well to other types of relations, apart from causality.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Knowledge Graphs (KGs) have emerged in the past decade as a
prominent form of knowledge representation, frequently used by
large enterprises such as Google, Facebook, Amazon, Siemens, and
many more [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. A KG is simply a graph representing some set of
data, usually coupled with a way to explicitly represent the
meaning of the data, e.g. an ontology. This can be seen as a revival of
graph-based knowledge representation, with roots in the early 1970’s
(for instance, the term knowledge graph was used as early as 1972
by [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]), but with recent advances mainly related to the Semantic
Web, such as Linked Data on the Web, and Semantic Web
ontologies. This renewed popularity has been accelerated by two main
realisations regarding Machine Learning (ML), including Deep
Learning (DL) models: Although outperforming humans on many specific
tasks, ML/DL methods (i) are often unable to determine the
semantics of the correlations found in the data, and (ii) lack the ability to
transparently explain a prediction. A particularly challenging
example is the case of causal relations. As pointed out by [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] the future
development of AI depends on building systems that incorporate the
notion of causality, e.g. to allow the system to reason about situations
that have not been previously encountered, based on general
principles. There is an active field of research developing specific ML/DL
algorithms targeting causal learning and reasoning. However, only
targeting ML/DL-based causal reasoning does not necessarily
improve interpretability, hence there is a need to also develop methods
for producing and utilising interpretable causal models, as we shall
discuss further in Section 3.
      </p>
      <p>
        KGs, being symbolic models, allow to define the semantics of
relations in data, at the level of formalisation necessary for an
intended task, e.g., through ontologies if needed, and by integration
with ML/DL methods this supports interpretability of predictions.
Hence, KGs can be used to address both the main shortcomings of
ML/DL mentioned earlier, but the construction of KGs is a major
bottleneck in their adoption, just as was the case with knowledge
representation in general, in early AI systems. Outside large companies,
such as Google, and huge crowdsourcing initiatives, such as Wikidata
[
        <xref ref-type="bibr" rid="ref32">32</xref>
        ], it is usually infeasible to construct large scale KGs ”manually”.
Rather, they have to be bootstrapped from existing sources, such as
semi-structured data or text. Current KG generation algorithms,
however, either do not take into account the desired formalisation of the
KG at all, or they hard-code it into the extraction algorithm. An
example of the latter is DBPedia [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], which is specific to a Wiki source
and results are expressed using a fixed ontology, which means the
method does not generalise to new settings or other input structures.
Additionally, the quality of the generated KGs is usually poor [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ],
requiring manual curation, and further, no automated approach so far
targets complex relations, e.g. causality. Therefore, it is our goal to
specifically target new methods and algorithms for KG generation
from text, which (a) explicitly take KG requirements into account,
e.g. allowing to flexibly specify the required schema of the output
graph, and (b) automate the curation process, to radically improve
the quality of resulting KGs. In order to fulfil a specific set of KG
requirements, as well as to achieve a sufficient level of accuracy, we
propose to use the notion of Knowledge Patterns (KPs) [?] as
formalisations of KG requirements. A KP represents both a linguistic
frame that can be detected in text [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], but also the representation of
that frame in the desired KG output formalism, i.e. similar to the
notion of Ontology Design Patterns (ODP) [
        <xref ref-type="bibr" rid="ref4 ref5">5, 4</xref>
        ]. In order to tackle a
particularly important obstacle to the future development of the AI
field, i.e., considering the importance of causal models and
reasoning, we intend to specifically target KPs and KGs targeting complex
causal relations.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>ML - Causality and Interpretability</title>
      <p>
        While ML methods perform very well in learning complex
connections between large amounts of input and output data, there is no
guarantee that they capture causation (cause and effect relations).
This shortcoming stems in part from the ignorance of data-driven
methods with respect to reasoning techniques, which are effortlessly
applied by humans. Consider the two imaginary groups of people:
Group A: 100 asthmatic people with a death rate of 40%, and Group
B: 100 asthmatic people who also suffer from pneumonia, with a
death rate of 35%. A ML method solely fed with the data can only
learn a nonsense result saying: asthmatics with pneumonia have more
chances to live! [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The learning method has perhaps learned the
associations (or correlations) among the variables in the data correctly.
However, due to the absence of context and common sense
knowledge, and also the lack of reasoning abilities, the method has not been
able to explicitly and correctly capture the cause-effect relations.
That is why the outcome of the example above is not only
counterintuitive, but also misleading. By context, we refer to any information
that may not be represented in the observed data directly, and may
include the actual causes behind the observations, e.g., some set of
background information about the setting. In the given example,
people in Group B are more high risk patients than those in Group A. The
lower death rate of people in Group B can have different reasons, for
instance, due to their high risk status they may more likely be taken
to the intensive care unit (ICU) or they may be taking more effective
medicines, which are all factors (or features) not considered by the
learning model [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ]. Additionally, some common sense knowledge,
such that additional diseases generally increase mortality rather than
decreasing it, could have also supported a system in avoiding the
erroneous conclusion, e.g., through using knowledge representations
as a referee for the learned model [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], as we will discuss furhter later
on in this paper.
      </p>
      <p>To provide sufficient support for a reliable and precise prediction
or diagnosis process, every prediction made by a system needs to
be perfectly transparent and interpretable by the user. This is
necessary for any autonomous system to act as the support for humans in
making decisions, and even legally and ethically required in many
domains, including the medical domain. Although ML should
definitely be a part of the solution, what is predicted needs to be
interpretable, so that any conclusion based on that knowledge can be
explained in detail, most often including some notion of reliability or
confidence. A solution to this shortcoming of ML methods is to
integrate them with explicitly represented knowledge, such as in the case
of causality, a formal causal model that reflects all the possible and
existing relations, including cause-effect ones, among the concepts
of a given domain.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Causal Models</title>
      <p>
        By causal model, we refer to a parametric model that represents a set
of probability densities over variables including concepts defined in
a system (e.g., diseases and symptoms in the context of medicine),
together with the plausible causal relationships between them [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ].
Once available, integration of causal information (inferred from a
causal model) with the training (observational) data, can enable a
ML/DL method to also learn the causes behind its mistakes (i.e.,
misclassification) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], and consequently improve its performance. In this
paper we specifically target causal relations, i.e. the focus is not on
determining the probability distributions but rather on the underlying
knowledge representation.
      </p>
      <p>
        Although recent research reflects the considerable impact of causal
inference in different domains, such as public health [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] or earth
science and climate change [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ], it is still also challenging to involve
causal models within a learning process. One of the hindering
factors is, in fact, the lack of available domain-related causal models
compatible with the data used for learning [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], which leads to the
need of manually creating such models for each use case. However,
for many domains nowadays, such as e-health and patient monitoring
through smart homes, both the set of potential outcomes and the set
of variables are extremely large. Therefore, manually constructing
and maintaining causal models requires a huge effort, and cannot be
easily adapted to a new domain. Even further, manual construction of
models representing all the environmental features and relations may
not even be practically feasible, due to the changing nature of the
environment. This has already changed the focus of research to
automatically generating causal models [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], which is a line of research
we are also contributing to.
      </p>
      <p>
        Furthermore, causal relations are usually not as simple as one
explicit link between two well-defined (cause and effect) concepts.
Depending on the context and the conditions, we may, for instance, end
up with a set of causations with different certainty values. The
appropriate modelling of the causal relations also heavily depend on
the use cases of the resulting model, e.g., the kind of reasoning and
prediction tasks that it should support. For instance, reasoning on
potential guideline and treatment interactions in an individual patient
context, e.g., the target use case of [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], requires a highly complex
causal model, while in other cases a more simple one might suffice.
In Fig. 1, we illustrate this through two examples. At the right (b) is a
highly complex conceptual model (inspired by the model in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ])
representing the belief that a causal relation exists, with some frequency
and strength. At the left (a) is a also a causal relation, but represented
as a much more simple conceptual model.
      </p>
      <p>Our proposed method intends to address the lack of causal models,
by automating the generation of highly accuracte causal KGs from
text. We intend to cater for the differing requirements of specific use
cases by using Knowledge Patterns (KPs), similar to the conceptual
models in Fig. 1 coupled with linguistic frames, to represent
requirements that make sure the resulting causal model enable the required
type of reasoning or predictions.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Proposed Approach: Generating Causal KGs from Text</title>
      <p>The overarching goal of our research is to support the integration
of ML/DL and Knowledge Representation, for improving both
accuracy and interpretability of downstream AI applications. As
discussed previously, we believe that KGs can play a crucial role in
this integration, but then the KG construction bottleneck needs to be
resolved. Therefore, we propose to develop new methods and
algorithms for KG generation from text, which (a) explicitly take KG
requirements into account, e.g. allowing to flexibly specify the
required schema of the output graph, and (b) automate the KG curation
process to radically improve the quality of resulting KGs with
minimal human effort. In order to fulfil a specific set of KG requirements,
as well as to achieve a sufficient level of accuracy, we argue that the
notion of Knowledge Patterns (KPs) [?] as formalisations of KG
requirements, is a crucial concept. We here specifically focus on KPs
and KGs targeting causal relations, since causal models and causal
reasoning are one of the main challenges for ML approaches today.</p>
      <p>However, the approach we outline is generic, and by exchanging the
KPs used, it can be used to target any type of complex relation that
can be expressed in natural language. The proposed approach is a
novel combination of methods from ML/DL for NLP, with recent
advancement in Knowledge Representation, such as KGs and KPs.</p>
      <p>As can be seen in Fig. 2, we propose a continuous process that
iteratively improves its ML/DL models based on feedback from a
curation step. As initial input (1), the process needs a set of KPs
repa)
Situation Type
causes</p>
      <p>incompatibleWith
hasPreSituation</p>
      <p>Event Type
hasPostSituation</p>
      <p>Transition Type
hasAsCause</p>
      <p>similarTo opposedTo
hasAsEffect
Action Type</p>
      <p>Causation Belief
frequency
strength
5 The notation is again informal, but the symbol := is here used to indicate
4 --hiatntcprpeersas:sisi/nte/gnwbtwrcewha.etshwtl3ey.scsoonrueggsh/s,RwpDiatFhrt/ipchulleagrlmyw–hseonmyeoup'ereopalcetimveay dismiss this as justthae"scmauoskalerr'eslactiooung,ha"nd f() represents a function.
- …
3
Coronavirus (COVID-19)
...</p>
      <p>
        If you have symptoms of coronavirus (a high temperature or a new, continuous cough)...
sions, which might be a valuable addition in our proposed curation
and feedback step. Earlier work on frame detection in text [
        <xref ref-type="bibr" rid="ref12 ref14">14, 12</xref>
        ],
and generation of KGs from this, may also be relevant for
comparison, especially since [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] also applied the notion of KPs related to
the frames detected, however, they did not allow for the frames to be
preselected as the KG requirements, or exchanged.
      </p>
      <p>
        Further, NELL [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] targets the learning of common facts,
extracted from natural language texts. Although their approach does
not target a specific output structure or relation, i.e., specific KPs, the
continuous improvement process is similar to our proposal. In other
recent studies, such as by [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], KGs are also generated from
natural language text, but they do not target complex relations such as
causality, and the approaches use a fixed output schema.
      </p>
      <p>
        Very little research exists on extracting more complex relations,
i.e. relations that cannot be expressed as single facts (triples), and in
particular causal relations, directly from text. One study that
generated causal KGs from text is [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]. The difference to our envisioned
approach is mainly the types of input data, as well as that [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]
targets one fixed logical structure of the output, i.e., a single fixed KP
expressing simple direct relations between diagnoses and symptoms.
      </p>
      <p>
        To learn a more complex formalisation of causality, we may also
need more complex learning, such as suggested by [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], who
proposed a method for extracting a relation graph directly from natural
language, where the relations express entailment rules rather than
simple facts (triples).
      </p>
      <p>
        Another area where NLP has been widely used is KG
completion, e.g., link and relation prediction in an existing KG. Although
we intend to generate a KG “from scratch”, the KG generation from
instantiated KPs, as well as the subsequent curation process, have
some similarities with link and relation prediction. Hence,
inspiration may come from work such as [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ], who propose to use
pretrained language models for knowledge graph completion, scoring
candidate triples for addition through their KG-BERT model. This
is similar to how we envision to assess potential links between the
instantiated KPs, when generating the overall KG. Another approach
was recently proposed by [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], where language models such as GPT-2
are combined with a seed KG, allowing the learning of its structure
and relations, whereafter the language model can generate new nodes
and edges. However, our KPs are abstract and do not contain concrete
facts, which is a main difference to the seed KGs they used.
4.2
      </p>
    </sec>
    <sec id="sec-5">
      <title>Knowledge Patterns</title>
      <p>
        The use of patterns in developing knowledge representation models
has a long tradition in AI, starting from the idea of Minsky in his
proposal of frames [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], and continued towards the notion of ontology
design patterns (ODP) in modern ontology engineering [
        <xref ref-type="bibr" rid="ref4 ref5">5, 4</xref>
        ]. ODPs
have also been generalised into KPs [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], where a KP may
represents both a linguistic frame that can be detected in text [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], but also
the representation of that frame in a desired output formalism.
However, in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], KPs are described and defined informally, and there is
currently no concrete formalism for representing and applying KPs
specifically for KGs.
      </p>
      <p>In order to capture specific types of knowledge from text,
supporting a specific task, such as medical decision support, the knowledge
extraction process needs to be carefully guided by the requirements
of the intended task of the resulting KG. Tasks may include different
types of queries, prediction, applying specific graph pattern matching
algorithms, or reasoning. To address this challenge we argue for
applying KPs as both a representation of the KG requirements, as well
as acting as a “schema” for the resulting KG. In short, as shown in</p>
      <p>Figure 2, we propose to tune the language models to detect the
specific KPs required, and further generate a KG from the instantiated
KPs.</p>
      <p>
        Using KPs to guide the learning process makes it possible to
capture different possible contextual situations separately, and target
different causal models, each focused on a certain specific downstream
task. Depending on the relations that are found in the text, KPs will
also allow us to calculate more precise certainty values for each
captured cause, similar to how we have used knowledge representations
as a referee for ML methods in our previous work[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This also
allows us to filter out extracted knowledge that does not make sense,
or is otherwise of questionable quality.
      </p>
      <p>
        However, this also introduces new challenges, because although
KPs have been studied to some extent for ontologies and the
Semantic Web, there is so far no formal definition of a KP that can be used
operationally (technically) by a system, in particular for KGs. For
this purpose we need to operationalise the definition in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], by
expanding on the connection between linguistic frames and ODPs, for
use within our KG extraction framework.
4.3
      </p>
    </sec>
    <sec id="sec-6">
      <title>Semantic Referee</title>
      <p>
        Related to the integration of ML/DL and symbolic models, and
using knowledge representation to verify and repair results of ML/DL
algorithms, we rely on the idea of a semantic referee introduced in
our previous work [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In that work, we demonstrated the benefit of
a semantic referee applied upon a causal model in the form of an
ontology (OntoCity) for improving a satellite imagery data classifier.
In particular, the ontology together with a reasoning process acted
as a semantic referee to guide the ML method (i.e, the classifier).
Using causal information represented in the ontology, the semantic
referee was able to explain the causes behind errors, and send the
explanations as feedback to the classifier. In this way, the ML method
is able to know the causes behind its mistakes and therefore better
learn from them [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. We argue that this previous work, will be highly
useful, when integrated as step (5) in our KG extraction framework,
illustrated earlier in Fig. 2.
5
      </p>
    </sec>
    <sec id="sec-7">
      <title>Conclusion</title>
      <p>In this paper, we propose a possible approach to capturing causal
knowledge, in a scalable fashion, and representing it as a shared KG.
We argue that the advantage of constructing causal KGs is the
integration of causality in reasoning and prediction processes, such as the
medical diagnosis process, to improve the accuracy and reliability of
existing ML/DL-based diagnosis methods, by producing transparent
justifications and explanations of the output.</p>
      <p>More specifically we focus on KGs as a means for providing
background knowledge and reasoning capabilities to ML/DL-based AI
systems, and target the KG creation bottleneck. In particular, we
recognise the challenge related to causal relations, where the
capability of performing causal reasoning is often lacking in pure ML-based
systems. Therefore we propose to generate causal KGs from textual
information, to then be used as the basis for causal models. Our novel
framework is based on using a set of formal KPs as input, acting both
as the requirements of the KG as well as the means for formalising
the extracted knowledge and curate it through logical reasoning.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Marjan</given-names>
            <surname>Alirezaie</surname>
          </string-name>
          , Martin La¨ngkvist, Michael Sioutis, and Amy Loutfi, '
          <article-title>Semantic referee: A neural-symbolic framework for enhancing geospatial semantic segmentation</article-title>
          ',
          <volume>10</volume>
          ,
          <fpage>863</fpage>
          -
          <lpage>880</lpage>
          , (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Collin</surname>
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Baker</surname>
            ,
            <given-names>Charles J.</given-names>
          </string-name>
          <string-name>
            <surname>Fillmore</surname>
          </string-name>
          , and John B. Lowe, '
          <article-title>The berkeley framenet project'</article-title>
          ,
          <source>in Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics - Volume 1, ACL '98/COLING '98</source>
          , p.
          <fpage>86</fpage>
          -
          <lpage>90</lpage>
          , USA, (
          <year>1998</year>
          ).
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Collin</surname>
            <given-names>F Baker</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Charles J Fillmore</surname>
          </string-name>
          , and John B Lowe, '
          <article-title>The berkeley framenet project'</article-title>
          ,
          <source>in Proceedings of the 17th international conference on Computational linguistics-Volume</source>
          <volume>1</volume>
          , pp.
          <fpage>86</fpage>
          -
          <lpage>90</lpage>
          . Association for Computational Linguistics, (
          <year>1998</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Eva</given-names>
            <surname>Blomqvist</surname>
          </string-name>
          , Karl Hammar, and Valentina Presutti, '
          <article-title>Engineering ontologies with patterns - the extreme design methodology', in Ontology Engineering with Ontology Design Patterns - Foundations</article-title>
          and Applications, eds.,
          <string-name>
            <surname>Pascal</surname>
            <given-names>Hitzler</given-names>
          </string-name>
          , Aldo Gangemi, Krzysztof Janowicz,
          <source>Adila Krisnadhi, and Valentina Presutti</source>
          , volume
          <volume>25</volume>
          <source>of Studies on the Semantic Web</source>
          ,
          <fpage>23</fpage>
          -
          <lpage>50</lpage>
          , IOS Press, (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Eva</given-names>
            <surname>Blomqvist</surname>
          </string-name>
          and Kurt Sandkuhl, '
          <article-title>Patterns in ontology engineering: Classification of ontology patterns'</article-title>
          ,
          <source>in ICEIS 2005, Proceedings of the Seventh International Conference on Enterprise Information Systems</source>
          , Miami, USA, May
          <volume>25</volume>
          -28,
          <year>2005</year>
          , eds.,
          <string-name>
            <surname>Chin-Sheng</surname>
            <given-names>Chen</given-names>
          </string-name>
          , Joaquim Filipe, Isabel Seruca, and Jose´ Cordeiro, pp.
          <fpage>413</fpage>
          -
          <lpage>416</lpage>
          , (
          <year>2005</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Antoine</given-names>
            <surname>Bosselut</surname>
          </string-name>
          , Hannah Rashkin, Maarten Sap, Chaitanya Malaviya, Asli Celikyilmaz, and Yejin Choi, 'Comet:
          <article-title>Commonsense transformers for automatic knowledge graph construction'</article-title>
          , arXiv preprint arXiv:
          <year>1906</year>
          .
          <volume>05317</volume>
          , (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>V.</given-names>
            <surname>Carretta</surname>
          </string-name>
          <string-name>
            <surname>Zamborlini</surname>
          </string-name>
          ,
          <article-title>Knowledge Representation for Clinical Guidelines: with applications to Multimorbidity Analysis and Literature Search</article-title>
          ,
          <source>Ph.D. dissertation, Vrije Universiteit Amsterdam</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Rich</given-names>
            <surname>Caruana</surname>
          </string-name>
          , Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad, '
          <article-title>Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission'</article-title>
          ,
          <source>in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '15</source>
          , p.
          <fpage>1721</fpage>
          -
          <lpage>1730</lpage>
          , New York, NY, USA, (
          <year>2015</year>
          ).
          <article-title>Association for Computing Machinery</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Tirthankar</given-names>
            <surname>Dasgupta</surname>
          </string-name>
          , Rupsa Saha, Lipika Dey, and Abir Naskar, '
          <article-title>Automatic extraction of causal relations from text using linguistically informed deep neural networks'</article-title>
          ,
          <source>in Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue</source>
          , pp.
          <fpage>306</fpage>
          -
          <lpage>316</lpage>
          , Melbourne, Australia, (
          <year>July 2018</year>
          ).
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Jacob</surname>
            <given-names>Devlin</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ming-Wei</surname>
            <given-names>Chang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Kenton</given-names>
            <surname>Lee</surname>
          </string-name>
          , and Kristina Toutanova, 'Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding'</article-title>
          ,
          <source>in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (Long and Short Papers), pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          , (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Fa</surname>
          </string-name>
          ¨rber, Frederic Bartscherer, Carsten Menne, and Achim Rettinger, '
          <article-title>Linked data quality of dbpedia, freebase</article-title>
          , opencyc, wikidata, and YAGO',
          <source>Semantic Web</source>
          ,
          <volume>9</volume>
          (
          <issue>1</issue>
          ),
          <fpage>77</fpage>
          -
          <lpage>129</lpage>
          , (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Marco</surname>
            <given-names>Fossati</given-names>
          </string-name>
          , Emilio Dorigatti, and Claudio Giuliano, '
          <article-title>N-ary relation extraction for simultaneous t-box and a-box knowledge base augmentation'</article-title>
          ,
          <source>Semantic Web</source>
          ,
          <volume>9</volume>
          (
          <issue>4</issue>
          ),
          <fpage>413</fpage>
          -
          <lpage>439</lpage>
          , (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Aldo</given-names>
            <surname>Gangemi</surname>
          </string-name>
          and Valentina Presutti, '
          <article-title>Towards a pattern science for the semantic web'</article-title>
          ,
          <source>Semantic Web</source>
          ,
          <volume>1</volume>
          (
          <issue>1-2</issue>
          ),
          <fpage>61</fpage>
          -
          <lpage>68</lpage>
          , (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Aldo</surname>
            <given-names>Gangemi</given-names>
          </string-name>
          , Valentina Presutti, Diego Reforgiato Recupero, Andrea Giovanni Nuzzolese, Francesco Draicchio, and Misael Mongiov`ı, '
          <article-title>Semantic web machine reading with fred'</article-title>
          ,
          <source>Semantic Web</source>
          ,
          <volume>8</volume>
          (
          <issue>6</issue>
          ),
          <fpage>873</fpage>
          -
          <lpage>893</lpage>
          , (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Thomas</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Glass</surname>
            ,
            <given-names>Steven N.</given-names>
          </string-name>
          <string-name>
            <surname>Goodman</surname>
          </string-name>
          , Miguel A. Herna´n, and
          <string-name>
            <surname>Jonathan</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Samet</surname>
          </string-name>
          , '
          <article-title>Causal inference in public health'</article-title>
          ,
          <source>(March</source>
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Aidan</surname>
            <given-names>Hogan</given-names>
          </string-name>
          , Eva Blomqvist, Michael Cochez, Claudia d'Amato, Gerard de Melo, Claudio Gutierrez, Jose´ Emilio Labra Gayo, Sabrina Kirrane,
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Neumaier</surname>
          </string-name>
          , Axel Polleres, Roberto Navigli,
          <string-name>
            <surname>Axel-Cyrille Ngonga</surname>
            <given-names>Ngomo</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sabbir M. Rashid</surname>
          </string-name>
          , Anisa Rula, Lukas Schmelzeisen, Juan Sequeda,
          <string-name>
            <surname>Steffen Staab</surname>
            , and
            <given-names>Antoine</given-names>
          </string-name>
          <string-name>
            <surname>Zimmermann</surname>
          </string-name>
          .
          <source>Knowledge graphs</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Pearl</surname>
            <given-names>J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Mackenzie</surname>
            <given-names>D.</given-names>
          </string-name>
          ,
          <article-title>The book of why: the new science of cause and effect</article-title>
          ,
          <source>Basic Books</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Mohammad</given-names>
            <surname>Javad</surname>
          </string-name>
          <string-name>
            <surname>Hosseini</surname>
          </string-name>
          , Nathanael Chambers, Siva Reddy, Xavier R Holt, Shay B Cohen, Mark Johnson, and Mark Steedman, '
          <article-title>Learning typed entailment graphs with global soft constraints'</article-title>
          ,
          <source>Transactions of the Association for Computational Linguistics</source>
          ,
          <volume>6</volume>
          ,
          <fpage>703</fpage>
          -
          <lpage>717</lpage>
          , (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Christopher</surname>
            <given-names>S. G.</given-names>
          </string-name>
          <string-name>
            <surname>Khoo</surname>
            ,
            <given-names>Syin</given-names>
          </string-name>
          <string-name>
            <surname>Chan</surname>
          </string-name>
          , and Yun Niu, '
          <article-title>Extracting causal knowledge from a medical database using graphical patterns'</article-title>
          ,
          <source>in Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics</source>
          , pp.
          <fpage>336</fpage>
          -
          <lpage>343</lpage>
          ,
          <string-name>
            <surname>Hong</surname>
            <given-names>Kong</given-names>
          </string-name>
          , (
          <year>October 2000</year>
          ).
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Jens</surname>
            <given-names>Lehmann</given-names>
          </string-name>
          , Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas,
          <string-name>
            <given-names>Pablo N.</given-names>
            <surname>Mendes</surname>
          </string-name>
          , Sebastian Hellmann, Mohamed Morsey, Patrick van Kleef,
          <article-title>So¨ ren Auer, and Christian Bizer, 'Dbpedia - A largescale, multilingual knowledge base extracted from wikipedia'</article-title>
          ,
          <source>Semantic Web</source>
          ,
          <volume>6</volume>
          (
          <issue>2</issue>
          ),
          <fpage>167</fpage>
          -
          <lpage>195</lpage>
          , (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Marvin</surname>
            <given-names>Minsky</given-names>
          </string-name>
          , '
          <article-title>A framework for representing knowledge', MIT-AI Laboratory Memo 306</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Riccardo</surname>
            <given-names>Miotto</given-names>
          </string-name>
          , Fei Wang,
          <string-name>
            <surname>Shuang</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiaoqian Jiang</surname>
          </string-name>
          , and Joel Dudley, '
          <article-title>Deep learning for healthcare: review, opportunities</article-title>
          and challenges',
          <source>Briefings in bioinformatics, 19</source>
          <volume>6</volume>
          ,
          <fpage>1236</fpage>
          -
          <lpage>1246</lpage>
          , (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23] Tom Mitchell, William Cohen, Estevam Hruschka, Partha Talukdar, Bishan Yang, Justin Betteridge, Andrew Carlson, Bhanava Dalvi, Matt Gardner,
          <string-name>
            <given-names>Bryan</given-names>
            <surname>Kisiel</surname>
          </string-name>
          , et al., '
          <article-title>Never-ending learning'</article-title>
          ,
          <source>Communications of the ACM</source>
          ,
          <volume>61</volume>
          (
          <issue>5</issue>
          ),
          <fpage>103</fpage>
          -
          <lpage>115</lpage>
          , (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>JUDEA</surname>
            <given-names>PEARL</given-names>
          </string-name>
          , '
          <article-title>Causal diagrams for empirical research'</article-title>
          , Biometrika,
          <volume>82</volume>
          (
          <issue>4</issue>
          ),
          <fpage>669</fpage>
          -
          <lpage>688</lpage>
          , (12
          <year>1995</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>Anderson</given-names>
            <surname>Rossanez</surname>
          </string-name>
          and
          <article-title>Julio Cesar dos Reis, 'Generating knowledge graphs from scientific literature of degenerative diseases', (</article-title>
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Maya</surname>
            <given-names>Rotmensch</given-names>
          </string-name>
          , Yoni Halpern, Abdulhakim Tlimat, Steven Horng, and David Sontag, '
          <article-title>Learning a health knowledge graph from electronic medical records'</article-title>
          ,
          <source>Scientific reports</source>
          ,
          <volume>7</volume>
          (
          <issue>1</issue>
          ),
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          , (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>J.</given-names>
            <surname>Runge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bathiany</surname>
          </string-name>
          , E. Bollt,
          <string-name>
            <given-names>G.</given-names>
            <surname>Camps-Valls</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Coumou</surname>
          </string-name>
          , E. Deyle,
          <string-name>
            <given-names>M.</given-names>
            <surname>Glymour</surname>
          </string-name>
          , C. andKretschmer,
          <string-name>
            <surname>M.D. Mahecha</surname>
            ,
            <given-names>E.H. van Nes</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Peters</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Quax</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Reichstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Scheffer</surname>
          </string-name>
          , M. Scho¨ lkopf, P. Spirtes, G. Sugihara,
          <string-name>
            <given-names>J.</given-names>
            <surname>Sun</surname>
          </string-name>
          , Ka. Zhang, and
          <string-name>
            <given-names>J.</given-names>
            <surname>Zscheischler</surname>
          </string-name>
          , '
          <article-title>Inferring causation from time series with perspectives in earth system sciences'</article-title>
          ,
          <string-name>
            <surname>Nature</surname>
            <given-names>Communications</given-names>
          </string-name>
          , (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Edward</surname>
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Schneider</surname>
          </string-name>
          , '
          <article-title>Course modularization applied: The interface system and its implications for sequence control and data analysis', (</article-title>
          <year>1973</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>Peter</given-names>
            <surname>Schulam</surname>
          </string-name>
          and Suchi Saria, '
          <article-title>Reliable decision support using counterfactual models'</article-title>
          ,
          <source>in Advances in Neural Information Processing Systems</source>
          <volume>30</volume>
          , eds., I. Guyon,
          <string-name>
            <given-names>U. V.</given-names>
            <surname>Luxburg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wallach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fergus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vishwanathan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Garnett</surname>
          </string-name>
          ,
          <volume>1697</volume>
          -
          <fpage>1708</fpage>
          , Curran Associates, Inc., (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>Livio</given-names>
            <surname>Baldini</surname>
          </string-name>
          <string-name>
            <surname>Soares</surname>
          </string-name>
          , Nicholas FitzGerald, Jeffrey Ling, and Tom Kwiatkowski, '
          <article-title>Matching the blanks: Distributional similarity for relation learning'</article-title>
          ,
          <source>in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics</source>
          , pp.
          <fpage>2895</fpage>
          -
          <lpage>2905</lpage>
          , (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>Peter</given-names>
            <surname>Spirtes</surname>
          </string-name>
          , '
          <article-title>Introduction to causal inference'</article-title>
          ,
          <source>J. Mach. Learn. Res.</source>
          ,
          <volume>11</volume>
          ,
          <fpage>1643</fpage>
          -
          <lpage>1662</lpage>
          , (
          <year>August 2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>Denny</given-names>
            <surname>Vrandecic</surname>
          </string-name>
          and Markus Kro¨ tzsch, '
          <article-title>Wikidata: a free collaborative knowledgebase'</article-title>
          ,
          <source>Commun. ACM</source>
          ,
          <volume>57</volume>
          (
          <issue>10</issue>
          ),
          <fpage>78</fpage>
          -
          <lpage>85</lpage>
          , (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>Liang</surname>
            <given-names>Yao</given-names>
          </string-name>
          , Chengsheng Mao, and Yuan Luo, '
          <article-title>Kg-bert: Bert for knowledge graph completion'</article-title>
          , arXiv preprint arXiv:
          <year>1909</year>
          .
          <volume>03193</volume>
          , (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>