<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>NeSy</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Continual Reasoning: Non-monotonic Reasoning in Neurosymbolic AI using Continual Learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sofoklis Kyriakopoulos</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Artur S. d'Avila Garcez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science</institution>
          ,
          <addr-line>City</addr-line>
          ,
          <institution>University of London</institution>
          ,
          <addr-line>London</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>17</volume>
      <fpage>3</fpage>
      <lpage>5</lpage>
      <abstract>
        <p>Despite the extensive investment and impressive recent progress at reasoning by similarity, deep learning continues to struggle with more complex forms of reasoning such as non-monotonic and commonsense reasoning. Non-monotonicity is a property of non-classical reasoning typically seen in commonsense reasoning, whereby a reasoning system is allowed (diferently from classical logic) to jump to conclusions which may be retracted later, when new information becomes available. Neural-symbolic systems such as Logic Tensor Networks (LTN) have been shown to be efective at enabling deep neural networks to achieve reasoning capabilities. In this paper, we show that by combining a neural-symbolic system with methods from continual learning, LTN can obtain a higher level of accuracy when addressing non-monotonic reasoning tasks. Continual learning is added to LTNs by adopting a curriculum of learning from knowledge and data with recall. We call this process Continual Reasoning, a new methodology for the application of neural-symbolic systems to reasoning tasks. Continual Reasoning is applied to a prototypical non-monotonic reasoning problem as well as other reasoning examples. Experimentation is conducted to compare and analyze the efects that diferent curriculum choices may have on overall learning and reasoning results. Results indicate significant improvement on the prototypical non-monotonic reasoning problem and a promising outlook for the proposed approach on statistical relational learning examples.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Neural-Symbolic Systems</kwd>
        <kwd>Continual Learning</kwd>
        <kwd>Non-monotonic Reasoning</kwd>
        <kwd>Logic Tensor Networks</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The combination of machine learning and symbolic reasoning, now embodied by the area known
as neurosymbolic AI, has been a developing field of research since the early days of AI. Recent
advancements in deep learning allowed for a surge of interest in this particular type of models.
Many variations of neural-symbolic (NeSy) models have surfaced in the past few years, showing
the advantages of NeSy systems at reasoning and learning with increased explainability, data
eficiency and generalization in comparison with other deep learning models [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4">1, 2, 3, 4</xref>
        ].
      </p>
      <p>
        In this paper we propose Continual Reasoning, a new paradigm of learning for NeSy
models to achieve non-monotonic reasoning (NMR). The core principle of Continual Reasoning
states that reasoning tasks, especially those of a non-monotonic nature, should be addressed by
learning from data and knowledge in a multi-stage curriculum of training. We illustrate this
learning paradigm using a combination of Logic Tensor Networks (LTN) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], a NeSy framework
capable of simulating First-Order Logic (FOL), and methodologies borrowed from Continual
Learning (CL) for deep learning [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. LTN is chosen for its ability to constrain the loss calculations
of a deep learning system based on symbolic knowledge defined in FOL and its efectiveness in
dealing with both typical deep learning and reasoning tasks [
        <xref ref-type="bibr" rid="ref5 ref7 ref8">5, 7, 8</xref>
        ]. CL, that is, the sequential
learning of knowledge, without forgetting, from data that may no longer be available, will
be shown to implement non-monotonicity in LTNs eficiently, when adopting an appropriate
learning curriculum. Continual Reasoning combining LTN and CL aims to address the dificulties
that many NeSy models have when dealing with non-monotonic tasks.
      </p>
      <p>
        We apply and evaluate Continual Reasoning on an exemplar NMR task (the birds and penguins
example), on the Smokers and Friends statistical relational reasoning task [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], and on a Natural
Language Understanding (NLU) task that contains NMR (from the bAbI dataset [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]). Results
indicate that a considerable increase in accuracy can be achieved in comparison with a
singlestage curriculum of learning.
      </p>
      <p>The remainder of this paper is organised as follows. In Section 2, we discuss the challenges
faced by previous approaches to NMR. In Section 3, we introduce the Continual Reasoning
methodology and two general approaches to curriculum design. In Section 4, we analyze the
experimental results. Section 5 concludes the paper and discusses directions for future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>
        A common scenario to explain NMR is the Penguin Exception Task (PET) [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ], which can be
defined in simple terms as: In a group of animals, there exist birds and non-birds. It is known that
normally all birds fly, and that all non-birds do not fly. However, it is also known that penguins are
animals that are birds, but do not fly . In First-Order Logic (FOL), the PET can be defined using
axioms such as ∀(_() → _ ()) and ∃(_() ∧ _()), etc.
The idea is that in the absence of further information, it is reasonable to assume that all birds
can fly. Although, when faced with information about penguins as an exception to the rule,
one would like to retract the previous conclusion. In monotonic FOL, however, retracting a
conclusion is not possible. Thus, in classical logic, the PET becomes unsolvable due to the
contradiction that may arise from _ () and ¬_ (). The PET is unsolvable also in
traditional logic programming languages, such as PROLOG [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. In order to address the problem,
many non-monotonic approaches have been developed, including Moore’s Autoepistemic logic,
McCarthy’s Circumscription, Reiter’s Default Logic and in logic programming with negation
by failure [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. In autoepistemic logic, certain rules can be adjusted to include an exception:
∀ (_() ∧ ¬ _() → _ ()). However, the need to be explicit in
including all exceptions makes this approach computationally expensive (considering that
there are other birds that do not fly, e.g. ostriches). Circumscription and logic programming
with negation by failure, on the other hand, find a solution to the problem by introducing the
predicate , to indicate an exceptional case. The above rules would be re-written as
∀ (_()∧¬ _()) → _ () along with a rule to state that penguins
are abnormal birds. Other exceptions would then be added as needed without changing the
original rule. Unfortunately, this approach does not adapt well to exceptions to the exceptions
such as an abnormal penguins (a hypothetical super-penguin that is capable of flying).
      </p>
      <p>
        At present, there is a tension between the above attempts to formalizing non-monotonicity
and large-scale data-driven approaches based on neural networks and natural language that are
eficient but lack any formalization. In this paper we seek to investigate approaches to solving
PET and other simple examples that can be formalized but that work using the same tools as the
large-scale network models. Work has been conducted to formalize NMR in neural networks
starting with the Connectionist Inductive Learning and Logic Programming System (CILP) [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ],
later developed into a system for statistical relational learning. More recently, the Diferentiable
Inductive Logic Programming (ILP) approach [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] was proposed, addressing cycles through
negation. Probabilistic approaches have also been developed which can implement a form of
nonmonotonicity or at least avoid the problems of classical logic by assigning probabilities to beliefs
expressed as Horn clauses, e.g. DeepProbLog [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. In this paper, rather than mapping symbolic
representations into neural networks and vice-versa, we are interested in the interplay between
learning and reasoning as part of a curriculum. We focus on the Logic Tensor Network (LTN) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
because it is a highly modular NeSy framework applicable in principle to any underlying neural
network model and based on the canonical, highly expressive FOL language. Additionally, the
LTN has shown promise for learning in continual mode [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
      <p>
        The LTN relies on two main ideas, the grounding of predicates and logical axioms into vectors
and Real Logic which maps the satisfiability of the logical axioms to a real number in the interval
{0,1} thus enabling viewing satisfiability as optimization. Given a knowledge base of FOL axioms
, the LTN grounds every variable  to a vector representation ( ) = ⟨1...⟩ ∈  1 ,
and every predicate  to a neural network () → [
        <xref ref-type="bibr" rid="ref1">0, 1</xref>
        ].2 The application of Real Logic uses
diferentiable fuzzy logic to calculate the truth value of any LTN rule in the usual way. The
satisfiability ( ), i.e. the aggregated truth value of the knowledge base, is then used in the loss
function, with  = 1 − . For further details, we point the reader to [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Method</title>
      <p>
        Continual Reasoning is proposed as a novel methodology, addressing reasoning tasks with a
combination of NeSy models and a curriculum of training. In CL, a multi-task dataset is split
along the diferent tasks, so that the model can be trained on each subset of data at each stage
of the curriculum, with the aim to learn new tasks without forgetting old ones. In the context
of NeSy models where tasks and knowledge are mostly represented at the symbolic level, we
treat the aforementioned splitting of data as a division of the symbolic knowledge along a series
of stages, which constitutes our curriculum of learning. In doing so, we rely on the neural
1The value  is a hyperparameter of the framework, and is defined by the developer. For the experiments below,
the values ⟨1...⟩, are initialized randomly and trained along with the predicate neural network.
2A note on terminology: the LTN framework treats FOL axioms in a slightly diferent way than logic programming.
A grounding creates a direct connection with data, mapping a variable to a specific partition of the data. For
this reason, we use the term rules instead of axioms when referring to the FOL knowledge base defined in LTN.
The FOL axiom ∀ _() → _ () is defined in LTN as the rule ∀ _() ⇒
_ (), where Animals is the set of vector groundings for all animals in the data. This makes LTN
a typed FOL language. If we wish to declare rules that only apply to a subset of Animals, we can do this in LTN
using e.g. ∀_ _( _), where Norm_Birds consists only of the vector representations
for birds, which is a subset of Animals. This excludes other subsets of animals, e.g. Penguins or Cows. For the
definition of the PET used in LTN, see Appendix A.
networks of the NeSy models to learn new knowledge without forgetting previously learned
knowledge, adjusting their beliefs about previously learned knowledge to allow for the new
knowledge to be mapped to true without creating an inconsistency. Specifically, when using
the LTN as our NeSy model, a knowledge base (KB) of FOL rules is separated into multiple
stages for learning. For example, consider a KB consisting of facts (), (), (), and rules
() ⇒ () and () ∧ () ⇒ (). A split into three stages might be: (1) train on the
facts; (2) train on () ⇒ () and recall fact (); (3) train on () ∧ () ⇒ (). All
facts and rules are assumed to be universally quantified. Our experiments will show, as one
would expect, that the choice of curriculum, i.e the specific sequence in which the rules are
learned and the facts are recalled, can afect the outcome. It becomes apparent that while in
traditional machine learning all data is treated equally as being i.i.d. (although recent work
around out-of-distribution (OOD) learning has started to question this assumption [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]), in
reasoning tasks, especially NMR, the order in which knowledge is learned matters (in addition
to the data split already identified as important in OOD learning).
      </p>
      <p>
        Thus, we focus on two core requirements for the choice of curriculum. The first relies on
the approach commonly applied in CL where data is split into separate tasks [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. This can be
applied in Continual Reasoning by treating each predicate as an individual task and training any
rule aimed at learning about said predicate in a single stage of the curriculum. We call this Task
Separation. In our previous example, we would split the KB into four stages: (1) learn (); (2)
(); (3) (); and (4) learn about (_), training on both rules. The second requirement takes
inspiration from work conducted with knowledge graphs and lifelong learning projects such
as NELL [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], in which we aim to “build up" from atomic knowledge (i.e. facts) and augment
knowledge by abiding to new rules. In Continual Reasoning, we can accomplish this by giving
priority to learning propositional rules, and rules that are directly tied to labelled data. Following
this, we aim to use rules that extend the learned domain beyond what is available to more
abstract concepts. This is known as Knowledge Completion. Using again our previous example,
to satisfy both requirements we would split the KB into two stages: (1) train (), () and
(); (2) learn () ⇒ () and () ∧ () ⇒ ().
      </p>
      <p>
        To be able to do the above using neural networks, we must address the core issue found in
CL, often referred to as catastrophic forgetting, i.e. when the process of gradient descent leads
the neural network to forget previously learned data by conforming entirely to newly provided
data. To address this problem, we apply a common CL technique of rehearsal [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Rehearsal
is the process by which previously seen data is sampled and recalled in the current stage of
learning. For Continual Reasoning, since our knowledge is represented in FOL, in each stage of
learning, we recall a random set of previously learned knowledge, such as () earlier, to be
learned along with the set of FOL rules.
      </p>
      <p>
        For our analysis, we compare the task separation and knowledge completion curricula to a
Baseline, where all knowledge is learned in a single stage, and a Random curriculum, where the
KB split is randomly selected for each stage. To allow for efective comparison, all curricula,
apart from the baseline, are composed of three stages. These comparisons are applied to the
PET as a prototypical NMR task to show their benefits. In addition, to show the efectiveness of
Continual Reasoning on other types of reasoning problems, we apply it to the Smokers and
Friends task[
        <xref ref-type="bibr" rid="ref20 ref5 ref9">5, 20, 9</xref>
        ] and to Task 1 of the bAbI dataset [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] in what follows.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>Penguin Exception Task (PET) : For the PET, we examine the behaviour of the LTN
model throughout the curriculum of training, paying particular attention to three
distinct types of reasoning that are necessary for success. First, we have knowledge that
can be learned through induction with one-hop reasoning, such as determining that all
normal birds fly:</p>
      <p>∀_ _ ( _), and that all penguins are birds:
∀  _( ). Second, we have two-hop reasoning when determining that
all penguins should be able to fly, ∀  _ ( ), because they are birds. This
is an instance of jumping to a conclusion in the absence of further information. Lastly, we
contradict this conclusion with our final learning stage for which we expect to conclude
nonmonotonically that penguins in fact do not fly, ∀  ¬ _ ( ). We use these
four FOL statements as queries in the analysis of our curricula of learning by measuring their
LTN satifiability over time (Table 1).</p>
      <p>The results indicate that the task separation curriculum performs better than the other
curricula, with the LTN able to correctly distinguish between all types of animals, as well as
learn that normal birds can fly, while penguins, although still classified as birds, do not fly. The
knowledge completion curriculum also performs to a high satisfiability for each of the queries.</p>
      <p>However, in comparison with task separation, the knowledge completion curriculum is less
robust, and in our experimentation led to one failure case, in which penguins were misclassified
as normal birds, and therefore could fly.</p>
      <p>When analyzing the queries throughout the training stages, we can identify changes that
show that the LTN has the desired behaviour, including jumping to conclusions and belief
revision. Specifically, in the second stage of both curricula, the LTN is trained to infer that
penguins are birds, as well as that all birds can fly. Until told otherwise, the LTN jumps to
the conclusion that penguins should be able to fly. In the third stage, however, the LTN is
trained on the rule that penguins cannot fly. Given this knowledge, _ ( ) and
_ ( _) take an initial plunge (clearly shown in Figure 1). This, of course,
makes sense, as the LTN does not yet have any reason to distinguish between penguins and
normal birds, and thus once again jumps to the conclusion that since penguins cannot fly,
then normal birds should not fly either. However, we see that the process of recall makes
_ ( _) regain satisfiability, while the satisfiability of _ ( )
decreases towards zero. It is interesting to note that in stage 3 the apparent contradiction does not
lead to a convergence around an uninformative satisfiability of 0.5. With a random curriculum,
we see more variance in the final results, which is to be expected given the random choice of
rules, but overall, on average, this curriculum performs slightly better than the baseline. This
shows that even without the benefit of curriculum design, the method of Continual Reasoning
¬ (, ) [0.83,0.98]
 (, ) ⇒  (, ) [0.97,1.00]
∃ (, ) [1.00,1.00]
 (, ) ∧ () ⇒ () [0.65,1.00]
() ⇒ () [0.58,1.00]
 (, )
()
()
¬() ⇒ ¬()</p>
      <p>MLN
leads to better results than attempting to learn the full knowledge base in a single stage. By further
analysing the experiments in which the random curricula perform optimally, we see that task
separation and knowledge completion curricula are not the only viable option for success (see
Appendix A).</p>
      <p>
        Smokers and Friends Task (S&amp;F) : The S&amp;F problem consists of a statistical relational
reasoning task. We define the knowledge base in accordance with [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and compare a baseline
curriculum to curricula belonging to knowledge completion and task separation paradigms. The
satisfiability of each rule throughout the stages show that a knowledge completion curriculum
outperforms the baseline and task separation on identifying that smoking causes cancer (97.8%
to 71.5% and 80.6%, respectively). Overall, the knowledge completion curriculum leads to the
LTN reaching higher satisfiability in five of the nine FOL rules, in comparison with the baseline
which beats the other curriculum in only three of the nine rules (see Appendix B for a table
detailing the satisfiability of rules per stage of each curriculum).
      </p>
      <p>
        In addition to comparison between curricula, we compare the outcome of Continual Reasoning
with two other NeSy models that have been applied to S&amp;F, the Logical Neural Network (LNN)
and the Markov Logic Network (MLN). The LNN allows for a lower and upper bound truth value,
which signifies the lowest possible and highest possible truth value for a given FOL axiom, such
that the whole knowledge base holds true. The MLN derives axiom log-probability weights
which signify the probability of the axiom’s mapping to true compared to the probability of it
mapping to false. In Table 4, we see the results of these models per FOL rule used for training
in our experiments. It is important to note that a precise comparison is not possible, as each
model defines the set of FOL rules slightly diferently in training. However, we see that the
application of continual reasoning on LTNs for the S&amp;F task performs comparably to other
NeSy approaches.
bAbI - Task 1 : Task 1 of the bAbI dataset contains story lines of given facts and questions
about those facts. For example, one instance will provide the sentences "Mary went to the ofice.
Jack travelled to the garden." and ask "Where is Mary?". In order to address such a task with
the proposed approach of Continual Reasoning using LTNs, we transform natural language
sentences into FOL rules using GPT-3 [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] 3. As the task already consists of stories told in stages,
separated by questions, for curriculum design, we simply separate the FOL rules along the same
stages in the dataset. The reasoning here can be said to be non-monotonic over time in that,
later in the story, truth-values may change, e.g. Mary may no longer be in the ofice. Initial
experimentation showed that by applying Continual Reasoning, a LTN model achieves 96.9%
accuracy on the testing set of bAbI-Task 1, surpassing the 95% threshold for success. Further
experimentation is ongoing.
      </p>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion, Conclusions and Future Work</title>
      <p>
        We have introduced a novel methodology that integrates neurosymbolic AI and continual
learning techniques in order to achieve non-monotonic reasoning. We call this Continual
Reasoning, and we showed that by using Logic Tensor Networks [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] as our neural-symbolic
framework, and training the knowledge base of First-Order Logic rules in a curriculum of
multiple stages, we can improve on the traditional approach of learning all rules together.
Additionally, we have analysed multiple types of curricula, proposing two general paradigms
for curriculum design, and showed that while even random curriculum performs better on
average than the baseline, a specific design choice can allow the model to appropriately jump
to conclusions and revise its beliefs more efectively.
      </p>
      <p>
        Experimentation conducted for this paper showed that Continual Reasoning also performs
comparably on statistical relational reasoning tasks to a baseline curriculum, and other NeSy
models. Continuation of this work could apply Continual Reasoning on larger datasets, such as
the dataset used in RuleTaker [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], visual relational question-answering datasets, such as the
CLEVR [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], and the remaining tasks in the bAbI dataset [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>
        Furthermore, there still remain open questions concerning Continual Reasoning, such as
how it might perform in extended non-monotonic reasoning tasks that occur when addressing
lifelong learning. Rudimentary exploration of extending the PET to learn about a
"superpenguin" which could fly, resulted in the LTN mostly failing to learn the exception to the
exception. We believe, however, that utilising more advanced continual learning techniques,
such as structural choices for neural network architecture, as well as more sophisticated recall
methods like active learning, as suggested in [
        <xref ref-type="bibr" rid="ref17 ref6">6, 17</xref>
        ], would allow the Continual Reasoning
methodology to succeed. This is to be investigated. Additionally, while LTNs proved to be a
straightforward NeSy model to apply Continual Reasoning on, it should be possible to apply
our methodology to other NeSy models, such as LNNs. Integration with a very recent software
framework called PyReason [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] could provide an eficient way to do this.
3This approach is inspired by that used in [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], although FOL parsing of natural language is an evolving field of
research which continues to face challenges [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]
      </p>
    </sec>
    <sec id="sec-6">
      <title>A. Penguin Exception Task - Extra Material</title>
      <p>PET - LTN Rule and curriculum definition Let us assume the variables  _,
,  , and , which represent groups of normal birds, cows, penguins, and
the union of all groups of animals, respectively. Therefore, we define below a knowledgebase of
FOL rules that reflect the PET as a prototypical non-monotonic reasoning task.
1. ∀_ _( _) (normal birds are birds)
2. ∀ ¬ _() (cows are not birds)
3. ∀ _() ⇒ _ () (birds can fly)
4. ∀ ¬ _() ⇒ ¬ _ () (non-birds cannot fly)
5. ∀  _( ) (penguins are penguins)
6. ∀_  ¬ _( _ ) (non-penguins are not penguins)
7. ∀ _() ⇒ _() (penguins are birds)
8. ∀ _() ⇒ ¬ _ () (penguins do not fly)
It is important to note that these rules are defined taking a open-world assumption, hence the
need for declaring negations in rules 2 and 6. Additionally, we recognize that the same knowledge
task could be defined using other forms of the same rules, to the same end. For example, rules
7 and 8 could be combined into one: ∀_() ⇒ _() ∧
¬ _ (). However, for the purposes of this paper, we limit the rules to their
simplest forms.
∀ _() ⇒ _ () (birds can fly)
∀ ¬ _() ⇒ ¬ _ () (non-birds cannot fly)
∀_  ¬ _( _ ) (non-penguins are not penguins)
∀ ¬ _() (cows are not birds)
∀  _( ) (penguins are penguins)
∀_ _( _) (normal birds are birds)
∀ _() ⇒ ¬ _ () (penguins do not fly)
∀ _() ⇒ _() (penguins are birds)</p>
    </sec>
    <sec id="sec-7">
      <title>B. Smokers and Friends Task - Extra Material</title>
      <p>identify known friendships
identify known smokers
identify known cancer
identify known friendships
friendship is antireflexive
friendship is symmetric
everyone has a friend</p>
      <p>friendship is antireflexive
friendship is symmetric
everyone has a friend
friends of smokers smoke
identify known smokers
friends of smokers smoke
 (, )
()
()
¬ (, )
 (, ) ⇒  (, )
∃ (, )
 (, ) ∧ () ⇒ ()
() ⇒ ()
¬() ⇒ ¬()
SAT of KB</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1] A.
          <string-name>
            <surname>d'Avila Garcez</surname>
            ,
            <given-names>L. C.</given-names>
          </string-name>
          <string-name>
            <surname>Lamb</surname>
          </string-name>
          ,
          <article-title>Neurosymbolic ai: The 3rd wave (</article-title>
          <year>2020</year>
          ). URL: http://arxiv. org/abs/
          <year>2012</year>
          .05876.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Ke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <article-title>Neural, symbolic and neural-symbolic reasoning on knowledge graphs</article-title>
          , arXiv:
          <year>2010</year>
          .05446 [cs] (
          <year>2021</year>
          ). URL: http://arxiv.org/abs/
          <year>2010</year>
          .05446.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T. R.</given-names>
            <surname>Besold</surname>
          </string-name>
          , A.
          <string-name>
            <surname>d'Avila Garcez</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Bader</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Bowman</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Domingos</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Hitzler</surname>
            , K.-U. Kuehnberger,
            <given-names>L. C.</given-names>
          </string-name>
          <string-name>
            <surname>Lamb</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Lowd</surname>
            ,
            <given-names>P. M. V.</given-names>
          </string-name>
          <string-name>
            <surname>Lima</surname>
            , L. de Penning, G. Pinkas,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Poon</surname>
          </string-name>
          , G. Zaverucha,
          <article-title>Neural-symbolic learning and reasoning: A survey and interpretation</article-title>
          , arXiv:
          <fpage>1711</fpage>
          .03902 [cs] (
          <year>2017</year>
          ). URL: http://arxiv.org/abs/1711.03902.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Mao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kohli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. B.</given-names>
            <surname>Tenenbaum</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Wu,</surname>
          </string-name>
          <article-title>The neuro-symbolic concept learner: Interpreting scenes, words, and sentences from natural supervision</article-title>
          ,
          <source>ICLR</source>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Badreddine</surname>
          </string-name>
          , A.
          <string-name>
            <surname>d'Avila Garcez</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Serafini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Sparanger</surname>
          </string-name>
          , Logic tensor networks (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Mundt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. W.</given-names>
            <surname>Hong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Pliushch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ramesh</surname>
          </string-name>
          ,
          <article-title>A wholistic view of continual learning with deep neural networks: Forgotten lessons and the bridge to active and open world learning</article-title>
          , arXiv:
          <year>2009</year>
          .01797 [cs, stat] (
          <year>2020</year>
          ). URL: http://arxiv.org/abs/
          <year>2009</year>
          .01797.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>L.</given-names>
            <surname>Serafini</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>d'Avila Garcez, Learning and reasoning with logic tensor networks</article-title>
          ,
          <source>Proc. Ai*AI</source>
          (
          <year>2016</year>
          )
          <fpage>334</fpage>
          -
          <lpage>348</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>I.</given-names>
            <surname>Donadello</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Serafini</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>d'Avila Garcez, Logic tensor networks for semantic image interpretation</article-title>
          ,
          <source>IJCAI-17</source>
          (
          <year>2017</year>
          )
          <fpage>1596</fpage>
          -
          <lpage>1602</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Richardson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Domingos</surname>
          </string-name>
          ,
          <article-title>Markov logic networks</article-title>
          ,
          <source>Machine Learning</source>
          <volume>62</volume>
          (
          <year>2006</year>
          )
          <fpage>107</fpage>
          -
          <lpage>136</lpage>
          . doi:
          <volume>10</volume>
          .1007/s10994-006-5833-1.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Weston</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bordes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chopra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Rush</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. van Merriënboer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Joulin</surname>
          </string-name>
          , T. Mikolov,
          <article-title>Towards ai-complete question answering: A set of prerequisite toy tasks</article-title>
          ,
          <year>2015</year>
          . arXiv:
          <volume>1502</volume>
          .
          <fpage>05698</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>A. d'Avila Garcez</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Lamb</surname>
            ,
            <given-names>D. M.</given-names>
          </string-name>
          <string-name>
            <surname>Gabbay</surname>
          </string-name>
          ,
          <article-title>Neural-symbolic cognitive reasoning</article-title>
          ,
          <source>in: Cognitive Technologies</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>G.</given-names>
            <surname>Antoniou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Williams</surname>
          </string-name>
          ,
          <article-title>Nonmonotonic reasoning / Grigoris Antoniou ; with contributions by Mary-Anne Williams</article-title>
          , MIT Press Cambridge, Mass,
          <year>1997</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Covington</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bagnara</surname>
          </string-name>
          , R. A.
          <string-name>
            <surname>O'Keefe</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Wielemaker</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Price</surname>
          </string-name>
          , Coding guidelines for prolog,
          <year>2009</year>
          . URL: https://arxiv.org/abs/0911.2899. doi:
          <volume>10</volume>
          .48550/ARXIV.0911.2899.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>A. d'Avila Garcez</surname>
            ,
            <given-names>G. Zaverucha,</given-names>
          </string-name>
          <article-title>The connectionist inductive learning and logic programming system</article-title>
          ,
          <source>Appl. Intell</source>
          .
          <volume>11</volume>
          (
          <year>1999</year>
          )
          <fpage>59</fpage>
          -
          <lpage>77</lpage>
          . doi:
          <volume>10</volume>
          .1023/A:
          <fpage>1008328630915</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>R.</given-names>
            <surname>Evans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Grefenstette</surname>
          </string-name>
          ,
          <article-title>Learning explanatory rules from noisy data</article-title>
          ,
          <year>2017</year>
          . URL: https: //arxiv.org/abs/1711.04574. doi:
          <volume>10</volume>
          .48550/ARXIV.1711.04574.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>R.</given-names>
            <surname>Manhaeve</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dumančić</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kimmig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Demeester</surname>
          </string-name>
          , L. De Raedt,
          <source>Deepproblog: Neural probabilistic logic programming</source>
          ,
          <year>2018</year>
          . URL: https://arxiv.org/abs/
          <year>1805</year>
          .10872. doi:
          <volume>10</volume>
          . 48550/ARXIV.
          <year>1805</year>
          .
          <volume>10872</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>B.</given-names>
            <surname>Wagner</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. S.</surname>
          </string-name>
          <article-title>d'Avila Garcez, Neural-symbolic integration for fairness in ai, in: AAAI 2021 Spring Symposium on Combining Machine Learning and Knowledge Engineering (AAAI-MAKE</article-title>
          <year>2021</year>
          ), volume
          <volume>2846</volume>
          ,
          <year>2021</year>
          . URL: https://openaccess.city.ac.uk/id/eprint/ 26151/.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>A.</given-names>
            <surname>Słowik</surname>
          </string-name>
          , L. Bottou,
          <article-title>Algorithmic bias and data bias: Understanding the relation between distributionally robust optimization and data curation</article-title>
          ,
          <year>2021</year>
          . arXiv:
          <volume>2106</volume>
          .
          <fpage>09467</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mitchell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Cohen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Hruschka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Talukdar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Betteridge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Carlson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Dalvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gardner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Kisiel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Krishnamurthy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Lao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Mazaitis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mohamed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Nakashole</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Platanios</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ritter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Samadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Settles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wijaya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Saparov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Greaves</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Welling</surname>
          </string-name>
          ,
          <article-title>Never-ending learning</article-title>
          ,
          <source>in: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI-15)</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>R.</given-names>
            <surname>Riegel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Luus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Khan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Makondo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. Y.</given-names>
            <surname>Akhalwaya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Qian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Barahona</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ikbal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Karanam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Neelam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Likhyani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          ,
          <article-title>Logical neural networks (</article-title>
          <year>2020</year>
          ). URL: http://arxiv.org/abs/
          <year>2006</year>
          .13155.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>T. B. Brown</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Mann</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Ryder</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Subbiah</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Kaplan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Dhariwal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Neelakantan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Shyam</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Sastry</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Askell</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Agarwal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Herbert-Voss</surname>
            , G. Krueger,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Henighan</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Child</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Ramesh</surname>
            ,
            <given-names>D. M.</given-names>
          </string-name>
          <string-name>
            <surname>Ziegler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Winter</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Hesse</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Chen</surname>
            , E. Sigler,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Litwin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Chess</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Clark</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Berner</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>McCandlish</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Radford</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Amodei</surname>
          </string-name>
          ,
          <article-title>Language models are few-shot learners</article-title>
          , CoRR abs/
          <year>2005</year>
          .14165 (
          <year>2020</year>
          ). URL: https://arxiv. org/abs/
          <year>2005</year>
          .14165. arXiv:
          <year>2005</year>
          .14165.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>M.</given-names>
            <surname>Nye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tessler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tenenbaum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. M.</given-names>
            <surname>Lake</surname>
          </string-name>
          ,
          <article-title>Improving coherence and consistency in neural sequence models with dual-system, neuro-symbolic reasoning</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>34</volume>
          ,
          <string-name>
            <surname>Curran</surname>
            <given-names>Associates</given-names>
          </string-name>
          , Inc.,
          <year>2021</year>
          , pp.
          <fpage>25192</fpage>
          -
          <lpage>25204</lpage>
          . URL: https://proceedings.neurips.cc/paper/2021/file/ d3e2e8f631bd9336ed25b8162aef8782-Paper.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>H.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Aggarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Krishnamurthy</surname>
          </string-name>
          ,
          <article-title>Exploring neural models for parsing natural language into first-order logic</article-title>
          ,
          <source>CoRR</source>
          (
          <year>2020</year>
          ). URL: https://arxiv.org/abs/
          <year>2002</year>
          .06544.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>P.</given-names>
            <surname>Clark</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Tafjord</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Richardson</surname>
          </string-name>
          ,
          <article-title>Transformers as soft reasoners over language</article-title>
          ,
          <year>2020</year>
          . URL: https://arxiv.org/abs/
          <year>2002</year>
          .05867. doi:
          <volume>10</volume>
          .48550/ARXIV.
          <year>2002</year>
          .
          <volume>05867</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>J.</given-names>
            <surname>Johnson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hariharan</surname>
          </string-name>
          , L. van der Maaten, L.
          <string-name>
            <surname>Fei-Fei</surname>
            ,
            <given-names>C. L.</given-names>
          </string-name>
          <string-name>
            <surname>Zitnick</surname>
            ,
            <given-names>R. B.</given-names>
          </string-name>
          <string-name>
            <surname>Girshick</surname>
          </string-name>
          ,
          <article-title>CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning</article-title>
          ,
          <source>CoRR abs/1612</source>
          .06890 (
          <year>2016</year>
          ). URL: http://arxiv.org/abs/1612.06890.
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>D.</given-names>
            <surname>Aditya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Mukherji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Balasubramanian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chaudhary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Shakarian</surname>
          </string-name>
          , Pyreason: Software for open world temporal logic,
          <year>2023</year>
          . URL: https://arxiv.org/abs/2302.13482. doi:
          <volume>10</volume>
          .48550/ARXIV.2302.13482.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>