<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Reasoning with Deep Learning: an Open Challenge</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Universita degli Studi di Modena e Reggio Emilia</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Building machines capable of performing automated reasoning is one of the most complex but fascinating challenges in AI. In particular, providing an e ective integration of learning and reasoning mechanisms is a long-standing research problem at the intersection of many di erent areas, such as machine learning, cognitive neuroscience, psychology, linguistic, and logic. The recent breakthrough achieved by deep learning methods in a variety of AI-related domains has opened novel research lines attempting to solve this complex and challenging task.</p>
      </abstract>
      <kwd-group>
        <kwd>Marco Lippi</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Motivation</title>
      <p>
        In the last decade, deep learning has brought a real revolution in the area of arti
cial intelligence (AI) and in many of its related elds, producing stunning results
in a variety of di erent application domains. In computer vision, image classi
cation and object detection systems can now be trained to recognize thousands of
di erent semantic categories [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], that sometimes are di cult to distinguish even
for humans. Speech recognition and music retrieval can be performed with an
accuracy that was hard to imagine only one decade ago [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. For natural language
processing and understanding, tasks such as machine translation or sentiment
analysis have moved huge steps forward with respect to earlier state-of-the-art
systems [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. In addition, in many of such contexts, these successful applications
have produced a tremendous impact also from a technological point of view, with
all major ICT companies in the world (Google, Facebook, Microsoft, IBM, etc.)
now actively working in the eld of AI more than ever before, aiming to
continuously develop more e cient and more accurate systems. Whereas these are
indeed impressive advancements, there is no doubt that many of the problems
that are really at the core of AI are far from being solved. This is particularly true
for those tasks that have to deal with reasoning operations, such as induction,
deduction, abduction, probabilistic inference, spatial or temporal reasoning, and
especially combinations of those. We can now build machines that can easily
and accurately translate a text between languages, that can spot whether an
object appears in an image or in a video, that are capable of recognizing spoken
language at very high accuracy levels, but which cannot yet answer higher-level
questions related to the content they have just processed. Building a machine
that can read any kind of short story, or watch a movie of any genre, and that
can answer simple questions about the plot and the characters, questions that a
child would certainly be able to answer, still remains a dream. Clearly, these are
extremely complex tasks, that humans learn to perform during the rst years
of life, and that involve learning to analyze large amounts of information, to
extract and somehow store some form of knowledge from such information, and
nally to digest this information and reason about it.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Methods</title>
      <p>
        Historically, there has always been a dichotomy between symbolic and
subsymbolic (often named connectionist) frameworks to model reasoning [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The
symbolic approach has its roots in the study of logic and philosophy, and it
sees reasoning as the capability of deriving additional information from that
already encoded in a collection of given symbols, by performing elaboration and
manipulation on the given structured representations. From the perspective of
connectionism, reasoning is instead the result or derivation of multiple,
interconnected, simple processing devices, one major example being neural networks. The
main motivation behind connectionism comes from cognitive neuroscience, since
the human neural circuitry is clearly capable of storing and retrieving
knowledge organized in short- and long-term memory, by continuously analyzing and
processing new, complex information, and reasoning upon it.
2.1
      </p>
      <sec id="sec-2-1">
        <title>Pioneering approaches</title>
        <p>
          Throughout the years, there have been many attempts to combine learning and
reasoning processes by integrating connectionist and symbolic paradigms.
Between the 80s and the 90s, a signi cant number of pioneering works started to
circulate, such as connectionist approaches to encode semantic networks [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], or
knowledge-based arti cial neural networks, named KBANNs [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. Within this
context, research has been mainly directed along distinct but strongly intertwined
directions: (i) inserting background knowledge into the structure of neural
networks, (ii) re ning sets of rules via neural networks, (iii) extracting rules or
classi cation patterns from trained neural networks.
        </p>
        <p>
          The main idea behind KBANNs is that of considering input-to-output paths
in a neural network as sub-symbolic realizations of some symbolic rules given in
advance: output units can be thought of as the nal conclusions of the rules, input
units are supporting facts and hidden units represent intermediate conclusions.
Standard backpropagation can be applied to tune the weights of the network, by
employing a training set, as for standard neural networks. This framework can be
adopted both for initializing the structure of a neural network with background
knowledge, and to extract a set of re ned rules from the nal learned network,
thus addressing the long-standing problem of neural network interpretability.
Similar approaches had been proposed for Recurrent Neural Networks to handle
sequential data [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. Despite having shown promising results in computational
biology tasks [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], KBANNs have found applications only in small-sized domains,
and encoding simple rules. One of the main limitations of this model was in fact
due to the di culty, in the 90s, of training deep neural networks, whose structure
was induced by complex rules. In this direction, the recent advancements of deep
learning could certainly o er a valuable contribution.
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Combining symbolic and sub-symbolic methods</title>
        <p>
          More recent attempts to combine symbolic and sub-symbolic techniques for
reasoning include the research lines carried out by the so-called neural-symbolic
community [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Several theoretical results have been succesfully achieved in this
area. Many studies have been conducted on the neural binding problem [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], that
aims to explain how connections between di erent brain regions are coordinated,
so as to retrieve and manipulate information, activate distant neural circuits, and
nally perform reasoning. Other research has focused on the analysis of the
capability of neural networks to represent modal and temporal logics [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] as well as
fragments of rst-order logic [
          <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
          ]. Despite being successfully applied in some
proof-of-concept settings, the existing neural-symbolic approaches still lack a
thorough application on large-scale, real-world problems.
        </p>
        <p>
          Starting from a slightly di erent perspective, the area of statistical relational
learning [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] (also known as probabilistic inductive logic programming) was born
at the end of the 90s with similar goals. Statistical relational learning aims to
combine the expressive power of logic representations with models handling
uncertainty in data, such as statistical learning approaches and graphical models.
Few attempts have been made in the direction of employing neural networks
within the context. An example is given by ground-speci c Markov logic
networks [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], that allow to embed neural networks within the Markov logic
framework, by learning the weights of the probabilistic logic clauses. The method
has been successfully applied to bioinformatics and time-series forecasting, for
problems where there is a crucial need to model background knowledge, handle
structured data, and perform probabilistic inference. Yet, it was never used to
handle reasoning tasks.
2.3
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Recent advances: deep learning</title>
        <p>
          In the last years, the task of reasoning with (deep) connectionist models has
captured an enormous interest, that is evidenced by the approaches that have
been proposed by some of the big companies that are currently investing in deep
learning. This is the case of Neural Turing Machines by Google DeepMind [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ],
Memory Networks developed at Facebook AI Research [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], Dynamic Memory
Networks proposed by MetaMind [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ], the Neural Reasoner [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] by Huawei
Technologies, and the Watson system developed by IBM [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. Additional methods
that are worth mentioning in this context are Wolfram Alpha, a computational
knowledge engine which is capable of handling and manipulating encyclopedic
knowledge to perform question-answering, and the GeoS system developed by
the Allen Institute [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] which can solve geometry Scholastic Aptitude Tests at
the level of the average US students.
        </p>
        <p>Many of such methods employ a purely sub-symbolic framework, relying on
supervised datasets to train a deep architecture from collections of examples.
Most of these approaches share the common idea that a connectionist model
aiming to perform reasoning has to maintain some memory that has to be e
ciently organized and queried in order to retrieve the information necessary to
provide solutions for the desired tasks. Memory Networks, for example, use a
dedicated neural network for each step in the process of retrieving the correct
answer to a given question: (i) computing feature representations for the input,
(ii) updating memory, (iii) combining input and memory to compute the output,
(iv) translate the output into an interpretable answer. In their original
implementation, such model is presented in a purely supervised fashion, but extensions
to semi-supervised settings are considered as well.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Discussion</title>
      <p>
        Although producing remarkable advancements, recent approaches to reasoning
with deep networks do not properly address the task of symbolic reasoning,
thus leaving the problem of neural network interpretability unsolved. Most of
the e ort is in fact demanded to an e cient management of the memory of
the network, and to fast matching and retrieval algorithms. Some of the
existing approaches have been compared on a collection of benchmarks, called
bAbI tasks [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], developed at Facebook AI Research. Such tasks include
simple question answering problems, that typically require to perform some kind of
reasoning and answer with a single word. The following is an example:
In the afternoon Julie went to the park. Yesterday Julie was at school. Julie
went to the cinema this evening. Where did Julie go after the park ? Cinema
To answer such questions, the system needs to perform many, advanced
operations. First, it has to process the text and store the information in some form
of memory, since even a short story like the one in the above example contains
plenty of information. Then, it has to understand which are pieces of knowledge
that are relevant to a given question, in order to nally formulate some
hypothesis and provide the correct answer. These nal steps include complex reasoning
mechanisms, such as deductions and uncertainty handling, as well as temporal
reasoning. Such skills are completely di erent from the technology that is present
in existing sophisticated question answering systems, that mainly exploit
encyclopedic background knowledge and answer highly speci c questions.
Big data. The recent, impressive success of deep learning across several, di
erent areas of AI is certainly strongly related to the availability of huge datasets,
that nowadays can be easily collected from various and heterogeneous data
sources over the Web, and also to the advancements in computer hardware
performance, that have dramatically reduced computational requirements. From
a theoretical point of view, models that are currently employed in many
systems were already known decades ago, but e cient techniques for training them
successfully have been proposed only in recent years [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. This is the case, for
example, of Convolutional and Recurrent Neural Networks, now representing the
state-of-the-art in a wide number of tasks. The injection of background
knowledge in the structure of such networks is yet to be investigated.
      </p>
      <p>
        Unsupervised learning. Among the open challenges, a crucial point is to
automatically extract knowledge from data, and to encode it into a neural network
model, rather than employing expert-given knowledge. Clearly, most of the
existing methods for information extraction and knowledge representation employ
supervised or at least semi-supervised data. But, in the future we expect that a
key contribution will come from unsupervised learning approaches, also to
extract commonsense knowledge. The advantages of using unsupervised data are
undeniable: generating labeled corpora is in fact an extremely complex,
timeconsuming and costly operation, whereas unsupervised data are everywhere,
available in a variety of di erent domains (text, video, audio, etc.).
Unsupervised learning algorithms could be employed to extract relevant features and
patterns from data. Although some algorithms for unsupervised learning have
played a crucial role for the development of the whole deep learning area, it is
widely recognized that a proper use of unsupervised data is still missing [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ].
Incremental learning. Humans naturally implement a lifelong learning scheme,
continuously acquiring knowledge. Such a feature seems to be a crucial element
for the development of reasoning skills and thus it is likely that future attempts
to this task will need to implement a dynamic, on-line mechanism that
incrementally acquires knowledge, possibly by also changing the network topology.
Beyond the Turing test ? Reasoning tasks could certainly be employed in an
advanced version of the Turing test. Recently, the computer vision community
has proposed the Visual Turing Challenge [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] where automated vision systems
have to answer questions regarding the content of some images or videos, thus
requiring both visual and linguistic skills. Also the bAbI tasks [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] already
mentioned represent another example of benchmark that in future could be
integrated with an advanced Turing test.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Krizhevsky</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hinton</surname>
          </string-name>
          , G.E.:
          <article-title>Imagenet classi cation with deep convolutional neural networks</article-title>
          .
          <source>In: Advances in NIPS</source>
          . (
          <year>2012</year>
          )
          <volume>1097</volume>
          {
          <fpage>1105</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          , Pham.,
          <string-name>
            <given-names>P.T.</given-names>
            ,
            <surname>Largman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Ng</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.Y.</surname>
          </string-name>
          :
          <article-title>Unsupervised feature learning for audio classi cation using convolutional deep belief networks</article-title>
          .
          <source>In: Advances in NIPS</source>
          . (
          <year>2009</year>
          )
          <volume>1096</volume>
          {
          <fpage>1104</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Cho</surname>
          </string-name>
          , K.,
          <string-name>
            <surname>van Merrienboer</surname>
            <given-names>B.</given-names>
          </string-name>
          , Gulcehre,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Bahdanau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Bougares</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Schwenk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Bengio</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          :
          <article-title>Learning phrase representations using rnn encoder-decoder for statistical machine translation</article-title>
          .
          <source>In: Proceedings of EMNLP</source>
          . (
          <year>2014</year>
          )
          <volume>1724</volume>
          {
          <fpage>1734</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Dinsmore</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>The symbolic and connectionist paradigms: closing the gap</article-title>
          .
          <source>Lawrence Erlbaum</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Shastri</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>A connectionist approach to knowledge representation and limited inference</article-title>
          .
          <source>Cognitive Science</source>
          <volume>12</volume>
          (
          <year>1988</year>
          )
          <volume>331</volume>
          {
          <fpage>392</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Towell</surname>
            ,
            <given-names>G.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shavlik</surname>
            ,
            <given-names>J.W.</given-names>
          </string-name>
          :
          <article-title>Knowledge-based arti cial neural networks</article-title>
          .
          <source>Arti cial intelligence</source>
          <volume>70</volume>
          (
          <year>1994</year>
          )
          <volume>119</volume>
          {
          <fpage>165</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Frasconi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gori</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maggini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Soda</surname>
          </string-name>
          , G.:
          <article-title>Uni ed integration of explicit knowledge and learning by example in recurrent networks</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>7</volume>
          (
          <year>1995</year>
          )
          <volume>340</volume>
          {
          <fpage>346</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>d'Avila Garcez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gori</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hitzler</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lamb</surname>
            ,
            <given-names>L.C.</given-names>
          </string-name>
          :
          <article-title>Neural-symbolic learning and reasoning (dagstuhl seminar 14381)</article-title>
          .
          <source>Dagstuhl Reports</source>
          <volume>4</volume>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Feldman</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>The neural binding problem (s)</article-title>
          .
          <source>Cognitive neurodynamics 7</source>
          (
          <year>2013</year>
          )
          <volume>1</volume>
          {
          <fpage>11</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Garcez</surname>
            ,
            <given-names>A.S.d.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lamb</surname>
            ,
            <given-names>L.C.</given-names>
          </string-name>
          :
          <article-title>A connectionist computational model for epistemic and temporal reasoning</article-title>
          .
          <source>Neural Computation</source>
          <volume>18</volume>
          (
          <year>2006</year>
          )
          <volume>1711</volume>
          {
          <fpage>1738</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Bader</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hitzler</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , Holldobler, S.:
          <article-title>Connectionist model generation: A rst-order approach</article-title>
          .
          <source>Neurocomputing</source>
          <volume>71</volume>
          (
          <year>2008</year>
          )
          <volume>2420</volume>
          {
          <fpage>2432</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Garcez</surname>
            ,
            <given-names>A.S.d.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lamb</surname>
            ,
            <given-names>L.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gabbay</surname>
            ,
            <given-names>D.M.</given-names>
          </string-name>
          :
          <article-title>Neural-symbolic cognitive reasoning</article-title>
          . Springer Science &amp; Business
          <string-name>
            <surname>Media</surname>
          </string-name>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Getoor</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taskar</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning)</article-title>
          . The MIT Press (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Lippi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Frasconi</surname>
          </string-name>
          , P.:
          <article-title>Prediction of protein -residue contacts by markov logic networks with grounding-speci c weights</article-title>
          .
          <source>Bioinformatics</source>
          <volume>25</volume>
          (
          <year>2009</year>
          )
          <volume>2326</volume>
          {
          <fpage>2333</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Graves</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wayne</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Danihelka</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Neural Turing machines</article-title>
          .
          <source>arXiv preprint arXiv:1410.5401</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Sukhbaatar</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weston</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fergus</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , et al.:
          <article-title>End-to-end memory networks</article-title>
          .
          <source>In: Advances in neural information processing systems</source>
          . (
          <year>2015</year>
          )
          <volume>2440</volume>
          {
          <fpage>2448</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Socher</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manning</surname>
            ,
            <given-names>C.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Reasoning with neural tensor networks for knowledge base completion</article-title>
          .
          <source>In: NIPS</source>
          . (
          <year>2013</year>
          )
          <volume>926</volume>
          {
          <fpage>934</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Peng</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wong</surname>
            ,
            <given-names>K.F.</given-names>
          </string-name>
          :
          <article-title>Towards neural network-based reasoning</article-title>
          .
          <source>arXiv preprint arXiv:1508.05508</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Gliozzo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Biran</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patwardhan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McKeown</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Semantic technologies in ibm watsontm</article-title>
          .
          <source>ACL</source>
          <year>2013</year>
          (
          <year>2013</year>
          )
          <fpage>85</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Seo</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hajishirzi</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Farhadi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Etzioni</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malcolm</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Solving geometry problems: Combining text and diagram interpretation</article-title>
          .
          <source>In: Proceedings of EMNLP</source>
          . (
          <year>2015</year>
          )
          <volume>17</volume>
          {
          <fpage>21</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Weston</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bordes</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chopra</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rush</surname>
            ,
            <given-names>A.M.</given-names>
          </string-name>
          , van Merrienboer,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Joulin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Mikolov</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
          </string-name>
          :
          <article-title>Towards ai-complete question answering: A set of prerequisite toy tasks</article-title>
          .
          <source>arXiv preprint arXiv:1502.05698</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>LeCun</surname>
          </string-name>
          , Y.,
          <string-name>
            <surname>Bengio</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hinton</surname>
          </string-name>
          , G.:
          <article-title>Deep learning</article-title>
          .
          <source>Nature</source>
          <volume>521</volume>
          (
          <year>2015</year>
          )
          <volume>436</volume>
          {
          <fpage>444</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Serre</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kouh</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cadieu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Knoblich</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kreiman</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poggio</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>A theory of object recognition: computations and circuits in the feedforward path of the ventral stream in primate visual cortex</article-title>
          .
          <source>Technical report</source>
          , MIT (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Krawczyk</surname>
            ,
            <given-names>D.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McClelland</surname>
            ,
            <given-names>M.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Donovan</surname>
            ,
            <given-names>C.M.:</given-names>
          </string-name>
          <article-title>A hierarchy for relational reasoning in the prefrontal cortex</article-title>
          .
          <source>Cortex</source>
          <volume>47</volume>
          (
          <year>2011</year>
          )
          <volume>588</volume>
          {
          <fpage>597</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Shan</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Adams</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Curless</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Furukawa</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seitz</surname>
            ,
            <given-names>S.M.:</given-names>
          </string-name>
          <article-title>The visual turing test for scene reconstruction</article-title>
          .
          <source>In: 2013 International Conference on 3D Vision-3DV</source>
          <year>2013</year>
          , IEEE (
          <year>2013</year>
          )
          <volume>25</volume>
          {
          <fpage>32</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>