<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Matrices via a Neuro-vector-symbolic Architecture</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Michael Hersche</string-name>
          <email>her@zurich.ibm.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mustafa Zeqiri</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luca Benini</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Abu Sebastian</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Abbas Rahimi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ETH Zürich</institution>
          ,
          <addr-line>Rämistrasse 101, 8092 Zürich</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>IBM Research-Zurich</institution>
          ,
          <addr-line>Säumerstrasse 4, 8803 Rüschlikon</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Human fluid intelligence is the ability to think and reason abstractly, and make inferences in a novel domain. The Raven's progressive matrices (RPM) test has been a widely-used assessment of fluid intelligence and visual abstract reasoning. Neuro-symbolic AI approaches display both perception and reasoning capabilities, but inherit the limitations of their individual deep learning and symbolic AI components, namely the the so-called neural binding problem and exhaustive symbolic searches explained in the following.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Italy</p>
      <p>© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
VSA representations can be composed, decomposed, probed, and transformed in various ways
using a set of well-defined operations, including binding, unbinding, bundling, permutations,
and associative memory.</p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], we propose a neuro-vector-symbolic architecture (NVSA) consisting of a visual
perception frontend and a probabilisitc reasoning backend, both taping into the rich resources of
VSA as a general computing framework (see Fig. 1). The resulting NVSA frontend addresses the
binding problem in the neural networks, especially the superposition catastrophe, by efectively
mapping the raw image of multiple objects to the structural VSA representations that still
maintain the perceptual uncertainty. The NVSA backend maps the inferred probability mass
functions into another vector space of VSA such that the exhaustive probability computations
and searches can be substituted by algebraic operations in that vector space. The VSA
operations ofer distributivity and computing-in-superposition, which significantly reduce the
computational costs thus performing probabilistic abduction and execution in real-time manner.
      </p>
      <p>
        Results. Compared to the state- Table 1: Average accuracy (%) on the I-RAVEN.
of-the-art deep neural network [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]
and neuro-symbolic approaches [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], Method Avg C 2x2 3x3 L-R U-D O-IC O-IG
end-to-end training (NVSA e2e tr.) SPCrALE[3[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]] 8741..31 9893..98 6882..99 4437..04 9984..58 9994..18 9576..76 8327..64
of NVSA achieves a new record of NVSA (e2e tr.) 88.1 99.8 96.2 54.3 100 99.9 99.6 67.1
88.1% in the I-RAVEN dataset (see NVSA (attr. tr.) 99.0 100 99.5 97.1 100 100 100 96.4
Table 1). In a fully supervised
setting in which the labels of the visual attributes are given (NVSA attr. tr.), the NVSA frontend
can be trained independently of the backend with a novel additive cross-entropy loss, yielding
highest accuracy of 99.0%. Moreover, compared to the symbolic reasoning within the
state-ofthe-art neuro-symbolic approaches, the probabilistic reasoning of NVSA with less expensive
operations on the distributed yet transparent representations is two orders of magnitude faster.
      </p>
      <p>Generalization. Further, we analyze the generalization of the NVSA frontend to unseen
combinations of attribute values in novel objects. We observe that the frontend with the
multiplicative binding cannot generalize to unseen combinations of the attribute values, hence
we enhance it by a multiplicative-additive encoding that can generalize up to 72%. The
multiplicative binding-based encoding however generalizes well to unseen combinations of multiple
objects. We also evaluate the out-of-distribution generalizability of NVSA frontend and backend
with respect to unseen attribute–rule pairs. NVSA outperforms the deep learning baselines by
a large margin in all unseen attribute–rule pairs.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hersche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zeqiri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Benini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sebastian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rahimi</surname>
          </string-name>
          ,
          <article-title>A neuro-vector-symbolic architecture for solving Raven's progressive matrices</article-title>
          ,
          <source>Nature Machine Intelligence</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Jia</surname>
          </string-name>
          , S.-C. Zhu,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <article-title>Abstract spatial-temporal reasoning via probabilistic abduction and execution</article-title>
          , in: IEEE CVPR,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Grosse</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Ba,</surname>
          </string-name>
          <article-title>The scattering compositional learner: Discovering objects, attributes, relationships in analogical reasoning</article-title>
          , arXiv preprint arXiv:
          <year>2007</year>
          .
          <volume>04212</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>