<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tommaso Carraro</string-name>
          <email>tcarraro@fbk.eu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Data and Knowledge Management Unit, Fondazione Bruno Kessler</institution>
          ,
          <addr-line>Via Sommarive, 18, 38123, Povo (TN)</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Mathematics, University of Padova</institution>
          ,
          <addr-line>Via Trieste, 63, 35121, Padova (PD)</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Despite being studied for over twenty years, Recommender Systems (RSs) still sufer from important issues that limit their applicability in real-world scenarios. Data sparsity, cold start, and explainability are some of the most impacting problems. Intuitively, these historical limitations can be mitigated by injecting prior knowledge into recommendation models. Neuro-Symbolic (NeSy) approaches are suitable candidates for achieving this goal. However, the application of such systems to RSs is still in its early stages, and most of the proposed architectures do not really exploit the advantages of a NeSy approach. To this end, we conducted preliminary experiments with a Logic Tensor Network (LTN), a novel NeSy framework. We used the LTN to train a vanilla Matrix Factorization model using a First-Order Logic knowledge base as an objective. In particular, we encoded facts to enable the regularization of the latent factors using content information, obtaining promising results. In this paper, we show our preliminary results with the LTN and propose interesting future works in this novel research area.</p>
      </abstract>
      <kwd-group>
        <kwd>recommender systems</kwd>
        <kwd>neuro-symbolic integration</kwd>
        <kwd>logic tensor networks</kwd>
        <kwd>first-order logic</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>while neural networks struggle. Then, () symbolic methods are usually explainable by design,
while neural networks are black-boxes. Finally, () symbolic methods manage to work in the
absence of data (i.e., zero-shot learning), while for neural networks, it is impossible. Ideally, by
integrating these two paradigms, it could be possible to obtain a recommendation engine that
can deal well with sparsity, address cold-start cases, and make its predictions less opaque.</p>
      <p>Many Neuro-Symbolic recommenders have been proposed recently [4, 5, 6]. However, these
methods do not totally exploit the advantages of the NeSy system, or they have not been applied
to overcome the aforementioned limitations, where we believe they are particularly suited.
For example, they implement the symbolic part using neural networks or use too simple logic
(e.g., propositional logic) that is not expressive enough to model real-world problems [4]. To
this end, in a preliminary experiment, we used a Logic Tensor Network (LTN) [7, 8] to encode
logical axioms for enabling regularization of a vanilla Matrix Factorization (MF) based on logical
reasoning. We obtained promising results, showing that the benefits of the encoded knowledge
increase with the sparsity of the user-item ratings.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Preliminary experiments</title>
      <p>In our preliminary experiments on Neuro-Symbolic Integration, we selected Logic Tensor
Networks (LTN) [7] as the NeSy framework to perform recommendations. LTN allows learning
a neural model using the satisfaction of a FOL knowledge base as an objective. Specifically,
it defines a FOL language, called Real Logic, that allows mapping every symbolic expression
to the domain of real numbers. By doing so, the logical formulas in the knowledge base form
a computational graph that can be used for gradient-based optimization. To this end, Real
Logic defines the grounding function  , which defines the mapping between the symbolic and
real domains. Specifically, individuals are mapped to tensors of real values, variable symbols
to sequences of individuals, functional symbols to real functions, and predicate symbols to
real functions with output in [0., 1.]. Then, connectives (i.e., ∧, ∨, ¬, ⟹ ) are mapped to fuzzy
semantics [9], while quantifiers (i.e., ∀, ∃) are mapped to special aggregation functions (e.g.,
generalized means).</p>
      <p>Intuitively, functional and predicate symbols can be represented as neural networks
parameterized by  . We refer to  (|  ) as a parametric grounding, meaning symbol  depends
on some parameters  that can be learned. LTN allows learning parametric groundings by
maximally satisfying a specified knowledge base  = { 1, … ,   }, where  1, … ,   are closed
formulas. More formally, the objective of the LTN is  ∗ = argmax SatAgg∈  (|  ), where
 (|  ) means formula  includes some functional or predicate symbols parameterized by  .
SatAgg ∶ [0., 1.]∗ ↦ [0., 1.] is a formula aggregating operator, usually defined with MEp [7],
namely the fuzzy operator that represents ∀.</p>
      <p>In what follows, we refer to Likes(, ) as a binary predicate returning whether a user  likes
an item  . Note that  ( Likes | ) can be any recommendation model returning the prediction
for a user-item pair in the dataset. In our experiments, we implemented  ( Likes | ) as a Matrix
Factorization model. Specifically,  ( Likes | ) ∶ ,  ↦  ( U ⋅ I⊤ + u + i ), where U ∈ ℝ× ,
I ∈ ℝ× , u ∈ ℝ , and i ∈ ℝ are the users’ and items’ latent factors, and users’ and items’
biases, respectively.  denotes the number of users,  the number of items, and  the number of
latent factors.  is the logistic function. Specifically,  allows Likes to be interpreted as a fuzzy
predicate.</p>
      <sec id="sec-3-1">
        <title>2.1. Recommendation loss function definition</title>
        <p>
          LTN allows defining the recommendation objective as a set of logical axioms. Thanks to the
expressiveness of FOL, one can express fine-grained constraints that can represent complex
loss functions. For example, the loss function for training an MF model could be the following.
∀( +,  +) Likes( +,  +)
∀( −,  −)¬ Likes( −,  −)
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
Intuitively,  + and  + are variable symbols denoting positive user-item pairs, while  − and  −
denote negative user-item pairs. Axiom (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) states that for each positive user-item pair, the
prediction of the MF model should be a positive truth value (i.e., the user should like the item).
In contrast, Axiom (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) states that for each negative user-item pair, the prediction of the MF
model should be a negative truth value since the loss imposes maximizing the negation of
Likes. In other words, by satisfying this knowledge base, the LTN learns how to train the MF to
factorize the user-item matrix using the ground truth (i.e., the target ratings).
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>2.2. Model regularization by logical reasoning</title>
        <p>Encoding additional information to regularize the recommendation model is straightforward.
This can be done by encoding additional axioms that enable logical reasoning based on side
or content information. Following this intuition, we conducted a preliminary experiment [6]
to see if LTN could enable an underlying MF model reasoning about some additional content
information. In particular, an experiment that drastically reduces the density of user-item
ratings showed that the benefits of the encoded knowledge increase with the sparsity of the
dataset, proving our NeSy approach has been beneficial in dealing with sparsity. Specifically,
we added the following axiom to the previous knowledge base. Note the experiment has been
conducted on MindReader1, a movie recommendation dataset providing ratings for both movies
and movie genres.</p>
        <p>
          ∀( ?,  ?)(∃¬ LikesGenre( ?, ) ∧ HasGenre( ?, )) ⟹
¬ Likes( ?,  ?)
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          )
In the formalization,  ? and  ? are variable symbols denoting user-movie pairs for which the
rating is unknown, while  is a variable symbol denoting the movie genres of the movies of
the dataset. Then, LikesGenre is a fixed (i.e., not learnable) binary predicate returning one if
user  likes genre  , zero otherwise. Similarly, HasGenre is a fixed binary predicate returning
one if a movie  belongs to genre  , zero otherwise. Note these two predicates can be easily
implemented as lookup tables filled with data from the dataset.
        </p>
        <p>
          Intuitively, Axiom (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) states that every time there is a user-movie pair for which we do not
have information (i.e., the rating is missing), if we know user  ? does not like some genre  of the
1https://mindreader.tech/dataset/
associated movie  ?, then  ? should not like  ?. This formula enables the underlying MF model
to reason about relationships between users, movies, and movie genres. In this sense, it acts as
a kind of logical regularization for the latent factors of the MF. In particular, we designed this
axiom based on the idea that when no ratings are available, knowing something about movie
genres is better than knowing nothing. This intuition has been evidenced by our results [6],
which prove that the addition of the formula has been crucial to deal with sparsity.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Proposed directions</title>
      <p>tasks.</p>
      <sec id="sec-4-1">
        <title>3.1. Hybrid recommendation</title>
        <p>This section proposes possible extensions of our NeSy model to solve diferent recommendation
In Section 2.1, we showed how LTN could be used to implement a recommendation model
based on Collaborative Filtering. It is well-known that CF cannot deal with cold-start cases,
as collaborative information for newly added users and items is unavailable. For this reason,
hybrid recommendation models have been proposed. The idea is to merge CF with
contentbased recommendations to deal with cold-start while maintaining all the advantages of CF.
Implementing this idea in LTN is straightforward. It involves adding the following axiom to the
knowledge base.</p>
        <p>∀, ,</p>
        <p>Likes(, ) ∧ Sim(,   ) ⟹</p>
        <p>
          Likes(, 
 )
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          )
Intuitively, Axiom (
          <xref ref-type="bibr" rid="ref4">4</xref>
          ) states that for each triple (, , 
), where  is a user,  is an item, and


a cold-start item, if  likes  , and  and  
are similar, then  should like  
too. Note the
predicate Sim can be implemented as a similarity measure based on content (e.g., movie genres
in common) or latent (i.e., pre-trained embeddings) information. The only constraint is the

output has to be in the range [0., 1.] as it has to be interpreted as a logical predicate by LTN.
Clearly, this formula will help the model deal with cold-start item cases, as when no information
is available about some ratings, we are using content information to compensate for this. The
LTN will perform hybrid recommendations by finding a trade-of between Axiom
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          ), Axiom (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ),
and Axiom (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) in the objective. Note that a similar idea can also be used to deal with cold-start
users.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Cross-domain recommendation</title>
        <p>Cross-domain recommendation aims at mitigating data sparsity by transferring knowledge
acquired from other domains (e.g., books, songs) to the target domain (e.g., movies). The source
domain is usually denser than the target domain, as the objective is to compensate for sparsity
in the latter. To this end, LTN can be used as an interface for transferring knowledge between
domains. To do that, the following axiom can be added to the knowledge base.
∀(, , )</p>
        <sec id="sec-4-2-1">
          <title>Likessource(, ) ∧</title>
          <p>Sim(, ) ⟹</p>
        </sec>
        <sec id="sec-4-2-2">
          <title>Likestarget(, )</title>
          <p>
            (
            <xref ref-type="bibr" rid="ref5">5</xref>
            )
In the formalization,  ,  , and  are variable symbols for denoting users, books, and movies,
respectively. Then, Likessource is a pre-trained recommendation model on book ratings (i.e.,
the source domain), while Likestarget represents the recommendation model we want to train
on the target domain (i.e., movie ratings in this example). Again, Sim is a predicate that can
be implemented using a similarity measure based on content information (e.g., storyline) or
latent structure. Intuitively, Axiom (
            <xref ref-type="bibr" rid="ref5">5</xref>
            ) states that every time a user  likes a book  in the
source domain, if the book is similar to a movie  in the target domain, then  should like  .
Clearly, this axiom allows the transfer of information between domains by logical reasoning.
We conducted some preliminary experiments based on this idea and obtained promising results.
Note one could apply Axiom (
            <xref ref-type="bibr" rid="ref5">5</xref>
            ) only when user-movie ratings are missing on the target domain.
In such cases, knowing something about the source domain acts as a kind of data augmentation
for the target domain and, hence, helps mitigate data sparsity.
          </p>
        </sec>
      </sec>
      <sec id="sec-4-3">
        <title>3.3. Explainable recommendation</title>
        <p>
          LTN has not been designed just for learning. In particular, it also provides the possibility to
query the knowledge base after the training phase. Querying is the process of grounding the
knowledge base with novel data and checking the satisfaction level of its formulas. We believe
this feature can be used to provide explanations for recommendations. For example, after we
train a model with Axiom (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ), Axiom (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ), and Axiom (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ), we could check the satisfaction level
of the following formula for a test user   to which we recommended the movie   .
∃ LikesGenre(
        </p>
        <p>, ) ∧ HasGenre(  , )
Intuitively, if the formula is evaluated with a high truth value, it means our test user likes at
least one movie genre of   . Note that one can go through the computational graph of the
formula to understand precisely which genre it is and provide this finding as an explanation.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Conclusions</title>
      <p>In this paper, we proposed diferent ways to use a Neuro-Symbolic approach to mitigate
important recommender systems’ limitations and solve interesting recommendation tasks.
In particular, we showed the flexibility of LTN in designing heterogeneous recommendation
objectives.</p>
    </sec>
    <sec id="sec-6">
      <title>5. Acknowledgments</title>
      <p>I would like to thank my supervisors, Luciano Serafini and Fabio Aiolli, for having proposed
working on this stimulating and ambitious research area. Then, I thank my colleague Alessandro
Daniele, who provided additional supervision during my Ph.D. journey. Furthermore, I am
grateful to the University of Padova for the delivery of this interesting doctoral program and
for the excellent formation that was provided during my journey. Finally, I thank Fondazione
Bruno Kessler, who sponsored my Ph.D. expeditions worldwide and supported my research
career.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Koren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Volinsky</surname>
          </string-name>
          ,
          <article-title>Collaborative filtering for implicit feedback datasets</article-title>
          , in: 2008
          <source>Eighth IEEE International Conference on Data Mining</source>
          ,
          <year>2008</year>
          , pp.
          <fpage>263</fpage>
          -
          <lpage>272</lpage>
          . doi:
          <volume>10</volume>
          .1109/ ICDM.
          <year>2008</year>
          .
          <volume>22</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T.</given-names>
            <surname>Carraro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Polato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Bergamin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Aiolli</surname>
          </string-name>
          ,
          <article-title>Conditioned variational autoencoder for top-n item recommendation</article-title>
          , in: E. Pimenidis,
          <string-name>
            <given-names>P.</given-names>
            <surname>Angelov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Jayne</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Papaleonidas</surname>
          </string-name>
          , M. Aydin (Eds.),
          <source>Artificial Neural Networks and Machine Learning - ICANN 2022</source>
          , Springer Nature Switzerland, Cham,
          <year>2022</year>
          , pp.
          <fpage>785</fpage>
          -
          <lpage>796</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -15931-2\_
          <fpage>64</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A. S.</given-names>
            <surname>d'Avila Garcez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Broda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Gabbay</surname>
          </string-name>
          ,
          <article-title>Neural-symbolic learning systems - foundations and applications</article-title>
          ,
          <source>in: Perspectives in Neural Computing</source>
          ,
          <year>2012</year>
          . doi:
          <volume>10</volume>
          .1007/ 978-1-
          <fpage>4471</fpage>
          -0211-3.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y. Zhang,</surname>
          </string-name>
          <article-title>Neural collaborative reasoning</article-title>
          ,
          <source>in: Proceedings of the Web Conference</source>
          <year>2021</year>
          , WWW '21,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2021</year>
          , p.
          <fpage>1516</fpage>
          -
          <lpage>1527</lpage>
          . doi:
          <volume>10</volume>
          .1145/3442381.3449973.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>G.</given-names>
            <surname>Spillo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Musto</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. De Gemmis</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Lops</surname>
          </string-name>
          , G. Semeraro,
          <article-title>Knowledge-aware recommendations based on neuro-symbolic graph embeddings and first-order logical rules</article-title>
          ,
          <source>in: Proceedings of the 16th ACM Conference on Recommender Systems</source>
          , RecSys '22,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2022</year>
          , p.
          <fpage>616</fpage>
          -
          <lpage>621</lpage>
          . doi:
          <volume>10</volume>
          .1145/3523227. 3551484.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>T.</given-names>
            <surname>Carraro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Daniele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Aiolli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Serafini</surname>
          </string-name>
          ,
          <article-title>Logic tensor networks for top-n recommendation</article-title>
          ,
          <source>in: AIxIA 2022 - Advances in Artificial Intelligence: XXIst International Conference of the Italian Association for Artificial Intelligence</source>
          ,
          <source>AIxIA</source>
          <year>2022</year>
          , Udine, Italy,
          <source>November 28 - December 2</source>
          ,
          <year>2022</year>
          , Proceedings, Springer-Verlag, Berlin, Heidelberg,
          <year>2023</year>
          , p.
          <fpage>110</fpage>
          -
          <lpage>123</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -27181-6\_8.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Badreddine</surname>
          </string-name>
          , A.
          <string-name>
            <surname>d'Avila Garcez</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Serafini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Spranger</surname>
          </string-name>
          ,
          <article-title>Logic tensor networks</article-title>
          ,
          <source>Artificial Intelligence</source>
          <volume>303</volume>
          (
          <year>2022</year>
          )
          <article-title>103649</article-title>
          . doi:https://doi.org/10.1016/j.artint.
          <year>2021</year>
          .
          <volume>103649</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T.</given-names>
            <surname>Carraro</surname>
          </string-name>
          , LTNtorch: PyTorch implementation of Logic Tensor Networks,
          <year>2023</year>
          . doi:
          <volume>10</volume>
          . 5281/zenodo.7778157.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>E. van Krieken</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Acar</surname>
          </string-name>
          ,
          <string-name>
            <surname>F. van Harmelen</surname>
          </string-name>
          ,
          <article-title>Analyzing diferentiable fuzzy logic operators</article-title>
          ,
          <source>Artificial Intelligence</source>
          <volume>302</volume>
          (
          <year>2022</year>
          )
          <article-title>103602</article-title>
          . doi: https://doi.org/10.1016/j.artint.
          <year>2021</year>
          .
          <volume>103602</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>