-

1613-0073

Tommaso Carraro

tcarraro@fbk.eu 0 1 0 Data and Knowledge Management Unit, Fondazione Bruno Kessler , Via Sommarive, 18, 38123, Povo (TN) , Italy 1 Department of Mathematics, University of Padova , Via Trieste, 63, 35121, Padova (PD) , Italy

Despite being studied for over twenty years, Recommender Systems (RSs) still sufer from important issues that limit their applicability in real-world scenarios. Data sparsity, cold start, and explainability are some of the most impacting problems. Intuitively, these historical limitations can be mitigated by injecting prior knowledge into recommendation models. Neuro-Symbolic (NeSy) approaches are suitable candidates for achieving this goal. However, the application of such systems to RSs is still in its early stages, and most of the proposed architectures do not really exploit the advantages of a NeSy approach. To this end, we conducted preliminary experiments with a Logic Tensor Network (LTN), a novel NeSy framework. We used the LTN to train a vanilla Matrix Factorization model using a First-Order Logic knowledge base as an objective. In particular, we encoded facts to enable the regularization of the latent factors using content information, obtaining promising results. In this paper, we show our preliminary results with the LTN and propose interesting future works in this novel research area.

recommender systems neuro-symbolic integration logic tensor networks first-order logic

CEUR ceur-ws.org

1. Introduction

while neural networks struggle. Then, () symbolic methods are usually explainable by design, while neural networks are black-boxes. Finally, () symbolic methods manage to work in the absence of data (i.e., zero-shot learning), while for neural networks, it is impossible. Ideally, by integrating these two paradigms, it could be possible to obtain a recommendation engine that can deal well with sparsity, address cold-start cases, and make its predictions less opaque.

Many Neuro-Symbolic recommenders have been proposed recently [4, 5, 6]. However, these methods do not totally exploit the advantages of the NeSy system, or they have not been applied to overcome the aforementioned limitations, where we believe they are particularly suited. For example, they implement the symbolic part using neural networks or use too simple logic (e.g., propositional logic) that is not expressive enough to model real-world problems [4]. To this end, in a preliminary experiment, we used a Logic Tensor Network (LTN) [7, 8] to encode logical axioms for enabling regularization of a vanilla Matrix Factorization (MF) based on logical reasoning. We obtained promising results, showing that the benefits of the encoded knowledge increase with the sparsity of the user-item ratings.

2. Preliminary experiments

In our preliminary experiments on Neuro-Symbolic Integration, we selected Logic Tensor Networks (LTN) [7] as the NeSy framework to perform recommendations. LTN allows learning a neural model using the satisfaction of a FOL knowledge base as an objective. Specifically, it defines a FOL language, called Real Logic, that allows mapping every symbolic expression to the domain of real numbers. By doing so, the logical formulas in the knowledge base form a computational graph that can be used for gradient-based optimization. To this end, Real Logic defines the grounding function , which defines the mapping between the symbolic and real domains. Specifically, individuals are mapped to tensors of real values, variable symbols to sequences of individuals, functional symbols to real functions, and predicate symbols to real functions with output in [0., 1.]. Then, connectives (i.e., ∧, ∨, ¬, ⟹ ) are mapped to fuzzy semantics [9], while quantifiers (i.e., ∀, ∃) are mapped to special aggregation functions (e.g., generalized means).

Intuitively, functional and predicate symbols can be represented as neural networks parameterized by . We refer to (| ) as a parametric grounding, meaning symbol depends on some parameters that can be learned. LTN allows learning parametric groundings by maximally satisfying a specified knowledge base = { 1, … , }, where 1, … , are closed formulas. More formally, the objective of the LTN is ∗ = argmax SatAgg∈ (| ), where (| ) means formula includes some functional or predicate symbols parameterized by . SatAgg ∶ [0., 1.]∗ ↦ [0., 1.] is a formula aggregating operator, usually defined with MEp [7], namely the fuzzy operator that represents ∀.

In what follows, we refer to Likes(, ) as a binary predicate returning whether a user likes an item . Note that ( Likes | ) can be any recommendation model returning the prediction for a user-item pair in the dataset. In our experiments, we implemented ( Likes | ) as a Matrix Factorization model. Specifically, ( Likes | ) ∶ , ↦ ( U ⋅ I⊤ + u + i ), where U ∈ ℝ× , I ∈ ℝ× , u ∈ ℝ , and i ∈ ℝ are the users’ and items’ latent factors, and users’ and items’ biases, respectively. denotes the number of users, the number of items, and the number of latent factors. is the logistic function. Specifically, allows Likes to be interpreted as a fuzzy predicate.

2.1. Recommendation loss function definition

LTN allows defining the recommendation objective as a set of logical axioms. Thanks to the expressiveness of FOL, one can express fine-grained constraints that can represent complex loss functions. For example, the loss function for training an MF model could be the following. ∀( +, +) Likes( +, +) ∀( −, −)¬ Likes( −, −) ( 1 ) ( 2 ) Intuitively, + and + are variable symbols denoting positive user-item pairs, while − and − denote negative user-item pairs. Axiom ( 1 ) states that for each positive user-item pair, the prediction of the MF model should be a positive truth value (i.e., the user should like the item). In contrast, Axiom ( 2 ) states that for each negative user-item pair, the prediction of the MF model should be a negative truth value since the loss imposes maximizing the negation of Likes. In other words, by satisfying this knowledge base, the LTN learns how to train the MF to factorize the user-item matrix using the ground truth (i.e., the target ratings).

2.2. Model regularization by logical reasoning

Encoding additional information to regularize the recommendation model is straightforward. This can be done by encoding additional axioms that enable logical reasoning based on side or content information. Following this intuition, we conducted a preliminary experiment [6] to see if LTN could enable an underlying MF model reasoning about some additional content information. In particular, an experiment that drastically reduces the density of user-item ratings showed that the benefits of the encoded knowledge increase with the sparsity of the dataset, proving our NeSy approach has been beneficial in dealing with sparsity. Specifically, we added the following axiom to the previous knowledge base. Note the experiment has been conducted on MindReader1, a movie recommendation dataset providing ratings for both movies and movie genres.

∀( ?, ?)(∃¬ LikesGenre( ?, ) ∧ HasGenre( ?, )) ⟹ ¬ Likes( ?, ?) ( 3 ) In the formalization, ? and ? are variable symbols denoting user-movie pairs for which the rating is unknown, while is a variable symbol denoting the movie genres of the movies of the dataset. Then, LikesGenre is a fixed (i.e., not learnable) binary predicate returning one if user likes genre , zero otherwise. Similarly, HasGenre is a fixed binary predicate returning one if a movie belongs to genre , zero otherwise. Note these two predicates can be easily implemented as lookup tables filled with data from the dataset.

Intuitively, Axiom ( 3 ) states that every time there is a user-movie pair for which we do not have information (i.e., the rating is missing), if we know user ? does not like some genre of the 1https://mindreader.tech/dataset/ associated movie ?, then ? should not like ?. This formula enables the underlying MF model to reason about relationships between users, movies, and movie genres. In this sense, it acts as a kind of logical regularization for the latent factors of the MF. In particular, we designed this axiom based on the idea that when no ratings are available, knowing something about movie genres is better than knowing nothing. This intuition has been evidenced by our results [6], which prove that the addition of the formula has been crucial to deal with sparsity.

3. Proposed directions

tasks.

3.1. Hybrid recommendation

This section proposes possible extensions of our NeSy model to solve diferent recommendation In Section 2.1, we showed how LTN could be used to implement a recommendation model based on Collaborative Filtering. It is well-known that CF cannot deal with cold-start cases, as collaborative information for newly added users and items is unavailable. For this reason, hybrid recommendation models have been proposed. The idea is to merge CF with contentbased recommendations to deal with cold-start while maintaining all the advantages of CF. Implementing this idea in LTN is straightforward. It involves adding the following axiom to the knowledge base.

∀, ,

Likes(, ) ∧ Sim(, ) ⟹

Likes(, ) ( 4 ) Intuitively, Axiom ( 4 ) states that for each triple (, , ), where is a user, is an item, and a cold-start item, if likes , and and are similar, then should like too. Note the predicate Sim can be implemented as a similarity measure based on content (e.g., movie genres in common) or latent (i.e., pre-trained embeddings) information. The only constraint is the output has to be in the range [0., 1.] as it has to be interpreted as a logical predicate by LTN. Clearly, this formula will help the model deal with cold-start item cases, as when no information is available about some ratings, we are using content information to compensate for this. The LTN will perform hybrid recommendations by finding a trade-of between Axiom ( 4 ), Axiom ( 1 ), and Axiom ( 2 ) in the objective. Note that a similar idea can also be used to deal with cold-start users.

3.2. Cross-domain recommendation

Cross-domain recommendation aims at mitigating data sparsity by transferring knowledge acquired from other domains (e.g., books, songs) to the target domain (e.g., movies). The source domain is usually denser than the target domain, as the objective is to compensate for sparsity in the latter. To this end, LTN can be used as an interface for transferring knowledge between domains. To do that, the following axiom can be added to the knowledge base. ∀(, , )

Likessource(, ) ∧

Sim(, ) ⟹

Likestarget(, )

( 5 ) In the formalization, , , and are variable symbols for denoting users, books, and movies, respectively. Then, Likessource is a pre-trained recommendation model on book ratings (i.e., the source domain), while Likestarget represents the recommendation model we want to train on the target domain (i.e., movie ratings in this example). Again, Sim is a predicate that can be implemented using a similarity measure based on content information (e.g., storyline) or latent structure. Intuitively, Axiom ( 5 ) states that every time a user likes a book in the source domain, if the book is similar to a movie in the target domain, then should like . Clearly, this axiom allows the transfer of information between domains by logical reasoning. We conducted some preliminary experiments based on this idea and obtained promising results. Note one could apply Axiom ( 5 ) only when user-movie ratings are missing on the target domain. In such cases, knowing something about the source domain acts as a kind of data augmentation for the target domain and, hence, helps mitigate data sparsity.

3.3. Explainable recommendation

LTN has not been designed just for learning. In particular, it also provides the possibility to query the knowledge base after the training phase. Querying is the process of grounding the knowledge base with novel data and checking the satisfaction level of its formulas. We believe this feature can be used to provide explanations for recommendations. For example, after we train a model with Axiom ( 1 ), Axiom ( 2 ), and Axiom ( 3 ), we could check the satisfaction level of the following formula for a test user to which we recommended the movie . ∃ LikesGenre(

, ) ∧ HasGenre( , ) Intuitively, if the formula is evaluated with a high truth value, it means our test user likes at least one movie genre of . Note that one can go through the computational graph of the formula to understand precisely which genre it is and provide this finding as an explanation.

4. Conclusions

In this paper, we proposed diferent ways to use a Neuro-Symbolic approach to mitigate important recommender systems’ limitations and solve interesting recommendation tasks. In particular, we showed the flexibility of LTN in designing heterogeneous recommendation objectives.

5. Acknowledgments

I would like to thank my supervisors, Luciano Serafini and Fabio Aiolli, for having proposed working on this stimulating and ambitious research area. Then, I thank my colleague Alessandro Daniele, who provided additional supervision during my Ph.D. journey. Furthermore, I am grateful to the University of Padova for the delivery of this interesting doctoral program and for the excellent formation that was provided during my journey. Finally, I thank Fondazione Bruno Kessler, who sponsored my Ph.D. expeditions worldwide and supported my research career.

[1]

Hu ,

Koren ,

Volinsky , Collaborative filtering for implicit feedback datasets , in: 2008 Eighth IEEE International Conference on Data Mining , 2008 , pp. 263 - 272 . doi: 10 .1109/ ICDM. 2008 . 22 .

[2]

Carraro ,

Polato ,

Bergamin ,

Aiolli , Conditioned variational autoencoder for top-n item recommendation , in: E. Pimenidis,

Angelov ,

Jayne ,

Papaleonidas , M. Aydin (Eds.), Artificial Neural Networks and Machine Learning - ICANN 2022 , Springer Nature Switzerland, Cham, 2022 , pp. 785 - 796 . doi: 10 .1007/978-3- 031 -15931-2\_ 64 .

[3]

A. S.

d'Avila Garcez ,

Broda ,

D. M.

Gabbay , Neural-symbolic learning systems - foundations and applications , in: Perspectives in Neural Computing , 2012 . doi: 10 .1007/ 978-1- 4471 -0211-3.

[4]

Chen ,

Shi ,

Li , Y. Zhang, Neural collaborative reasoning , in: Proceedings of the Web Conference 2021 , WWW '21, Association for Computing Machinery, New York, NY, USA, 2021 , p. 1516 - 1527 . doi: 10 .1145/3442381.3449973.

[5]

Spillo ,

Musto , M. De Gemmis , P. Lops , G. Semeraro, Knowledge-aware recommendations based on neuro-symbolic graph embeddings and first-order logical rules , in: Proceedings of the 16th ACM Conference on Recommender Systems , RecSys '22, Association for Computing Machinery, New York, NY, USA, 2022 , p. 616 - 621 . doi: 10 .1145/3523227. 3551484.

[6]

Carraro ,

Daniele ,

Aiolli ,

Serafini , Logic tensor networks for top-n recommendation , in: AIxIA 2022 - Advances in Artificial Intelligence: XXIst International Conference of the Italian Association for Artificial Intelligence , AIxIA 2022 , Udine, Italy, November 28 - December 2 , 2022 , Proceedings, Springer-Verlag, Berlin, Heidelberg, 2023 , p. 110 - 123 . doi: 10 .1007/978-3- 031 -27181-6\_8.

[7]

Badreddine , A. d'Avila Garcez , L.

Serafini , M.

Spranger , Logic tensor networks , Artificial Intelligence 303 ( 2022 ) 103649 . doi:https://doi.org/10.1016/j.artint. 2021 . 103649 .

[8]

Carraro , LTNtorch: PyTorch implementation of Logic Tensor Networks, 2023 . doi: 10 . 5281/zenodo.7778157.

[9] E. van Krieken ,

Acar , F. van Harmelen , Analyzing diferentiable fuzzy logic operators , Artificial Intelligence 302 ( 2022 ) 103602 . doi: https://doi.org/10.1016/j.artint. 2021 . 103602 .