AIRec: Attentive Intersection Model for
             Tag-Aware Recommendation

                 Bo Chen1 , Dong Wang1 , Yue Ding1 , and Xin Xin2
 1
      Shanghai Jiao Tong University {chenbo.31, wangdong, dingyue}@sjtu.edu.cn
                 2
                   University of Glasgow x.xin.1@research.gla.ac.uk


         Abstract. Tag-aware recommender systems (TRS) utilize rich tagging
         information to better depict user portraits and item features. Existing
         methods fail to capture multi-aspect user preferences and lack of ex-
         ploration of tags intersection. In this work, we propose attentive inter-
         section model (AIRec) to address these issues. User representations are
         constructed via a hierarchical attention network, where the item-level
         attention differentiates the contributions of interacted items and the
         preference-level attention discriminates the saliencies between explicit
         and implicit preferences. Besides, the tags intersection is exploited to
         enhance the learning of conjunct features. Finally, we combine factoriza-
         tion machines (FM) with BPR for score prediction. Experiments on two
         real-world datasets demonstrate significant improvements of AIRec over
         state-of-the-art methods for tag-aware top-n recommendation.


1      Introduction

Social tagging systems, also known as folksonomies, are widely used in various
websites, where users can freely annotate online resources (e.g., movies, artists)
with arbitrary tags. These tags are composed by laconic words or phrases, which
can not only indicate user preferences, but also summarize features of items.
Consequently, user-defined tags can be introduced into recommender systems
for alleviating the cold-start problem and improving recommendation quality.
    To solve the problem of sparsity, ambiguity and redundancy in tag space,
some neural networks-based methods are proposed by converting the tag space
into dense latent space, such as CFA [5], DSPR-NS [3] and TRSDL [1]. Although
these models have made some progress, there are some weaknesses that hinder
their performance. They construct user representations by either explicit tagging
behaviors (e.g., DSPR-NS) or implicit interacted items (e.g., TRSDL), which is
inadequate to capture multi-aspect user preferences. The intersection of user and
item tags reflects the diverse focuses of different users, which is the key incentive
of user-item transactions. Unfortunately, seldom research has explored this field.
     Dong Wang is the corresponding author. Copyright c 2019 for this paper by its au-
     thors. Use permitted under Creative Commons License Attribution 4.0 International
     (CC BY 4.0).
2        Bo Chen, Dong Wang, Yue Ding, and Xin Xin

    In this paper, we focus on developing solution to address the drawbacks
mentioned above and propose an Attentive Intersection Recommendation model
(AIRec) for TRS. Compared to the previous models, our method not only takes
both explicit and implicit preferences into consideration for capturing more ac-
curate user portrait via hierarchical attention network, but also makes full use
of the tags intersection to improve performance.


2    The AIRec Model
In this section, we will present the architecture of our proposed AIRec model and
explain the training procedure. Figure 1 illustrates the structure of our model.
                                                                                             Training
                                                                                    𝑦ො𝑢𝑖                 BPR

                                                                                                                                                       Element-wise ⨉
                                                                         Factorization Machines
          Prediction Layer                                                                                                                             Element-wise +

                                                                                Concat

                                                                                                                                                    Hybrid user model
                          𝐲෤𝑖                                    𝐲𝐢෤𝑖𝑢
                                                                    𝑖                                                             𝐱෤ 𝑢𝐻


                                                                                                                 Preference-level


                                                                                                                𝐱෤ 1𝑢                               𝐱෤ 𝑢2


                                                                                                                                   Item-level                       ···
                                                   Shared parameters                             Shared parameters
          Hidden Layers                MLP                                     MLP                                           MLP

                                                                                                                                                                        ···
          Input Layer 0      2     0    4    ···      1      0      2      0    0          ···       1      0      3     1    0           ···   6

                                       𝐲𝑖                            𝐢𝑖𝑢 = 𝐲𝑖 ⋂ 𝐱 𝑢                                          𝐱𝑢                             𝐲෤1   𝐲෤2         𝐲෤𝑇

                                Item tags                        Intersection Module                                    User tags                       Historical items


                                        Fig. 1. The structure of AIRec model.
Input Layer and Hidden Layers The user feature vector is constructed as
xu = (pu1 , pu2 , ..., puV ), where V is the size of tag set and puj = |{(u, i, tj ) ∈ A|i ∈ I}|
is the number of times that user u annotates items with tag tj . Similarly, the
item feature vector can be represented as yi = (q1i , q2i , ..., qVi ).
    To solve the problem of sparsity and high-dimension, xu and yi are fed into
the multi-layer perceptrons (MLPs) with shared parameters. Sharing parameters
can not only obtain better generalization capability and less computational over-
head, but force networks to use the same feature space to describe user and item.
The latent representations of user and item are x̃1u = h(xu ) and ỹi = h(yi ).

Hybrid User Model To capture multi-aspect user preferences, we should con-
sider not only the explicit preferences x̃1u reflected by user’s own tagging behav-
iors, but also the implicit preferences x̃2u conveyed by the historical interacted
items. In this part, we elaborate a hybrid user model with hierarchical attention
        AIRec: Attentive Intersection Model for Tag-Aware Recommendation                  3

network. The item-level attention aims to depict user implicit preferences x̃2u by
differentiating contributions of historical items, while the preference-level atten-
tion dynamically discriminates the saliencies between explicit tagging behaviors
and implicit preferences for obtaining hybrid user representation x̃H  u .
    In the item-level attention, we leverage an additive attention network to
differentiate contributions of items by investigating the similarities between item
representations and explicit preferences x̃1u . Suppose the historical items set of
user u is Iu , the representation of kth item ik ∈ Iu is ỹk . The attention weight
α(u, k) can be interpreted as the contribution of the kth item to the implicit
preferences, which is shown as:
               α(u, k) = sof tmax(v1T ReLU (W0 x̃1u + W1 ỹk + b1 )),                   (1)
                    0     1                1
where matrices W , W and vectors b , v1 are the trainable P parameters. Finally,
the implicit preferences x̃2u can be represented as x̃2u = ik ∈Iu α(u, k)ỹk .
    The hybrid user representation can be obtained by fusing x̃1u with x̃2u . Dif-
ferent from manually setting a hyper-parameter β for all users to determine
the trade-off, we design a self-attentive fusion mechanism for complying with
individual diversity. Similarly, the attention weight β(u, k) of the kth part is:
                    β(u, k) = sof tmax(v2T ReLU (W2 x̃ku + b2 )).                       (2)
And the hybrid user representation is formulated as x̃H            1            2
                                                      u = β(u, 1)x̃u + β(u, 2)x̃u .


Intersection Module Item features are multi-dimensional and have diverse
attractions for different users. The intersection of user and item tags reveals
the deep reason why the user focuses on the item and which are the vital di-
mensions when modeling this transaction. Motivated by this observation, we
elaborate an intersection module to extract the intersection for further enhanc-
ing the recommendation performance. Firstly, we calculate the tags intersection
by iiu = yi ∩xu = (r1iu , r2iu , ..., rViu ), where rjiu = min(qji , puj ) means the minimum
occurrences of tag tj . Then iiu is fed into a MLP that shares parameters with
the previous MLPs for further training the networks. At last, the latent repre-
sentation ĩiu is added to the user/item representations, that is, ỹi = ỹi ⊕ ĩiu and
x̃H     H
  u = x̃u ⊕ ĩiu , where operation ⊕ means element-wise addition.
    Due to the shared parameters, intersection module can constrain MLPs to fo-
cus on the conjunct features, obtaining more concrete user/item representations
under a certain user-item transaction scenario.

Training Details At the prediction stage,feature vectors ỹi and x̃H   u are con-
catenated into a single vector z = ỹi , x̃H u  , and passed  through  a  prediction
layer consisting of a factorization machine [2], which captures the second-order
interactions in a fine-grained manner, i.e, ŷui = F M (z).
    WePoptimize the model with the BPR framework and the loss function is
LΘ = hu,i+ ,i− i − ln σ(ŷui+ − ŷui− ), where i+ and i− are the positive and nega-
tive items of user u respectively. The negative items are randomly sampled from
a uniform distribution. Besides, dropout is also used to prevent overfitting.
4       Bo Chen, Dong Wang, Yue Ding, and Xin Xin

3    Experiments
We conduct experiments on two public datasets: Last.Fm and Delicious and
adopt the same preprocessing as [5, 3, 4] to remove infrequent tags. For each
dataset, we randomly select 80% of the assignments as training set and 20% as
test set. The training set is used to construct tag-based user and item profiles.
    We compare the performance of AIRec with FM[2], CFA[5], DSPR-NS[3] and
HDLPR[4]. Precision (P ), Recall (R), F1-score (F ) and Mean Reciprocal Rank
(MRR) are used to evaluate the results. Table 1 illustrates the top-n recommen-
dation performances. It’s obvious that AIRec achieves the best performance in
all metrics, which demonstrates the effectiveness of our model.
                 Table 1. Comparison between different models.
    Last.Fm P@10       P@20     R@10      R@20     F@10      F@20     MRR
    FM      0.1470     0.1237   0.0945    0.1410   0.1151    0.1318   0.0306
    CFA     0.1389     0.1055   0.0970    0.1349   0.1142    0.1184   0.0287
    DSPR-NS 0.1693     0.1340   0.1667    0.2234   0.1680    0.1675   0.0362
    HDLPR 0.1641       0.1328   0.1483    0.1984   0.1558    0.1591   0.0357
    AIRec   0.3074     0.2417   0.2670    0.3437   0.2857    0.2838   0.0651
    Delicious P@10     P@20     R@10      R@20     F@10      F@20     MRR
    FM      0.0369     0.0352   0.0103    0.0172   0.0161    0.0231   0.0058
    CFA     0.0168     0.0110   0.0098    0.0125   0.0124    0.0117   0.0031
    DSPR-NS 0.3656     0.3196   0.0897    0.1437   0.1441    0.1982   0.0423
    HDLPR 0.2546       0.2148   0.0554    0.0885   0.0910    0.1254   0.0301
    AIRec   0.4052     0.3505   0.1165    0.1838   0.1810    0.2417   0.0480

4    Conclusion
In this work, we propose a novel tag-aware top-n recommendation model AIRec.
We design a hybrid user model with a hierarchical attention network for better
user modeling and leverage the tags intersection for constraining neural networks
to focus on the conjunct features. Extensive experiments shows that AIRec sig-
nificantly outperforms the state-of-the-art baselines.

References
1. Liang, N., Zheng, H.T., Chen, J.Y., Sangaiah, A.K., Zhao, C.Z.: Trsdl: Tag-aware
   recommender system based on deep learning–intelligent computing systems. Applied
   Sciences 8(5), 799 (2018)
2. Rendle, S.: Factorization machines. In: ICDM. pp. 995–1000. IEEE (2010)
3. Xu, Z., Chen, C., Lukasiewicz, T., Miao, Y., Meng, X.: Tag-aware personalized
   recommendation using a deep-semantic similarity model with negative sampling.
   In: CIKM. pp. 1921–1924. ACM (2016)
4. Xu, Z., Lukasiewicz, T., Chen, C., Miao, Y., Meng, X.: Tag-aware personalized
   recommendation using a hybrid deep model. In: IJCAI. pp. 3196–3202 (2017)
5. Zuo, Y., Zeng, J., Gong, M., Jiao, L.: Tag-aware recommender systems based on
   deep neural networks. Neurocomputing 204, 51–60 (2016)