=Paper=
{{Paper
|id=Vol-3411/INRA-paper1
|storemode=property
|title=Relevancy and Diversity in News Recommendations
|pdfUrl=https://ceur-ws.org/Vol-3411/INRA-paper1.pdf
|volume=Vol-3411
|authors=Shaina Raza,Chen Ding
|dblpUrl=https://dblp.org/rec/conf/sigir/RazaD22
}}
==Relevancy and Diversity in News Recommendations==
<pdf width="1500px">https://ceur-ws.org/Vol-3411/INRA-paper1.pdf</pdf>
<pre>
Relevancy and Diversity in News Recommendations⋆
Shaina Raza1,∗ , Chen Ding2
1
    Toronto Metropolitan University, ON, Canada
1
    Toronto Metropolitan University, ON, Canada


                                         Abstract
                                         News recommendation systems face unique challenges, including the dynamic nature of user preferences
                                         and the need for diversity in recommended news articles. To address these challenges, we propose
                                         a deep neural network architecture that learns representations for both news items and users. Our
                                         approach uses an enhanced vector for each query and news item to facilitate information interaction
                                         between these entities. To overcome selection bias in implicit user feedback, we employ negative
                                         sampling. We also promote diversity in recommended news by aligning the uneven news category
                                         representations of items in a loss function. Experimental results on a benchmark dataset demonstrate
                                         the superiority of our proposed architecture over baselines, achieving both relevancy and diversity in
                                         the news recommendations.

                                         Keywords
                                         News Recommender, Systems, Relevancy, Diversity, Retrieval, Deep Neural Networks


1. Introduction
Prominent news organizations such as Yahoo!, BBC, NYTimes, and CNN have introduced
online news portals, which users can access from any location to peruse a wide range of news
categories and stay up-to-date. However, with the abundance of information available on the
internet, locating pertinent news has become a challenging and time-consuming task. News
recommender systems (NRS) address the issue of information overload by presenting users with
personalized and intriguing recommendations, chosen from an extensive pool of available news
articles [1].
   An NRS must balance between maintaining relevance to the user’s interests and introducing
enough diversity to keep the user engaged. If the NRS recommends news articles that are too
diverse or unrelated to the user’s interests, the user may lose interest and stop using the system.
Conversely, if the NRS recommends only news articles that are closely related to the user’s
interests, it may miss out on opportunities to introduce the user to new topics and categories.
Therefore, finding the optimal balance between relevance and diversity is crucial for the success
of an NRS.
   In general, users’ preferences can be either long-term or short-term, with short-term prefer-
ences defining their current interests. For example, a user may have a long-standing interest in
Joint Proceedings of 10th International Workshop on News Recommendation and Analytics (INRA’22) and the Third
International Workshop on Investigating Learning During Web Search (IWILDS‘22) co-located with the 45th International
ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’22), July 15, 2022, Madrid, Spain
∗
    Corresponding author.
Envelope-Open shaina.raza@torontomu.ca (S. Raza)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
a particular topic, such as climate change, and read numerous articles related to this topic over
the course of their lifetime [2]. However, if the user has been reading only news articles related
to climate change for an extended period, the NRS may consider suggesting news articles on
related topics, such as sustainable living or renewable energy, to introduce some diversity into
the recommendations. By doing so, the NRS can help the user explore related topics and expand
their knowledge in the field, while also keeping them engaged with new and exciting news
articles.
   Short-term interests can also be identified by analyzing the user’s recent behavior on the
NRS. For instance, if a user has been reading numerous articles on a particular topic over the
past few days, the NRS can infer that the user is currently interested in that topic and adjust its
recommendations accordingly. By incorporating short-term interests into the recommendation
process, the NRS can provide more personalized and relevant recommendations to the user.
   This paper presents an NRS that prioritizes both relevancy and diversity in the recom-
mendation process, achieving a more balanced and engaging user experience. The specific
contributions of this paper are:

    • The proposed deep neural network is based on a query-candidate architecture [3][4],
      featuring a query (user) model that retrieves relevant news items and an item (candidate
      news) model that ranks them based on user actions. The model introduces a similarity
      score between the user feedback and item representations, enabling it to recommend a
      more diverse set of news articles.
    • The proposed model takes into account the uneven distribution of news articles across
      different categories, which is prevalent in news data, and ensures that recommendations
      include a balance of news items from all categories.
    • The proposed model also offer a negative sampling approach to tackle the selection bias
      of implicit user feedback by bringing in news samples from the entire news corpus.

   Extensive experiments on a news dataset show that our proposed approach can provide both
relevant and diversified news recommendations in an NRS.


2. Proposed Approach
Next, we discuss our approach.

2.1. Problem Formulation
The objective of news recommendation is to select relevant candidate news items from a news
corpus, given a set of queries. The item set is represented as {𝑣𝑖 }𝑁
                                                                    𝑖=1 and the query set as
     𝑀
{𝑣𝑗 }𝑗=1 . The recommendation problem, denoted as R, is learned from the query-item feedback
represented by a matrix ℝ𝑁 ×𝑀 . Each query is considered as feedback provided by the user.
If query j gives positive feedback on news item i, then 𝑅𝑖𝑗 = 1; otherwise, it is considered as
non-positive feedback.
Figure 1: Proposed architecture.


2.2. Query-Candidate Model
We show our proposed architecture in Figure 1 and explain next.
   The proposed model is a query-candidate model that consists of an embedding layer, an
augmented layer, and two models (query and news item generation). The model takes into
account different content features related to news items, such as news ID, title, body, and
category, as well as contextual information.
   The embedding layer is represented by an embedding matrix E ∈ ℝ𝐾 ×𝐷 , which maps each
piece of information (e.g., news item or user ID) in 𝑢𝑗 and 𝑣𝑖 to a low-dimensional dense vector
𝑒𝑗 ∈ ℝ 𝐾 .
   The augmented layer creates two augmented vectors 𝑎𝑢 and 𝑎𝑣 by concatenating the IDs
corresponding to two input feature vectors f𝑢 and f𝑣 , respectively. These augmented vectors
are then concatenated with the original feature vectors f𝑢 and f𝑣 to obtain the augmented repre-
sentations of query 𝑝𝑢 and news item 𝑝𝑣 . The fully connected layers with the ReLU activation
function are applied to these concatenated vectors, and the output from the fully connected
layers goes through the 𝐿2 normalization layer to obtain the augmented representations of
query 𝑝𝑢 and news item 𝑝𝑣 .
   The loss function of the proposed model is defined as the mean square error between the
augmented vectors 𝑎𝑢 and 𝑎𝑣 and query/item embedding for each sample of which label equals 1.
The augmented vectors 𝑎𝑢 and 𝑎𝑣 are used to fit all positive interactions in the model belonging
to the corresponding query or item. The stop gradient strategy is applied to stop the gradient of
𝐿𝑢 and 𝐿𝑣 from flowing back into 𝑝𝑢 and 𝑝𝑣 , respectively. The output of the model is the inner
product of the query embedding and news item embeddings.
   To improve the generalization ability of the model and enable it to learn more transferable
features across different news categories, we introduce an additional loss function during
the training phase. This loss function aims to minimize the distance between the feature
representations of news items from different categories. To achieve this, we first select the
largest news category as the reference category. We then calculate the squared Euclidean
distance between the feature representations of news items in this category and the feature
representations of news items in the other categories. This loss function is added to the overall
objective function, with a regularization parameter to control its relative importance.
   The final loss function is calculated as the sum of the binary cross-entropy loss, the loss
functions for query and news item representations, and the category-aware loss function.

2.3. Training
To train the model for news recommendation, we treat the problem as a binary classification
task. We use a random negative sampling technique, where for each positive query-item pair
(the label = 1), we randomly select a set of N news items from the news corpus to create negative
query-item pairs (the label = 0) for that query. This process results in a dataset with a balance
of positive and negative samples. We then use binary cross-entropy loss to train the model by
minimizing the error between the predicted scores and the ground truth labels.


3. Experimental Settings
We use the benchmark dataset MIND-small [5] that was collected during 6 weeks (Oct. 12, 2019
to Nov. 22, 2019), with 50k users, 161,013 news, and 156,925 clicks.
   Following the standard evaluation methodology s in NRS [5],[6], we conduct a time-based
splitting and use the following metrics.
   Relevancy metrics: We use Mean Reciprocal Rank (MRR) and F1-score (harmonic mean of
precision and recall) to evaluate the relevancy of news recommendations. MRR is the average
of the reciprocal rank of the first relevant item in the recommendation list for a set of queries.
   To compute MRR, we first calculate the reciprocal rank for each query:
                                                        1
                         Reciprocal Rank =                                                     (1)
                                          rank of first relevant item
  Then, MRR is calculated as the mean of the reciprocal ranks for all queries:
                                           |𝑄|
                                       ∑𝑖=1 Reciprocal Rank𝑖
                                 MRR =                                                         (2)
                                                |𝑄|
   where |𝑄| is the number of queries.
   Diversity metric: In addition to the relevancy metrics of MRR and F1-score, we also use the
GINI index to evaluate the diversity of news recommendations. GINI is a diversity metric that
measures the inequality of item distribution across different categories or clusters [7]. In the
context of news recommendation systems, GINI reflects the extent to which recommended
news articles cover a variety of topics and viewpoints. To calculate GINI, we first group the
recommended news articles into different categories based on their content. Then, we compute
the proportion of news articles in each category and use this information to calculate the GINI
index.
   Tradeoff metric: In the context of news recommendation systems, trade-offs are necessary
to balance multiple objectives, such as relevancy and diversity. Striking an optimal balance
between these two aspects enhances user satisfaction and engagement. We use the trade-off
metric, which is the product of the MRR and GINI scores, divided by their sum, multiplied by 2,
to measure the balance between relevancy and diversity.

                                                 ( MRR ∗ GINI )
                                tradeoff = 2 ∗                                                 (3)
                                                 ( MRR + GINI )
   The trade-off metric ranges from 0 to 1, with higher values indicating a better balance between
relevancy and diversity. By evaluating both relevancy and diversity, we can ensure that our
news recommendation system provides users with personalized, engaging, and informative
news articles. The trade-off metric ranges from 0 to 1, with higher values indicating a better
balance between relevancy and diversity.
   We use the following baseline methods for evaluation.
   Wide Deep [8], a hybrid model that combines deep neural networks with linear models for
recommendation. It is designed to provide both memorization and generalization capabilities.
   DKN [6], a knowledge-aware news recommendation method that incorporates knowledge
graphs for news representation and recommendation.
   LightGCN [9], a simplified Graph Convolutional Network (GCN) for collaborative filtering,
aiming to reduce complexity while maintaining performance.
   SASRec [10], a self-attentive sequential recommendation model that captures long-range
dependencies in user sequences for personalized recommendation.
   NeuMF [11], a hybrid model that combines the strengths of Generalized Matrix Factorization
(GMF) and Multi-Layer Perceptron (MLP) for better recommendation performance.
   We implemented these models in TensorFlow. The embedding dimension and batch size were
fixed to 32 and 256. We use the Adam optimizer for training. Other hyperparameters of all
models were individually tuned to achieve optimal results to ensure a fair comparison. The
dimensions of augmented vectors were both set to d= 32, and the tuning parameter 𝛾1 , 𝛾2 were
set to 0.5 and 𝛾3 to 1. We set top@ k to 5 and 10 as it is normally good practice to retrieve a
relatively large number of candidate news items to rank.
4. Results and Discussion
4.1. Performance Evaluation
We evaluate the results for both relevancy and diversity and mainly evaluate the model per-
formance based on the tradeoff score, as it shows a harmonic mean between relevancy and
diversity. We expect a good tradeoff score to be above 50% as it is a balancing score between 2
different metrics [2][12]. The results are shown in Table 2.

    Method             MRR     F1-score    GINI    Tradeoff    Precision@5      Precision@10
    Wide & Deep        0.38      0.50      0.42      0.44          0.55              0.53
    DKN                0.40      0.52      0.45      0.46          0.57              0.55
    LightGCN           0.42      0.54      0.44      0.48          0.59              0.57
    SASRec             0.41      0.53      0.47      0.47          0.58              0.56
    NeuMF              0.39      0.51      0.43      0.45          0.56              0.54
    Proposed Model     0.48      0.60      0.52      0.54          0.65              0.63
Table 1
Performance of different methods and the current model over the evaluation metrics and precision @5
and 10.

   Overall, we observe in Table 1, our proposed approach outperforms all the other methods in
terms of relevancy (MRR, F1, precision), diversity (GINI), and trade-off scores on the dataset. This
indicates that the current model is capable of providing personalized news recommendations
that are not only relevant to individual users but also diverse enough to expose them to a variety
of content. The relevancy scores may not be optimal but we achieve balanced tradeoff scores.
   The improvements in relevancy, as demonstrated by the higher MRR and F1-score, suggest
that the current model can better capture users’ preferences and deliver news articles that cater
to their interests. This is particularly important in the news recommendation domain, where
user engagement largely depends on the presentation of content that aligns with their interests
and preferences.
   The higher diversity, as represented by the GINI score, shows that the current model is able
to recommend a more diverse set of news articles, helping users discover new and unexpected
content. This is an essential aspect of a news recommender system, as it encourages users to
explore different perspectives and broadens their understanding of various topics.
   Moreover, the better trade-off score signifies that the current model strikes an optimal
balance between relevancy and diversity, ensuring that users receive a well-rounded set of
recommendations. This balance is crucial in maintaining user satisfaction and engagement,
as it prevents filter bubbles and echo chambers from forming while still providing users with
content that matches their interests.
   These results also show that by using negative sampling, we are reducing the selection bias
[13]. This is shown by the relatively higher diversity score of our model compared to other
items, as all news items in the corpus get a chance to serve as negatives so that the model gets
better retrievals towards diversified and long-tail items.
Figure 2: Relevancy and Diversity trade-off of our model


4.2. Relation Between Relevancy and Diversity
Next, we showcase the relevancy-diversity trade-off achieved by our model in Figure 2.
   In Figure 2, we observe several trends and relationships between relevancy, diversity, and
the tradeoff as the number of recommendations increases. The relevancy scores decrease
as the number of recommendations increases. This observation suggests that it becomes
more challenging for the recommender system to maintain high relevancy for all items as the
recommendation list size grows. This challenge is a common trade-off in recommender systems,
as providing more recommendations can increase the likelihood of including diverse items at
the cost of potentially lower relevancy for some items.
   Conversely, diversity scores increase with the number of recommendations. This trend
indicates that the recommender system is capable of providing more diverse recommendations
as the list size grows. Including a larger variety of items in the recommendation list can enhance
the user experience, as it exposes users to a broader range of content that may match their
interests.
   The tradeoff scores, which measure the balance between relevancy and diversity, remain
relatively stable across different recommendation list sizes. This stability suggests that the
recommender system is maintaining a reasonable balance between providing relevant and
diverse recommendations. In the given example, the tradeoff scores show a slight decrease
as the number of recommendations increases, indicating a minor compromise in balancing
relevancy and diversity for larger recommendation lists.
   This analysis highlights the challenge of balancing relevancy and diversity in recommender
systems as the number of recommendations increases. It is crucial to find a suitable balance to
ensure an optimal user experience, providing recommendations that are both relevant to the
user’s preferences and diverse enough to expose them to a variety of content.

4.3. Negative Sampling
To analyze the impact of negative sampling on our news recommendation system, we compared
the performance of the model with and without negative sampling using precision and recall
metrics at a recommendation list size of 10. As illustrated in Figure 3, employing negative
sampling significantly improves both precision and recall scores. This finding highlights the
importance of incorporating negative samples during the model training process, as it enables
the system to provide more accurate and diverse recommendations for users.


Figure 3: Model Performance with and without Negative Sampling (Top@10)


5. CONCLUSION
In this paper, we present a deep neural network-based architecture designed to model the
information interaction between query and news items. Our approach incorporates a variety of
features in both the news item and query representations. Additionally, we introduce a loss
function that selects distinctive news items across different news categories. We briefly discuss
selection bias and demonstrate how using negative sampling can mitigate this bias by including
random negatives from the news corpus.
   Extensive experiments on a benchmark dataset showcase the superior performance of our
proposed method in achieving a balance between accuracy and diversity. For future work, we
plan to conduct experiments on additional real-world news datasets and explore the potential
of deeper neural networks. We also intend to incorporate more evaluation metrics to assess
relevancy, diversity, and novelty in the recommendation results. Furthermore, we aim to address
challenges such as mitigating biases [14] and combating fake news [15] in news recommendation
systems by employing more advanced deep neural networks.
References
 [1] S. Raza, C. Ding, News recommender system: a review of recent progress, challenges, and
     opportunities, Artificial Intelligence Review (2021) 1–52.
 [2] S. Raza, C. Ding, A regularized model to trade-off between accuracy and diversity in a
     news recommender system, in: 2020 IEEE International Conference on Big Data (Big
     Data), IEEE, 2020, pp. 551–560.
 [3] R. Wang, Z. Zhao, X. Yi, J. Yang, D. Z. Cheng, L. Hong, S. Tjoa, J. Kang, E. Ettinger, H. Chi,
     Improving relevance prediction with transfer learning in large-scale retrieval systems, in:
     Proceedings of the 1st Adaptive & Multitask Learning Workshop, 2019.
 [4] J. Yang, X. Yi, D. Zhiyuan Cheng, L. Hong, Y. Li, S. Xiaoming Wang, T. Xu, E. H. Chi,
     Mixed negative sampling for learning two-tower neural networks in recommendations,
     in: Companion Proceedings of the Web Conference 2020, 2020, pp. 441–447.
 [5] F. Wu, Y. Qiao, J.-H. Chen, C. Wu, T. Qi, J. Lian, D. Liu, X. Xie, J. Gao, W. Wu, et al., Mind: A
     large-scale dataset for news recommendation, in: Proceedings of the 58th Annual Meeting
     of the Association for Computational Linguistics, 2020, pp. 3597–3606.
 [6] H. Wang, F. Zhang, X. Xie, M. Guo, Dkn: Deep knowledge-aware network for news
     recommendation, in: Proceedings of the 2018 world wide web conference, 2018, pp.
     1835–1844.
 [7] F. Sun, J. Liu, J. Wu, C. Pei, X. Lin, W. Ou, P. Jiang, BERT4Rec: Sequential Recommendation
     with Bidirectional Encoder Representations from Transformer, in: Proceedings of the 28th
     ACM International Conference on Information and Knowledge Management, CIKM ’19,
     Association for Computing Machinery, 2019, pp. 1441–1450. doi:1 0 . 1 1 4 5 / 3 3 5 7 3 8 4 . 3 3 5 7 8 9 5 .
 [8] H.-T. Cheng, L. Koc, J. Harmsen, T. Shaked, T. Chandra, H. Aradhye, L. Anderson, M. Pham,
     P. Ravichander, J. Pennington, et al., Wide & deep learning for recommender systems, in:
     Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, 2016, pp.
     7–10.
 [9] X. He, K. Deng, X. Wang, Y. Li, Y. Zhang, M. Wang, Lightgcn: Simplifying and powering
     graph convolution network for recommendation, in: Proceedings of the 43rd International
     ACM SIGIR Conference on Research and Development in Information Retrieval, ACM,
     2020.
[10] W.-C. Kang, J. McAuley, Self-attentive sequential recommendation, in: Proceedings of the
     2018 IEEE International Conference on Data Mining (ICDM), IEEE, 2018, pp. 197–206.
[11] X. He, L. Liao, H. Zhang, L. Nie, X. Hu, T.-S. Chua, Neural collaborative filtering, in:
     Proceedings of the 26th International Conference on World Wide Web, International World
     Wide Web Conferences Steering Committee, 2017, pp. 173–182.
[12] S. Raza, C. Ding, Deep Neural Network to Tradeoff between Accuracy and Diversity in a
     News Recommender System, in: 2021 IEEE International Conference on Big Data (Big
     Data), IEEE, 2021, pp. 5246–5256.
[13] S. Caton, C. Haas, Fairness in machine learning: A survey, arXiv preprint arXiv:2010.04053
     (2020).
[14] S. Raza, J. Reji, C. Ding, Dbias: Detecting biases and ensuring Fairness in news articles,
     International Journal of Data Science and Analytics (2022).
[15] S. Raza, C. Ding, Fake news detection based on news content and social contexts: a
transformer-based approach, International Journal of Data Science and Analytics 13 (2022)
335–362.

</pre>