<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>P. Kumari, G. Kaur, P. Singh, A. Kumar, Movie Recommendation System for Cold-Start
Problem Using User's Demographic Data, Proceedings of the International Conference on
Electrical, Computer and Energy Technologies (ICECET)</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">0973-6107</issn>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1007/s11257-018-9215-8</article-id>
      <title-group>
        <article-title>Recommendation System Based on a Compact Hybrid User Model Using Fuzzy Logic Algorithms</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nina Khairova</string-name>
          <email>khairova.nina@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nataliia Sharonova</string-name>
          <email>nvsharonova@ukr.net</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dmytro Sytnikov</string-name>
          <email>dmytro.sytnikov@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mykyta Hrebeniuk</string-name>
          <email>mykyta.hrebeniuk@nure.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Polina Sytnikova</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Kharkiv National University of Radio Electronics</institution>
          ,
          <addr-line>Nauky Ave. 14, Kharkiv, City, 61166</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>National Technical University “Kharkiv Polytechnic Institute”</institution>
          ,
          <addr-line>Kyrpychova str. 2, Kharkiv, 61002</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>29</volume>
      <issue>9</issue>
      <fpage>105</fpage>
      <lpage>113</lpage>
      <abstract>
        <p>The paper presents algorithm designed to address the challenges of traditional collaborative filtering methods by integrating a compact hybrid user model. This model incorporates hybrid features, demographic information, and fuzzy logic principles to improve recommendation accuracy. A key contribution of this work is the development of an innovative approach for calculating user similarity using fuzzy logic algorithms. By considering fuzzy concepts, the proposed approach effectively captures the inherent uncertainty and imprecision in user preferences, leading to more nuanced and accurate recommendations. Experimental evaluations conducted on the widely used MovieLens dataset provide insights into the performance of the proposed algorithm compared to traditional collaborative filtering techniques such as Pearson correlation and cosine similarity. The dataset, which contains both user ratings and demographic details, serves as a comprehensive testbed for assessing recommendation systems. The results of the experiments demonstrate the superiority of the proposed approach in capturing user similarities and enhancing recommendation accuracy. This paper contributes to the ongoing progress in recommendation systems by proposing a solution that addresses the challenges associated with traditional collaborative filtering methods. Through the integration of hybrid user models, demographic data, and fuzzy logic principles, the proposed algorithm offers a promising approach for enhancing recommendation accuracy across diverse application domains. Recommendation system, fuzzy logic, hybrid feature, compact hybrid user model, fuzzy user model, similarity, fuzzy distance</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>With the proliferation of data-driven technologies and the increasing volume of information
across various domains, the demand for efficient recommendation systems has grown
substantially. These systems play a vital role in assisting users in discovering relevant items or
content based on their preferences and behavior. Collaborative filtering, a widely used approach
in recommendation systems, leverages user interactions and similarities to generate
personalized recommendations. However, traditional collaborative filtering methods often face
challenges related to scalability and accuracy, particularly when dealing with sparse or
incomplete data.</p>
      <p>To address these challenges, hybrid recommendation systems have emerged as a promising
solution by integrating multiple recommendation techniques, such as collaborative filtering and
demographic filtering. By combining collaborative data with demographic information, hybrid
systems aim to enhance recommendation accuracy and mitigate sensitivity to data sparsity.
Additionally, the incorporation of fuzzy logic allows for a more nuanced representation of user
preferences, accommodating the inherently fuzzy nature of human decision-making.
0000-0002-9826-0286 (N. Khairova); 0000-0002-8161-552X (N. Sharonova); 0000-0003-1240-7900 (D. Sytikov);
0009-0008-0989-7957 (M. Hrebeniuk); 0000-0002-6688-4641 (P. Sytnikova)
© 2024 Copyright for this paper by its authors.</p>
      <p>Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).</p>
      <p>This paper explores the development and evaluation of a hybrid recommendation system that
integrates collaborative filtering with demographic information and fuzzy logic. The system
utilizes a compact user model, incorporating genre interest indicators (GII) derived from user
ratings and demographic attributes such as age, gender, and profession. By hybridizing at both
the feature and model levels, the system aims to improve recommendation accuracy while
maintaining scalability.</p>
      <p>Experimental evaluations are conducted using the MovieLens dataset, comprising user ratings
for movies along with demographic information. The performance of the proposed hybrid
recommendation system is compared against traditional collaborative filtering methods,
including Pearson correlation and cosine similarity. The experiments aim to assess the
effectiveness of the fuzzy distance function in capturing user similarities and improving
recommendation accuracy.</p>
      <p>Through this research, insights into the efficacy of hybrid recommendation systems and the
impact of incorporating fuzzy logic are gained. The findings contribute to advancing the
understanding of recommendation system design and optimization, with implications for
enhancing user experiences across various applications and domains.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related works</title>
      <p>
        The landscape of recommender systems has been shaped by extensive research efforts over the
past decade, with a focus on improving recommendation accuracy and addressing various
challenges [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Collaborative filtering (CF) has remained a cornerstone in recommendation
system research, owing to its ability to provide personalized recommendations based on
useritem interactions [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Hybrid recommendation systems, which integrate multiple recommendation techniques, have
emerged as a promising approach to overcome the limitations of individual methods. [
        <xref ref-type="bibr" rid="ref3 ref4">3,4</xref>
        ]
explores the landscape of hybrid recommender systems, offering insights into different
hybridization techniques and identifying trends in hybrid recommender system research. Paper
[5] describes a hybrid music recommendation system that combine collaborative filtering with
content-based filtering techniques. By integrating user preferences with item features such as
genre and artist, these systems aim to provide more diverse and personalized music
recommendations for automated playlist continuation. Another area of exploration involves
weighting strategies [6] for recommender systems that cluster items based on genres. By
assigning weights to items within each cluster, these systems aim to enhance recommendation
accuracy by giving more importance to relevant genres in the user's preferences. [7] introduces
the concept of genre interest measure (GII), which is a hybrid feature that combines user ratings
and movie genres and represents user preferences at the model level.
      </p>
      <p>Also, an important part of the recommendation system is the calculation of similarity. [8] has
delved into evaluating different similarity measures used in collaborative filtering-based
recommender systems. Traditional measures such as Pearson correlation and cosine similarity,
as well as more advanced techniques like adjusted cosine similarity and Jaccard coefficient, are
reviewed and compared to evaluate their performance in recommendation tasks through
experimental comparisons.</p>
      <p>Additionally, the integration of demographic information into recommendation systems has
shown promise. Addressing the cold-start problem in recommender systems, another study
leverages user demographic attributes to provide personalized recommendations to new users
with limited interaction history. By incorporating demographic information such as age, gender,
and location into recommendation algorithms, the system aims to mitigate the cold-start problem
and offer relevant recommendations [9,10]. Specifically tailored for movie recommendations, a
demographic collaborative recommender system leverages demographic information such as age
and gender to enhance collaborative filtering algorithms, offering accurate and personalized
movie recommendations to users [10].</p>
      <p>The incorporation of fuzzy logic principles into recommendation algorithms has garnered
attention for its capacity to model the inherent uncertainty in user preferences. A paper [11]
provides an overview of fuzzy logic techniques applied in recommender systems, discussing how
fuzzy logic can address uncertainty and imprecision in user preferences and item attributes.
Various fuzzy logic-based recommendation approaches are reviewed, along with their
effectiveness in improving recommendation accuracy. By integrating fuzzy logic with CF
techniques, as demonstrated in [12,13], recommendation accuracy can be improved, highlighting
the importance of incorporating fuzzy concepts in user modeling. [14] addresses this problem by
using fuzzy C-means method and comparing its performance against other clustering techniques
used in user-based Collaborative filtering recommendation systems. By incorporating fuzzy logic
principles, the system offers a more nuanced understanding of user preferences, leading to more
accurate recommendations.</p>
      <p>In summary, the integration of collaborative filtering, hybrid recommendation techniques, and
fuzzy logic principles has led to significant advancements in recommendation system
effectiveness.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methods and materials</title>
      <p>Formally, we have  users,  = { 1 . . . ,   }, explicit or implicit ratings of  items,  =
{ 1 . . . ,   }, such as news, web pages, books, games, or movies. The spaces  and  are large and
can be very large in some cases. Each user   , where  = 1, . . . ,  rated a subset of items   . The
rating of user   for item   , where  = 1, . . . ,  is denoted as   , . All available ratings are
collected in an  ×  user-item matrix denoted as  . The architecture of different
recommendation systems can be centralized or distributed. In this work, we assume a centralized
architecture where the recommendation system is in one specific place.</p>
      <p>During the development of a recommendation system, the following five phases can be
identified:
1. Data collection
2. User model formation
3. Similarity computation
4. Neighbor selection
5. Predictions and recommendations</p>
      <sec id="sec-3-1">
        <title>3.1 Data collection</title>
        <p>The recommendation system should have as much information as possible about users to provide
them with satisfactory results from the very beginning. This information includes user interests,
origin, habits, personal data, and other details. Typically, three types of data can be collected from
users in addition to product descriptions, namely demographic information during registration,
explicit ratings (expressing users' opinions on items) for a subset of available items, and implicit
data from user behavior on the service. Implicit ratings relate to the interpretation of user
behavior or choice for assigning a rating or preference based on viewing data, purchase history,
or other types of information access models. Additionally, the recommendation system should
have access to a database of items being evaluated (in our case, movies).</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2 User model formation</title>
        <p>
          Memory-based collaborative filtering is more accurate, but its scalability compared to
modelbased recommendation systems is poor. In addition, actual user preferences may not always be
captured solely through ratings, and therefore, some item content descriptions are needed. This
can be achieved if we build a hybrid user model [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] that integrates user ratings with some item
content descriptions.
        </p>
        <p>The user-based memory-based collaborative filtering model consists of a vector of elements
whose ratings increase as the user interacts with the system over time. This huge amount of data
requires a very large space and extremely long processing time. During a query search across the
entire database to find the best set of neighbors is computationally expensive. On the other hand,
model-based collaborative filtering receives a model from a group of users that may be far from
the actual preferences of the users. While memory-based collaborative filtering is simple, it
provides recommendations with high accuracy and allows for easy addition of new data, but it is
expensive in terms of computation as the size of the input data set increases. Ultimately, the user
may leave the website until processing is complete. On the other hand, applying only model-based
collaborative filtering to such sparse data, although reducing the cost of online processing, often
comes at the price of recommendation accuracy. However, one of the common threats in current
recommendation system research is the need to combine recommendation methods to mitigate
sparsity and scalability problems. But most common hybridization methods create two separate
models and implement an online process for each filtering technique separately. Finally, some
merging is used to obtain the result. What if we build a user model according to a certain filtering
technique, and then apply another filtering method to the created model? Thus, only one online
filtering process (Collaborative filtering) should be used, while another filtering method
(Content-based filtering) is used to densify the data. To achieve this, the utilization of hybrid
features is proposed.</p>
        <p>In our methodology, we incorporate the concept of Genre Interest Indicator (GII) [6] to
enhance the user model formation process. The GII is a measure of a user's interest in specific
genres of items, such as movies. It is calculated based on explicit ratings provided by users for
items belonging to different genres. This approach allows us to capture nuanced user preferences
beyond simple ratings, enabling the recommendation system to better understand user tastes.</p>
        <p>To implement the GII, we utilize a hybrid approach that combines collaborative filtering with
demographic data. Specifically, we leverage explicit user ratings to link users to genres, while also
incorporating demographic information such as age, gender, and profession. This hybrid user
model provides a more accurate representation of user preferences by considering both explicit
ratings and demographic factors.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.2.1 Combining collaborative and demographic data</title>
        <p>The compact user model described above uses genre interest indicator (GII) to build a model by
linking explicit ratings to genres. However, the assertion that two people are similar is based not
only on whether they have similar thoughts on a particular topic but also on other factors, such
as their background and personal data. In many cases, the ratings claimed by some users are not
sufficient to describe them adequately. Therefore, a hybrid user model with age, gender, and
professions as demographic information, in addition to GII, may be a good choice for creating
more accurate and individual recommendations.</p>
        <p>Combining features of collaborative and demographic filtering allows for considering explicit
ratings without relying solely on them, thus reducing the sensitivity of collaborative filtering to
the number of ratings [8]. Conversely, it enables having demographic information about users
that would otherwise be unavailable. Moreover, most current hybrid recommendation systems
are weighted systems, where the online process is realized for each filter separately and then
some merging is used to obtain the final result. In this work, we will attempt to introduce
hybridization at two different levels, namely at the model level and at the approach level. Figure
1 illustrates the hybrid user model that we introduce to obtain hybrid collaborative/demographic
filtering.</p>
        <p>Accordingly, the hybrid user model consists of age, gender, and profession as demographic
information, and GII for genres, as shown in Table 1.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.2.2 Fuzzy user model</title>
        <p>…
…
…</p>
        <p>
          Triller
   (  )
…
Fuzzy sets were introduced as a generalization of classical crisp sets in order to deal with fuzzy
concepts such as "young," "rich," "tall," etc. Instead of the rigid membership of elements in a crisp
set (1 if an element belongs to the set and 0 otherwise), a fuzzy set allows elements to have a
partial degree of membership, i.e., any value in the interval [
          <xref ref-type="bibr" rid="ref1">0,1</xref>
          ]. In the theory of fuzzy sets, a
fuzzy subset A of the universe of discourse U is described by a membership function   ( ): 

→ [
          <xref ref-type="bibr" rid="ref1">0,1</xref>
          ], which represents the degree of membership of x in the set A. Fuzzy logic refers to all
∈
theories and technologies that use fuzzy sets. For recommendation systems, most user
preferences are fuzzy, so fuzzy logic is an appropriate tool for representing these preferences. In
these methods, each object is represented by a set of primitive propositions, whose truth is
determined in the object space by a value in the interval [
          <xref ref-type="bibr" rid="ref1">0,1</xref>
          ]. For example, a proposition could
be "This movie is a comedy." The associated value with this proposition indicates the degree to
which this movie is a comedy.
        </p>
        <p>The crisp description of age and GII in the hybrid user model (Table 1) does not reflect the real
case of human decisions. For example, the distance between two users aged 15 and 19 is 4, while
both users belong to the same age group, namely teenagers. These fuzzy characteristics need to
be taken into account when comparing users. Below we will discuss how to implement the hybrid
user model and introduce a fuzzy distance function for finding the nearest neighbors.</p>
        <p>The fuzzy user model will help create a set of neighbors as close as possible to the active
user. However, to build a fuzzy model, it is first necessary to label the features of the user model.
First of all, age is divided into three fuzzy sets: young, adult, and old (Figure 2), with the
following membership functions:</p>
        <p>The values of gender and profession are considered as fuzzy points with a membership value
of one. Finally, GII is divided into six fuzzy sets, very bad (VB), bad (B), average (AV), good (G),
very good (VG), and excellent (E) with the following membership functions (Figure 3):
  ( ) = {
1 − 
0
 ≤ 1
 &gt; 1
0
  ( )( ) = { −  + 2
 −</p>
        <p>≤  − 2,  &gt; 
 − 2 &lt;  ≤  − 1
 − 1 &lt;  ≤ 
 = 2,3,4,5
Here,  ( ) =  , 
,  ,</p>
        <p>for  = 2,3,4,5 respectively.
  ( ) = {0 − 4</p>
        <p>≤ 4
4 &lt;  ≤ 5</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.3 Similarity computation</title>
        <p>After building a user model, a recommendation system compares the active user to the available
database according to the corresponding similarity function. Based on the calculated similarity
values, a connection is established between the active user and other users, allowing the
recommendation system to form a set of neighbors for the active user. The choice of similarity
function depends on the program and is based on the nature of the user model's features. Some
similarity function modifiers have been introduced to refine or enhance the recommendation
system's ability to find close neighbors. It should be noted that similarity calculations for
collaborative filtering can be performed between items instead of users. This work only discusses
user-based methods of similarity (user-to-user similarity), as it is the most popular.</p>
        <p>The similarity between two users is a measure of how similar they are to each other. Formally,
the similarity function  (  ,   ) for a set of users  is a function with a non-negative value:
 :  ×  →  + + {0} . Here, we distinguish between   and   based on their context. When
we use   , we refer to user-x, while   represents the feature vector for the user-x model. The
similarity function may have some of the following properties:
(P1) Identity: ∀  ∈  ,  (  ,   ) &gt; 0
(P2) Positivity: ∀  (≠   ) ∈  ,  (  ,   ) ≥ 0
(P3) Symmetry: ∀  ,   ∈  ,  (  ,   ) =  (  ,   )</p>
        <p>Any function that satisfies (P1) is a similarity function. Although symmetry is a convenient
property, it is not satisfied in all programs. Non-negativity is not satisfied for two standard
examples: correlation coefficients and scalar products. Different similarity functions  (  ,   )
were used in the study of collaborative filtering between users   and   . The most popular
similarity function for memory-based collaborative filtering is the Pearson correlation coefficient
[7], where the similarity between two users is based only on their common ratings   . The
Pearson correlation coefficient:</p>
        <p>Another similarity function is the cosine similarity function [7], which considers two users as
two vectors in an |  | dimensional space.</p>
        <p>On the other hand, dissimilarity is the opposite of similarity and is related to the concept of
distance, where two terms are used interchangeably: small distances mean small differences, and
large distances mean large differences. Formally, the distance function  (  ,   ) for a set of
users  is a function  :  ×  →  + + {0}. The  function may have some of the following
properties:
(P1) Identity or reflexivity: ∀  ∈  ,  (  ,   ) = 0
(P2) Positivity: ∀  (≠   ) ∈  ,  (  ,   ) &gt; 0
(P3) Symmetry: ∀  ,   ∈  ,  (  ,   ) =  (  ,   )
(P4) Uniqueness or definiteness:</p>
        <p>(  ,   ) = 0 ⇒   =  
(P5) Triangle inequality: ∀  ,   ,  ∈  ,  (  ,   ) ≤  (  ,   ) +  (  ,   )
Generally, identity and positivity are crucial for determining the correct distance function.</p>
        <p>Obviously, formulas (1) and (2) are not suitable if the model includes diverse features because
these formulas consider only the elements of the joint assessment of both users. The Euclidean
distance function provides another way of computing differences for recommendation systems,
which considers numerical peculiarities.</p>
        <p>(  ,   ) = √∑ =1(  −   )2
, where   — is the j-th feature of   , and  — is the number of features.</p>
        <p>3.3.1 Fuzzy distance function
( ,  ) = 
( ,  ) × 
( ,  )
(12)
where</p>
        <p>( ,  ) is simply the difference operator, and a and b are vectors of size l, and
 ( ,  ) is any vector metric distance.</p>
        <p>In this work, the Euclidean distance is used to calculate 
( ,  ):
Using a fuzzy user model has many advantages, but how can we compare two user models that
have many fuzzy features? In general, each function has many fuzzy sets. Actually, the choice of
distance function is an important issue for the system and depends largely on the problem itself.
For the hybrid user model in Figure 1, a vector with N features represents the user, and therefore,
for each function, a local fuzzy distance should be found. Therefore, for each pair of users, we have
N local fuzzy distances. The global fuzzy distance could be obtained by two methods. The first
to   ) THEN (  is similar to   ). In this case, the global fuzzy distance is defined as:
method uses the fuzzy IF-THEN rule: IF( 1 is close to  1) and ( 2 is close to  2) ... and (  is close</p>
        <p>The second method considers each local fuzzy distance as an opinion. The global fuzzy distance
is the global opinion of all. An aggregation operator is needed for this in fuzzy logic. The
aggregation operator can be the average of N local fuzzy distances.</p>
        <p>(  ,   ) = ∑ =1</p>
        <p>(  ,  )
values.
sets.</p>
        <p>have close membership values.
conditions:</p>
        <p>Formula (10) works poorly for the hybrid user model because it considers only the feature
with the minimum distance and ignores other features. According to fuzzy set and concept
distances, we need a local fuzzy distance metric, 
(  ,   ), which satisfies the following
A.</p>
        <p>Zero value for the same feature values.</p>
        <p>B. Zero value for different feature values in the same fuzzy set with the same membership
C. Minimized distance between any two feature values that belong to the same fuzzy set and
D. Maximized distance between any two feature values that belong to two different fuzzy
Condition (A) is a fundamental requirement for any distance function. To clarify condition (B),
let's assume that we have two users who are 40 and 35 years old, respectively. Both users have a
membership value of 1 for the “adult” category (Figure 2). The distance between them is 5, but
they are similar users in terms of fuzzy sets. To make the distance between two users zero, we
need another term that gives zero value for this and similar cases. What really makes these two
users similar is their equal membership values in one fuzzy set. We then define a corresponding
fuzzy distance function that satisfies all four above-mentioned conditions.</p>
        <p>Let a and b be the membership vectors corresponding to two crisp values a and b for a given
feature with  fuzzy sets. The fuzzy distance between a and b is defined as
 ( ,  ) = √∑ =1(  −   )2
(13)
where   is the membership value of feature a for its fuzzy set j.</p>
      </sec>
      <sec id="sec-3-6">
        <title>Example</title>
        <p>Let's assume we need to calculate the fuzzy distance between two users who have ages:
a) 35 and 40
b) 45 and 60
c) 18 and 23
Case a:  = 〈0,1,0〉,  = 〈0,1,0〉. Then
 ( ,  ) = √(0 − 0)2 + (1 − 1)2 + (0 − 0)2 = 0
  (35,40) = 40 − 35 = 5
  (35,40) = 0 × 5 = 0 (similar users)
Case b:  = 〈0,1,0〉,  = 〈0,0,1〉. Then
 ( ,  ) = √(0 − 0)2 + (1 − 0)2 + (0 − 1)2 = √2
  (60,45) = 60 − 45 = 15
  (60,45) = √2 × 15 (opposite users)
Case c:  = 〈1,0,0〉,  = 〈0.8,0.2,0〉. Then
 ( ,  ) = √(0.8 − 1)2 + (0.2 − 0)2 + (0 − 0)2 = 0.283
  (23,18) = 23 − 18 = 5
  (23,18) = 0.283 × 5 (close users)</p>
        <p>The example results show that formula (12) satisfies all four features for a necessary local
function of fuzzy distance. Therefore, for the fuzzy approach, the fuzzy distance function
between two users can be aggregated using formula (11).</p>
      </sec>
      <sec id="sec-3-7">
        <title>3.4 Neighbor selection</title>
        <p>After calculating similarity values, the system ranks users according to their similarity to the
active user to obtain a set of neighbors for them. The size of the neighbor set can be fixed by
choosing the first N users or variable by selecting users whose similarity exceeds a certain
threshold. This work distinguishes between the set of neighbors and the set of actual
recommendations. The output of the neighbor set is the same as mentioned earlier (distance
function), and a priority set is used to refine it.</p>
      </sec>
      <sec id="sec-3-8">
        <title>3.5 Predictions and recommendations</title>
        <p>At this stage, the recommendation system assigns a predicted rating to all items that the set of
neighbors sees, rather than the active user. The predicted rating   , indicates the expected
interest of item   to user   and is usually computed as the sum of the ratings of   environment
for the same item   :
  , = ∑  ∈    ,
 
where   denotes the set of neighbors for   , who rated item   .</p>
        <p>Some examples of aggregation functions:
  , =  ∑  ∈  
(  ,   ) ×   ,
(14)
(15)
  = | 1 | ∑  ∈    ,</p>
        <p>The weighted sum (6) is the most commonly used aggregation function for predicting ratings.
Since users typically use rating scales differently, this prediction formula compensates for
variations in the rating scale. This allows maintaining predicted ratings for a given user to be close
to the average rating of this active user.</p>
        <p>Based on the predicted ratings of items that have not yet been rated, seen by the neighbors of
the active user, the recommendation system sorts them in descending order according to their
predicted ratings to form a prediction list for the active user   .</p>
        <p />
        <p>(  ) = {  |  ∈   ⊂   ,   ∉   }</p>
        <p>The rank of item   in the prediction list   ,    (  ) is the position of item   in the
prediction list for active user   . Accordingly, we can define the recommendation list
 (  ) for the active user   as the set of items with the highest rating   in  (  ),
which is given by:
  , =   +  ∑  ∈  
(  ,   ) × (  , −   )
∑  ∈  |
(  ,  )|</p>
        <p>and   – is the average rating of user   .</p>
        <p>The multiplier  serves as a normalizing coefficient and is typically chosen as  =
1</p>
        <p>It is expected that objects with the highest rating will be the most predominant, so the user is
likely to explore objects in an ordered list, starting from the top, hoping to find interesting
objects.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <p>• Split the users into two groups:
a) Group 1 (20 movies for building the user model).</p>
      <p>b) Group 2 (40 movies for testing).</p>
      <p>• Resulted in 497 eligible users providing 84,596 ratings.</p>
      <p>Random split generation:
• Created five random splits of training and active users.
• Each split involved selecting 50 active users and utilizing the remaining 447 users
as training users.
• These splits were labeled as split-1, split-2, ..., split-5 for subsequent
crossvalidation.
3. Cross-validation procedure:
• Conducted five-fold cross-validation, repeating experiments five times, once with
each split.</p>
      <p>• Each split served as a distinct training and testing dataset.
4. Training and testing phases:
• Training phase - utilized the set of training users (447 users) to find neighbors for
the active user.
• Testing phase - divided ratings of each active user randomly into two sets:
a) Training ratings (34%)
b) Test ratings (66%)
Training ratings used to model the user, while test ratings remained unseen for prediction
evaluation.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>In this experiment, we are running a recommendation system using fuzzy distance and comparing
its results with classical systems (Pearson correlation and cosine similarity). The size of the
neighbor set is kept at 30 for all experiments. In this experiment, we are running the system over
the entire database of training users, even if it takes a long time. The difference between gender
(profession) values is either 0 if both users have the same gender (profession), or 1 otherwise.
This is consistent with our reasoning for establishing opposing values, as far as possible. In
addition, a certain normalization is used for age values to ensure that they fall within the same
GII range, i.e., [0, 5]. Each age value is multiplied by (5/MAX), where MAX is the oldest user in the
system and no younger than 60 years old. The system selects movies from the set of test ratings
of the active user one by one. After that, it predicts ratings for them based on the set of all
neighbors who rated the same movie. Once the predicted ratings are obtained, the system
compares them with the actual ratings provided by the active user. Figures 4-8 show the
percentage of correct predictions obtained for fifty active users. Each graph shows the percentage
of the number of ratings that the system correctly predicted, out of the total number of available
test ratings for the active user.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Discussions</title>
      <p>The results of the experiment conducted for each random split of 50 active users and 447 training
instances are shown in figures 4-8 and tables 2-11, which depict:
• a table with the results of calculating the Mean Absolute Error (MAE), which is a measure
of the accuracy of the recommendation system, and Coverage [14], which is a measure of
the percentage of items for which the recommendation system can provide predictions;
• a table comparing the developed algorithm with classical algorithms (Pearson correlation
and cosine similarity), where the comparison measure is the percentage of correctly
predicted ratings for movies from the test dataset for each of the 50 active users. The table
shows the number of users belonging to a certain group, where Greater group has a higher
percentage of correctly predicted ratings compared to classical methods, Same group has
the same percentage, and smaller group has a lower percentage;
• a graph showing the percentage of correctly predicted ratings using the implemented
algorithms for each user.</p>
      <p>Based on the obtained data, the following conclusions can be drawn:
• the Mean Absolute Error (MAE) of the developed algorithm is better (smaller) compared
to classical approaches, indicating that the deviation of predictions generated by the
recommendation system from the true ratings specified by the active user of the
recommendation system has decreased;
• the percentage of items for which the recommendation system can provide predictions
(Coverage) [15] remained at the same level, and in some cases even increased;
• in all 5 runs, the percentage of correctly predicted ratings was higher for most users.</p>
      <p>Higher prediction values obviously illustrate that a better set of corresponding neighboring
users has been found, thus increasing the accuracy of the recommendation system.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusions</title>
      <p>In this study, we proposed and evaluated a recommendation system that leverages fuzzy logic for
both user model formation and similarity computation. The research outcomes indicate several
key findings:</p>
      <p>Enhanced user modeling: By incorporating fuzzy logic into the user modeling process, we
were able to capture nuanced user preferences beyond simple ratings. The integration of fuzzy
sets for demographic attributes and genre interest indicator resulted in a more accurate
representation of user tastes.</p>
      <p>Improved similarity computation: The introduction of fuzzy distance metrics facilitated a
more robust comparison between user models, considering the partial degree of membership in
fuzzy sets. This approach addressed the limitations of traditional distance functions, particularly
in handling diverse and imprecise user features.</p>
      <p>Superior recommendation accuracy: Experimental results demonstrated that the
recommendation system utilizing fuzzy logic outperformed classical approaches, such as Pearson
correlation and cosine similarity. The system achieved lower Mean Absolute Error (MAE) and
higher prediction accuracy, indicating its effectiveness in providing personalized
recommendations.</p>
      <p>Stable coverage: Despite the introduction of fuzzy logic, the recommendation system
maintained stable coverage, ensuring that a wide range of items could be recommended to users.
This suggests that the proposed approach strikes a balance between accuracy and coverage,
essential for practical recommendation systems.</p>
      <p>In summary, the integration of fuzzy logic in user modeling and distance computation proved
to be a promising approach for enhancing recommendation systems' performance.</p>
    </sec>
    <sec id="sec-8">
      <title>8. References</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Roy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dutta</surname>
          </string-name>
          ,
          <article-title>A systematic review and research perspective on recommender systems</article-title>
          ,
          <source>Journal of Big Data</source>
          , Vol.
          <volume>9</volume>
          ,
          <string-name>
            <surname>Issue</surname>
            <given-names>59</given-names>
          </string-name>
          ,
          <year>2022</year>
          . doi:
          <volume>10</volume>
          .1186/s40537-022-00592-5.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.</given-names>
            <surname>Dahiya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Duhan</surname>
          </string-name>
          ,
          <article-title>Comparative Analysis of Various Collaborative Filtering Algorithms</article-title>
          ,
          <source>International Journal of Computer Sciences and Engineering</source>
          , Vol.
          <volume>7</volume>
          , Issue 8 (
          <year>2019</year>
          )
          <fpage>347</fpage>
          -
          <lpage>351</lpage>
          . doi:
          <volume>10</volume>
          .26438/ijcse/v7i8.
          <fpage>347351</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E.</given-names>
            <surname>Çano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Morisio</surname>
          </string-name>
          ,
          <article-title>Hybrid Recommender Systems: A Systematic Literature Review, Intelligent Data Analysis</article-title>
          , Vol.
          <volume>21</volume>
          ,
          <string-name>
            <surname>Issue</surname>
            <given-names>6</given-names>
          </string-name>
          ,
          <year>2017</year>
          , pp.
          <fpage>1487</fpage>
          -
          <lpage>1524</lpage>
          . doi:
          <volume>10</volume>
          .3233/IDA-163209.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fauzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Putra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Stephanie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. S.</given-names>
            <surname>Edbert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Suhartono</surname>
          </string-name>
          ,
          <article-title>Hybrid Approaches for Customer Segmentation and Product Recommendation</article-title>
          ,
          <source>Proceedings of the International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS)</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>324</fpage>
          -
          <lpage>329</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICIMCIS60089.
          <year>2023</year>
          .
          <volume>10348619</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>