RecTour 2019, September 19th, 2019, Copenhagen, Denmark.                                                                                             27


      A Framework for Recommender Systems Based on a Finite
                 Multidimensional Model Space
                         Leonhard Seyfang                                                            Julia Neidhardt
             Research Unit of E-Commerce, TU Wien                                      Research Unit of E-Commerce, TU Wien
                        Vienna, Austria                                                           Vienna, Austria
                    seyfang@ec.tuwien.ac.at                                                  neidhardt@ec.tuwien.ac.at

ABSTRACT                                                                       2     CORE CONCEPTS
In this conceptual paper we suggest a framework for flexible and               In this section we introduce the essential concepts in theory. Prac-
efficient recommender systems. It is based on an unified finite                tical aspects will be treated in section 3 and 4.
multivariate model space for both user and products. Association
functions map each entity to each model-dimension fuzzily. Finally             2.1    Model Space
distance- and learning-operations allow efficient operation. The               In this framework we use a multidimensional, finite model space.
main differences to existing approaches are the reduced model                  All entities, users, products, or whatever abstract or actual items
space and the fuzzy location of entities. The reduced model space is           are of interest, are fuzzy–located in the very same model space.
most advantageous where item features are inconsistent structured              In most cases the number and interpretation of the dimensions
or sparse. The association function allows to express a distribution           will be defined domain specific. This can be done through domain-
of agreement, not just a single location.                                      knowledge or by dimension reduction techniques such as factor
                                                                               analysis (see [3] for a related approach). The latter of course requires
CCS CONCEPTS                                                                   a suitable data corpus. For tourism seven factors have already been
• Information systems → Personalization; Recommender sys-                      identified [5], [4].
tems; Collaborative search; Similarity measures.                                  Alternatively a generic, user oriented data model can be used to
                                                                               obtain a cross-domain recommender system. For example the Big
                                                                               Five personality traits [2] could be used straightforward as dimen-
KEYWORDS                                                                       sions. For a comprehensive work on cross-domain recommenda-
recommendation, personalization, feature based recommendation,                 tions see [1], and for thoughts on personality and recommender
similarity measurement, fuzzy mapping                                          systems see [6].

                                                                               2.2    Association Function
1   INTRODUCTION                                                              Association functions express the degree of accordance between en-
Tourism is for many reasons an interesting and challenging field for          tities and model-dimensions. They are most comparable to member-
recommender systems: Travel experiences are complex and include               ship functions in fuzzy logic but should not be confused with prob-
various physical and mental aspects. Decisions are mainly based               ability density functions. Dimensions are treated independently, so
on subconscious, abstract ideas and emotions attached to them. At             each entity has a separate association function for each dimension.
the same time hard constraints, like the available time frame and                 In our model space, we think of each dimension as closed interval
budget, have to be met. Also multiple persons are usually involved            between 0 and 1. We believe that placing an entity on a single point
in the decision finding process. Products are very diverse, they are          on each dimension is an oversimplification. Instead it should be
often inconsistent and incomplete documented. More often than                 possible to express the spread of conformity over an adjustable
not, products themselves do not satisfy the tourists need directly,           range. Hence we were looking for a function that:
but are prerequisites for the tourists dreams to be fulfilled. With all             • Is defined on the closed interval [0, 1];
that challenges in mind, we reach for a flexible generic solution.                  • Takes values between 0 and 1;
   Generally, recommender systems aim to provide useful sugges-                     • Is continuous (sufficiently small changes in x result in arbi-
tions to their users. They use any combination of user-, item-, and                   trarily small changes in f (x));
context- information.                                                               • Allows to specify location and dispersion independent of
   We suggest a recommendation-framework that:                                        each other, hence takes (at least) two parameters;
                                                                                    • Is memory-efficient (is specified by as little as possible pa-
    • Reduces the feature-space to few interpretable (user-related)
                                                                                      rameters).
      and manageable dimensions.
    • Maps users and products, and other entities of interest to              We found the association function defined in (equation 1) fulfilling
      the model space.                                                        all requirements above.
    • Treats the entity-dimension-relationship fuzzily.                                           
                                                                                                  
                                                                                                   1                           if a = b = 0
    • Provides a heuristic to efficiently compute distances between                                       x a (1 − x)b
                                                                                                  
                                                                                       fa,b (x) = 
                                                                                                  
                                                                                                  
                                                                                                                                otherwise        (1)
      entities.                                                                                       a a           a b
                                                                                                                1−
                                                                                                  
    • Provides self-learning procedures in near real-time.
                                                                                                  
                                                                                                   a +b            a +b
                                                                                                  


        Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
RecTour 2019, September 19th, 2019, Copenhagen, Denmark.                                                                                                                      28


 f is fully specified by two real parameters a ≥ 0 and b ≥ 0. An                     (3) Self-learning: Entities – typically users – can learn their
                                                                                         position in the model space based on interaction with other
           1.0
                                                                                         entities – typically products – that already have been classi-
                                                            µ = 0.2, ρ = 10              fied (see 2.4 for details).
                                                            µ = 0.4, ρ = 4
           0.8                                                                    The association function can also be used to retrieve item properties,
           0.6
                                                                                  particularly after a self-learning phase.
 fa,b(x)
           0.4                                                                    2.3     Distance
           0.2
                                                                                 We define the distance d between two association functions as
                                                                                                            (
                                                                                                              0              if ρ 1 = 0 or ρ 2 = 0
           0.0                                                                        d(fa1,b1 , fa2,b2 ) =                                        (6)
                                                                                                              1 − fa1,b1 (x) otherwise
                 0.0       0.2       0.4        0.6         0.8         1.0
                                                                                  where x is uniquely defined by the two properties (without loss of
                                                                                  generality we assume from now on that µ 1 ≤ µ 2 ):
Figure 1: Two examples of association functions. Solid line:
µ = 0.2, ρ = 10, a = 2, b = 8; Dashed line: µ = 0.4, ρ = 4, a = 1.6,                                                   µ1 ≤ x ≤ µ2                                           (7)
b = 2.4;                                                                                                         fa1,b1 (x) = fa2,b2 (x)                                     (8)
                                                                                  In words: x is the place between both modes where the two associ-
alternative, more human comprehensible parametrization is given
by the location parameter µ ∈ [0, 1] and the precision parameter
ρ ≥ 0. Both parametrizations can easily be converted into each                    1.0                                              1.0
                                                                                                                                                         d
other via (2), (3), (4), and (5). Examples for f are shown in figure 1.           0.8                                              0.8
                                 a
                        µ =             a +b > 0                   (2)            0.6                                              0.6
                              a +b                                                                         d
                        ρ = a +b                                   (3)            0.4                                              0.4
                        a = µρ                                            (4)
                                                                                  0.2                                              0.2
                        b = (1 − µ)ρ                                      (5)
                                                                                  0.0                                              0.0
The value of fa,b (x) is in [0, 1] for all valid a, b, and x ∈ [0, 1]. If a =
0 and b = 0, f (x) is constant 1. We call f 0,0 the non-informative case.                0.0   0.2   0.4       0.6   0.8     1.0         0.0    0.2       0.4   0.6   0.8   1.0
                                                                                                      x                                              x
µ is not defined in the non-informative case and not needed either.
Note: fa,b is proportional to the beta distribution Beta(a + 1, b + 1),
                                                                                  Figure 2: In the left example the association functions are
but density functions are scaled to an area of 1 while the association
                                                                                  very dissimilar hence the distance d is large. In the right ex-
function is scaled to the range of [0, 1]. Further, Beta(0.5, 0.5) is
                                                                                  ample the association functions are somewhat similar so the
called the non-informative prior in the context of Bernoulli trials
                                                                                  distance is rather small.
in Bayesian statistics. Our case f 0,0 is not intended to possess the
same non-informativeness and should not be confused.
   Realistically ρ should not be to small since f gets increasingly               ation functions intersect. d is 1 minus the value of f at x. The basic
vague as ρ approaches 0. On the other hand, ρ should not be to                    idea of d is illustrated in figure 2.
large neither as it would suggest an non-existing precision.                         Determining x requires numerical optimization but a good ap-
   There are several ways how an entity gets its association func-                proximation is given by d:
tions:
                                                                                                           0
                                                                                                           
                                                                                                                                                        if ρ 1 = 0 or ρ 2 = 0
                                                                                  d(fa1,b1 , fa2,b2 ) =
                                                                                                           
    (1) Per mapping-algorithm: For products, or whatever enti-                                                  fa1,b1 (x̂) + fa2,b2 (x̂)
        ties are considered for recommendations, mapping functions                                         1 −
                                                                                                                                                        otherwise
                                                                                                                           2
        can be defined. A mapping function translates the available                                                                                                          (9)
        feature description into association function. Mapping al-                with
        gorithms can also be used related to users: in [5] users are                                                        µ 1w 1 + µ 2w 2
                                                                                                                     x̂ =                                                   (10)
        mapped according to pictures they have selected. Also a                                                                w1 + w2
        mapping based on demographic features is possible.                        and
    (2) Manually: The graph of f can be used to set up an easy to                                                                              √
                                                                                                               w 1 = 1 + 0.4 (1 + s 1 )             ρ1                      (11)
        use human interface. While using two sliders, one for the                                                                              √
        mode and one for the precision, one could alter the associa-                                           w 2 = 1 + 0.4 (1 − s 2 )             ρ2                      (12)
        tion function until the desired properties are reached. This              where s (skewness) is defined as
        option is favorable if no mapping-algorithm exists. In cases                                                   √
        where the recommendation is in the foreground, it might be                                             2(b − a) a + b + 3
                                                                                                       s =                                                                  (13)
                                                                                                           (a + b + 4) (a + 1)(b + 1)
                                                                                                                      p
        attractive to offer a tool for user-self-classification.


           Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
RecTour 2019, September 19th, 2019, Copenhagen, Denmark.                                                                                                                      29


The closed solution for d is easy to compute and the deviation                                                    λ=0                                       λ=6
|d − d | is limited for a given range of ρ, e.g. |d − d | ≤ 0.039 for                        1.0                                        1.0
the reasonable assumption 0.5 ≤ ρ ≤ 10 (without proof). Obvious
                                                                                             0.8                                        0.8
properties of d are (also without proof):
                                                                                             0.6                                        0.6
             µ 1 = µ 2 ⇒ d(fa1,b1 , fa2,b2 ) = 0                                      (14)
                                                                                             0.4                                        0.4
  d(f µ 1, ρ 1 f µ 2, ρ 2 ) < d(f µ 1, ρ 1 f µ 2 +ϵ, ρ 2 )                  ϵ>0       (15)
                                                                                             0.2                                        0.2
  d(f µ 1, ρ 1 f µ 2, ρ 2 ) > d(f µ 1, ρ 1 f µ 2, ρ 2 +ϵ )       ϵ > 0, µ 1 , µ 2     (16)
                                                                                             0.0                                        0.0
The overall distance D between two entities is the weighted mean
of the distances of all k dimensions.                                                               0.0   0.2   0.4   0.6   0.8   1.0         0.0   0.2   0.4   0.6   0.8   1.0

                                             k
                                                                                             Figure 3: The tuning parameter λ controls the extent to
                                             Õ
                                      D=           di vi                              (17)
                                             i=1                                             which the dissimilarity h diminishes a new and bnew . On the
                                                                                             left λ = 0 and the precision parameter of the resulting func-
The weights v are chosen proportional to the importance of the                               tion is simply the average of the precision parameters of
corresponding dimension.                                                                     the input functions (dashed line). On the right λ = 6 and
                                                                                             the resulting function covers roughly the same area which
2.4     Learning Procedure                                                                   is covert by the two input functions in conjunction.
The learning procedure allows entities (usually users) to adopt their
location in the model space according to their interaction with other
entities (usually products). It is based on the merge-operation.                             associative:
   The merge-operation m translates an ordered set of association
                                                                                                           m(fa1,b1 , fa2,b2 ) = m(fa2,b2 , fa1,b1 )             (22)
functions F into a single association function:
                                                                                                 m m(fa1,b1 , fa2,b2 ), fa3,b3 , m fa1,b1 , m(fa2,b2 , fa3,b3 ) (23)
                                                                                                                                                              
                                        m
                                   F −→ fanew,bnew                                    (18)
                                                                                             3      USAGE
We assume that no element of F is the non-informative function
(otherwise those elements are simply removed as they do not hold                             A standard application works as follows: The model space (the
information anyway). The cardinality of F (the number of elements                            number and interpretation of the dimensions) would be determined
in F ) is denoted by n. The new parameter a new is defined as                                based on expert knowledge or dimensionality reduction methods
                                                                                             or both. As mentioned earlier, seven factors have already been
                               
                               
                                0                            if n = 0                       determined for the scope of tourism [5], [4].
                               a 1                           if n = 1                           Once the model space is specified, mappings from item-descrip-
                               
                               
                 a new =
                               
                                          n                                           (19)   tions to the model dimensions must be implemented (see section
                                        Õ
                                д h(F                         if n > 1
                               
                               
                               
                               
                                     )      (ai wi )                                        2.2).
                                        i=1                                                     In tourism, items are very diverse, including travel packages,
and bnew is defined accordingly.                                                             hotels, flights, events, sights, natural phenomena, destination, cities,
Here w is a vector of weights associated with the elements of F                              forms of sport and many others. Some of them are real products
with ni=1 wi = 1. h is a function that represents the dissimilarity                          meaning bookable, other are not. The latter are still important
       Í
of F . We currently use the mean of all pairwise distances within F                          for recommender systems as they serve as connection to actual
for h (see equation 20) but other definition are certainly possible.                         products. Sometimes strong intangible aspects such as culture-
                                                                                             dependent attributions or emotional concepts are involved. (The
                                              n−1 n                                         decision process might roughly be like: honeymoon + love + Europe
                     1                        Õ Õ                                 
      h(F ) = Ín−1 Ín                                        d(fi , f j ) wi wj       (20)   → city of love → Paris → hotel → room / suite, not right away to
                   i=1      j=i+1 wi w j i=1 j=i+1
                                                                                             the hotel room.)
The function д transforms the result of h to a reasonable shrinking                              Users obtain their profile in a self-learning way as they inter-
factor, such as                                                                              act with items (or even other users). Depending on the particular
                                       λ                                                    domain and application, interactions can include book-, buy-, like-
                        д = 1 − h (F )                          (21)
                                                                                             , rate-, comment-, view-, listen-to-, read-, search-, compare-, and
where λ ≥ 0 is a tuning parameter. For larger lambdas the penalty                            other actions. Using the learning procedure from section 2.4, defined
for the dissimilarity increases. If λ = 0 there is no shrinking at all.                      interactions modify the users profile towards the items interacted
In this case a new and bnew are simply the weighted averages of the                          with. To define relevant interactions can be straightforward in some
input-parameters (figure 3, left side). With a sufficient shrinkage                          cases and sophisticated in others.
factor on the other hand, m acts more like an union operation (figure                            The initial association functions might be: the non-informative
3, right side). Note that shrinking refers to a and b and consequently                       association function, the (dimensionwise) grand mean, the contex-
to the precision ρ whereas the spread of f works in the opposite                             tual a priori association function (for example based on known or
direction. The merge-operation is commutative but generally not                              estimated demographic characteristics).


          Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
RecTour 2019, September 19th, 2019, Copenhagen, Denmark.                                                                                                                           30


   The recommendation service itself calculates distances between                                          Finally the sailboat cruise is exiting at times (although not as
users and products, sorts the results, and holds a list of most ap-                                     thrilling as skydiving), and the old Mediterranean cities also provide
propriate items ready. Computations can be done on demand or                                            the opportunity to get in touch with old cultures. (figure 4, bottom
in advance. Filters might be implemented additionally to meet the                                       row).
users constraints.
   Implementing stochastic components can increase serendipity                                                          Rich
and diversity but destruct predictability and reproducibility.                                                                                City trip to Rome


4 WORKED EXAMPLE                                                                                                                               Sailboat Cruise


                                                                                                                     Culture
                                    d = 0.35                            d = 0.36
           Skydiving holiday


                                                                                                                                         ●   User


                                                                                                                       Little     Skydiving holiday

                                                                                                                                Excitement                        Relaxation
                                                                                                                                                    Action


                                                                                                        Figure 5: All item-locations in the R 2 . The dotted lines are
                                                  d = 0.37
                                                                                                        drawn approximately at f = 0.75 to indicate the spread along
           City trip to Rome


                                                                                     d = 0.54
                                                                                                        both dimensions.

                                                                                                           In this toy example, the Mediterranean sailboat cruise would
                                                                                                        clearly be the best recommendation according to our measurement
                                                                                                        D (see equation 17), followed by the skydiving holiday. However
                                                                                                        if we had used the location parameter µ in conjunction with the
                                            d = 0.01
                                                                                                        Euclidean distance or the Manhattan distance, the skydiving holiday
                                                                                   d = 0.29
                                                                                                        would have appeared to be the closest to the user. The reason for
in the Mediterranean


                                                                                                        this divergence is the different spread of associations.
   Sailboat Cruise


                                                                                                           In table 1 all user-item distances are presented, according to
                                                                                                        Euclidean-, Manhattan-, and D-distance. Figure 5 illustrates the
                                                                                                        locations of all items in the R 2 .

                                                                                                        Table 1: Euclidean-, Manhattan-, and D-distance for all
                                                                                                        items.
                               Excitement              Relaxation   Little                    Rich
                                              Action                          Culture
                                                                                                                 Item                     Euclidean          Manhattan         D
Figure 4: Example with two dimensions (columns) and three                                                        Skydiving                          0.35           0.50    0.36
products (rows). The filled shapes display the users prefer-                                                     Rome                               0.86           1.20    0.45
ences, the dashed lines indicate the product properties.                                                         Sailboat Cruise                    0.49           0.55    0.15


   For a simple example we assume that we have a travel recom-
mendation system with two dimensions: Action and Culture, both                                          5    DISCUSSION
equally important meaning equally weighted.                                                             The framework presented here offers interesting possibilities as
   Our user is inclined towards exiting activities as long as they                                      it is flexible, possibly cross-domain, self-learning, and the entity-
are not too extreme (figure 4, left column). The user is not really                                     dimension-memberships relation is easy to understand. It has no
interested in culture (figure 4, right column).                                                         cold start problem with new items and it is not necessary to match
   We have three items to suggest: A skydiving holiday, a city trip                                     an user to other similar users. It can serve as basis for multivariate
to Rome, and a sailboat cruise in the Mediterranean.                                                    outlier detection and for cluster analysis. Deviations in the product-
   The skydiving holiday is about as exiting as it gets with virtually                                  and user- distribution can be revealed as side effect.
no cultural options. (figure 4, first row).                                                                 However this approach comes with two downsides: Firstly the
   The city trip to Rome offers ample cultural sights but besides                                       dimensions of the model-space must be defined in advance and are
that, it’s not terribly exciting. (figure 4, second row).                                               hard to modify in a running system. Hence setting up the model


                                 Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
RecTour 2019, September 19th, 2019, Copenhagen, Denmark.                                                                                                                 31


space is the crucial task. Secondly the mapping from the original                         Dordrecht London, Chapter 27, 919–959.
feature space to the model dimensions must be implemented. Man-                       [2] Oliver P. John and Sanjay Srivastava. 2008. Handbook of Personality: Theory and
                                                                                          Research (3 ed.). The Guilford Press, New York, Chapter 4, 114–158.
ual input is simple but time-consuming thus expensive with large                      [3] Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization tech-
quantities. The next steps will be the utilization in an operating rec-                   niques for recommender systems. Computer 42 (08 2009), 30–37.
                                                                                      [4] Julia Neidhardt, Rainer Schuster, Leonhard Seyfang, and Hannes Werthner. 2014.
ommender system and measuring and reporting the performance,                              Eliciting the Users’ Unknown Preferences. In Proceedings of the 8th ACM Conference
ideally in comparison with an established system.                                         on Recommender Systems (RecSys ’14). ACM, New York, NY, USA, 309–312.
                                                                                      [5] Julia Neidhardt, Leonhard Seyfang, Rainer Schuster, and Hannes Werthner. 2015.
                                                                                          A picture-based approach to recommender systems. Information Technology &
REFERENCES                                                                                Tourism 15-1 (2015), 49 – 69.
[1] Iván Cantador, Ignacio Fernández-Tobías, Shlomo Berkovsky, and Paolo Cremonesi.   [6] Marko Tkalcic and Li Chen. 2015. Recommender Systems Handbook (2 ed.). Springer,
    2015. Recommender Systems Handbook (2 ed.). Springer, New York Heidelberg             New York Heidelberg Dordrecht London, Chapter 21, 715–739.


          Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).