=Paper=
{{Paper
|id=Vol-2665/paper12
|storemode=property
|title=A method of user preference elicitation by pairwise comparisons
|pdfUrl=https://ceur-ws.org/Vol-2665/paper12.pdf
|volume=Vol-2665
|authors=Alexander Borodinov,Vladislav Myasnikov
}}
==A method of user preference elicitation by pairwise comparisons ==
A Method of User Preference Elicitation by Pairwise Comparisons Alexander Borodinov Vladislav Myasnikov Geoinformatics and Information Security Department, Geoinformatics and Information Security Department, Samara National Research University, Samara National Research University, Samara, Russia Samara, Russia aaborodinov@yandex.ru vmyas@geosamara.ru Abstract—In this paper, we consider the problem of II. RELATED WORK reconstructing functions defined implicitly by the results of pairwise comparisons. In the proposed approach, we apply an There is a large research community focused on adaptive transformation to the high-dimensional space. Then we recommendation systems with a wide range of tasks. classify the comparisons using linear or non-linear classifiers. In Historically, most approaches are based on collaborative this work, we consider linear regression and random forest as filtering approaches. For example, forecasting ratings for classification algorithms. In experimental analysis, we compare streaming services such as Netflix [8]. Pairwise methods for different methods of transformation to the high-dimensional comparing user preferences are most often used in search space and investigate the effectiveness of the proposed method. engines [9]. Recommender systems based on information about transitions between sites and products in online stores are Keywords—utility function, preference function, another large area of research [10,11]. One of the new preferences elicitation, pairwise comparisons, machine approaches was the use of neural networks to improve the learning accuracy of recommendations [12]. One of the young and I. INTRODUCTION underdeveloped areas are transport recommender systems [13]. In our work, we consider the method of pairwise comparisons, The method of pairwise comparisons is one of the methods which has not been used before for the construction of used in recommendation systems. Analyzing pairwise transport recommender systems. comparisons, we try to determine some pattern in the choice of the preferred option. The method of pairwise comparisons uses III. PROBLEM STATEMENT information about comparing pairs of objects, in contrast to the classical methods of machine learning, which use data about a Let the objects set j j J have an order « » and/or specific object [1-4]. The task of providing recommendations a strict partial order « ». The equivalent notation is i j for a particular user is the task of preference elicitation. and j i . In the case i j j i , the objects are Three main types of tasks are specified according to indistinguishable and 𝜔𝑖 ≪ 𝜔𝑗 . Absolute preference is different types of objects and classes [3,5]: characterized by utility function u : R , and relative - label ranking – search for preferred ordering among labels preference is described by preference function p : R . for any example. The traditional classification problem can be For utility function u i u j denote as i j , generalized as part of the label ranking problem when the classification result of the example is a label of the highest u i u j i j and 𝑢(𝜔𝑖 ) = 𝑢(𝜔𝑗 ) ⇔ 𝜔𝑖 ≪ 𝜔𝑗 . rank; For preference function p i , j 0 denote as i j and - instance ranking – ranking a set of examples for a fixed label order; 𝑝(𝜔𝑖 , 𝜔𝑗 ) = 0 ⇔ 𝜔𝑖 ≪ 𝜔𝑗 . The preference function has restrictions based on the properties of the corresponding order - object ranking – similar to ranking examples, however, relations such as asymmetry in argument, transitivity, etc. labels are not associated with examples. A preference function can be defined through a utility In this paper, we consider the task of ranking objects where the objects may be the transport routes proposed by the function p j , i u j u i and recommender system [6,7], and the preferences are the routes selected by the user. In the second section of the paper, we u j p j , * u p , 0 . * * * briefly describe the existing approaches to the construction of recommender systems. In the third section, we give the Objects are defined by the feature vector x x X of problem formulation and problem statement. The fourth N-dimensional space. The utility and preference function will section describes the method of pairwise comparisons. The be written as p x , x j , u x . Let x j x j and fifth section shows the results of experimental studies. At the end of the work, conclusions and possible directions for further p ij p i , j , u j u j to shorten the record. research are presented. Information about pairwise comparisons can be presented in Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0) Image Processing and Earth Remote Sensing the form of values of the preference function p j , i or in where s is a numerical non-negative parameter. Thus, we have the following estimate of the preferences between the objects: the form of a symbolic representation: с1 0 с1 0 1, p j , i 0, 1 0 ˆ s ln ln 1 . с 1 0 с 0 1 с1 0 с 0 1 z ij z j , i 0 , p j , i 0, . The analytic hierarchy process (AHP) is used for the 1, p j , i 0. multicriteria ranking of objects that are defined by features. In this case, at the initial step matrices m ijn i , j J , n 0 , N 1 The choice of a specific route from the route list proposed by the system is an example of information on paired are calculated. Each element of the matrix is the result of a user comparisons in a transport recommendation system. response regarding the preferences of the i-th object over the j- th objects according to the n-th criterion. The resulting utility The number of incorrectly reconstructed relationships, the N 1 Kendall distance for pairwise comparisons, is a criterion for the of the objects as a scalar product u j w n v nj , where reconstruction quality of the preference and utility function: n0 v , T n n n (i, j ) : z , z x , x , (i, j ) I , is the right eigenvector of the preference v , v J 1 d i j i j 0 matrix, and w – eigenvector of the matrix of alternatives. The This value in the normalized form is an estimate of the main problem of this method is a large number of pairwise 1 comparisons. Therefore, in practice, they often use a model corresponding relation errors probability d d I . N 1 u x w n v n x n , based on a generalized additive IV. METHOD n0 model. A. Pairwise Comparison Method Pair comparison methods were initially used to range B. Proposed method description objects that cannot be described by a feature vector. Each The following features should be considered when element of the matrix c ij is the absolute frequency of the i-th reconstructing the utility function and preference function: object over j-th [14]. To analyze such data, the Thurstone - reconstruction of functions is practically impossible with model was proposed [15], in which it is assumed that the utility a small amount of information or in its absence, as in the case of an object is determined by a normally distributed random of a system cold start; variable. Thus, for objects 0 ,1 we get - it must be able to automatically transform to nonlinear fu u j j , j models. Classes will be separable almost surely when using 2 , which with transformation the original features space to a new feature u 1 u 0 1 0 , 10 , 10 1 0 2 10 1 0 2 2 2 2 space Y with a higher dimension; and the Laplace function we get: - the regression task of reconstructing the utility function can be reduced to the classification problem by reconstructing 1 0 the symbolic representation. P 1 0 P u 1 u 0 0 . 10 The method of function reconstruction by their symbolic representation contains the following steps: In the numerical estimation of the probability (5) as the relative frequency of the corresponding preferences calculated – feature values normalization in the range [0,1]; using the matrix c ij , we have the following estimation: – selection of a new feature space (basis) Y; – transformation of the original feature vector x into the с1 0 1 0 ˆ 1 0 1 . new feature space Y with a higher dimension K=dim(Y) N; с1 0 с 0 1 – building a linear or nonlinear classifier in the feature The simplified Thurstone model assumes the absence of space Y; correlation and equal variances in the utility function, which – quality assessment of the building classifier on the test can be represented as: 12 02 0 .5 , 1 0 0 , 120 1 . dataset. Another featureless method is the Bradley-Terry model. In the case when the evaluation of the preference function [1]. Estimating the probability (5) in the following form: is unsatisfactory, go to the selection of a new basis and transformation of the feature space. 1 P 1 0 , j exp j s , The described steps are presented as a diagram in Fig. 1. 1 0 VI International Conference on "Information Technology and Nanotechnology" (ITNT-2020) 51 Image Processing and Earth Remote Sensing N 1 K K n K n K 0 K 0 N d im Y d im X N , Simulation Proposed method n0 N 1 synthesis y k ( x n ), ( xn ) 2 n ( 2 x i), k j bas is Bas es r epository n0 i 0 , 1, ..., 2 1, j 0 , 1, ..., lo g 2 N 1, j analysis KS Object synthesis basis coe ffic ients N 1 synthesis Transformation n 0 , N 1; k K 0 kn. n model dimension KA =N KA:=KA+∆ to a new analysis space n0 objects and relations D. Machine Learning Methods Initial data: Machine learning In this paper, we use logistic regression and random forest methods objects and relations when testing the proposed approach. UF PF test train tuned Logistic regression solves the binary classification problem method using a linear dividing hyperplane: The estimate of the analysis d x w x wN . T relations model quality Classification Regression methods: methods: frequency accuracy (error) Classifier parameters for a particular training set x j , r j j J estimate are determined from the condition: Is the error no J w ln 1 e x p r j d x j m in , j J satisfactory? where r j 1,1 – is a random variable of the correct classification that determines the true class of the Results corresponding j-th object. Fig. 1. The scheme of the proposed approach. A random forest is a voting method implementation of C. Bases Repository several tree classifiers. A random forest avoids retraining, unlike a decision tree. Each tree is built independently of the In this paper, we consider the following bases for rest on a random subset of the training set. The components of transformations : X Y : the feature vector are selected from a random subset of features x yx for each partition when learning trees. – Original basis: User decisions may be erroneous, especially with a small K d im Y d im X N , yn xn , n 0, N 1 ; difference in the proposed alternatives. Therefore, in this work we use the Thurstone’s model with the probability estimation – Polynomial basis: to add errors in the ideal preferences. For the case j i : N 1 K K n K n K 0 K 0 N d im Y d im X N , z , , j i rnd P u j u i 0 , n0 z ij z j , i , o th e r w is e . N 1 N 1 yk xnn , k K 0 kn; k n 0 , N 1; n n0 n0 where rnd