<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>PrefWork - a framework for the user preference learning methods testing?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alan Eckhardt</string-name>
          <email>eckhardt@ksi.mff.cuni.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Software Engineering, Charles University</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Computer Science, Czech Academy of Science</institution>
          ,
          <addr-line>Prague</addr-line>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2005</year>
      </pub-date>
      <fpage>7</fpage>
      <lpage>13</lpage>
      <abstract>
        <p>PrefWork is a framework for testing of meth- - it can be used for any given task, but it has to be ods of induction of user preferences. PrefWork is thor- customised, the developer has to choose from a very oughly described in this paper. A reader willing to use Pref- wide range of possibilities. For our case, Weka is too Work finds here all necessary information - sample code, strong. configuration files and results of the testing are presented in the paper. Related approaches for data mining testing are compared to our approach. There is no software available specially for testing of methods for preference learning to our best knowledge.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The most popular tool related to PrefWork is the open
source projec t Weka [1]. Weka is in development for
many years and has achieved to become the most
widely used tool for data mining. It offers many
classificators, regression methods, clustering, data
preprocessing, etc. However this variability is also its weakness
? The work on this paper was supported by Czech
projects MSM 0021620838, 1ET 100300517 and GACR
201/09/H057.</p>
      <p>We also need some implementations of algorithms
of the user preference learning that are publicly
available for being able to compare various methods among
themselves. This is a strength of PrefWork - any
existing method, which works with ratings, can be
integrated into PrefWork using a special adaptor Aggregation function may have different forms; one
for each tool (see Section 4.3). There is a little bit of the most common is a weighted average, as in the
old implementation of collaborative filtering Cofi [10] following formula:
and a brand new one (released 7.4.2009) Mahout [11],
developed by Apache Lucene project. Cofi uses Taste @(o) =(2 ∗ fP rice(o) + 1 ∗ fDisplay(o) + 3 ∗ fHDD(o)+
framework [12], which became a part of Mahout. The 1 ∗ fRAM (o))/7 ,
expectations are that Taste in Mahout would perform
better than Cofi, so we will try to migrate our where fA is the fuzzy set for the normalisation of
atPrefWork adaptor for Cofi to Mahout. Finally there is tribute A.</p>
      <p>IGAP [13] - a tool for learning of fuzzy logic programs Another totally different approach was proposed
in form of rules, which correspond to user preferences. in [15]. It uses the training dataset as partitioning of
Unfortunately, IGAP is not yet available publicly for normalised space X0. For example, if we have an object
download. with normalised values [0.4, 0.2, 0.5] with rating 3, any</p>
      <p>We did not find any other mining algorithm spe- object with better attribute values (e.g. [0.5, 0.4, 0.7])
cialised on user preferences available for free down- is supposed to have the rating at least 3. In this way,
load, but we often use already mentioned Weka. It we can find the highest lower bound on any object with
is a powerful tool that can be more or less easily in- unknown rating. In [15] was also proposed a method
tegrated into our framework and provide a reasonable for interpolation of ratings between the objects with
comparison of a non-specialised data mining algorithm known ratings and even using the ideal (non-existent)
to other methods that are specialised for preference virtual object with normalised values [1, ..., 1] with
ratlearning. ing 6.
3</p>
    </sec>
    <sec id="sec-2">
      <title>User model</title>
      <p>For making this article self-contained, we describe in
brief our user model, as in [14]. In this section, we
describe our user model. This model is based on a scoring
function that assigns the score to every object. User
rating of an object is a fuzzy subset of X(set
of all objects), i.e. a function R(o) : X → [0, 1], where
0 means the least preferred and 1 means the most
preferred object. Our scoring function is divided into two
steps.</p>
      <p>Local preferences In the first step, which we call
local preferences, all attribute values of object o are
normalised using fuzzy sets fi : DAi → [0, 1]. These fuzzy
sets are also called objectives or preferences over
attributes. With this transformation, the original space
N
of objects’ attributes X = Y DAi is transformed into
4</p>
    </sec>
    <sec id="sec-3">
      <title>PrefWork</title>
      <p>Our tool PrefWork was initially developed as a master
thesis of Tom´aˇs Dvoˇr´ak [16], who has implemented it
in Python. In this initial implementation, only Id3
decision trees and collaborative filtering was
implemented. For better ease of use and also for the possibility
of integrating other methods, PrefWork was later
rewritten to Java by the author. Many more possibilities
were added until the today state. In the following
sections, components of PrefWork are described.</p>
      <p>Most of the components can be configured by XML
configurations. Samples of these configurations and
Java interfaces will be provided for each component.
We omit methods for configuration from Java
interfaces such as configTest(configuration,section)
which is configured using a configuration from a
section in an XML file. Also data types of function
arguments are omitted for brevity.
4.1</p>
      <sec id="sec-3-1">
        <title>The workflow</title>
        <p>i=1
X0 = [0, 1]N . Moreover, we know that the object o ∈ X0
with transformed attribute values equal to [1, . . . , 1] is
the most preferred object. It probably does not
exist in the real world, though. On the other side,
the object with values [0, . . . , 0] is the least preferred,
which is more probable to be found in reality.</p>
        <p>In this section a sample of workflow with PrefWork is
described.</p>
        <p>The structure of PrefWork is in Figure 1. There
are four different configuration files - one for database
access configuration (confDbs), one for datasources
Global preferences In the second step, called global (confDatasources), one for methods (confMethods)
preferences, the normalised attribute values are aggre- and finally one for PrefWork runs (confRuns). A run
gated into the overall score of the object using an ag- consists of three components - a set of methods, a set
gregation function @ : [0, 1]N → [0, 1]. Aggregation of datasets and a set of ways to test the method. Every
function is also often called utility function. method is tested on every dataset using every way to</p>
        <sec id="sec-3-1-1">
          <title>Results of method testing</title>
          <p>Test</p>
          <p>Results
Interpreter</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>Results</title>
          <p>CSV File</p>
        </sec>
        <sec id="sec-3-1-3">
          <title>How to divide data to training and testing sets</title>
        </sec>
        <sec id="sec-3-1-4">
          <title>Predicted rating</title>
          <p>confDbs
confDatasources
confMethods
confRuns
Datasource</p>
        </sec>
        <sec id="sec-3-1-5">
          <title>Data</title>
        </sec>
        <sec id="sec-3-1-6">
          <title>Train data/Test data</title>
          <p>Inductive
Method</p>
          <p>Database/</p>
          <p>CSV
test. For each case, results of the testing are written tains a random number associated to each rating. Its
into a csv file. purpose will described later.</p>
          <p>A typical situation a researcher working with Every datasource has to implement the following
PrefWork finds himself in is: “I have a new idea X. I am methods:
really interested, how it performs on that dataset Y.”</p>
          <p>The first thing is to create corresponding Java interface BasicDataSource{
class X that implements interface InductiveMethod boolean hasNextRecord();
(see 4.3) and add a section X to confMethods.xml. void setFixedUserId(value);
Then copy an existing entry defining a run (e.g. IFSA, List&lt;Object&gt; getRecord();
see 4.5) and add method X to section methods. Run Attribute[] getAttributes();
ConfigurationParser and correct all errors in the new Integer getUserId();
class (and there will be some, for sure). After the run void setLimit(from, to,
has finished correctly, process the csv file with results recordsFromRange);
to see how X performed in comparison with other void restart();
methods. void restartUserId();</p>
          <p>A similar case is when introducing a new dataset }
into PrefWork - confDatasets.xml and confDBs.xml
have to be edited if the data are in SQL database There are two main attributes of datasource - a list
or in a csv file. Otherwise a new Java class (see 4.2) of all users and a list of ratings of the current user.
able to handle the new type of data has to be created. getUserId returns the id of the current user. The
For example, we still have not implemented the class most important function is getRecord, which returns
for handling of arff files - these files have the defini- a vector containing the rating of the object and its
tion of attributes in themselves, so the configuration attributes. Following calls of getRecords return all
in confDatasets.xml would be much more simple (see objects rated by the current user. A typical sequence
Section 4.2 for an example of a configuration of a data- is:
source with its attributes).
4.2</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Datasource</title>
        <p>Datasource is, as the name hints, the source of data
for inductive methods. Currently, we are working only
with ratings of objects. Data are vectors, where the
first three attributes typically are: the user id, the
object id and the rating of the object. The attributes of
the object follow. There is a special column that con- }
int userId = data.getUserId();
data.setFixedUserId(userId);
data.restart();
while(data.hasNextRecord()){</p>
        <p>List&lt;Object&gt; record =</p>
        <p>data.getRecord();
// Work with the record
...
Another important function is setLimit, which limits Let us also note the select for obtaining the user
the data using given boundaries from and to. The ids (section usersSelect) and the name of the column
random number associated to each vector returned that contains the random number used in setLimit
by getRecord has to fit into this interval. If (randomColumn).
recordsFromRange is false, then the random number
should be outside of the given interval on the contrary. Other types of user preferences. PrefWork as it is
This method is used when dividing the data into train- now supports only ratings of objects. There are many
ing and testing sets. For example, let us divide the data more types of data containing user preferences - user
to 80% training set and 20% testing set. First, we call clickstream, user profile, filtering of the result set etc.
setLimit(0.0,0.8,true) and let the method train PrefWork does not work with any information
on these data. Then, setLimit( 0.0,0.8,false) is about the user, either demographic like age, sex, place
executed and vectors returned by the datasource are of birth, occupation etc. or his behaviour. These types
used for the testing of the method. of information may bring a large improvement in the</p>
        <p>Let us show a sample configuration of a datasource prediction accuracy, but they are typically not present
that returns data about notebooks: - users do not want to share any personal information
for the sole purpose of a better recommendation.</p>
        <p>Another issue is the complexity of user information;
a semantic processing would have to be used.
&lt;NotebooksIFSA&gt;
&lt;attributes&gt;
&lt;attribute&gt;&lt;name&gt;userid&lt;/name&gt;</p>
        <p>&lt;type&gt;numerical&lt;/type&gt;
&lt;/attribute&gt;
&lt;attribute&gt;&lt;name&gt;notebookid&lt;/name&gt;</p>
        <p>&lt;type&gt;numerical&lt;/type&gt;
&lt;/attribute&gt;
&lt;attribute&gt;&lt;name&gt;rating&lt;/name&gt;</p>
        <p>&lt;type&gt;numerical&lt;/type&gt;
&lt;/attribute&gt;
&lt;attribute&gt;&lt;name&gt;price&lt;/name&gt;</p>
        <p>&lt;type&gt;numerical&lt;/type&gt;
&lt;/attribute&gt;
&lt;attribute&gt;&lt;name&gt;producer&lt;/name&gt;</p>
        <p>&lt;type&gt;nominal&lt;/type&gt;
&lt;/attribute&gt;
&lt;attribute&gt;&lt;name&gt;ram&lt;/name&gt;</p>
        <p>&lt;type&gt;numerical&lt;/type&gt;
&lt;/attribute&gt;
&lt;attribute&gt;&lt;name&gt;hdd&lt;/name&gt;</p>
        <p>&lt;type&gt;numerical&lt;/type&gt;
&lt;/attribute&gt;
&lt;/attributes&gt;
&lt;recordsTable&gt;</p>
        <p>note_ifsa
&lt;/recordsTable&gt;
&lt;randomColumn&gt;</p>
        <p>randomize
&lt;/randomColumn&gt;
&lt;userID&gt;userid&lt;/userID&gt;
&lt;usersSelect&gt;
select distinct userid from note_ifsa
&lt;/usersSelect&gt;
&lt;/NotebooksIFSA&gt;
First, a set of attributes is defined. Every attribute
has a name and a type - numerical, nominal or list.</p>
        <p>An example of list attribute is actors in a film. This
attribute can be found in the IMDb dataset [17].</p>
      </sec>
      <sec id="sec-3-3">
        <title>4.3 Inductive method</title>
        <p>InductiveMethod is the most important interface - it is
what we want to evaluate. Inductive method has two
main methods:
interface InductiveMethod {
int buildModel(trainingDataset,</p>
        <p>userId);
Double classifyRecord(record,</p>
        <p>targetAttribute);
}
buildModel uses the training dataset and the userId
for the construction of a user preference model. After
having it constructed, the method is tested - it is
being given records via method classifyRecord and is
supposed to evaluate them.</p>
        <p>Various inductive methods were implemented.</p>
        <p>Among the most interesting are our method
Statistical ([18, 15] ) and Instances ([15]), WekaBridge that
allows to use any method from Weka (such as
Support vector machine) and ILPBridge that transforms
data to a prolog program and then uses Progol [19] to
create the user model. CofiBridge allows to use Cofi
as a PrefWork InductiveMethod.</p>
        <p>A sample configuration of method Statistical is:
&lt;Statistical&gt;
&lt;class&gt;Statistical&lt;/class&gt;
&lt;rater&gt;
&lt;class&gt;WeightAverage&lt;/class&gt;
&lt;weights&gt;VARIANCE&lt;/weights&gt;
&lt;/rater&gt;
&lt;representant&gt;</p>
        <p>&lt;class&gt;AvgRepresentant&lt;/class&gt;
&lt;/representant&gt;
&lt;numericalNormalizer&gt; &lt;IFSA&gt;</p>
        <p>Linear &lt;methods&gt;
&lt;/numericalNormalizer&gt; &lt;method&gt;
&lt;nominalNormalizer&gt; &lt;name&gt;Statistical&lt;/name&gt;</p>
        <p>RepresentantNormalizer &lt;numericalNormalizer&gt;
&lt;/nominalNormalizer&gt; Standard2CPNormalizer
&lt;listNormalizer&gt; &lt;/numericalNormalizer&gt;</p>
        <p>ListNormalizer &lt;/method&gt;
&lt;/listNormalizer&gt; &lt;method&gt;&lt;name&gt;Statistical&lt;/name&gt;
&lt;/Statistical&gt; &lt;/method&gt;
&lt;method&gt;&lt;name&gt;Mean&lt;/name&gt;&lt;/method&gt;
Every method requires a different configuration, only &lt;method&gt;&lt;name&gt;SVM&lt;/name&gt;&lt;/method&gt;
the name of the class is obligatory. Note that the &lt;/methods&gt;
methods based on our two-step user model (Statis- &lt;dbs&gt;
tical and Instances for now) can be easily configured &lt;db&gt;
to test different heuristics for the processing of differ- &lt;name&gt;MySQL&lt;/name&gt;
ent types of attributes. Configuration contains three &lt;datasources&gt;NotebooksIFSA
sections: numericalNormalizer, nominalNormalizer &lt;/datasources&gt;
and listNormalizer for the specification of the &lt;/db&gt;
method for the particular type of attribute. Also see &lt;/dbs&gt;
Section 4.5 for an example of this configuration. &lt;tests&gt;
&lt;test&gt;
4.4 Ways of the testing of the method &lt;class&gt;TestTrain&lt;/class&gt;
&lt;ratio&gt;0.05&lt;/ratio&gt;
Several possible ways for the testing of methods can
&lt;path&gt;resultsIFSA&lt;/path&gt;
be defined, the division to training and testing sets
&lt;testInterpreter&gt;
is the most typically used. The method is trained on
&lt;class&gt;DataMiningStatistics
the training set (using buildModel) and then tested on
&lt;/class&gt;
the testing set (using classifyRecord). Another typical
&lt;/testInterpreter&gt;
method is k-fold cross validation that divides data into
&lt;/test&gt;
k sets. In each of k runs, one set is used as the testing
&lt;test&gt;
set and the rest as the training set.
&lt;class&gt;TestTrain&lt;/class&gt;
interface Test { &lt;ratio&gt;0.1&lt;/ratio&gt;
void test(method, trainDataSource, &lt;path&gt;resultsIFSA&lt;/path&gt;</p>
        <p>testDataource); &lt;testInterpreter&gt;
} &lt;class&gt;DataMiningStatistics
&lt;/class&gt;</p>
        <p>When the method is tested, the results in the form &lt;/testInterpreter&gt;
userid, objectid, predictedRating, realUserRating &lt;/test&gt;
have to be processed. The interpretation is done by &lt;/tests&gt;
a TestResultsInterpreter. The most common is &lt;/IFSA&gt;
DataMiningStatistics, which computes such measures
as correlation, RMSE, weighted RMSE, MAE, Kendall
rank tau coefficient, etc. Others are still waiting to be
implemented - ROC curves or precision-recall
statistics.</p>
        <p>First, we have specified which methods are to be
tested - in our case it is two variants of Statistical,
then Mean and SVM. Note that some attributes of
Statistical, which was defined in confMethods, can be
abstract class TestInterpreter { “overridden” here. The basic configuration of
Statistiabstract void writeTestResults( cal is in Section 4.3. Then the datasource for testing of
testResults); the methods is specified – we are using MySql database
} with datasource NotebooksIFSA. Several datasources
or databases can be specified here. Finally, the ways
4.5 Configuration parser of the testing and interpretation are given in section
tests. TestTrain requires ratio of the training and the
The main class is called ConfigurationParser. The de- testing sets, the path where the results are to be
writfinition of one test follows: ten, and the interpretation of the test results.
date;Ratio;dataset;method;userId;mae;rmse;weightedRmse;monotonicity;tau;weightedTau;correlation;buildTime;
testTime;countTrain;countTest;countUnableToPredict</p>
        <p>28.4.2009
12:18;0,05;NotebooksIFSA;Statistical,StandardNorm2CP;1;0,855;0,081;1,323;1,442;0,443;0,358;0,535;94;47;10;188;0;
28.4.2009
12:18;0,05;NotebooksIFSA;Statistical,StandardNorm2CP;1;0,868;0,078;1,216;1,456;0,323;0,138;0,501;32;0;13;185;0;
28.4.2009
12:18;0,05;NotebooksIFSA;Statistical,StandardNorm2CP;1;0,934;0,083;1,058;1,873;0,067;0,404;0,128;31;16;12;186;0;
28.4.2009 12:31;0,025;NotebooksIFSA;Statistical,Peak;1;0,946;0,081;1,161;1,750;0,124;0,016;0,074;15;16;4;194;0
28.4.2009 12:31;0,025;NotebooksIFSA;Statistical,Peak;1;0,844;0,076;1,218;1,591;0,224;0,215;0,433;0;16;6;192;0
28.4.2009 12:31;0,025;NotebooksIFSA;Statistical,Peak;1;1,426;0,123;1,407;1,886;0,024;0,208;-0,063;16;0;4;194;0</p>
        <p>The definitions of runs are in confRuns.xml in
section runs. The specification of the run to be executed
is in section run of the same file.
In Figure 2 is a sample of the resulting csv file. In
our example, there are three runs with method
Statistical with normaliser StandardNorm2CP and three
runs with normaliser Peak. Runs were performed on
different settings of the training and the testing sets,
so the results are different even for the same method.</p>
        <p>The results contain all necessary information
required for generation of a graph or a table with the
results. Csv format was chosen for its simplicity and
wide acceptance, so any other possible software can
handle it. We are currently using Microsoft Excel and
its Pivot table that allows aggregation of results by
different criteria. Among other possibilities is also the
already mentioned R [3].</p>
        <p>Example figures of the output of PrefWork are in
Figures 3 and 4. The lines represent different
methods, X axis represents the size of the training set and
the Y axis the value of the error function. In
Figure 3 the error function is Kendall rank tau coefficient
(the higher it is the better) and in Figure 4 is RMSE
weighted by the original rating (the lower the better).</p>
        <p>The error function can be chosen, as is described in
Section 4.4.</p>
        <p>It is impossible to compare PrefWork to another
framework generally. A simple comparison to other
such systems is in Section 2. This can be done only
qualitatively; there is no attribute of frameworks that
can be quantified. The user itself has to choose among
them the one that suits his needs the most.
4.7</p>
      </sec>
      <sec id="sec-3-4">
        <title>External dependencies</title>
        <p>PrefWork is dependent on some external libraries. Two
of them are sources for inductive methods - Weka [1]
and Cofi [10]. Cofi also requires taste.jar.
0,7
0,6
0,5
t
n
e
iic0,4
f
f
e
co0,3
u
a
T0,2
0,1</p>
        <p>0
1,55
1,35
E
S
1RM,15
d
e
t
h
g
i
e
0W,95
0,75
0,55</p>
        <p>Average of Tau coefficient
weka,SVM
Mean
Statistical, Linear regression
Statistical,2CP-regression
weka,MultilayerPerceptron
2
5
10 15 20
Training set size
40</p>
        <p>75</p>
        <p>Fig. 3. Tau coefficient.</p>
        <p>Average of Weighted RMSE
weka,SVM
Mean
Statistical, Linear regression
Statistical,2CP-regression
weka,MultilayerPerceptron
2
5
10</p>
        <p>15
Training set size
20
40</p>
        <p>75</p>
        <p>Fig. 4. Weighted RMSE.</p>
        <p>PrefWork requires following jars to function cor- Eliassi-Rad, T., eds.: KDD’06: Proceedings of the 12th
rectly: ACM SIGKDD international conference on Knowledge
discovery and data mining, New York, NY, USA, ACM
Weka weka.jar (August 2006), 935–940.</p>
        <p>Cofi cofi.jar 3. R-project. http://www.r-project.org/.
Cofi taste.jar 4. SAS enterprise miner. http://www.sas.com/.
Logging log4j.jar 5. SPSS Clementine. http://www.spss.com/software/
CSV parsing opencsv-1.8.jar modeling/modeler/.</p>
        <p>Configuration commons-configuration-1.5.jar 6. Sˇ. Pero, T. Horv´ath: Winston: A data mining
assisConfiguration commons-lang-2.4.jar tant. In: To appear in proceedings of RDM 2009, 2009.
MySql mysql-connector-java-5.1.5- 7. P. Viappiani, B. Faltings: Implementing example-based
bin.jar tools for preference-based search. In: ICWE’06:
ProOracle ojdbc1410.2.0.3.jar ceedings of the 6th international conference on Web
engineering, New York, NY, USA, ACM, 2006, 89–90.</p>
        <p>Tab. 1. Libraries required by PrefWork. 8. P. Viappiani, P. Pu, B. Faltings: Preference-based
search with adaptive recommendations. AI Commun.
21, 2-3, 2008, 155–175.
5 Conclusion 9. S. Holland, M. Ester, W. Kiessling: Preference
mining: A novel approach on mining user preferences for
personalized applications. In: Knowledge Discovery in
PrefWork has been presented in this paper with a thor- Databases: PKDD 2003, Springer Berlin / Heidelberg,
ough explanation and description of every component. 2003, 204–216.</p>
        <p>Interested reader should be now able to install Pref- 10. Cofi: A Java-Based Collaborative Filtering Library.
Work, run it, and implement a new inductive method http://www.nongnu.org/cofi/.
or a new datasource. 11. Apache Mahout project. http://lucene.apache.</p>
        <p>The software can be downloaded at http://www. org/mahout/.
ksi.mff.cuni.cz/∼eckhardt/PrefWork.zip 12. Taste project. http://taste.sourceforge.net/old.
as an Eclipse project containing all java sources and all html.
required libraries or can be downloaded as SVN check- 13. T. Horv´ath, P. Vojt´aˇs: Induction of fuzzy and
annoout at [20]. The SVN archive contains Java sources and tated logic programs. In Muggleton, S.,
TamaddoniNezhad, A., Otero, R., eds.: ILP06 - Revised Selected
sample configuration files. papers on Inductive Logic Programming. Number 4455
in Lecture Notes In Computer Science, Springer
Ver5.1 Future work lag, 2007, 260–274.
14. A. Eckhardt: Various aspects of user preference
We plan to introduce time dimension to PrefWork. learning and recommender systems. In Richta, K.,
Netflix [21] datasets uses a timestamp for each rat- Pokorny´, J., Sn´aˇsel, V., eds.: DATESO 2009. CEUR
ing. This will enable to study the evolution of the Workshop Proceedings, Cˇ esk´a technika -
nakladatelpreferences in time, which is a challenging problem. stv´ı Cˇ VUT, 2009, 56–67.</p>
        <p>However, the integration of the time dimension into 15. A. Eckhardt, P. Vojt´aˇs: Considering data-mining
techniques in user preference learning. In: 2008
InternaPrefWork can be done in several ways and the right tional Workshop on Web Information Retrieval
Supone is yet to be chosen. port Systems, 2008, 33–36.</p>
        <p>Allowing other sources of data apart from the rat- 16. T. Dvoˇr´ak: Induction of user preferences in
semanings is a major issue. The clickthrough data can be tic web, in Czech. Master Thesis, Charles University,
collected without any effort of the user and can be sub- Czech Republic, 2008.
stantially larger than the number of ratings. But its in- 17. The Internet Movie Database. http://www.imdb.
tegration into com/.</p>
        <p>PrefWork would require a large reorganisation of ex- 18. A. Eckhardt: Inductive models of user preferences for
isting methods. semantic web. In Pokorny´, J., Sn´aˇsel, V., Richta, K.,
eds.: DATESO 2007. Volume 235 of CEUR Workshop
Proceedings., Matfyz Press, Praha, 2007, 108–119.</p>
        <p>References 19. S. Muggleton: Learning from positive data. 1997, 358–
376
20. PrefWork - a framework for testing methods for
user preference learning. http://code.google.com/p/
prefwork/.
21. Netflix dataset, http://www.netflixprize.com.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>