<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Addressing the New User Problem with a Personality Based User Similarity Measure</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Marko</forename><surname>Tkalčič</surname></persName>
							<email>marko.tkalcic@fe.uni-lj.si</email>
							<affiliation key="aff0">
								<orgName type="department">Faculty of electrical engineering</orgName>
								<orgName type="institution">University of Ljubljana</orgName>
								<address>
									<addrLine>Tržaška 25</addrLine>
									<postCode>1000</postCode>
									<settlement>Ljubljana</settlement>
									<country>Sovenia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Matevž</forename><surname>Kunaver</surname></persName>
							<email>matevz.kunaver@fe.uni-lj.si</email>
							<affiliation key="aff0">
								<orgName type="department">Faculty of electrical engineering</orgName>
								<orgName type="institution">University of Ljubljana</orgName>
								<address>
									<addrLine>Tržaška 25</addrLine>
									<postCode>1000</postCode>
									<settlement>Ljubljana</settlement>
									<country>Sovenia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Andrej</forename><surname>Košir</surname></persName>
							<email>andrej.kosir@fe.uni-lj.si</email>
							<affiliation key="aff0">
								<orgName type="department">Faculty of electrical engineering</orgName>
								<orgName type="institution">University of Ljubljana</orgName>
								<address>
									<addrLine>Tržaška 25</addrLine>
									<postCode>1000</postCode>
									<settlement>Ljubljana</settlement>
									<country>Sovenia</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jurij</forename><surname>Tasič</surname></persName>
							<email>jurij.tasic@fe.uni-lj.si</email>
							<affiliation key="aff0">
								<orgName type="department">Faculty of electrical engineering</orgName>
								<orgName type="institution">University of Ljubljana</orgName>
								<address>
									<addrLine>Tržaška 25</addrLine>
									<postCode>1000</postCode>
									<settlement>Ljubljana</settlement>
									<country>Sovenia</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Addressing the New User Problem with a Personality Based User Similarity Measure</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">D30CE45D9F488396C4F7F805CC0C3970</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T05:13+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>memory based collaborative recommender system</term>
					<term>new user problem</term>
					<term>personality based user similarity measure</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The new user problem is a recurring problem in memory based collaborative recommender systems (MBCR). It occurs when a new user is added to the system and there are not enough information to make a good selection of the user's neighbours. As a consequence, the recommended items have poor correlation with the user's interests. We addressed the new user problem by observing the user similarity measure (USM). In this paper we present two novelties that address the new user problem : (i) the usage of a personality based USM to alleviate the new user problem and (ii) a method for establishing the boundary of the cold start period. We succesfully used a personality based USM that yielded significantly better recommender performance in the period where the new user problem occurs. Furthermore we presented a new methodology for assessing the boundary of the period where the new user problem occurs.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>The new user problem is an important issue in memory based collaborative recommender systems <ref type="bibr" target="#b0">[Adomavicius and Tuzhilin, 2005]</ref>. It occurs when a new user joins the system and there are no (or there are too few) overlapping ratings to calculate good estimates of user similarities with rating-based user similarity measures (USM). We will denote this initial period as the cold start period (CSP). The consequences of being in the CSP are bad rating predictions for unseen items and thus poor quality of the recommender system. Usually, the new user problem (NUP) has been addressed by introducing content-based approaches which resulted in hybrid systems <ref type="bibr" target="#b0">[Adomavicius and</ref><ref type="bibr">Tuzhilin, 2005, Ahn, 2008</ref>]. Once the system has enough overlapping items it is not in the CSP and rating based USM can be used.</p><p>We introduced a personality-based USM using the five factor model (FFM) in <ref type="bibr" target="#b7">Tkalčič et al. [2009]</ref>. The same approach was later used by <ref type="bibr" target="#b2">Hu and Pu [2010]</ref> for the NUP in a music recommender system. In this paper we present (i) the results of the proposed USM in a CF recommender system for images and (ii) a methodology for assessing the boundary of the cold start period.</p><p>The proposed approach to use a personality-based USM in the NUP allows us to calculate user similarities immediately, without waiting for the user to rate several items. The underlying assumption for choosing personality as the basis for the proposed user similarity measure is that people with similar personalities have similar tastes for products. In psychology, personality is described as a set of factors that account for the majority of between-user variance in emotive, interpersonal, experiential and attitudinal styles <ref type="bibr" target="#b3">[John and Srivastava, 1999]</ref>.</p><p>The second novelty is a statistical method for determining at which point the new user problem stops occurring. A review of literature showed that authors either (i) did not set limits for the CSP <ref type="bibr" target="#b6">[Schein et al., 2002]</ref> or (ii) provided limits without further argumentation, e.g. <ref type="bibr" target="#b5">Massa and Bhattacharjee [2004]</ref> defined cold start users as the users who have expressed less than 5 ratings. We propose to determine the boundary of the CSP with a statistical approach, as the number of ratings where the recommender's performance stops being significantly lower than the performance with higher number of ratings given by the user.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">The new user problem</head><p>The new user problem in collaborative filtering recommenders is described as the period from the moment when a user joins the system to the moment when there are enough ratings to yield a stable list of neighbours (i.e. users with similar preferences). We rewrote this description from various sources <ref type="bibr" target="#b0">[Adomavicius and Tuzhilin, 2005</ref><ref type="bibr" target="#b6">, Schein et al., 2002</ref><ref type="bibr" target="#b1">, Ahn, 2008]</ref>. To the best of the authors' knowledge no formal definition of the new user problem period is available.</p><p>In this section we define the boundary of the CSP. Let us have a user u joining the system. The user starts using the system and gives ratings r(u, h) to items h ∈ H where H ⊂ {h 1 . . . h J }, a set of J items. At any given moment the user has given n ratings to n different items which yields the set</p><formula xml:id="formula_0">R n u = {r u 1 . . . r u n } (1)</formula><p>The boundary of the new user problem period (the CSP) for the selected user is the number of ratings N CS u after which the system starts to yield stable sets of users. The consequence of a stable set of users is a stable confusion matrix of recommended items. We define that the confusion matrix is stable if a sequence of F -measure values, has statistically equivalent means at different n.</p><p>We choose the F measure as a scalar measure of the confusion matrix. We denote the F measure when n ratings have been used to calculate neighbours as F n . We define the CSP boundary as the point N where the means of F values of the sets</p><formula xml:id="formula_1">R N J u = {F N . . . F J } R (N −1)J u = {F (N −1) . . . F J } (2)</formula><p>are significantly different.</p><p>In <ref type="bibr" target="#b7">Tkalčič et al. [2009]</ref> we presented a user similarity measure that takes two vectors b i = (b i1 . . . b i5 ) and b j = (b j1 . . . b j5 ) containing the personality values of two users u i and u j and yields the scalar similarity value</p><formula xml:id="formula_2">d W (b i , b j ) = 5 l=1 w l (b il − b jl ) 2</formula><p>(3)</p><p>We use the proposed user similarity measure to calculate similarities in the CSP.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Experiment</head><p>The goal of the experiment was to (i) assess the CSP boundary and to (ii) see whether the personality-based USM performs better in the CSP. We conducted two experiments with the CF recommender system: (i) one with the personality based USM and (ii) one with the rating based USM.</p><p>We used the LDOS-PerAff-1 <ref type="bibr" target="#b8">[Tkalčič et al., 2010]</ref> dataset which contained all data necessary to carry out our experiments. The dataset provided the usage history (i.e. the log of users' interactions) of 52 users consuming 70 content items (a subset of images from the IAPS<ref type="foot" target="#foot_0">1</ref> dataset) and giving explicit ratings to each item. The users' task was to assess the images for their computer's wallpaper. The users' personalities vectors b were assessed using the IPIP50<ref type="foot" target="#foot_1">2</ref> questionnaire. We used the personality based USM as defined in Eq. 3 to calculate the distances between the users. We calculated the predicted ratings based on the neighbours' ratings using the adjusted Pearson's coefficient as defined in <ref type="bibr" target="#b4">Kunaver et al. [2007]</ref>. We then compared the predicted ratings with the ground truth ratings which yielded the confusion matrix.</p><p>We calculated the rating based USM d(u i , u j ) between two arbitrary users u i and u j based on their respective ratings e(u, h) of the overlapping items h m , where m is the index of the items that both the users have rated</p><formula xml:id="formula_3">d(u i , u j ) = m (e(u i , h m ) − e(u j , h m )) 2 (4)</formula><p>The dataset used in our experiments had a full ratings-items table without missing values with I users and K ratings. To simulate the CSP we determined a usage history path in the form of a random sequence of ratings, for each user separately. We iterated through cold start stages s from one (the user has given only one rating) to K (the user has rated all items) for each user separately. At each stage 1 ≤ s ≤ K we performed the recommender procedure and calculated the confusion matrix for the observed user u at the observed stage s. We chose the F measure as the performance measure of the recommender system. The experimental procedure thus yielded a table of F values at different stages s ∈ {1..K} and for each user u ∈ U .</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Evaluation methodology</head><p>We compared the performance of the rating based USM and the personality based USM by testing the hypothesis H 0 : µ R = µ B5 at different cold start stages using the t-test. The value µ R represents the mean F values using the rating based USM and µ B5 represents the mean F values using the personality based USM.</p><p>We determined the position of the CSP boundary by testing the hypothesis H 0 : µ s = µ s−K where µ s represents the mean F value at stage s and µ s−K represents the mean F value from stages s+1 to K, where K is the last observed cold start stage.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Results</head><p>When seeking for the CSP boundary we calculated the p values which are shown in Fig. <ref type="figure" target="#fig_0">1</ref>. On the dataset used we observed that p &lt; 0.05 occurs when the cold start stage is s &lt; 6.</p><p>We analyzed the CSP by graphing the quality rate of the recommender(F ) versus the number of ratings used. At each cold start stage s the F measures for each user were calculated.</p><p>The results of the t test showed that, on the dataset used, the personality based USM yields a significantly higher mean of F values than the rating based USM when the number of ratings taken in account for the calculation of the neighbours is lower than 50 (see Fig. <ref type="figure">2</ref>). When the number of ratings is higher than 50 the means of F values for both similarity measures are not significantly different at α = 0.05.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Discussion and conclusion</head><p>Experimental results showed that the personality based USM performs significantly better than the rating based USM in cold start conditions. A positive outcome is also the fact that the personality USM is statistically equivalent to the rating based USM which makes it a good candidate for a complete replacement of the rating based USM.</p><p>However, the results presented here were verified only on the specific dataset and we don't have any ground to conclude that the presented approach is useful also in other domains. We do speculate that hedonistic-content domains would benefit from the presented approach but this should be verified as future work.</p><p>The main drawback of the personality based USM is the difficulty of acquisition of end users' personality parameters. There are two main obstacles in this: (i) it is annoying for the end user to fill in questionnaires and (ii) the acquisition of personality data raises ethical and privacy issues that need to be addressed first. The progress beyond the state of the art here is the knowledge that personality does account for between-users variance in entertainment applications.</p><p>In the lack of existing methodologies for assessing the boundaries of the new user problem we chose a statistical approach. We acknowledge that further investigations should be conducted to determine how to test for the CSP boundary and that these investigations might conclude that a different approach is more suitable.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Fig. 1 .</head><label>1</label><figDesc>Fig.1. p values of the t test for the CSP boundary. On the dataset used the CSP occurs when s &lt; 6.</figDesc></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">http://csea.phhp.ufl.edu/media.html</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">http://ipip.ori.org/New_IPIP-50-item-scale.htm</note>
		</body>
		<back>
			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>We provided a methodology for the assessment of the new user boundary. The results presented should not be taken for granted and several repetitions of the procedure should be carried out on different datasets.</p><p>In this paper we have evaluated a personality based USM under cold start conditions. The results showed that the personality based USM performed significantly better than the rating based USM. Furthermore we described a methodology for the assessment of the CSP border. Both novelties are important in the field of memory based collaborative filtering recommender systems and should be further explored.</p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions</title>
		<author>
			<persName><forename type="first">G</forename><surname>Adomavicius</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Tuzhilin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Transactions on Knowledge and Data Engineering</title>
		<imprint>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="734" to="749" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">A new similarity measure for collaborative filtering to alleviate the new user cold-starting problem</title>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">J</forename><surname>Ahn</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Sciences</title>
		<imprint>
			<biblScope unit="volume">178</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="37" to="51" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Using Personality Information in Collaborative Filtering for New Users</title>
		<author>
			<persName><forename type="first">R</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Pu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Recommender Systems and the Social Web</title>
				<imprint>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page">17</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">The big five trait taxonomy: History, measurement, and theoretical perspectives</title>
		<author>
			<persName><forename type="first">Oliver</forename><forename type="middle">P</forename><surname>John</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sanjay</forename><surname>Srivastava</surname></persName>
		</author>
		<ptr target="http://www.uoregon.edu/\~{}sanjay/pubs/bigfive.pdf" />
	</analytic>
	<monogr>
		<title level="m">Handbook of Personality: Theory and Research</title>
				<editor>
			<persName><forename type="first">Lawrence</forename><forename type="middle">A</forename><surname>Pervin</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Oliver</forename><forename type="middle">P</forename><surname>John</surname></persName>
		</editor>
		<meeting><address><addrLine>New York</addrLine></address></meeting>
		<imprint>
			<publisher>Guilford Press</publisher>
			<date type="published" when="1999">1999</date>
			<biblScope unit="page" from="102" to="138" />
		</imprint>
	</monogr>
	<note>second edition</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Optimisation of combined collaborative recommended systems</title>
		<author>
			<persName><forename type="first">Matevž</forename><surname>Kunaver</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Tomaž</forename><surname>Požrl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Matevž</forename><surname>Pogačnik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jurij</forename><surname>Tasič</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Electronic Communications</title>
		<imprint>
			<biblScope unit="volume">61</biblScope>
			<biblScope unit="page" from="433" to="443" />
			<date type="published" when="2007">2007</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Using Trust in Recommender Systems: An Experimental Analysis</title>
		<author>
			<persName><forename type="first">P</forename><surname>Massa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Bhattacharjee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Trust management: second international conference, iTrust 2004</title>
				<meeting><address><addrLine>Oxford, UK</addrLine></address></meeting>
		<imprint>
			<publisher>Springer-Verlag New York Inc</publisher>
			<date type="published" when="2004-04-01">March 29-April 1, 2004. 2004</date>
			<biblScope unit="page">221</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Methods and metrics for cold-start recommendations</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">I</forename><surname>Schein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Popescul</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">H</forename><surname>Ungar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">M</forename><surname>Pennock</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval</title>
				<meeting>the 25th annual international ACM SIGIR conference on Research and development in information retrieval</meeting>
		<imprint>
			<publisher>ACM</publisher>
			<date type="published" when="2002">2002</date>
			<biblScope unit="page" from="253" to="260" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Personality based user similarity measure for a collaborative recommender system</title>
		<author>
			<persName><forename type="first">Marko</forename><surname>Tkalčič</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Matevž</forename><surname>Kunaver</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jurij</forename><surname>Tasič</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andrej</forename><surname>Košir</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 5th Workshop on Emotion in Human-Computer Interaction -Real world challenges</title>
				<editor>
			<persName><forename type="first">C</forename><surname>Peter</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Crane</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Axelrod</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Agius</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Afzal</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Balaam</surname></persName>
		</editor>
		<meeting>the 5th Workshop on Emotion in Human-Computer Interaction -Real world challenges</meeting>
		<imprint>
			<publisher>Fraunhofer Verlag</publisher>
			<date type="published" when="2009-09">September 2009</date>
			<biblScope unit="page" from="30" to="37" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">The LDOS-PerAff-1 Corpus of Face Video Clips with Affective and Personality Metadata</title>
		<author>
			<persName><forename type="first">Marko</forename><surname>Tkalčič</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Jurij</forename><surname>Tasič</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Andrej</forename><surname>Košir</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the LREC 2010 Workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality</title>
				<editor>
			<persName><forename type="first">Michael</forename><surname>Kipp</surname></persName>
		</editor>
		<meeting>the LREC 2010 Workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality</meeting>
		<imprint>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
