Classifeye: Classification of Personal Characteristics Based on Eye Tracking Data in a Recommender System Interface Martijn Millecampa , Cristina Conatib and Katrien Verberta a Department of computer science, KU Leuven, Celestijnenlaan 200A bus 2402, Leuven, Belgium b Department of computer Science, ICICS/CS 107, 2366 Main Mall, Vancouver, BC, Canada Abstract Due to the increasing importance of recommender systems in our life, the call to make these systems more transparent becomes louder. However, providing explanations is not as easy as it seems, as research has shown that different users have varying reactions to explanations. So not only the recommendations, but also the explanations should be personalised. As a first step towards these personalised explanations, we explore the possibility to classify users based on their gaze pattern during the interaction with a music recommender system. More specifically, we classify three personal characteristics that have been shown to play a role in the interaction with music recommendations: need for cognition, openness and musical sophistication. Our results show that classification based on eye tracking has potential for need for cognition and openness, as we are able to do better than random, but not for musical sophistication as no classifier did better than a uniform random baseline. Keywords eye tracking, classification, recommender system, openness, need for cognition, musical sophistication 1. Introduction RS to the user [4, 5]. Especially the combina- tion of these explanations with control can In the field of recommender systems (RS), re- help users not only to understand the RS, but searchers are increasingly aware that opti- also to steer the RS with input and feedback mizing accuracy is not enough to reach the [5]. Despite the increased interest in explana- full potential of recommender systems (RS) tions for RS, it is still not clear how to imple- [1, 2]. For example, users will not choose a ment explanations in practice as users have recommended item unless they have trust in varying reactions to them which shows the the system [3]. One possible way to increase need to personalize explanation to the user this trust is providing explanations which re- [6]. veal (a part of) the internal reasoning of the However, before the system could adapt ex- planations to personal characteristics (PCs), HUMANIZE: Joint Proceedings of the ACM IUI 2021 it needs to be aware of the PCs of the user. A Workshops, April 13–17, 2021, College Station, USA possible way to obtain these characteristics is " martijn.millecamp@kuleuven.be (M. Millecamp); by explicitly asking the users to fill in ques- conati@cs.ubc.ca (C. Conati); katrien.verbert@kuleuven.be (K. Verbert) tionnaires [7] or by implicitly inferring PCs  0000-0002-5542-0067 (M. Millecamp); through an analysis of the social media of the 0000-0002-8434-9335 (C. Conati); 0000-0001-6699-7710 user [8]. Nonetheless, asking users to fill in (K. Verbert) questionnaires or to give access to their so- © 2021 Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). cial media is often not desirable. Moreover, to CEUR http://ceur-ws.org CEUR Workshop Proceedings personalize explanations it is not necessary (CEUR-WS.org) Workshop ISSN 1613-0073 Proceedings to obtain a fine-grained result, but a classifi- 2. Related work cation into two categories suffices [9]. For this reason, we explore in this paper With the increasing role of RS in our daily whether it is possible to classify users’ per- lives, the call for explainable, transparent RS sonality traits during the interaction with a also becomes louder so that users can make music RS with explanations by analyzing their better informed decisions whether or not to gaze. We will focus on three different PCs: follow the recommendations [11, 6]. In com- openness, need for cognition (NFC) and mu- bination with controls, this transparency also sical sophistication (MS) [9, 10]. These PCs enables users to correct the RS whenever they will be explained in detail in Section 2. feel it makes wrong assumptions [5]. How- Openness is one of the Big Five personal- ever, research has shown that different users ity traits which measures how open a per- have different reactions to explanations [6, son is to new experiences. Millecamp et al. 12, 13]. In the field of music RS, recent re- [9] showed that there was a significant dif- search has shown that there are three PCs ference in the gaze pattern between low and that could influence the way users perceive high openness users. This is the reason we explanations: openness, NFC and MS [10, 9]. hypothesize that classifying openness based Openness is one of the five factors of the on gaze might be possible. Five Factor Model, also known as the Big 5 Similarly, we hypothesize that inferring MS, model [14]. This model describes personality which is a measure of domain knowledge in in five different traits and it has been used the music domain, from gaze data might be in several studies which showed the positive possible as the study of Millecamp et al. [9] impact of considering personality in RS [15]. also found significant differences in gaze pat- The factor openness describes the breadth, depth tern between low and high MS. and complexity of an individualś mental and NFC is a cognitive style which influences experiental life [16]. It has been shown that the way a person prefers to process informa- openness is related to the preferred amount tion and thus looks at information. Previ- of diversity in RS and to the willingness to ous studies already showed that NFC moder- use a system with explanations [17, 18, 9]. ates the perception of explanations in a music Need for cognition has been shown to in- recommender system, which was the motiva- fluence the success of a RS [13, 12, 19, 20] and tion to explore whether inferring NFC from is defined as “a measure of the tendency for gaze would be possible. an individual to engage in, and enjoy, effort- Next to exploring the general accuracy, we ful cognitive activities” [21]. NFC has been also want to explore how much data we need shown to have an impact on the willingness to infer these PCs. of users to rely on a RS [12], on the confi- The contribution of this paper is twofold. dence in a playlist created in a music RS with First, to our knowledge, we are the first to ex- explanations [10], on preference matching [22], plore whether it is possible to infer PCs dur- on the style of explanations they prefer [13] ing the interaction with a RS in the presence and on the reason why users need a transpar- of explanations. Second, we make the gath- ent RS [23]. ered dataset publicly available to support the Musical sophistication is defined by Mul- research in this area. This dataset is unique lensiefen et al. [24] as a concept to describe because it provides both gaze data and data the multi-faceted nature of musical expertise. about PCs. In the music domain, Millecamp et al. [9] showed that users with high MS feel more supported to make a decision in a RS inter- Table 1 face that provided explanations than an in- An overview of personal characteristics measured, terface without such explanations, while this together with their highest and lowest possible made no difference for users with low MS. scores and summary statistics for the scores of the Another study showed that users with high participants domain experience perceive a higher diver- PC Possible Range Median Score sity in a scatter plot than in a simpler bubble chart [25]. Age 18-65 24 To acquire the PCs of users, the most com- MS 18-126 64 NFC 0-100 68.75 mon way is to ask users to fill in validated Openness 0-100 55 questionnaire [7], but there exist also other approaches such as inferring PCs by analyz- ing the social media of the user [26, 8], by can be found in [9]. As mentioned in Sec- analyzing a conversation with a chatbot [27] tion 2, we focus in this study on openness, or by analyzing the physical signals such as NFC and MS as previous research has found brain activity [28] and gaze data [7]. that these PCs could affect the perception of The previously mentioned works rely on explanations in a music RS [9, 10] and the fine-grained personality scores. In contrast, study of Millecamp et al. [9] already showed in our work we focus on adapting interfaces that openness and MS change the gaze pat- to users for which we only need a classifica- tern between an interface in the presence or tion in two groups. We aim to base this clas- absence of explanations. To measure these sification on the gaze pattern during the in- three characteristics, users were asked to fill teraction with a music RS interface instead of out three questionnaires before the experi- asking users to watch carefully selected stim- ment started. To measure openness, we used uli, to fill in questionnaires or to share their the 44-item Big Five Inventory [34] and se- social media profile. Previous studies which lected afterwards the questions related to open- classified users based on their gaze pattern ness. For NFC, we used the 18-items ques- during normal activities are almost all only tionnaire of Cacioppo et al. [21] and for MS focused on cognitive abilities and visualiza- the Goldsmiths Musical Sophistication Index tion experience [29, 30, 31, 32]. One excep- 1 was used. The dataset we used in this study tion is the study of Hoppe et al. [33] which consists of the gaze data of 30 participants inferred the Big Five personality traits by (21 male). For the three PCs, the participants studying the gaze of a walk through a cam- were divided into a high and low group based pus. This study is different from our work, on a median split. This resulted in equally as we investigate if it is possible to infer PCs distributed groups for MS and NFC and al- while interacting with a music RS and also most equally groups for openness (16 in the focus on different PCs. low and 14 in the high openness group). An short overview of the characteristics of the 3. Data participants can be found in Table 1. The gaze data was recorded with a Tobii The gaze data that is used in this study was 4C remote eye tracker at a sampling rate of generated in a user study by Millecamp et al. 90Hz. Each sample contained information about [9]. We will provide a brief summary of this 1 https://www.gold.ac.uk/music-mind-brain/ experiment, but a more elaborate description gold-msi/ May 2020 the focus point on the screen denoted as an 4. Classifiers x and y coordinate, the distance between the participant and the screen, and the validity of 4.1. Features these measures. To calibrate the eye tracker, The Tobii 4C does not come with software the experiment started with a standard cali- to detect fixations and saccades so we iden- bration procedure provided by Tobii Core Soft- tified fixations and saccades using an imple- ware. After the calibration, users were asked mentation of the ID-T algorithm [35] with a to explore the interface of a music RS in the dispersion threshold of one degree and a du- presence of feature-based explanations until ration threshold of 100ms [35]. This means they understood all functionalities. A screen- that in this study a fixation is identified as a shot of the interface is shown in Figure 1. circle on the screen in which the user keeps As shown in Part A of this figure, users focusing for at least 100ms without moving first can search for an artist they like through their eyes more than one degree. All other a search bar in the top left corner. When they movements are then identified as saccades, add the artist, this artist is shown in Part B. i.e. quick movements of gaze from one fix- Based on this artist, the system starts to gen- ation to another [30]. erate recommendations which were listed in Based on these saccades and fixations, we a two-column format as shown in Part F. generated a set of eye-tracking features as listed When users hover over the cover of the pic- in Table 2. Most of these features are selected ture of a recommended song, they can click because they are widely used in previous eye a play button to listen to a 30s preview of the tracking studies [7, 30, 36]. In addition to song. On the right side of each explanation, these features, we included Most frequent sac- they can click on the thumb-up icon to add cade direction and fixations in a 4x4 heatmap the song to their playlist. Through the sliders as the study of Hoppe et al. [33] indicated that shown in Part D of Figure 1, users can mod- 2 these features are important in the extraction ify several audio features such as popular- of personality. We did not include features ity, energy and danceability which are also that contain explicit information about the taken into account in the recommendation content of the interface, so called areas of in- process. To help users steer these sliders, the terest (AOI) even as previous work has shown minimum and the maximum for each audio that these features could have more predic- feature is shown for each artist. tive power [30]. The reason for this is that After the user explored all the options of this information is already partially captured the interface, the recording of the gaze started. in a more general way by Most frequent sac- As shown in Part E of Figure 1, users were cade direction and fixations in a 4x4 heatmap. asked to create a playlist of five songs. To cre- Thus, at this stage we chose to investigate ate this playlist, they could use all function- how far we can go with display-independent alities without any restriction. When they features, which also have the advantage of added the fifth song to their playlist, we possibly being more generalizable to other in- stopped the recording of the gaze. On aver- terfaces. age, users took 4 minutes 26 seconds to com- plete their playlist. As part of this paper’s contribution, this data is publicly available 3 . 4.2. Data windows 2 https://developer.spotify.com/documentation/ To explore whether classification of the three web-api/reference/tracks/get-audio-features/ PCs would be possible with only a partial 3 augment.cs.kuleuven.be/datasets/classifeye K A E B F G H J C I D Figure 1: The interface with the different parts highlighted in orange. A: Searchbox, B: Artist, C: Attributes of the artist, D: Preference of the user, E: Task, F: Recommendations, G: Cover of a song, H: Explanations I: (dis)like buttons, J: Play button K: list of (dis)liked songs Table 2 Description of eye tracking features Features Description Saccade rate Number of saccades divided by segment duration Avg. saccade length Average distance between the two fixations delimiting the saccade Avg. saccade amplitude Average size of saccade in degrees of visual angle Avg. saccade velocity Average velocity (saccade amplitude / saccade duration) of saccades Peak saccade velocity Maximum saccade velocity in segment Most frequent saccade direction Most frequent saccade direction (segments of 45°) Fixation rate Number of fixations divided by segment duration Avg. fixation duration Average duration of fixation in ms Ratio Fixations/Saccades Ratio of total nb of fixations divided by total nb of saccades 4x4 Heatmap Percentage of fixations in 16 raster areas Avg. pupil size Average pupil size of both eyes amount of data, we generated three differ- 60% and the last window consisted of the first ent data windows to simulate partial obser- 90% of data. Despite the fact that this ap- vations of gaze data during the task similar proach requires a task to be fully completed to Steichen et al. [30] and Conati et al. [31]. to determine what 100% of the data consti- Each window consists of a partial observa- tutes, it still allows to provide valuable in- tion of each participant based on relative du- sights into trends and patterns about infer- ration: the first window consisted of the first ring PCs from gaze data [30]. Each of these 30% of data, the second window of the first windows consist of three different measure- ments and for each of these measurements Table 3 the data was divided in ten different segments Description of parameters of the different classi- of equal length. For each of these segments, fiers we generated the mentioned set of eye-tracking Classifier Parameter features resulting in a feature vector of 260 Baseline strategy: uniform features for each measurement. Logistic Regression solver: liblinear The reasoning behind creating these dif- Random Forest estimators: 100 ferent datasets is to verify whether we would Gaussian Naive Bayes na be able to adapt the RS interface to the needs Linear Support Vector Machines gamma: scale probability: True of the user during the task. As such we did Gradient Boosting maximum depth: 4 not include a window with 100% of the data as the adaptation would be too late. Addi- tionally, previous research [30] showed already that Random forests worked the best. How- that after a certain amount of data, the ac- ever, Berkovsky et al. [7] conclude that Naive curacy started to converge or even that the Bayes and Support Vector Machines are the accuracy decreases after a certain amount of best. Additionally, Gradient Boosting performed data. In this study, we want to explore whether well in the study of Barral et al. [39]. Be- we would notice similar trends for different cause of the small sample size, we chose not PCs. to use deep learning methods. For each of these classifiers we tried to optimize the ac- 4.3. Classification methods curacy. The resulting parameters can be found To classify users in a low and high category, in Table 3. we used scikit-learn to train five different clas- To strengthen the stability of the results, sifiers and a baseline [37]. To evaluate the we ran this evaluation 10 times with differ- performance of the classifiers, we applied a ent random seeds. We calculated the average leave-one-out methodology. Because of this accuracy over all participants, and all runs to evaluation methodology and the measure performance of the classifier. uniform groups, we could not use the most common majority class baseline which pre- 5. Results dicts the most likely class (this would lead to 0% accuracy) [30, 31, 33]. As a consequence, To examine whether it is possible to classify we choose a random uniform baseline which users in the correct personality group and whether has a theoretical accuracy of 50%. To clas- this classification works better on specific win- sify the characteristics, we trained Logistic dows, we ran for each PC a two-way repeated Regression, Random Forest, Gaussian Naive measures ANOVA with accuracy as the de- Bayes, Linear Support Vector Machines and pendent variable and both classifier and win- Gradient Boosting. The reasoning behind the dow as independent variables. As we run mul- implementation of all these classifiers is that tiple ANOVA’s and pairwise comparisons, the in previous research there is no consensus reported p-values are adjusted using the Ben- about which classifiers work the best. Ste- jamini and Hoghberg procedure [40] to con- ichen et al. [30] found that Logistic Regres- trol for the family-wise false discovery rate. sion performed better than Decision Trees, The main results of this analysis are shown Support Vector Machines and Neural Networks. in Figure 2 and we will report the results for Lallé et al. [38] and Hoppe et al. [33] found each of the PCs in detail in the next para- 6. Discussion graphs. Need for cognition. The results of the two- Our results show that we have a higher accu- way repeated measures ANOVA revealed a racy than the random baseline for NFC and significant main effect of classifier on accu- for openness in the first window, but that we racy (F(7.14) =18.8, p<.001). To investigate this were not able to do beat the random baseline main effect, we ran post-hoc pairwise com- classifier for MS. parisons which showed that the mean accu- For the classification of openness, it is in- racy of the logistic regression classifier (0.59) teresting that we are able to outperform the performed statistically better than the base- baseline while openness was one of the few line (p=.0491) which is shown in Figure 2a. traits of which Hoppe et al. [33] could not This figure also shows the accuracy in the outperform the baseline. This might be due three different windows and that the peak ac- to a different classification technique as Hoppe curacy (0.67) is reached in the last window. et al. only used a Random Forest classifier Musical sophistication. The results of while we outperformed the baseline with a the two-way repeated measures ANOVA re- Gradient Boost classifier. Another possible vealed that no classifier could outperform the reason could be that this difference is due to baseline and that most of the classifiers per- the fact that we trained the classifiers on dif- formed even worse. ferent data windows and that our results show Openness. The results of the two-way re- that the performance to classify openness is peated measures ANOVA revealed a signifi- only significantly better than the baseline in cant interaction effect of classifier with win- the first window. As far as we know, no other dow on accuracy (F(14,28)=4.88, p<.001). An studies formally showed that classifying PCs analysis of the effect of classifier showed a on early stages of the task can outperform significant effect for the classifiers trained on more data. However, first window (F(7,16)=4.512, p=.006) and a post- other studies such as the study of Steichen et hoc test revealed that in this window the al. [30] already discussed this trend for per- Gradient Boost performed significantly bet- ceptual speed, verbal working memory and ter than the baseline (p=.020). The analysis visual working memory. They argued that of the effect of window showed a significant these characteristics most strongly affect the effect for the Gradient Boost classifier gaze pattern of the user during the initial phase (F(2,6)=8.12, p=.020) and a post-hoc analysis of a task and that other factors dilute the gaze showed that the gradient boost classifier per- pattern as the task continues. This is prob- formed significantly better in the first win- ably also the reason why we are only able dow than in the second (p=.028) and the third to classify openness in the beginning of the window (p=.029). Figure 2b shows that the task. However, this is not necessary a prob- highest accuracy of Gradient Boost lem as we want to adapt an interface to the is reached in the first window (0.66). This ac- openness of a user as early on as possible. curacy is significantly higher than the accu- Nevertheless, the obtained accuracy is still too racy of the baseline and the accuracy of Gra- low to be used to adapt the explanations. Also, dient Boost in the other windows. more research is needed to verify that open- ness will always affect the gaze during the be- ginning of a task or only when they see a new interface. To classify NFC, our results show a signif- Openness Need For Cognition 0.67 0.66* Accuracy 0.58 Logistic Regression Accuracy Mean: 0.59* 0.53 Base Base Mean: 0.41 Gradient Boost 0.31 0.26 First Second Third First Second Third (a) Accuracy of Logistic Regression for NFC. (b) Accuracy of Gradient Boost for openness. Figure 2: Accuracy of classifiers that perform significantly better than the baseline. icant main effect of Logistic Regression on sible reason for this could be that we did not accuracy. The reason that we do not see a include AOI related features which were in- significant difference between the windows cluded in the above-mentioned studies. An could be that NFC is correlated with decision- interesting further line of research is to ver- making processes [12] and creating a playlist ify whether including these AOI features can in a music RS constantly involves making de- improve accuracy. cisions. Despite the significant main effect, the accuracy to classify NFC seems not high enough to adapt the interface, especially not 7. Conclusion in the first two windows. As a consequence, In this paper, we explored whether it would this means that further research needs to fo- be possible to adapt the explanations in a mu- cus on reaching a higher accuracy in the be- sic RS interface based on personal character- ginning of the interaction to be able to adapt istics. To do so, we investigated whether a explanations early on in the process or on classification of personal characteristics could adapting the interface if the user re-visits the be inferred by studying the gaze pattern dur- application. Additionally, further research ing the creation of a playlist in this system. should investigate why Logistic Regression More concretely, we classified musical sophis- performed the best to classify NFC as this is tication, need for cognition and openness be- similar to previous studies in which Logis- cause these characteristics have shown to im- tic Regression performed well to classify PCs, pact the user experience of explanation in a but we do not have an explanation why logis- RS [9]. We trained the classifiers on different tic regression outperforms other algorithms windows to detect whether the classification [30, 41]. would already work with only a partial ob- As a previous study in the field of music servation of the creation of a playlist. RS showed that MS influences the way users Our results show that even as our accu- look to a music RS interface and previous stud- racy is not yet high enough for practical use, ies in the field of information retrieval also we are able to outperform a baseline to clas- showed the potential of predicting domain sify need for cognition with Logistic Regres- knowledge based on eye tracking [42, 43, 9], sion. If we only consider the first third of we expected to be able to classify MS based the data, our results show that the classifica- on gaze data. However, our results show that tion of openness with Gradient Boost beats we could not outperform the baseline. A pos- the baseline. Despite the limitations in terms Transparency for Emerging Technolo- of accuracy, this finding is important because gies Workshop, 2019, p. 5. it shows the potential to adapt explanations [5] N. Tintarev, J. Masthoff, A survey of during the interaction with a music RS inter- explanations in recommender systems, face. In a next step, we want to increase the in: 2007 IEEE 23rd international con- accuracy of the classifiers particularly in the ference on data engineering workshop, beginning of the interaction which we plan IEEE, 2007, pp. 801–810. to do by gathering more training data and [6] A. Springer, S. Whittaker, Progressive by using different features such as AOI re- disclosure: empirically motivated ap- lated features. Additionally, more research is proaches to designing effective trans- needed to verify whether the results of this parency, in: Proceedings of the 24th study could be generalized to different tasks International Conference on Intelligent and interfaces which we also plan to address User Interfaces, 2019, pp. 107–120. in future research. [7] S. Berkovsky, R. Taib, I. Koprinska, E. Wang, Y. Zeng, J. Li, S. Kleitman, Detecting personality traits using eye- Acknowledgments tracking data, in: Proceedings of the 2019 CHI Conference on Human Fac- Part of this research has been supported by tors in Computing Systems, 2019, pp. 1– the KU Leuven Research Council (grant agree- 12. ment C24/16/017) and the Research Founda- [8] G. Park, H. A. Schwartz, J. C. Eichstaedt, tion Flanders (FWO). M. L. Kern, M. Kosinski, D. J. Stillwell, L. H. Ungar, M. E. Seligman, Automatic References personality assessment through social media language., Journal of personality [1] R. R. Sinha, K. Swearingen, et al., Com- and social psychology 108 (2015) 934. paring recommendations made by on- [9] M. Millecamp, N. N. Htun, C. Conati, line systems and friends., DELOS 106 K. Verbert, What’s in a user? towards (2001). personalising transparency for music [2] C. He, D. Parra, K. Verbert, Interactive recommender interfaces, in: Proceed- recommender systems: A survey of the ings of the 28th ACM Conference on state of the art and future research chal- User Modeling, Adaptation and Person- lenges and opportunities, Expert Sys- alization, 2020, pp. 173–182. tems with Applications 56 (2016) 9–27. [10] M. Millecamp, N. N. Htun, C. Conati, [3] J. Kunkel, T. Donkers, L. Michael, C.-M. K. Verbert, To explain or not to ex- Barbu, J. Ziegler, Let me explain: Im- plain: the effects of personal charac- pact of personal and impersonal expla- teristics when explaining music recom- nations on trust in recommender sys- mendations, in: Proceedings of the 24th tems, in: Proceedings of the 2019 CHI International Conference on Intelligent Conference on Human Factors in Com- User Interfaces, 2019, pp. 397–407. puting Systems, 2019, pp. 1–12. [11] M. Naiseh, N. Jiang, J. Ma, R. Ali, Ex- [4] A. Springer, S. Whittaker, Making plainable recommendations in intelli- transparency clear, in: Algorithmic gent systems: delivery methods, modal- ities and risks, in: International Confer- ence on Research Challenges in Infor- mation Science, Springer, 2020, pp. 212– based recommender systems: technolo- 228. gies and research issues, in: Proceed- [12] S. T. Tong, E. F. Corriero, R. G. Math- ings of the 10th international confer- eny, J. T. Hancock, Online daters’ will- ence on Electronic commerce, 2008, pp. ingness to use recommender technol- 1–10. ogy for mate selection decisions., in: In- [21] J. T. Cacioppo, R. E. Petty, C. Feng Kao, tRS@ RecSys, 2018, pp. 45–52. The efficient assessment of need for [13] S. Naveed, T. Donkers, J. Ziegler, cognition, Journal of personality as- Argumentation-based explanations in sessment 48 (1984) 306–307. recommender systems: Conceptual [22] K. Y. Tam, S. Y. Ho, Web personaliza- framework and empirical results, in: tion as a persuasion strategy: An elabo- Adjunct Publication of the 26th Con- ration likelihood model perspective, In- ference on User Modeling, Adaptation formation systems research 16 (2005) and Personalization, 2018, pp. 293–298. 271–291. [14] L. R. Goldberg, The structure of pheno- [23] M. Millecamp, R. Haveneers, K. Verbert, typic personality traits., American psy- Cogito ergo quid? the effect of cognitive chologist 48 (1993) 26. style in a transparent mobile music rec- [15] R. Hu, P. Pu, Enhancing collaborative ommender system, in: Proceedings of filtering systems with personality infor- the 28th ACM Conference on User Mod- mation, in: Proceedings of the fifth eling, Adaptation and Personalization, ACM conference on Recommender sys- 2020, pp. 323–327. tems, 2011, pp. 197–204. [24] D. Müllensiefen, B. Gingras, J. Musil, [16] V. Benet-Martinez, O. P. John, Los cinco L. Stewart, The musicality of non- grandes across cultures and ethnic musicians: an index for assessing mu- groups: Multitrait-multimethod analy- sical sophistication in the general pop- ses of the big five in spanish and en- ulation, PloS one 9 (2014) e89642. glish., Journal of personality and social [25] Y. Jin, N. Tintarev, K. Verbert, Effects psychology 75 (1998) 729. of individual traits on diversity-aware [17] N. Tintarev, M. Dennis, J. Masthoff, music recommender user interfaces, in: Adapting recommendation diversity to Proceedings of the 26th Conference on openness to experience: A study of hu- User Modeling, Adaptation and Person- man behaviour, in: International Con- alization, 2018, pp. 291–299. ference on User Modeling, Adaptation, [26] J. Golbeck, C. Robles, M. Edmondson, and Personalization, Springer, 2013, pp. K. Turner, Predicting personality from 190–202. twitter, in: 2011 IEEE third interna- [18] L. Chen, W. Wu, L. He, How personality tional conference on privacy, security, influences users’ needs for recommen- risk and trust and 2011 IEEE third inter- dation diversity?, in: CHI’13 extended national conference on social comput- abstracts on human factors in comput- ing, IEEE, 2011, pp. 149–156. ing systems, 2013, pp. 829–834. [27] M. X. Zhou, G. Mark, J. Li, H. Yang, [19] U. Gretzel, D. R. Fesenmaier, Persua- Trusting virtual agents: the effect of sion in recommender systems, Interna- personality, ACM Transactions on In- tional Journal of Electronic Commerce teractive Intelligent Systems (TiiS) 9 11 (2006) 81–100. (2019) 1–36. [20] A. Felfernig, R. Burke, Constraint- [28] J. Wache, R. Subramanian, M. K. Abadi, R.-L. Vieriu, N. Sebe, S. Winkler, Im- search & applications, ACM, 2000, pp. plicit user-centric personality recogni- 71–78. tion based on physiological responses [36] J. H. Goldberg, X. P. Kotval, Com- to emotional videos, in: Proceedings of puter interface evaluation using eye the 2015 ACM on International Confer- movements: methods and constructs, ence on Multimodal Interaction, 2015, International journal of industrial er- pp. 239–246. gonomics 24 (1999) 631–645. [29] D. Toker, C. Conati, B. Steichen, [37] F. Pedregosa, G. Varoquaux, A. Gram- G. Carenini, Individual user charac- fort, V. Michel, B. Thirion, O. Grisel, teristics and information visualization: M. Blondel, P. Prettenhofer, R. Weiss, connecting the dots through eye track- V. Dubourg, J. Vanderplas, A. Passos, ing, in: proceedings of the SIGCHI D. Cournapeau, M. Brucher, M. Perrot, Conference on Human Factors in E. Duchesnay, Scikit-learn: Machine Computing Systems, 2013, pp. 295–304. learning in Python, Journal of Machine [30] B. Steichen, C. Conati, G. Carenini, Learning Research 12 (2011) 2825–2830. Inferring visualization task properties, [38] S. Lallé, C. Conati, G. Carenini, Predic- user performance, and user cognitive tion of individual learning curves across abilities from eye gaze data, ACM information visualizations, User Mod- Transactions on Interactive Intelligent eling and User-Adapted Interaction 26 Systems (TiiS) 4 (2014) 1–29. (2016) 307–345. [31] C. Conati, S. Lallé, A. Rahman, D. Toker, [39] O. Barral, S. Lallé, G. Guz, A. Iranpour, Further results on predicting cognitive C. Conati, Eye-tracking to predict user abilities for adaptive visualizations, IJ- cognitive abilities and performance for CAI International Joint Conference on user-adaptive narrative visualizations, Artificial Intelligence (2017) 1568–1574. in: Proceedings of the 2020 Interna- doi:10.24963/ijcai.2017/217. tional Conference on Multimodal Inter- [32] M. Gingerich, C. Conati, Constructing action, 2020, pp. 163–173. models of user and task characteristics [40] Y. Benjamini, Y. Hochberg, Controlling from eye gaze data for user-adaptive in- the false discovery rate: a practical and formation highlighting, in: Proceed- powerful approach to multiple testing, ings of the AAAI Conference on Arti- Journal of the Royal statistical society: ficial Intelligence, 1, 2015. series B (Methodological) 57 (1995) 289– [33] S. Hoppe, T. Loetscher, S. A. Morey, 300. A. Bulling, Eye movements during [41] S. Kardan, C. Conati, Exploring gaze everyday behavior predict personality data for determining user learning with traits, Frontiers in Human Neuro- an interactive simulation, in: Inter- science 12 (2018) 1–8. doi:10.3389/ national Conference on User Model- fnhum.2018.00105. ing, Adaptation, and Personalization, [34] O. P. John, E. M. Donahue, R. L. Kentle, Springer, 2012, pp. 126–138. The big five inventory—versions 4a and [42] M. J. Cole, J. Gwizdka, C. Liu, N. J. 54, 1991. Belkin, X. Zhang, Inferring user knowl- [35] D. D. Salvucci, J. H. Goldberg, Iden- edge level from eye movement patterns, tifying fixations and saccades in eye- Information Processing & Management tracking protocols, in: Proceedings of 49 (2013) 1075–1091. the 2000 symposium on Eye tracking re- [43] X. Zhang, M. Cole, N. Belkin, Predicting users’ domain knowledge from search behaviors, in: Proceedings of the 34th international ACM SIGIR conference on Research and development in Informa- tion Retrieval, 2011, pp. 1225–1226.