Combination of User and Venue Personality with Topic Modelling in Restaurant Recommender Systems Evripides Christodoulou1 , Andreas Gregoriades1 , Herodotos Herodotou1 and Maria Pampaka2 1 Cyprus University of Technology, Limassol, Cyprus 2 The University of Manchester, Manchester, UK Abstract Recommender systems are popular information systems used to support decision makers’ information overload. However, despite their success in simple problems, such as music recommendation, they have been criticized of insufficient performance in highly complex domains, characterized by many parameters, such as restaurant recommendations. Recent research has acknowledged the importance of personality in influencing consumers’ choice, but recommendation methodologies do not exploit this in the restaurant recommendation problem. Hence, this work seeks to analyze the contribution of personality in combination with extracted topics from consumers’ electronic word of mouth (eWOM) to restaurant recommender systems. The paper utilizes a bi-directional transformer approach with a feed-forward classification layer for personality prediction, due to its improved performance in similar problems over other machine learning models. One issue with this approach is the handling of long text, such as narratives written by people of different personality types (labels). Thus, different long-text management methods are evaluated to find the one with best personality prediction performance. Two personality models are evaluated, namely the Myers-Briggs and Big Five, based on two labelled datasets that are utilized to generate two personality classifiers. In addition to customer personality, this work investigates the concept of venue personality estimated from personalities of users that visited a venue and liked it. Finally, the customer and venue personalities are used together with the topics discussed by customers to form the input to the extreme gradient boosting (XGBoost) models for predicting user ratings of restaurants. The performance of these models is compared to traditional collaborative filtering methods using various prediction metrics. Keywords Personality Prediction, Recommender System, Topic Modelling 1. Introduction Recommender systems have been developed to provide support to consumers’ decision mak- ing process by addressing information overloading [1]. In tourism, they aim to enhance the tourists’ experience by better satisfying tourists’ needs and wants. Traditional recommendation approaches base their recommendations on user preferences extracted from users’ historical records, such as ratings, reviews, or purchases. Popular techniques include the collaborative and RecSys Workshop on Recommenders in Tourism (RecTour 2022), September 22th, 2022, co-located with the 16th ACM Conference on Recommender Systems, Seattle, WA, USA $ ep.xristodoulou@edu.cut.ac.cy (E. Christodoulou); andreas.gregoriades@cut.ac.cy (A. Gregoriades); herodotos.herodotou@cut.ac.cy (H. Herodotou); maria.pampaka@manchester.ac.uk (M. Pampaka)  0000-0002-7422-1514 (A. Gregoriades); 0000-0002-8717-1691 (H. Herodotou); 0000-0001-5481-1560 (M. Pampaka) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) content-based filtering approaches. Recently, there is strong interest in the utilization of users’ personality, since it is linked to perception, motivation, and preference and is known to remain stable during adulthood [2]. Personality is directly associated with consumer emotions and has strong impact on satisfaction [3] with theory indicating that people with similar personality traits have similar preferences and needs [3]. The application of users’ personality has enhanced the performance of recommenders in the tourism domain with results from questionnaire data or automated personality recognition found to improve point of interest, destination recom- mendations. In the same vein, venue personality is directly related to branding and satisfaction but has not been examined in restaurant recommendation. Restaurants have their brands, which in turn have personalities, just like people and attract people with similar personalities [4]. Consumers that relate to a restaurant’s concept and personality are more likely to visit it than those who do not. Brand personality has its roots in personality psychology [5] and can be defined as the human characteristics associated with a brand. Given that consumers prefer brands aligned with their own personality, the personality of a venue can be extracted by consumers that visited a venue and liked it. This is the approach employed in this study. Despite the importance of user and brand personality, they have not been utilized for restaurant recommendations, which makes this work novel. The proposed approach utilizes consumers’ and venues’ personalities along with perceptions about venues from eWOM to recommend most suitable restaurants to tourists. We hypothesize that venue personality can improve recommendation due to the link that exists between users’ and venues’ personalities. EWOM represents consumer opinions about products and services and has been used extensively in identifying consumers’ preferences. This paper evaluates two popular personality models, the Myers-Briggs Type Indicator (MBTI) [6] and the Big Five [7] in combination with preferences of users expressed in eWOM themes. The identification of the best combination of users’ eWOM-topics and personality labelling is made by training and comparing two Extreme Gradient Boosting (XGBoost) models each using as features: (1) the users’ personalities identified from eWOM using two candidate personality models (MBTI and BIG 5), (2) the users’ perceptions of the venues from eWOM topics, and (3) the personality of the venue based on personalities of users that visited the venue and liked it. An XGBoost model is generated for this purpose due to its good performance in similar recommendation problems. The research question addressed in this work is how to best combine user and venue person- ality models with topic features inferred from eWOM to produce the best recommendation, in contrast to popular model-based collaborative filtering (CF) techniques. This is a continuation of our previous work [8, 9] that examined the use of personality and emotion in recommender systems. The contributions of this work are threefold. First, it identifies which long text manage- ment approach produces the best personality prediction. Second, it investigates the concept of venue personality in restaurant recommendation and evaluating its impact. Restaurant’s person- ality is estimated from personalities of users that visited a venue and liked it. Finally, it shows which combination of personality and eWOM topics improves restaurant recommendation. The paper is organized as follows. The next section introduces background knowledge on restaurant recommender systems and techniques for extracting personality from text. The next section describes techniques for identifying topics discussed in consumers’ eWOM and personality prediction using deep neural networks. Subsequent sections elaborate on the method followed and the results obtained. The paper concludes with the discussion of findings and future directions. 2. Existing Knowledge This section provides a review of recommendation techniques, the concept of personality, and elaborates on how it has been used in recommender systems so far. 2.1. Restaurant Recommender Systems Recommender systems aim to predict the satisfaction of a consumer with an item (prod- uct/service) he/she has not bought yet [10]. This is part of one-to-one marketing that seeks to match items to consumers’ preferences in contrast to mass marketing aiming to satisfy a target market segment [11]. Popular approaches focus on consumers’ past experiences (ratings) for the creation of a user-item matrix and based on that predict what is more appropriate to a user depending on either similarity between users or items (products, services)[10]. The relationship between consumers or between products can be found using similarity metrics, a method known as Collaborative Filtering (CF) [12]. This has been successfully applied in tourism recommendation problems such as hotels or points of interests, and is considered as one of the most popular techniques [13]. Another popular technique is content-based filtering, that attempts to guess what a user may like based on items’ features rather than their rating [12]. A hybrid approach takes the advantage of both content-based filtering and collaborative filtering [10]. CF techniques, however, suffer from the cold start problem that occurs when very little or no data is available about a user and thus inability to identify similar consumers [14]. In addition, data sparsity exacerbates the problem when there are a lot of unrated items in the user-item matrix. This occurs when there is not enough data to populate the user-item matrix based on which to make reliable inferences [15]. In tourism, the collection of data is difficult and time-consuming due to the limited time that tourists spent at a destination. The cold start problem appears with first-time users (tourists) since there are no records of their purchasing activity at a specific destination. To address these CF problems, recent methods utilise machine learning techniques such as matrix factorization to approximate the user-item matrix content using latent variables that emerge from the initial data. The singular value decomposition (SVD), optimized SVD (SVD++), and non-negative matrix factorization (NMF) models factorize the user-item matrix and predict the satisfaction of users for products that are unknown [16]. Alternatively, content-based approaches utilize metadata about new products to address the cold start problem. A useful source for obtaining these metadata is textual information from eWOM and its analysis using text analytics [17]. An example application includes work by Sun et al. [18] that improved CF performance by analysing restaurants eWOM to define numerical features corresponding to consumers satisfaction through sentiment analysis. In the same vein, topic modelling techniques have been used with CF to assist in estimating the similarity between consumers or items [19]. Finally, Zhang et al. in 2018 [20] used consumers or items characteristics to identify clusters, and then find correlations between clusters to address the data sparsity problem. Recently, a strong interest emerged in using the personalities of consumers in an effort to better understand and match their needs, as “personality” relates to the perceptions, feelings, motivations and preferences of individuals [21]. The application of user personality has improved the performance of recommendations in the tourism domain for points of interest compared to traditional methods [22]. Personality-based recommendations have also been shown to greatly reduce the cold start and data sparsity problems, and improved the performance of recommendations in areas such as online advertising, social media, books and music [23]. However, these approaches do not take advantage of eWOM data from users on the web to extract their preferences and their personalities. They focus mainly on the extraction of user data from specialized questionnaires to collect consumers’ behaviours and personalities. Such approaches fail to continuously update the system because of the time-consuming use of questionnaires that leads to limited updating of recommender system’s knowledge. This led to the need for automated means for extracting consumers’ personality, discussed next. 2.2. Personality Extraction from Text Personality is a set of characteristics and behaviours of an individual. Over the years, it has been shown that the personality traits of a person, influence many areas of his/her life such as motivations, preferences, as well as consumer preferences and behaviour [22]. Applications of automated personality predictions have been seen with data from various social networks such as Facebook, Twitter, to explore correlations between personalities and the different user activities, purchasing behaviors and liking of foods from specific cuisines [24]. The two most popular text-based personality classification methods are based on the Myers- Briggs Type Indicator (MBTI) and the Big Five personality traits due to the availability of labelled data on these models. The MBTI focuses on four dimensions that refer to eight key types of people characteristics/behaviours: Extraversion or Introversion, Sensing or Intuition, Thinking or Feeling, and Judging or Perceiving. The combination of characteristics can shape 16 different personality types and classify people to the proper personality cluster [25]. The Big 5 Personality model expresses personality in the following five dimensions expressed in binary states of high/low: Agreeableness, Extraversion, Openness to Experience, Conscientiousness, and Neuroticism. Such taxonomies are recognized as valid mechanisms for defining the most essential aspects of personality that describes people characteristics which reflect their behaviour [26]. Personality prediction is an important phase of personality-aware recommender systems, and the two main methods for doing so is through questionnaires and automated means [27]. Generally, questionnaires are more accurate in assessing personality; however, the process is tedious while the automated approach is easier to conduct, by utilising user’s existing data which can be either text, images, videos, likes (behavioural data) etc. [28]. Predicting personality from text is a popular automated approach that is based on personality theory claiming that words can reveal some psychological states and personality of the author of the text. There are two main categories of techniques, the feature-based and the deep learning: the former uses unigrams/n-grams (open vocabulary approach) or lexicons (closed vocabulary) of features relevant to personality, and the latter utilizes text embeddings learned from large corpus of text in an unsupervised manner (language models). Popular feature-based methods utilize the Mairesse [29] and linguistic inquiry and word count techniques [30]. Features from these techniques are fed into different machine learning classifiers (e.g., Naïve Bayes, support vector machines) to make predictions. Obtaining such features however is a costly process and cannot effectively represent the original text semantics. To avoid feature engineering, deep neural models and language models are employed to learn text representations, which currently result in improved accuracy. Deep models focus on the context of the text and not just a static representation for a word or a sentence. Those kinds of deep learning techniques use an attention mechanism [31] that focuses on giving weights to words based on how they are used in a text, enabling them to also capture the semantic content [32]. A popular architecture is the Bidirectional Encoder Representations from Transformers (BERT) that utilizes transformer neural networks. Attention-based transformers have shown that collecting the semantics of a text improves the performance level and the predication accuracy of ML personality models [33]. Given this, the method proposed in this paper utilizes attention-based personality prediction. Most personality prediction approaches that utilize the Big 5 and MBTI use binary classi- fiers for each dimension of the personality model. For instance, a classifier for extraversion- introversion in Big 5 etc. Such methods require pre labeled data with the personality class. The first step in the process is the vectorization of the text into a form that can be processed by ML algorithms [34]. This can be done using open/closed lexicons or sentence embeddings in the case of deep learning methods (BERT). The vectorized data is used to train a classifier using the data label or fine tune a pretrained model as in the case of BERT. The trained and validated model can be used to predict unseen data. 3. Technical Background The proposed method utilises automated eWOM topic modelling for the identification of themes discussed in the reviews’ text, BERT-based personality classification for customers and venues, and ensemble tree-based regression for the prediction of user ratings of restaurants. 3.1. Topic Modelling Topic modelling is a popular tool for extracting information from unstructured data and is used in this work to identify themes discussed by consumers in eWOM. Topic models generally involve a statistical model aiming at finding topics that occur in a collection of documents [35]. Two of the most popular techniques for topic analysis are the Latent Dirichlet Allocation and the Structural Topic Model (STM). In this study, the STM approach [36] is used to develop a topic model using collected reviews from TripAdvisor. Each topic in STM represents a set of words that occur frequently together in a corpus and each document is associated with a probability distribution of topics per document. The process for learning the topic model initiates with data preprocessing that includes removal of common and custom stop-words and irrelevant information (punctuation), followed by tokenization (breaking sentences into word tokens), and stemming (converting words to their root form). Initially, common stop-words were considered and gradually with the refinement of the model, additional stop-words that were irrelevant to our goal were added to the list of custom stop-words such as, names of people, restaurants, cities, etc. The optimum number of topics that best fits the dataset is identified through an iterative process examining different values for the number of topics (K) and inspecting the semantic coherence, held out likelihood or exclusivity of the model at each iteration until a satisfactory model is produced [36]. Coherence measures the degree of semantic similarity between high scoring words in the topic. Held out likelihood tests a trained topic model using a test set that contains previously unseen documents. Exclusivity measures the extent to which top words in one topic are not top words in other topics. The naming of the topics was performed manually based on domain knowledge and the most prevalent words that characterize each topic. 3.2. BERT Personality Classification Recent benefits of the “attention” mechanism in deep learning models have demonstrated state-of-the-art performance in numerous text analysis tasks such as classification [31]. BERT uses a multi-layer bidirectional transformer encoder and is inspired by the concept of knowledge transfer, since in many problems it is difficult to access sufficiently large volume of labelled data to train deep models. In transfer learning, a pre-trained model is learned from massive unlabeled datasets not representing the target problem, but allows the learning of general knowledge. BERT-like approaches provide pretrained models and their embedded knowledge can be transferred to a target domain where labelled data is limited. Fine-tuning such models is performed using a labelled dataset representing the actual problem; these tune the model to the task at hand. Fine-tuning adds a feedforward layer on top of the pre-trained BERT. Previous work has demonstrated that this pre-training and fine-tuning approach outperforms existing text classification approaches. In our case, fine-tuning the BERT model was performed using publicly available personality labelled data. Despite their good results, BERT-based approaches have been criticized that their best perfor- mance is reported with short texts. Long text refers to text with more than 512 tokens. Such text however is computationally expensive to process thus most transformers models limit the number of tokens they can process simultaneously. In our case, most reviews produced by consumers exceeded the 512 tokens limit and thus the prediction of personality was considered as a long text classification problem. Different methods exist to dealing with this issue, including the naïve head-only, tail only or semi-naïve approaches, that either use the top number of words, bottom number of words, or combination of top/bottom/important words in the text. Such approaches lose information but have a minimum computational cost. Recent work has sought to alleviate the computational cost constraint by applying more sophisticated models to longer text instances such as dividing the long text into chunks and combining the embeddings of the chunks. However, Sun et al. [37] who investigated different long-text treatment methods for consumer reviews, showed that the best classification performance is achieved with naïve methods such as using only the head or tail tokens of the text while dropping all other content. In this work, we explore the naïve and semi naïve methods to identify the one with the best personality classification performance prior to labelling users with their personality. The results, described in a subsequent section, show that the naïve approach yielded the best performance, which is in line with [37]. 3.3. Ensemble Tree-based Regression - XGBoost XGBoost regression is used in this study as it produces good results in similar problems. It is an ensemble method; hence multiple trees are constructed with the training of each tree depending on errors from previous trees’ predictions. Gradient descent is used to generate new trees based on all previous trees while optimizing for loss and regularization. XGBoost regularization component balances complexity of the learned model against predictability. In this work, XGBoost is used to predict the rating (between 1 and 5) given by a user to a restaurant to fill a user-item matrix, as explained in Section 4. XGBoost optimization is required to minimize model overfitting and treating data imbalance, by tuning multiple hyperparameters. The optimal values of hyperparameters can be determined with different techniques such as the exhaustive (grid search), Bayesian, or random. The grid search method combines all possible values of each parameter, to obtain the model with best performance, while the Bayesian utilizes results from previous optimization cycle to identify hyperparameters values with higher probability in improving the classifiers performance. Grid search is better but slower while Bayesian is faster but not as accurate. In this work, the grid search approach is adopted to maximize the classification performance of the XGBoost models. 4. Methodology The methodology employed to address our research question is overviewed in Figure 1 and is implemented via the following steps. 1. Collection of restaurant reviews from TripAdvisor and extraction of consumers’ eWOM using a dedicated web crawler; 2. Preprocessing of the data and preparation for subsequent analyses (topic modelling, per- sonality classification). The preprocessing procedures include punctuations and URLs elimination, lowering of text, stop-words removal, tokenization, lemmatization, contrac- tual expansion of text abbreviations (i.e., don’t to do not), and text normalization. During this step, the user-item matrix is generated with rows corresponding to consumers and columns to restaurants. The cells of the matrix contain ratings when these are available since tourists did not visit all restaurants; 3. Development of a topic model using as corpus the eWOM’s text to identify consumers’ opinions and how these are associated with each review. Two metrics are extracted from eWOM, the preferences of each user and the topics that characterize each venue. Restaurant’s topics are generated by averaging the topics theta values associated with each restaurant. This represents common consumer opinions per restaurant; 4. Assess customers’ personality from eWOM using two personality classification models (BERT) and two long-text handling techniques. The best long-text handling technique was employed during BERT classifier training. Two BERT models were developed by fine tuning the language model on two personality datasets (MBTI personality café [38] and Big 5 [39]) with labelled data; 5. The two personality classifiers that emerged are used separately to label the personality of each reviewer and the personality of the restaurant by averaging the personalities Figure 1: Overview of the approach and its evaluation (probabilities of belonging to each of the binary dimensions of the two personality models) of users that visited the venue and liked it; 6. The features that emerge from the personality labelling and topic modelling are used collectively to enhance the original user-item matrix with additional information. This information is subsequently used to train two XGBoost regressor models, one for each personality model, using as output variable, the user rating between 1 and 5. The XGBoost models are trained using the enhanced user-item matrix. Each XGBoost model is optimized using hyperparameter tuning and validated using train/test data split (70/30) with stratified sampling based on user ratings. The trained models are used to predict user ratings for restaurants that users have never visited; 7. The performance of the two XGBoost models are compared against that of three popular baseline models, namely SVD, SVD++, and NMF trained using the initial user-item matrix. The performance of the models is assessed using popular recommender systems evaluation metrics. 5. Results The data utilized refer to 105k reviews (English language) from customers who visited restaurants in Cyprus between 2010 to 2020 and posted their opinions about their experience on TripAdvisor (publicly available). The total number of unique users was 56800 and the number of restaurants was 650. Figure 2 depicts descriptive statistics of reviews’ ratings per year. For this study, only users with at least 5 reviews are considered and only restaurants with at least 50 reviews yielding 1535 unique users and 437 venues. Figure 2: Percentage of restaurant review ratings [1-5] per year from 2010 to 2020 5.1. Learned topic model To extract consumers’ discussed themes from eWOM, an STM topic model was developed using the estimated optimum K(19) number of topics based on the model’s performance metrics in Figure 3, with focus on high coherence, high held-out likelihood, low residuals, and high lower bound scores. The naming of the topics in Table 1 was based on domain knowledge, words with highest probability in each topic and words with high Lift score; lift gives higher weight to words that appear less frequently in other topics. The probability distribution of topics per review denotes the probability of each topic discussed in a review and the sum of all topics’ probabilities in each review totals 1. Reviews are associated with the distribution of topics prevalence per review. The trained STM model’s theta values per review refer to the probability that a topic is associated with each review. These theta values, presented in Figure 4, were used as features during the training of the XGBoost model along with other features. 5.2. Personality Labelling To identify the BERT long text approach with the best classification performance, two tech- niques were examined, namely the naïve and semi naïve approaches and the one with the Figure 3: Topic performance measures for identifying the optimum number of topics. The red circle indicated the K number of topics selected Figure 4: Average theta values per topic best performance was used in the workflow. For the naïve approaches, we used the head only using as sentence length the 256 and 512 words and for the semi naïve, we used chunking of text into 128 words and combining their embedding. The area under the curve (AUC) and accuracy scores in Table 2 from this process showed that the 512-naïve approach outperformed the other approaches and thus it was employed in users’ personality classification. Results Table 1 Specified names for the topics that emerged from STM analysis Topic name Words with high probability and lift score Asian cuisine chinese, restaurants, indian, far, tried Downtown restaurant town, souvlaki, ice, cream, island, chefs Quality of restaurant excellent, time, amazing, quality, went, every Italian cuisine italian, pizza, pasta, next, love, year, ate Cleanliness clean, authentic, kids, toilets, lovely, party Wines wine, local, bottle, wonderful, house, red Music bar music, bar, drinks, evening, friends, night Seafood fresh, fish, cooked, dinner, many, seafood, Time to serve food order, time, minutes, table, get, Food Taste steak, taste, meal, experience, special Quality of food ingredients chicken, delicious, sauce, yam, prawns, curry Price prices, reasonable, selection, excellent, quality Location location, place, nice, great, sea, beach Staff great, staff, friendly, atmosphere, really, lovely Value for money value, money, variety, price, quality, excellent Classy/Style class, style, terrace, skylights, little, nice, cozy Buffet buffet, even, place, amazing, food Traditional tavern meze, traditional, cypriot, family, tavern For lunch lunch, busy, popular, day, weekends, Sunday Table 2 Performance results per long text treatment per model Long text treatment Personality AUC ACC (lower is better) model Naïve- head 512 tokens MBTI 0.878 0.839 BIG 5 0.679 0.698 Naïve- head 512 tokens MBTI 0.784 0.759 BIG 5 0.645 0.686 Semi naive- Sliced text 128 tokens MBTI 0.653 0.662 BIG 5 0.605 0.755 from these BERT BIG 5 models outperformed models trained using the same dataset using convolution neural networks/SVM/MLP and linguistic cues as extra features that obtained an average accuracy 57% [40]. Figure 5 shows descriptive statistics regarding the personalities of users according to the detected personality from the two BERT classifiers fine-tuned using two labelled datasets (MBTI and BIG5) and treating long text using the naïve-head approach with 512 tokens. The acronyms refer to the combination of dimensions of each of the MBTI and BIG 5 models. Figure 5 shows acronyms that correspond to combinations of personality dimensions detected in the dataset, so combinations that were not present are not shown. The trained BERT models predicts for each dimension of the personality model the probability that a user belongs to any of the personality traits (i.e., probability for extraversion-introversion [E/I], neuroticism-calm [N/C], agreeableness-competitive [A/C], conscientiousness-inattentive [C/I] and openness-closeness [O/C] in BIG 5 and Extraversion-Introversion [E/I], Sensing-Intuition [S/N], Thinking-Feeling [T/F], and Judging-Perceiving [J/P] for MBTI). The two BERT models use five binary classifiers in the case of Big 5, one for each dimension of the model, and similarly four binary classifiers for MBTI. The classifiers predict the label of each state of the output variable. Combinations of personality labels create the acronyms, depicted in Figure 5, such as INTP in MBTI, which defines individuals characterized as introverted (I), intuitive (N), thinking (T), and perceiving (P). Similarly, Big 5 acronyms refer to combinations of each of the Big 5’s dimensions. The MBTI classifier’s average performance is 88% AUC, while the BIG 5 one is 68%, which is better than results reported in [40] that use convolutional neural nets. Figure 5: Distribution of Big 5 (left) and MBTI (right) personality traits using each dimension’s acronyms 5.3. Training and Evaluating the XGBoost Models The enhanced user-item matrix that emerged from the two personality models and the topics associations per review were used to train two XGBoost regression models, one for each personality modeling approach. The two XGBoost models underwent hyperparameter tuning prior to training by tuning the models’ learning rate, gamma, subsample and regularization options using grid search. The two models were compared based on the following performance metrics: the mean absolute error (MAE) that represents the average of the absolute difference between the real and predicted values, Mean Squared Error (MSE), and Root Mean Squared Error (RMSE) that is the square root of MSE. Comparison of the two models against traditional recommendation techniques, namely SVD, SVD++, and NMF, revealed an improved performance of the personality-based Table 3 Performance results per model incorporating all features Evaluation MBTI-XGB BIG5-XGB MBTI-XGB BIG5-XGB SVD SVD++ NMF metric (lower without venue without venue is better) personality personality MAE 0.59 0.6 0.61 0.62 0.65 0.68 0.82 MSE 0.71 0.72 0.73 0.75 0.87 0.89 1.22 RMSE 0.84 0.85 0.88 0.9 0.93 0.94 1.1 approaches over these baseline models. The traditional techniques were also optimized by tuning two hyperparameters, the number of factors and the regularization value. In the experiments conducted using the aforementioned restaurants reviews, the data was initially split into test and training sets (70/30) using stratified sampling to guarantee that all user ratings are sufficiently represented in the test and training samples. The models were hyper tuned, trained, and tested using the same samples. The aforementioned metrics were computed, and the results that emerged (Table 3) show (1) that the combination of user and venue personality improves the results and confirms the hypothesis that consumers prefer to visit restaurants with personalities similar to their own; and (2) that MBTI XGBoost model produced the best performance among all other models. Both personality-based models outperformed traditional approaches, which indicates that the use of personality and eWOM-extracted topics improved the recommendations. 6. Conclusions This study proposes a personality-based restaurant recommendation approach and constitutes one of the first studies that use customer and venue personality in the restaurant recommenda- tion problem. It focuses on evaluating two popular personality models to enhance the restaurant recommendation process, namely MBTI and BIG 5. Personality is identified from tourists’ eWOM using two BERT classification models that are fine-tuned on labelled datasets. Due to the length of the training data, the best long-text handling approach (naïve 512 tokens) was employed during BERT model tuning. The method used as additional features, users’ and restaurants’ discussion themes extracted from eWOM’s text through topic modelling. All aforementioned features are used collectively to train two XGBoost regressors (one for each personality model) to predict consumers satisfaction for unvisited restaurants. The results show firstly that venue and user personality can improve recommendation. Secondly, the MBTI model in combination with topics from eWOM outperforms the BIG 5 model and also outperformed model-based collaborative filtering techniques. Both results offer a first indication that the consideration of personality in restaurant recommendation can have valuable implications. Future work will focus on evaluating other long-text handling techniques to fine tune BERT classification and combine the results with other traditional machine learning models in an ensemble manner to improve further the performance of personality classification, given that personality is a valuable feature that enhances restaurant recommendation. References [1] M. del Carmen Rodríguez-Hernández, S. Ilarri, AI-based mobile context-aware recommender systems from an information management perspective: Progress and directions, Knowledge-Based Systems 215 (2021) 106740. URL: https://doi.org/10.1016/j.knosys.2021.106740. doi:10.1016/j. knosys.2021.106740. [2] A. Tommasel, A. Corbellini, D. Godoy, S. Schiaffino, Personality-aware followee recommendation algorithms: An empirical analysis, Engineering Applications of Artificial Intelligence 51 (2016) 24–36. URL: https://www.sciencedirect.com/science/article/pii/S0952197616000208. doi:https:// doi.org/10.1016/j.engappai.2016.01.016. [3] J. Gountas, S. Gountas, Personality orientations, emotional states, customer satisfaction, and intention to repurchase, Journal of Business Research 60 (2007) 72–75. doi:10.1016/j.jbusres. 2006.08.007. [4] J. L. Aaker, Dimensions of Brand Personality, Journal of Marketing Research 34 (1997) 347–356. URL: http://www.jstor.org/stable/3151897. doi:10.2307/3151897. [5] D. Kim, V. P. Magnini, M. Singal, The effects of customers’ perceptions of brand personality in casual theme restaurants, International Journal of Hospitality Management 30 (2011) 448–458. URL: https://www.sciencedirect.com/science/article/pii/S027843191000109X. doi:https://doi.org/ 10.1016/j.ijhm.2010.09.008. [6] G. J. Boyle, Myers-Briggs Type Indicator (MBTI): Some Psychometric Limitations, Australian Psychologist 30 (1995) 71–74. URL: https://aps.onlinelibrary.wiley.com/doi/abs/10.1111/j.1742-9544. 1995.tb01750.x. doi:https://doi.org/10.1111/j.1742-9544.1995.tb01750.x. [7] R. R. McCrae, O. P. John, An introduction to the five-factor model and its applications., Journal of personality 60 (1992) 175–215. doi:10.1111/j.1467-6494.1992.tb00970.x. [8] A. Gregoriades, M. Pampaka, M. Georgiades, A Holistic Approach to Requirements Elicitation for Mobile Tourist Recommendation Systems, in: K. Arai, R. Bhatia (Eds.), Future of Information and Communication Conference, Springer International Publishing, Cham, 2020, pp. 857–873. [9] E. Christodoulou, A. Gregoriades, M. Pampaka, H. Herodotou, Personality-Informed Restaurant Recommendation, in: A. Rocha, H. Adeli, G. Dzemyda, F. Moreira (Eds.), World Conference on Information Systems and Technologies, Springer International Publishing, Cham, 2022, pp. 13–21. [10] S. Malik, A. Rana, M. Bansal, A Survey of Recommendation Systems, Information Resources Management Journal 33 (2020) 53–73. URL: http://services.igi-global.com/resolvedoi/resolve.aspx? doi=10.4018/IRMJ.2020100104. doi:10.4018/IRMJ.2020100104. [11] A. Ansari, S. Essegaier, R. Kohli, Internet Recommendation Systems, Journal of Marketing Research 37 (2000) 363–375. URL: http://journals.sagepub.com/doi/10.1509/jmkr.37.3.363.18779. doi:10.1509/ jmkr.37.3.363.18779. [12] S. B. Aher, L. Lobo, Applicability of data mining algorithms for recommendation system in e-learning, ACM International Conference Proceeding Series (2012) 1034–1040. doi:10.1145/ 2345396.2345562. [13] M. Nilashi, O. bin Ibrahim, N. Ithnin, N. H. Sarmin, A multi-criteria collaborative filtering recom- mender system for the tourism domain using Expectation Maximization (EM) and PCA–ANFIS, Electronic Commerce Research and Applications 14 (2015) 542–562. URL: http://dx.doi.org/10.1016/ j.elerap.2015.08.004https://linkinghub.elsevier.com/retrieve/pii/S1567422315000599. doi:10.1016/ j.elerap.2015.08.004. [14] N. Silva, D. Carvalho, A. C. Pereira, F. Mourão, L. Rocha, The Pure Cold-Start Problem: A deep study about how to conquer first-time users in recommendations domains, Information Systems 80 (2019) 1–12. doi:10.1016/j.is.2018.09.001. [15] S. Natarajan, S. Vairavasundaram, S. Natarajan, A. H. Gandomi, Resolving data sparsity and cold start problem in collaborative filtering recommender system using Linked Open Data, Expert Systems with Applications 149 (2020). doi:10.1016/j.eswa.2020.113248. [16] Y. Koren, R. Bell, C. Volinsky, Matrix Factorization Techniques for Recommender Systems, Computer 42 (2009) 30–37. doi:10.1109/MC.2009.263. [17] Y. Fan, Y. Shen, J. Mai, Study of the Model of E-commerce Personalized Recommendation System Based on Data Mining, in: 2008 International Symposium on Electronic Commerce and Security, IEEE, 2008, pp. 647–651. URL: http://ieeexplore.ieee.org/document/4606146/. doi:10.1109/ISECS. 2008.106. [18] L. Sun, J. Guo, Y. Zhu, Applying uncertainty theory into the restaurant recommender system based on sentiment analysis of online Chinese reviews, World Wide Web 22 (2019) 83–100. doi:10.1007/ s11280-018-0533-x. [19] G. B. Herwanto, A. M. Ningtyas, Recommendation system for web article based on association rules and topic modelling, Bulletin of Social Informatics Theory and Application 1 (2017) 26–33. doi:10.31763/businta.v1i1.36. [20] C. Zhang, H. Zhang, J. Wang, Personalized restaurant recommendation method combining group correlations and customer preferences, Information Sciences 454-455 (2018) 128–143. doi:10.1016/ j.ins.2018.04.061. [21] W.-Z. Su, P.-H. Lin, A Study of Relationship Between Personality and Product Identity, in: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 9741, 2016, pp. 266–274. URL: http://link.springer.com/10.1007/ 978-3-319-40093-8_27. doi:10.1007/978-3-319-40093-8_27. [22] H. Wang, Y. Zuo, H. Li, J. Wu, Cross-domain recommendation with user personality, Knowledge- Based Systems 213 (2021) 106664. URL: https://doi.org/10.1016/j.knosys.2020.106664. doi:10.1016/ j.knosys.2020.106664. [23] W. Wu, L. Chen, Y. Zhao, Personalizing recommendation diversity based on user per- sonality, User Modeling and User-Adapted Interaction 28 (2018) 237–276. URL: https://doi. org/10.1007/s11257-018-9205-xhttp://link.springer.com/10.1007/s11257-018-9205-x. doi:10.1007/ s11257-018-9205-x. [24] R. P. Karumur, T. T. Nguyen, J. A. Konstan, Personality, User Preferences and Behavior in Recom- mender systems, Information Systems Frontiers 20 (2018) 1241–1265. URL: http://link.springer. com/10.1007/s10796-017-9800-0. doi:10.1007/s10796-017-9800-0. [25] M. H. Amirhosseini, H. Kazemian, Machine Learning Approach to Personality Type Prediction Based on the Myers–Briggs Type Indicator®, Multimodal Technologies and Interaction 4 (2020) 9. URL: www.theijm.comhttps://www.mdpi.com/2414-4088/4/1/9. doi:10.3390/mti4010009. [26] S. V. Paunonen, Big Five Factors of Personality and Replicated Predictions of Behavior, Journal of Personality and Social Psychology 84 (2003) 411–424. doi:10.1037/0022-3514.84.2.411. [27] M. Tkalcic, L. Chen, Personality and Recommender Systems, in: F. Ricci, L. Rokach, B. Shapira (Eds.), Recommender systems handbook, Springer US, Boston, MA, 2015, pp. 715–739. URL: https: //doi.org/10.1007/978-1-4899-7637-6_21. doi:10.1007/978-1-4899-7637-6_21. [28] S. Dhelim, N. Aung, M. A. Bouras, H. Ning, E. Cambria, A Survey on Personality-Aware Recom- mendation Systems, Artif. Intell. Rev. 55 (2022) 2409–2454. doi:10.1007/s10462-021-10063-7. [29] F. Mairesse, M. A. Walker, M. R. Mehl, R. K. Moore, Using linguistic cues for the automatic recognition of personality in conversation and text, Journal of Artificial Intelligence Research 30 (2007) 457–500. doi:10.1613/jair.2349. [30] J. W. Pennebaker, M. E. Francis, R. J. Booth, Linguistic inquiry and word count: LIWC 2001, Mahway: Lawrence Erlbaum Associates 71 (2001) 2001. [31] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, I. Polo- sukhin, Attention is All you Need, in: I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fer- gus, S. Vishwanathan, R. Garnett (Eds.), Advances in Neural Information Processing Systems, volume 30, Curran Associates, Inc., 2017. URL: https://proceedings.neurips.cc/paper/2017/file/ 3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf. [32] S. Kardakis, I. Perikos, F. Grivokostopoulou, I. Hatzilygeroudis, Examining attention mechanisms in deep learning models for sentiment analysis, Applied Sciences (Switzerland) 11 (2021). doi:10. 3390/app11093883. [33] H. Jun, L. Peng, J. Changhui, L. Pengzheng, W. Shenke, Z. Kejia, Personality Classification Based on Bert Model, Proceedings of 2021 IEEE International Conference on Emergency Science and Infor- mation Technology, ICESIT 2021 (2021) 150–152. doi:10.1109/ICESIT53460.2021.9697048. [34] S. Stajner, S. Yenikent, A Survey of Automatic Personality Detection from Texts, in: Proceedings of the 28th International Conference on Computational Linguistics, 2020, pp. 6284–6295. doi:10. 18653/v1/2020.coling-main.553. [35] S. I. Nikolenko, S. Koltcov, O. Koltsova, Topic modelling for qualitative studies, Journal of Informa- tion Science 43 (2017) 88–102. URL: http://journals.sagepub.com/doi/10.1177/0165551515617393. doi:10.1177/0165551515617393. [36] M. E. Roberts, B. M. Stewart, D. Tingley, C. Lucas, J. Leder-Luis, S. K. Gadarian, B. Albertson, D. G. Rand, Structural topic models for open-ended survey responses, American Journal of Political Science (2014). doi:10.1111/ajps.12103. [37] C. Sun, X. Qiu, Y. Xu, X. Huang, How to Fine-Tune BERT for Text Classification?, 2019. doi:10. 48550/ARXIV.1905.05583. [38] Kaggle, (MBTI) Myers-Briggs Personality Type Dataset, 2017. URL: https://www.kaggle.com/ datasets/datasnaek/mbti-type. [39] J. W. Pennebaker, L. A. King, Linguistic styles: Language use as an individual difference., 1999. doi:10.1037/0022-3514.77.6.1296. [40] N. Majumder, S. Poria, A. Gelbukh, E. Cambria, Deep Learning-Based Document Modeling for Personality Detection from Text, IEEE Intelligent Systems 32 (2017) 74–79. doi:10.1109/MIS. 2017.23.