Extracting User Preferences and Personality from Text for Restaurant Recommendation Evripides Christodoulou1 , Andreas Gregoriades1 , Herodotos Herodotou1 and Maria Pampaka2 1 Cyprus University of Technology, Limassol, Cyprus 2 The University of Manchester, Manchester, UK Abstract Restaurant recommender systems are designed to support restaurant selection by assisting consumers with the information overload problem. However, despite their promises, they have been criticized of insufficient performance. Recent research in recommender systems has acknowledged the importance of personality in improving recommendation; however, limited work exploited this aspect in the restaurant domain. Similarly, the importance of user preferences in food has been known to improve recommendation but most systems explicitly ask the users for this information. In this paper, we explore the influence of personality and user preference by utilizing text in consumers’ electronic word of mouth (eWOM) to predict the probability of a user enjoying a restaurant he/she had not visited before. Food preferences are extracted though a trained named-entity recognizer learned from a labelled dataset of foods, generated using a rule-based approach. The prediction of user personality is achieved through a bi-directional transformer approach with a feed-forward classification layer, due to its improved performance in similar problems over other machine learning models. The personality classification model utilizes the textual information of reviews and predicts the personality of the author. Topic modelling is used to identify additional features that characterize users’ preferences and restaurants properties. All aforementioned features are used collectively to train an extreme gradient boosting tree model, which outputs the predicted user rating of restaurants. The trained model is compared against popular recommendation techniques such as nonnegative matrix factorization and single value decomposition. Keywords Consumer Personality, Food preference extraction, Recommender System, Topic Modelling 1. Introduction sonality and preferences, while explicit features refer to ratings of restaurants, their estimated value, price, and Dining is one of the top five tourist activities during a cuisine offered. leisure trip that plays a central role in travel experience. There are several ways to extract user preferences for Recently, interest on food experience has been growing restaurant recommendation [5]. The simplest is through [1], with businesses in the hospitality sector seeking in- explicit queries by asking users to define their prefer- sights regarding the dining behaviors and preferences of ences. This however has some disadvantages, as food customers to improve decision making in areas such as preferences might not be covered by the questions asked. marketing [2] and recommendation [3]. Past research Alternative methods utilize user ratings to find similar- that utilizes food in recommender systems such as [4] ities between users and restaurants (e.g., collaborative employ simple techniques such as frequencies of food filtering). Another method for preference extraction is vocabularies in Bag of Words. However, such techniques user opinion analysis that utilizes natural language pro- require a lexicon of complete list of foods that usually cessing. is not available for different cousins and countries. In Traditional recommendation approaches base their rec- this paper, we utilize implicit and explicit information ommendations on user preferences extracted from users’ of consumers’ eWOM to improve restaurant recommen- historical records, such as ratings, reviews, or purchases. dation. Implicit information refers to textual comments Popular techniques include the collaborative and content- in reviews that can be used to estimate consumers’ per- based filtering approaches. Recently, there is strong in- terest in the utilization of users’ personality, since it is ORSUM@ACM RecSys 2022: 5th Workshop on Online Recommender linked to perception, motivation, and preference, and Systems and User Modeling, jointly with the 16th ACM Conference on is known to remain stable during adulthood. Personal- Recommender Systems, September 23rd, 2022, Seattle, WA, USA ity is directly associated with consumer emotions and Envelope-Open ep.xristodoulou@edu.cut.ac.cy (E. Christodoulou); andreas.gregoriades@cut.ac.cy (A. Gregoriades); has strong impact on satisfaction with theory indicating herodotos.herodotou@cut.ac.cy (H. Herodotou); that people with the same personality have similar prefer- maria.pampaka@manchester.ac.uk (M. Pampaka) ences and needs [6]. The application of users’ personality Orcid 0000-0002-7422-1514 (A. Gregoriades); 0000-0002-8717-1691 has enhanced the performance of recommender systems (H. Herodotou); 0000-0001-5481-1560 (M. Pampaka) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License in the tourism domain with results improving point of CEUR Workshop Proceedings Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) http://ceur-ws.org ISSN 1613-0073 interest, destination recommendations, utilizing either questionnaires or automated personality recognition. 2. Existing Knowledge This paper illustrates the utilization of consumers’ per- sonality and user preferences extracted from the textual This section provides a review of recommendation tech- part of electronic word of mouth (eWOM) to improve niques, the concept of personality, and elaborates on how recommendation. EWOM represent consumer opinions it has been used in recommender systems so far. about products and services and has been used exten- sively in identifying consumers’ preferences. Recom- 2.1. Restaurant Recommender Systems mendations are made by training an Extreme Gradient Boosting (XGBoost) prediction model using as features, Recommender systems aim to predict the satisfaction of a the users’ personality and the users’ preferences (e.g., consumer with an item (product/service) he/she has not food). XGBoost is used due to its good performance bought yet [11]. This is part of one-to-one marketing that in similar recommendation problems [7]. The research seeks to match items to consumers’ preferences in con- question addressed in this paper focuses on whether the trast to mass marketing aiming to satisfy a target market integration of personality with other features inferred segment [12]. Popular approaches focus on consumers’ from structured and unstructured parts of online reviews past experiences (ratings) for the creation of a user-item improves restaurant recommendation, in contrast to pop- matrix and based on that predict what is more appro- ular model-based collaborative filtering (CF) techniques priate to a user depending on either similarity between such as nonnegative matrix factorization (NMF) and sin- users or items (products, services) [11]. The relationship gle value decomposition (SVD). between consumers or between products can be found The proposed approach utilizes consumers’ food pref- using similarity metrics, and this method is known as erences and personalities along with perceptions about Collaborative Filtering (CF) [13]. This has been success- venues from eWOM to recommend most suitable restau- fully applied in tourism recommendation problems such rants to tourists. Labelled personality data is utilized as hotels or points of interests, and is considered as one of to train a BERT (Bidirectional Encoder Representations the most popular techniques [14]. Another popular tech- from Transformers) classifier using the personality model nique is content-based filtering, that attempts to guess of Myers-Briggs Type Indicator (MBTI) due to its good what a user may like based on items’ features rather than results in previous studies [8]. User preferences are ex- their rating [13]. A hybrid approach takes the advantage tracted through topic modelling and a trained food named of both content-based filtering and collaborative filtering entity recognizer. An XGBoost model is generated to pre- [11]. dict the probability of a user liking an unvisited restau- CF techniques, however, suffer from the cold start prob- rant based on its personality, preference, and themes that lem that occurs when very little or no data is available characterize the venue. about a user and thus inability to identify similar con- The research question addressed in this work is how sumers [15]. In addition, data sparsity exacerbates the to best combine user preference and personality mod- problem when there are a lot of unrated items in the user- els with topic features inferred from eWOM to produce item matrix. This occurs when there is not enough data the best recommendation, in contrast to popular model- to populate the user-item matrix based on which to make based collaborative filtering (CF) techniques. This is a reliable inferences [16]. In tourism, the collection of data continuation of our previous work in [9, 10] that exam- is difficult and time-consuming due to the limited time ine the use of personality and emotion in recommender that tourists spent at a destination. The cold start prob- systems. The contribution of this work lies in the auto- lem appears with first-time users (tourists) since there are mated detection of food preferences from eWOM and its no records of their purchasing activity at a specific des- combination with user personality and topic modeling tination.To address these CF problems, recent methods for restaurant recommendation. utilise machine learning techniques such as matrix fac- The paper is organized as follows. The next section torisation to approximate the user-item matrix content introduces background knowledge on restaurant recom- using latent variables that emerge from the initial data. mender systems and techniques for extracting food pref- The singular value decomposition (SVD), optimized SVD erences and personality from text. The next section de- (SVD++), and non-negative matrix factorization (NMF) scribes techniques for identifying topics discussed in con- models factorize the user-item matrix and predict the sumers’ eWOM and personality prediction using deep satisfaction of users for products that are unknown [17]. neural networks. Subsequent sections elaborate on the Alternatively, content-based approaches utilize metadata methodology followed and the results obtained. The pa- about new products to address the cold start problem. per concludes with the discussion and future directions. A useful source for obtaining these metadata is textual information from eWOM and its analysis using text ana- lytics [18]. An example application includes work by Sun et al. [19] that improved CF performance by analysing restaurants eWOM to define numerical features corre- mechanism for defining the most essential aspects of per- sponding to consumers satisfaction through sentiment sonality that describes people characteristics that creates analysis. In the same vein, topic modelling techniques and reflects their behaviour [28]. have been used with CF to assist in estimating the simi- Personality prediction is an important phase of larity between consumers or items [20]. Finally, work by personality-aware recommender systems, and the two Zhang et al. [21] used consumers or items characteristics main methods for doing so is through questionnaires and to cluster them into groups, and then find correlations automated means. Generally, questionnaires are more between clusters to address the data sparsity problem. accurate in assessing personality; however, the process Recently, a strong interest emerged in using the per- is tedious while the automated approach is easier to con- sonalities of consumers in an effort to better understand duct, by utilising user’s existing data that can be either and match their needs, as “personality” relates to the text, images, videos, likes (behavioural data) etc. [18] perceptions, feelings, motivations, and preferences of in- Predicting personality from text is a popular automated dividuals [22]. The application of user personality has approach that is based on personality theory claiming improved the performance of recommendations in the that words can reveal some psychological states and per- tourism domain for points of interest compared to tradi- sonality of the author of the text. There are two main tional methods [23]. Personality-based recommendations categories of techniques, the feature-based and the deep have also been shown to greatly reduce the cold start and learning: the former uses unigrams/n-grams (open vo- data sparsity problems, and improved the performance cabulary approach) or lexicons (closed vocabulary) of of recommendations in areas such as online advertising, features relevant to personality, and the latter text em- social media, books, and music [24]. However, these beddings learned from large corpus of text in an unsuper- approaches do not take advantage of eWOM data from vised manner (language models). Popular feature-based users on the web to extract their preferences and their methods utilize the Mairesse [29] and linguistic inquiry personalities. They focus mainly on the extraction of and word count techniques [30]. Features from these are user data from specialized questionnaires to collect con- fed into different machine learning classifiers (e.g., Naïve sumers’ behaviours and personalities. Such approaches Bayes, support vector machines) to make predictions. fail to continuously update the system because of the Obtaining such features however is a costly process and time-consuming use of questionnaires that leads to the cannot effectively represent the original text semantics. loss of automation and update limitations. To avoid feature engineering, deep neural models and language models are employed to learn text representa- 2.2. Personality Extraction from Text tions that currently result in improved accuracy. Deep models focus on the context of the text and not just a Personality is a set of characteristics and behaviours of an static representation for a word or a sentence. Those kind individual that influence many areas of his/her life such of deep learning techniques are using an attention mech- as motivations, preferences, as well as consumer pref- anism that focuses on giving weights to words based on erences and behaviour [23]. Applications of automated how they are used in a text giving the ability of captur- personality predictions have been applied by researchers ing the semantic content [31]. A popular architecture is on data from various social networks such as Facebook, the BERT (Bidirectional Encoder Representations from Twitter, to explore correlations between personalities and Transformers) that utilizes transformers neural network the different user activities, purchasing behaviors and architecture. Attention-based transformers have shown liking of foods from specific cuisines [25]. that collecting the semantics of a text improves the per- The two most popular text-based personality classifi- formance level and the predication accuracy of ML per- cation methods are based on the Myers-Briggs Type Indi- sonality models [32]. Given this, the method proposed in cator (MBTI) [26] and the Big Five [27] personality traits this paper utilizes attention-based personality prediction. due to the availability of labelled data on these models. Most approaches use a binary classifier for each of The classifiers with best performance are usually employ- the personality traits (MBTI) such as a classifier for ing the MBTI personality model that focuses on 8 key extraversion-introversion etc. Such methods require pre types of characteristics that people have, Extraversion or labeled data with the personality class. The first step in Introversion, Sensing or Intuition, Thinking or Feeling, the process is the vectorization of the text into a form and Judging or Perceiving, behaviours. The combina- that can be processed by ML algorithms [33]. This can be tion of characteristics can shape 16 different personality done using open/closed lexicons or sentence embeddings types and classify people to the proper personality clus- in the case of deep learning methods (BERT). The vector- ter [13]. The Big 5 Personality model express personality ized data is used to train a classifier using the data label in the following 5 dimensions: Agreeableness, Extraver- or fine tune a pretrained model in the case of BERT. The sion, Openness to Experience, Conscientiousness, and trained and validated model can be used to predict unseen Neuroticism. Such taxonomies are recognized as a valid data. Recent personality classification techniques that utilize deep learning for Big Five personality prediction, tures (220) that resulted in the best model performance. such as the DeepPerson [34] demonstrate classification The selected food features where then one-hot encoded performance (AUC score) of around 70% per personal- for each consumer review. To identify the food prefer- ity dimension, using different training datasets, which is ences of each user, reviews were grouped by user and the much lower compared to classifiers that use MBTI data. most frequent food entities in each user’s reviews were considered as food preferences. This process considers that, when customers visit different restaurants and write 3. Technical Background comments about the food they ordered, irrespective of the food’s quality and the review rating, it constitutes The proposed method utilises a named entity recognition food preference of the user. to extract food preferences, an automated eWOM topic modelling for the identification of themes discussed in review’s text, a BERT-based personality classification, 3.2. EWOM Topic Modelling and an ensemble tree-based regression for the prediction Topic modelling is a popular tool for extracting informa- of consumer restaurant ratings. tion from unstructured data and is used in this work to identify themes discussed by consumers in eWOM. Topic 3.1. User’s Food Preference Extraction models generally involve a statistical model aiming at finding topics that occur in a collection of documents A named entity recognition (NER) is utilized to extract [36]. Two of the most popular techniques for topic analy- the food preferences of customers. NER is a major com- sis are the Latent Dirichlet Allocation and the Structural ponent in NLP systems to extract information from un- Topic Model (STM). In this study, the STM approach [37] structured text. An entity can be any word or sentence is used to develop a topic model due to its ability to incor- that refers to the concept of question. There are two porate reviews’ metadata such as sentiment (rating>3) main approaches for the creation of a NER, the model- that help with interpreting and naming the identified based, and rule-based approach. The latter focuses on the topics. Each topic in STM represents a set of words that grammatical rules and linguistic terms to extract entities. occur frequently together in a corpus and each document The model-based approach generates machine learning is associated with a probability distribution of topics per models using a text with pre-labelled entities. Most food document. The process for learning the topic model ini- NER models as reported in [35] are trained on data that tiates with data preprocessing that includes removal of did not include Cyprus dishes thus their predictions were common and custom stop-words and irrelevant informa- insufficient for our case. There is a wide range of generic tion (punctuation), followed by tokenization (breaking libraries suitable for NER, such as NLTK, spaCy, Stan- sentences into word tokens), and stemming (converting ford NER, Stanza, and Flair, but none was able to provide words to their root form). Initially, common stop-words appropriate food recognition in text. were considered and gradually with the refinement of the To extract food preferences, the spacy library was uti- model, additional stop-words that were irrelevant to our lized, and several rules have been specified that enabled goal were added to the list of custom stop-words such as the extraction of sentences that mention food consump- names of people, restaurants, cities, etc. The optimum tion such as “I ate ”,“I had for dinner” etc. To generate number of topics that best fits the dataset is identified a sufficient training set, a local and international food through an iterative process examining different values recipe dataset was used. The returned sentences were for the number of topics (K) and inspecting the seman- annotated automatically based on the position in the tic coherence, held out likelihood, and exclusivity of the sentence where the food entity occurred. This was nec- model at each iteration until a satisfactory model is pro- essary in order to create a training dataset labelled with duced [37]. Coherence measures the degree of semantic food names, and their start/end position in the sentence, similarity between high scoring words in the topic. Held based on which the food NER training was performed. out likelihood tests a trained topic model using a test set The trained NER achieved an overall accuracy of 81% that contains previously unseen documents. Exclusivity (70/30 train-test split) and was applied on the restaurant measures the extent to which top words in one topic reviews to extract foods associated with each review. Due are not top words in other topics. The naming of the to the large number of food entities that were generated, topics was performed manually based on domain knowl- there were a lot of repetitions due to different spellings. edge and the most prevalent words that characterize each Thus, to reduce the dimensionality of the dataset, a fea- topic. ture selection process was performed using a random forest machine learning model to identify the most im- portant food names using the review ratings as the target variable. The process yielded the optimum number of fea- Figure 1: Overview of the approach and its evaluation 3.3. BERT Personality Classification has been used for personality prediction using the Per- sonality Cafe MBTI dataset in [38] achieving an accuracy Recent benefits of the “attention” mechanism in deep of around 0.75. In contrast, other deep learning methods learning models have demonstrated state-of-the-art per- that use the Big 5 model as well as the popular stream- formance in numerous text analysis tasks such as classi- of-consciousness essay dataset such as the one reported fication. in [39] using CNN, achieve inferior classification perfor- BERT uses a multi-layer bidirectional transformer en- mance. coder and is inspired by the concept of knowledge trans- Despite their good results, BERT-based approaches fer, since in many problems it is difficult to access suffi- have been criticized that their best performance is re- ciently large volume of labelled data to train deep models. ported with short texts. Long text refer to text with In transfer learning, a pre-train a model is learned from more than 512 tokens. Such text however are compu- massive unlabeled datasets not representing the target tationally expensive to process thus most transformers problem, but allows the learning of general knowledge. models limit the number of tokens they can process si- BERT-like approaches provide pretrained models and multaneously. In our case, most reviews produced by their embedded knowledge can be transferred to a tar- consumers exceeded the 512 tokens limit and thus the get domain where labelled data is limited. Fine-tuning prediction of personality was considered as a long text such models is performed using a labelled dataset repre- classification problem. Different methods exist to dealing senting the actual problem; these tune the model to the with this issue, which include the naïve head-only, tail task at hand. Fine-tuning adds a feedforward layer on only or semi-naïve approaches, that either use the top top of the pre-trained BERT. Previous work has demon- number of words, bottom number of words, or combi- strated that this pre-training and fine-tuning approach nation of top/bottom/important words in the text. Such outperforms existing text classification approaches. In approaches lose information but have a minimum compu- our case, fine-tuning the BERT model was performed tational cost. Recent works have sought to alleviate the using publicly available personality labelled data. BERT computational cost constraint by applying more sophis- matrix is generated with rows corresponding to ticated models to longer text instances such as dividing consumers and columns to restaurants. The cells the long text into chunks and combining the embeddings of the matrix contain ratings when these are avail- of the chunks. However, work by Sun et al. [40] that able since customers did not visit all restaurants; investigated different long-text treatment methods for 3. Development of a topic model using as corpus consumer reviews, showed that the best classification the eWOM (reviews) to identify consumers’ opin- performance is achieved using naïve methods such as ions and how these are associated with each re- using only the head or tail tokens of the text while drop- view. Restaurant’s topics are generated by aver- ping all other content. In this work, we explore the naïve aging the topics theta values associated with each and semi naïve methods to find the one with the best restaurant. This represents common consumer personality classification performance prior to labelling opinions per restaurant; users with their personality. The results, described in a 4. Assessment of customers’ personality from subsequent section, show that the naïve approach yielded eWOM is achieved using the MBTI BERT per- the best performance, which is in line with [40]. sonality classification model; 5. Food preferences of users are extracted from 3.4. XGBoost Regression eWOM’s text using a custom NER model; 6. The explicit information from each restaurant is XGBoost regression is used in this study due to its ability combined with implicit information that emerges of producing good results in similar problems. It is an from personality analysis, food preference, and ensemble method; hence multiple trees are constructed topic modelling. These features are used collec- with the training of each tree depending on errors from tively to enhance the user-item matrix and are previous trees’ predictions. Gradient descent is used used to train an Extreme Gradient Boosting (XG- to generate new trees based on all previous trees while boost) regressor model using as output variable optimizing for loss and regularization. XGBoost regular- the user rating of restaurants and taking values ization component balances complexity of the learned in the range [1-5]. The XGBoost is optimized us- model against predictability. XGBoost optimization is ing hyperparameter tuning and validated using required to minimize model overfitting and treating data train/test data split (70/30). The trained model is imbalance, by tuning multiple hyperparameters. The op- used to predict user ratings for restaurant users timal values of hyperparameters can be determined with have never visited; different techniques such as the exhaustive (grid search), Bayesian, or random. The grid search method combines 7. The performance of the XGBoost model is com- all possible values of each parameter, to obtain the model pared against that of three popular model-based with best performance, while the Bayesian utilizes results CF techniques, namely SVD, SVD++, and NMF. from previous optimization cycles to identify hyperpa- The comparison models are trained using the ini- rameters values with higher probability in improving the tial user-item matrix while the XGBoost using the classifiers performance. Grid search is better but slower enhanced user-item matrix that includes explicit while Bayesian is faster but not as accurate. In this work, and implicit information. The performance of the grid search approach is adopted. the models is assessed using popular evaluation metrics such as mean absolute error (MAE), mean squared error (MSE), and root mean squared error 4. Methodology (RMSE). The methodology employed to address our research ques- tion is presented in Figure 1 and is implemented via the 5. Results following steps. The data collected includes 105k reviews written in En- 1. Collection of restaurant reviews from TripAdvi- glish from tourists who visited Cyprus between 2010 sor and extraction of consumers’ eWOM and ad- to 2020 and posted reviews about their experience with ditional explicit information of restaurants such restaurants in Cyprus (publicly available). The total num- as cuisine type, price range, and value for money; ber of unique users were 56800 and the number of restau- 2. Preprocessing of the data and preparation for sub- rants were 650. Figure 2 depicts descriptive statistics of sequent analyses (topic modelling, personality reviews ratings per year. For this study only users with classification). Preprocessing includes punctu- at least 20 reviews are considered and only restaurants ations and URLs elimination, lowering of text, with at least 50 reviews yielding 93 unique users and 410 stop words removal, tokenization, stemming, and venues. lemmatization. During this step, the user-item Figure 2: Percentage of restaurant review ratings [1-5] per year from 2010 to 2020 5.1. Learned Topic Model To extract consumers’ discussed themes from eWOM, an STM topic model was developed using the estimated optimum K (30) number of topics based on the model’s performance metrics in Figure 3, with focus on high co- herence, high held-out likelihood, low residuals, and high lower bound scores. Figure 4: Average theta values per topic 5.2. Personality Labelling The training of the binary classifiers was performed us- ing the Personality Cafe MBTI dataset consisting of joint user posts on a social network labeled by personality type defined using MBTI questionnaire. The dataset is publicly available on “Kaggle” [41]. To identify the BERT long text approach with the best classification perfor- mance, two techniques were examined, namely the naïve and semi naïve approach and the one with the best per- Figure 3: Topic performance measures for identifying the formance was used in the workflow. For the naïve ap- optimum number of topics. The red circle indicated the K proach, we used the head-only using as sentence length number of topics selected the 256 and 512 words and for the semi-naïve we used chunking of text into 128 words and combining their embedding. The results from this process, presented in The naming of the topics in Table 1 was based on Table 2, showed that the 512-naïve-head approach outper- domain knowledge, words with highest probability in formed the other approaches and thus it was employed in each topic and words with high Lift score. Lift gives users’ personality classification. Results from the BERT higher weight to words that appear less frequently in model outperformed personality models trained using other topics. the same dataset and thus improved our confidence in The probability distribution of topics per review de- the personality prediction of each user. notes the probability of each topic discussed in a review The personality distributions in Figure 5 show descrip- and the sum of all topics’ probabilities in each review tive statistics regarding the personalities of users accord- total 1. Reviews are associated with the distribution of ing to the detected personality from the MBTI BERT clas- topics prevalence per review. The trained STM model’s sifier fine-tuned using a labelled datasets and treating theta values per review refer to the probability that a long text using the naïve-head approach with 512 tokens. topic is associated with each review. These theta val- The acronyms refer to combination of dimensions of the ues, shown in Figure 4, were used as features during the MBTI model. The trained BERT model predicts for each training of the XGBoost model along with other features. dimension of the personality model the probability that a user belongs to any of the personality traits (i.e., prob- ability for Extraversion – Introversion (E/I), Sensing – Table 1 Specified names for the topics that emerged from STM analysis Topics Words with high probability and lift scores Topic Name Topic1: great, really, music, live, day, atmosphere, enchiladas, music, really, live, pub Entertainment Atmosphere Topic2: nice, prices, atmosphere, reasonable, big, family, polite, quick, nice, relaxing, Family Restaurant families, cafe Topic3: time, excellent, went, night, amazing, first, every, occasions, stay, went, amaze, Special Occasion week Topic4: eat, new, end, found, places, second, thai, always, second, none New Place Topic5: lovely, recommend, highly, enjoyed, beautiful, setting, party, setting, party, Party Place hosts, fabulous, thoroughly, absolutely Topic6: well, lunch, local, attentive, wonderful, presented, chose, breaks, attentive, Lunch presented, chose Topic7: evening, bar, friends, though, group, customers, quiet, whiskies, though Evening/Bar Topic8: restaurant, location, must, beach, view, right, perfect, definitely, must, far Location Topic9: visit, will, back, really, worth, definitely, going, visit, called Worth Visiting Topic10: wedding, amazing, even, similar, impression, organize, guest, events, beyond, Wedding Place pleasure Topic11: many, birthday, soon, booked, kitchen, also, october, good, love see, year this, Celebration parties flight, travel, celebration Topic12: experience, nothing, special, whole, maybe, dining, perfection, fiancée, maybe Not Worth Topic13: restaurant, probably, also, mountains, open, available, well, best, more, owner, Out of town managers, troodos Topic14: summer, use, even, range, late, cool, evenings, use, dine, cozy Summer location Topic15: always, can, class, owners, restaurant, number, first, classy, number, varied, Fabulous Place feeling, interesting, hidden Topic16: two, outside, can, inside, sit, world, get, disappointed, aircon, magic, noise, Outside eating traffic, heat Topic17: thai, tourist, across, gem, trying, partner, avoid, duck, again, overall, always, Asian Cuisine bespoke, gimmicks, hardcore Topic18: bit, little, better, average, like, quite, expensive, much, however, criticisms, Average Place average, Topic19: different, small, cheese, also, breakfast, euros, greek, platter, platter, options, Breakfast vegetarian, bacon, eggs Topic20: wife, return, disappointed, restaurant, reviews, favourite, holiday, isn’t, trip, Disappointment done Topic21: old, cypriot, road, stop, village, along, street, waitresses, road, walk Stop during trips Topic22: busy, get, people, table, lot, need, without, joyful, early, book Busy Place Topic23: years, restaurant, made, visiting, coming, several, since, ago, forward, visits Visit over years Topic24: staff, friendly, always, see, come, welcoming, feel, chat, truly, smile, come Welcoming Staff Topic25: chips, served, priced, set, large, course, portion, adults, chips, portion, reason- Good portions ably Topic26: best, cyprus, don’t, ever, never, restaurants, know, traditional, meze Traditional foods Topic27: value, money, recommended, excellent, variety, high, meals , bringing, best, Value for money cyprus, eaten Topic28: couldn’t, enough, friend, eat, fresh, away, wow, take, out, basilica, excellent, Fresh ingredients lovely, rice, more, time ,highly Topic29: ordered, came, table, order, waiter, asked, arrived, minutes, waited, waitress, Bad service left, seated, orders, bill Topic30: just, restaurant, basic, way, like, standard, much, full, unacceptable, much, Nothing Special restaurant Intuition (S/N), Thinking – Feeling (T/F), and Judging – which is depicted in Figure 5. The BERT model deals Perceiving (J/P)). Combinations of letter from each cate- with 4 classifiers, one for each of the dimensions above. gory generate 16 four-letter personality types: ISFJ, INFP, The classifier’s average area under the curve (AUC) per- INFJ, ISTP, ISTJ, ISFP, INTP, INTJ, ENTP, ESFP, ENFP, formance is 87%. This is an improved performance com- ESFJ, ESTP, ESTJ, ENFJ and ENTJ, the distribution of pared to alternative personality classification techniques Table 2 personality-based approaches over these baseline mod- Performance results per long text treatment els. The traditional techniques were also optimized by tuning two hyperparameters, the number of factors and Long text treatment for MBTI BERT AUC ACC the regularization value. Naïve - head 512 tokens 0.878 0.839 In the experiments conducted using the aforemen- Naïve - head 256 tokens 0.784 0.759 tioned restaurants reviews, the data was initially split Semi naïve - Sliced text 128 tokens 0.653 0.662 into test and training sets (70/30) using stratified sam- pling to guarantee that all user ratings are sufficiently represented in the test and training samples. The models that utilize deep learning and Big Five personality model, were hyper tuned, trained, and tested using the same such as the DeepPerson [34] that achieved AUC of around samples. The aforementioned metrics were computed, 70%. and the results that emerged (see Table 3) show that the MBTI XGBoost model produced the best performance among all other models. Both personality-based models outperformed traditional approaches, which indicates that the use of personality and eWOM-extracted topics improved the recommendations. Table 3 Performance results per model incorporating all features Performance metric SVD SVD++ NFM XGB (lower is better) Mean Absolute Error (MAE) 0.65 0.68 0.82 0.40 Mean Squared Error (MSE) 0.87 0.89 1.22 0.24 Root Mean Squared Error 0.93 0.94 1.10 0.49 (RMSE) Figure 5: Distribution of MBTI personality traits using each 6. Conclusions dimension’s acronyms This study proposes a combined user-preference with user personality restaurant recommendation approach and constitutes one of the first studies that use customer 5.3. Training and Evaluating the XGBoost preference along with personality in the restaurant rec- ommendation problem. It utilizes a popular personality Models model (MBTI) to enhance the restaurant recommenda- The enhanced user-item matrix that emerged from the tion process by fine tuning a BERT classification model personality model, food preferences, and the topics asso- on personality labelled dataset. Due to the length of ciations per user and venue were used to train an XGB the training data, the best long-text handling approach regressor model. (naïve-head 512) was employed during BERT model tun- The XGBoost model underwent hyperparameter tun- ing. EWOM themes are extracted through topic mod- ing prior to training by tuning the models’ learning rate, elling from eWOM’s text and are also used as additional gamma, subsample, and regularization options using grid features of restaurants and users that refer to implicit search. Traditional recommendation models, namely preferences of users and properties of restaurants. All SVD, SVD++, and NMF were generated using the sur- aforementioned features are used collectively to train an prise python library. The models were compared based XGBoost regressor to predict consumers’ satisfaction (i.e., on the following performance metrics: the mean abso- rating) for unvisited restaurants. The results show that lute error (MAE) that represents the average of the ab- the MBTI model in combination with topics from eWOM solute difference between the real and predicted values, outperforms the model-based collaborative filtering tech- Mean Squared Error (MSE) and Root Mean Squared Er- niques, offering a first indication that the application of ror (RMSE) that is the square root of MSE. Comparison personality and food preferences in restaurant recom- of the two models against traditional recommendation mendation can have valuable results. Future work will techniques revealed an improved performance of the focus on evaluating additional long-text handling tech- niques and combine the results of the learned classifiers formation Systems and Technologies, Springer In- with other traditional machine learning models in an en- ternational Publishing, Cham, 2022, pp. 13–21. semble manner to improve further the performance of [10] A. Gregoriades, M. Pampaka, M. Georgiades, A personality classification, given that personality is a valu- Holistic Approach to Requirements Elicitation for able feature that enhances restaurant recommendation. Mobile Tourist Recommendation Systems, in: K. Arai, R. Bhatia (Eds.), Future of Information and Communication Conference, Springer International References Publishing, Cham, 2020, pp. 857–873. [11] S. Malik, A. Rana, M. Bansal, A Survey of Rec- [1] Y. G. Kim, A. Eves, C. Scarles, Building a model of ommendation Systems, Information Resources local food consumption on trips and holidays: A Management Journal 33 (2020) 53–73. URL: http: grounded theory approach, International Journal //services.igi-global.com/resolvedoi/resolve.aspx? of Hospitality Management 28 (2009) 423–431. URL: doi=10.4018/IRMJ.2020100104. doi:10.4018/IRMJ https://www.sciencedirect.com/science/article/pii/ .2020100104 . S027843190900005X. doi:https://doi.org/10.1 [12] A. Ansari, S. Essegaier, R. Kohli, Internet Rec- 016/j.ijhm.2008.11.005 . ommendation Systems, Journal of Marketing Re- [2] K.-H. Min, T. J. Lee, Customer Satisfaction with search 37 (2000) 363–375. URL: http://journals.s Korean Restaurants in Australia and Their Role agepub.com/doi/10.1509/jmkr.37.3.363.18779. as Ambassadors for Tourism Marketing, Journal doi:10.1509/jmkr.37.3.363.18779 . of Travel & Tourism Marketing 31 (2014) 493–506. [13] M. H. Amirhosseini, H. Kazemian, Machine learn- URL: https://doi.org/10.1080/10548408.2013.877412. ing approach to personality type prediction based doi:10.1080/10548408.2013.877412 . on the Myers–Briggs type indicator®, Multimodal [3] C. Anderson, A survey of food recommenders, Technologies and Interaction 4 (2020) 9. doi:10.3 CoRR abs/1809.0 (2018). URL: http://arxiv.org/abs/ 390/mti4010009 . 1809.02862. [14] M. Nilashi, O. bin Ibrahim, N. Ithnin, N. H. [4] S. B. Hegde, S. Satyappanavar, S. Setty, Sentiment Sarmin, A multi-criteria collaborative filtering rec- based Food Classification for Restaurant Business, ommender system for the tourism domain using in: 2018 International Conference on Advances Expectation Maximization (EM) and PCA–ANFIS, in Computing, Communications and Informatics Electronic Commerce Research and Applications (ICACCI), 2018, pp. 1455–1462. doi:10.1109/ICAC 14 (2015) 542–562. URL: http://dx.doi.org/10.10 CI.2018.8554794 . 16/j.elerap.2015.08.004https://linkinghub.e [5] E. Asani, H. Vahdat-Nejad, J. Sadri, Restaurant lsevier.com/retrieve/pii/S1567422315000599. recommender system based on sentiment analy- doi:10.1016/j.elerap.2015.08.004 . sis, Machine Learning with Applications 6 (2021) [15] N. Silva, D. Carvalho, A. C. Pereira, F. Mourão, 100114. URL: https://www.sciencedirect.com/sc L. Rocha, The Pure Cold-Start Problem: A deep ience/article/pii/S2666827021000578. doi:https: study about how to conquer first-time users in rec- //doi.org/10.1016/j.mlwa.2021.100114 . ommendations domains, Information Systems 80 [6] J. Gountas, S. Gountas, Personality orientations, (2019) 1–12. doi:10.1016/j.is.2018.09.001 . emotional states, customer satisfaction, and inten- [16] S. Natarajan, S. Vairavasundaram, S. Natarajan, tion to repurchase, Journal of Business Research A. H. Gandomi, Resolving data sparsity and 60 (2007) 72–75. doi:10.1016/j.jbusres.2006.0 cold start problem in collaborative filtering recom- 8.007 . mender system using Linked Open Data, Expert [7] Z. Shahbazi, Y. Byun, Y.-C. Byun, Product Rec- Systems with Applications 149 (2020). doi:10.101 ommendation Based on Content-based Filtering 6/j.eswa.2020.113248 . Using XGBoost Classifier, International Journal [17] Y. Koren, R. Bell, C. Volinsky, Matrix Factorization of Advanced Science and Technology 29 (2020) Techniques for Recommender Systems, Computer 6979–6988. URL: https://www.researchgate.net 42 (2009) 30–37. doi:10.1109/MC.2009.263 . /publication/342864588. [18] S. Dhelim, N. Aung, M. A. Bouras, H. Ning, E. Cam- [8] F. Celli, B. Lepri, Is big five better than MBTI? bria, A Survey on Personality-Aware Recommenda- A personality computing challenge using Twitter tion Systems, Artif. Intell. Rev. 55 (2022) 2409–2454. data, CEUR Workshop Proceedings 2253 (2018). doi:10.1007/s10462- 021- 10063- 7 . [9] E. Christodoulou, A. Gregoriades, M. Pampaka, [19] L. Sun, J. Guo, Y. Zhu, Applying uncertainty the- H. Herodotou, Personality-Informed Restaurant ory into the restaurant recommender system based Recommendation, in: A. Rocha, H. Adeli, G. Dze- on sentiment analysis of online Chinese reviews, myda, F. Moreira (Eds.), World Conference on In- World Wide Web 22 (2019) 83–100. doi:10.1007/ s11280- 018- 0533- x . [20] G. B. Herwanto, A. M. Ningtyas, Recommendation [30] J. W. Pennebaker, M. E. Francis, R. J. Booth, Linguis- system for web article based on association rules tic inquiry and word count: LIWC 2001, Mahway: and topic modelling, Bulletin of Social Informatics Lawrence Erlbaum Associates 71 (2001) 2001. Theory and Application 1 (2017) 26–33. doi:10.317 [31] S. Kardakis, I. Perikos, F. Grivokostopoulou, I. Hatzi- 63/businta.v1i1.36 . lygeroudis, Examining attention mechanisms in [21] C. Zhang, H. Zhang, J. Wang, Personalized restau- deep learning models for sentiment analysis, Ap- rant recommendation method combining group cor- plied Sciences (Switzerland) 11 (2021). doi:10.339 relations and customer preferences, Information 0/app11093883 . Sciences 454-455 (2018) 128–143. doi:10.1016/j. [32] H. Jun, L. Peng, J. Changhui, L. Pengzheng, ins.2018.04.061 . W. Shenke, Z. Kejia, Personality Classification [22] W.-Z. Su, P.-H. Lin, A Study of Relationship Be- Based on Bert Model, Proceedings of 2021 IEEE tween Personality and Product Identity, in: Lecture International Conference on Emergency Science Notes in Computer Science (including subseries and Information Technology, ICESIT 2021 (2021) Lecture Notes in Artificial Intelligence and Lecture 150–152. doi:10.1109/ICESIT53460.2021.96970 Notes in Bioinformatics), volume 9741, 2016, pp. 48 . 266–274. URL: http://link.springer.com/10.1007/97 [33] S. Stajner, S. Yenikent, A Survey of Automatic Per- 8-3-319-40093-8_27. doi:10.1007/978- 3- 319- 4 sonality Detection from Texts, in: Proceedings 0093- 8_27 . of the 28th International Conference on Computa- [23] H. Wang, Y. Zuo, H. Li, J. Wu, Cross-domain recom- tional Linguistics, 2020, pp. 6284–6295. doi:10.186 mendation with user personality, Knowledge-Based 53/v1/2020.coling- main.553 . Systems 213 (2021) 106664. URL: https://doi.org/10 [34] K. Yang, R. Y. Lau, A. Abbasi, Getting Personal: .1016/j.knosys.2020.106664. doi:10.1016/j.knos A Deep Learning Artifact for Text-Based Measure- ys.2020.106664 . ment of Personality, Information Systems Research [24] W. Wu, L. Chen, Y. Zhao, Personalizing recommen- (2022). dation diversity based on user personality, User [35] G. Popovski, B. K. Seljak, T. Eftimov, A Survey Modeling and User-Adapted Interaction 28 (2018) of Named-Entity Recognition Methods for Food 237–276. URL: https://doi.org/10.1007/s11257-018 Information Extraction, IEEE Access 8 (2020) -9205-xhttp://link.springer.com/10.1007/s11257-018 31586–31594. doi:1 0 . 1 1 0 9 / A C C E S S . 2 0 2 0 . 2 9 7 3 -9205-x. doi:10.1007/s11257- 018- 9205- x . 502 . [25] R. P. Karumur, T. T. Nguyen, J. A. Konstan, Personal- [36] S. I. Nikolenko, S. Koltcov, O. Koltsova, Topic mod- ity, User Preferences and Behavior in Recommender elling for qualitative studies, Journal of Informa- systems, Information Systems Frontiers 20 (2018) tion Science 43 (2017) 88–102. URL: http://journa 1241–1265. URL: http://link.springer.com/10.1007/ ls.sagepub.com/doi/10.1177/0165551515617393. s10796-017-9800-0. doi:10.1007/s10796- 017- 980 doi:10.1177/0165551515617393 . 0- 0 . [37] M. E. Roberts, B. M. Stewart, D. Tingley, C. Lucas, [26] G. J. Boyle, Myers-Briggs Type Indicator (MBTI): J. Leder-Luis, S. K. Gadarian, B. Albertson, D. G. Some Psychometric Limitations, Australian Psy- Rand, Structural topic models for open-ended sur- chologist 30 (1995) 71–74. URL: https://aps.onlineli vey responses, American Journal of Political Sci- brary.wiley.com/doi/abs/10.1111/j.1742-9544.1995.t ence (2014). doi:10.1111/ajps.12103 . b01750.x. doi:https://doi.org/10.1111/j.1742 [38] S. S. Keh, I.-T. Cheng, Myers-Briggs Personality - 9544.1995.tb01750.x . Classification and Personality-Specific Language [27] J. F. Salgado, The Big Five personality dimen- Generation Using Pre-trained Language Models, sions and counterproductive behaviors, Interna- 2019. URL: https://arxiv.org/abs/1907.06333. doi:10 tional Journal of Selection and Assessment 10 (2002) .48550/ARXIV.1907.06333 . 117–125. doi:10.1111/1468- 2389.00198 . [39] N. Majumder, S. Poria, A. Gelbukh, E. Cambria, [28] S. V. Paunonen, Big Five Factors of Personality and Deep Learning-Based Document Modeling for Per- Replicated Predictions of Behavior, Journal of Per- sonality Detection from Text, IEEE Intelligent Sys- sonality and Social Psychology 84 (2003) 411–424. tems 32 (2017) 74–79. doi:10.1109/MIS.2017.23 . doi:10.1037/0022- 3514.84.2.411 . [40] C. Sun, X. Qiu, Y. Xu, X. Huang, How to Fine-Tune [29] F. Mairesse, M. A. Walker, M. R. Mehl, R. K. Moore, BERT for Text Classification?, 2019. doi:10.48550 Using linguistic cues for the automatic recognition /ARXIV.1905.05583 . of personality in conversation and text, Journal of [41] Kaggle, (MBTI) Myers-Briggs Personality Type Artificial Intelligence Research 30 (2007) 457–500. Dataset, 2017. URL: https://www.kaggle.com/d doi:10.1613/jair.2349 . atasets/datasnaek/mbti-type.