Using Sentiment Text Analysis of User Reviews in Social Media for E-Tourism Mobile Recommender Systems Olga Artemenko1, Volodymyr Pasichnyk2, Nataliia Kunanets2, Khrystyna Shunevych2 1PHEI “Bukovinian University”, Chernivtsi,Ukraine 2Lviv Polytechnic National University, Lviv, Ukraine olga.hapon@gmail.com, vpasichnyk@gmail.com, nek.lviv@gmail.com, krishirak@gmail.com Abstract This paper describes main modern tendencies for the design and de- velopment of e-tourism recommender systems with sentiment analysis of user generated content in social media. Main goal is to systematize and summarize knowledge about the possibilities of using tourist’s user reviews in social media as a type of e-tourism big data for mobile e-tourism recommender systems. In particular, to analyze the sources and types of tourist feedback data, messages and comments generated by the tourist with his gadget that can be related to e- tourism big data. Developing efficient tools for e-tourism user comments and feedback in social media, combining big data technologies, NLP and smartphone services advantages, can provide e-tourism recommender systems with new better ways of creating more personalized recommendations. Keywords: e-tourism, mobile recommender systems, trip support, content anal- ysis, sentiment text analysis 1 Introduction There is an increasing interest for the use of content created by consumers of hospitali- ty and tourism services, in particular on social networks and video hosting. Thus, the structure and dynamics of tourists' preferences can be tracked and analyzed, infor- mation about the image and reputation of the tourist product can be received, as well as about the behavior of tourists themselves when traveling. The feedback received from the tourist is not only useful for business, but also can be used by recommender appli- cations as one of the sources for estimation of the alternative item. Two popular e- advice websites Booking and TripAdvisor host users’ opinions since decades. But they are very much moderated. Also not every user leaves feedback on tourism-related re- view platforms. But every user has a profile in one or more social networks. And there he publishes different aspects of his life, tourist experience included. Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 2 E-Tourism Recommender Systems Recommender systems are a class of intelligent information retrieval systems de- signed to filter out, in a abundance of information resources, exactly the instances of data that best meet the interests of a particular user [1]. Diversified e-tourism recom- mender systems are intensively developing and are very popular among the users. But the problem of getting better, faster, more personalized recommendations is still on the table. One of the resources for improvements is using tourist’s user reviews and comments in social media as another kind of recommendation tool. E-tourism recommender systems can be classified according to different character- istics, such as: architecture, information technology platform, target audience, meth- ods used for generating recommendations, main tasks to be solved, etc [2]. E-tourism recommender systems Platform Knowledge base Users Methodology Group of tourists Dynamic Desktop (PC) Context- Content- oriented oriented Web oriented Tourist Hybrid Static Agent Mobile Function Potential Ready to go Analysis of travel Optimization Cost control Navigation Planning results With special Guide needs Virtual Fig. 1. General classification of e-tourism recommender systems The basis for successful operation for mobile e-tourism recommender systems is a complete, well-timed and correct data processing. There are certain peculiarities in dealing with input and output data and they should be taken into account when de- signing tourist recommender systems, in particular mobile ones. There are three main sources of input data for a mobile e-tourism recommender system: 1) The user as an informational source – generates queries, leaves feedback, dis- seminates messages about himself on social networks. All smart tourism technologies nowadays act in the paradigms "tourist as a sensor" and "every tourist is an expert". 2) The gadget of the user itself - information about the external background of the tourist, contextual data, appearance or disappearance from the operational space of various obstacles, etc. 3) Internet content and internet of things - this is data from referential resources, both tourist and external, work schedules, lists of tourist places and establishments, public transport timetables, etc., including web search data, user net surfing history and online booking data, and more. GPS data Data analysis technologies (sentiment text analysis Data preparation and modification technologies Data collection and consolidation technologies Roaming data Bluetooth and other sensors data User info profile (including feedback, reviews and technologies) comments) Comments and messages user from other users Tourist services Internet of things Transport services Other background information Meteorological data Recomendations Fig. 2. Processing user generated big data in mobile e-tourism recommender systems 3 Analysis of User Generated Content The gadget user is a powerful source of information for e-tourism software products. This data belongs to the e-tourism big data category [3]. The development of technol- ogy and the phenomenon of social networks have led to the emergence of new con- cepts, ways, rules and habits of disseminating information in the digital world. Hashtag, emoji, geo-positioning, photo and video content, live streams and pages complemented the classic textual content, which was the main source of tourist feed- back [4]. In particular, the text message that a gatget user leaves as a review of a tourism product consumed has changed [5-8]. 1. The user response has become shorter. First, there are limits to the number of characters for a single message in different systems, such as Twitter; screen size of the smartphone – there is an unspoken rule "what wasn’t fit on one screen will not be read" [9]. 2. The space for posting reviews has also changed. Traditionally, users have left posts on specialized sites, travel forums, travel agency blogs, and more [10]. To do this, the user has either logged in or left an anonymous comment. But since the last decade, a tourist with a gadget leaves a comment anywhere in the social media space [11]. 3. The structure of the comment has also changed: text is now being supplemented or even partially replaced by graphic, audio and video content. Emoji, stickers, ani- mated elements convey the emotional tinge of user feedback. Video stories and live streams may contain text captions and subtitles to increase the content of the response [12-14]. 4. The user may not plan and prepare the response text in advance, his story may be devoted to a completely different topic, and his own impression of the consumed tourist product will "slip through" among other things. Such reviews are the hardest to follow, but they also create a reputation for the tourist product [15]. 5. Using a hashtag for text and geo-positioning for images and videos allows you to uniquely identify the tourism product [16]. Making it easier to find and analyze data. 6. Option of personalized feedback from the author of the review. From the official owner’s profile of the tourist product can be added in response to the tourist post a gratitude for the positive feedback or an apology in case of complaints [17-18]. In this way, thanks to the social media space, the product seller is able to reach the custom- er's information territory and attract his (and his social environment) attention. It is also possible to supplement user profile of the recommendation application with new review facts. 7. The language used by the tourist: in the forums and official pages of the tourist objective (classic space for creating reviews), as a rule, one language is used, or in the case of regional information resources, two: English and the language of the region [19]. The social media space enables the user to express his or her thoughts in the language best suited to them [20]. That is, it is likely that the tourism product provid- ed by one country will have reviews in five, ten or more different languages. Which complicates the analysis of the text. Therefore, travel product reviews need to be collected not only on specialized re- sources, but also increasingly in the social media space. Analyzing the sentimental content of such reviews is complicated, first, by multilingualism and, second, by the presence of such graphic elements as emoji and stickers. Consumer feedback now needs to be maintained on users territory – on social media. Accordingly, the analysis of the sentimental filling of tourist feedback on tourist products is not only a source of data for mobile e-tourism recommender systems, but also transforms from the classic text-mining task to the task of analyzing big data, not only text [21]. Finding and retrieving useful information from user reviews of a tourism product in the social media space poses a number of challenges to developers of recommender applications. In particular: 1. How to properly treat sentimental tint of an emoji in reviews? Is negative con- tent related to the mood of the user, the weather, the day in general or the quality of the tourist product consumed this day? Is it possible to use for comparison as a de- scription of previous bad experiences with another product? Should emoji be consid- ered equivalent to keywords in reviews? 2. How situational and implicit (no hashtag and location) reviews can be well tracked and consolidated? 3. How to effectively extract text content from photos, videos and audio messages? 4. Should the publication a tourist photo or video related to the tourist product but without supplementing the text message be considered as a feedback and how to clas- sify it: as positive, negative or neutral? These and other problems need to be solved to create efficient sentiment analysis technologies for mobile e-tourism recommender systems. 4 Using Sentiment Text Analysis for E-Tourism Technologies Natural Language Processing (NLP) is a field of Computer Science that studies the use of automatic ways to process natural language. Sentiment text analysis is a fast growing element of NLP [22]. Automatic processing of e-tourism text data due to the large amount of content generated by users every minute is becoming more compli- cated. Fig. 3. Using web search data and social media feedback texts to predict tourists' preferences The keys to solve it lie in combining big data technologies, NLP and smartphone services advantages [23]. Different domains and types of texts have different information extraction re- quirements and thus require different NLP tasks and tools [24]. Developing efficient tools for sentiment analysis of specific type of text – e-tourism user comments and feedback in social media can provide e-tourism recommender systems with new better ways of creating more personalized recommendations. There are researches and discussions on the mechanisms behind reviewing tourists behavior and it’s connection with the data of the reputation sites, hotels, attractions and destinations have online, and how this affects tourist behavior and purchasing decisions. Social media feedback data bring new context and new challenges to this topic. But also they bring new perspectives and resources. 5 Sentiment Text Analysis of E-Tourism User Reviews In the first stage, a list of key components of the response was compiled: keywords, hashtags, emoji, and the order of punctuation was drawn. The keywords were divided into classes, as well as Ukrainian, English and Romanian, since these three languages are used by tourists to provide feedback on Bukovina tourist services, as shown in Tables 1, 2 and 3. Table 1. Fragment of Keywords list (Ukrainian) Keywords Дуже Позитивні Нейтральні Негативні Дуже негативні позитивні x1 дуже y1 прикольн z1 аби як a1 не b1 нічого сильно о сподобал не сподоба ось сподоба лось лось x2 надзвич y2 круто z2 50/50 a2 звичайні b2 зовсім айно відчуття не сподоба сподоба лось лось x3 дуже y3 сподобал z3 непога a3 не b3 погані романти ось ні романтич відчуття чні відчут ні відчуття тя відчуття x4 надзвич y4 нормальн z4 може a4 не гарно b4 зовсім айно і відчуття бути не гарно гарно x5 дуже y5 все z5 не a5 не b5 нічого гарно сподобал зовсім спланова не гарно ось сплано но вано x6 дуже- y6 романтич z6 досить a6 ніяково b6 зовсім дуже ні цікаво не гарно відчуття зручно Table 2. Fragment of Keywords list (Romanian) Comentariile utilizatorilor Foarte Pozitive Neutru Negative Foarte negative pozitive mi-a plăcut mi-a plăcut 50/50 nu mi-a nu mi-a plăcut foarte mult plăcut nimic sentimente senzații normale nu-i rău senzații nu mi-a plăcut foarte roman- obișnuite deloc tice foarte frumos totul a plăcut poate nu e bine Senzație de rău foarte, foarte sentimente ro- nu chiar planifi- nu este nu e frumos frumos mantice cat planificat bine planificat frumos fără precedent incomod nimic nu este bun extrem de senzații bune nu chiar con- nu mă nu este con- convenabil fortabil interesează venabil foarte in- destul de frumos destul de gândit nici o nimic nu este teresant impresie gândit incredibil de destul de bine nu am regretat. așteptările nemulțumiți interesant nu s-au îndeplinit Table 3. Fragment of Keywords list (English) User feedback Very posi- Positive Neutral Negative Very negative tive I liked it cool to how not like I didn't like any- very much thing I really hard-boiled 50/50 normal feelings I didn't like it at liked it all very ro- liked good feeling not romantic bad feelings mantic feelings feelings Very beau- sensations maybe not like not pretty at all tiful are normal very, very romantic reasonably confusedly not at all conven- beautiful feeling interesting ient well- nicely not very con- uncomfortable not at all interest- planned venient ing extremely quite beauti- elaborate discomfort no way convenient ful Since Protégé cannot write hashtags via "#", we wrote them using the letter "h". The hashtags were divided into "Very Positive", "Positive" and "Neutral" as well as being Ukrainian, English and Romanian as shown in the tables. 4, 5 and. 6. Table 4. Fragment of hashtags list (Ukrainian) Хештеги користувачів Дуже позитивні Позитивні Нейтральні Ім’я змінної h_дуже h_щастя h_мандрівка h.x1 h_дужесмачно h_щастяє h_мандруй h.x2 h_дужевесело h_цікавімісця h_мандриукраїною h.x3 h_дужедешево h_щастявдрібницях h_мандрівки h.x4 h_дужекруто h_щастяпоруч h_мандрівник h.x5 h_дужекрасиво h_щастя_є h_мандрівниця h.x6 h_дужегарно h_щасливі h_мандрівники h.x7 h_дужедуже h_щасливийдень h_мандруйдешевше h.x8 h_дужегарнемісто h_щастявпростихречах h_мандруй_сміливо h.x9 h_веселуха h_щасття h_мандруємоукраїною h.x10 h_супер h_весело h_мандрівникиукраїною h.x11 h_божественно h_цікаво h_мандруюукраїною h.x13 h_чудовийдень h_цікаваукраїна h_мандриподорожі h.x14 h_чудово h_цікавімісцяукраїни h_мандруватилегко h.x15 h_чудовийнастрій h_мандруй_з_нами h.x16 h_чудовийранок h_мандруй_активно h.x17 h_чудовий_день h_подорож h.x18 h_чудовийвечір h_подорожі h.x19 h_класно h_подорожіукраїною h.x20 h_подорожуйукраїною h.x21 h_подорожі_україною h.x22 h_подорожуємо h.x23 h_подорожуйзнами h.x24 h_подорожувати h.x25 h_подорожуєморазом h.x26 h_подорожуй_україною h.x27 h_подорожуючиукраїною h.x28 Table 5. Fragment of hashtags list (Romanian) Hashtag-urile utilizatorului Foarte pozitive Pozitive Neutru h_foartegustos h_fericire h_calator h_foartevesel h_fericirea h_calatoreste h_foarteieftin h_fericit h_calatorestecudrag h_foarte_frumos h_fericireaexista h_calatorii h_foarte_tare h_interesant h_calatori h_foarte_amuzant h_interesantelocuri h_calatorie h_foartefoarte h_ fericireainlucrurisimple h_calatorintaramea h_vesel h_ fericireainlucrurimici h_calatorinromania h_divin h_ fericireainlucrurimarunte h_calatoriicugust h_perfect h_calator_in_romania h_perfectazi h_calator_in_tara_mea h_perfectadimineata h_calator_prin_lume h_perfecta_zi h_calator_prin_romania h_perfectaseara h_calatori_in_viata h_perfecta_dimineat h_calatori_prin_lume a h_pefecta_seara h_calatorii_cu_zambet h_itur Table 6. Fragment of hashtags list (English) User hashtags Very posi- Positive Neutral tive h_very h_happy h_ journey h_very_delici h_happytime h_travel ous h_very_fun h_happiness h_traveling h_very_chea h_happiness_in_the_li h_travelling p ttle_things h_very_good h_ happiness_nearby h_travels h_very_gooo h_ happiness_exists h_traveller d h_super h_ happy_moments h_traveler h_wonderful h_ happy _day h_travel_drops h_wonderful h_ happy _night h_travelbodldy _location h_ wonder- h_ happy morning h_travel_drops_ ful_vacations h_ wonder- h_fun h_travel_capture ful_day h_ wonder- h_ interesting h_travel_europe ful_night h_ wonder- h_ interesting_places h_tarvel_captures ful_morning h_ wonder- h_travel_ ful_mood h_ wonder- h_travel_tourist fulvacations h_ wonder- h_travel_life fulday h_ wonder- h_ lifesjourney fulnight h_ wonder- h_ thejourney fulmorning h_ wonder- h_ journeys fulmood h_very_nice h_travel_wonderful h_very_beaut h_travel_world iful h_very_delici h_travel_magic ous_food h_cool h_travel_love h_travel_time h_travel_is_life h_travellife h_travelgoals Since we can't add emoji to Protégé, we wrote them through the letter "e". Emoji were divided into "Very Positive," "Positive," "Neutral," "Negative," and "Very Negative." Table 7. Table 7. Fragment of Emoji list Emoji Transcription Variable name e_дуже сильно сподобалось e.x1 e_надзвичайно гарно e.x2 e_дуже задоволені e.x3 e_дуже романтичні відчуття e.x4 e_найкращі емоції e.x5 e_дуже весело e.x6 e_на високому рівні e.x7 e_розкішно e.x8 Punctuation marks are used to denote such a dismemberment of a written language that cannot be transmitted either by morphological means or by the order of the words in the sentence. An exclamation point (!) Is a punctuation mark that is placed at the end of a sen- tence to express outrage, a call for strong feelings, anxiety, and more. It can also be doubled, tripled or used many times to express greater expression and emotionality in grammatical abuse. Question mark (?) Is a punctuation mark, usually placed at the end of a sentence to express a question or doubt. In user reviews, punctuation marks such as question mark (?) And exclamation mark (!) Are very common, they can be for positive feedback as well as negative feedback, it all depends on the words found before punctuation marks, positive key- words or negative . Users use punctuation to express or displease tourist services. If after a positive keyword there are three exclamation marks then the keyword refers to very positive feedback, but if after a positive keyword there are three question marks then the key- word refers to very negative feedback. A single exclamation mark after neutral key- words means that the keyword refers to positive responses, but if one question mark after a negative keyword means the keyword refers to neutral responses. For example, a user posted the following comment: Like it! this keyword is not a positive but a very positive one, because there are three exclamation points after it, or the user left a "Dear !!!" this keyword refers not to negative but very negative feedback, or the user left a response: "Why is it so expensive?", the keyword here is "expensive", since after the keyword one question mark, the response refers to neutral feedback. According to the keyword tables, hashtags and emoji built a hierarchy of ontology classes and subclasses with Protege software. The classes in Protege are displayed as a class hierarchy (Class Hierarchy). Initially, they created base classes according to the hierarchy. Instances were created for each class as shown in Figure 4. Fig. 4. Instances (Ukrainian, Romanian, English) The ontology properties were created, corresponding to the areas of definition and areas of value of the hierarchical ontology. Figure 5. Fig. 5. Specific relations between classes (Ukrainian, Romanian, English) 6 Conclusions This study is an attempt to systematize and summarize knowledge about the possi- bilities of using tourist’s user reviews in social media as a type of e-tourism big data for mobile e-tourism recommender systems. In particular, to analyze the sources and types of tourist feedback data, messages and comments generated by the tourist with his gadget, that can be related to e-tourism big data. References 1. Ricci F., Rokach L., Shapira B.: Recommender Systems Handbook: Second Edition. Springer Science+Business Media New York, p. 1003. (2015) 2. Artemenko O., Kunanets O., Pasichnyk V.: E-tourism recommender systems: a survey and development perspectives. Econtechmod an international quarterly journal, Vol. 6. No. 2, 91-95. (2017) 3. Li, J., Xu, L., Tang, L., Wang, S., & Li, L.: Big data in tourism research: A literature re- view. Tourism Management, 68, 301–323. (2018). doi:10.1016/j.tourman.2018.03.009 4. Su, K.-W., Liu, C.-L., & Wang, Y.-W.: A principle of designing infographic for visualiza- tion representation of tourism social big data. Journal of Ambient Intelligence and Human- ized Computing. doi:10.1007/s12652-018-1104-9 (2018). 5. Guan, D., & Du, J.: Cross-Media Big Data Tourism Perception Research Based on Multi- Agent. Lecture Notes in Electrical Engineering, 353–360. (2015). doi:10.1007/978-3-662- 48386-2_37 6. Delic, A., Neidhardt, J., Nguyen, T. N., & Ricci, F.: An observational user study for group recommender systems in the tourism domain. Information Technology & Tourism, 19(1- 4), 87–116. (2018). doi:10.1007/s40558-018-0106-y 7. Huertas, A.: How live videos and stories in social media influence tourist opinions and be- haviour. Information Technology & Tourism, 19(1-4), 1–28. (2018). doi:10.1007/s40558- 018-0112-0 8. Sertkan, M., Neidhardt, J., & Werthner, H.: What is the “Personality” of a tourism destina- tion? Information Technology & Tourism. (2018). doi:10.1007/s40558-018-0135-6 9. Iinuma, S., Nanba, H., & Takezawa, T.: Investigating the effectiveness of computer- produced summaries obtained from multiple travel blog entries. Information Technology & Tourism. (2018). doi:10.1007/s40558-018-0132-9 10. Lalicic, L., Huertas, A., Moreno, A., & Jabreel, M.: Which emotional brand values do my followers want to hear about? An investigation of popular European tourist destinations. Information Technology & Tourism. (2018). doi:10.1007/s40558-018-0134-7 11. Höpken, W., Eberle, T., Fuchs, M., & Lexhagen, M.: Google Trends data for analysing tourists’ online search behaviour and improving demand forecasting: the case of Åre, Sweden. Information Technology & Tourism. (2018). doi:10.1007/s40558-018-0129-4 12. Scaglione, M., Johnson, C., & Favre, P.: As time goes by: last minute momentum booking and the planned vacation process. Information Technology & Tourism, 21(1), 9–22. (2018). doi:10.1007/s40558-018-0133-8 13. Leung, D. Denis Trček: Trust and reputation management systems: an e-business perspec- tive. Information Technology & Tourism. (2018). doi:10.1007/s40558-018-0118-7 14. Rossetti, M., Stella, F., & Zanker, M.: Analyzing user reviews in tourism with topic mod- els. Information Technology & Tourism, 16(1), 5–21. (2015). doi:10.1007/s40558-015- 0035-y 15. Krawczyk, M., & Xiang, Z.: Perceptual mapping of hotel brands using online reviews: a text analytics approach. Information Technology & Tourism, 16(1), 23–43. (2015). doi:10.1007/s40558-015-0033-0 16. García-Pablos, A., Cuadros, M., & Linaza, M. T.: Automatic analysis of textual hotel re- views. Information Technology & Tourism, 16(1), 45–69. (2015). doi:10.1007/s40558- 015-0047-7 17. Francalanci, C., & Hussain, A.: Discovering social influencers with network visualization: evidence from the tourism domain. Information Technology & Tourism, 16(1), 103–125. (2015). doi:10.1007/s40558-015-0030-3 18. Tao, M., Nawaz, M. Z., Nawaz, S., Butt, A. H., & Ahmad, H.: Users’ acceptance of inno- vative mobile hotel booking trends: UK vs. PRC. Information Technology & Tourism. (2018). doi:10.1007/s40558-018-0123-x 19. Fazzolari, M., & Petrocchi, M.: A study on online travel reviews through intelligent data analysis. Information Technology & Tourism. (2018). doi:10.1007/s40558-018-0121-z 20. Zhang, W., & Fesenmaier, D. R.: Assessing emotions in online stories: comparing self- report and text-based approaches. Information Technology & Tourism. (2018). doi:10.1007/s40558-018-0122-y 21. Ghahramani, L., Khalilzadeh, J., & KC, B.: Tour guides’ communication ecosystems: an inferential social network analysis approach. Information Technology & Tourism. (2018). doi:10.1007/s40558-018-0114-y 22. Van der Zee, E., & Bertocchi, D.: Finding patterns in urban tourist behaviour: a social network analysis approach based on TripAdvisor reviews. Information Technology & Tourism. (2018). doi:10.1007/s40558-018-0128-5 23. Al-Ghossein, M., Abdessalem, T., & Barré, A.: Open data in the hotel industry: leveraging forthcoming events for hotel recommendation. Information Technology & Tourism. (2018). doi:10.1007/s40558-018-0119-6 24. Qi, S., Wong, C. U. I., Chen, N., Rong, J., & Du, J.: Profiling Macau cultural tourists by using user-generated content from online social media. Information Technology & Tour- ism. (2018). doi:10.1007/s40558-018-0120-0 25. . Chan, I. C. C., & Law, R.: Tanja Schneider, Karin Eli, Catherine Dolan, and Stanley Uli- jaszek (editors): Digital food activism. Information Technology & Tourism. (2018). doi:10.1007/s40558-018-0117-8