=Paper=
{{Paper
|id=Vol-2631/paper34
|storemode=property
|title=Analysis and Estimation of Popular Places in Online Tourism Based on Machine Learning Technology
|pdfUrl=https://ceur-ws.org/Vol-2631/paper34.pdf
|volume=Vol-2631
|authors=Yurii Tverdokhlib,Vasyl Andrunyk,Liliya Chyrun,Lyubomyr Chyrun,Nataliya Antonyuk,Ivan Dyyak,Oleh Naum,Dmytro Uhryn,Vitor Basto-Fernandes
|dblpUrl=https://dblp.org/rec/conf/momlet/TverdokhlibACCA20
}}
==Analysis and Estimation of Popular Places in Online Tourism Based on Machine Learning Technology==
Analysis and Estimation of Popular Places in Online Tourism Based on Machine Learning Technology Yurii Tverdokhlib1, Vasyl Andrunyk2[0000-0003-0697-7384], Liliya Chyrun3[0000-0003-4040- 7588] , Lyubomyr Chyrun4[0000-0002-9448-1751], Nataliya Antonyuk5[0000-0002-6297-0737], Ivan Dyyak6[0000-0001-5841-2604], Oleh Naum7[0000-0001-8700-6998], Dmytro Uhryn8[0000-0003-4858-4511], Vitor Basto-Fernandes9[0000-0003-4269-5114] 1-3Lviv Polytechnic National University, Lviv, Ukraine 4-6Ivan Franko National University of Lviv, Lviv, Ukraine 5University of Opole, Opole, Poland 7Drohobych Ivan Franko State Pedagogical University, Drohobych, Ukraine 8Chernivtsi Philosophical and Legal Lyceum, Chernivtsi, Ukraine 9University Institute of Lisbon, Lisbon, Portugal yurii.tverdokhlib.sa.2017@lpnu.ua1, vasyl.a.andrunyk@lpnu.ua2, Lyubomyr.Chyrun@lnu.edu.ua4, nantonyk@yahoo.com5, ivan.dyyak@lnu.edu.ua6, oleh.naum@gmail.com7, ugrund38@gmail.com8 Abstract. This article discusses and compares some machine-learning regres- sion methods for developing a prognostic model that predicts the daily number of visitors in different areas (tourist places) of India. Visitor reviews from holi- dayiq.com are used as data. The main features of the selected data set are de- scribed. Keywords: Online Tourism, Popular Places, Machine Learning. 1 Introduction The article is based on a set of data consisting of specific data obtained from user reviews posted on Holidayiq.com about different types of attractions in India [1-2]. This dataset is completed with feedback on appointments published by 249 Holiday- iq.com reviewers by March 2020. This paper discusses and compares some machine-learning regression methods for developing a prognostic model that predicts the daily number of visitors in different areas (tourist places) of India. Implementation of strategic projects will allow for appropriate restructuring of the tourism industry in relation to the socio-economic life of the state [3-9]. It is focus on population, government, management and business structures and a comprehensive approach to ensuring the effective use of benefits and opportunities of domestic tour- ism sector due to climatic conditions and historical features [1-16], taking into ac- count the requirements of environmental protection and preservation and enrichment of is heritage [17-25]. Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 2 Main Part Tourism in India is important for the country's economy and growing rapidly [1]. The World Travel and Tourism Council estimated that tourism in 2018 generated ($ 240 billion) or 9.2% of India's GDP and supported 42.673 million jobs, 8.1% of total em- ployment [1]. It is projected that by 2028 this sector will grow from 6.9% to 32.05 GEL [1]. Tourism is one of the largest earners in foreign currency. The importance of tourism as a tool for economic development and employment, especially in remote and backward areas, has well recognized around the world. The benefits of tourism can be increased either by increasing the number of tourists or by increasing the length of stay of tourists in the country. Data on national length of stay are very im- portant and useful for the purposeful promotion of tourism in the outgoing markets. This paper uses a set of user data (feedback) on different types of attractions in In- dia. The dataset contains 1743 data points collected from Holidayiq.com [2]. The main features of the selected data set: 1. User ID; 2. Number of inspections of stadiums, sports complexes; 3. Number of reviews of religious institutions; 4. Number of reviews about the beach, lake, river, etc .; 5. The number of reviews about theaters, exhibitions; 6. The number of reviews about shopping centers, shopping malls; 7. Number of reviews about parks, picnic areas, etc. Fig. 1. Data set. The method that will implemented in this work is the classification tree. To implement this course work, we chose the Python programming language. Py- thon is an easy-to-use yet full-fledged programming language that provides much more tools for structuring and supporting large applications [26-32]. The large selection of libraries is one of the main reasons that Python is the most popular programming language used for ML [33-45]. A library is a module or group of modules published from various sources, such as PyPi, that contains a pre-written piece of code that allows users to achieve certain functionality or perform various actions [46-51]. Python libraries provide basic-level elements, so you do not need to encode them from the beginning. ML requires constant data processing, and Python libraries allow you to access, process, and transform data [52-61]. Fig. 2. PyCharm interface First, we need to connect libraries to use their capabilities in the future After downloading the data, reading our data set, and building graphs. Next, we want to show you a chart that visualizes data according to user feedback about picnics and parks. As we see here, we have 2 local highs and 4 local lows. The same can be said for the next 2 graphs. These graphs determine whether to work with this data and allow you to determine whether to normalize it or not. Fig. 3. Bar chart of reviews of parks and picnic areas. Fig. 4. Bar chart of feedback on stadiums. Fig. 5. Bar chart of reviews by religious places. The following graph is also a chart, but it shows how feedback after visiting the thea- ter affects the feedback of beaches, lakes, and we can see points from 0 to 10 here. Fig. 6. Bar chart of reviews on blowing beaches, lakes, rivers. The following graph is a scatter graph that shows us the distribution of attribute data relative to the data distribution of another attribute. We need this data in order to clus- ter it in the future. Fig. 7. Schedule of scattering visits to theaters in relation to shopping centers Fig. 8. Schedule of scattering visits to parks in relation to shopping centers. 3 Assigning Target and Feature Variables Feature selection is one of the core concept in order to affect the performance of the model. In the piece of code shown above, we have assigned the feature and target variables. 4 Splitting the Dataset into Training Set and Testing Set We generally split the data we have into training and testing sets so that our model learns on this data. we use the test data to test how accurate our model is. Here we have to divide our data as 70% training and 30% testing. 5 Accuracy for the Training Data 6 Accuracy of the Testing Data 7 Data Visualization The entropy for each node in the decision tree is calculated and shown in the Fig. 9. Fig. 9. Schedule of scattering visits to parks in relation to shopping centers 8 Perceptron In machine learning, the perceptron is an algorithm for the controlled study of binary classifiers. A binary classifier is a function that can determine whether an input repre- sented by a vector of numbers belongs to a particular class. 9 Logistic Regression 10 Results Decision tree: 44%. Perceptron: 10%. Logistic regression: 8%. Note: Since the dataset is small, we are getting low accuracy. Conclusion from the above results we can con- clude that decision tree is the best method for this dataset with an accuracy of 44%. 11 Conclusions This article discusses and compares some machine-learning regression methods for developing a prognostic model that predicts the daily number of visitors in different areas (tourist places) of India. Visitor reviews from holidayiq.com are used as data. The main features of the selected data set are described. Next, methods and means of implementation are described. To implement this course work, we chose the Python programming language. After downloading the data, reading our data set, and build- ing graphs. This article demonstrates 6 graphs, namely 4 bar charts and two scatter plots. Three machine-learning methods are used. The first was the decision tree, which showed the best result of 44%. The second is the 10% perceptron, and the third is the logistic regression method. Because the sample of my dataset is small, that is why we got such accuracy of algorithms. In general, we can say that the algorithms are not very successful in their task, but this is because the data sample is too small. References 1. Tourism survey in the State of Delhi, mode:http://tourism.gov.in/sites/default/files/Other/Delhi_0.pdf. 2. BuddyMove Data set, mode:https://archive.ics.uci.edu/ml/datasets/BuddyMove+Data+Set. 3. Su, J., Sachenko, A., Lytvyn, V., Vysotska, V., Dosyn, D.: Model of Touristic Information Resources Integration According to User Needs. In: Proceedings of the International Con- ference on Computer Sciences and Information Technologies, CSIT, 113-116. (2018) 4. Mathias Weske. Business Process Management. Springer-Verlag, Berlin-Heidelberg, (2008). 5. Varetskyy, Y., Rusyn, B., Molga, A., Ignatovych, A.: A new method of fingerprint key protection of grid credential. In: Advances in Intelligent and Soft Computing, 84, 99-103. (2010) 6. Bobby Woolf Exploring IBM SOA. Technology&Practice. Maximum press (2009). 7. IBM Redbook. Patterns: Implementing an SOA Using an Enterprise Service Bus, http://www.ibm.com/Index.cfm?ArticleID=22084, last accessed 2020/05/01. 8. Kravets, P.: Game method for coalitions formation in multi-agent systems. In: Internation- al Scientific and Technical Conference on Computer Sciences and Information Technolo- gies, CSIT, 1-4. (2018) 9. IBM Redbook Patterns: Service-Oriented Architecture and Web Services, http://www.redbooks.ibm.com/abstracts/sg246303.html, last accessed 2020/05/01. 10. Berko, A.Y., Aliekseyeva, K.A.: Quality evaluation of information resources in web- projects. In: Actual Problems of Economics 136(10), 226-234. (2012) 11. SOA Development Using the IBM Rational Software Development Platform, http://www07.ibm.com/sg/soa/downloads/SOA_Development_using_Rational_Software, , last accessed 2020/05/01. 12. Lytvyn, V., Vysotska, V., Burov, Y., Demchuk, A.: Architectural ontology designed for intellectual analysis of e-tourism resources. In: Proceedings of the International Confer- ence on Computer Sciences and Information Technologies, CSIT, 335-338. (2018) 13. Antonyuk, N., Vysotsky, A., Vysotska, V., Lytvyn, V., Burov, Y., Demchuk, A., Lyudkevych, I., Chyrun, L., Chyrun, S., Bobyk, І.: Consolidated Information Web Re- source for Online Tourism Based on Data Integration and Geolocation. In: Proceedings of the International Conference on Computer Sciences and Information Technologies, CSIT, 15-20. (2019) 14. Vysotsky, A., Lytvyn, V., Vysotska, V., Dosyn, D., Lyudkevych, I., Antonyuk, N., Naum, O., Vysotskyi, A., Chyrun, L., Slyusarchuk, O.: Online Tourism System for Proposals Formation to User Based on Data Integration from Various Sources. In: International Con- ference on Computer Sciences and Information Technologies, CSIT, 92-97. (2019) 15. Shakhovska, N., Shakhovska, K., Fedushko, S.: Some Aspects of the Method for Tourist Route Creation. In: Advances in Artificial Systems for Medicine and Education II, 902, 527-537. (2019) 16. Antonyuk, N., Medykovskyy, M., Chyrun, L., Dverii, M., Oborska, O., Krylyshyn, M., Vysotsky, A., Tsiura, N., Naum, O.: Online Tourism System Development for Searching and Planning Trips with User’s Requirements. In: Advances in Intelligent Systems and Computing IV, Springer Nature Switzerland AG 2020, 1080, 831-863. (2020) 17. Lozynska, O., Savchuk, V., Pasichnyk, V.: Individual Sign Translator Component of Tour- ist Information System. In: Advances in Intelligent Systems and Computing IV, Springer Nature Switzerland AG 2020, Springer, Cham, 1080, 593-601. (2020) 18. Savchuk, V., Lozynska, O., Pasichnyk, V.: Architecture of the Subsystem of the Tourist Profile Formation. In: Advances in Intelligent Systems and Computing, 561-570. (2019) 19. Zhezhnych, P., Markiv, O.: Recognition of tourism documentation fragments from web- page posts. In: 14th International Conference on Advanced Trends in Radioelectronics, Telecommunications and Computer Engineering, TCSET, 948-951. (2018) 20. Zhezhnych, P., Markiv, O.: Linguistic comparison quality evaluation of web-site content with tourism documentation objects. In: Advances in Intelligent Systems and Computing 689, 656-667. (2018) 21. Zhezhnych, P., Markiv, O.: A linguistic method of web-site content comparison with tour- ism documentation objects. In: International Scientific and Technical Conference on Com- puter Sciences and Information Technologies, CSIT, 340-343. (2017) 22. Babichev, S.: An Evaluation of the Information Technology of Gene Expression Profiles Processing Stability for Different Levels of Noise Components. In: Data, 3 (4), 48. (2018) 23. Babichev, S., Durnyak, B., Pikh, I., Senkivskyy, V.: An Evaluation of the Objective Clus- tering Inductive Technology Effectiveness Implemented Using Density-Based and Ag- glomerative Hierarchical Clustering Algorithms. In: Advances in Intelligent Systems and Computing, 1020, 532-553. (2020) 24. Lytvyn, V., Vysotska, V.: Designing architecture of electronic content commerce system. In: Computer Science and Information Technologies. In: Proceedings of the International Conference on Computer Sciences and Information Technologies, CSIT, 115-119. (2015) 25. Rusyn, B., Vysotska, V., Pohreliuk, L.: Model and architecture for virtual library infor- mation system. In: Proceedings of the International Conference on Computer Sciences and Information Technologies, CSIT, 37-41. (2018) 26. Lytvyn, V., Kuchkovskiy, V., Vysotska, V., Markiv, O., Pabyrivskyy, V.: Architecture of system for content integration and formation based on cryptographic consumer needs. In: Proceedings of the International Conference on Computer Sciences and Information Tech- nologies, CSIT, 391-395. (2018) 27. Lytvyn, V., Vysotska, V., Demchuk, A., Demkiv, I., Ukhanska, O., Hladun, V., Koval- chuk, R., Petruchenko, O., Dzyubyk, L., Sokulska, N.: Design of the architecture of an in- telligent system for distributing commercial content in the internet space based on SEO- technologies, neural networks, and Machine Learning. In: Eastern-European Journal of En- terprise Technologies, 2(2-98), 15-34. (2019) 28. Rzheuskyi, A., Kutyuk, O., Vysotska, V., Burov, Y., Lytvyn, V., Chyrun, L.: The Archi- tecture of Distant Competencies Analyzing System for IT Recruitment. In: Proceedings of the International Conference on Computer Sciences and Information Technologies, CSIT, 254-261. (2019) 29. Kazarian, A., Kunanets, N., Pasichnyk, V., Veretennikova, N., Rzheuskyi, A., Leheza, A., Kunanets, O.: Complex Information E-Science System Architecture based on Cloud Com- puting Model. In: CEUR Workshop Proceedings, Vol-2362, 366-377. (2019) 30. Lytvyn, V., Dosyn, D., Emmerich, M., Yevseyeva, I.: Content formation method in the web systems. In: CEUR Workshop Proceedings, 2136, 42-61. (2018) 31. Lytvyn, V., Rybchak, Z.: Design of airport service automation system. In: Proceedings of the International Conference on Computer Sciences and Information Technologies, CSIT, 195-197. (2015) 32. Lytvyn, V.V.: An approach to intelligent agent construction for determining the group of bank risk basing on ontology. In: Actual Problems of Economics (7), 314-320. (2011) 33. Dosyn, D., Lytvyn, V., Kovalevych, V., Oborska, O., Holoshchuk, R.: Knowledge discov- ery as planning development in knowledgebase framework. In: Modern Problems of Radio Engineering, Telecommunications and Computer Science, TCSET, 449-451. (2016) 34. Lypak, O.H., Lytvyn, V., Lozynska, O., (...), Rzheuskyi, A., Dosyn, D.: Formation of Effi- cient Pipeline Operation Procedures Based on Ontological Approach. In: Advances in In- telligent Systems and Computing, 871, 571-581. (2019) 35. Peleshchak, R., Lytvyn, V., Peleshchak, I., Olyvko, R., Korniak, J.: Decision making mod- el based on neural network with diagonalized synaptic connections. In: Advances in Intel- ligent Systems and Computing, 853, 321-329. (2019) 36. Pasichnyk, V., Lytvyn, V., Kunanets, N., (...), Bolyubash, Y., Rzheuskyi, A.: Ontological approach in the formation of effective pipeline operation procedures. In: 13th International Scientific and Technical Conference on Computer Sciences and Information Technologies, CSIT, 80-83. (2018) 37. Lytvyn, V., Uhryn, D., Fityo, A.: Modeling of territorial community formation as a graph partitioning problem. In: Eastern-European Journal of Enterprise Technologies, 1(4), 47- 52. (2016) 38. Lytvyn, V.: The similarity metric of scientific papers summaries on the basis of adaptive ontologies. In: Proceedings of 7th International Conference on Perspective Technologies and Methods in MEMS Design, MEMSTECH, 162. (2011) 39. Lytvyn, V.V., Tsmots, O.I.: The process of managerial decision making support within the early warning system. In: Actual Problems of Economics, 149(11), 222-229. (2013) 40. Kravets, P.: The control agent with fuzzy logic. In: Perspective Technologies and Methods in MEMS Design, MEMSTECH, 40-41. (2010) 41. Kravets, P.: The game method for orthonormal systems construction. In: The Experience of Designing and Application of CAD Systems in Microelectronics, , 296-298. (2007) 42. Kravets, P., Kyrkalo, R.: Fuzzy logic controller for embedded systems. In: International Conference on Perspective Technologies and Methods in MEMS Design, , 58-59. (2009) 43. Kravets, P., Prodanyuk, O.: Game task of resource allocation. In: Experience of Designing and Application of CAD Systems in Microelectronics, CADSM, 437-438. (2009) 44. Kravets, P.: Game methods of the stochastic boundary problem solution. In: Perspective Technologies and Methods in MEMS Design, MEMSTECH, 71-74. (2007) 45. Kravets, P.: Adaptive method of pursuit game problem solution. In: Modern Problems of Radio Engineering, Telecommunications and Computer Science Proceedings of Interna- tional Conference, TCSET, 62-65. (2006) 46. Kravets, P.: Game methods of construction of adaptive grid areas. In: The Experience of Designing and Application of CAD Systems in Microelectronics, , 513-516. (2003) 47. Shakhovska, N., Veres, O., Bolubash, Y., Bychkovska-Lipinska, L.: Data space architec- ture for Big Data managering. In: Proceedings of the International Conference on Comput- er Sciences and Information Technologies, CSIT, 184-187. (2015) 48. Vysotska, V., Hasko, R., Kuchkovskiy, V.: Process analysis in electronic content com- merce system. In: Proceedings of the International Conference on Computer Sciences and Information Technologies, CSIT, 120-123. (2015) 49. Lytvyn, V., Vysotska, V., Veres, O., Rishnyak, I., Rishnyak, H.: The Risk Management Modelling in Multi Project Environment.. In: Proceedings of the International Conference on Computer Sciences and Information Technologies, CSIT, 32-35. (2017) 50. Kanishcheva, O., Vysotska, V., Chyrun, L., Gozhyj, A.: Method of Integration and Con- tent Management of the Information Resources Network. In: Advances in Intelligent Sys- tems and Computing, 689, Springer, 204-216. (2018) 51. Vysotska, V., Fernandes, V.B., Emmerich, M.: Web content support method in electronic business systems. In: CEUR Workshop Proceedings, Vol-2136, 20-41. (2018) 52. Vysotska, V., Lytvyn, V., Burov, Y., Gozhyj, A., Makara, S.: The consolidated infor- mation web-resource about pharmacy networks in city. In: CEUR Workshop Proceedings, 239-255. (2018) 53. Gozhyj, A., Kalinina, I., Vysotska, V., Gozhyj, V.: The method of web-resources man- agement under conditions of uncertainty based on fuzzy logic. In: Proceedings of the In- ternational Conference on Computer Sciences and Information Technologies, CSIT, 343- 346. (2018) 54. Gozhyj, A., Vysotska, V., Yevseyeva, I., Kalinina, I., Gozhyj, V.: Web Resources Man- agement Method Based on Intelligent Technologies. In: Advances in Intelligent Systems and Computing, 871, 206-221. (2019) 55. Rzheuskyi, A., Kutyuk, O., Voloshyn, O., Kowalska-Styczen, A., Voloshyn, V., Chyrun, L., Chyrun, S., Peleshko, D., Rak, T.: The Intellectual System Development of Distant Competencies Analyzing for IT Recruitment. In: Advances in Intelligent Systems and Computing IV, Springer, Cham, 1080, 696-720. (2020) 56. Rusyn, B., Pohreliuk, L., Rzheuskyi, A., Kubik, R., Ryshkovets Y., Chyrun, L., Chyrun, S., Vysotskyi, A., Fernandes, V. B.: The Mobile Application Development Based on Online Music Library for Socializing in the World of Bard Songs and Scouts’ Bonfires. In: Advances in Intelligent Systems and Computing IV, Springer, 1080, 734-756. (2020) 57. Vysotska, V., Rishnyak, I., Chyrun L.: Analysis and evaluation of risks in electronic com- merce. In: CAD Systems in Microelectronics, International Conference, 332-333. (2007) 58. Vysotska, V., Chyrun, L.: Analysis features of information resources processing. In: Pro- ceedings of the International Conference on Computer Sciences and Information Technol- ogies, CSIT, 124-128. (2015) 59. Andrunyk, V., Chyrun, L., Vysotska, V.: Electronic content commerce system develop- ment. In: Proceedings of 13th International Conference: The Experience of Designing and Application of CAD Systems in Microelectronics, CADSM 2015-February. (2015) 60. Alieksieieva, K., Berko, A., Vysotska, V.: Technology of commercial web-resource pro- cessing. In: Proceedings of 13th International Conference: The Experience of Designing and Application of CAD Systems in Microelectronics, CADSM 2015-February. (2015) 61. Vysotska, V., Chyrun, L.: Methods of information resources processing in electronic con- tent commerce systems. In: Proceedings of 13th International Conference: The Experience of Designing and Application of CAD Systems in Microelectronics, CADSM. (2015)