=Paper= {{Paper |id=Vol-2631/paper34 |storemode=property |title=Analysis and Estimation of Popular Places in Online Tourism Based on Machine Learning Technology |pdfUrl=https://ceur-ws.org/Vol-2631/paper34.pdf |volume=Vol-2631 |authors=Yurii Tverdokhlib,Vasyl Andrunyk,Liliya Chyrun,Lyubomyr Chyrun,Nataliya Antonyuk,Ivan Dyyak,Oleh Naum,Dmytro Uhryn,Vitor Basto-Fernandes |dblpUrl=https://dblp.org/rec/conf/momlet/TverdokhlibACCA20 }} ==Analysis and Estimation of Popular Places in Online Tourism Based on Machine Learning Technology== https://ceur-ws.org/Vol-2631/paper34.pdf
       Analysis and Estimation of Popular Places in Online
        Tourism Based on Machine Learning Technology

  Yurii Tverdokhlib1, Vasyl Andrunyk2[0000-0003-0697-7384], Liliya Chyrun3[0000-0003-4040-
    7588]
    , Lyubomyr Chyrun4[0000-0002-9448-1751], Nataliya Antonyuk5[0000-0002-6297-0737], Ivan
Dyyak6[0000-0001-5841-2604], Oleh Naum7[0000-0001-8700-6998], Dmytro Uhryn8[0000-0003-4858-4511],
                           Vitor Basto-Fernandes9[0000-0003-4269-5114]
                         1-3Lviv Polytechnic National University, Lviv, Ukraine
                       4-6Ivan Franko National University of Lviv, Lviv, Ukraine
                                  5University of Opole, Opole, Poland
              7Drohobych Ivan Franko State Pedagogical University, Drohobych, Ukraine
                   8Chernivtsi Philosophical and Legal Lyceum, Chernivtsi, Ukraine
                 9University Institute of Lisbon, Lisbon, Portugal

  yurii.tverdokhlib.sa.2017@lpnu.ua1, vasyl.a.andrunyk@lpnu.ua2,
          Lyubomyr.Chyrun@lnu.edu.ua4, nantonyk@yahoo.com5,
 ivan.dyyak@lnu.edu.ua6, oleh.naum@gmail.com7, ugrund38@gmail.com8



            Abstract. This article discusses and compares some machine-learning regres-
            sion methods for developing a prognostic model that predicts the daily number
            of visitors in different areas (tourist places) of India. Visitor reviews from holi-
            dayiq.com are used as data. The main features of the selected data set are de-
            scribed.

            Keywords: Online Tourism, Popular Places, Machine Learning.


1           Introduction

The article is based on a set of data consisting of specific data obtained from user
reviews posted on Holidayiq.com about different types of attractions in India [1-2].
This dataset is completed with feedback on appointments published by 249 Holiday-
iq.com reviewers by March 2020.
   This paper discusses and compares some machine-learning regression methods for
developing a prognostic model that predicts the daily number of visitors in different
areas (tourist places) of India.
   Implementation of strategic projects will allow for appropriate restructuring of the
tourism industry in relation to the socio-economic life of the state [3-9]. It is focus on
population, government, management and business structures and a comprehensive
approach to ensuring the effective use of benefits and opportunities of domestic tour-
ism sector due to climatic conditions and historical features [1-16], taking into ac-
count the requirements of environmental protection and preservation and enrichment
of is heritage [17-25].
Copyright © 2020 for this paper by its authors.
Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
2       Main Part

Tourism in India is important for the country's economy and growing rapidly [1]. The
World Travel and Tourism Council estimated that tourism in 2018 generated ($ 240
billion) or 9.2% of India's GDP and supported 42.673 million jobs, 8.1% of total em-
ployment [1]. It is projected that by 2028 this sector will grow from 6.9% to 32.05
GEL [1]. Tourism is one of the largest earners in foreign currency. The importance of
tourism as a tool for economic development and employment, especially in remote
and backward areas, has well recognized around the world. The benefits of tourism
can be increased either by increasing the number of tourists or by increasing the
length of stay of tourists in the country. Data on national length of stay are very im-
portant and useful for the purposeful promotion of tourism in the outgoing markets.
   This paper uses a set of user data (feedback) on different types of attractions in In-
dia. The dataset contains 1743 data points collected from Holidayiq.com [2].
   The main features of the selected data set:
        1. User ID;
        2. Number of inspections of stadiums, sports complexes;
        3. Number of reviews of religious institutions;
        4. Number of reviews about the beach, lake, river, etc .;
        5. The number of reviews about theaters, exhibitions;
        6. The number of reviews about shopping centers, shopping malls;
        7. Number of reviews about parks, picnic areas, etc.




Fig. 1. Data set.

The method that will implemented in this work is the classification tree.
   To implement this course work, we chose the Python programming language. Py-
thon is an easy-to-use yet full-fledged programming language that provides much
more tools for structuring and supporting large applications [26-32].
   The large selection of libraries is one of the main reasons that Python is the most
popular programming language used for ML [33-45]. A library is a module or group
of modules published from various sources, such as PyPi, that contains a pre-written
piece of code that allows users to achieve certain functionality or perform various
actions [46-51]. Python libraries provide basic-level elements, so you do not need to
encode them from the beginning. ML requires constant data processing, and Python
libraries allow you to access, process, and transform data [52-61].




Fig. 2. PyCharm interface

First, we need to connect libraries to use their capabilities in the future




After downloading the data, reading our data set, and building graphs. Next, we want
to show you a chart that visualizes data according to user feedback about picnics and
parks. As we see here, we have 2 local highs and 4 local lows. The same can be said
for the next 2 graphs. These graphs determine whether to work with this data and
allow you to determine whether to normalize it or not.
Fig. 3. Bar chart of reviews of parks and picnic areas.




Fig. 4. Bar chart of feedback on stadiums.
Fig. 5. Bar chart of reviews by religious places.

The following graph is also a chart, but it shows how feedback after visiting the thea-
ter affects the feedback of beaches, lakes, and we can see points from 0 to 10 here.




Fig. 6. Bar chart of reviews on blowing beaches, lakes, rivers.

The following graph is a scatter graph that shows us the distribution of attribute data
relative to the data distribution of another attribute. We need this data in order to clus-
ter it in the future.
Fig. 7. Schedule of scattering visits to theaters in relation to shopping centers




Fig. 8. Schedule of scattering visits to parks in relation to shopping centers.
3      Assigning Target and Feature Variables




Feature selection is one of the core concept in order to affect the performance of the
model. In the piece of code shown above, we have assigned the feature and target
variables.


4      Splitting the Dataset into Training Set and Testing Set

We generally split the data we have into training and testing sets so that our model
learns on this data. we use the test data to test how accurate our model is.



Here we have to divide our data as 70% training and 30% testing.


5      Accuracy for the Training Data




6      Accuracy of the Testing Data
7       Data Visualization




The entropy for each node in the decision tree is calculated and shown in the Fig. 9.




Fig. 9. Schedule of scattering visits to parks in relation to shopping centers


8       Perceptron

In machine learning, the perceptron is an algorithm for the controlled study of binary
classifiers. A binary classifier is a function that can determine whether an input repre-
sented by a vector of numbers belongs to a particular class.
9   Logistic Regression
10    Results

Decision tree: 44%. Perceptron: 10%. Logistic regression: 8%. Note: Since the dataset
is small, we are getting low accuracy. Conclusion from the above results we can con-
clude that decision tree is the best method for this dataset with an accuracy of 44%.
11     Conclusions

This article discusses and compares some machine-learning regression methods for
developing a prognostic model that predicts the daily number of visitors in different
areas (tourist places) of India. Visitor reviews from holidayiq.com are used as data.
The main features of the selected data set are described. Next, methods and means of
implementation are described. To implement this course work, we chose the Python
programming language. After downloading the data, reading our data set, and build-
ing graphs. This article demonstrates 6 graphs, namely 4 bar charts and two scatter
plots. Three machine-learning methods are used. The first was the decision tree,
which showed the best result of 44%. The second is the 10% perceptron, and the third
is the logistic regression method. Because the sample of my dataset is small, that is
why we got such accuracy of algorithms. In general, we can say that the algorithms
are not very successful in their task, but this is because the data sample is too small.


References
 1. Tourism            survey           in         the          State        of         Delhi,
    mode:http://tourism.gov.in/sites/default/files/Other/Delhi_0.pdf.
 2. BuddyMove Data set, mode:https://archive.ics.uci.edu/ml/datasets/BuddyMove+Data+Set.
 3. Su, J., Sachenko, A., Lytvyn, V., Vysotska, V., Dosyn, D.: Model of Touristic Information
    Resources Integration According to User Needs. In: Proceedings of the International Con-
    ference on Computer Sciences and Information Technologies, CSIT, 113-116. (2018)
 4. Mathias Weske. Business Process Management. Springer-Verlag, Berlin-Heidelberg,
    (2008).
 5. Varetskyy, Y., Rusyn, B., Molga, A., Ignatovych, A.: A new method of fingerprint key
    protection of grid credential. In: Advances in Intelligent and Soft Computing, 84, 99-103.
    (2010)
 6. Bobby Woolf Exploring IBM SOA. Technology&Practice. Maximum press (2009).
 7. IBM Redbook. Patterns: Implementing an SOA Using an Enterprise Service Bus,
    http://www.ibm.com/Index.cfm?ArticleID=22084, last accessed 2020/05/01.
 8. Kravets, P.: Game method for coalitions formation in multi-agent systems. In: Internation-
    al Scientific and Technical Conference on Computer Sciences and Information Technolo-
    gies, CSIT, 1-4. (2018)
 9. IBM Redbook Patterns: Service-Oriented Architecture and Web Services,
    http://www.redbooks.ibm.com/abstracts/sg246303.html, last accessed 2020/05/01.
10. Berko, A.Y., Aliekseyeva, K.A.: Quality evaluation of information resources in web-
    projects. In: Actual Problems of Economics 136(10), 226-234. (2012)
11. SOA Development Using the IBM Rational Software Development Platform,
    http://www07.ibm.com/sg/soa/downloads/SOA_Development_using_Rational_Software, ,
    last accessed 2020/05/01.
12. Lytvyn, V., Vysotska, V., Burov, Y., Demchuk, A.: Architectural ontology designed for
    intellectual analysis of e-tourism resources. In: Proceedings of the International Confer-
    ence on Computer Sciences and Information Technologies, CSIT, 335-338. (2018)
13. Antonyuk, N., Vysotsky, A., Vysotska, V., Lytvyn, V., Burov, Y., Demchuk, A.,
    Lyudkevych, I., Chyrun, L., Chyrun, S., Bobyk, І.: Consolidated Information Web Re-
    source for Online Tourism Based on Data Integration and Geolocation. In: Proceedings of
    the International Conference on Computer Sciences and Information Technologies, CSIT,
    15-20. (2019)
14. Vysotsky, A., Lytvyn, V., Vysotska, V., Dosyn, D., Lyudkevych, I., Antonyuk, N., Naum,
    O., Vysotskyi, A., Chyrun, L., Slyusarchuk, O.: Online Tourism System for Proposals
    Formation to User Based on Data Integration from Various Sources. In: International Con-
    ference on Computer Sciences and Information Technologies, CSIT, 92-97. (2019)
15. Shakhovska, N., Shakhovska, K., Fedushko, S.: Some Aspects of the Method for Tourist
    Route Creation. In: Advances in Artificial Systems for Medicine and Education II, 902,
    527-537. (2019)
16. Antonyuk, N., Medykovskyy, M., Chyrun, L., Dverii, M., Oborska, O., Krylyshyn, M.,
    Vysotsky, A., Tsiura, N., Naum, O.: Online Tourism System Development for Searching
    and Planning Trips with User’s Requirements. In: Advances in Intelligent Systems and
    Computing IV, Springer Nature Switzerland AG 2020, 1080, 831-863. (2020)
17. Lozynska, O., Savchuk, V., Pasichnyk, V.: Individual Sign Translator Component of Tour-
    ist Information System. In: Advances in Intelligent Systems and Computing IV, Springer
    Nature Switzerland AG 2020, Springer, Cham, 1080, 593-601. (2020)
18. Savchuk, V., Lozynska, O., Pasichnyk, V.: Architecture of the Subsystem of the Tourist
    Profile Formation. In: Advances in Intelligent Systems and Computing, 561-570. (2019)
19. Zhezhnych, P., Markiv, O.: Recognition of tourism documentation fragments from web-
    page posts. In: 14th International Conference on Advanced Trends in Radioelectronics,
    Telecommunications and Computer Engineering, TCSET, 948-951. (2018)
20. Zhezhnych, P., Markiv, O.: Linguistic comparison quality evaluation of web-site content
    with tourism documentation objects. In: Advances in Intelligent Systems and Computing
    689, 656-667. (2018)
21. Zhezhnych, P., Markiv, O.: A linguistic method of web-site content comparison with tour-
    ism documentation objects. In: International Scientific and Technical Conference on Com-
    puter Sciences and Information Technologies, CSIT, 340-343. (2017)
22. Babichev, S.: An Evaluation of the Information Technology of Gene Expression Profiles
    Processing Stability for Different Levels of Noise Components. In: Data, 3 (4), 48. (2018)
23. Babichev, S., Durnyak, B., Pikh, I., Senkivskyy, V.: An Evaluation of the Objective Clus-
    tering Inductive Technology Effectiveness Implemented Using Density-Based and Ag-
    glomerative Hierarchical Clustering Algorithms. In: Advances in Intelligent Systems and
    Computing, 1020, 532-553. (2020)
24. Lytvyn, V., Vysotska, V.: Designing architecture of electronic content commerce system.
    In: Computer Science and Information Technologies. In: Proceedings of the International
    Conference on Computer Sciences and Information Technologies, CSIT, 115-119. (2015)
25. Rusyn, B., Vysotska, V., Pohreliuk, L.: Model and architecture for virtual library infor-
    mation system. In: Proceedings of the International Conference on Computer Sciences and
    Information Technologies, CSIT, 37-41. (2018)
26. Lytvyn, V., Kuchkovskiy, V., Vysotska, V., Markiv, O., Pabyrivskyy, V.: Architecture of
    system for content integration and formation based on cryptographic consumer needs. In:
    Proceedings of the International Conference on Computer Sciences and Information Tech-
    nologies, CSIT, 391-395. (2018)
27. Lytvyn, V., Vysotska, V., Demchuk, A., Demkiv, I., Ukhanska, O., Hladun, V., Koval-
    chuk, R., Petruchenko, O., Dzyubyk, L., Sokulska, N.: Design of the architecture of an in-
    telligent system for distributing commercial content in the internet space based on SEO-
    technologies, neural networks, and Machine Learning. In: Eastern-European Journal of En-
    terprise Technologies, 2(2-98), 15-34. (2019)
28. Rzheuskyi, A., Kutyuk, O., Vysotska, V., Burov, Y., Lytvyn, V., Chyrun, L.: The Archi-
    tecture of Distant Competencies Analyzing System for IT Recruitment. In: Proceedings of
    the International Conference on Computer Sciences and Information Technologies, CSIT,
    254-261. (2019)
29. Kazarian, A., Kunanets, N., Pasichnyk, V., Veretennikova, N., Rzheuskyi, A., Leheza, A.,
    Kunanets, O.: Complex Information E-Science System Architecture based on Cloud Com-
    puting Model. In: CEUR Workshop Proceedings, Vol-2362, 366-377. (2019)
30. Lytvyn, V., Dosyn, D., Emmerich, M., Yevseyeva, I.: Content formation method in the
    web systems. In: CEUR Workshop Proceedings, 2136, 42-61. (2018)
31. Lytvyn, V., Rybchak, Z.: Design of airport service automation system. In: Proceedings of
    the International Conference on Computer Sciences and Information Technologies, CSIT,
    195-197. (2015)
32. Lytvyn, V.V.: An approach to intelligent agent construction for determining the group of
    bank risk basing on ontology. In: Actual Problems of Economics (7), 314-320. (2011)
33. Dosyn, D., Lytvyn, V., Kovalevych, V., Oborska, O., Holoshchuk, R.: Knowledge discov-
    ery as planning development in knowledgebase framework. In: Modern Problems of Radio
    Engineering, Telecommunications and Computer Science, TCSET, 449-451. (2016)
34. Lypak, O.H., Lytvyn, V., Lozynska, O., (...), Rzheuskyi, A., Dosyn, D.: Formation of Effi-
    cient Pipeline Operation Procedures Based on Ontological Approach. In: Advances in In-
    telligent Systems and Computing, 871, 571-581. (2019)
35. Peleshchak, R., Lytvyn, V., Peleshchak, I., Olyvko, R., Korniak, J.: Decision making mod-
    el based on neural network with diagonalized synaptic connections. In: Advances in Intel-
    ligent Systems and Computing, 853, 321-329. (2019)
36. Pasichnyk, V., Lytvyn, V., Kunanets, N., (...), Bolyubash, Y., Rzheuskyi, A.: Ontological
    approach in the formation of effective pipeline operation procedures. In: 13th International
    Scientific and Technical Conference on Computer Sciences and Information Technologies,
    CSIT, 80-83. (2018)
37. Lytvyn, V., Uhryn, D., Fityo, A.: Modeling of territorial community formation as a graph
    partitioning problem. In: Eastern-European Journal of Enterprise Technologies, 1(4), 47-
    52. (2016)
38. Lytvyn, V.: The similarity metric of scientific papers summaries on the basis of adaptive
    ontologies. In: Proceedings of 7th International Conference on Perspective Technologies
    and Methods in MEMS Design, MEMSTECH, 162. (2011)
39. Lytvyn, V.V., Tsmots, O.I.: The process of managerial decision making support within the
    early warning system. In: Actual Problems of Economics, 149(11), 222-229. (2013)
40. Kravets, P.: The control agent with fuzzy logic. In: Perspective Technologies and Methods
    in MEMS Design, MEMSTECH, 40-41. (2010)
41. Kravets, P.: The game method for orthonormal systems construction. In: The Experience
    of Designing and Application of CAD Systems in Microelectronics, , 296-298. (2007)
42. Kravets, P., Kyrkalo, R.: Fuzzy logic controller for embedded systems. In: International
    Conference on Perspective Technologies and Methods in MEMS Design, , 58-59. (2009)
43. Kravets, P., Prodanyuk, O.: Game task of resource allocation. In: Experience of Designing
    and Application of CAD Systems in Microelectronics, CADSM, 437-438. (2009)
44. Kravets, P.: Game methods of the stochastic boundary problem solution. In: Perspective
    Technologies and Methods in MEMS Design, MEMSTECH, 71-74. (2007)
45. Kravets, P.: Adaptive method of pursuit game problem solution. In: Modern Problems of
    Radio Engineering, Telecommunications and Computer Science Proceedings of Interna-
    tional Conference, TCSET, 62-65. (2006)
46. Kravets, P.: Game methods of construction of adaptive grid areas. In: The Experience of
    Designing and Application of CAD Systems in Microelectronics, , 513-516. (2003)
47. Shakhovska, N., Veres, O., Bolubash, Y., Bychkovska-Lipinska, L.: Data space architec-
    ture for Big Data managering. In: Proceedings of the International Conference on Comput-
    er Sciences and Information Technologies, CSIT, 184-187. (2015)
48. Vysotska, V., Hasko, R., Kuchkovskiy, V.: Process analysis in electronic content com-
    merce system. In: Proceedings of the International Conference on Computer Sciences and
    Information Technologies, CSIT, 120-123. (2015)
49. Lytvyn, V., Vysotska, V., Veres, O., Rishnyak, I., Rishnyak, H.: The Risk Management
    Modelling in Multi Project Environment.. In: Proceedings of the International Conference
    on Computer Sciences and Information Technologies, CSIT, 32-35. (2017)
50. Kanishcheva, O., Vysotska, V., Chyrun, L., Gozhyj, A.: Method of Integration and Con-
    tent Management of the Information Resources Network. In: Advances in Intelligent Sys-
    tems and Computing, 689, Springer, 204-216. (2018)
51. Vysotska, V., Fernandes, V.B., Emmerich, M.: Web content support method in electronic
    business systems. In: CEUR Workshop Proceedings, Vol-2136, 20-41. (2018)
52. Vysotska, V., Lytvyn, V., Burov, Y., Gozhyj, A., Makara, S.: The consolidated infor-
    mation web-resource about pharmacy networks in city. In: CEUR Workshop Proceedings,
    239-255. (2018)
53. Gozhyj, A., Kalinina, I., Vysotska, V., Gozhyj, V.: The method of web-resources man-
    agement under conditions of uncertainty based on fuzzy logic. In: Proceedings of the In-
    ternational Conference on Computer Sciences and Information Technologies, CSIT, 343-
    346. (2018)
54. Gozhyj, A., Vysotska, V., Yevseyeva, I., Kalinina, I., Gozhyj, V.: Web Resources Man-
    agement Method Based on Intelligent Technologies. In: Advances in Intelligent Systems
    and Computing, 871, 206-221. (2019)
55. Rzheuskyi, A., Kutyuk, O., Voloshyn, O., Kowalska-Styczen, A., Voloshyn, V., Chyrun,
    L., Chyrun, S., Peleshko, D., Rak, T.: The Intellectual System Development of Distant
    Competencies Analyzing for IT Recruitment. In: Advances in Intelligent Systems and
    Computing IV, Springer, Cham, 1080, 696-720. (2020)
56. Rusyn, B., Pohreliuk, L., Rzheuskyi, A., Kubik, R., Ryshkovets Y., Chyrun, L., Chyrun,
    S., Vysotskyi, A., Fernandes, V. B.: The Mobile Application Development Based on
    Online Music Library for Socializing in the World of Bard Songs and Scouts’ Bonfires. In:
    Advances in Intelligent Systems and Computing IV, Springer, 1080, 734-756. (2020)
57. Vysotska, V., Rishnyak, I., Chyrun L.: Analysis and evaluation of risks in electronic com-
    merce. In: CAD Systems in Microelectronics, International Conference, 332-333. (2007)
58. Vysotska, V., Chyrun, L.: Analysis features of information resources processing. In: Pro-
    ceedings of the International Conference on Computer Sciences and Information Technol-
    ogies, CSIT, 124-128. (2015)
59. Andrunyk, V., Chyrun, L., Vysotska, V.: Electronic content commerce system develop-
    ment. In: Proceedings of 13th International Conference: The Experience of Designing and
    Application of CAD Systems in Microelectronics, CADSM 2015-February. (2015)
60. Alieksieieva, K., Berko, A., Vysotska, V.: Technology of commercial web-resource pro-
    cessing. In: Proceedings of 13th International Conference: The Experience of Designing
    and Application of CAD Systems in Microelectronics, CADSM 2015-February. (2015)
61. Vysotska, V., Chyrun, L.: Methods of information resources processing in electronic con-
    tent commerce systems. In: Proceedings of 13th International Conference: The Experience
    of Designing and Application of CAD Systems in Microelectronics, CADSM. (2015)