<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">How to use Instagram to Travel the World? An Approach to Discovering Relevant Insights from Tourist Media Content</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Angel</forename><surname>Fiallos</surname></persName>
							<email>afialloso@ecotec.edu.ec</email>
							<affiliation key="aff0">
								<orgName type="institution">Universidad Ecotec</orgName>
								<address>
									<settlement>Samborondón</settlement>
									<country key="EC">Ecuador</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">How to use Instagram to Travel the World? An Approach to Discovering Relevant Insights from Tourist Media Content</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">D80212D22A8D8D2596E08A28E9533786</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-23T21:35+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Natural Language Processing</term>
					<term>Data Mining</term>
					<term>Computer Vision</term>
					<term>Instagram</term>
					<term>Tourism</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This work aims to detect content themes, locations, sentiment, and demographic information on Instagram or similar platforms in a way that supports business decision-making and marketing strategies in the tourism or travel industries. For this purpose, we propose an original combination of NLP methodology and computer vision to be applied to the content of posts associated with a specific hashtag. To demonstrate this, we collected and processed 30,122 images and texts of Instagram posts related to the hashtag #traveltheworld, showing the results of the most relevant user interests, places, emotions, and other detected features.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Social media is essential because social networks have made everybody a potential author, so the language is now closer to the user than to any prescribed norms. In this way, share information about events, activities, services, opinions, and experiences on social media channels.</p><p>Instagram is a social network that has experienced a rapid increase in users and picture uploads since it was launched in October 2010. However, a few research works have been developed around it in contrast to other social networks like Twitter, where the text is analyzed as the main element in its posts.</p><p>Ninety million photos are shared every day through Instagram. Furthermore, users add other features such as hashtags, locations, and text to photos through the platform. These media elements communicate the user's intention behind posting an image but do not necessarily describe the published image <ref type="bibr" target="#b0">[1]</ref>. Also, concerning hashtags, several researchers suggest they carry emotional information which is not directly related to the context they appear <ref type="bibr" target="#b1">[2]</ref>.</p><p>Hashtags are also used to create searchable content categories to gain followers by attracting the attention of public users by businesses and are single words or unbroken strings of words preceding the # symbol. Instagram encourages users to make hashtags both specific and relevant, 101-113 rather than tagging generic words, to make photographs stand out and to attract like-minded Instagram users.</p><p>Obtaining all possible information from these Instagram posts is essential for gaining user insights, measuring brand reputation, and other important market digital research aspects on several industries, such as tourism, travel, hospitality, and customer services, among others. Also, to evaluate campaigns in business, understand users' social behavior, and avoid costly direct surveys.</p><p>The main contribution of this work is proposing a methodology to identify the relevant topics, locations, sentiments, and features from a combination of text and pictures associated with a particular hashtag by combining text mining techniques, sentimental analysis, natural language processing, and computer vision tools. The methodology was applied to a dataset of Instagram photos associated with the hashtag #traveltheworld. This popular hashtag refers to more than 15 million posts and is used by travelers to discover new destinations, swap travel tips, and share their experiences.</p><p>The rest of this work is structured as follows: Section 2 describes the related work, Section 3 presents the proposed methodology, Section 4 describes the results of the case study analysis, and Section 5 presents the conclusions and future work components, incorporating the applicable criteria that follow.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Word</head><p>Few researchers have investigated different ways to detect relevant content topics from Instagram pictures: Hu et al., <ref type="bibr" target="#b2">[3]</ref> analyzed free photos of a random sample of users by considering the user's text. Then, the similarity between pictures was calculated in terms of Euclidean distance between their codebook vectors by k-means to obtain clusters of photos. This work shows eight popular picture categories (friends, food, gadget, captioned photos, pets, activities, selfies, fashion) and five distinct types of users in terms of their posted pictures.</p><p>Jang et al., <ref type="bibr" target="#b3">[4]</ref> performed an analysis of the relationship between LDA-based topics and Likes from the test datasets of 20 million users and their 2 billion LDA-based topics. This work uses a Latent Dirichlet Allocation model over the description text and hashtags written by users. As a result, they identified 20 latent topics prevalent among hashtags added to pictures and presented the top 5 topics.</p><p>Amanatidis et al., <ref type="bibr" target="#b4">[5]</ref> performed a picture analysis and categorization of the personal experiences of users before, during, and after the covid-19 vaccination process. For this purpose, they used computer vision convolutional neural models and datasets from ImageNet.</p><p>Manikonda <ref type="bibr" target="#b5">[6]</ref> concluded that on Twitter, you could locate informational content, while on Instagram, the content is more personal and social. To reach this conclusion, the researchers performed a textual and visual analysis of the media content posted on these two platforms from the same set of users. Our paper differs from those mentioned because it uses a multidisciplinary approach of techniques (computer vision and natural language processing) and validates which one provides better results depending on the objectives to be achieved, in this case, focused on tourist experiences and user demographics. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Methodology</head><p>First, the topics, locations, sentiments, and demographic information were detected following the steps shown in Figure <ref type="figure" target="#fig_1">1</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Data Collection</head><p>A scraping process developed with Python, using BeautifulSoup, and Selenium libraries, was applied to collect a dataset of 50.510 publications related to the hashtag #traveltheworld from the Instagram Platform. These data include the following features per publication: image file, post id, user id, hashtags, upload date, post text, locations, and likes count.</p><p>Then, a sample of 30,122 photos was selected from user accounts with an average of at least 150 likes and 100 followers to avoid downloading photos that belong to fake accounts. The hashtags and the text were taken as post descriptions for this work.  The Microsoft Cognitive Services API also recognizes natural and manmade landmarks worldwide by comparing them to a library of known places. Figure <ref type="figure">4</ref> shows an example of a response once the recognition process is applied over a landmark photo. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Terms Detection</head><p>Some text-mining processes were applied to documents to determine the most relevant topics. First, a data preprocessing <ref type="bibr" target="#b6">[7]</ref> was executed separately for posts descriptions and visual description files with the following steps:</p><p>• Each document was transformed into words (lexical analysis).</p><p>• Empty words (articles, prepositions, marks, conjunctions, numbers, punctuation, and other words that did not semantically describe the content) were deleted. • Stemming process was executed where non-essential parts of terms, such as suffixes and prefixes, were eliminated to keep their essential part (lemma) of the terms.</p><p>Second, the TF-IDF (Term Frequency-Inverse document frequency) model <ref type="bibr" target="#b7">[8]</ref> was applied to evaluate the key terms in the documents. TF-IDF measures the weight of a term based on the term frequency (TF) and inverse document frequency (IDF). Then, a document-term matrix was created with the TF-IDF, and the dispersed terms were deleted to conserve the most relevant terms.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Topic Modeling</head><p>Topic modeling is a text mining technique that employs unsupervised and supervised statistical machine-learning techniques to identify patterns in a corpus or a large amount of unstructured text. It can take a vast collection of documents and group the words into clusters of words, identifying topics using the process of similarity.</p><p>We applied Non-Negative Matrix Factorization to determine relevant topics in both documental corpus. NMF <ref type="bibr" target="#b8">[9]</ref> is a linear-algebra optimization algorithm to extract meaningful information about topics from decomposing the document-term matrix A. in two k-dimensional factors W (document-topic matrix) and H (topic-term matrix).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">Sentimental Analysis</head><p>Sentimental analysis is a technique that uses natural language processing to identify, extract, quantify, and explore affective states and subjective information from text. Generally, the sentimental analysis used a text classification approach based on machine learning.</p><p>The text classification assumes that each sample is assigned to one and only one label. On the other hand, multi-label classification assigns to each sample a set of target labels that are not mutually exclusive. However, many of text multi-label classification methods ignore the word order, opting to use word bag models or TF-IDF weighting to create document vectors.</p><p>Convolutional neural networks (CNN) utilize layers with convolving filters that are applied to local features <ref type="bibr" target="#b9">[10]</ref>. Initially invented for computer vision, CNN models are adequate for NLP and have achieved excellent results in semantic parsing. Kim and Berger <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b11">12]</ref> demonstrated that CNN models using semantic word embeddings such as Word2Vec <ref type="bibr" target="#b12">[13]</ref> significantly outperform the Binary Relevance method with bag-of-words features on a large-scale multi-label.</p><p>We designed a simple CNN network composed of an input layer with five different n-grams window sizes and one convolution layer on top of word vectors obtained from the Word2Vec unsupervised neural language model. These vector representations essentially feature extractors that encode words' semantic features in their dimensions. To conduct the experiment, first, we trained a dataset provided by FigureEigth<ref type="foot" target="#foot_1">2</ref> , which contains approximately 19.000 tweets that have been labeled in neutral, positive, and negative sentimental categories.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.5.">Emotion Recognition</head><p>Face API allows the detection of human faces together with facial attributes that contain predictions of facial features based on automatic learning. The characteristics of available facial attributes are age, emotion, gender, and posture, among others. The API also integrates recognition of emotions and returns the degree of confidence of a set of emotions for each face detected. The process is applied to a set of Instagram photos that, during the process of image recognition, refer to some of the values: "man", "men", "woman," or "women". For each photo, the emotional response with the highest score is compared to the emotion classified manually by observers (ground truth). Figure <ref type="figure" target="#fig_5">5</ref> shows a response from Face API.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Relevant Terms and Topics</head><p>Computer Vision API was applied over 30.122 images and detected 1.816 unique terms related to images' visual contents. After the preprocessing routines, 1.801 terms (99.12%) were conserved 101-113 for the following analysis. Figure <ref type="figure">6</ref> shows a word cloud with the most relevant terms related to the visual content of images. Figure <ref type="figure">7</ref> shows the most frequent terms with higher TF-IDF weights. They are building, groups, people, water, person, mountain, woman, cities, beaches, and streets, among others. These terms have a TF-IDF weight greater than 12.500 and suggest that most of the pictures are related to building structures, people, urban cities, sports activities, and natural tourism attractions. Table <ref type="table" target="#tab_0">1</ref> presents the six terms more associated with the key terms "mountain," "woman", "water", "building", "people," and "city"; for example, the term "mountain" is related to hills, nature, background, view, and field. Key terms were set considering the most frequent terms illustrated in Figure <ref type="figure">7</ref>. Associated terms have a correlation, a quantitative measure between 0 and 1 of the occurrences of words in several documents. In this respect, whether two terms always appear together, then the calculated correlation is 1.</p><p>Using NMF, we detected the most relevant topics of visual descriptions. They are shown in Table <ref type="table" target="#tab_1">2</ref>. and refer to natural landscapes, people's actions, cities and buildings, sea and related activities, food, and other outdoor photos. Next, a corpus of 24.719 documents and 21.972 terms were created with Instagram posts. After preprocessing, 18.810 terms (85.61%) were conserved for the topic modeling. The relevant topics results of user descriptions are shown in Figure <ref type="figure" target="#fig_7">8</ref>. These topics refer to events, exclamations of admiration, visits to specific tourist sites, emotions, and engagements. In order to ensure that content is coherent and to eliminate redundancy in topic terms, we reduce the hashtags related to travel. Figure <ref type="figure" target="#fig_7">8</ref> shows the most frequent terms with higher TF-IDF weights greater than 800.</p><p>On the other hand, Table <ref type="table" target="#tab_2">3</ref> shows the topics of the text content written by users. These  topics refer to travelers' stories, expressions of admiration, and social media engagements. The average cosine distance between the topics mined from the users' descriptions and the visual description was 0.290, which means there is a low similarity between both documents. Then, the user descriptions do not allow us to identify the features and elements of the images in a specific way because they refer to narrations of events, situations, or opinions of events related to the photos.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Locations and Landmarks</head><p>The geolocations were added by users in 19.782 (65.69%) Instagram posts, so the locations for the remaining photos were detected using Computer Vision API landmark properties. A total of 2.26% of pictures were retrieved by this method. Table <ref type="table" target="#tab_3">4</ref> shows the places identified in Instagram photos which are more highest count rate.</p><p>The identified locations include famous monuments and buildings, such as the Eiffel Tower, Sagrada Familia, Pantheon Rome, Grand Central Terminal, Brooklyn Bridge, and Trevi Fountain, among others, those that were positioned to your specific city or country through the GeoPy tool <ref type="foot" target="#foot_2">3</ref> . These values can be contrasted with TripAdvisor info, the largest travel website in the world, where Paris, New York, London, Rome, Barcelona, Bali, and Prague, among others, are mentioned as the most popular locations the World in TripAdvisor. Therefore, the results presented in Table <ref type="table" target="#tab_3">4</ref> could be a good reference for worldwide tourism stats.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Users Demographics</head><p>We used a scraping process to retrieve a total of 17.752 unique photos of user profiles from the Instagram posts. The Face API process was applied to the profile's photo collection to recognize facial properties. Once the process was finished, we selected the photos with an exposure value greater than 0.5, and the genre and age properties could be detected. In total 5.560 (31.32 %).</p><p>The rest of the photos of user profiles, among other reasons, did not show the face of the user, belonged to business profiles, or had low quality and did not allow identification of gender and age properties. Table <ref type="table" target="#tab_4">5</ref> shows the percentages belonging to the user genre groups, and table <ref type="table">6</ref> shows the percentages belonging to the user groups by age range: </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4.">Emotion Recognition and Text Sentiment Analysis</head><p>An ideal visual experience on Instagram social network happens when the sentiment and emotions transmitted from text and photo(s) or video(s) are similar. Classifying emotions in publications requires a lot of effort and manual work from experienced teams. Therefore, emotion recognition and text sentiment analysis can help predict the emotions of a social media post.</p><p>A sample of 114 photos was taken that referred to a person with a visible face. It was automatically classified using Face API, the feelings expressed in the images for each of the following categories: anger, disgust, fear, joy, sadness, and surprise. In addition, we use our Word2Vec model to classify the sentiment found in the text of the user's IG publications. Figure <ref type="figure" target="#fig_8">9</ref> shows the sentiment and emotion percentages, where joy is the most frequent emotion available in people's photos, and neutral is the most regular sentiment available in text content. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>101-113</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusions</head><p>The proposed methodology allows obtaining more useful inferred information from any collection of publications associated with a particular hashtag on Instagram or other social networks at a low cost and effort.</p><p>The low similarity between the topics is mined from the content written by users, tourists usually, and the visual descriptions from photos because users generally refer to situations or opinions regarding the photos. In contrast, the visual analysis produces tags more related to the actual content of the images. We can also determine that the emotions transmitted in Instagram posts are better predicted using photos instead of text written by users, but only when a quality image containing a face with high confidence is available.</p><p>The results of the most frequent worldwide photo locations are similar to the most popular places on TripAdvisor. For this reason, the methodology of this work can be helpful in areas such as digital marketing, market research, opinion polls, social studies, and other fields. Also, the findings can be valuable for decision-making, creating new marketing strategies, and other studies such as consumer profile analysis, as well as being complementary to textual content from social network reports and third-party social listening platforms.</p><p>In future work, we will consider exploring stories and reels' visual content and text comments on user descriptions to evaluate if they improve prediction values using the text.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Proposed methodology.</figDesc><graphic coords="3,151.80,147.54,291.68,283.43" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 113 Figure 2 :</head><label>21132</label><figDesc>Figure 2: Instagram post sample.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head></head><label></label><figDesc>Figure 3 shows how computer vision API returns information about the visual content of an image.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 3 : 113 Figure 4 :</head><label>31134</label><figDesc>Figure 3: Visual content response from an IG photo.</figDesc><graphic coords="4,110.13,430.10,375.01,213.66" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: Face API Response.</figDesc><graphic coords="7,130.96,84.19,333.36,192.77" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Figure 6 : 113 Figure 7 :</head><label>61137</label><figDesc>Figure 6: Wordcloud of relevant terms.</figDesc><graphic coords="7,151.80,422.88,291.68,219.05" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head>Figure 8 :</head><label>8</label><figDesc>Figure 8: Most frequent terms of visual descriptions.</figDesc><graphic coords="9,110.13,406.64,375.01,237.06" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_8"><head>Figure 9 :</head><label>9</label><figDesc>Figure 9: Sentiment and Emotion Percentages.</figDesc><graphic coords="12,89.29,84.19,416.69,173.52" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="5,110.13,84.19,375.01,229.76" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0"><head></head><label></label><figDesc></figDesc><graphic coords="8,110.13,84.19,375.01,237.06" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Correlation Terms detected in datasets.</figDesc><table><row><cell>Terms</cell><cell></cell><cell cols="2">Correlation Terms</cell><cell></cell></row><row><cell>Mountain</cell><cell>Hill</cell><cell>Naure</cell><cell>Back</cell><cell>Rocky</cell></row><row><cell></cell><cell>0.58</cell><cell>0.53</cell><cell>0.44</cell><cell>0.38</cell></row><row><cell>Woman</cell><cell>Young</cell><cell>Person</cell><cell>Girl</cell><cell>Wearing</cell></row><row><cell></cell><cell>0.68</cell><cell>0.66</cell><cell>0.55</cell><cell>0.42</cell></row><row><cell>Water</cell><cell>Boat</cell><cell>Ocean</cell><cell>River</cell><cell>Lake</cell></row><row><cell></cell><cell>0.62</cell><cell>0.57</cell><cell>0.53</cell><cell>0.52</cell></row><row><cell>Building</cell><cell>Old</cell><cell>Clock</cell><cell>Street</cell><cell>tower</cell></row><row><cell></cell><cell>0.46</cell><cell>0.45</cell><cell>0.43</cell><cell>0.38</cell></row><row><cell>People</cell><cell>Group</cell><cell>Walking</cell><cell>Crowd</cell><cell>Man</cell></row><row><cell></cell><cell>0.70</cell><cell>0.30</cell><cell>0.26</cell><cell>0.24</cell></row><row><cell>City</cell><cell>Street</cell><cell>Traffic</cell><cell>Tall</cell><cell>Clock</cell></row><row><cell></cell><cell>0.55</cell><cell>0.50</cell><cell>0.41</cell><cell>0.34</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Topics of visual descriptions from IG Photos.</figDesc><table><row><cell>Topic</cell><cell>Terms</cell></row><row><cell>1</cell><cell>city, building, street, front, clock, tower, tall, old, large, sign.</cell></row><row><cell>2</cell><cell>body, water, boat, ocean, beach, lake, doc, river, large, sunset.</cell></row><row><cell>3</cell><cell>person, woman, young, hold, wear, pose, man, front, girl, standing.</cell></row><row><cell>4</cell><cell>mountain, field, hill, grass, green, tree.</cell></row><row><cell>5</cell><cell>table, sit, food, plate, room, close, wooden, white, indoor, cake.</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc>Topics of text content written by users from IG Posts.</figDesc><table><row><cell>Topic</cell><cell>Terms</cell></row><row><cell>1</cell><cell>view, top, stunning, enjoy, city, nice, room, beautiful, sea, hotel, climb.</cell></row><row><cell>2</cell><cell>tag, follow, friend, like, comment, someone, photo, credit, share, picture.</cell></row><row><cell>3</cell><cell>travel, world, happy, destination, capture, blog, escape, explore, live, inspiration.</cell></row><row><cell>4</cell><cell>day, beautiful, love, place, time, good, life, see, back, world, sunset.</cell></row><row><cell>5</cell><cell>love, fall, madly, city, hubby, guess, place, live want, someone, people.</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 4</head><label>4</label><figDesc>Popular Places Ranking from IG posts linked to hashtag #traveltheworld.</figDesc><table><row><cell>Location</cell><cell>Count</cell><cell>Percentage</cell></row><row><cell>Paris, France</cell><cell>214</cell><cell>1.08%</cell></row><row><cell>New York, New York</cell><cell>172</cell><cell>0.87%</cell></row><row><cell>London, United Kingdom</cell><cell>122</cell><cell>0.61%</cell></row><row><cell>Rome, Italy</cell><cell>111</cell><cell>0.56%</cell></row><row><cell>Barcelona, Spain</cell><cell>110</cell><cell>0.55%</cell></row><row><cell>Bali, Indonesia</cell><cell>109</cell><cell>0.55%</cell></row><row><cell>Prague, Czech Republic</cell><cell>96</cell><cell>0.48%</cell></row><row><cell>Amsterdam, Netherlands</cell><cell>95</cell><cell>0.48%</cell></row><row><cell>Iceland</cell><cell>69</cell><cell>0.35%</cell></row><row><cell>Venice, Italy</cell><cell>68</cell><cell>0.34%</cell></row><row><cell>Los Angeles, California</cell><cell>66</cell><cell>0.33%</cell></row><row><cell>Lisbon, Portugal</cell><cell>62</cell><cell>0.31%</cell></row><row><cell>Istanbul, Turkey</cell><cell>57</cell><cell>0.29%</cell></row><row><cell>San Francisco, California</cell><cell>54</cell><cell>0.27%</cell></row><row><cell>Berlin, Germany</cell><cell>51</cell><cell>0.26%</cell></row><row><cell>Total</cell><cell></cell><cell>7.33%</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 5</head><label>5</label><figDesc>Genre Percentages.</figDesc><table><row><cell>Genre</cell><cell>Count</cell><cell>Percentage</cell></row><row><cell>Female</cell><cell>3575</cell><cell>64.30%</cell></row><row><cell>Male</cell><cell>1985</cell><cell>35.70%</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">Microsoft Cognitive Services https://azure.microsoft.com/es-es/services/cognitive-services/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">FigureEight https://www.figure-eight.com/wp-content/uploads/2016/07/text_emotion.csv</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">GeoPyt https://geopy.readthedocs.io/en/latest/</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Using hashtags to capture fine emotion categories from tweets</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Mohammad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kiritchenko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computational Intelligence</title>
		<imprint>
			<biblScope unit="volume">31</biblScope>
			<biblScope unit="page" from="301" to="326" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<author>
			<persName><forename type="first">F</forename><surname>Kunneman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Liebrecht</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Van Den</surname></persName>
		</author>
		<author>
			<persName><surname>Bosch</surname></persName>
		</author>
		<title level="m">The (un) predictability of emotional hashtags in twitter</title>
				<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Manikonda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kambhampati</surname></persName>
		</author>
		<title level="m">What we instagram: A first analysis of instagram photo content and user types</title>
				<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">No reciprocity in&quot; liking&quot; photos: analyzing like activities in instagram</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">Y</forename><surname>Jang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Han</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Lee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 26th ACM conference on hypertext &amp; social media</title>
				<meeting>the 26th ACM conference on hypertext &amp; social media</meeting>
		<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="273" to="282" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Mining textual and imagery instagram data during the covid-19 pandemic</title>
		<author>
			<persName><forename type="first">D</forename><surname>Amanatidis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Mylona</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Kamenidou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mamalis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Stavrianea</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Applied Sciences</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page">4281</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<author>
			<persName><forename type="first">L</forename><surname>Manikonda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">V</forename><surname>Meduri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kambhampati</surname></persName>
		</author>
		<title level="m">Tweeting the mind and instagramming the heart: Exploring differentiated content sharing on social media</title>
				<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Data mining and knowledge discovery</title>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">J</forename><surname>Cios</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Pedrycz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">W</forename><surname>Swiniarski</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Data mining methods for knowledge discovery</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="1998">1998</date>
			<biblScope unit="page" from="1" to="26" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition</title>
		<author>
			<persName><forename type="first">D</forename><surname>Jurafsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">H</forename><surname>Martin</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2000">2000</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Non-negative matrix factorization with sparseness constraints</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">O</forename><surname>Hoyer</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of machine learning research</title>
		<imprint>
			<biblScope unit="volume">5</biblScope>
			<date type="published" when="2004">2004</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Gradient-based learning applied to document recognition</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Lecun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Bottou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bengio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Haffner</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE</title>
				<meeting>the IEEE</meeting>
		<imprint>
			<date type="published" when="1998">1998</date>
			<biblScope unit="volume">86</biblScope>
			<biblScope unit="page" from="2278" to="2324" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Convolutional neural networks for sentence classification</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Kim</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<title level="m" type="main">Large scale multi-label text classification with semantic word vectors</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Berger</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
		<respStmt>
			<orgName>Stanford University</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Technical report</note>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Comparison of term frequency and document frequency based feature selection metrics in text categorization</title>
		<author>
			<persName><forename type="first">N</forename><surname>Azam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Yao</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Expert Systems with Applications</title>
		<imprint>
			<biblScope unit="volume">39</biblScope>
			<biblScope unit="page" from="4760" to="4768" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
