=Paper=
{{Paper
|id=Vol-3282/icaiw_aiesd_6
|storemode=property
|title=Public Safety Perception in Ecuador: An Approach from Social Networks over Data Analytics
|pdfUrl=https://ceur-ws.org/Vol-3282/icaiw_aiesd_6.pdf
|volume=Vol-3282
|authors=Maria de Lourdes Díaz,Jorge Berrezueta,Gonzalo Albán Molestina,Andres Ortega
|dblpUrl=https://dblp.org/rec/conf/icai2/DiazBMO22
}}
==Public Safety Perception in Ecuador: An Approach from Social Networks over Data Analytics==
Public Safety Perception in Ecuador: An Approach
from Social Networks over Data Analytics
Maria de Lourdes Díaz, Jorge Berrezueta, Gonzalo Albán Molestina and
Andres Ortega*
Universidad Ecotec, Samborondón, Ecuador
Abstract
In Ecuador, insecurity and crime address a space that generates a lot of commotion in social networks.
The data provided by the governments of the nations is not contrasted with what happens in public
opinion. Today, information is very sensitive thanks to the use of social networks, where it is sought
through a data analytics tool to measure the perception of insecurity of citizens. Based on the metadata
offered by the Twitter API, we collect this information through an algorithm based on natural language
processing (NLP) using Python, we generate a statistical report to understand the context of citizen
perception. The results show that there is a high correlation of security factors such as theft, corruption,
crime at the regional and territorial level, affecting the cultural, political and economic development of
cities and countries.
Keywords
Social Networks, Data Analytics, Homicides, Public policies, NLP
1. Introduction
Social media allows users to create a profile, navigate, connect, and communicate with other
users through private or public messaging [1]. At the same time, social media provides a space
that was originally designed to be a thought gatherer. This kind of platform was made to be
entertaining, having algorithms fine-tuned to display relevant content to each user based on
their previous history and interactions with content by other users. Over time, social media
experimented a growing shift toward other purposes of use, where they gained trust and market
share as the primary news source for many users where content is usually not filtered and does
not exclude subjective thoughts from other users on a certain topic or situation [2].
Information on the social network Twitter is abundant and available, as well as giving rise
to fresh opinions on current contexts [3]. Keep in mind that on this platform users not only
post tweets, but can also receive responses or interactions. What is known as "retweeting"
is a way of spreading a message regardless of its veracity that users can use as a method of
interaction. Frequently, this interaction is found in messages of social and political information,
ICAIW 2022: Workshops at the 5th International Conference on Applied Informatics 2022, October 27–29, 2022, Arequipa,
Peru
*
Corresponding author
$ mariadiaz@est.ecotec.edu.ec (M. d. L. Díaz); joberrezueta@est.ecotec.edu.ec (J. Berrezueta);
galban@ecotec.edu.ec (G. Albán Molestina); aortegao@ecotec.edu.ec (A. Ortega)
0000-0002-9141-2048 (A. Ortega)
© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
CEUR Workshop Proceedings (CEUR-WS.org)
114
Maria de Lourdes Díaz et al. CEUR Workshop Proceedings 114–124
2022 2021 2020 2019 2018
400
350
Number of homicides
300
250
200
150
100
Jan Feb Mar Apr May Jun
Month
Figure 1: Semi-annual homicides in Ecuador
news, opinions and very commonly controversial topics to generate a representation of the
current perception of people [4].
This research starts by categorizing crime, violence and delinquency as synonymous words
that affect citizen security, words that indicate a deviation in the behavior of individuals within
a society as well as the violation of established rules and codes [5]. Latin America is considered
the most violent region in the world and its origin may be relevant to factors such as the rapid
conversion from rural to urban, producing a standard of living that requires large investments,
inefficient public social services and evident economic inequality [6].
Insecurity in Ecuador is a social problem that afflicts all its citizens. There are different forms
of citizen insecurity that surround the country. Criminal events are internalized by not obeying
legal and moral norms. Exposure to the use of violence generates an expansion of criminal acts
as a way of solving the problems that society may be going through.
In the last 5 years, they have focused mainly on the number of robberies and homicides.
According to data reported by the Ministry of Government on its information portal the six
months statistic can be perceived in the last 5 years. The accuracy presented in Figure 1 allows
visualizing this evolution of incidents which show that between 2021 and 2022 there have been
a greater number of intentional homicides. However, the reported data is usually not updated.
For example, the State Attorney General’s Office which receives complaints about events of this
type, filed a last report of homicides and robberies for the period of "January-November 2021".
The impact of the insecurity crisis is due to the deterioration of the quality of life, this type
of difficulties gives a new perception of the current state of security by the victims who face
a transformation of habits and in their daily routines in order to avoid being involved in a
dangerous situation [7].
Insecurity is associated with fear and concern, mainly affecting the calm for a citizen to
function smoothly within a society. Fear of crime and feelings of insecurity regarding one’s
environment have an impact on the quality of life of citizens, especially when feelings of fear
become excessive with consequences that can lead to health problems, as it is argued that
115
Maria de Lourdes Díaz et al. CEUR Workshop Proceedings 114–124
general anxiety is significantly related to fear of crime [8]. These factors contribute to the fact
that citizens limit their decision-making in multiple aspects such as consumption, investments
and mobility, directly affecting the social and economic development of our country. [9].
The perception of security is a variable that subjectively indicates the concern among citizens
since it is seeking to measure the fear of danger that Ecuadorians feel day to day. But beyond
perceptions, there is no clarity on the existence of tools or strategies to follow to reduce
and control insecurity. This data can give citizens a certain notion of security about what is
happening in the country [10].
However, this does not invalidate the evident increase in the number of crimes, which
generates a progression of the perception of insecurity with respect to victimization. Ecuador
carried out a single survey on the perception of insecurity at the national level in 2011. This data
by not being updated causes numerous unknowns about the current levels of victimization and
generates distrust in public institutions in charge of security, such as the Ecuadorian police [11].
There is a study in Mexico that takes bases analyzing the relationships between victimization,
perception of insecurity and changes in routines through an adaptation of the National Survey
of Victimization and Public Security, this pointed out important data for the development of the
subject, for example, that women and men victims of crime indicated restrictions in daily life
[7].
How can we measure the perception of security in the country with available resources?
Currently there is a lack of a national methodology for measuring the perception of insecurity
through the use of information on social networks. The content that travels on social networks
affects directly or indirectly the perception that people have about a wide variety of topics.
Although this information is usually not verified, it could exert a change in said perception
through opinions. The intangible social construction that has been created within the use of
social networks defines keywords about the collective experience. In this case, around numerous
violent acts, where citizens manage to specify textually its broadest meaning in terms of their
perception, since social networks are a space where human behavior and free will take place
in real time, becoming a rather convenient to talk about concerns [12]. A study carried out
in Colombia on the perception of insecurity focused on the emotional response that a person
experiencing a crime situation could give, in this case the use of tools such as surveys did
not dictate an appropriate convenience and focused their attention on the results provided for
information based on social networks, specifically using Twitter [13].
Twitter is a social network where users share based on subjectivity, and transmit human
habits by capturing and perceiving opinions. This information can be extracted from the Twitter
API data access service to perform data analysis [14]. Another study was conducted in London
using Twitter to explore and analyze patterns of reactions to homicides. This tool allowed to
quantify the information of people indirectly affected by the events and based on the location,
the speed of expansion of the news was analyzed [12].
Due to the need for a tool capable of measuring the effects of concern in citizens, this study
proposes the extraction, and analysis of citizen perception variables using the Twitter social
network as the information base. Daily tweets will be analyzed after 11 days, through filtering
appropriate of data where key words of insecurity such as robbery and homicide are taken into
account, which are contrasted through a survey of free expression on insecurity. These data are
processed and analyzed statistically, at the national and regional levels with the most important
116
Maria de Lourdes Díaz et al. CEUR Workshop Proceedings 114–124
cities in Ecuador (Guayaquil and Quito). A correlation analysis was carried out at the regional
level with the most important cities in the country, and high rates of crime in Guayaquil and
corruption in Quito and Guayaquil generated by social networks were detected.
2. Materials and Methods
Initially, a survey of 200 respondents was conducted to determine the keywords that are highly
influential in the colloquial language, with the aim of measuring their concept of insecurity.
The compilation of the obtained words that are shown in Table 1 will determine the search
criteria within the algorithm carried out in Python.
For the collection of the Twitter data, the words that were found most frequently within a
data set resulting from the survey were evaluated as shown in Figure 2. Because the survey had
an open text field to indicate the keywords, a tokenization process involving NLP (Natural
Language Processing) techniques was carried out, in which each survey response passed through
a text processing pipeline and lemmatization (obtaining its canonical form or lemma) using
SpaCy library next to the module SpaCy Stanza [15], in this way we obtain the words with
the highest incidence. As there is a constant variation in the format of the responses of the
respondents, it was necessary to implement rules for tokenization through the Algorithm 1.
This procedure initially separates the words of each answer: by line if there are line breaks;
by comma if there are commas; or by spaces if they only contain spaces. Once the responses
Table 1
Most frequent terms related to unsafety: Ecuadorian slang
Survey Terms Twitter Terms
robo corrupción
asesinato delincuencia
sicariato robo
choro robar
asalto miedo
secuestro muerte
violación ladrón
ladrón narcotráfico
delincuencia droga
muerte delito
droga agresión
corrupción peligro
extorsión sicariato
miedo asesinato
delito asalto
peligro extorsión
maltrato secuestro
narcotráfico violación
agresión maltrato
robar choro
117
Maria de Lourdes Díaz et al. CEUR Workshop Proceedings 114–124
robo 53
sicariato 31
asesinato 29
choro 24
asalto 19
secuestro 17
violacion 17
ladron 15
delincuencia 14
muerte 14
droga 13
corrupcion 12
extorsion 12
miedo 12
delito 10
peligro 10
maltrato 9
narcotrafico 9
agresion 6
robar 6
0 10 20 30 40 50
Count
(a) Survey Report
corrupcion 4321
delincuencia 3649
robo 3232
robar 3212
miedo 2712
muerte 1964
ladron 1353
narcotrafico 1340
droga 1220
delito 915
agresion 781
peligro 652
sicariato 507
asesinato 444
asalto 442
extorsion 264
secuestro 227
violacion 204
maltrato 170
choro 102
0 1000 2000 3000 4000
Number of Tweets
(b) Twitter Report
Figure 2: Data colletion related to insecurity words
are processed, the use of Workers is handled, which uses the "web" module of the Python
Pattern [16] library, uses search filters and geographic location of the tweets. These workers
allow parallel interaction with the Twitter API, maintaining the original structure of the tweets.
For the storage phase, the use of MongoDB was considered, which is a NoSQL (non-relational)
database management system that stores data in the form of JSON (JavaScript Object Notation)
documents instead of a columnar format [17]. This in-memory NoSQL database manager system
was selected due to its speed compared to relational database managers, since it has been shown
that its storage and insertion speeds are greater than relational systems [18].
The responses generated by the Twitter API are used as input to be stored in a MongoDB
database. This data goes through a normalization process, separating the data into different
collections (tables) by user, content of the tweet and metadata of the search carried out by the
Workers. Before storing the tweets, here are subjected to a sanitization process, in which
118
Maria de Lourdes Díaz et al. CEUR Workshop Proceedings 114–124
Algorithm 1: Pipeline Steps
Step 1 Input Survey Data;
Step 2 Separate words by available separator;
Step 3 Convert the words to lower case;
Step 4 Remove space and punctuation marks;
Step 5 Lemmatize words;
Step 6 Remove accents marks;
Step 7 Output Tokens;
symbols, labels, links and #hashtags are removed of the original content of the tweet and is
stored in an additional field so as not to alter the word quantification process. The storage
process prevents overwriting of data to be able to carry out a historical evaluation of each tweet.
For this, a web service was developed that serves as a bridge between the collection workers
and the database system MongoDB. Figure 3 presents the overview system architecture, along
Survey NLP
MongoDB Web Service Workers
Data Analytics Twitter API
Figure 3: Architecture of the environment
119
Maria de Lourdes Díaz et al. CEUR Workshop Proceedings 114–124
with the data flow of each component. All components of the system have a flow unidirectional,
except for the MongoDB database and the web service, which are capable of sending and
receiving data each. For the analytical process of this research, used the web service as the data
source for the analysis graphs, since it acts as a mirror of the Twitter API and has only a subset
of relevant data.
3. Results and Analysis
To understand the levels of security that exist in a country and in its main cities; not only it is
enough to have the data provided by the national government as a reference, but it is important
to understand the relief of the people and public opinion too; since many of the events related
to insecurity are mostly hidden due to fear of repression; especially in a Latin American context
where culture and violence have a singular connotation. In Figure 1, which data is taken from
the national government, they only report data on homicides and death in a six-month period;
when insecurity has a deeper meaning; maybe it could be measured through the perception
of feelings that generate panic or fear in citizens, and even an impact on the economic model
of the productive matrix of a country. Taken based on the correlation of the words with the
highest number of tweets in Figure 2, both for the survey report as well as for tweets, it has
been explored in a weekly period due to the limitations of the Twitter API service, the number
of tweets generated with the words with the highest incidence at the national and regional level
as shown in Figure 4.
When we analyze the social and political crisis in Ecuador, people define robo, delincuencia,
muerte, corrupción and miedo as the words that are most linked to insecurity. These curves
will depend on the events with the greatest impact on social networks that may arise over time.
On August 14 we have a report of approximately 600 tweets for a news item that caused panic
in the city of Guayaquil [19], and this causes it to alter and generate commotion among citizens
on social networks. This perception can be correlated to understand what happens with the
most important cities in the national territory, where we have obtained some relevant data
shown in Table 2.
We further observe that in Figure 4 (a) crime and corruption are the words of greatest concern;
In other words, corruption is a factor that always affects our environment and can cause a
perception of insecurity for investment in our national territory.
In Guayaquil, Figure 4 (b), being one of the most dangerous cities in the national territory,
Table 2
Correlation of terms related to unsafety
Words Ecuador Quito Guayaquil
robo -0.619 0.544 0.108
delincuencia -0.715 0.171 -0.305
muerte -0.578 0.202 0.017
corrupción 0.191 0.670 0.627
miedo -0.133 0.609 0.279
120
Maria de Lourdes Díaz et al. CEUR Workshop Proceedings 114–124
600
robo
delincuencia
500 muerte
corrupcion
miedo
Number of Tweets
400
300
200
100
3
4
5
6
7
8
9
0
1
2
3
4
-1
-1
-1
-1
-1
-1
-1
-2
-2
-2
-2
-2
08
08
08
08
08
08
08
08
08
08
08
08
Date
(a) Ecuador
175 robo 200 robo
delincuencia delincuencia
150 muerte muerte
corrupcion 150 corrupcion
125 miedo miedo
Number of Tweets
Number of Tweets
100 100
75
50
50
25 0
0
3
4
5
6
7
8
9
0
1
2
3
4
3
4
5
6
7
8
9
0
1
2
3
4
-1
-1
-1
-1
-1
-1
-1
-2
-2
-2
-2
-2
-1
-1
-1
-1
-1
-1
-1
-2
-2
-2
-2
-2
08
08
08
08
08
08
08
08
08
08
08
08
08
08
08
08
08
08
08
08
08
08
08
08
Date Date
(b) Guayaquil (c) Quito
Figure 4: Data Analytics of Unsafety Perception
maintains uniformity among all the words that were selected over 10 days from August 14 to 24.
In the city of Quito, the interaction of tweets is more marked with the focus on corruption as it
is a city surrounded by politics, then robbery and below crime. An increase in the interaction
of the tweets also coincides with the event raised on the date of August 14. This gives an
approach to the fact that each city is a diverse reality due to issues related to cultural and urban
connotation. On August 21 in the city of Quito, an event takes place that goes viral on social
networks with aggression within a sporting event, where the increase in tweets is reported in
the Figure 5, where clearly this word is not the most common in social networks, but requires
a contrast in the proportional increase in the number of tweets for the words most related to
security. This is affected from August 21 to 23 in the Figures 4 (a), (b), (c).
4. Conclusions
Through this study, the influence of real events on public perception and opinion in spaces of
open discussion is analyzed from the content of social networks. In addition, the topics with
the greatest social impact in terms of citizen insecurity in a given time have been identified
121
Maria de Lourdes Díaz et al. CEUR Workshop Proceedings 114–124
Daily count for word: agresion
520
500
400
Number of Tweets
300
200
104
100
14 16 26 35
13 6 11 6 6
0
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
08
08
08
08
08
08
08
08
08
08
08
Date
Figure 5: Analysis of aggression at national level
through the quantification technique, by extracting tweets.
This data can lead to a socioeconomic, geopolitical and productive study due to the cultural
change of the masses. The increase in criminal events within the country leads to the need to
implement public policies based on results of perception.
The statistical analysis carried out shows that there is a frequent fluctuation highly dependent
on daily events. Likewise, these fluctuations can vary between the different subregions of the
country from day to day. The most important correlation through this study was Ecuador -
Guayaquil crime and corruption between Ecuador - Quito - Guayaquil. It was also observed
that each event on social networks can have an impact on all the words linked to insecurity.
Predicting a crime rate is very complex since various criminal analysis had already confirmed
that crimes are unequally distributed in place, time and context. Such kinds of situations are
strongly driven by the environment, inequality, and lifestyle of Ecuadorian citizens, producing
a high rate of victimization and negative effects for the people whose lifestyles forces them to
expose themselves to a higher risk level.
The information that circulates in social networks responds to the perception and opinion
of people on various topics. Regarding security, given the reality of indicators of violence
and robbery in Ecuador, this issue is no exception and generates spaces for citizen opinion.
In this research, the influence of networks on the perception of insecurity is verified, which
contributes to the generation of a diagnosis that allows the construction of strategies for its
control. It is necessary to highlight that this tool could allow the determination of an indicator,
for the measurement of its evolution. The stochastic error that could be generated when unreal
accounts unfoundedly seek to alter such perception in order to generate chaos or stability should
also be taken into consideration prior to any conclusion, therefore the importance of reaching
the largest possible filter so that the information processed is mostly from real accounts.
122
Maria de Lourdes Díaz et al. CEUR Workshop Proceedings 114–124
References
[1] D. M. Boyd, N. B. Ellison, Social network sites: Definition, history, and scholarship, Journal
of computer-mediated Communication 13 (2007) 210–230. doi:10.1111/j.1083-6101.
2007.00393.x.
[2] H. Gil de Zúñiga, N. Jung, S. Valenzuela, Social media use for news and individuals’
social capital, civic engagement and political participation, Journal of computer-mediated
communication 17 (2012) 319–336. doi:10.1111/j.1083-6101.2012.01574.x.
[3] X. Wang, D. E. Brown, M. S. Gerber, Spatio-temporal modeling of criminal incidents using
geographic, demographic, and twitter-derived information, in: 2012 IEEE International
Conference on Intelligence and Security Informatics, IEEE, 2012, pp. 36–41. doi:10.1109/
ISI.2012.6284088.
[4] C. S. Park, B. K. Kaye, Expanding visibility on twitter: Author and message characteristics
and retweeting, Social Media+ Society 5 (2019) 1–10. doi:10.1177/2056305119834595.
[5] A. Alvarado, La sociología del crimen y la violencia en américa latina. un campo fragmen-
tado, Tempo Social 32 (2020) 67–107. doi:10.11606/0103-2070.ts.2020.175010.
[6] J. Albarracín, N. Barnes, Criminal violence in latin america, Latin American Research
Review 55 (2020) 397–406. doi:10.25222/larr.975.
[7] M. E. Ávila, B. Martínez-Ferrer, A. Vera, A. Bahena, G. Musitu, Victimization, perception
of insecurity, and changes in daily routines in mexico, Revista de Saúde Pública 50 (2016).
doi:10.1590/S1518-8787.2016050006098.
[8] I. D. Reid, S. Appleby-Arnold, N. Brockdorff, I. Jakovljev, S. Zdravković, Developing a model
of perceptions of security and insecurity in the context of crime, Psychiatry, psychology
and law 27 (2020) 620–636. doi:10.1080/13218719.2020.1742235.
[9] K. M. Ortega, S. L. Pino, Impacto social y económico de los factores de riesgo que
afectan la seguridad ciudadana en ecuador, Espacios 42 (2021) 52–70. doi:10.48082/
espacios-a21v42n21p04.
[10] M. Córdova Montúfar, Percepción de inseguridad: una aproximación transversal, Ciudad
Segura 15 (2007) 4–9.
[11] Instituto Nacional de Estadistica y Censos, Encuesta de victimización
y percepción de inseguridad 2011, https://www.ecuadorencifras.gob.ec/
encuesta-de-victimizacion-y-percepcion-de-inseguridad-2011/, 2011.
[12] O. Kounadi, T. J. Lampoltshammer, E. Groff, I. Sitko, M. Leitner, Exploring twitter to
analyze the public’s reaction patterns to recently reported homicides in london, PLoS ONE
10 (2015). doi:10.1371/journal.pone.0121848.
[13] L. F. Chaparro, C. Pulido, J. Rudas, J. Victorino, A. M. Reyes, C. Estrada, L. A. Narvaez,
F. Gómez, Quantifying perception of security through social media and its relationship
with crime, IEEE Access 9 (2021) 139201–139213. doi:10.1109/ACCESS.2021.3114675.
[14] L. M. Gómez, C. García Torres, Twitter, Revista Colombiana de Anestesiología 38 (2010)
539–540.
[15] P. Qi, Y. Zhang, Y. Zhang, J. Bolton, C. D. Manning, Stanza: A Python natural language
processing toolkit for many human languages, in: Proceedings of the 58th Annual Meeting
of the Association for Computational Linguistics: System Demonstrations, 2020, pp. 1–8.
[16] T. De Smedt, W. Daelemans, Pattern for python, J. Mach. Learn. Res. 13 (2012) 2063–2067.
123
Maria de Lourdes Díaz et al. CEUR Workshop Proceedings 114–124
[17] M. Polo-Usaola, MongoDB: gestión, administración y desarrollo de aplicaciones, Macario
Polo Usaola, 2015.
[18] F. Rubio, P. Vega, R. P. Reyes, Nosql vs. sql in big data management: An empirical study,
KnE Engineering 5 (2020) 40–49. doi:10.18502/keg.v5i1.5917.
[19] La Hora, Explosión en «cristo del consuelo», https://www.lahora.com.ec/pais/
explosion-en-cristo-del-consuelo/, 2022.
124