<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Overview of MEX-A3T at IberLEF 2019: Authorship and aggressiveness analysis in Mexican Spanish tweets</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mario Ezra Aragón</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Miguel Á. Álvarez-Carmona</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Manuel Montes-y-Gómez</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hugo Jair Escalante</string-name>
          <email>hugojair@inaoep.mx</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luis Villaseñor-Pineda</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniela Moctezuma</string-name>
          <email>dmoctezuma@centrogeo.edu.mx</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Centre de Recherche en Linguistique Française GRAMMATICA (EA 4521)</institution>
          ,
          <addr-line>Université d'Artois</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Centro de Investigación en Ciencias de Información Geoespacial A.C.</institution>
          ,
          <country country="MX">Mexico</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Consejo Nacional de Ciencia y Tecnología (CONACYT)</institution>
          ,
          <country country="MX">Mexico</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Laboratorio de Tecnologías del Lenguaje, Instituto Nacional de Astrofísica</institution>
          ,
          <addr-line>Óptica y Electrónica (INAOE)</addr-line>
          ,
          <country country="MX">Mexico</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Unidad de Transferencia Tecnológica, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE-UT3)</institution>
          ,
          <country country="MX">Mexico</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <fpage>478</fpage>
      <lpage>494</lpage>
      <abstract>
        <p>This paper presents the framework and results from the MEX-A3T track at IberLEF 2019. This track considers two tasks, author profiling and aggressiveness detection, both of them using Mexican Spanish tweets. The author profiling task consists on determining the gender, occupation and place of residence of users from their tweets. As a novelty in this year's edition, it considers the use of text and images as information sources, with the aim of studying the relevance and complementarity of multimodal data for profiling social media users. On the other hand, the aggressiveness detection task follows the same design than the previous edition; it aims to discriminate between aggressive and non-aggressive tweets. For both tasks, we have built new corpora considering tweets from Mexican Twitter users. This paper compares and discusses the results of the participants.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>Twitter platform is constantly growing thanks to the information generated by
a massive community of active users. The analysis of shared information has
become very relevant for several applications in marketing, security, and forensics,
among others.</p>
      <p>
        One essential task for social media analysis is author profiling (AP), which
consists in predicting general or demographic attributes of authors by examining
the content of their posts [
        <xref ref-type="bibr" rid="ref2 ref4">2, 4</xref>
        ]. On the other hand, detecting aggressive content
targeted to specific people or vulnerable groups is also a task of high relevance
to preventing possible viral destructive behaviors through social networks.
      </p>
      <p>The objective of the MEX-A3T is to encourage research on the analysis of
social media content in Mexican Spanish. Mainly, it aims to push research into
the treatment of a variety of Spanish that has cultural traits that make it
significantly diferent from the peninsular Spanish. Accordingly, the 2019 edition
of MEX-A3T consider two main tasks: author profiling, whose aim was to
develop methods for profiling users according to non-standard dimensions (gender,
occupation, and place of residence), and aggressiveness detection in tweets.
Particularly, the main novelty for this edition is the use multimodal data (text and
images) for AP, with the aim of exploring the relevance of multimodal
information for profiling social media users.</p>
      <p>To evaluate these tasks, we built two ad hoc collections. The first one is a
multimodal author profiling corpus consisting of 5 thousand Mexican users, each
one having eleven images, the profile image as well as ten random selected
pictures. This corpus is labeled for the subtasks of gender, occupation and place
of residence identification. Whereas the second corpus is oriented to the
aggressiveness detection and contains more than 11 thousand tweets. In this case, each
tweet is labeled as aggressive or not.</p>
      <p>The remainder of this paper is organized as follows: Section 2 covers a brief
description of the first edition of the MEX-A3T; Section 3 presents the evaluation
framework used at MEX-A3T 2019; Section 4 shows an overview of the
participating approaches; Section 5 reports and analyses the results obtained by the
participating teams; finally, Section 6 draws the conclusions of this evaluation
exercise.
2</p>
      <p>
        MEX-A3T 2018
Last year, the first edition of the MEX-A3T shared task was carried out [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This
represented the first attempt for organizing an evaluation forum for the analysis
of social media content in Mexican Spanish. A variety of methods were proposed
by participants, comprising content-based (bag of words, word n-grams, term
vectors, dictionary words, and so on) and stylistic-based features (frequencies,
punctuation, POS, Twitter-specific elements, slang words, and so forth) as well
as approaches based on neural networks (CNN, LSTM and others). In both
tasks, author profiling and aggressiveness identification, the baseline results were
outperformed by most participants.
      </p>
      <p>
        For author profiling, the approach proposed by the MXAA team [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] obtained
the best results with an approach based on emphasizing the value of personal
information for building the text representation. In the case of the
aggressiveness identification, the top-ranked team was INGEOTEC [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], which proposed an
approach based on MicroTC and EvoMSA. MicroTC is a minimalistic text
classifier independent from domain and language. EvoMSA is another text classifier
which combines models (as MicroTC) with Genetic Programming.
3
      </p>
    </sec>
    <sec id="sec-2">
      <title>Evaluation framework</title>
      <p>This section outlines the construction of the two used corpus, highlighting
particular properties, challenges, and novelties. It also presents the evaluation measures
used for both tasks.
3.1</p>
      <sec id="sec-2-1">
        <title>A multimodal Mexican corpus for author profiling</title>
        <p>
          This new corpus is based on previous year’s collection. For the MEX-A3T 2018,
we labeled 5 thousand Twitter users for occupation and place traits divided
into 3500 users for training and 1500 for the test. For the occupation label,
we considered the following eight classes: arts, student, social, sciences, sports,
administrative, health, and others. For the place of residence trait, we considered
the following six classes: north, northwest, northeast, center, west, and southeast.
For more details, we recommend consulting [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
        </p>
        <p>For this year’s edition, we added the gender trait to the corpus, and in this
way, each user is characterized by three labels: gender, occupation, and location.
Another important novel aspect of this new corpus is the addition of 11 images
for user. We selected the profile picture for each user as well as 10 randomly
selected images from their tweets6.</p>
        <p>Table 1 shows the distribution of the corpus according to the gender trait. For
this corpus, the gender trait is balanced. Also, Table 2 shows the distribution of
the corpus according to the place of residence trait. As it is possible to observe,
the distributions of training and test sets are very similar. The majority class
corresponds to the center region, with more than 36% of the profiles, whereas
the minority class is the north region with only 3% of the instances. On the other
hand, Table 3 shows the distribution of the occupation trait. It also shows similar
distributions in the training and test partitions. The majority class are students
with almost 50% of the profiles, whereas sports correspond to the minority class,
with approximately 1% of the instances.</p>
        <p>
          In the Tables 2 and 3, the class imbalance was calculated as proposed in
[
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. The place of residence trait shows a value of 396.1, whereas the occupation
trait has a value of 502.4. Considering that zero represents a perfect balance,
these numbers indicate that the imbalance is bigger for the occupation trait
and, therefore, that it could be more complex to be predicted that the place of
residence.
        </p>
        <p>Finally, Table 4 presents some additional statistics for the author profiling
corpus. For computing these numbers, we have considered words, numbers,
punctuation marks and emoticons as terms. We also applied a normalization over user
6 For most users we collected 11 images, although there are a small number of users
with fewer images since in total they did not shared 10 images. In this cases, we take
all available images.
mentions, hashtags, and URLs. It is possible to observe that the lexical
diversity is very close for the training and test partitions. Also, the same goes for
the tweets per profile averages. Nevertheless, the standard deviation in training
and test is quite large, implying that the length of the profiles is very variable.
Finally, the last row in the table shows the number of images in the corpus.</p>
      </sec>
      <sec id="sec-2-2">
        <title>A Mexican corpus for aggressiveness identification</title>
        <p>As the author profiling corpus, for the previous edition of MEX-A3T we also
built an corpus of tweets for the task of aggressiveness detection. To build this
corpus, we used rude words and controversial hashtags to narrow the search.
The hashtags were related to topics of politics, sexism, homophobia, and
discrimination. The collected tweets were labeled by two persons. At the end each
tweet of the corpus was labeled as aggressive or non-aggressive. Table 5 shows
some examples labeled as aggressive and non-aggressive. As can be intuited, the
task of labeling aggressiveness is challenging, especially because in most of the
cases it is necessary to interpret the message in a given context.</p>
        <p>
          The collected corpus consists of 11 thousand tweets. For the evaluation
exercise, the corpus was divided into two parts, one for training and the other for
test. Table 6 shows the distribution of this corpus. It is noticed that the
nonaggressive class is the majority class in both partitions. For more details of the
labeled methodology, please consult [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
Author profiling. For the author profiling task, we used as final score the
average of the macro F1 measures for gender, place of residence, and occupation
traits, as shown in Formula 1.
        </p>
        <p>Faverage =</p>
        <p>Fmacro(Cgender) + Fmacro(Clocation) + Fmacro(Coccupation)
3
(1)</p>
        <p>The Fmacro measures were computed using Formula 2, where C indicates
the set of classes for a given trait7, and F1(c) is the F1-measure of each of the
categories from that trait.</p>
        <p>Fmacro(C) =
1</p>
        <p>∑ F1(c)
jCj c2C
(2)
Aggressiveness identification. For this task, the final score corresponds to
the F1-measure for the aggressive class.
4</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Overview of the Submitted Approaches</title>
      <p>For this study, 8 teams have submitted one or more solutions, of which, 2
participated in the author profiling task and 6 participated in the aggressiveness
identification task. By what they explained in their notebook papers, this section
presents a summary of their approaches regarding preprocessing steps, features,
and classification algorithms.</p>
      <p>
        The participating methods are listed below:
7 Cgender = fmale, femaleg, Clocation = fnorth, northwest, northeast, center, west,
southeastg, and Coccupation = farts, student, social, sciences, sports, administrative,
health, othersg
– CerpamidUA at MexA3T 2019: Transition Point Proposal [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
Task: Author Profiling
      </p>
      <sec id="sec-3-1">
        <title>Team name: Cerpamid</title>
        <p>Features: Bag of Words.</p>
        <p>Classification: Support Vector Machine.</p>
        <p>
          Summary: In this paper, the authors proposed an approach that
follows the traditional pipeline of a non-thematic text classification system,
where they employed a BOW representation and a SVM classifier. The
authors focused on determining a reduced subset of features that
represent frequent words for each profile, and propose using the theory of
Transition Points for the selection of these features.
– Author profiling from images using 3D Convolutional Neural Networks [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]
Task: Author Profiling
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Team name: CIC-VCR</title>
        <p>Features: Hierarchical features obtained with CNN.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Classification: CNN.</title>
        <p>
          Summary: In this paper, the authors focused on determining the
proifle of an author using images only. They proposed a 3D Convolutional
Neural Network for extracting features from the images and classifying
them in the diferent classes. They concluded that predicting the AP of
a Twitter user using only images is a dificult task due to the generality
of purpose of the images on this platform.
– Aggressive analysis in Twitter using a combination of models [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]
Task: Aggressiveness Detection
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>Team name: PRHLT</title>
        <p>Features: Bag of Words with TF-IDF weights, hierarchical features
obtained with CNN.</p>
        <p>Classification: CNN, LSTM and Multi-layer Perceptron.</p>
        <p>
          Summary: In this work, the authors proposed a method that combines
diferent classification strategies: a Convolutional Neuronal Networks
whose outputs feed a LSTM Neural Network; a pre-trained Universal
Sentence Encoder for encoding sentences into embedding vectors; and a
simple Multi-layer Perceptron which gets the TF-IDF representation of
the tweet. The best results were obtained with the simplest model (the
multi-layer perceptron), which can be explained by the lack of data to
train deep learning models.
– Aggressiveness Identification in Twitter at IberLEF2019: Frequency Analysis
Interpolation for Aggressiveness Identification [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]
        </p>
        <p>Task: Aggressiveness Detection
Team name: OscarGaribo</p>
        <sec id="sec-3-4-1">
          <title>Features: Statistical descriptors.</title>
          <p>Classification: Support Vector Machine.</p>
          <p>
            Summary: In this paper, the authors proposed a new text
representation that reduces the dimensionality of the information for each author
or text to 6 characteristics per class. The proposed representation aims
to capture the level of association of each word to each one of the classes
and, therefore, to model the probability distribution of the presence or
evidence of each class in the texts. This representation, named as
Frequency Analysis Interpolation, is used to codify the texts for each user,
and then this codified information is used to feed a Support Vector
Machines classifier.
– Attribute selection techniques for classification of aggressive tweets.
LyRUAMC participation at MexA3T 2019 Task [
            <xref ref-type="bibr" rid="ref14">14</xref>
            ]
          </p>
          <p>Task: Aggressiveness Detection</p>
        </sec>
      </sec>
      <sec id="sec-3-5">
        <title>Team name: LyR</title>
        <p>Features: Document frequency, mutual information, and lexical
Availability.</p>
      </sec>
      <sec id="sec-3-6">
        <title>Classification: Naïve Bayes.</title>
        <p>
          Summary: In this work, the authors proposed an approach that
follows the traditional pipeline of a non-thematic text classification system.
They employed a BOW representation and evaluated the impact of
distinct features selection strategies. Their goal was to test if a condensed
set of words can be indicative of the aggressiveness of a tweet. They
proposed a new criterion to select relevant words: the lexical availability,
and reach the following conclusion: diferent feature selection techniques
favor diferent aspects of the aggressiveness in a short text.
– Detection of Aggressive Tweets in Mexican Spanish Using Multiple Features
with Parameter Optimization [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]
        </p>
        <p>
          Task: Aggressiveness Detection
Team name: mineriaUNAM
Features: Linguistically features and diferent types of n-grams.
Classification: Support Vector Machine
Summary: In this work, the authors approached the problem using
linguistically motivated features and several types of n-grams (words,
characters, functional words, punctuation symbols, among others). They
trained a Support Vector Machine using a combinatorial framework that
optimizes the results of the classifier.
– UACh at MEX-A3T 2019: Preliminary results on detecting aggressive tweets
by adding author information via an unsupervised strategy [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]
        </p>
        <p>Task: Aggressiveness Detection</p>
      </sec>
      <sec id="sec-3-7">
        <title>Team name: UACh</title>
        <p>Features: Character n-grams and word embeddings.</p>
        <p>
          Classification: Support Vector Machine and a multilayer perceptron.
Summary: In this paper, the authors considered the application of a
traditional classification method to the problem of aggressiveness
detection in Spanish tweets. They used two main kinds of features: character
n-grams and word embeddings. Then employed two diferent classifiers,
a SVM and a multilayer perceptron. The main idea of their participation
was the inclusion of features to try to give context to the text messages
and explore if people verbally attack diferently depending on their traits
and overall environment. The obtained results indicated that adding
context features produce almost unnoticeable changes in the performance.
– Aggressiveness Detection through Deep Learning Approaches [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]
Task: Aggressiveness Detection
        </p>
      </sec>
      <sec id="sec-3-8">
        <title>Team name: VRAIN</title>
        <p>Features: Hierarchical features obtained by a CNN.</p>
      </sec>
      <sec id="sec-3-9">
        <title>Classification: CNN, LSTM, GRU.</title>
        <p>
          Summary: In this paper, the authors explore three deep learning
approaches to the task: a convolutional network, a recurrent network and
a self-attention network. They did not obtain good results in the test
set. They assumed that that was due to the fact that the content of the
test data is too diferent from the training set.
– Ensemble learning to detect aggressiveness in Mexican Spanish tweets [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]
Task: Aggressiveness Detection
        </p>
      </sec>
      <sec id="sec-3-10">
        <title>Team name: CEATIC</title>
        <p>Features: Bag of words with term frequency.</p>
        <p>Classification: Support Vector Machine, Logistic Regression,
Multinomial Naïve Bayes.</p>
        <p>Summary: In this work, the authors used a traditional BOW
representation, considering unigrams and bigrams, and applied a TF weighting.
They evaluated multiple classification algorithms, among them Logistic
Regression, Multinomial NB and SVM, and proposed an ensemble
classifier combining the three best individual algorithms by majority vote.
5</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experimental evaluation and analysis of results</title>
      <p>This section summarizes the results obtained by the participants, comparing and
analyzing in detail the performance of their submitted solutions. For the final
phase of the challenge, participants sent their predictions for the test partition,</p>
      <sec id="sec-4-1">
        <title>Type Approach Lowercase</title>
        <p>Preprocessing Normalize tweets ✓</p>
        <p>Characters n-grams
Words n-grams ✓</p>
        <p>Aggressive words ✓
Representation Word embeddings ✓
(features) Statistical descriptors</p>
        <p>LIWC
Hierarchical (over texts) ✓
Hierarchical (over images)</p>
        <p>C Logistic regression
Classification C Naïve Bayes</p>
        <p>C SVM
A Deep-learning ✓
A Model selection/Ensembles ✓
the performance on this data was used to rank them. The macro average F1 was
used as the main evaluation measure.</p>
        <p>
          For computing the evaluation scores we relied on the EvALL platform [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
EvALL is an online evaluation service targeting information retrieval and natural
language processing tasks. It is a complete evaluation framework that receives
as input the ground truth and the predictive outputs of systems and returns a
complete performance evaluation. In the following, we report the results obtained
by participants as evaluated by EvALL.
        </p>
        <p>
          As baseline methods, we implemented two popular approaches that have
shown to be hard to beat in both tasks: i) a classification model trained on the
bag of words (BoW) representation, and ii) a classifier trained on a character
3-grams representation. Also, we compared the systems’ results versus the best
results for both tasks in the previous year edition. For author profiling we
consider the results from the MXAA approach [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], and for aggressiveness detection
we use the results from the INGEOTEC system [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ],
        </p>
        <p>For the BOW approach, all the corpus vocabulary was used, but stopwords
and special characters were removed. The size of the representation of each text
was 14,913. For the 3-grams representation, all 3-grams were used. As in BOW,
stopwords and special characters were removed. The size of the representation
of each text was 5,212. A SVM with linear kernel and C = 1 was applied for
classification in both tasks.
5.1</p>
        <sec id="sec-4-1-1">
          <title>Author profiling results</title>
          <p>Table 8 shows a summary of the results obtained by each team in the three AP
subtasks as well as their average performance. The average macro F1 was used
to rank participants. The approach of the CerpamidTeam (run 1) team obtained
the best performance. Nevertheless, this system do not overcome all baselines. In
particular, it is noticeable that for the two traits considered in the 2018 edition,
i.e., occupation and place of residence, any of the two participant teams was able
to improve the results from winner team (MXAA) of the last year’s edition.</p>
          <p>Tables 9 and 10 show the results obtained by each team for the location
and occupation traits respectively. Although we used the macro average of F1
to rank the participants, we also show the accuracy results as well as the F1 for
each class. For these two particular traits the two participant systems could not
outperform any of the proposed baselines.
From an overall analysis, it was possible to notice that for all traits the
best results corresponded to textual-based solutions. In spite of this general
behaviour, we could identify 110 users out of 1500 from the the test set that
were correctly classified only by the image-based systems (runs 1 and 2 from
the CIC-VCR team). We hypothesize that this result could be caused by the
lower number of tweets from these users in comparison to the rest. They have on
average 1218 tweets, whereas the average from the complete test set is of 1353
tweets per user.</p>
          <p>To analyze the complementarity of the predictions by the two participants, we
built a theoretically perfect ensemble from their four runs. That is, we considered
that a test instance was correctly classified if at least one of the participating
teams (i.e., one of their runs) classified it correctly. Additionally, we considered
a majority vote approach; for this we choose the class with the greatest number
of predictions among the four runs.</p>
          <p>Table 11 shows the results of the perfect ensemble and the majority vote
approach, and compares them with the best result obtained for each trait by a
single participant system. From these results, it is possible to observe that the
perfect ensemble performance is considerably greater than the best approach for
the three traits, suggesting that the two participant systems are complementary
to each other. Nevertheless, the bad results from the majority vote approach
indicate that the intersection of correctly classified instances by the two
systems is quite small, and therefore, that automatically taking advantage of these
complementarity is a complex task
Table 12 shows the results obtained by the participating teams in the
aggressiveness detection task. For this task, we sort the teams by their F1 results in
the aggressive class, but for completion we also report the accuracy, the macro
F1 and the F1 in the non-aggressive class. The approach submitted by the
University of Chihuahua (UACh) obtained the best performance, outperforming all
proposed baselines, except the results from INGEOTEC, which the winner team
in the 2018 edition. Nevertheless, it is important to point out that the UACh
approach is considerably much simpler than the one from INGEOTEC.</p>
          <p>As previously done with the profiling task, we also built a theoretically perfect
ensemble and a majority vote approach from all submitted submissions to the
aggressiveness task. Table 13 shows these results. Again, the perfect ensemble
shows a very good result, in this case a F1 = 0:99, but also the majority vote
approach obtained a low performance, indicating that also for this task it is
dificult to find a way to merge the information from the diferent approaches,
even when they are complementary to each other.</p>
          <p>As a result of the perfect theoretical ensemble, it was possible to identify
those common errors across all the systems. In fact, there are only 9 tweets that
no system could classify correctly. All of them are aggressive tweets that were
classified as non-aggressive. Below we show some of these tweets, where we can
identify ironic comments, the use of out of the training vocabulary words, such
as some named entities, as well as ofenses with no vulgar or profane words.
– Y hablando de cosas feas, ¿cómo está tu novia?
– A mí más real se me hace Carla Morrison porque está super gorda
– Ponete a correr gorda, está bien que las puertas del gimnasio de abren
– pero va a llegar el momento en que no vas a pasar el mejor proff????
– Tu siempre tan tonta Viviana ????</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>This paper described the design and results of the MEX-A3T shared task
collocated with IberLef 2019. MEX-A3T stands for Authorship and Aggressiveness
Analysis in Mexican Spanish Tweets. Two tasks were proposed, one targeting
author profiling and the other focused on aggressiveness detection. Mainly, given
a set of tweets in Mexican Spanish, the participants had to identify the gender,
location and occupation of their authors as well as the aggressive messages. For
these tasks we employed the same data sets than from the previous MEX-A3T
edition, but we extended the author profiling collection by including eleven
images for each user, with the aim of evaluating multimodal profiling approaches.
The shared task lasted more than two months and attracted the participation
of nine teams from three diferent countries, Mexico, Spain and Cuba.</p>
      <p>A variety of methodologies were proposed by the participants, from
traditional supervised methods to deep learning approaches. For author profiling, the
approach proposed by the Cerpamid team obtained the best results with an
approach based on dimentionality reduction in text. However; their results did
not overcome the best results from the previous year edition.For aggressiveness
identification, the top-ranked team was UACh with an apporach based on two
main kinds of features: character n-grams and word embeddings. Their results
were equal to the previous year winner but employing a simpler approach.</p>
      <p>In general terms, the competition was a success: the solutions proposed by
nine participants were diverse regarding methodologies and performances, and
new insights on how to deal with tweets on Mexican Spanish were obtained.
Among the most interesting findings was the complementarity of the predictions
from the diferent participants, a phenomenon that was also observed in the
previous edition. This opens the possibility to study how to take advantage of
the diferent information extracted by the teams in such a way that results reach
those from the perfect ensemble.</p>
      <p>Acknowledgements Our special thanks go to all of MEX-A3T’s participants.
We would like to thank CONACyT for partially supporting this work under
grants CB-2015-01-257383, FC-2016-2410 and the Thematic Networks program
(Language Technologies Thematic Network). The first author thanks for doctoral
scholarship CONACyT-Mexico 654803 and the second for doctoral scholarship
CONACyT-Mexico 401887.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Álvarez-Carmona</surname>
            ,
            <given-names>M.Á.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guzmán-Falcón</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montes-y Gómez</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Escalante</surname>
            ,
            <given-names>H.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Villaseñor-Pineda</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reyes-Meza</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rico-Sulayes</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Overview of mex-a3t at ibereval 2018: Authorship and aggressiveness analysis in mexican spanish tweets</article-title>
          .
          <source>In: Notebook Papers of 3rd SEPLN Workshop on Evaluation of Human Language Technologies for Iberian Languages (IBEREVAL)</source>
          , Seville, Spain, September (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Álvarez-Carmona</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>López-Monroy</surname>
            ,
            <given-names>A.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montes-y Gómez</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>VillaseñorPineda</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meza</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Evaluating topic-based representations for author profiling in social media</article-title>
          .
          <source>In: Ibero-American Conference on Artificial Intelligence</source>
          . pp.
          <fpage>151</fpage>
          -
          <lpage>162</lpage>
          . Springer (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Amigó</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carrillo-de Albornoz</surname>
          </string-name>
          , J.,
          <string-name>
            <surname>Almagro-Cádiz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gonzalo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , RodríguezVidal, J.,
          <string-name>
            <surname>Verdejo</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Evall: Open access evaluation for information access systems</article-title>
          .
          <source>In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          . pp.
          <fpage>1301</fpage>
          -
          <lpage>1304</lpage>
          . ACM (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Argamon</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koppel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fine</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shimoni</surname>
            ,
            <given-names>A.R.</given-names>
          </string-name>
          : Gender, genre, and
          <article-title>writing style in formal written texts</article-title>
          .
          <source>TEXT-THE HAGUE THEN AMSTERDAM THEN</source>
          BERLIN-
          <volume>23</volume>
          (
          <issue>3</issue>
          ),
          <fpage>321</fpage>
          -
          <lpage>346</lpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Casavantes</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>López</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>González</surname>
            ,
            <given-names>L.C.</given-names>
          </string-name>
          : Uach at mex-a3t
          <year>2019</year>
          :
          <article-title>Preliminary results on detecting aggressive tweets by adding author information via an unsupervised strategy</article-title>
          .
          <source>In: In Proceedings of the First Workshop for Iberian Languages Evaluation Forum (IberLEF</source>
          <year>2019</year>
          ),
          <source>CEUR WS Proceedings</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Castro</surname>
            <given-names>Castro</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Artigas</surname>
          </string-name>
          <string-name>
            <surname>Herold</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.F.</given-names>
            ,
            <surname>Ortega</surname>
          </string-name>
          <string-name>
            <surname>Bueno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Muñoz</surname>
          </string-name>
          , R.:
          <source>Cerpamidua at mexa3t</source>
          <year>2019</year>
          :
          <article-title>Transition point proposal</article-title>
          .
          <source>In: In Proceedings of the First Workshop for Iberian Languages Evaluation Forum (IberLEF</source>
          <year>2019</year>
          ),
          <source>CEUR WS Proceedings</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Graf</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miranda-Jiménez</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tellez</surname>
            ,
            <given-names>E.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moctezuma</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Salgado</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>OrtizBejar</surname>
          </string-name>
          , J.,
          <string-name>
            <surname>Sánchez</surname>
            ,
            <given-names>C.N.</given-names>
          </string-name>
          :
          <article-title>Ingeotec at mex-a3t: Author profiling and aggressiveness analysis in twitter using tc and evomsa</article-title>
          .
          <source>In: In Proceedings of the Third Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval</source>
          <year>2018</year>
          ),
          <source>CEUR WS Proceedings</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Molina-González</surname>
            ,
            <given-names>M.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Plaza-del Arco</surname>
            ,
            <given-names>F.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martín-Valdivia</surname>
            ,
            <given-names>M.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ureña</surname>
            <given-names>López</given-names>
          </string-name>
          ,
          <string-name>
            <surname>L.A.</surname>
          </string-name>
          :
          <article-title>Ensemble learning to detect aggressiveness in mexican spanish tweets</article-title>
          .
          <source>In: In Proceedings of the First Workshop for Iberian Languages Evaluation Forum (IberLEF</source>
          <year>2019</year>
          ),
          <source>CEUR WS Proceedings</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Nina-Alcocer</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>González</surname>
            ,
            <given-names>J.Á.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hurtado</surname>
            ,
            <given-names>L.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pla</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Aggressiveness detection through deep learning approaches</article-title>
          .
          <source>In: In Proceedings of the First Workshop for Iberian Languages Evaluation Forum (IberLEF</source>
          <year>2019</year>
          ),
          <source>CEUR WS Proceedings</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Ortega-Mendoza</surname>
            ,
            <given-names>R.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>López-Monroy</surname>
            ,
            <given-names>A.P.:</given-names>
          </string-name>
          <article-title>The winning approach for author profiling of mexican users in twitter at mex</article-title>
          .a3t@
          <fpage>ibereval</fpage>
          -
          <lpage>2018</lpage>
          . In: In
          <source>Proceedings of the Third Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval</source>
          <year>2018</year>
          ),
          <source>CEUR WS Proceedings</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Ortiz</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gómez-Adorno</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reyes-Magaña</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bel-Enguix</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sierra</surname>
          </string-name>
          , G.:
          <article-title>Detection of aggressive tweets in mexican spanish using multiple features with parameter optimization</article-title>
          .
          <source>In: In Proceedings of the First Workshop for Iberian Languages Evaluation Forum (IberLEF</source>
          <year>2019</year>
          ),
          <source>CEUR WS Proceedings</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Garibo</surname>
          </string-name>
          i Orts, O.:
          <article-title>Aggressiveness identification in twitter at iberlef2019: Frequency analysis interpolation for aggressiveness identification</article-title>
          .
          <source>In: In Proceedings of the First Workshop for Iberian Languages Evaluation Forum (IberLEF</source>
          <year>2019</year>
          ),
          <source>CEUR WS Proceedings</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>De la Peña Sarracén</surname>
            ,
            <given-names>G.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Aggressive analysis in twitter using a combination of models</article-title>
          .
          <source>In: In Proceedings of the First Workshop for Iberian Languages Evaluation Forum (IberLEF</source>
          <year>2019</year>
          ),
          <source>CEUR WS Proceedings</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Ramírez-de-la Rosa</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Villatoro-Tello</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jiménez-Salazar</surname>
          </string-name>
          , H.:
          <article-title>Attribute selection techniques for classification of aggressive tweets lyr-uamc participation at mexa3t 2019 task</article-title>
          .
          <source>In: In Proceedings of the First Workshop for Iberian Languages Evaluation Forum (IberLEF</source>
          <year>2019</year>
          ),
          <source>CEUR WS Proceedings</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Tellez</surname>
            ,
            <given-names>F.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pinto</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cardif</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Defining and evaluating blog characteristics</article-title>
          .
          <source>In: Artificial Intelligence</source>
          ,
          <year>2009</year>
          . MICAI 2009. Eighth Mexican International Conference on. pp.
          <fpage>97</fpage>
          -
          <lpage>102</lpage>
          . IEEE (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Valdez-Rodríuez</surname>
            ,
            <given-names>J.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calvo</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Felipe-Riverón</surname>
            ,
            <given-names>E.M.:</given-names>
          </string-name>
          <article-title>Author profiling from images using 3d convolutional neural networks</article-title>
          .
          <source>In: In Proceedings of the First Workshop for Iberian Languages Evaluation Forum (IberLEF</source>
          <year>2019</year>
          ),
          <source>CEUR WS Proceedings</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>