<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">UMUTeam at IROSTEREO: Profiling Irony and Stereotype spreaders on Twitter combining Linguistic Features with Transformers</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">José</forename><forename type="middle">Antonio</forename><surname>García-Díaz</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Facultad de Informática</orgName>
								<orgName type="institution">Universidad de Murcia</orgName>
								<address>
									<addrLine>Campus de Espinardo</addrLine>
									<postCode>30100</postCode>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Miguel</forename><surname>Ángel Rodríguez-García</surname></persName>
							<affiliation key="aff1">
								<orgName type="department">Departamento de Ciencias de la Computación</orgName>
								<orgName type="institution">Universidad Rey Juan Carlos</orgName>
								<address>
									<postCode>28933</postCode>
									<settlement>Madrid</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Francisco</forename><surname>García-Sánchez</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Facultad de Informática</orgName>
								<orgName type="institution">Universidad de Murcia</orgName>
								<address>
									<addrLine>Campus de Espinardo</addrLine>
									<postCode>30100</postCode>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Rafael</forename><surname>Valencia-García</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Facultad de Informática</orgName>
								<orgName type="institution">Universidad de Murcia</orgName>
								<address>
									<addrLine>Campus de Espinardo</addrLine>
									<postCode>30100</postCode>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff2">
								<orgName type="department">Evaluation Forum</orgName>
								<address>
									<addrLine>September 5-8</addrLine>
									<postCode>2022</postCode>
									<settlement>Bologna</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">UMUTeam at IROSTEREO: Profiling Irony and Stereotype spreaders on Twitter combining Linguistic Features with Transformers</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">3542F98315EF33321543370C5F2F17CB</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T03:28+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Author Profiling</term>
					<term>Irony and Stereotypes</term>
					<term>Stance detection</term>
					<term>Feature Engineering</term>
					<term>Deep Learning</term>
					<term>Transformers</term>
					<term>Natural Language Processing</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Irony is a curious mode of communication in which the speaker says something that wants the audience to be interpreted oppositely. Its automatic detection is a very challenging task due to its complex interpretation, and it has a significant potential for various applications in text mining. Social Media platforms like Twitter offer a vital chance to analyze this literary technique since users frequently utilize it to give their opinions. In this working note, we describe the contribution designed for the PAN's shared author profiling task and its subtask concerning Stereotype Stance Detection. The former consists in determining whether the authors spread irony and stereotypes and the latter is focused on identifying stereotypes that can hurt vulnerable groups. The organizers provide a set compiled from Twitter to carry out the task. In particular, we have proposed a supervised learning pipeline consisting of a combination of Deep Learning techniques that utilizes context and non-context embeddings to address the binary classification. The resulting system reaches promising results, achieving the fifth-best score in the main task with an accuracy of 96.67%.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>With the proliferation of social media, irony has made it one of the most literary device utilized in this communication manner <ref type="bibr" target="#b0">[1]</ref>. Several definitions have been provided in the literature about irony, but they concurred with the same binary classification, verbal and situational irony. The former has been conceived as the act of using words that mean the opposite of what you think, particularly to make funny <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3]</ref>. The latter has been defined as a strange or funny situation because things happen in a way that seems to be the opposite of what you expected <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b4">5]</ref>. Both definitions highlight one of the primary features of this rhetorical device, to make something understandable by expressing the opposite <ref type="bibr" target="#b2">[3]</ref>. Such rhetorical complexity makes dialogue that is occasionally arduous to comprehend by humans <ref type="bibr" target="#b5">[6]</ref>. This challenge has attracted the research community's attention. In recent years, several approaches have been published addressing the detection of irony in natural language text obtained from different social media sources. In particular, we have focused on identifying whether its author spreads Irony and Stereotypes for the PAN shared challenge <ref type="bibr" target="#b6">[7]</ref>.</p><p>Irony and Stereotype identification is an essential task in social media applications since it enables the identification of online abuse and harassment <ref type="bibr" target="#b7">[8]</ref>. The automatic detection in written discourse is a complex task where traditional text mining methods cannot be applied successfully <ref type="bibr" target="#b8">[9]</ref>. This conventional method's drawback is that the identification requires semantics that cannot be inferred from word counts computed from document analysis <ref type="bibr" target="#b9">[10]</ref>. To overcome this deficiency, more sophisticated Machine Learning methods are applied to solve the problem, but although the obtained results are quite competitive, there is still scope for improvement <ref type="bibr" target="#b10">[11]</ref>.</p><p>In this working note, we have faced the author profiling challenge proposed by constructing a supervised classification pipeline. The method comprises four stages: pre-processing stage to clean the provided dataset; the collecting features a stage, where contextual and non-contextual embeddings were utilized; training stage, where several machine models were used and finally, the evaluation stage, where we evaluated the designed models.</p><p>The remainder of this working note is organized as follows: Section 2 provides a brief review of the related work. It examines distinct approaches proposed in the literature that address the challenge thrown. Section 3 specifies the methods developed for addressing the challenge. In Section 4 the results achieved in the challenge are presented. Besides, we report separately our participation in a subtask concerning Stereotype Stance Detection in Section 5. Finally, Section 6 summarizes the findings obtained developing this work, and it also scrutinizes some of the future lines to explore.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related work</head><p>Due to the complexity of recognizing verbal irony in a natural language test, we can find different approaches, from ones that utilize simple strategies to more complex ones. Barbieri and Saggion in <ref type="bibr" target="#b11">[12]</ref> proposed two tree-based classifiers, Random Forest and Decision Tree. They represent each tweet by using the following seven groups of features: i) frequency to analyze the gap between rare and common words utilized by users; ii) written-spoken to capture the users' style; iii) intensity to measure the power of the adverbs and adjectives; iv) structure that analyzes the length, punctuation and emoticons; v) sentiments utilize SentiWordNet to measure the gap between positive and negative terms; vi) synonyms for comparing common vs rare synonyms utilized; and, finally, vii) ambiguity to analyze possible ambiguities. Furthermore, this approach explores the usage of a bag of word representation based on frequency analysis. Anchiêta et al. in <ref type="bibr" target="#b12">[13]</ref> proposed two differentiated strategies in a more complex way. Firstly, they combined Term Frequency, Inverse Frequency (TF-IDF), and Linear Support Vector Machine (SVM). The former was used to extract the features from the datasets, and the latter was the classifier utilized for the identification task. The classifier was trained by using the Stochastic Gradient Descent (SGD) technique. Secondly, they combine embeddings created by using Distributed Bag of Words Paragraph Vector model and a Multi-Layer Perceptron (MLP) for tackling the classification task. With a different level of complexity, Wu et al. in <ref type="bibr" target="#b13">[14]</ref> proposed the Dense-LSTM model based on a densely connected LSTM network with a multi-task learning strategy. It comprises an embedding layer to convert the inputs tweets into a sequence of dense vectors and four Bi-LSTM layers concatenated with 200-dim hidden states to learn different levels of information simultaneously. Furthermore, they combine two different pre-trained word embeddings that are concatenated and used.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Methodology</head><p>The IROSTEREO challenge consists of a binary classification from an author profiling perspective. The dataset proposed in this task is compiled from Twitter. The training dataset has a total of 420 different users. The users are grouped in those who are irony and stereotype spreaders (I) and those who are not (NI). For each user, there are 200 of their tweets written in English <ref type="bibr" target="#b14">[15]</ref>. We separate a small subset from the training dataset to perform a custom validation. The statistics of the dataset are depicted in Table <ref type="table" target="#tab_0">1</ref>. We followed a typical pipeline of supervised classification for solving the proposed task. We started applying a pre-processing stage of the dataset. Then, we compile the feature sets, train several machine learning models, and evaluate them using a custom validation split.</p><p>The pre-processing stage consists in the creation of an alternative version of the documents by encoding them in lowercase, removing mentions, hyperlinks, digits, punctuation, and expressive lengthening. Besides, we expand texting language and fix misspellings. The alternative version is used to extract the majority of the feature sets based on sentence embeddings and linguistic features.</p><p>The feature sets involved in our experimentation consist into linguistic features from UMU-TextStats (LF) <ref type="bibr" target="#b15">[16,</ref><ref type="bibr" target="#b16">17]</ref>, and three sentence embeddings: non-contextual sentence embeddings from FastText (SE) <ref type="bibr" target="#b17">[18]</ref>, and two contextual embeddings from BERT (BF) <ref type="bibr" target="#b18">[19]</ref> and RoBERTa (RF) <ref type="bibr" target="#b19">[20]</ref>. These feature sets were used separately and combined using two approaches. One is based on knowledge integration, and another is based in ensemble learning. For the ensemble learning, we evaluate four strategies: i) soft voting, ii) hard voting, iii) average probabilities, and iv) highest probability. Concerning the hard voting strategy, it is a weighted mode with the weights based on the F1-score results of the custom validation split.</p><p>As we deal with author analysis, the results are reported at author level. Nevertheless, some of the described stages of our pipeline are performed at document level. For example, the features are compiled at the document level and then combined by each user to produce a unique vector per user.</p><p>For extracting the contextual sentence embeddings from BERT and RoBERTa we fine-tune the models with the IROSTEREO dataset, and then we obtained the value of the [CLS] token <ref type="bibr" target="#b20">[21]</ref>. In order to find the best hyperparameters, we trained ten models for BERT and 10 models for RoBERTa. The hyperparameters are i) the weight decay, ii) the batch size, iii) the warm-up speed, iv) the number of epochs, and v) the learning rate. This step is performed using Tree of Parzen Estimators (TPE) <ref type="bibr" target="#b21">[22]</ref>, which is a method for choosing the hyper-parameters based on Bayesian reasoning and expected improvement.</p><p>Next, we train several neural networks for each feature set and for the combination of all feature sets using a knowledge integration strategy. These hyperparameters include the shape of the network, the dropout mechanism, the learning rate and the activation function. Table <ref type="table" target="#tab_1">2</ref> depicts the best hyperparameters for this task. It can be observed that the majority of best results are obtained with shallow neural networks, with two hidden layers but a large number of neurons. The only exception is SE, which achieved its best result with 7 hidden layers and 27 neurons in a long funnel shape. Besides, all experiments achieved better results with high dropout mechanisms and a learning rate of 0.010 using no activation function (linear). The exception again is SE, which uses a smaller learning rate, a smaller ratio of the dropout and elu as an activation function. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Results and analysis</head><p>First, we report the results achieved with our custom validation split. These results include the label's precision, recall, and F1-score, and the macro and weighted average of the whole task. We report the results of the best feature set trained separately in Tables <ref type="table" target="#tab_2">3, 4</ref>, 5, and 6 for LF, SE, BF and RF respectively. The results for the KI strategy in Table <ref type="table">7</ref>, and the results for the four strategies using ensemble learning in Tables <ref type="table" target="#tab_3">8, 9</ref>, 10, and 11 for hard voting, soft voting, averaging probabilities and highest probability, respectively. From the results achieved with the custom validation split, that are reported at the user level, we can assume that determining if a user is an irony and stereotype spreader is somehow a trivial task. It is worth mentioning that these results at the document level will be more limited. The best results are achieved with BERT from the feature sets separately. However, it draws our attention to the limited results achieved with RoBERTa (see Table <ref type="table">6</ref>). We observed that all the incorrect predictions are from the I label, but the model reports the NI label. We also compared the predictions between BF and RF and observed that the BF model outputs probabilities near 100% whereas RF is less accurate.</p><p>We can observe that the features based on pure linguistics also achieve similar results to the ones obtained with state-of-the-art embeddings. The LF features include features related to stylometry, lexis, social media jargon, and Part-of-Speech features. In order to gain insights concerning the interpretability of the features, we calculate the Information Gain of the linguistic features and we normalize the top-ten that achieved a better coefficient for the I and NI labels (see Figure <ref type="figure" target="#fig_0">1</ref>). It can be observed that the majority of the most discerning features are related to stylometry, including the number of words, the number of words per sentence, the usage of full stops, and some readability formulas. There are two linguistic features concerning morphology: the usage of interjections and the usage of words in singular. Because of the results achieved, for participating in this shared task, we sent one run based on the Knowledge Integration strategy, achieving the fifth best result an accuracy of 96.67% from a total of 65 participants. We selected the Knowledge Integration strategy over the two ensemble learning strategies that achieved the same results (hard and soft voting) because the Knowledge Integration have reported better results in other shared tasks in the past. Table <ref type="table" target="#tab_4">12</ref> contains the best results along with the baselines proposed by the organizers. It is worth mentioning that these results were yielded from TIRA <ref type="bibr" target="#b22">[23]</ref>, an Integrated Research Architecture utilized by IROSTEREO organizers for managing the participants' algorithms executions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Stereotype Stance Detection subtask</head><p>The organizers of the IROSTEREO shared task proposed a minor challenge that consisted in determining whether the stereotypes are used in favor of the target or against them. For this, they released a training dataset in which 94 authors were tagged against and 46 authors were tagged in favor.</p><p>To solve this challenge, we utilized the same pipeline described for the main challenge. Our results with our custom validation split are promising. We report the Knowledge Integration strategy and the four ensemble learning strategies in Table <ref type="table" target="#tab_5">13</ref>. We achieved a macro F1-score of 82.8753% with the Knowledge Integration strategy and a macro F1-score of 78.5714% with the ensemble learning based on soft-voting.</p><p>However, our results with the official leader board were limited. We achieved a macro F1-score of 53.12% (F1 with the In Favor label of 25% and F1 of the Against label of 81.25%).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusions and future work</head><p>This working note describes the participation of the UMUTeam at IROSTEREO shared task concerning author profiling. This is a binary classification task in which the participants are challenged to identify which profiles from Twitter are spreaders of Irony and Stereotypes. Our proposal is grounded on the combination of several feature sets based on linguistic features and sentence embeddings. We achieved promising results with our custom validation split and achieved a final accuracy of 96.67% on the official leader board. One of the limitations of our work is the results achieved with RoBERTa (RF). Although we searched for common errors in our pipeline, we could identify the reason for the limited results. To address this issue, we suggest combining document level analysis with tools such as SHAP <ref type="bibr" target="#b23">[24]</ref> in order to find the reason for the wrong predictions. Besides, we obtained wrong predictions with the highest probability strategy (see <ref type="bibr">Table 11)</ref> as this ensemble outputs always the I label (100% of accuracy). We suspect this issue is related to an error in code while generating the final report.</p><p>As future work we will incorporate cross-validation techniques into our pipeline and dataaugmentation techniques to increase our models' generalization.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Information gain of the ten features with higher information gain</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 IROSTEREO</head><label>1</label><figDesc></figDesc><table><row><cell>dataset's users</cell><cell></cell></row><row><cell></cell><cell>train val total</cell></row><row><cell>I</cell><cell>166 44 210</cell></row><row><cell>NI</cell><cell>176 34 210</cell></row><row><cell cols="2">TOTAL 342 78 420</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Best hyper-parameters for each feature set trained separately and combined using knowledge integration.</figDesc><table><row><cell cols="2">Feature set shape</cell><cell cols="3">hidden layers neurons dropout</cell><cell>lr activation</cell></row><row><cell>LF</cell><cell>brick</cell><cell>2</cell><cell>128</cell><cell cols="2">.3 0.010 linear</cell></row><row><cell>SE</cell><cell>long funnel</cell><cell>7</cell><cell>27</cell><cell cols="2">.1 0.001 elu</cell></row><row><cell>BF</cell><cell>brick</cell><cell>2</cell><cell>512</cell><cell cols="2">.3 0.010 linear</cell></row><row><cell>RF</cell><cell>brick</cell><cell>2</cell><cell>512</cell><cell cols="2">.3 0.010 linear</cell></row><row><cell>KI</cell><cell>brick</cell><cell>2</cell><cell>512</cell><cell cols="2">.3 0.010 linear</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 3</head><label>3</label><figDesc>Classification report for LF.</figDesc><table><row><cell></cell><cell></cell><cell>Table 4</cell><cell></cell></row><row><cell></cell><cell></cell><cell cols="2">Classification report for SE.</cell></row><row><cell></cell><cell>precision recall f1-score</cell><cell></cell><cell>precision</cell><cell>recall f1-score</cell></row><row><cell>I</cell><cell>94.737 97.297 96.000</cell><cell>I</cell><cell cols="2">100.000 94.595 97.222</cell></row><row><cell>NI</cell><cell>97.826 95.745 96.774</cell><cell>NI</cell><cell cols="2">95.918 100.000 97.917</cell></row><row><cell>macro avg</cell><cell>96.281 96.521 96.387</cell><cell>macro avg</cell><cell cols="2">97.959 97.297 97.569</cell></row><row><cell>weighted avg</cell><cell>96.465 96.429 96.433</cell><cell>weighted avg</cell><cell cols="2">97.716 97.619 97.611</cell></row><row><cell>Table 5</cell><cell></cell><cell>Table 6</cell><cell></cell></row><row><cell cols="2">Classification report for BF.</cell><cell cols="2">Classification report for RF.</cell></row><row><cell></cell><cell>precision recall f1-score</cell><cell></cell><cell cols="2">precision recall f1-score</cell></row><row><cell>I</cell><cell>97.297 97.297 97.297</cell><cell>I</cell><cell cols="2">64.286 48.649 55.385</cell></row><row><cell>NI</cell><cell>97.872 97.872 97.872</cell><cell>NI</cell><cell cols="2">66.071 78.723 71.845</cell></row><row><cell>macro avg</cell><cell>97.585 97.585 97.585</cell><cell>macro avg</cell><cell cols="2">65.179 63.686 63.615</cell></row><row><cell>weighted avg</cell><cell>97.619 97.619 97.619</cell><cell>weighted avg</cell><cell cols="2">65.285 65.476 64.594</cell></row><row><cell>Table 7</cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="2">Classification report for KI.</cell><cell></cell><cell></cell></row><row><cell></cell><cell>precision recall f1-score</cell><cell></cell><cell></cell></row><row><cell>I</cell><cell>97.297 97.297 97.297</cell><cell></cell><cell></cell></row><row><cell>NI</cell><cell>97.872 97.872 97.872</cell><cell></cell><cell></cell></row><row><cell>macro avg</cell><cell>97.585 97.585 97.585</cell><cell></cell><cell></cell></row><row><cell>weighted avg</cell><cell>97.619 97.619 97.619</cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 8</head><label>8</label><figDesc>Classification report for EL (hard-voting).</figDesc><table><row><cell></cell><cell>precision recall f1-score</cell></row><row><cell>I</cell><cell>97.297 97.297 97.297</cell></row><row><cell>NI</cell><cell>97.872 97.872 97.872</cell></row><row><cell>macro avg</cell><cell>97.585 97.585 97.585</cell></row><row><cell>weighted avg</cell><cell>97.619 97.619 97.619</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 12</head><label>12</label><figDesc>Top results and baselines from the official leader board for the IROSTEREO 2022 shared task, ranked by accuracy</figDesc><table><row><cell>POS Team</cell><cell>Accuracy</cell></row><row><cell>1 wentaoyu</cell><cell>0.9944</cell></row><row><cell>2 harshv</cell><cell>0.9778</cell></row><row><cell>3 edapal</cell><cell>0.9722</cell></row><row><cell>3 ikae</cell><cell>0.9722</cell></row><row><cell>5 UMUTEAM</cell><cell>0.9667</cell></row><row><cell>5 Enrub</cell><cell>0.9667</cell></row><row><cell>LDSE</cell><cell>0.9389</cell></row><row><cell>RF + char 2-ngrams</cell><cell>0.8610</cell></row><row><cell>LR + word 1-ngrams</cell><cell>0.8490</cell></row><row><cell>LSTM+Bert-encoding</cell><cell>0.6940</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 13</head><label>13</label><figDesc>Macro precision, recall and F1-score for the Stance detection subtask using the custom validation split. KI stands for Knowledge Integration and EL for Ensemble Learning</figDesc><table><row><cell></cell><cell>precision recall f1-score</cell></row><row><cell>KI</cell><cell>93.478 78.571 82.875</cell></row><row><cell>EL -soft-voting</cell><cell>83.182 76.071 78.571</cell></row><row><cell>EL -hard-voting</cell><cell>91.667 71.429 75.455</cell></row><row><cell>EL -average probabilities</cell><cell>78.804 68.929 71.459</cell></row><row><cell>EL -highest probability</cell><cell>37.037 50.000 42.553</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This work is part of the research project LaTe4PSP (PID2019-107652RB-I00) funded by MCIN/ AEI/10.13039/501100011033. This work is also part of the research project PDC2021-121112-I00 funded by MCIN/AEI/10.13039/501100011033, by the European Union NextGenerationEU/PRTR, and by "Programa para la Recualificación del Sistema Universitario Español 2021-2023". In addition, José Antonio García-Díaz is supported by Banco Santander and the University of Murcia through the Doctorado Industrial programme.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">An impact analysis of features in a classification approach to irony detection in product reviews</title>
		<author>
			<persName><forename type="first">K</forename><surname>Buschmeier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Cimiano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Klinger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 5th workshop on computational approaches to subjectivity, sentiment and social media analysis</title>
				<meeting>the 5th workshop on computational approaches to subjectivity, sentiment and social media analysis</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="42" to="49" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Compilation and evaluation of the spanish saticorpus 2021 for satire identification using linguistic features and transformers</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>García-Díaz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Valencia-García</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Complex &amp; Intelligent Systems</title>
		<imprint>
			<biblScope unit="page" from="1" to="14" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">Irony</title>
		<author>
			<persName><forename type="first">J</forename><surname>Garmendia</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2018">2018</date>
			<publisher>Cambridge University Press</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Hunter</surname></persName>
		</author>
		<ptr target="https://books.google.es/books?id=w7sYBAAAQBAJ" />
		<title level="m">Evaluating the Circumstances, John P. Hunter III</title>
				<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Preparation for Critical Instruction: How to Explain Subject Matter While Teaching All Learners to Think, Read, and Write Critically</title>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">P</forename><surname>Maiorana</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2016">2016</date>
			<publisher>Rowman &amp; Littlefield</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">A Deep Learning Model for Detecting Sarcasm in Written Product Reviews</title>
		<author>
			<persName><forename type="first">N</forename><surname>Schwarz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Interactive Media; FH Oberösterreich -Fakultät für informatik</title>
				<imprint>
			<publisher>Kommunikation und Medien</publisher>
			<biblScope unit="page">4232</biblScope>
		</imprint>
	</monogr>
	<note>Master&apos;s thesis</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Overview of pan 2022: Authorship verification, profiling irony and stereotype spreaders, style change detection, and trigger detection</title>
		<author>
			<persName><forename type="first">J</forename><surname>Bevendorff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Chulvi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Fersini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Heini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kestemont</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Kredens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mayerl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Ortega-Bueno</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Pęzik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Potthast</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">European Conference on Information Retrieval</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="331" to="338" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">What a sunny day: toward emoji sensitive irony detection</title>
		<author>
			<persName><forename type="first">A</forename><surname>Chaudhary</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">A</forename><surname>Hayati</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Otani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">W</forename><surname>Black</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2019">2019. 2019</date>
			<biblScope unit="page">212</biblScope>
			<pubPlace>W-NUT</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Irony detection on microposts with limited set of features</title>
		<author>
			<persName><forename type="first">H</forename><surname>Taslioglu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Karagoz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Symposium on Applied Computing</title>
				<meeting>the Symposium on Applied Computing</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="1076" to="1081" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Computational irony: A survey and new perspectives</title>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">C</forename><surname>Wallace</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Artificial intelligence review</title>
		<imprint>
			<biblScope unit="volume">43</biblScope>
			<biblScope unit="page" from="467" to="483" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Masking and bert-based models for stereotype identication</title>
		<author>
			<persName><forename type="first">J</forename><surname>Sánchez-Junquera</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Montes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Chulvi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Procesamiento del Lenguaje Natural</title>
		<imprint>
			<biblScope unit="volume">67</biblScope>
			<biblScope unit="page" from="83" to="94" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Modelling irony in twitter</title>
		<author>
			<persName><forename type="first">F</forename><surname>Barbieri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Saggion</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Student Research Workshop at the 14th Conference of the European Chapter of the Association for Computational Linguistics</title>
				<meeting>the Student Research Workshop at the 14th Conference of the European Chapter of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="56" to="64" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Piln IDPT 2021: Irony detection in portuguese texts with superficial features and embeddings</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">T</forename><surname>Anchiêta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">A R</forename><surname>Neto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">C</forename><surname>Marinho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">V</forename><surname>Do Nascimento</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">S</forename><surname>Moura</surname></persName>
		</author>
		<ptr target="org" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2021) co-located with the Conference of the Spanish Society for Natural Language Processing (SEPLN 2021), XXXVII International Conference of the Spanish Society for Natural Language Processing</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<meeting>the Iberian Languages Evaluation Forum (IberLEF 2021) co-located with the Conference of the Spanish Society for Natural Language Processing (SEPLN 2021), XXXVII International Conference of the Spanish Society for Natural Language Processing<address><addrLine>Málaga, Spain</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2021-09">September, 2021. 2943. 2021</date>
			<biblScope unit="page" from="917" to="924" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Thu_ngn at semeval-2018 task 3: Tweet irony detection with densely connected lstm and multi-task learning</title>
		<author>
			<persName><forename type="first">C</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Yuan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Huang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of The 12th International Workshop on Semantic Evaluation</title>
				<meeting>The 12th International Workshop on Semantic Evaluation</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="51" to="56" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Profiling Irony and Stereotype Spreaders on Twitter (IROSTEREO) at PAN 2022</title>
		<author>
			<persName><forename type="first">O.-B</forename><surname>Reynier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Berta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Francisco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Paolo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Elisabetta</surname></persName>
		</author>
		<ptr target="CEUR-WS.org" />
	</analytic>
	<monogr>
		<title level="m">CLEF 2022 Labs and Workshops</title>
		<title level="s">Notebook Papers</title>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Psychographic traits identification based on political ideology: An author analysis study on spanish politicians&apos; tweets posted in 2020</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>García-Díaz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Colomo-Palacios</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Valencia-García</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Future Generation Computer Systems</title>
		<imprint>
			<biblScope unit="volume">130</biblScope>
			<biblScope unit="page" from="59" to="74" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Evaluating feature combination strategies for hate-speech detection in spanish using linguistic features and transformers</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">A</forename><surname>García-Díaz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Jiménez-Zafra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>García-Cumbreras</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Valencia-García</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Complex &amp; Intelligent Systems</title>
		<imprint>
			<biblScope unit="page" from="1" to="22" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<title level="m" type="main">Learning word vectors for 157 languages</title>
		<author>
			<persName><forename type="first">E</forename><surname>Grave</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Bojanowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Gupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Joulin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Mikolov</surname></persName>
		</author>
		<idno>CoRR abs/1802.06893</idno>
		<ptr target="http://arxiv.org/abs/1802.06893.arXiv:1802.06893" />
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Bert: Pre-training of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><forename type="middle">C</forename><surname>Kenton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">K</forename><surname>Toutanova</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of NAACL-HLT</title>
				<meeting>NAACL-HLT</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="4171" to="4186" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<title level="m" type="main">Roberta: A robustly optimized BERT pretraining approach</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Joshi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Levy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Stoyanov</surname></persName>
		</author>
		<idno>CoRR abs/1907.11692</idno>
		<ptr target="http://arxiv.org/abs/1907.11692.arXiv:1907.11692" />
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Sentence-bert: Sentence embeddings using siamese bert-networks</title>
		<author>
			<persName><forename type="first">N</forename><surname>Reimers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Gurevych</surname></persName>
		</author>
		<ptr target="https://arxiv.org/abs/1908.10084" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics</title>
				<meeting>the 2019 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures</title>
		<author>
			<persName><forename type="first">J</forename><surname>Bergstra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Yamins</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Cox</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International conference on machine learning</title>
				<meeting><address><addrLine>PMLR</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="115" to="123" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">TIRA Integrated Research Architecture</title>
		<author>
			<persName><forename type="first">M</forename><surname>Potthast</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Gollub</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wiegmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Stein</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-030-22948-1_5</idno>
	</analytic>
	<monogr>
		<title level="m">Information Retrieval Evaluation in a Changing World, The Information Retrieval Series</title>
				<editor>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Peters</surname></persName>
		</editor>
		<meeting><address><addrLine>Berlin Heidelberg New York</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">A unified approach to interpreting model predictions</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Lundberg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S.-I</forename><surname>Lee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in neural information processing systems</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
