<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">An Ensemble Machine Learning Classifier for Profiling Irony and Stereotype Spreaders on Twitter Notebook for PAN at CLEF 2022</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Zengyao</forename><surname>Li</surname></persName>
						</author>
						<author>
							<persName><forename type="first">Zhongyuan</forename><surname>Han</surname></persName>
							<email>hanzhongyuan@gmail.com</email>
						</author>
						<author>
							<persName><forename type="first">Mingjie</forename><surname>Huang</surname></persName>
							<email>mingjiehuang007@163.com</email>
						</author>
						<author>
							<persName><forename type="first">Leilei</forename><surname>Kong</surname></persName>
							<email>kongleilei@fosu.edu.cn</email>
						</author>
						<author>
							<affiliation key="aff0">
								<orgName type="institution">Foshan University</orgName>
								<address>
									<settlement>Foshan</settlement>
									<country key="CN">China</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<orgName type="department">Evaluation Forum</orgName>
								<address>
									<addrLine>September 5-8</addrLine>
									<postCode>2022</postCode>
									<settlement>Bologna</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">An Ensemble Machine Learning Classifier for Profiling Irony and Stereotype Spreaders on Twitter Notebook for PAN at CLEF 2022</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">1D7D83068C91A9B378595201F9877620</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T03:31+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Pre-trained model, Classification, Irony and Stereotype Spreaders 0000-0001-8472-4150</term>
					<term>(Z. Li)0000-0001-8960-9872 (Z. Han)</term>
					<term>0000-0002-0889-5027 (M. Huang)</term>
					<term>0000-0002-4636-3507(L. Kong)</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The Profiling Irony and Stereotype Spreaders on Twitter (profiling IROSTEREO) task is to judge which author can be considered ironic based on the author's comments. We treat this task as a text binary classification task. This paper proposes a feature extraction method based on a pre-trained language model and a classifier based on an ensemble machine learning model. Our proposed method and model achieve 0.9222 accuracy on the test set for this task.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>With irony, language is employed in a figurative and subtle way to mean the opposite of what is literally stated. In the case of sarcasm, a more aggressive type of irony, the intent is to mock or scorn a victim without excluding the possibility of being hurt <ref type="bibr" target="#b0">[1]</ref>. Irony as a literary technique is widely used in online texts such as Twitter tweets.</p><p>The task of Profiling Irony and Stereotype Spreaders on Twitter (profiling IROSTEREO <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>) is presented in this background. Typically, in author analysis tasks ,such as profiling Hate Speech Spreaders(HSSs) on Twitter task <ref type="bibr" target="#b2">[3]</ref>, the context representing the positive or negative intent of the text is consistent. However, in the profiling IROSTEREO task, a text's ironic intent is defined by its context incongruity. For example, in the phrase "I love being ignored", irony is defined by the incongruity between the positive word "love" and the negative context of "being ignored" <ref type="bibr" target="#b3">[4]</ref>. This task aims to find those authors that can be considered ironic through their tweets on the Twitter author. We use the pre-trained language model BERT <ref type="bibr" target="#b4">[5]</ref> for this task to extract the features and propose a classifier model based on an ensemble machine learning model.</p><p>The rest of this paper is organized as follows. The related work and methodology are discussed in Section 2 and 3. The experimental setup and results are subsequently reported in Section 4 and eventually concludes with summarize in Section 5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>The profiling Hate Speech Spreaders on Twitter task <ref type="bibr" target="#b2">[3]</ref> proposed by Pan last year is similar to this task. We explore it as a text classification task. A previous method for profiling HSSs, like classifying an author as HSS (Hate Speech Spreader) or not , takes advantage of a CNN based on a single convolutional layer <ref type="bibr" target="#b5">[6]</ref>. In addition, hate speech spreader detection using n-grams and voting classifier also achieved good results <ref type="bibr" target="#b6">[7]</ref>. And it also works well for deep modeling of latent representations based on Transformer <ref type="bibr" target="#b7">[8]</ref> model for profiling HSSs task <ref type="bibr" target="#b8">[9]</ref>.</p><p>Furthermore, we will introduce some state-of-the-art methods used in the paper for profiling IROSTEREO tasks. In the paper by Shiwei Zhang, the authors formulate irony detection instead as a transfer learning task where supervised learning on irony labeled text is enriched with knowledge transferred from external sentiment analysis resources. Importantly, they focus on identifying the hidden, implicit incongruity without relying on explicit incongruity expressions <ref type="bibr" target="#b3">[4]</ref>. In the ref <ref type="bibr" target="#b9">[10]</ref>, the authors use two different interpretable methods to identify stereotypes about immigration: Transformer-based deep learning models and text masking techniques .</p><p>Presently, pre-trained language models are the mainstream models, and they have been tested to outperform other models in evaluation metrics on most tasks. At the same time, traditional machine learning classifiers are simple and effective for binary classification tasks. In the last year, participants in the profiling HSSs task chose only one of the models to complete the task. So, is it feasible to combine pre-trained languages and traditional machine learning models? In addition to this, we also looked up some related research on text classification. Hybrid methods exist in the literature, such as CNNs for extracting text features and SVMs for performing classification and prediction <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b10">11]</ref>. Finally, similar results can be obtained with CNN <ref type="bibr" target="#b5">[6]</ref>. So, combining pre-trained language models and machine learning classifiers to train and predict data is worth trying.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Method</head><p>This section gives a brief overview of our model, training process, and prediction process. Our proposed model mainly consists of the following two parts: 1. Fine-tuning Bert for feature extraction 2. Ensemble Machine Learning Classifier</p><p>We describe these two parts in Sections 3.1 and 3.2. Sections 3.3 and 3.4 will explain our training and prediction process in detail. The raw text is preprocessed before training and prediction, as detailed below. The preprocessed text will be given the same label as the author of the text, and then fed into the Bert model one by one for fine-tuning. When the accuracy rate on the validation set no longer increased within two epochs, the training was stopped, and the model with the highest accuracy rate was saved. In the feature extraction stage, we can extract 768-dimensional cls tokens through the trained model. The structure of the fine-tune Bert model is shown in Figure <ref type="figure" target="#fig_0">1</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Fine-tuning Bert</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Ensemble Machine Learning Classifier (EMLC)</head><p>We build Ensemble Machine Learning Classifier (EMLC), which integrates LR, RF, and SVM models, and the specific structure is shown in the red box in Figure <ref type="figure" target="#fig_1">2</ref>. In Figure <ref type="figure" target="#fig_1">2</ref>, xi represents the input text, and the fine-tuned Bert is used to extract its feature representation and train it as the input of the EMLC. Then the probabilities output by the LR, RF, and SVM models are averaged, and the class with the highest probability is taken as the final prediction result yi.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Experiment and Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Dataset</head><p>The English dataset provided by the task organizer consists of two parts: a training set and a test set. The training set consists of 840,000 tweets: the training set has 420 authors, each author assigns a label, and each author has 200 tweets. The test set consists of 360,000 tweets: the test set contains 180 authors with 200 tweets per author. Table <ref type="table" target="#tab_0">1</ref> shows the data analysis of the dataset. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Text Preprocessing</head><p>For each author's 200 tweets, we first remove some unusual symbols and strings, then convert all text to lowercase, and merge every eight tweets into a new tweet in sequence so that An author gets 25 new tweets. In addition, we also divided the training set into five parts according to the idea of 5fold, named Data1~5 respectively (as shown in Figure <ref type="figure" target="#fig_3">4</ref>). Each data is independently trained to obtain an independent Bert model. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Experimental setting</head><p>In this work, the pre-trained language model chosen for the first part of our model is Bertbase<ref type="foot" target="#foot_0">1</ref> (L=12, H=768, A=12, Total Parameters=110M). Specifically, the implementation of HuggingFace<ref type="foot" target="#foot_1">2</ref> called BertForSequenceClassification is used. During the fine-tuning stage of the pretrained model, we set batch_size=25 and used cross-entropy as Bert's loss function. As the optimizer, we choose AdamW, and the learning rate is set to 1e-5. For the second part of the model, we employ an integrated machine learning classifier of LR, RF, and SVM. For these three machine learning models, we chose to use the default settings and set the "voting" parameter of the VotingClassifier to "soft". We use the PyTorch framework for the whole model to conduct our experiments. Our source code is publicly available at https://github.com/Zero-Lzy/Pan_2022_Twitter.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4.">Model Prediction Process</head><p>When making predictions, we first preprocess the text of the dataset, use five fine-tuned Bert models to extract features from the text, and then input the extracted features into the corresponding EMLC. Five EMLCs will get five labels about the text and cast hard votes<ref type="foot" target="#foot_2">1</ref> on these five labels to get the final label of the text. The acquisition process of text labels is consistent with the training process, as shown on the right side of Figure <ref type="figure" target="#fig_2">3</ref>, but at this time, xi represents the preprocessed text in the test set. And for the task, what we need to predict is the author label. After the dataset is preprocessed, one author will have 25 texts. That is, 25 text labels will be predicted. Based on these 25 text labels, we choose the classification with the most votes as the final author label. The prediction process is shown in Figure <ref type="figure" target="#fig_4">5</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.5.">Result</head><p>We conducted experiments with 5-fold cross-validation on the training set. The dataset was folded five times as described in subsection 4.2. Table <ref type="table" target="#tab_1">2</ref> reports the accuracy obtained on the validation set used at each fold, along with the arithmetic mean and standard deviation. As can be seen from Table <ref type="table" target="#tab_1">2</ref>, our cross-validation experiments achieved an average accuracy of 0.9976. We believe that our model is relatively reliable. Finally, we compress the results predicted on the test set and upload them to TIRA <ref type="bibr" target="#b11">[12]</ref>. As reported on the PAN website, in the test set given by the organizer, our evaluation metrics accuracy can reach 0.9222 as shown in Table <ref type="table">3</ref>. The accuracy differs from the result on the validation set by 0.0754. It may be because the validation set is too small or the correlation of the split data is high, resulting in the accuracy of the validation set being too high. At the same time, there may be some unsolvable overfitting problems in the model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion</head><p>To address the profiling IROSTEREO task proposed by PAN2022, we offer a feature extraction method based on pre-trained language models and a classifier based on ensemble machine learning models in this paper. At the same time, to solve the problem of data underfitting, we constructed multiple datasets for multiple training and voting. To solve the problem of data overfitting, we use early stopping. In the end, we reached 90% on the accuracy score. Therefore, our proposed method is still effective for this task.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: In the structure of the fine-tuned Bert(inside the red frame), we will use the hidden layer vector output from the last layer of Bert as a feature.The raw text is preprocessed before training and prediction, as detailed below. The preprocessed text will be given the same label as the author of the text, and then fed into the Bert model one by one for fine-tuning. When the accuracy rate on the validation set no longer increased within two epochs, the training was stopped, and the model with the highest accuracy rate was saved. In the feature extraction stage, we can extract 768-dimensional cls tokens through the trained model. The structure of the fine-tune Bert model is shown in Figure1.</figDesc><graphic coords="2,141.60,465.72,311.04,193.20" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: The structure of the EMLC. The training process of our model is mainly divided into two parts: training of Bert model and training of EMLC. We feed the split 5 data (see section 4 for details of data) into five initial Bert models for training and end up with five fine-tuned Bert models, the first part of training. In the second part of the training process, we use the total training set to input these 5 fine-tuned Bert models for feature extraction, resulting in 5 feature representations. These feature representations are then fed into five EMLCs for training, which finally completes all training of the model. The training process of EMLC is shown in Figure 3. The xi on the right side of the figure represents the text in the total training set.</figDesc><graphic coords="3,204.60,166.32,185.76,179.04" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: The left side of the figure is the Bert k pre-trained based on Data k. The right side of the figure shows the training process of EMLC. The features of x i are extracted by the fine-tuned Bert k model and then trained in the corresponding EMLC k (k = 1, 2, 3, 4, 5).</figDesc><graphic coords="3,124.80,460.92,345.00,192.96" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Construction diagram of data1~5. T stands for the training set, and V stands for the validation set.</figDesc><graphic coords="4,133.80,321.00,327.36,168.36" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: Predict author label based on author's preprocessed tweets.</figDesc><graphic coords="5,159.60,185.76,275.04,246.24" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 The</head><label>1</label><figDesc></figDesc><table><row><cell cols="2">detail of profiling IROSTEREO datasets</cell><cell></cell><cell></cell></row><row><cell>Dataset</cell><cell>Author</cell><cell>Tweets</cell><cell>Labels</cell></row><row><cell>Train dataset</cell><cell>420</cell><cell>84000</cell><cell>420</cell></row><row><cell>Test dataset</cell><cell>180</cell><cell>36000</cell><cell>None</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Results were achieved by the model on a 5-fold cross-validation on the complete training set.</figDesc><table><row><cell></cell><cell></cell><cell></cell><cell>Fold</cell><cell></cell><cell></cell><cell></cell></row><row><cell>Accuracy</cell><cell>1 0.988</cell><cell>2 1.000</cell><cell>3 1.000</cell><cell>4 1.000</cell><cell>5 1.000</cell><cell>Avg. 0.9976</cell></row><row><cell>Table 3</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="3">Results achieved by our model on the test set</cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="2">Rank</cell><cell></cell><cell></cell><cell>Accuracy</cell><cell></cell><cell></cell></row><row><cell>40</cell><cell></cell><cell></cell><cell></cell><cell>0.9222</cell><cell></cell><cell></cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://github.com/google-research/bert</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://huggingface.co/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_2">minority obeys the principle of majority. For example, three labels out of 5 labels are 0,</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_3">labels are 1, and The final label is 0</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Acknowledgments</head><p>This work is supported by the Natural Science Foundation of Guangdong Province, China (No. 2022A1515011544).</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Profiling Irony and Stereotype Spreaders on Twitter (IROSTEREO) at PAN 2022</title>
		<author>
			<persName><forename type="first">R</forename><surname>Ortega-Bueno</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Chulvi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Fersini</surname></persName>
		</author>
		<ptr target=".org" />
	</analytic>
	<monogr>
		<title level="m">CLEF 2022 Labs and Workshops</title>
		<title level="s">Notebook Papers</title>
		<imprint>
			<publisher>CEUR-WS</publisher>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Overview of PAN 2022: Authorship Verification, Profiling Irony and Stereotype Spreaders, and Style Change Detection</title>
		<author>
			<persName><forename type="first">J</forename><surname>Bevendorff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Chulvi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Fersini</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Thirteenth International Conference of the CLEF Association (CLEF 2022)</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<meeting>the Thirteenth International Conference of the CLEF Association (CLEF 2022)</meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="volume">13390</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Profiling Hate Speech Spreaders on Twitter Task at PAN</title>
		<author>
			<persName><forename type="first">F</forename><surname>Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">L D L P</forename><surname>Sarracén</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Chulvi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Fersini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<ptr target=".org" />
	</analytic>
	<monogr>
		<title level="m">CLEF 2021 Labs and Workshops</title>
		<title level="s">Notebook Papers</title>
		<editor>
			<persName><forename type="first">A</forename><forename type="middle">J M M F P</forename><surname>Guglielmo Faggioli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Nicola</forename><surname>Ferro</surname></persName>
		</editor>
		<imprint>
			<publisher>CEUR-WS</publisher>
			<date type="published" when="2021">2021. 2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Irony Detection via Sentiment-based Transfer Learning</title>
		<author>
			<persName><forename type="first">S</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Processing &amp; Management</title>
		<imprint>
			<biblScope unit="volume">56</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="1633" to="1644" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Pre-training of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
				<meeting>the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="4171" to="4186" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Detection of Hate Speech Spreaders using Convolutional Neural Networks</title>
		<author>
			<persName><forename type="first">M</forename><surname>Siino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">D</forename><surname>Nuovo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Tinnirello</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">L</forename><surname>Cascia</surname></persName>
		</author>
		<ptr target=".org" />
	</analytic>
	<monogr>
		<title level="m">CLEF 2021 Labs and Workshops</title>
		<title level="s">Notebook Papers</title>
		<editor>
			<persName><forename type="first">G</forename><surname>Faggioli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Joly</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Maistro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Piroi</surname></persName>
		</editor>
		<imprint>
			<publisher>CEUR-WS</publisher>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">HSSD: Hate Speech Spreader Detection using N-grams and Voting Classifier</title>
		<author>
			<persName><forename type="first">F</forename><surname>Balouchzahi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">H L</forename></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Sidorov</surname></persName>
		</author>
		<ptr target=".org" />
	</analytic>
	<monogr>
		<title level="m">CLEF 2021 Labs and Workshops</title>
		<title level="s">Notebook Papers</title>
		<editor>
			<persName><forename type="first">G</forename><surname>Faggioli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Joly</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Maistro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Piroi</surname></persName>
		</editor>
		<imprint>
			<publisher>CEUR-WS</publisher>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Attention is all you need</title>
		<author>
			<persName><forename type="first">A</forename><surname>Vaswani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Shazeer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Parmar</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">31st Conference on Neural Information Processing Systems (NIPS 2017)</title>
				<meeting><address><addrLine>Long Beach, CA, USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="6000" to="6010" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Deep Modeling of Latent Representations for Twitter Profiles on Hate Speech Spreaders Identification. Notebook for PAN at CLEF</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">L</forename><surname>Tamayo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Castro-Castro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">O</forename><surname>Bueno</surname></persName>
		</author>
		<ptr target=".org" />
	</analytic>
	<monogr>
		<title level="m">CLEF 2021 Labs and Workshops</title>
		<title level="s">Notebook Papers</title>
		<editor>
			<persName><forename type="first">G</forename><surname>Faggioli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Joly</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Maistro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Piroi</surname></persName>
		</editor>
		<imprint>
			<publisher>CEUR-WS</publisher>
			<date type="published" when="2021">2021. 2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Masking and BERT-based Models for Stereotype Identification</title>
		<author>
			<persName><forename type="first">J</forename><surname>Sánchez-Junquera</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Montes-Y-Gómez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Chulvi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Procesamiento del Lenguaje Natural (SEPLN)</title>
				<meeting>esamiento del Lenguaje Natural (SEPLN)</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">67</biblScope>
			<biblScope unit="page" from="83" to="94" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Research on web text classification algorithm based on improved cnn and svm</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Qu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2017 IEEE 17th International Conference on Communication Technology (ICCT), IEEE</title>
				<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="1958" to="1961" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">TIRA Integrated Research Architecture</title>
		<author>
			<persName><forename type="first">M</forename><surname>Potthast</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Gollub</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wiegmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Stein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Information Retrieval Evaluation in a Changing World, The Information Retrieval Series</title>
				<editor>
			<persName><forename type="first">Nicola</forename><surname>Ferro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Carol</forename><surname>Peters</surname></persName>
		</editor>
		<meeting><address><addrLine>Berlin Heidelberg New York</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2019-09">September 2019</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
