<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Personalized Models Resistant to Malicious Attacks for Human-centered Trusted AI</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Teddy</forename><surname>Ferdinan</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Artificial Intelligence</orgName>
								<orgName type="institution">Wrocław University of Science and Technology</orgName>
								<address>
									<settlement>Wrocław</settlement>
									<country key="PL">Poland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jan</forename><surname>Kocoń</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of Artificial Intelligence</orgName>
								<orgName type="institution">Wrocław University of Science and Technology</orgName>
								<address>
									<settlement>Wrocław</settlement>
									<country key="PL">Poland</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Personalized Models Resistant to Malicious Attacks for Human-centered Trusted AI</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">CBD90155095A7993ECF388C1A0CF9F46</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-04-29T06:37+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>personalized NLP</term>
					<term>poisoning attack</term>
					<term>adversarial machine learning</term>
					<term>learning human representation</term>
					<term>cybersecurity</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Researchers in Natural Language Processing (NLP) and recommendation systems typically train machine learning models on large corpora. In many cases, the corpus is constructed using annotations from a third-party, such as crowd-sourced workers, volunteers, or real users of the social networking services. This opens the possibility of malicious agents providing harmful data into the corpus to introduce unwanted behavior into the model's performance. Existing methods to mitigate the existence of such data are often not applicable or considerably costly. In our paper, we propose personalized solutions for building trusted AI models that possess some inherent resistance against malicious annotations. The personalized human-centered model is trained on textual content and learns representations of users providing their annotations for that content. We compare the predictive performance of such models and a non-personalized baseline on multivariate regression tasks at various levels of simulated malicious annotations. Our results show that the personalized model outperforms the baseline consistently at any malicious annotation level. This makes AI models adapt to the needs of specific users and thus protect them from the effect of potential poisonous attacks.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>It is common in recommender systems for some users to run fake profiles to create biased ratings for content in the system <ref type="bibr" target="#b0">[1]</ref>. This malicious behavior is known as poisonous, shilling, or profile injection attacks <ref type="bibr" target="#b1">[2]</ref>. They can be motivated by unfair competition in the market for products and services and the likes or dislikes of music and video creators. One of the more controversial uses of such attacks is politically or ideologically motivated <ref type="bibr" target="#b2">[3]</ref>, when a group of users agree against a certain person or topic and, for example, maliciously report content about the chosen topic as offensive. Some systems have built-in mechanisms to learn what content to show people based on such reports <ref type="bibr" target="#b3">[4]</ref>. A bigger challenge seems to be using this type of data to train general-purpose classifiers to filter unwanted content, such as hate speech <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b5">6]</ref>.</p><p>Today, increasing interest in NLP is directed toward personalized models for subjective tasks <ref type="bibr" target="#b6">[7,</ref><ref type="bibr" target="#b7">8,</ref><ref type="bibr" target="#b8">9]</ref>. Such tasks are those for which it is difficult to obtain high agreement between annotators and include recognizing emotions, hate speech, or humor in a text. Naturally, content reception will not be the same for everyone reading a text. However, creating datasets annotated by many people from different backgrounds and cultural circles</p><p>The AAAI-23 Workshop on Artificial Intelligence Safety (SafeAI 2023), <ref type="bibr">February 13-14, 2023</ref>, Washington, D.C., US teddy.ferdinan@pwr.edu.pl (T. Ferdinan); jan.kocon@pwr.edu.pl (J. <ref type="bibr">Kocoń)</ref> 0000-0003-3701-3502 (T. Ferdinan); 0000-0002-7665-6896 (J. <ref type="bibr">Kocoń)</ref> is very expensive. Often, the problem of differences in decisions toward the same object is overlooked in favor of majority voting or creating guidelines to train a group of annotators to get high agreement on their ratings <ref type="bibr" target="#b9">[10]</ref>.</p><p>On the other hand, the use of crowdsourcing platforms is becoming increasingly popular. The cost of obtaining information is lower than hiring annotators, and more diverse content evaluations can be obtained. In addition, in many social media, the text is an important content medium, subject to evaluation by millions of users, making it possible for owners of such platforms to use such data to create filters for unwanted content. New personalized models, in particular, use both the similarity of a person's behavior to other users, as well as their individual content preferences, to make inferences <ref type="bibr" target="#b6">[7]</ref>.</p><p>In this work, we tested how well the best-personalized architectures for inferring textual content are robust to poisonous attacks. For the study, we used the GoEmotions dataset containing nearly 60k texts from Reddit annotated by a large group of people with 28 emotion categories <ref type="bibr" target="#b10">[11]</ref>. Using selected keywords, we simulated the poisonous attack of a group of people on annotated texts (training data). We tested how their attack affects the decision of a system trained on such data on a group of normal users. We compared the non-personalized baseline SOTA in NLP (finetuned transformer) with two personalized transformer-based models: HuBi-Medium and User-ID <ref type="bibr" target="#b11">[12]</ref>. The results show that the personalized models are significantly more resistant to poisonous attacks than the baseline models. The larger the group of attackers, the greater the differences in favor of the personalized models.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>There have been some efforts to taxonomize attack methods against machine learning models. In general, attack types can be distinguished into poisoning attack, and evasion attack <ref type="bibr" target="#b12">[13]</ref>. A poisoning attack aims to alter the training data to affect the training process, whereas an evasion attack aims to exploit weaknesses in the model without affecting the training process.</p><p>Poisoning attacks can be performed with various techniques. In image recognition, backdooring poisoning attack is popular <ref type="bibr" target="#b13">[14,</ref><ref type="bibr" target="#b14">15]</ref>. In this case, a backdoor is a perturbation inserted into an image that triggers misclassification to a label selected by the attacker. Another technique is clean-label poisoning <ref type="bibr" target="#b13">[14]</ref>, in which additional data is embedded into the image without changing the label. In NLP, a similar approach to backdooring poisoning attacks has been investigated. This approach relies on a trigger inserted into the training data to cause misclassification. The trigger may be an uncommon word or a sequence of characters in the example text <ref type="bibr" target="#b15">[16,</ref><ref type="bibr" target="#b16">17]</ref>, but it can also be a carefully crafted malicious word embedding <ref type="bibr" target="#b17">[18]</ref>. In the recommendation systems, poisoning is often performed in the form of shilling attack <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b18">19,</ref><ref type="bibr" target="#b0">1]</ref>,</p><p>where specific examples are crafted with fake user profiles and are inserted into the target system to generate recommendations toward specific items selected by the attacker for the target users.</p><p>Some proposed defense mechanisms to protect machine learning models include comparing the model's performance periodically against a clean baseline <ref type="bibr" target="#b19">[20]</ref>, adding noise to the example, entropy analysis <ref type="bibr" target="#b20">[21]</ref>, early stopping of the training, perplexity analysis, embedding distance analysis <ref type="bibr" target="#b16">[17]</ref>, and rating time series analysis <ref type="bibr" target="#b1">[2]</ref>. However, these options are costly, not always applicable, or unreliable. In this paper, we propose a model with inherent resistance against malicious annotations. Notably, our model does not aim to replace existing defense propositions. Instead, it may complement existing defense methods to improve the system further.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Dataset</head><p>We used GoEmotions <ref type="bibr" target="#b10">[11]</ref> to create datasets for our experiments. It contains 211,225 annotations from 82 unique annotators working on 58,011 unique texts curated from Reddit. Up to five unique annotators rated a given text. Each annotation consists of 28 emotional class labels. The annotators could assign more than one label to a given text. Also, the annotators may not assign any emotional class label and mark the text as unclear.</p><p>There is a striking class imbalance in GoEmotions, as shown in Figure <ref type="figure" target="#fig_1">1</ref>. Some classes, such as Neutral, Approval, and Admiration have very high occurrences,   Therefore, instead of predicting specific emotions, we try to predict the sentiments in the annotations. This allows us to group the emotional class labels by following the result of the sentiment analysis performed by the authors of GoEmotions, as shown in Table <ref type="table" target="#tab_0">1</ref>. Although there is still some class imbalance when using sentimental class labels, it is less substantial.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Experiment 1: Attack Simulation with Compromise Probability</head><p>For our first experiment, we prepared a list of keywords that was used to simulate malicious annotations. Then, we filtered out from GoEmotions only texts that contain at least one keyword. The resulting dataset consists of 18,326 annotations. The sentiment distribution in the  dataset for the first experiment is shown in Figure <ref type="figure" target="#fig_2">2</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Experiment 2: Attack Simulation with</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Ratio of Malicious Users</head><p>For our second experiment, we created a dataset consisting of 50% texts containing at least one keyword and 50% texts without any keyword. We also want the dataset to possess roughly equal sentiment distribution. We do this by first dropping annotations with all zeroes in all sentiments and texts that fewer than three annotators rate. Then, we filter only texts that contain at least one keyword, resulting in 18,198 annotations. After that, from an initial sentiment distribution analysis, we found that the sentiment Positive is the most prominent in the picked annotations, followed by Negative, Neutral, and Ambiguous. So, we randomly pick more annotations for the same total number of annotations, but by giving a greater portion for Ambiguous sentiment, followed by Neutral, Negative, and Positive. The final dataset consists of 36,396 annotations. The sentiment distribution in the final dataset for the second experiment is shown in Figure <ref type="figure" target="#fig_3">3</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Poisoning Strategy</head><p>In our experiments, we assume a scenario where the texts are annotated by users whose genuineness cannot always be guaranteed. These users know that the annotations will be used to train a machine-learning model, but they do not know or care about its architecture. Some of these users may provide malicious annotations. However, in individual perspectives modeling, it is important to distinguish the concept of malicious annotation from subjective judgment because they both may appear as statistical outliers. By the term malicious, we mean that the user does not annotate the given text based on any personal value or moral justification. Instead, they annotate to introduce unwanted behavior into the resulting model or at least degrade the performance of the resulting model. We also assume that the users do not have direct access to the environment where the model is trained, and they do not possess high technical capabilities. Therefore, the only way for the users to affect the resulting model is through the annotations.</p><p>To simulate such malicious annotators in our experiments, we deploy a poisoning strategy similar to the trigger-based poisoning attack technique commonly discussed in the literature <ref type="bibr" target="#b15">[16,</ref><ref type="bibr" target="#b16">17]</ref>. We define a list of keywords that will act as triggers to change the annotations' values. These keywords are selected from the top 500 most frequent words in GoEmotions. However, the difference from the common poisoning strategy is that we only change the annotations from users of the Experimental group. In contrast, the annotations from users of the Control group remain untouched. Table <ref type="table" target="#tab_1">2</ref> contains the list of keywords. Finally, testing is performed only on the Control group users to see how malicious annotations during training affect test predictions for genuine users.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Experiment 1: Attack Simulation with</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Compromise Probability</head><p>In the first experiment, we introduce a compromise probability to determine whether the annotations should be changed. This is to simulate the behavior of a malicious user who may not always annotate maliciously. We use compromise probability values of 0.0, 0.125, 0.25, 0.375, and 0.5. All malicious users are taken from the entire Experimental group, which has a fixed size, so the ratio of malicious users to all users remains the same for each compromise probability value. Malicious annotations are created by changing the Negative label to 1 and the other labels to 0. The main goal of the attack is to associate the keywords with the Negative sentiment, although such Change Negative label to 1 and the other labels to 0 The ratio of texts containing a trigger to all texts, first experiment 100% The ratio of malicious users to all users, first experiment 0.5 Compromise probability, first experiment 0, 0.125, 0.25, 0.375, and 0.5 The ratio of texts containing a trigger to all texts, second experiment 50% The ratio of malicious users to all users, second experiment 0.0, 0.   The malicious users are randomly picked from the Experimental group. For example, if there are 82 users in total, then a 10% ratio of malicious users to all users is equal to 8 users. Those eight users are randomly picked from the Experimental group.</p><p>an attack may also affect the predictive performance of other sentiments.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Experiment 2: Attack Simulation with</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Ratio of Malicious Users</head><p>In the second experiment, we investigate the effects of different sizes of the malicious user group. We do not use the compromise probability, meaning that annotations from users belonging to the malicious user group are always changed. Malicious users are randomly picked from the pool of users in the Experimental group. First, we start with a 0.0 ratio of malicious users to all users, followed by 0.1, 0.2, 0.3, 0.4, and 0.5. Figure <ref type="figure" target="#fig_5">4</ref> shows how we prepare the dataset copies with different malicious annotator levels. Like in the first experiment, malicious annotations are created by changing the Negative label to 1 and the other labels to 0.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Dataset Splitting</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">Experiment 1: Attack Simulation with Compromise Probability</head><p>Our dataset splitting strategy for the first experiment can be seen in Figure <ref type="figure" target="#fig_6">5</ref>. First, we randomly choose 50% of all annotators to be put in the Experimental group, whose annotations may be tweaked to simulate malicious annotations. The remaining annotators are put in the Control group, whose annotations are unchanged. Then, we divide the dataset into train, val, and test splits with the ratio 70:20:10, and with the condition that the train and val splits have to contain annotations from both genuine users (Control group) and malicious users (Experimental group). During testing, only predictions for genuine users are compared against the real annotations to compute the result.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Experiment 2: Attack Simulation with</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Ratio of Malicious Users</head><p>The dataset splitting strategy for our second experiment is depicted in Figure <ref type="figure" target="#fig_7">6</ref>. It is adapted from <ref type="bibr" target="#b21">[22]</ref>. The division of texts into past, present, future1, and future2 partitions is to simulate available data in a working prediction system. The past partition represents initial annotations made by users when they start using the system. The present partition is analogous to annotations generated by the system's operation. The Future1 and Future2 partitions are meant for validation and test purposes, respectively. Meanwhile, the user-based split follows the 10-fold cross-validation schema. Similar to the first experiment, the train and val splits contain both genuine and malicious user annotations. During testing, only predictions for genuine users are compared against the real annotations to compute the result.   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Models</head><p>For the sentiment prediction task based on individual perspectives, we take advantage of the following sources of information: text embeddings, user IDs, user embeddings, and word biases. Text embeddings are acquired from the pre-trained language model. The Baseline model is trained with text embeddings without any user tion. On the other hand, the personalized User-ID model is trained with text embeddings and user IDs. Meanwhile, the personalized HuBi-Medium model is trained with text embeddings, user embeddings, and word biases. In personalized models, we assume minimal user knowledge in the form of several texts annotated by the user in the training set, as in <ref type="bibr" target="#b22">[23]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1.">Baseline</head><p>We feed text embeddings acquired from the pre-trained language model into the Baseline model and train it on each user's annotation. This is based on the common  approach in NLP where, on a given text, the predictive model provides one unified prediction output for any user. In other words, the Baseline model is trained to produce prediction outputs that are general enough to suit most users, similar to <ref type="bibr" target="#b23">[24,</ref><ref type="bibr" target="#b24">25,</ref><ref type="bibr" target="#b25">26,</ref><ref type="bibr" target="#b26">27,</ref><ref type="bibr" target="#b27">28,</ref><ref type="bibr" target="#b28">29,</ref><ref type="bibr" target="#b29">30,</ref><ref type="bibr" target="#b30">31,</ref><ref type="bibr" target="#b31">32,</ref><ref type="bibr" target="#b32">33,</ref><ref type="bibr" target="#b33">34,</ref><ref type="bibr" target="#b34">35]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2.">User-ID</head><p>The User-ID model is a personalized model proposed in <ref type="bibr" target="#b5">[6,</ref><ref type="bibr" target="#b8">9]</ref>. To achieve personalization, the user ID of the annotator providing the annotation is added to the text embedding as a special token. Notably, in BERT-based models, special tokens receive their unique embeddings. Then, we feed text embeddings containing user information into the User-ID model and train it on each user's annotation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.3.">HuBi-Medium</head><p>The HuBi-Medium model is introduced in <ref type="bibr" target="#b6">[7]</ref>. It achieves personalization by optimizing a multi-dimensional latent vector representing the users. This model is based on the Neural Collaborative Filtering (NFC) technique commonly implemented in recommendation systems. However, NFC cannot be applied directly for individual perspective modeling due to the cold start problem. Constructing a decent user representation from scratch is difficult when most texts in the dataset do not receive many annotations. HuBi-Medium overcomes the cold start problem by initializing the latent vector randomly and optimizing the latent vector via backpropagation. The relationship between the user and the given text is signified by the element-wise multiplication between the user embedding and the text embedding, as shown in Figure <ref type="figure" target="#fig_8">7</ref>. The result goes into a fully connected layer and gets summed with word biases to output the prediction. The prediction output is mathematically defined as:</p><formula xml:id="formula_0">𝑦(𝑡, 𝑢) = 𝑊𝑇 𝑈 (𝑎(𝑊𝑇 𝑥𝑡) ⊗ 𝑎(𝑊𝑈 𝑥𝑢)) + ∑︁ 𝑤𝑜𝑟𝑑∈𝑡 𝑏 𝑤𝑜𝑟𝑑</formula><p>where 𝑡 and 𝑢: evaluated text and user; 𝑏: a vector of biases indexed with words; 𝑥𝑡: embedding of the text 𝑡; 𝑥𝑢: embedding of the user 𝑢; 𝑊𝑇 𝑈 , 𝑊𝑇 , 𝑊𝑈 : weights of the fully-connected layers; 𝑎: the activation function.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Experimental Setup</head><p>We design each experiment as a multivariate regression. The task is to simultaneously predict sentiment perception for a given text and a given user in four sentimental labels. The output for each sentimental label is a continuous value in the interval [0,1] that can be interpreted as the probability for the user to label the given text with the associated sentimental label. We use the 𝑅 2 metric to evaluate the models. This measure gives us information on how close the model is to the correct decision.</p><p>The first experiment is repeated through 5 iterations. In each iteration, the average 𝑅 2 value of each configuration is calculated from its 𝑅 2 values from all labels. At the end of the experiment, we analyze the best result from each configuration. Meanwhile, the second experiment deploys a 10-fold cross-validation to evaluate the models over 10 different user-based subsets of equal size. Then, we calculate the average 𝑅 2 value from each label of each configuration</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.1.">Language Model</head><p>For our experiments, we use DistilBERT <ref type="bibr" target="#b35">[36]</ref>, a Transformer-based language model. It is a distilled version of BERT <ref type="bibr" target="#b36">[37]</ref>. We choose DistilBERT because it is significantly faster to train while having almost similar language understanding proficiency as the original BERT. We perform both experiments with fine-tuned models. In fine-tuning, all layers of the pre-trained models are unfrozen. This allows pre-trained weights to be updated via backpropagation during training.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.2.">Hyperparameter Settings</head><p>We utilize Mean Squared Error (MSE) for the loss function and the Adam optimizer. The optimal hyperparameter settings for each model are investigated individually, where it is found that all models perform best with a learning rate of 5e-5. All models are trained for three epochs. In the case of the User-ID model, the size of the text embedding needs to be adjusted due to the additional special tokens. Meanwhile, in the case of the HuBi-Medium model, we need to set several additional hyperparameter settings. The user embedding size is set to 82, equal to the total number of annotators in the dataset. The hidden size for the last fully connected layer is set to 20. The dropout layer above the user embedding is given a rate of 0.2 to prevent overfitting.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.3.">Statistical Testing</head><p>We perform statistical tests to ensure the significance of the differences between the models. First, we check the distribution normality with Q-Q plots and the Shapiro-Wilk test, where the significance level 𝛼 is set to 0.05. We also check the variance homogeneity with the Levene test. We assume that the groups in the data are independent because the results come from different models that do not affect each other. The experiments are performed in isolated environments. Finally, we perform independent samples t-test on the results with 𝛼 = 0.05. We accept the null hypothesis if 𝑝_𝑣𝑎𝑙𝑢𝑒 &gt; 𝛼, meaning there is no significant difference between the two models. We reject the null hypothesis if 𝑝_𝑣𝑎𝑙𝑢𝑒 ≤ 𝛼, meaning there is a significant difference between the two models.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8.">Results</head><p>In the first experiment, we only used the User-ID model here to be compared against the Baseline model because it is simple to implement without requiring any extension. Figure <ref type="figure" target="#fig_9">8</ref> presents the result from the first experiment. In the second experiment, we compare User-ID and HuBi-Medium personalized models against the Baseline model. Figure <ref type="figure">9</ref> presents the aggregated result from this experiment, while Figure <ref type="figure" target="#fig_1">10</ref> shows the results in each sentiment category.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8.1.">Experiment 1: Attack Simulation with</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Compromise Probability</head><p>The User-ID model obtains the best result, with a consistent advantage over the Baseline model at any compromise probability level. Even in the clean dataset setting without malicious annotation, User-ID can achieve an 𝑅 2 score of 28.22%, which is 3.35 percentage points (pp.) higher than the Baseline model. On the other hand, the Baseline model can only achieve an 𝑅 2 score of 24.87% in the clean dataset setting. This shows that using a personalized model can improve the system's predictive performance even when we are certain that the dataset does not contain malicious annotation. Personalization enriches the model to make more accurate decisions in the context of a specific user about whom the model has minimal knowledge, as shown in <ref type="bibr" target="#b6">[7,</ref><ref type="bibr" target="#b5">6,</ref><ref type="bibr" target="#b11">12]</ref>.</p><p>As the compromise probability level increases, the predictive performance of the Baseline model steadily decreases. In general, every time the compromise probability is increased by 0.125, the 𝑅 2 score of the Baseline model drops by roughly 1.73 pp. The exception is when the compromise probability is increased from 0.375 to 0.5, where the 𝑅 2 score dramatically drops by 6.12 pp. from 19.68% to 13.56%. This suggests that the Baseline model cannot converge properly when the frequency of malicious annotations is high.</p><p>Meanwhile, the User-ID model exhibits a more stable performance. With each 0.125 increase of the compromise probability, the 𝑅 2 score changes by only about 0.35 to 0.93 pp. Even when the compromise probability is increased from 0.375 to 0.5, the 𝑅 2 score only decreases by 0.77 pp. from 27.50% to 26.73%. In addition, the statistical tests show that the differences between User-ID and Baseline across the compromise probability values are significant with 95% confidence.</p><p>Our result shows that the higher the compromise probability, the greater the advantage offered by the User-ID model over the Baseline model. This is due to the ability of User-ID to learn about the users that make the annotations. By providing information about the user as an additional special token, the User-ID model can make personalized predictions, where harmful predictions are more likely to be made on users that make malicious annotations and less likely on users making genuine annotations.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8.2.">Experiment 2: Attack Simulation with</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Ratio of Malicious Users</head><p>The models do not give any significant difference up to the 30% malicious annotator level (MAL). At 30% MAL, both User-ID and HuBi-Medium start to outperform the Baseline model, but the differences are still insignificant. However, at 40% MAL, both User-ID and HuBi-Medium perform similarly with a dramatic advantage over the Baseline model, with 95% confidence. At 50% MAL, HuBi-Medium can maintain a stable performance, significantly outperforming both User-ID and the Baseline model. In contrast, the User-ID model fails to gain a significant difference from the Baseline model. Notably, all models perform similarly in the Ambiguous category. User-ID outperforms HuBi-Medium and the Baseline model in the Ambiguous category at 40% MAL. However, all models again perform similarly when there is a 50% MAL. This is because Ambiguous is a difficult category to predict. Unlike Positive and Negative sentiments, which very often can be indicated by the presence of nuanced words in the texts, the Ambiguous sentiment often requires additional knowledge that cannot be easily represented in the language modeling, such as the text's context in the Reddit thread or cultural circle of the user.</p><p>At 10% and 20% MAL, the Baseline seems to outperform all personalized models. However, the statistical tests indicate that these levels' differences are insignif-icant. Nevertheless, the high 𝑅 2 mean of the Baseline model at these levels can be explained, which is due to abnormal behavior in the Neutral category and the Positive category. In the Neutral category, the Baseline model delivers a sharp increase in the 𝑅 2 score at 10% MAL. This is caused by the poisoning strategy, where the annotation for the Neutral category is always changed to zero in the presence of a trigger in the given text. It just happens that the small number of changed Neutral annotations conform to the majority of the genuine Neutral annotations on the affected texts. A similar phenomenon happens in the Positive category. Later, when the MAL is increased from 10% to 20%, the 𝑅 2 score in the Neutral category immediately drops, indicating that the malicious annotations start to contrast and overwhelm the genuine annotations on the affected texts. Meanwhile, the 𝑅 2 score of the Baseline model in the Positive category starts to drop when the MAL is greater than 20%.</p><p>The User-ID model starts gaining an advantage over the Baseline model at 30% MAL, but it only becomes significant at 40% MAL. At 40% MAL, User-ID is significantly better than the Baseline model in Ambiguous, Neutral, and Negative categories, as well as the overall mean.</p><p>The User-ID model loses its significant advantage at 50% MAL. Due to the low exposure of texts to users in the dataset, User-ID tends to put greater importance to the text embeddings than the user ID special tokens. The great number of malicious annotations affects the finetuning process on the text embedding layer significantly. To counter this effect, User-ID requires each text to be annotated by more users to put greater importance to the user ID special tokens. Unfortunately, such a condition cannot be obtained using GoEmotions, so we will need to investigate the phenomenon further in the future with a different dataset.</p><p>In the Positive category, the User-ID model has worse performance than both the Baseline and the HuBi-Medium model. Considering that people tend to have high agreement on the Positive sentiment, it appears that predicting this category based on aggregated data alone (the Baseline) may deliver accurate results more often than predicting the individuals (the User-ID model). However, the Baseline suffers from the poisoning attack significantly at MAL &gt;30%.</p><p>HuBi-Medium seems to be the best solution for the problem. In the Positive category, it performs similarly to the Baseline at 0 -30% MAL, and it outperforms the Baseline at MAL &gt;30%. This is because the HuBi-Medium model considers the word biases, which are the main reason for the high agreement in the Positive category. The HuBi-Medium model still offers the benefit of personalization in increasing resistance against malicious annotations, as seen in the minimal drops of predictive performance at 40% MAL and 50% MAL, due to having the user embeddings. The HuBi-Medium model is generally the bestperforming model due to its stability. HuBi-Medium experiences minimal drops in the overall predictive performance at 10% -30% MAL, where a 10% increase in the ratio of malicious annotators to all annotators only reduces the 𝑅 2 mean by about 1.05 pp. When the MAL is increased from 30% to 40%, the 𝑅 2 mean only decreases by 2.4 pp. When the MAL is further increased from 40% to 50%, the 𝑅 2 mean only decreases by 3.12 pp. The drops are much smaller than the drops the other models experienced. Also, HuBi-Medium is the best-performing model at 40% and 50% MAL.</p><p>HuBi-Medium can maintain a stable performance because it extends the basic BERT architecture with user embeddings and word biases. During fine-tuning, the user embeddings can be optimized more precisely than only individual user ID tokens. Meanwhile, the word biases help to prevent dramatic changes in the weights of the text embeddings when malicious annotations are present. A potential drawback of using HuBi-Medium is that the training process tends to be longer due to having more trainable parameters. However, in our experiments with small datasets, the differences in training time are negligible.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="9.">Conclusions and Future Work</head><p>This work is part of a larger research investigating personalized transformer models' resistance against malicious annotations. Our results show that such personalized models are promising solutions for a human-centered trusted AI. In the scenario where attackers do not always perform malicious annotations, the personalized model consistently outperforms the baseline model with minimal decreases in average predictive performance. In a bigger scenario that includes untriggered texts, the ef- fects of the poisoning attack become significant when the ratio of malicious annotators to all annotators is greater than 30%. At that point, the personalized models User-ID and HuBi-Medium show higher predictive performance than the baseline model.</p><p>We must thoroughly examine the limits of the resistance offered by personalized transformer models. In addition, the personalized models need to be evaluated in other machine learning tasks with different datasets and tested against more sophisticated attack methods. We would also like to study possible extensions to the personalized models to increase the resistance against malicious annotations further.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Emotion distribution in GoEmotions dataset. The Y-axis values show the annotation count, while the X-axis values show the emotional class labels.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Sentiment distribution in the dataset for the first experiment. There are 18,326 annotations in total.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Sentiment distribution in the dataset for the second experiment. There are 36,396 annotations in total.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: The poisoning strategy in the second experiment.The malicious users are randomly picked from the Experimental group. For example, if there are 82 users in total, then a 10% ratio of malicious users to all users is equal to 8 users. Those eight users are randomly picked from the Experimental group.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: Dataset splitting in the first experiment. Only predictions for genuine users (the Control group) are considered during testing.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head>Figure 6 :</head><label>6</label><figDesc>Figure 6: Dataset splitting in the second experiment. Only predictions for genuine users (the Control group) are considered during testing.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_8"><head>Figure 7 :</head><label>7</label><figDesc>Figure 7: The HuBi-Medium model architecture.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_9"><head>Figure 8 :</head><label>8</label><figDesc>Figure 8: Average 𝑅 2 on the test split in the first experiment. baseline_sgl: the Baseline model, personalized_user_id: the User-ID model.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_10"><head>Figure 9 :Figure 10 :</head><label>910</label><figDesc>Figure 9: Average 𝑅 2 on the test split in the second experiment, calculated from the mean of all classes. baseline_sgl: the Baseline model, personalized_user_id: the User-ID model, personalized_hubi_medium: the HuBi-Medium model.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Grouping of Emotions into Sentiments in GoEmotions</figDesc><table><row><cell>Sentiment</cell><cell>Emotions</cell></row><row><cell>Positive</cell><cell>admiration, amusement, approval,</cell></row><row><cell></cell><cell>desire, excitement, gratitude,</cell></row><row><cell></cell><cell>love, optimism, pride,</cell></row><row><cell></cell><cell>caring, joy, relief</cell></row><row><cell>Negative</cell><cell>anger, annoyance, disappointment,</cell></row><row><cell></cell><cell>disgust, embarrassment, fear,</cell></row><row><cell></cell><cell>nervousness, remorse, sadness,</cell></row><row><cell></cell><cell>disapproval, grief</cell></row><row><cell cols="2">Ambiguous confusion, curiosity, realization,</cell></row><row><cell></cell><cell>surprise</cell></row><row><cell>Neutral</cell><cell>neutral</cell></row><row><cell cols="2">while other classes, such as Pride, Relief, and Grief are</cell></row><row><cell cols="2">very rare. The class imbalance is problematic because</cell></row><row><cell cols="2">it creates difficulties in interpreting the results of the</cell></row><row><cell>experiments.</cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc></figDesc><table><row><cell>Poisoning Strategy Parameters</cell><cell></cell></row><row><cell>Keywords</cell><cell>man, guy, fuck, shit, fucking, guys, hell, reddit,</cell></row><row><cell></cell><cell>men, god, religion, dumb, government,</cell></row><row><cell></cell><cell>racist, subreddit</cell></row><row><cell>Malicious annotations</cell><cell></cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This work was financed by <ref type="bibr" target="#b0">(1)</ref> the National Science Centre, Poland, project no. 2019/33/B/HS2/02814 and 2021/41/B/ST6/04471; (2) the Polish Ministry of Education and Science, CLARIN-PL; (3) the European Regional Development Fund as a part of the 2014-2020 Smart Growth Operational Programme, CLARIN -Common Language Resources and Technology Infrastructure, project no. POIR.04.02.00-00C002/19; (4) the statutory funds of the Department of Artificial Intelligence, Wroclaw University of Science and Technology.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Practical data poisoning attack against next-item recommendation</title>
		<author>
			<persName><forename type="first">H</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Ding</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gao</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of The Web Conference 2020</title>
				<meeting>The Web Conference 2020<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="2458" to="2464" />
		</imprint>
	</monogr>
	<note>WWW &apos;20</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Shilling attack detection for recommender systems based on credibility of group users and rating time series</title>
		<author>
			<persName><forename type="first">W</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Qu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zeng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Cheng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">PLOS ONE</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="page" from="1" to="17" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Using machine learning to examine cyberattack motivations on web defacement data</title>
		<author>
			<persName><forename type="first">S</forename><surname>Banerjee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Swearingen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Shillair</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Bauer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Holt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ross</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Social Science Computer Review</title>
		<imprint>
			<biblScope unit="volume">40</biblScope>
			<biblScope unit="page" from="914" to="932" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">What is a flag for? social media reporting tools and the vocabulary of complaint</title>
		<author>
			<persName><forename type="first">K</forename><surname>Crawford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Gillespie</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">New Media &amp; Society</title>
		<imprint>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="page" from="410" to="428" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Vulnerable community identification using hate speech detection on social media</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Mossie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J.-H</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Processing &amp; Management</title>
		<imprint>
			<biblScope unit="volume">57</biblScope>
			<biblScope unit="page">102087</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Offensive, aggressive, and hate speech analysis: From data-centric to humancentered approach</title>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Figas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gruza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Puchalska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Kajdanowicz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kazienko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Inf. Process. Manage</title>
		<imprint>
			<biblScope unit="volume">58</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Learning personal human biases and representations for subjective tasks in natural language processing</title>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gruza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bielaniewicz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Grimling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Kanclerz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Miłkowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kazienko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE International Conference on Data Mining (ICDM)</title>
				<imprint>
			<date type="published" when="2021">2021. 2021</date>
			<biblScope unit="page" from="1168" to="1173" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Multitask personalized recognition of emotions evoked by textual content</title>
		<author>
			<persName><forename type="first">P</forename><surname>Miłkowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Saganowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gruza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kazienko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Piasecki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2022">2022. 2022</date>
			<biblScope unit="page" from="347" to="352" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">What if ground truth is subjective? personalized deep neural hate speech detection</title>
		<author>
			<persName><forename type="first">K</forename><surname>Kanclerz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gruza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Karanowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bielaniewicz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Miłkowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kazienko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 1st Workshop on Perspectivist Approaches to NLP@ LREC2022</title>
				<meeting>the 1st Workshop on Perspectivist Approaches to NLP@ LREC2022</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="37" to="45" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">The origin and value of disagreement among data labelers: A case study of individual differences in hate speech annotation</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Sang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Stanton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Information</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="425" to="444" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">GoEmotions: A dataset of fine-grained emotions</title>
		<author>
			<persName><forename type="first">D</forename><surname>Demszky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Movshovitz-Attias</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Cowen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Nemade</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ravi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</title>
				<meeting>the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="4040" to="4054" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Studemo: A non-aggregated review dataset for personalized emotion recognition</title>
		<author>
			<persName><forename type="first">A</forename><surname>Ngo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Candri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Ferdinan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Korczynski</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 1st Workshop on Perspectivist Approaches to NLP@ LREC2022</title>
				<meeting>the 1st Workshop on Perspectivist Approaches to NLP@ LREC2022</meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="46" to="55" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">A taxonomy and survey of attacks against machine learning</title>
		<author>
			<persName><forename type="first">N</forename><surname>Pitropakis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Panaousis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Giannetsos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Anastasiadis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Loukas</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computer Science Review</title>
		<imprint>
			<biblScope unit="volume">34</biblScope>
			<biblScope unit="page">100199</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Backdooring and poisoning neural networks with image-scaling attacks</title>
		<author>
			<persName><forename type="first">E</forename><surname>Quiring</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Rieck</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Security and Privacy Workshops (SPW)</title>
				<imprint>
			<date type="published" when="2020">2020. 2020</date>
			<biblScope unit="page" from="41" to="47" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Systematic evaluation of backdoor data poisoning attacks on image classifiers</title>
		<author>
			<persName><forename type="first">L</forename><surname>Truong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Hutchinson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>August</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Praggastis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Jasper</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Nichols</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Tuor</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)</title>
				<imprint>
			<date type="published" when="2020">2020. 2020</date>
			<biblScope unit="page" from="3422" to="3431" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Exploring the impact of data poisoning attacks on machine learning model reliability</title>
		<author>
			<persName><forename type="first">L</forename><surname>Verde</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Marulli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Marrone</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Knowledge-Based and Intelligent Information &amp; Engineering Systems: Proceedings of the 25th International Conference KES2021</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">192</biblScope>
			<biblScope unit="page" from="2624" to="2632" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Concealed data poisoning attacks on NLP models</title>
		<author>
			<persName><forename type="first">E</forename><surname>Wallace</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Feng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Singh</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</title>
				<meeting>the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="139" to="150" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Be careful about poisoned word embeddings: Exploring the vulnerability of the embedding layers in NLP models</title>
		<author>
			<persName><forename type="first">W</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>He</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</title>
				<meeting>the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="2048" to="2058" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<monogr>
		<author>
			<persName><forename type="first">H</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">Z</forename><surname>Gong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Xu</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2101.02644</idno>
		<title level="m">Data poisoning attacks to deep learning based recommender systems</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Systematic poisoning attacks on and defenses for machine learning in healthcare</title>
		<author>
			<persName><forename type="first">M</forename><surname>Mozaffari-Kermani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Sur-Kolay</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Raghunathan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">K</forename><surname>Jha</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Journal of Biomedical and Health Informatics</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="page" from="1893" to="1905" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Salem</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Backes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2111.04394</idno>
		<title level="m">Get a model! model hijacking attack against machine learning models</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">Personal bias in prediction of emotions elicited by textual opinions</title>
		<author>
			<persName><forename type="first">P</forename><surname>Miłkowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gruza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Kanclerz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kazienko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Grimling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop</title>
				<meeting>the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="248" to="259" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Controversy and conformity: from generalized to personalized aggressiveness detection</title>
		<author>
			<persName><forename type="first">K</forename><surname>Kanclerz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Figas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gruza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Kajdanowicz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Puchalska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kazienko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing</title>
		<title level="s">Long Papers</title>
		<meeting>the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="5915" to="5926" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Classifier-based polarity propagation in a wordnet</title>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Janz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Piasecki</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)</title>
				<meeting>the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Multilevel analysis and recognition of the text sentiment on the example of consumer opinions</title>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zaśko-Zielińska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Miłkowski</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International Conference on Recent Advances in Natural Language Processing</title>
				<meeting>the International Conference on Recent Advances in Natural Language Processing</meeting>
		<imprint>
			<publisher>RANLP</publisher>
			<date type="published" when="2019">2019. 2019</date>
			<biblScope unit="page" from="559" to="567" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Recognition of emotions, valence and arousal in large-scale multidomain text reviews</title>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Janz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Miłkowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Riegel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wierzba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Marchewka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Czoska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Grimling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Konat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Juszczyk</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">9th Language &amp; Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Multilevel sentiment analysis of polemo 2.0: Extended corpus of multi-domain consumer reviews</title>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Miłkowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zaśko-Zielińska</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)</title>
				<meeting>the 23rd Conference on Computational Natural Language Learning (CoNLL)</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="980" to="991" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Cross-lingual deep neural transfer learning in sentiment analysis</title>
		<author>
			<persName><forename type="first">K</forename><surname>Kanclerz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Miłkowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Procedia Computer Science</title>
		<imprint>
			<biblScope unit="volume">176</biblScope>
			<biblScope unit="page" from="128" to="137" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Multiemo: Multilingual, multilevel, multidomain sentiment analysis corpus of consumer reviews</title>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Miłkowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Kanclerz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Computational Science</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="297" to="312" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Mapping wordnet onto human brain connectome in emotion processing and semantic similarity recognition</title>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Maziarz</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Processing &amp; Management</title>
		<imprint>
			<biblScope unit="volume">58</biblScope>
			<biblScope unit="page">102530</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">Aspectemo: multi-domain corpus of consumer reviews for aspect-based sentiment analysis</title>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Radom</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Kaczmarz-Wawryk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Wabnic</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Zajączkowska</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Zaśko-Zielińska</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2021 International Conference on Data Mining Workshops (ICDMW), IEEE</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="166" to="173" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<analytic>
		<title level="a" type="main">Deep neural language-agnostic multi-task text classifier</title>
		<author>
			<persName><forename type="first">K</forename><surname>Gawron</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Pogoda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Ropiak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Swędrowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2021 International Conference on Data Mining Workshops (ICDMW), IEEE</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="136" to="142" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<analytic>
		<title level="a" type="main">Neuro-symbolic models for sentiment analysis</title>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Baran</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gruza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Janz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kajstura</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kazienko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Korczyński</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Miłkowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Piasecki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Szołomicka</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Computational Science</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="667" to="681" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<analytic>
		<title level="a" type="main">Multilingual and language-agnostic recognition of emotions, valence and arousal in large-scale multi-domain text reviews</title>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Miłkowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wierzba</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Konat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Klessa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Janz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Riegel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Juszczyk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Grimling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Marchewka</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Language and Technology Conference</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="214" to="231" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<analytic>
		<title level="a" type="main">Multi-model analysis of language-agnostic sentiment classification on multiemo data</title>
		<author>
			<persName><forename type="first">P</forename><surname>Miłkowski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Gruza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Kazienko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Szołomicka</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Woźniak</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kocoń</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Conference on Computational Collective Intelligence Technologies and Applications</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="163" to="175" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b35">
	<monogr>
		<author>
			<persName><forename type="first">V</forename><surname>Sanh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Debut</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chaumond</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Wolf</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1910.01108</idno>
		<title level="m">Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b36">
	<analytic>
		<title level="a" type="main">BERT: Pre-training of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
		<title level="s">Long and Short Papers</title>
		<meeting>the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies<address><addrLine>Minneapolis, Minnesota</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="4171" to="4186" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
