<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Capturing Human Perspectives in NLP: Questionnaires, Annotations, and Biases⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Wiktoria Mieleszczenko-Kowszewicz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kamil Kanclerz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Julita Bielaniewicz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marcin Oleksy</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marcin Gruza</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stanisław Woźniak</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ewa Dzięcioł</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Przemysław Kazienko</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jan Kocoń</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Artificial Intelligence, Wrocław University of Science and Technology</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>This article compiles research on the extraction of human characteristics using three diferent methods: questionnaires, annota- tions, and biases. We have performed an analysis of how personalized perception of texts is afected by individual human profile and bias. To acquire comprehensive knowledge about individual user prefer- ences, we have gathered 40 users who annotated 1000 texts in 26 subjective tasks grouped into three categories: positive afect, neg- ative afect, and rational afect. The results revealed that categories of annotation were correlated with psychological dimensions, e.g., agreeableness and conscientiousness, which are traits related to pos- itive afect dimension biases. We have observed the presence of two clearly defined categories among annotators when it comes to the aspect of humor: those who confidently share their perspectives on what they find funny and those who tend to rate humor levels within a narrow range. Moreover, we analyzed intra-annotator agreement to show that people tend to change their ratings over time. Our results show that the higher level of the ranking correlation between anno- tations and agreement calculated using binarized annotations com- pared to the absolute agreement calculated using full annotations im- plies that the 10-point annotation scale might be a significant factor in annotator disagreement.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;natural language processing</kwd>
        <kwd>personalization</kwd>
        <kwd>subjectivity</kwd>
        <kwd>annotator bias</kwd>
        <kwd>annotator representation</kwd>
        <kwd>data acquisition</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Resolving natural language processing (NLP) tasks, such
as detecting ofensiveness, humor recognition, or
emotion recognition, requires the work of annotators labeling
large datasets used in training models in machine
learning algorithms. Although people vary between
themselves on a daily basis, the final evaluation of annotated
instances is a decision of the majority of the annotator
called the gold standard. The assumption underlying this
process is that most people will perceive texts similarly
[1]. Annotations not aligned with the majority vote are
not included in the final model. As a result, much
information about humans is not used. Moreover, annotators’
personalities are flattened and generalized, afecting the 2. Related Work
model’s accuracy. Despite existing research [2, 3], there
is still a certain lack of exploration in measurement of the The research from recent years has shown that people
way how individual characteristics of the text’s audience strongly vary in their perception of text depending on
influence the perception of it. the characteristics they possess. This includes features</p>
      <p>This article aims to answer the following research ques- such as cognitive skills [4], personality traits [5], or even
tions: the emotions they have experienced [6]. This
notice1. What is the impact of annotators’ individual char- able diversity between people is reflected in the
multiple perspectives presented in the annotations. The
work of Basile et al. [7] states that the perspectivist
approach should be taken into account when
determining the golden standard. What it implies is the need to
tailor the standard to each person individually,
understanding that the said ground truth is subjective. As the
diferences in user reception of the same text inevitably</p>
      <p>acteristics on their text perception?
2. How does the evaluation of texts change over
time and what are the crucial factors of such an
intra-annotator change of the user?
3. What are the main diferences between methods</p>
      <p>for capturing human perspectives?
4. What is the impact of annotator sense of humor
on the funny content perceived by themselves
and other people?
5. What are the ranking dependencies of
annotations and absolute agreement between
annotators?
2nd Workshop on Perspectivist Approaches to NLP
* Corresponding author.
† These authors contributed equally.
$ wiktoria.mieleszczenko-kowszewicz@pwr.edu.pl
(W. Mieleszczenko-Kowszewicz)</p>
      <p>© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License
CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g ACttEribUutRion W4.0oInrtekrnsahtioonpal (PCCroBYce4.0e).dings (CEUR-WS.org)
become apparent, it is crucial to examine it using ap- Table 1
propriate measures [8]. A stability of user’s annotations Annotation dimensions categorized depending on the afect
is an interesting take, however we have decided to fo- and rational nature.
cus on the deviating from the majority. For this reason, Positive afect Negative afect
we have utilized measures such as Personal Emotional (1) calm (8) anger
Bias [9] and Human Bias [10]. The first metric calcu- (2) compassion (9) disgust
lates the degree of user diferentiation from the average (3) delight (10) fear
emotional perception of a given text, while the second (4) inspiration (11) negative
metric compares the bias of an annotator and its simi- (5) joy (12) sadness
larity to the majority of users. As seen over the years, ((67)) spuorspitriivsee
applying these measures when performing experiments
in natural language processing tasks [11, 12, 13, 14]
conifrmed the efectiveness and a strong improvement in
understanding the individuality of a user. Furthermore,
it has been shown that compared to standard methods
derived from psychology, NLP models are even better at
identifying the Big Five personality traits [15]. With that
in mind, we have decided to perform an assessment of
results from a collection of diferent questionnaires, as
well as investigate the annotations of users.</p>
      <p>Rational (no afect)
(13) agreement
(14) embarrassing
(15) funny to me
(16) funny to someone
(17) incomprehensible
(18) interesting
(19) ironic
(20) ofensive to me
(21) ofensive to someone
(22) political
(23) sympathy
(24) trust
(25) understandable
(26) vulgar
3. Capturing Human Perspectives</p>
      <sec id="sec-1-1">
        <title>3.1. Text Selection Procedure</title>
        <p>To acquire comprehensive knowledge about individual
user preferences, our annotation process consisted of
three major steps: (1) annotation of the large collection
of texts done by a small group of annotators (6 people),
(2) measuring the controversy of the annotated texts with
three methods, and (3) selection of texts for annotation
involving a large group of users (40 people). In the first
step, a small group of experienced annotators annotated
a large collection of comments in Polish. They were
acquired from various Internet forums regarding news,
sport, and lifestyle topics. Then, we measured the
controversy [12] of texts in 3 variants: (1) average controversy
for all dimensions, (2) average controversy of the top
ifve most controversial dimensions for the specific text,
and (3) highest controversy value of all dimensions for
a certain text. Finally, we separately selected 13 of the
texts for annotation with each variant of the controversy.
Furthermore, the texts selected by a specific variant
consisted of 23 of the texts with the highest controversy and
13 of the texts with the lowest controversy measured by a
specific variant. In this way, the final dataset obtained in
step (3) comprised texts with diverse controversy, which
enabled the extraction of various user perspectives.</p>
      </sec>
      <sec id="sec-1-2">
        <title>3.2. Dataset</title>
        <p>Forty annotators participated in the study, with 77.5 %
of them being women and 22.5% being men. Their age
ranged from 19 to 56 years ( = 39.9,  = 10.1).</p>
        <p>The dataset we used is one of the iterations of the
Doccano 1.0 project, which aims to capture subjective
impressions elicited by textual content. The number of
annotated texts was 1000. Each of them is no longer than
132 words ( = 24.5,  = 16.2). On average, each person
annotated around 790 texts and each text was annotated
by around 32 annotators. In its entirety, it comes out a
little under 31,700 annotations. Each annotation consists of
26 independent dimensions (see Tab 1: For each
dimension, the annotator chose a value from 0 to 10, where 0
means that the annotator did not react and 10 means that
the reaction was strong. No decision is acceptable,
indicating that the person does not know what value to give.
Labels with a value of zero occur on average 62% with
22% standard deviation in each dimension. Meanwhile,
empty labels occur on average 4% with 8% standard
deviation. The distributions of the remaining values, which
provide us with information about the actual reactions
of the annotators, are shown in Fig. 1.</p>
        <p>The dimensions are divided into three groups:
positive afect, negative afect, and rational (no afect). This
approach is inspired by multiple works [16, 17, 18].</p>
      </sec>
      <sec id="sec-1-3">
        <title>3.3. Measuring Annotator Profile: Questionnaires</title>
        <p>Big Five personality traits (Mini-IPIP) [19] is a 20 item
questionnaire that measures the factors of the Big Five
personality model: extraversion, agreeableness,
conscientiousness, neuroticism, and intellect/imagination. Each
dimension is measured by four questions, where answers
are given on a 5-point scale: 1 = very inaccurate to 5 =
very accurate. Agreeableness is considered a social trait
that aims to maintain positive relationships with
others. People who score high on this trait tend to choose
the interpretation of the situation as less controversial
(2) compassion
(3) delight
(15) funny to me
(16) funny to someone
(17) incomprehensible
(18) interesting
and choose the more constructive form of conflict reso- they had used in the past month. The subscales are:
relution [20]. Extraversion is a trait that describes people laxation (dampening of autonomic arousal), engagement
who are active and social, it is also widely known for its (active expression of emotions), rumination (sustained
association with positive afect. Conscientiousness is a attention), reappraisal (cognitive reframing), distraction
personality characteristic that describes the tendency to (diverting attention) and suppression (inhibition of
emobe organized, prepared, hard working, and maintaining tional expression).
a high quality of work [21, 22]. Neuroticism refers to the The Physical Health Questionnaire PHQ [27] is
tendency of people to experience negative emotions such a 14-item questionnaire that evaluates four dimensions
as anxiety, worry, fear, and sadness [23]. Intellect is a trait of somatic health (sleep disturbances, headaches,
gastrointhat describes the willingness to seek new experiences, testinal problems and respiratory infections). Items were
investigate new ideas, experience new tastes, and visit rated on a 7-point frequency scale with seven possible
new places [24]. answers.</p>
        <p>Humor Styles Questionnaire (HSQ) [25] is a 32 ele- Patient Health Questionnaire-9 PHQ-9 [28] is a
ment questionnaire that evaluates four styles of humor questionnaire consisting of 9 questions about the
sympapplied by a person: (1) self-enhancing, (2) afiliative , (3) toms of depression, which the user rates on a scale of 0 to
aggressive, and (4) self-defeating. The two positive val- 3.
ues indicate (1) the empowerment of self through the Depression is one of the most common mental
disuse of humor and (2) the willingness to bond with oth- orders. The core questions of the PHQ-9 address the
ers (mostly the recipients of the texts). The remaining symptoms of depression included in the DSM-IV
diagnegative values refer to (3) inflicting a verbal attack on nostic criteria: the higher the score, the more severe the
other people, as well as (4) themselves through the use depression.
of deprecating humor. The values of each of the styles In PHQ and PHQ-9 questionnaires, the lowest scores
are calculated through the use of answers to 32 questions correspond to the absence of symptoms, while the higher
regarding the sense of humor of an individual, which scores proportionally represent their more frequent
ocincludes 8 questions per individual style of humor. The currence.
scale of answers consists of 7 possible answers from 1 = Alexithymia measured with the PAQ
questiontotally disagree to 5 = totally agree. naire [29] containing 7 questions on the 7-point
Lik</p>
        <p>The regulating emotion systems in everyday life ert scale [30] ranges from 1 = strongly disagree to 7 =
(RESS-EMA) scale [26] evaluates how people regulate strongly agree. It is a trait that impedes identifying own
their emotions in daily life. The questionnaire consists feelings, describing them, and limits externally oriented
of 12 items measuring 6 emotion regulation strategies (2 thinking style, manifesting in unintentionally ignoring
items per subscale). Each item was rated on scales from others’ emotions.
0 = totally disagree to 100 = totally agree, and the respon- Perceived Stress Scale [31] measures stress with 10
dents ticked of which emotion management strategies items on 5-point scale with answers from 0 = never to 4</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>4. Analytical Results</title>
      <p>We used the Human Bias HB(, ) [14] measure to
capture the diversity between the preferences of the user
and the others. Its value for a user  within dimension 
is a Z-score-based measure that describes the degree of
diversity of user ’s annotations ,, of all texts  ∈ 
relative to the mean  , and standard deviation  ,
of annotations provided by all users in dimension , as
follows:</p>
      <p>4.2. Bias and Human Characteristics
HB(, ) = ∈ (1) Personality described in Appendix B.1 shows the
cor|| relations between the Big Five and the annotations. The
results reveal that agreeableness and conscientiousness
3.5. Back Saturation are traits that are strongly related with positive afect
For the purposes of the study, we introduced a measure biases. Slightly weaker tendency is observed for negative
called Back Saturation (BS). It could be calculated for afect biases. Moreover, these two traits are also
modereach text ( ) within a particular dimension as follows: ately correlated with each other, which strengthens the
above observation.
 = − 1 * 3 + − 2 * 2 + − 3 + − 4 + − 5 (2) Styles of humor and subjectivity. Despite the fact
that every human understands the concept of humor,
where  is the rating for the negative dimension. − 1 each person has their own, distinct sense of it. We can
refers to the one text back, − 2 to the two texts back, etc., analyze the similarity between each person, aggregate
for example, if subsequent texts received the following the annotation scores into groups, and eventually find
negativity ratings: T1 - 3, T2 - 5, T3 -3, T4 - 2, T5 - 7, then the humor scores of the majority of annotators, but there
the  for text T6 is: is a very low chance of encountering people with
identical set of scores related to humor. Even so, the same
 6 = 7 * 3 + 2 * 2 + 3 + 5 + 3 = 36 (3) scores in this particular research would not imply that
the annotators with equally same humor annotations
3.6. Intra-Annotator Agreement possess the exact same sense of humor. This indicates
the fact that humor is a hugely subjective task, and with
Inspired by the recent works [8] we randomly selected 3 this in mind, we need to take into account the perspective
annotators for a very detailed analysis. Its purpose was of the individual user when assessing their results. As
not only to examine the consistency of the annotations, humor in natural language processing itself is a vastly
but also to try to determine the influence of various fac- personalized task, identifying and categorizing texts with
tors on the change of their decisions. The annotation diferent types of humor may shed some light on the
deprocess was planned in such a way that some texts ap- tails of a person’s sense of humor. The categorization
peared at least twice (hereinafter ‘duplicates’). It was then derived from the Humor Styles Questionnaire in Sec. 3.3
possible to calculate the consistency of the annotations of provides a set of humor types that are widely used in the
these texts made by a single annotator (hereinafter ‘intra- field of humor research, not only in the scope of
natuannotator agreement’ or IntraAA). For some purposes ral language processing, but also in psychology [34, 35].
we have also introduced soft IntraAA, where annotations When acquired, the four available measures of diferent
that difer by only one point (on a scale of 1-10 ) are also types of humor indicate the intensity of experiencing
considered as consistent.
1.00
0.75 Po(sbiitaivse)
0.50 Funnyto
0.25 me(bias)
0.00 sFoumnenoyntoe</p>
      <p>(bias)
0.25 Negative
0.50 (bias)
0.75 (Mbeiaasn)
1.00
2.09
2.50
0.41
0.80
-0.35
-0.37
0.76
0.08
User 39 Biases
0.61
2.5
2.0
1.5
1.0
0.5
0.0
2.5
2.0
1.5
1.0
0.5
0.0
0.5
0.5
humor, but what is interesting is that we can see in Fig. 2
that these values actually focus on the external
perspective of funniness of an individual. It is clearly visible
when evaluating the correlation values between the
humor style parameters and the dimensions funny to me
and funny to someone. as presented in Fig. 3. Other characteristics that are more correlated with the negative
than the mentioned results, the HSQ metrics seem to afect dimension. A similar relationship exists between
be separated from the standard funniness values, as the vulgar and embarrassing bias. Also, compassion is
poscorrelation is much lower than when analyzed between itively related to health problems. There is a general
other HSQ values. As for the individualism of a user, the tendency for people who report health problems to
persubjective matter of experiencing humor is based on the ceive text as less understandable.
emotionality of a user. We have noticed that there are Bias and Stress with Emotions are presented in
two distinct groups of annotators in regard to the humor Appendix B.5. Stress is related to positive and slightly
dimension, people who feel free to express their views weak to negative afect biases. On the other hand, there
of funniness, and individuals who hardly exceed small is a negative relationship between experiencing positive
values in both funniness and unfunniness. As shown in afect and rational biases. Negative afect is related to
Fig. 3, people from the expressive group, such as User 38, negative afect and rational dimensions. Satisfaction with
have a relatively high correlation when talking about the life is weakly negatively related to rational dimension
content being funny to others or themselves, as where biases.
more reserved people, similar to User 39, tend to be mild
in their expression of emotions and feelings. This obser- 4.3. Intra-Annotator Agreement over
vation extends the area of subjectivity in humor in NLP Time
and emphasizes that not only the experience is analyzed
through personalization, but also the expression must be The sample results (calculated for one annotator)1 are
noticed and thoroughly examined. Detailed correlation shown in Fig. 4. Interestingly, IntraAA only in few cases
between humor and annotation dimensions is presented reaches a level that could be considered very good, or
in Appendix B.2. even satisfactory. The situation is even worse if we
ex</p>
      <p>Emotion regulation and subjectivity The relation- clude the cases of the agreement for null marks, especially
ships between regulation of emotions and subjectivity when they account for a large percentage of decisions
are described in the appendix B.3. The use of distraction (e.g. for the presented user the score ranges from 0.08
as a strategy exhibits the most positive relationship with to 0.54, and the average is 0.21). However, the use of
the positive afect dimension and the selected rational the soft IntraAA, which also considers as congruent
anbiases. On the contrary, the relaxation strategy shows an swers those that difer only by one point, shows that the
inverse relationship with negative rational biases. diferences between the annotations are most often not</p>
      <p>Health and subjectivity is described in Appendix B.4.</p>
      <p>Depression and gastrointestinal problems are the health 1For the complete results for all 3 annotators see the Appendix C
large - the IntraAA increases significantly (55% on aver- decisions made on two diferent days, the proportion of
age for the analyzed users). This shows that the analyzed changes from a more to a less negative label increased
annotators were characterized by relatively high stabil- (at the expense of cases of maintaining the assessment;
ity. Smaller diferences between strict and soft IntraAA see Fig. 5).
would show the dimensions for which annotators are
particularly stable. Such dimensions include joy, inspiration,
embarrassing, vulgar or ofensive to me .</p>
      <p>We also investigated this phenomenon by trying to
determine the impact of the negativity of previously
annotated texts. For the purposes of the study, we used
a measure called Back Saturation (BS - see Section 3.5).</p>
      <p>After assigning the appropriate  value to each text,
we compared respectively the  for each text as it
appears for the first time and for the second time. The
results were combined with changes in the annotator’s
decision (see Fig. 6). As it turns out, the analyzed
annotators changed their decisions without a clear efect of
back saturation. However, we observe an imbalance in
the proportion in the case where the evaluation of a text
changes to a more negative text by one point. Indeed,
we note relatively more cases in which such a decision
change is associated with the occurrence of a duplicate
after more negative texts.</p>
      <p>We believe that a number of factors can afect the
change in rating. The basis for the more detailed analysis dFeigciusrioen6. : The correlation between  and changes in the
was the labels within the negative dimension, primarily
because this is the dimension for which relatively most
labels other than "zero" appear and because it has relatively
low concordance scores. Among other things, the anal- 4.4. Relation between Annotations,
ysis looked at the impact of time. It turned out that the Questionnaires and Biases
tendency to change the decision increased when the text
to be annotated was repeated on a diferent day (some To gather holistic knowledge about the user, we decided
duplicates appeared on the same day). Interestingly, for to include text annotations and questionnaires in the data
collection process. Then, we used the acquired
annotations to calculate the biases that describe the peculiarity
of user preferences according to others. Each of the
human data acquisition methods are described in Tab. 2. To
measure the similarity of knowledge obtained by each of
these methods, we used the Pearson correlation
coeficient [36]. The results are presented in Fig. 7. The higher
correlation values were observed between text
annotations and user biases. On the other hand, lower
correlation values appeared between questionnaire answers
and user biases. The relation between questionnaires
and text annotation is described by the least significant
correlation values.</p>
      <p>quest
quest-anno</p>
      <p>quest-bias
anno-quest
anno</p>
      <p>anno-bias
bias-quest
bias-anno
bias
10-point scale for each dimension makes it dificult to
achieve exact agreement between annotators. Therefore,
to better understand the phenomenon, we used three
diferent agreement metrics:
1. Cohen’s kappa on raw annotations.
2. Cohen’s kappa on binarized annotations,
where all nonzero annotations (1-10) were
converted to ones (1).
3. Kendall Tau rank correlation coeficient that
measures the ranking agreement between
annotators.</p>
      <p>In case of Kendall rank correlation metrics, all empty
annotations were removed from calculations, as they
cannot be ordered. As expected, the average Kappa
agreement scores in most dimensions are very low, with a
minimum for surprise (0.025) and maximum for political
(0.267). In the case of binarized annotations, the Kappa
agreement increases significantly (between 0.052 for
surprise and 0.513 for political). The Tau coeficient ranges
between 0.081 for surprise and 0.589 for political. The
results also reveal a positive correlation between the
percentage of zero annotations and annotators agreement for
given dimension (0.485 Pearson correlation coeficient
for the mean kappa and 0.396 for mean kappa binarized).</p>
      <p>We also checked the correlation between the mean Tau
coeficient and the absolute diferences in the biases of
the annotators. Annotators with high bias are more likely
to rate texts above average, and annotators with low bias
are more likely to rate texts below average. Therefore,
the diference of biases on given dimension can be
interpreted as the distance between the annotators’ sensitivity
on this dimension. As Tab. 3 shows, these correlations
are mostly negative but very weak. This means that there
is no clear relationship between the annotator ranking
agreement and the diference in their sensitivity.</p>
    </sec>
    <sec id="sec-3">
      <title>5. Discussion</title>
      <p>who are in a positive emotional state are more likely to suggests that lower life satisfaction may contribute to
perceive and interpret stimuli in a positive light. Individ- perceiving and evaluating stimuli in a negative light,
inuals who have higher levels of life satisfaction may have lfuencing negative dimension biases in the interpretation
a generally positive outlook, influencing their perception of the text. People with health problems are more prone
and interpretation of stimuli as more positive. Gener- to negative dimension biases. Surprisingly, afective
bially, according to questionnaire data, there is a tendency ases are less noticeable when people experience stress.
that positive afect dimensions are afected by the level Vulgar and embarrassing bias co-occur with each other.
of health (both mental and physical). Interestingly, peo- People who score higher in neuroticism, experiencing
ple with health problems evaluate text as more arousing stress, feeling negative emotions, and less satisfied with
compassion. life are more prone to perceive texts as more
controver</p>
      <p>Individuals who do not use relaxation strategies as a sial in those two biases. Depression and general health
coping mechanism for stress tend to exhibit a negative problems can reinforce these biases, as well as
ruminaafect dimension bias . This suggests that the absence of tion and distraction as emotion regulation strategies. An
relaxation techniques may contribute to a tendency to inverse relationship with positive emotions confirms the
perceive and interpret stimuli in a negative light when tendency to perceive text as passing less controversial
experiencing stress. Individuals who employ afiliative while experiencing similar emotions. A similar tendency
humor are less likely to present biases toward negative af- is noticed for people who score higher in intellect and
fect dimensions. There is a positive relationship between use relaxation and engagement as strategies of emotion
agreeableness and diferentiation in negative dimension regulation. Individuals who are more likely to view a
biases, slightly weaker compared to positive dimension text as ofensive or funny tend to experience higher levels
biases. This implies that individuals with higher levels of stress, negative emotions, and dificulty in identifying
of agreeableness may display more nuanced biases when and understanding their own emotions. Additionally, the
it comes to perceiving negative afect. Higher scores in presence of positive afect appears to have a mitigating
alexithymia are associated with a greater propensity to efect on this tendency, indicating that higher levels of
negative bias. This suggests that individuals who strug- positive emotions are associated with a reduced
likeligle with identifying and expressing their own emotions hood of perceiving the text as ofensive or funny. There
may be more inclined toward negative biases in their per- is a diference between personality traits that have an
ception and interpretation of stimuli. When individuals impact on the ofensive to me and ofensive to someone
experience negative emotions, they are more suscepti- bias. Individuals who score higher in agreeableness and
ble to perceiving text through a negative dimension bias. conscientiousness have a tendency to perceive the text
This implies that the emotional state of negativity can as more ofensive to them, surprisingly the tendency is
influence how individuals interpret and evaluate stim- inverse for an ofensive to someone (only for
agreeableuli, leading to a bias towards negative afect dimensions. ness). In other words, individuals high in agreeableness
Individuals who report lower levels of life satisfaction and ofensiveness may be more sensitive to personal
crittend to mark text as more negatively biased. This finding icism or ofensive remarks directed toward them, but
they may be less sensitive or more understanding when ods implies the necessity to include all of them in the data
it comes to ofensive language or content directed toward acquisition process in order to capture the most relevant
others. There is also a positive relationship between per- representations of various human perspectives.
ceiving text as ofensive to someone and funny (to me or The analysis of the stability of the ratings showed
sevsomeone) with the rumination and suppression strategy. eral important issues. The diference in the evaluation of
Interestingly, no significant relationship was observed duplicates made on a diferent day than the annotation
between the rumination strategy and the perception of of the first occurrence of the text may indicate a gradual
text as ofensive to oneself, suggesting that this particular resilience to the content presented since users rather
lowstrategy may not significantly influence one’s sensitivity ered the score for negativity than upheld their judgment.
to personal ofense. The use of distraction as a coping The introduction of a new measure to determine the
negamechanism has an impact on perceiving content as ofen- tivity of the context in the form of preceding texts ( )
sive and finding humor in it. The inverse relationship is revealed that there is an impact of the negativity of texts
observed for conscientiousness. Individuals with higher previously rated by the annotator – if the context for the
levels of conscientiousness may be more sensitive to po- duplicate is more negative than for the first occurrence
tential threats or negative implications in communication, of the text ( is higher), the annotators tend to assign
leading them to perceive text as ofensive to them more a more negative rating to the duplicate than they did for
frequently. Individuals with higher levels of intellect are the first appearance.
less likely to interpret text as personally ofensive . In
other words, intellectual individuals tend to be more
objective and less sensitive to potentially ofensive content 6. Conclusions and Future Work
directed at themselves. Political bias is higher for
people who score higher in intellect. The same tendency is Our results demonstrated that people vary between
themfor neuroticism. Individuals who are more agreeable are selves in terms of psychological characteristics, which
likely to be a more open-minded and tolerant approach was also reflected in the diversified annotation results.
Rewhen it comes to political beliefs, leading to lower lev- lationships between questionnaire results and biases lead
els of political bias. Also, health problems can influence to several conclusions. First, there is a common tendency
the perception of text as understandable. However, to that specific psychological characteristics are related to
generalize such conclusions, we should conduct more similar dimensions inside the group. e.g., agreeableness
complex studies that consider the use of more specialized with positive afect. It is a question of future research
equipment. The fact that the ranking agreement (Tau to investigate why certain dimensions (e.g., calm with
coeficient) and agreement calculated on binarized anno- agreeableness) did not correspond to the group
tendentations (Kappa binarized) are significantly higher than cies. Second, it is possible to evaluate the intensity of
the agreement calculated on raw annotations suggests psychological characteristics based on the annotation of
that the 10 point annotation scale may be problematic texts. Future studies could further explore this issue by
for annotators. They generally agreed on the presence of selecting the type of text to annotate and developing
popa given dimension in the text, but difered in determining ulation norms. The main conclusion that can be drawn
its exact intensity. Nevertheless, the values of the Tau is that psychological characteristics influence multiple
coeficient are high for most of the tasks, which means perspectives on text perception. Our research also shows
that the annotators generally agreed on the ranking of that it may be worth including information about
annothe dimension intensity of texts. tator characteristics in machine learning solutions. We</p>
      <p>Higher correlation values between text annotations have shown that people tend to change their ratings over
and user biases compared to their relationship with ques- time, and in many cases, the diferences in annotations
tionnaires may be related to the text dependency of those (and therefore intra-annotator agreement) are very high.
methods. On the other hand, more significant positive Undoubtedly, this depends on many factors. One of them
and negative correlations between questionnaires and may be the influence of previously annotated texts. We
biases compared to correlations between questionnaires presented a study conducted by us on a selected sample of
and annotations may be caused by the aggregative na- annotators. Our future work in this regard would involve
ture of biases. They aim to distill user annotations to increasing the scope of this work to more dimensions
emphasize the main diferences between user preferences and a larger number of annotators. The limitations of the
compared to others. Furthermore, the highest number of present studies naturally include the unbalanced gender
negative correlation values was observed between ques- and age group. Another limitation concerns insuficient
tionnaires and biases. This outlines the diferent types sample size to generealize our findings. The source code
of text-agnostic knowledge about the user that can be used during research is publicly available2.
obtained with this method in comparison to annotations 2https://github.com/CLARIN-PL/capturing-human-perspectives/
and biases. Therefore, the distinct nature of those meth- tree/main
ation for Information Science and Technology 73
(2022) 3–18.</p>
      <p>This work was financed by (1) the National Science Cen- [7] V. Basile, F. Cabitza, A. Campagner, M. Fell, Toward
tre, Poland, project no. 2021/41/B/ST6/04471; (2) Contri- a perspectivist turn in ground truthing for
predicbution to the European Research Infrastructure ’CLARIN tive computing, arXiv preprint arXiv:2109.04270
ERIC - European Research Infrastructure Consortium: (2021).</p>
      <p>Common Language Resources and Technology Infras- [8] G. Abercrombie, V. Rieser, D. Hovy, Consistency
tructure’, 2022-23 (CLARIN Q); (3) the Polish Ministry is key: Disentangling label variation in natural
lanof Education and Science, CLARIN-PL; (4) the Euro- guage processing with intra-annotator agreement,
pean Regional Development Fund as a part of the 2014- arXiv preprint arXiv:2301.10684 (2023).
2020 Smart Growth Operational Programme, projects no. [9] P. Milkowski, M. Gruza, K. Kanclerz, P. Kazienko,
POIR.04.02.00-00C002/19, POIR.01.01.01-00-0288/22 and D. Grimling, J. Kocon, Personal bias in
prePOIR.01.01.01-00-0923/20; (5) the statutory funds of the diction of emotions elicited by textual opinions,
Department of Artificial Intelligence, Wroclaw Univer- in: Proceedings of the 59th Annual Meeting of
sity of Science and Technology; (6) the Polish Ministry the Association for Computational Linguistics
of Education and Science within the programme “Inter- and the 11th International Joint Conference on
national Projects Co-Funded”; (7) the European Union Natural Language Processing: Student Research
under the Horizon Europe, grant no. 101086321 (OMINO). Workshop, Association for Computational
LinHowever, the views and opinions expressed are those of guistics, Online, 2021, pp. 248–259. URL: https:
the author(s) only and do not necessarily reflect those //aclanthology.org/2021.acl-srw.26. doi:10.18653/
of the European Union or the European Research Execu- v1/2021.acl-srw.26.
tive Agency. Neither the European Union nor European [10] P. Kazienko, J. Bielaniewicz, M. Gruza, K. Kanclerz,
Research Executive Agency can be held responsible for K. Karanowski, P. Miłkowski, J. Kocoń,
Humanthem. centred neural reasoning for subjective content
processing: Hate speech, emotions, and humor,
InforReferences mation Fusion (2023).</p>
      <p>[11] J. Bielaniewicz, K. Kanclerz, P. Miłkowski, M. Gruza,
[1] D. Hovy, S. Prabhumoye, Five sources of bias in nat- K. Karanowski, P. Kazienko, J. Kocoń,
Deepural language processing, Language and Linguistics sheep: Sense of humor extraction from
embedCompass 15 (2021) e12432. dings in the personalized context, in: 2022 IEEE
[2] K. Kenyon-Dean, E. Ahmed, S. Fujimoto, J. Georges- International Conference on Data Mining
WorkFilteau, C. Glasz, B. Kaur, A. Lalande, S. Bhanderi, shops (ICDMW), 2022, pp. 967–974. doi:10.1109/
R. Belfer, N. Kanagasabai, et al., Sentiment analy- ICDMW58026.2022.00125.
sis: It’s complicated!, in: Proceedings of the 2018 [12] K. Kanclerz, A. Figas, M. Gruza, T. Kajdanowicz,
Conference of the North American Chapter of the J. Kocon, D. Puchalska, P. Kazienko, Controversy
Association for Computational Linguistics: Human and conformity: from generalized to personalized
Language Technologies, Volume 1 (Long Papers), aggressiveness detection, in: Proceedings of the
2018, pp. 1886–1895. 59th Annual Meeting of the Association for
Com[3] A. M. Davani, M. Díaz, V. Prabhakaran, Dealing putational Linguistics and the 11th International
with disagreements: Looking beyond the majority Joint Conference on Natural Language Processing
vote in subjective annotations, Transactions of the (Volume 1: Long Papers), Association for
ComputaAssociation for Computational Linguistics 10 (2022) tional Linguistics, Online, 2021, pp. 5915–5926. URL:
92–110. https://aclanthology.org/2021.acl-long.460. doi:10.
[4] A. Tourimpampa, A. Drigas, A. Economou, P. Rous- 18653/v1/2021.acl-long.460.
sos, Perception and text comprehension. it’sa mat- [13] K. Kanclerz, M. Gruza, K. Karanowski,
ter of perception!, International Journal of Emerg- J. Bielaniewicz, P. Miłkowski, J. Kocoń, P. Kazienko,
ing Technologies in Learning (Online) 13 (2018) 228. What if ground truth is subjective? personalized
[5] M. M. Nitzschner, U. K. Nagler, J. F. Rauthmann, deep neural hate speech detection, in: Proceedings
A. Steger, M. R. Furtner, The role of personality of the 1st Workshop on Perspectivist Approaches
in advertising perception: An eye tracking study, to NLP@ LREC2022, 2022, pp. 37–45.</p>
      <p>Psychologie des Alltagshandelns 8 (2015) 10–17. [14] J. Kocoń, M. Gruza, J. Bielaniewicz, D. Grimling,
[6] X. Sun, X. Zhou, Q. Wang, S. Sharples, Investigating K. Kanclerz, P. Miłkowski, P. Kazienko, Learning
the impact of emotions on perceiving serendipitous personal human biases and representations for
subinformation encountering, Journal of the Associ- jective tasks in natural language processing, in:
2021 IEEE International Conference on Data
Mining (ICDM), IEEE, 2021, pp. 1168–1173. lan, The psychometric assessment of alexithymia:
[15] A. Cutler, D. M. Condon, Deep lexical hypothe- Development and validation of the perth
alexsis: Identifying personality structure in natural lan- ithymia questionnaire, Personality and Individual
guage., Journal of Personality and Social Psychol- Diferences 132 (2018) 32–44.</p>
      <p>ogy (2022). [30] R. Likert, A technique for the measurement of
[16] D. Demszky, D. Movshovitz-Attias, J. Ko, A. Cowen, attitudes., Archives of psychology (1932).</p>
      <p>G. Nemade, S. Ravi, Goemotions: A dataset of fine- [31] S. Cohen, R. C. Kessler, L. U. Gordon, Measuring
grained emotions, arXiv preprint arXiv:2005.00547 stress: A guide for health and social scientists,
Ox(2020). ford University Press on Demand, 1997.
[17] L. Feldman Barrett, J. A. Russell, Independence and [32] E. Diener, D. Wirtz, W. Tov, C. Kim-Prieto, D.-w.
bipolarity in the structure of current afect., Journal Choi, S. Oishi, R. Biswas-Diener, New well-being
of personality and social psychology 74 (1998) 967. measures: Short scales to assess flourishing and
[18] J. B. Nezlek, P. Kuppens, Regulating positive and positive and negative feelings, Social indicators
negative emotions in daily life, Journal of personal- research 97 (2010) 143–156.</p>
      <p>ity 76 (2008) 561–580. [33] E. Diener, R. A. Emmons, R. J. Larsen, S. Grifin, The
[19] M. B. Donnellan, F. L. Oswald, B. M. Baird, R. E. satisfaction with life scale, Journal of personality
Lucas, The mini-ipip scales: tiny-yet-efective mea- assessment 49 (1985) 71–75.
sures of the big five factors of personality., Psycho- [34] K. Förster, P. Kanske, Upregulating positive afect
logical assessment 18 (2006) 192. through compassion: Psychological and
physiolog[20] L. A. Jensen-Campbell, W. G. Graziano, Agreeable- ical evidence, International Journal of
Psychophysness as a moderator of interpersonal conflict, Jour- iology 176 (2022) 100–107.</p>
      <p>nal of personality 69 (2001) 323–362. [35] G. Haydon, J. Reis, L. Bowen, The use of humour
[21] B. W. Roberts, C. Lejuez, R. F. Krueger, J. M. in nursing education: An integrative review of
reRichards, P. L. Hill, What is conscientiousness and search literature, Nurse Education Today (2023)
how can it be assessed?, Developmental psychology 105827.</p>
      <p>50 (2014) 1315. [36] K. Pearson, Vii. note on regression and inheritance
[22] L. D. Smillie, C. G. DeYoung, P. J. Hall, Clarifying the in the case of two parents, proceedings of the royal
relation between extraversion and positive afect, society of London 58 (1895) 240–242.</p>
      <p>Journal of personality 83 (2015) 564–574.
[23] S. Balta, E. Emirtekin, K. Kircaburun, M. D. Grifiths,</p>
      <p>Neuroticism, trait fear of missing out, and phubbing: A. Annotator Profiles
The mediating role of state fear of missing out and
problematic instagram use, International Journal Annotator profiles comprised with the results of the
quesof Mental Health and Addiction 18 (2020) 628–639. tionnaires mentioned in 3.3 are presented in Fig. 8.
[24] R. R. McCrae, D. M. Greenberg, Openness to ex- Personality: Agreeableness: The average score of
perience, The Wiley handbook of genius (2014) 15.6 suggests that people tend to be moderately
coopera222–243. tive and compassionate towards others (with a standard
[25] R. A. Martin, P. Puhlik-Doris, G. Larsen, J. Gray, deviation of 2.2). Extraversion: The average score of 12.1
K. Weir, Individual diferences in uses of humor and indicates that, on average, individuals tend to have a
their relation to psychological well-being: Develop- moderate level of sociability and assertiveness (with a
ment of the humor styles questionnaire, Journal of standard deviation of 4.3). Conscientiousness: With an
research in personality 37 (2003) 48–75. average score of 15.1, individuals, on average, exhibit a
[26] H. Medland, K. De France, T. Hollenstein, D. Mus- moderate level of organization and responsibility (with
sof, P. Koval, Regulating emotion systems in ev- a standard deviation of 2.7). Neuroticism: The average
eryday life, European Journal of Psychological As- score of 12.8 implies that, on average, individuals tend
sessment (2020). to have a moderate level of emotional stability and
ex[27] A. C. Schat, E. K. Kelloway, S. Desmarais, The physi- perience negative emotions (with a standard deviation
cal health questionnaire (phq): construct validation of 3.6). Intellect: The average score of 15.1 suggests that,
of a self-report scale of somatic symptoms., Journal on average, individuals tend to exhibit a moderate level
of occupational health psychology 10 (2005) 363. of intellectual curiosity and openness to new ideas (with
[28] A. Kokoszka, A. Jastrzębski, M. Obrębski, Ocena a standard deviation of 2.5). Humor Style: Afiliative
psychometrycznych właściwości polskiej wer- humor: The average score of 29.4 indicates that on
aversji kwestionariusza zdrowia pacjenta-9 dla osób age people tend to use humor extensively to strengthen
dorosłych, Psychiatria 13 (2016) 187–193. social bonds and improve relationships (with a standard
[29] D. Preece, R. Becerra, K. Robinson, J. Dandy, A. Al- deviation of 5.5). Self-enhancing humor: The average
score of 25.2 suggests that, on average, individuals tend trointestinal problems (with a standard deviation of 4.3).
to use humor extensively as a coping mechanism to main- Respiratory Infections: The average score of 3.5 indicates
tain a positive outlook during stressful situations (with a that, on average, individuals report a relatively low level
standard deviation of 6.7). Aggressive humor: With an of respiratory infections (with a standard deviation of
average score of 19.5, individuals, on average, exhibit a 2.7).
moderate tendency to use humor as a means of teasing
or mocking others (with a standard deviation of 4.8).
Selfdefeating humor: The average score of 19.1 implies that, B. Heatmaps
on average, individuals tend to moderately engage in
self-disparaging humor and put themselves down (with a Heatmaps may vary in the number of dimensions
disistthaynmdairad: dTehveiaativoenraogfe5.s4c)o.rSetroefss14a.n8dinEdmicoatteiosntsh:aAt, leoxn- tpiloanynedaiirne.thOenslcyopdeimofenbsiaiosenss atnhdattheexhreibsuitltas
ocforthreelaqtuioesnaverage, individuals tend to have a low level of dificulty value of 0.1 or higher are displayed.
in identifying and expressing emotions (with a standard
deviation of 6.4). Stress: With an average score of 14.5, B.1. Personality Traits
individuals, on average, perceive a low level of stress in In Fig. 9, agreeableness is moderately correlated with
their lives (with a standard deviation of 6.8). Positive the dimension connected with positive afect dimensions
afect: The average score of 22.4 suggests that, on aver- (positive, delight, inspiration, surprise and compassion)
age, individuals experience a moderate level of positive whereas weakly with joy. The relationship with
negaemotions (with a standard deviation of 4.7). Negative tive afect dimensions is slightly weaker. Data analysis
afect: The average score of 17.5 implies that, on aver- revealed a weak positive relationship between
extraverage, individuals experience a moderate level of negative sion and selected positive emotion bias. There is a weak
emotions (with a standard deviation of 5.6). Satisfaction (positive, surprise and compassion) and moderate
(dewith life: The average score of 21.6 indicates that, on light, inspiration, joy) relationship between
conscienaverage, individuals have a low level of satisfaction and tiousness trait and a few positive afect biases, and a
happiness with their lives (with a standard deviation of weak negative relationship between negative emotion
5.7). Emotion’s Regulation: Relaxation: The average bias (negative, sadness, and anger). Two rational bias
score of 94.4 suggests that, on average, individuals en- (ofensive to me and funny to me) are related to
consciengage in relaxation techniques to manage their emotions tiousness. There is a weak negative correlation between
to a moderate extent (with a standard deviation of 61.7). neuroticism and positive afect biases ( joy, delight,
inspiEngagement: With an average score of 125.5, individuals, ration and compassion). A similar relationship is observed
on average, exhibit a moderate level of involvement and for some negative afect biases ( negative, fear) and the
immersion in activities as a means of emotion regulation opposite for anger. This trait is positively weekly
corre(with a standard deviation of 62). Rumination: The mean lated with rational biases (embarrassing, vulgar, political,
score is 93.7, reflecting a moderate tendency to ruminate understandable, ofensive to someone ). The inverse
relaor dwell on negative thoughts or emotions (with a stan- tionship is observed for anger and ofensive to me biases.
dard deviation of 65.3). Reappraisal: The mean score is Intellect is negatively related to three rational biases
115.5, indicating a moderate tendency to reinterpret sit- (ofensive to me , vulgar and embarassing) and positively
uations to regulate emotions (with a standard deviation with two (political and understandable). There is a weak
of 58.1). Distraction: The mean score is 89.6, reflecting a negative association between intellect and positive
afmoderate preference for using distractions as an emotion fect biases (compassion, surprise, calm, inspiration and
regulation strategy (with a standard deviation of 63.4). delight).</p>
      <p>Suppression: The mean score is 46.5, indicating a
relatively lower tendency to suppress or hide emotions (with
a standard deviation of 50.4). Health Depression: The B.2. Humor
average score of 6.1 indicates that, on average, individ- In Fig. 10, afiliative humor has a weak positive
correlauals report a relatively low level of depression (with a tion with rational dimension biases (understandable) and
standard deviation of 5.6). Sleep disturbance: The aver- a negative correlation with embarrassing and interesting.
age score of 10.7 suggests that, on average, individuals For the negative afect dimension biases, a weak positive
experience a low level of sleep disturbance (with a stan- correlation can be seen for negative, fear, sadness, disgust
dard deviation of 6.0). Headaches: The average score of and anger bias. Self-enhancing humor is negatively
6.7 indicates that, on average, individuals report a low correlated with rational dimension biases (funny to me,
level of headaches (with a standard deviation of 4.3). Gas- ofensive to someone, understandable, interesting,
polititrointestinal Problems: The average score of 7.4 suggests cal, embarrassing). Aggressive humor is positively
corthat, on average, individuals report a low level of
gasrelated with negative afect dimensions ( funny to some- (embarrassing, interesting) and negatively with (political,
one, ofensive to someone , understandable,interesting and understandable).
political) and positive dimensions biases (positive, joy,
delight, surprise) and negative dimension bias (disgust). B.3. Emotion Regulation
Self-defeating humor is correlated with positive
affect dimensions (positive, joy, delight, inspiration, sur- In Fig. 11, engagement has only weak negative
correprise, compassion), negative afect dimensions ( negative, lations with the rational dimension (ironic,
embarrassfear, sadness, disgust and anger) and rational dimensions ing, vulgar, understandable, ofensive to someone, funny
to someone). For the positive afect dimension, a weak erate for surprise bias and weak positive for compassion,
positive correlation can be seen for positive and calm bias. positive, calm and inspiration bias.
There is also a weak positive correlation for negative bias, Reappraisal is weakly correlated with the dimension
and a weak negative correlation is for disgust biases. associated with positive dimensions (surprise, positive,</p>
      <p>There is a moderate positive relationship between of- compassion) and negative dimensions (negative, sadness).
fensive to someone bias and rumination. For the other The same absolute value occurs for ironic and funny to
rational dimension (funny to someone, funny to me) a me bias, except that the former shows a weak positive
weak positive correlation occurs. The correlations for correlation and the latter a weak negative correlation.
positive afects dimension are similarly distributed: mod- There is a moderate relationship between distraction
and four positive dimensions (positive, inspiration, joy, correlation is found with compassion bias.
compassion). A similar correlation occurs for the two
rational dimensions (ofensive to someone, funny to some- B.4. Health
one). Other rational (ofensive to me, funny to me, ironic,
vulgar) and positive (delight, surprise) dimensions show In Fig. 12, there is a weak relationship between
depresa weak positive correlation. sion and disgust bias. At the same time, there is a
nega</p>
      <p>Suppression is moderately correlated with two ratio- tive correlation with positive afects (joy, positive). From
nal dimension (funny to someone, funny to me). Other ra- positive afects only compassion is related to depression
tional dimension (vulgar, embarrassing, ofensive to some- in a weak positive correlation. There are also
relationone, ironic) have a weak positive relationship. A similar ships with rational diferentiation: weak positive (ironic,
ofensive to someone), moderate positive (vulgar, embar- turbance and positive bias.
rassing), and moderate negative correlation (understand- The item most highly correlated is understandable bias,
able). with a moderate positive correlation with headaches.</p>
      <p>A positive afect ( compassion) is weakly positively re- Other rational dimensions show weak positive
correlalated to sleep disturbance. There is a similar but neg- tions (ironic, interesting, embarrassing, vulgar). There is
ative correlation with joy bias and rational efects ( un- a weak positive association between headache and
negaderstandable, interesting, incomprehensible). There are tive afect ( anger). There is a weak negative correlation
also weak positive correlations with rational afects ( em- with incomprehensible bias.
barrassing,vulgar). The ironic bias has a weak positive Gastrointestinal problems are mostly correlated
correlation. There is no relationship between sleep dis- with rational afect: with moderate positive correlation
(ironic, vulgar, ofensive to me ), weak positive correla- correlation. For positive bias, there is a weak positive
tion (embarrassing, ofensive to someone ) and negative correlation.
correlations: moderate (understandable) and weak (in- A moderate or weak negative correlation can be
obcomprehensible). There are as many weak positive rela- served between perceiving a text as understandable and
tionships with positive afects ( compassion, positive) as headaches, gastrointestinal problems, depression, and
with negative ones (disgust, fear). sleep disturbance. The same somatic health dimensions</p>
      <p>For respiratory infections, the strongest correlations have a positive correlation with interpreting a text as
vulare with rational afect, both weakly positive ( ironic, in- gar or embarrassing. There is a positive correlation with
teresting) and weakly negative (incomprehensible, ofen- the ironic bias in all physical health dimensions studied.
sive to me). Negative afect ( anger) has a weak negative</p>
      <sec id="sec-3-1">
        <title>B.5. Stress and Emotions</title>
        <p>In Fig. 13, experiencing stress is moderate (vulgar,
embarrassing) and weakly (ofensive to someone , funny to
someone) positively related to rational biases. The negative
relationship is noticed only with understandable rational
bias. Positive afect (positive, inspiration) and negative
afect (negative, fear, sadness) biases are weakly negative
related to stress. Experienced positive afect is negatively
moderately related to rational biases (funny to someone,
ofensive to someone , political, vulgar and embarassing).
Among positive afect dimensions only positive bias is
positively correlated. From the negative afect
dimensions, only disgust bias is weakly negatively correlated.
Both the negative afect dimensions ( negative, fear,
sadness, disgust) and rational dimensions (embarrassing,
political, ofensive to someone and funny to someone) are
weakly related to negative afect. Only the vulgar bias
is moderately related to the negative afect. There is a
weak positive relationship between positive dimensions
biases (positive, inspiration and surprise) and satisfaction
with life. There is a weak inverse relationship between
negative dimension biases (fear, sadness). Rational
dimension biases (embarrassing, vulgar, political, ofensive
to someone) are weakly negatively related to satisfaction
with life. A positive correlation is only observed for the
understandable bias.</p>
        <p>C. Intra-Annotator Agreement
C.1. Intra-Annotator Agreement for</p>
        <p>Afective Dimensions
0 0 6 0 0 3
0 0 6 0 0 2
0 0 4 0 0 7 0 0 7 0
r
ea ,53 ,15 ,
F</p>
        <p>%
% %
3 0
3 74 ,14 ,0 5
4 ,
3
7 7
0 ,7 ,1 ,1 ,6
,59 ,15 ,</p>
        <p>%
5 0 1
5 51 ,18 ,0 68 ,70 ,
1 , 0 ,
8 5
4 ,6
C
0 0 5 0 0 4</p>
        <p>0 0 8 0
%</p>
        <p>%
,36 ,18 ,
%
5 1
6 28 ,19 ,7 6
1 , 0 ,9
7
,4 3
6 ,5
0 0 2 0 0 1 0 0 9 0
4 8
,8 ,7 ,</p>
        <p>8
0 0 1 0</p>
        <p>7 ,
%
%
% % %
lam ,39 ,08 ,
C</p>
        <p>%
2 4 1
0
4 ,8 ,13 ,1 68 ,70 ,
7 ,
8 8
4 ,6
0 0 3 0 0 7
0 0 8 0
e
g
n ,65 ,11 ,
A
%</p>
        <p>%
%
2 7 3
8 46 ,12 ,5 94 ,51 ,
0 , 8 ,
9 3
9 ,5
,67 ,14 ,
%</p>
        <p>% %
6 4 7
8 55 ,06 ,1 86 ,80 ,
1 , 2 ,
3 3
5 ,6
D
0 0 6 0 0 5
0 0 6 0
4 4 3 4
a
S
u</p>
        <p>S
e
l
s
n
o
e
s D
(
i y
s Jo
n
e</p>
        <p>e
im iv</p>
        <p>t
d a
g
e
e
v N
i
t</p>
        <p>I
t
h
g
i
l
e
e
v
i
t
i
s
r</p>
        <p>o
o P
f
c
fe
a
m
t
n
e
e
e
r
g
A
r
o
t
a
t
o
n
n
4
r
t
A A A
a a
r
t
)
B
(
&gt;
)
A
(
% % % %
1 1
,5 ,
6 0
9 0
0 ,</p>
        <p>3
2 6
4 5
,5 ,
1 8
% % % %
9
9
8
0 ,
3 ,
9 2
% % % %
8 0 9 2
,8 ,
3 0
7 ,
6 ,</p>
        <p>4
2 2
% %
2 7
,8 ,
8</p>
        <p>9
2 ,
8
% %</p>
        <p>6
4 ,</p>
        <p>8
7 4
2 1
,1 ,</p>
        <p>8
1 ,</p>
        <p>7
5 ,</p>
        <p>2
9 2 3 8
4 5 1 3
% %
3 9
4 ,
4
%
6
6% ,8
1 9
3 7 ,
1 1 3 1</p>
        <p>4
6 0
%
6
2 4
,
5 ,
0 9
6 ,
6 9
7 ,</p>
        <p>8
2 5
% %
0 7
,6 ,
6
4 ,
% %
4 1
7 ,</p>
        <p>6
6 1 1
5 2 1 3
% %
2 8
,9 ,9
4 2 ,
3 3 5 2</p>
        <p>%
% 9
28 ,4</p>
        <p>3
% %
4 7
,6 ,6
4 6 ,
2 4 8 2</p>
        <p>%
% 8
14 ,6</p>
        <p>4
% % %
% 7
39 ,
, 8
5 ,
9 2
4 ,</p>
        <p>8
1 5
7 2 2 0
6 1 1 1
%
0
5 4
,
7 ,
0 4</p>
        <p>2 4
1 6 2 6
% % %
2 5</p>
        <p>8
9 ,
1 ,
5
,5 ,1 ,9 ,
%
8
4 73 ,05 ,4 78 ,21 ,
%
1
5 2
5 ,7
1 ,
0 0 4 0 0 7 0 0 8 0
%
6
,65 ,08 ,
% %
3 9
8 67 ,15 ,4 49 ,50 ,
1 ,
2 5
3 ,7
1 ,
0 0 6 0 0 6
1
,7 ,2 ,3 ,
8 6
0 ,7
0 0 6 0 0 5 0 0 9 0
n
%
9
,47 ,22 ,
9 35 ,15 ,5 3</p>
        <p>6 2
3 ,4 ,2 ,8 ,4
2 ,
0 0 3 0 0 2 0 0 2 0
9 5 5
,7 ,8 ,9 ,8
0 0 0 0
%
%</p>
        <p>%
9
,41 ,12 ,
]
0
] r
o se
r
e u
z [
n sg
o n
n i
[
]
0
] 1
o re
r
e s
z u
- [
n s
o g
k ) )
) g</p>
        <p>n
e l
t
a
c</p>
        <p>g
o l
i u</p>
        <p>n
m i
i s
s
a
r
l r
a a</p>
        <p>b
%
6 7
6 95 ,22 ,5 1</p>
        <p>3 ,9
,91 ,18 ,
1 ,
0 0 9 0 0 9 0 0 9 0
,8 5
6 ,9
%
5
%
3
6 8 7 4
,9 ,2 ,4 ,9 ,2 ,6
0 ,4 ,
3 0</p>
        <p>8
4 ,4 ,6 ,22 ,1 9</p>
        <p>0 ,4
0 1 0 0 5 0
,79 ,17 ,
% %
6 6 9
2 89 ,06 ,8 59 ,42 ,9 8
3 ,8
% %
2 6 3
3 64 ,04 ,8 97 ,21 ,3 5
7 , 2 , 6 ,7
r
t
In I
n ra a</p>
        <p>r
o t</p>
        <p>o a
op i op ir op rev ) ftI Ion o ev rc
tf</p>
        <p>tf
(A tS tS rP tS tS rP tS trS rP A (B oS S S A In su su su vA</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>