<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>EmoGraphArt: A Knowledge Graph for Representing and Explaining Emotions in Abstract Paintings</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rafael Berlanga</string-name>
          <email>berlanga@uji.es</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dolores-María Llidó</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>María-José Aramburu</string-name>
          <email>aramburu@uji.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dpto. Ciencias de la Computación, Universitat Jaume 1 de Castellón</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dpto. Lenguajes y Sistemas Informáticos, Universitat Jaume 1 de Castellón</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>images. This representation is based on a logical model for explaining emotions derived from visual stimuli, and a knowledge graph aimed at connecting elements from the stimuli, the appraisal of the annotator and the final evoked emotion. With this approach, we put together in the same space all the components required for explaining emotions in this field. The adopted model relies on the appraisal theory in Psychology, mainly Lazarus' theory, which states that emotions derive from a rationalization process after the stimuli. The proposed knowledge graph provides us with the diferent perspectives and abstraction levels to analyze the connections between the main components involved in the rationalization of an emotion, namely: the person, the stimuli, and the emotion itself. In this paper, we will mainly focus on the opposites of emotions, that is, how emotions are contrasted in the context of rationalizing the stimuli coming from abstract images. Then, we compare the found opposites with other emotion models that define diferent axes for complex emotions. The experiments were carried out with the recently published collection FeelingBlue, which associates abstract paintings to basic emotions by means of rationales. We demonstrate that it is possible to extract useful patterns from the rationales by querying the generated knowledge graph.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Afective XAI</kwd>
        <kwd>Emotion Representation</kwd>
        <kwd>Knowledge Graphs</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Explaining the emotions evoked by abstract images is a challenging task that opens new opportunities
for studying key aspects in creativity and education in arts. Gathering rationales (explanations) together
with the images (stimuli) allows us to better analyze how emotions are built from stimuli through an
appraisal process. In this way, we can achieve a better understanding about how emotional expressions
are conducted by artists.</p>
      <p>
        Recent research in afective explainable AI (XAI) has been mainly focused on labeling large datasets of
images with small captions and the evoked emotion. However, these datasets lack enough information
to capture the relationship between the stimuli and the emotion. Moreover, these models are usually
limited to Ekman’s model of basic emotions [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Even so, no consensus is achieved about the set of
labels the annotator must use to assign emotions to images.
      </p>
      <p>Rationales have been recently used to give explanations about decisions for helping in the classification
of texts. Several approaches extract the text segments that best align to the rationale and adjust the
classification algorithm to improve its performance. This technique has been successfully applied to the
sentiment analysis task.</p>
      <p>
        Recently, rationales are used for annotating abstract paintings [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In this case, annotators freely
express in natural language the emotion an image evokes with respect to a reference basic emotion.
Unlike traditional labeling of images, open rationales can provide more details about the evoked emotion
as well as its connection to the mentioned stimuli from the image. However, dealing with rationales
MAI_XAI
* https://krono.act.uji.es/
is far more complex, requiring an extraction and normalization process similar to those applied in
ontology engineering.
      </p>
      <p>In this paper, we propose a knowledge graph to formalize the relationships between stimuli and
emotions. We formalize the afective space that results from the appraisal process, connecting the
stimuli to the basic emotional reactions or feelings. Afterwards, we extract from rationales all the
elements that can be mapped to this space. As far as we know, this is a first attempt to formalize an
afective space based on the appraisal theory for abstract art.</p>
    </sec>
    <sec id="sec-2">
      <title>2. An Explainable Model of Emotions</title>
      <p>
        Several models have been proposed in the literature to represent emotions in metric and qualitative
spaces. The most simple one is that of Ekman [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] which consists of the five basic reactions that people
usually express in their faces, namely: anger, joy, sadness, fear, and disgust. Maps of arousal-valence
values are very popular in the community, as they provide a means to assign EEG
(Electroencephalography) signals to points in this map [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. These maps inspired other forms of organizing emotions
according to other axes diferent from valence and arousal. One of the most elaborated ones is that of
the HourGlass (HG) of emotions presented in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Unfortunately, none of these models include elements
of explainability, that is, they do not explain the origin and the process that produce the particular
emotions. An exception is the OCC model (proposed by Ortony, Clore, and Collins) [5], which organizes
the emotions according to diferent appraisals, providing in this way an abstract explanation of each
emotion. However, this model is limited by the small number of complex emotions it is able to represent.
      </p>
      <sec id="sec-2-1">
        <title>2.1. Space of emotions</title>
        <p>
          In our approach, we combine some interesting concepts of the HG and the OCC models. We adopt
from the HG model the axes for representing the reactions or feeling, where composition of reactions
is possible to represent more complex feelings. From the OCC model we adopt the way a reaction is
explained in terms of the appraisal of an event or situation. On the other hand, the HG model provides
us the axes with which emotions can be organized. More specifically, we adapt the the revised HG model
[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], where 26 emotion categories are arranged into five axes, namely: Temper, Attention, Sensitivity,
Introspection, and a neutral axis named Control with two categories anticipation and surprise.
        </p>
        <p>Table 1 shows the main emotion categories associated with the adapted axes. In uppercase, we mark
the Eckman’s core emotions. Notice that there are no basic emotions associated to ∅, +,  + and +
in the Eckman’s core.</p>
        <p>
          For the sake of simplicity, we will not include any expression of intensity in the model, as they could
be easily included by modifying the polarities with a number in a continuous range like [− 1, +1]. For
example, annoyance could be denoted with  − 0.3, anger with  − 0.5 and rage with  − 0.9. Estimating
intensities are part of a calculus that is out of the scope of this work [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
        <p>
          We express the composition of emotions by using the operator ⊕ . For example, love is considered
in the HG model as a mixture of acceptance and joy (+ ⊕ +). Composition can combine diferent
polarities from diferent dimensions, and the final polarity of the reaction will depend on the intensity of
each of the components [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. Although in some examples the HG model presents contradicted polarities
under the same dimension, it is very unlikely to happen and should be revised.
        </p>
        <p>From a logical point of view, composition ⊕ must be considered as an “and” operator. Thus, reactions
can be organized into a hierarchy of reactions. However, as previously mentioned, inference of intensities
is out of the logic framework and must be defined through a calculus. For this reason, we will omit
intensities from now on.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Explaining emotional reactions</title>
        <p>According to the Appraisal Theory [6], the evaluation of a situation or an event’s consequences produces
an emotional reaction. After this reaction, one has to cope with it according to their abilities to handle,
accept or change the situation. Appraisal is a complex process that evaluates relevance, resources, and
options in the context of the goals of who judges the future expectancy as favorable or unfavorable.
Reasoning and understanding the emotional reaction becomes important for the future appraisals as
well (learning).</p>
        <p>In our model, we express the main aspects in the appraisal process of an event or a situation similarly
to the OCC model. We denote any event/situation as  , which has the structure shown in Figure 1.
Finally, to put together feelings and explanations, we define causality rules. Thus, a (named) emotion is
defined as a reaction caused by the appraisal of an event or situation. We use the causal operator ▷ to
relate the reaction with the explanation as follows:</p>
        <p>≡  ▷  [, , ]</p>
        <p>Now, some complex emotions can be expressed with this language. For example “jealousy” can be
represented by the following expression:</p>
        <p>Jealousy ≡  − ⊕ − ▷  − [(, ℎ), (), ()]</p>
        <p>This expression means that one feels rejection and anger towards both an intimate and another
person because of a negative appraisal due to a presumed relationship between them.</p>
        <p>We must point out that, in many cases, the reason for an emotion will be just an appraisal with the
same polarity as the reaction.</p>
        <p>In the abstract art domain,  will represent any visual stimulus whose appraisal evokes a particular
reaction in the observer. This will considerably reduce the complexity of the explanations as they usually
do not involve complex social relationships. For this domain, we will introduce specific targets/properties
for the situations, which are based on visual stimuli like colors, shapes, etc. Notice that targets can be
persons, objects or actions.</p>
        <p>Type the event: internal, sensory, social, etc.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Logical representation</title>
        <p>The introduced model for defining emotions can be easily expressed within a logical framework like
description logics (DL). Thus, reactions and events/situations are represented with diferent classes.
We must define all the necessary nominals and properties to represent the diferent elements of the
reactions and the events. Disjoint axioms are then introduced over all the contradictory nominals,
like basic emotions within the same axis and diferent polarities. Also, contradicting properties of
events/situations are expressed with disjoint axioms. Finally, the casualty operator ▷ must be interpreted
as an “and” operator between reaction and appraisal instead of an implication. This is because the
appraisal of a same event can lead to diferent reactions and therefore diferent emotions.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Representing emotion rationales in a KG</title>
      <p>Knowledge graphs (KGs) are powerful frameworks for representing connected knowledge and data,
providing diferent degrees of quality by means of ontological axioms [ 7]. In our proposal, KGs are used
to represent the emotional rationales along with the relevant entities involved in them. The KG is built
according to the following principles:
• It must contain all the concepts involved in the logical model described in Section 2.
• It must map the diferent elements mentioned in the rationales to concepts in the KG.
• It should link through properties the elements that take part in the generation of a rationale,
mainly: stimuli, reactions and appraisals.
• It should include external concepts to enrich the hierarchies of concepts and allow the abstraction
of rationales in order to find useful patterns.</p>
      <p>As a result, we have defined two ontologies to support our KG, namely:  and .
The ontology  contains all concepts related to the emotions organized as proposed in the
previous section. The ontology  comprises the lemmas and compound words hierarchically
organized into abstract semantic categories such as objects, actions, parts of body, colors, and so on.</p>
      <sec id="sec-3-1">
        <title>3.1. Extracting concepts from texts and corpora</title>
        <p>Following the methodology and the same tools proposed by the authors of the FeelingBlue dataset, we
apply lemmatization to extract all the concepts from the rationales. These lemmas are then mapped to
diferent semantic resources like WordNet and WikiData to enrich them with useful categories like
emotions, colors, shapes, etc. These categories are specially useful to find abstract patterns in the
rationales.</p>
        <p>All the metadata concerning the stimuli (images) and the rationales (basic reactions) are also included
in the KG. For example, we include in the KG all the labels assigned by experts in the Wikiart collection
and the labels assigned to the FeelingBlue collection. A normalization process was necessary to
harmonize all these labels according to the proposed XAI emotion model.</p>
        <p>For multimodal XAI, we have also included the palette of colors extracted from the images. This
information is useful to compare the mentioned colors in the rationale with the true colors seen in the
paintings. Incoherent rationales are then revised to check whether they are due to some error in the
normalization process of colors.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. FeelingBlue dataset</title>
        <p>
          FeelingBlue [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] has been constructed from a subset of the WikiArt dataset [8]. Wikiart contains images
from The Visual Art Encyclopedia annotated with 20 emotions. This collection was devised to analyze
human annotations with and without the title of the images.
        </p>
        <p>FeelingBlue only covers the abstract images of the WikiArt collection with the goal of analyzing
emotions from pure visual stimuli like color, shapes and textures. Emotions in FeelingBlue have been
reduced to the five emotions nearly resembling the Ekman’s core ones. FeelingBlue contains a total
of 912 images, which has been structured into 1,000 annotation tasks. Each task consists of 4 images
and a reference emotion (out of five). Each task is then performed by several human annotators. Each
annotator writes an explanation for the image with the emotion intensity (MAX) and another one for
the image with the minimum emotion intensity (MIN). As a result, we have multiple annotations for
the same image, with expressions of MAX and MIN over the reference emotions. These annotations
were included as concepts in our KG.</p>
        <p>To complete the KG, we translate the whole FeelingBlue collection by structuring the explicit
MINMAX rationales on abstract images. Table 2 shows an example of MIN-MAX explanations and their
corresponding expressions in the proposed model of emotions. Basically, we link the images to their
annotations as well as to the elements participating in the logical expressions of the
stimuli-appraisalreaction involved by the rationale. The final statistics of the resulting KG for the FeelingBlue dataset is
summarized in Table 3. In the table we have not included the inferred statements which are around
22,700. In Figure 2 we show a fragment of the graph for one painting after loading the KG into the
GraphDB tool [9].</p>
        <p>To allow sentiment analysis, we have used the spaCy1 Python library. So we can associate a global
polarity to each annotation rationale. Finally, for each image we summarize the polarity of the sentiment
of all its rationales.</p>
        <p>In Figure 2, we show the annotated emotions in WikiArt (). Only emotions annotated
by more than 10% of the annotators were included in the KG. In this case, the image was labelled in
WikiArt with the emotions , , ,  and . Regarding the
FeelingBlue annotations, the image has been explained through 16 tasks related with the emotions
 and .</p>
        <p>The  task has been annotated nine times as  . with text polarity − 1.0 and eight
times as  . with polarity 1.04. This indicates that the image is controversial as has a similar
number of annotations expressing opposite emotions.</p>
        <p>We can now analyze the divergences of emotions between WikiArt and FeelingBlue by retrieving
the polarities implied by the labeled emotions. We can do this through a SPARQL query on the KG as
follows.</p>
        <p>In the query 3, we have selected all the data about images that have assigned diferent emotion
s e l e c t ? term ?emoA ? emoAvalue ?emoPA ? emoF ? emoFvalue ? emoPF
where {
? term r d f : t yp e f o a f : Image .
? term base : EmotionA ? emoAnnoA .
? emoAnnoA r d f : v a l u e ? emoAvalue .
? emoAnnoA r d f : t yp e ?emoA .
?emoA base : emoOnto ?emoAnnoAA .
?emoAnnoAA base : p o l a r i t y ?emoPA .
? term base : EmotionF ? emoAnnoF .
? emoAnnoF r d f s : l a b e l ? emoFvalue .
? emoAnnoF r d f : t yp e ? emoF .
? emoF base : emoOnto ? emoAnnoFF .
? emoAnnoFF base : p o l a r i t y ? emoPF .</p>
        <p>FILTER ( ? emoPF ! = ?emoPA )
} groupby ? term ?emoA ? emoAvalue ?emoPA ? emoF ? emoFvalue ? emoPF
polarities with the two emotions annotators: WikiArt () and FeelingBlue ( ), where  
and   represents their respective polarities and  represents the polarity value.</p>
        <p>From the results of the query, we will inspect the case of task "anger.218", where rationales showed
some contradictions. Table 4 shows that the image has been associated to the anger emotion with a
preference to MAX by 9 annotators. However, the overall rationale polarity is positive. In this case, the
text polarity has been wrongly calculated. One of the annotators explains that the image is  .
because “bright red evokes a sense of “fury". The positive polarity comes from the word “bright" with
polarity +0.7, whereas “fury" has not been regarded as a negative word by the sentiment analysis tool.
On the other hand, the image has been also annotated as  . with overall positive polarity.
In this case, the rationale associated to the "minimum disgusting image" task was “bright color so better
feel". Notice that both rationales points to the same stimulus (bright color/red) producing opposite
appraisals and reactions: "evokes fury" vs. "better feel".</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Extracting patterns from the knowledge graph</title>
      <p>The resulting EmoGraphArt can serve us to test hypotheses about the stimuli and the emotions. For
example, an interesting question is how emotions are negated when annotators contrast MIN and MAX
annotations. Other relevant questions can be as follows: Are the MIN-MAX oppositions related to the
model axes? Is the annotator prone to select an image with a switched polarity in the same axis of the
reference emotion?</p>
      <p>As described in Section 2, we deal with four axes, namely: Attention (), Introspection (), Temper
( ), and Sensibility (). We ignore the Control axis because it cannot be placed in this map, and it is
poorly represented in the dataset. The selected axes can be visualized in the arousal-valence map as the
Figure 4 shows.</p>
      <p>We analyze the MIN rationales to estimate the likelihood of each emotion to be contrasted to that of
the reference. Table 5 shows the statistics comparing the rationale emotion with the reference emotion
for MIN annotations.</p>
      <p>We can see that many of the MIN explanations fall into the neutral emotion, which means that the
annotator usually does not have any feeling with respect to the expected one. This happens specially
for the  . task (− ), whose opposite is the Pleasant emotion (+), which is not the most
likely mentioned emotion. Notice that for  . the most likely non-neutral evoked emotions
are calm ( +) and joy (+), both positive as expected, but not in the same axis as disgust. Calm ( +) is
a frequent non-neutral emotion for negative emotions, except sadness. This is because anger, disgust,
and fear have a high arousal and sadness a low one. Thus, the opposites to these emotions are positive
with low arousal like calm. Finally, the contrast sadness-joy is quite clear in the results. This means
that when the reference emotion is either sadness (− ) or happiness (+), the annotator looks for an
image with the opposite feeling in the same axis.</p>
      <p>We have also included some expressions like “less happy” and “less sad” present in some rationales
to show that for some tasks the user looks for a less intense feeling and not the opposite one.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>Representing explanations of abstract art requires the combination of visual stimuli along with appraisals
that come from the subjective interpretation of the observer. Due to the complexity of this information,
we propose to represent them in a knowledge graph governed by a logical model of emotions able
to connect the necessary elements for the explanations. With this KG, we are able to find out useful
correlations between stimuli and emotions contextualized to specific images and observers.</p>
      <p>In the preliminary experiments, we have shown how this graph can be applied to find contradictions
from the diferent sources of annotations, as well as to understand how emotions are contrasted for
diferent images. This is a first step to study the dynamics of the emotions, that is, how emotions can
lead to further emotions through the course of a series of stimuli. Future work will go further in this line
as we plan to extract patterns from images to find connections between these stimuli and the emotions
that are evoked in the rationales.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This research has been partially funded by the Spanish Ministry of Science under grants
PID2021123152OB-C22 funded by the MCIN/AEI/10.13039/501100011033 and by the European Union and
FEDER/ERDF (European Regional Development Funds).
[5] G. Clore, A. Ortony, Psychological construction in the occ model of emotion, Emotion Review 5
(2013) 335–343.
[6] R. Lazarus, Progress on a cognitive-motivational-relational theory of emotion, American
Psychologist 46 (1991) 819–834.
[7] A. Hogan, et al., Knowledge graphs, ACM Computing Surveys 54 (2021) 1–37.
[8] S. Mohammad, S. Kiritchenko, Wikiart emotions: An annotated dataset of emotions evoked by art,
in: Proceedings of the Eleventh International Conference on Language Resources and Evaluation
(LREC 2018), 2018.
[9] B. Bishop, S. Bojanov, Implementing owl 2 rl and owl 2 ql rule-sets for owlim, in: CEUR Workshop
Proceedings, volume 796, 2011.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ekman</surname>
          </string-name>
          , W. Friesen, Unmasking the Face, Prentice-Hall, Englewood Clifs, NJ,
          <year>1975</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ananthram</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Winn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Muresan</surname>
          </string-name>
          ,
          <article-title>Feelingblue: A corpus for understanding the emotional connotation of color in context</article-title>
          ,
          <source>Transactions of the Association for Computational Linguistics</source>
          <volume>11</volume>
          (
          <year>2023</year>
          )
          <fpage>176</fpage>
          -
          <lpage>190</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Russell</surname>
          </string-name>
          , L. Barrett,
          <article-title>Core afect, prototypical emotional episodes, and other things called emotion: dissecting the elephant</article-title>
          ,
          <source>Journal of Personality and Social Psychology</source>
          <volume>76</volume>
          (
          <year>1999</year>
          )
          <fpage>805</fpage>
          -
          <lpage>819</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Susanto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Livingstone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. C.</given-names>
            <surname>Ng</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. Cambria,</surname>
          </string-name>
          <article-title>The hourglass model revisited</article-title>
          ,
          <source>IEEE Intelligent Systems</source>
          <volume>35</volume>
          (
          <year>2020</year>
          )
          <fpage>96</fpage>
          -
          <lpage>102</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>