<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Xiv.</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.48550/arXiv.2309.03409</article-id>
      <title-group>
        <article-title>Exploring Neuro-Symbolic AI for Facial Emotion Recognition</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jens Gebele</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anne Vetter</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Philipp Brune</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Frank Schwab</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sebastian von Mammen</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Neu-Ulm University of Applied Sciences</institution>
          ,
          <addr-line>Wileystraße 1, 89231 Neu-Ulm</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Würzburg</institution>
          ,
          <addr-line>Am Hubland, 97074 Würzburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2309</year>
      </pub-date>
      <volume>03409</volume>
      <abstract>
        <p>Facial Emotion Recognition (FER) aims at interpreting emotional states from facial behaviors. Deep Learning (DL) models have achieved notable successes in FER through inductive pattern recognition, yet their real-world efectiveness remains limited. This limitation stems from dificulties in capturing the multifaceted and nuanced range of facial behavior in both theoretical models and datasets. To address these issues, this paper proposes adopting Neuro-Symbolic AI (N-SAI), i.e. approaches that combine the rule-based strengths of symbolic AI with the numerical power of DL. We explore various N-SAI strategies, with a particular focus on abductive learning, which interprets sub-symbolic data into logical facts and uses logical abduction to correct misconceptions. This approach not only improves the adaptability of FER systems, but also fosters new insights into the relationship between facial behaviors and emotional states, substantially enhancing the practical utility and efectiveness of FER technologies. Additionally, the analysis relates N-SAI reasoning to human cognition, deduction, induction, and abduction, as a conceptual lens on hybrid intelligence.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Neuro-Symbolic AI</kwd>
        <kwd>Abductive Learning</kwd>
        <kwd>Hybrid Intelligence</kwd>
        <kwd>Facial Emotion Recognition</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Facial Emotion Recognition (FER) entails the analysis and
interpretation of emotional states based on observable
facial behavior. This technology enhances human-machine
interactions [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ], and plays a crucial role in supporting
interpersonal understanding [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. The current
state-ofthe-art in FER predominantly relies on Deep Learning (DL)
techniques such as Convolutional Neural Networks (CNNs),
Recurrent Neural Networks, and Generative Adversarial
Networks [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Despite these advances, FER systems face multiple
challenges. Key issues include the complex and variable
relationship between facial expressions and emotional states [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], the
scarcity of diverse training data [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], and diferent
environmental factors such as lighting, background and occlusions
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Inconsistencies in data annotations further complicate
matters [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Moreover, the field also contends with diferent
emotional theories that propose either a continuous or a
discrete model concept of emotional states [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], such as the
well-known theory of seven universal basic emotional states
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The ongoing debate about the universal applicability
of facial expressions to infer emotional states highlights
the variability and complexity inherent in human
expressions [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Consequently, while FER systems exhibit robust
performance on controlled datasets, their efectiveness in
real-world scenarios remains limited [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ].
      </p>
      <p>
        To overcome these limitations, we propose
NeuroSymbolic AI (N-SAI) architectures for FER. N-SAI merges
the rule-based precision of symbolic AI with the adaptive,
data-driven capabilities of neural approaches,
encompassing both symbolic reasoning and neural inductive reasoning
[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. This innovative approach is designed to address both
data-related and theoretical challenges in emotion
recognition. This paper conceptually explores various N-SAI
architectures for FER, with a particular focus on
abductive learning, to reveal new insights into the correlations
between facial expressions and emotional states across
different contexts and conditions. The structure of this paper
is as follows: Section 2 examines current FER research. This
is followed by an examination of six N-SAI design patterns.
Section 4 highlights the most promising N-SAI design for
FER, focusing on abductive learning and drawing conceptual
parallels to human cognitive processes, namely, deduction,
induction, and abduction.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Facial Emotion Recognition</title>
      <p>
        In the field of FER, there are two main approaches that
researchers are using. The first approach involves directly
recognizing an emotional state from facial expressions in a
single step. This method typically utilizes DL models trained
on a large corpus of annotated data [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. In contrast, the
second approach starts with the detection of facial
movements, which are categorized as Action Units (AUs) based
on the Facial Action Coding System (FACS) [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] standard.
These AUs are then mapped to corresponding emotional
states [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Ultimately, both approaches rely on annotations
of emotional states grounded in emotional theories, using
either discrete [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] or continuous label spaces [
        <xref ref-type="bibr" rid="ref16 ref17">16, 17</xref>
        ]. In
particular, the discrete model of the seven universal basic
emotions is predominantly used.
      </p>
      <p>
        A significant challenge across both approaches is the
accurate mapping of facial behavior to emotional states,
especially as the relationship between these behaviors and
states is recognized to be more complex than previously
understood, thus complicating the fidelity of data
annotations based on discrete and continuous emotional models
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. This complexity remains whether the mapping is
performed directly through features learned by DL models or
indirectly through the detection of Action Units by certified
FACS coders [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] or semi-automated tools [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. In all cases,
a relationship to emotional states must be established at the
end.
      </p>
      <p>
        Given this complexity, it is not surprising that while DL
models achieve impressive results in controlled
environments, their performance in real-world settings often falls
short [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. This limitation is primarily due to the models’
dependence on approximating the distribution of the training
data. Capturing the nuanced and multifaceted nature of
facial expressions of emotions remains a significant challenge,
necessitating considerable eforts to ensure datasets are
diverse and of high quality [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. Despite growing awareness
of the foundational issues related to emotion theory
concepts, validated approaches that address these underlying
problems are still relatively rare.
      </p>
      <p>
        The AI community is exploring solutions through training
on data of higher quantity and quality, and by employing
more complex DL strategies, such as multitask networks
and transfer learning [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. However, even without the data
quantity challenge, the efectiveness of purely inductive
DL approaches remains questionable. The complexity and
variability of the relationship between facial expressions
and emotional states, further influenced by cultural and
agerelated diferences, make this task particularly challenging.
Consequently, the mapping of facial behavior to emotional
states, a critical foundation for annotating data of facial
expressions, continues to pose a major issue.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Neuro-Symbolic AI (N-SAI)</title>
      <p>Given the complex challenges identified in the field of FER,
particularly in accurately mapping facial behaviors to
emotional states across diferent contexts and demographics, we
see N-SAI as a promising solution. N-SAI combines the
rulebased strengths of symbolic AI (symbolism), characterized
by deductive reasoning, with the numerical power of
subsymbolic DL, known for its inductive learning capabilities.
This hybrid approach is designed to leverage the precision
of symbolic rules and the adaptability of DL (connectionism)
to efectively tackle the nuanced complexity of FER.</p>
      <p>
        In this context, the foundational concept of symbols,
fundamental to symbolic AI, is beneficial. There is an ongoing
philosophical debate about symbolic versus non-symbolic
or sub-symbolic data [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. To simplify this issue, we adopt
Berkeley’s interpretation [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], which defines a symbol as
meeting the following criteria:
1. It represents an object, category, or relationship.
2. It can be either simple or composite, consisting of
other symbols.
3. It requires a defined process for creating new
symbols from existing ones, ensuring that each new
symbol also represents something.
      </p>
      <p>
        Recent research interest in N-SAI has surged, reflecting
a growing recognition of its potential. However, the
existing literature remains diverse and predominantly empirical,
making it challenging to navigate. To provide clarity, we
draw on the seminal works of ten Teije and van Harmelen
[
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] and Kautz [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], which ofer complementary
perspectives on N-SAI architecture designs. While ten Teije and van
Harmelen focus on detailed architectural analysis, Kautz
provides high-level design patterns that serve as the
foundation for our systematic overview. In this paper, we adopt
Kautz’s classification of six key N-SAI design patterns,
illustrating each with concrete examples, advantages, and
disadvantages, before demonstrating which of these
frameworks address key challenges in FER.
      </p>
      <sec id="sec-3-1">
        <title>3.1. Symbolic-Neuro-Symbolic</title>
        <p>
          The Symbolic-Neuro-Symbolic pattern describes an approach
where both the input and output are symbolic, while the
processing in between is handled numerically by a neural
architecture. For example, words can be converted to
numerical vectors, e.g. by means of GloVe [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ], which are
processed by DL models to yield outputs in the form of
sequences or categories represented again as symbols [
          <xref ref-type="bibr" rid="ref20 ref22">20, 22</xref>
          ].
This architecture ofers the advantage of inductive learning
from large amounts of data, enabling the system to discover
patterns and generalize efectively. The lack of
explainability is a key drawback, as the neural component operates as
a black box, making it dificult to interpret decisions. The
system is also highly dependent on data quality and
quantity. For FER, this design pattern is inefective because it
requires processing sub-symbolic data, like image pixels,
which symbolic rules alone cannot adequately represent.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Symbolic[Neuro]</title>
        <p>
          The Symbolic[Neuro] pattern focuses on a symbolic
problemsolving method, enhanced by a neural network acting as
pattern recognition subroutine. In this setup, the neural
network’s numerical capabilities support the decision-making
of the symbolic core. A notable example is AlphaGo [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ],
where the symbolic problem-solving core is implemented
using Monte Carlo Tree Search [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ], guided by a neural
network as the evaluation subroutine. This architecture is
particularly efective in scenarios requiring complex
decisionmaking, such as autonomous driving [
          <xref ref-type="bibr" rid="ref20 ref22">20, 22</xref>
          ].
        </p>
        <p>
          A key advantage of this pattern is its efective search
process, where the symbolic system explores possible
decisions guided by inductively learned representations from
the neural network. This combination provides strong
generalizability, allowing it to adapt to similar tasks, and easy
transferability to other structured environments (e.g.,
different games) without requiring extensive domain-specific
knowledge. However, the pattern has limitations. Its
explainability is reduced when sub-symbolic (e.g., pixel-based)
input is fed into the symbolic module without interpretable
representations. Additionally, its transferability is limited in
tasks lacking well-defined rules, particularly in settings
characterized by continuous or ambiguous inputs. Moreover,
the lack of abstract reasoning capabilities can hinder
performance in tasks requiring logical generalization beyond
learned patterns [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]. In the context of FER, this design
pattern leverages the numerical power of DL and symbolic
logic, but its one-sided flow of information - from the neural
network to the symbolic system - prevents the emergence of
higher reasoning capabilities through dynamic interaction.
3.3. Neuro | Symbolic
In the Neuro | Symbolic design, a neural network takes
subsymbolic (or non-symbolic) inputs, such as image pixels,
and converts them into a format that a symbolic
reasoning system can understand and process. An example of
this is the Neuro-Symbolic Concept Learner [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ], which
integrates object-based scene representations and symbolic
program execution to perform tasks like visual question
answering and semantic parsing without direct supervision.
In the Symbolic[Neuro] design, the neural module serves as
a secondary subroutine, whereas in the Neuro | Symbolic
approach it functions as a co-routine, working in parallel
with the symbolic system [
          <xref ref-type="bibr" rid="ref20 ref22">20, 22</xref>
          ].
        </p>
        <p>
          This design ofers several advantages. It requires only
weak data supervision, eliminating the need for pixel-level
annotations. The architecture can learn novel visual and
linguistic concepts, enabling strong generalization across tasks.
Additionally, the symbolic reasoning component improves
interpretability compared to purely neural approaches by
ofering logical, structured explanations of decisions.
However, the design faces challenges. It has a high dependency
on the sub-symbolic perception module, as the symbolic
system cannot correct errors made by the neural classifier. This
limits performance in real-world scenarios with ambiguous
object boundaries or new, unseen categories. Furthermore,
the lack of end-to-end diferentiability complicates training,
as symbolic reasoning does not support backpropagation of
gradients. The system also struggles with abstract
reasoning, especially when dealing with high-level concepts that
are not directly captured by visual perception [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ].
        </p>
        <p>For FER, this pattern shows potential due to its ability
to process sub-symbolic data and provide interpretable
outputs, but its inability to correct neural classifier errors makes
it highly limited for facial behavior, which can be highly
ambiguous. Additionally, as the knowledge describing
AUemotion relationships is often debated in emotion
psychology, this pattern does not support the necessary refinement
or adaptation of symbolic knowledge.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.4. Neuro: Symbolic</title>
      </sec>
      <sec id="sec-3-4">
        <title>Neuro</title>
        <p>
          The Neuro: Symbolic Neuro approach adheres to the
architecture outlined in 3.1, with the distinction that training
leverages symbolic rules instead of textual data. A notable
implementation is the 2020 study by Lample and Charton
[
          <xref ref-type="bibr" rid="ref27">27</xref>
          ], focusing on symbolic mathematics. They developed a
transformer model trained to simplify mathematical
expressions from one form (A) to another (B). Post-training, the
model demonstrated the ability to accurately simplify new,
previously unseen expressions, providing correct solutions
directly without step-by-step derivations [
          <xref ref-type="bibr" rid="ref20 ref22">20, 22</xref>
          ].
        </p>
        <p>
          This approach has several advantages. It exhibits high
generalization power, allowing it to solve new, unseen
problems. The end-to-end nature of training and inference
eliminates the need for symbolic reasoning during prediction,
making it computationally eficient. Studies by Lample and
Charton [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ] show that such models can outperform
traditional symbolic systems, and the architecture is transferable
to other symbolic tasks by adjusting the training data.
However, there are challenges. The lack of explainability is a
key drawback, as the system does not produce step-by-step
derivations, making it dificult to trace how solutions are
reached. The model is highly dependent on the quality and
quantity of training data, and new symbolic expressions
require retraining with updated data due to its inductive
reasoning-only approach. Additionally, this method does
not validate predictions, potentially reducing reliability. For
FER, this pattern is similarly inefective, like 3.1, because it
solely relies on symbolic rules without the ability to process
sub-symbolic data such as facial images.
        </p>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Neuro_{Symbolic}</title>
        <p>
          The Neuro_{Symbolic} architecture incorporates symbolic
representations (rules) as templates to structure neural
networks. Logic tensor networks [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ] and tensor product
representations [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] have successfully integrated hierarchical
and abstract concepts within these neural networks. By
encoding disjunctive rules (OR), this approach could facilitate
combinatorial reasoning, enabling the system to manage
multiple scenarios simultaneously [
          <xref ref-type="bibr" rid="ref20 ref22">20, 22</xref>
          ].
        </p>
        <p>
          This architecture ofers the advantage of combining
inductive DL with symbolic reasoning, allowing systems to
benefit from both data-driven pattern discovery and
structured logic. The integration of symbolic logic enhances
robustness, as it provides logical constraints or guard rails
that can enforce requirements, such as those mandated by
regulations. This dual nature improves generalization in
tasks where structured reasoning and pattern recognition
are both required. These advantages are accompanied by
notable challenges. Both training and inference become
more complex. The performance is highly dependent on
the quality of the symbolic component; poorly defined
symbolic rules can limit generalization and lead to errors when
examples fall outside predefined logic. Additionally,
encoding symbolic knowledge into neural networks is nontrivial
and can require significant domain expertise. Explainability
can be an advantage or disadvantage: well-defined
symbolic knowledge enhances transparency, but complex
encodings can reduce interpretability [
          <xref ref-type="bibr" rid="ref28 ref29">28, 29</xref>
          ]. For FER, this
architecture is problematic because it relies on symbolic
representations (concepts) embedded in neural networks,
which are heavily debated in emotion psychology, making
it unsuitable as an N-SAI architecture.
        </p>
      </sec>
      <sec id="sec-3-6">
        <title>3.6. Neuro[Symbolic]</title>
        <p>
          The Neuro[Symbolic] architecture combines symbolic
reasoning and neural processing by embedding a symbolic
engine within a neural engine to enhance “superneuro” and
combinatorial reasoning capabilities. Inspired by Daniel
Kahneman’s dual-process theory from his seminal work,
“Thinking, Fast and Slow” [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ], this architecture harmonizes
the rapid, intuitive operations of neural networks (System
1) with the methodical, thoughtful processes of symbolic
reasoning (System 2). This configuration allows for both
fast pattern recognition and thoughtful, detailed analysis
within the same AI system [
          <xref ref-type="bibr" rid="ref20 ref22">20, 22</xref>
          ]. A key feature of this
architecture is the dynamic interaction between the two
systems, where one subsystem can activate the other,
enabling a bidirectional flow of information. Insights from
symbolic reasoning (System 2) can refine and enhance the
pattern recognition capabilities of the neural network
(System 1), while System 1 can provide data-driven insights that
improve symbolic reasoning.
        </p>
        <p>
          In our view, a concrete example of this principle is ofered
by research on abductive learning, a topic notably absent
from the foundational works of ten Teije and van Harmelen
[
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] as well as Kautz [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]. Abductive Learning provides a
powerful framework that integrates Machine Learning (ML)
with logical reasoning [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ]. It leverages abductive
reasoning, a cognitive process central to hypothesis generation,
creative problem-solving, and the generation of plausible
explanations. As such, it constitutes a crucial component
within the broader Neuro[Symbolic] pattern [
          <xref ref-type="bibr" rid="ref32">32</xref>
          ].
        </p>
        <p>
          The advantages of Neuro[Symbolic] architecture are
numerous. They ofer increased explainability due to the
symbolic reasoning component. The architecture enables
reasoning capabilities that go beyond the inductive reasoning
of DL and the deductive reasoning of symbolic logic,
supporting more abstract and flexible forms of reasoning.
Furthermore, the system benefits from improved generalization
through logical constraints, which can help to reduce
overiftting. The bidirectional flow of information between the
neural and symbolic components enables the system to
validate predictions, correct errors, and detect instances of new
and unseen classes, thereby improving learning and
adaptability over time. Most importantly, support for abductive
reasoning equips the system with the ability to generate
hypotheses and explanations based on incomplete information,
making it efective in tasks involving uncertainty [
          <xref ref-type="bibr" rid="ref33 ref34">33, 34</xref>
          ].
        </p>
        <p>
          These benefits come with certain challenges. Integrating
the two systems is complex, particularly when resolving
conflicts between the neural and symbolic parts to ensure
consistency. The symbolic reasoning component can also
limit the system’s transferability to other domains,
particularly to those requiring learning or tasks outside the defined
symbolic knowledge base. Additionally, the architecture
does not fully support end-to-end training using
backpropagation, as the symbolic reasoning component introduces
non-diferentiable operations that complicate optimization
[
          <xref ref-type="bibr" rid="ref31">31</xref>
          ].
        </p>
        <p>In contrast to the previously discussed N-SAI
architectures, the Neuro[Symbolic] architecture enables mutual
improvement between the neural and symbolic components.
This integration provides reasoning capabilities beyond
traditional inductive and deductive approaches. Such advanced
reasoning capabilities are particularly well-suited to the
domain of FER, where the mapping between facial expressions
and emotional states is often ambiguous, context-dependent,
and highly variable. Given these strengths, we consider the
Neuro[Symbolic] pattern to be the most promising
architectural approach for addressing the challenges inherent in
FER.</p>
        <p>A particularly compelling instantiation of this pattern is
abductive learning, which we examine in greater detail in
the following section. To provide a structured comparison,
Table 1 summarizes the key advantages and disadvantages
of the six N-SAI patterns, highlighting their respective
tradeofs and underscoring the strengths of the Neuro[Symbolic]
approach.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. The Role of Abductive Learning in FER</title>
      <p>
        Building upon the limitations of existing FER approaches
and the capabilities of various N-SAI design patterns
outlined in Section 3, we propose a novel conceptual integration
of abductive learning into FER. Unlike existing models that
rely mostly on inductive learning via deep neural networks,
our approach extends them via abductive reasoning, a
cognitive process central to hypothesis generation and creative
problem solving. This integration draws parallels to
human cognitive reasoning processes, specifically, deduction,
induction, and abduction. Deductive reasoning involves
deriving specific conclusions from general principles;
inductive reasoning entails identifying general patterns based
on specific observations; and abductive reasoning involves
hypothesizing plausible explanations based on incomplete
or ambiguous information [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ].
      </p>
      <p>
        AI’s historical evolution began with the symbolic era,
noted for its deductive reasoning capabilities. This phase
was followed by the sub-symbolic era, which highlighted
inductive learning through sophisticated DL models [
        <xref ref-type="bibr" rid="ref13 ref35">13, 35</xref>
        ].
Building on this evolution, we propose that a promising
next step in AI development may lie in the incorporation
of abductive learning, which enables AI to formulate
hypotheses and insights beyond the limitations of existing
symbolic and sub-symbolic systems. This transition is
essential for expanding AI’s problem-solving capabilities beyond
traditional deductive and inductive reasoning frameworks.
Incorporating abductive learning aligns seamlessly with
the Neuro[Symbolic] design pattern described in subsection
3.6. Additionally, we advocate for the synthesis of symbolic
and sub-symbolic AI components in such a way that they
mutually enhance the capabilities of each other.
      </p>
      <p>Depending on the specific operational task or field, the
choice of the most suitable N-SAI design pattern can vary.
Given the unique challenges of FER, particularly the
ambiguity and variability in the relationship between facial
expressions and emotional states, we find Neuro[Symbolic]
architectures to be particularly relevant. Unlike other designs,
Neuro[Symbolic] integrates symbolic and sub-symbolic
systems as mutually interacting routines. This capability allows</p>
      <p>Input
Images
with AU
annotations</p>
      <p>DL Model</p>
      <p>Prediction (Pt)</p>
      <p>Pseudo-Emotion</p>
      <p>Reasoning
Happy
Sadness</p>
      <p>Pt+n
Sadness
?
...</p>
      <p>Ground Truth Emotion
Revise Pseudo-Emotion predicted</p>
      <p>Pt
Pt+1</p>
      <p>Knowledge Graph
(Pseudo-)Emotion</p>
      <p>AU relationship
Validation against
AU annotations
x
h
p
a
r
G
e
g
d
e
l
w
o
n
K
t
s
u
j
d
A
the FER system to generate new hypotheses and insights
beyond the available data and symbolic rules, efectively
“thinking” outside the conventional datasets. This approach
is essential because not all facial expression emotion
relationships are universally applicable or adequately
represented by data alone, making Neuro[Symbolic] uniquely
equipped to navigate these complexities.</p>
      <p>
        To this end, the research on abductive learning by Dai et al.
[
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] seems highly relevant, demonstrating how ML models
can detect basic logical facts and use symbolic reasoning to
correct errors and refine predictions. This method is
exemplified in their work on decoding Mayan hieroglyphs, which
involves recognizing numbers visually from the glyphs and
using knowledge of mathematics and calendars to interpret
them symbolically. Follow-up research has shown how
symbolic knowledge can be refined if it is incomplete (i.e. new
concepts can be detected) or inaccurate [
        <xref ref-type="bibr" rid="ref33 ref34">33, 34</xref>
        ].
      </p>
      <p>For FER, this approach is particularly advantageous
because it aligns DL model predictions with theoretical
knowledge about the relationship between emotional states and
facial behavior. The key feature of this approach is its
reliance on high-quality AU annotations, enabling the DL
model and symbolic reasoning system to mutually refine
and enhance each other’s outputs. Practically, this involves
a (pre-trained) DL Model (perception model) predicting an
emotional state (ground truth label) from a facial input
image. Additionally, the facial images include AU annotations,
ideally from certified FACS coders. A knowledge graph
(reasoning), built on expert knowledge of AU-emotion
relationships, is used to deduce the AUs associated with the
predicted emotion.</p>
      <p>The symbolic reasoning part then validates whether the
data-driven prediction, complemented by the
corresponding AUs deduced from the knowledge graph, matches the
original AU annotations from the facial images. If both the
perception model and the reasoning part concur, the output
of the DL model (ground truth prediction) is likely correct.
However, if they diverge, the inconsistency may stem from
either the DL model’s prediction errors or an incomplete
or inaccurate knowledge graph. This discrepancy can be
resolved either by retraining the DL model with a revised
prediction or by updating the knowledge graph with refined
AU-emotion mappings and retesting it against the model.</p>
      <p>
        In the example illustrated in Figure 1, the process
begins with a perception model (e.g. a pre-trained DL model)
analyzing an image of a sad-looking child. Despite the
visual cues, the model initially incorrectly predicts a
pseudoemotion label “happiness”. The image also contains AU
annotations, manually provided by FACS experts, which
in this case likely include AU 1 (Inner Brow Raiser), AU 4
(Brow Lowerer), and AU 15 (Lip Corner Depressor),
typically indicative of “sadness”. The predicted pseudo-emotion
is then passed to the reasoning module, which infers the
corresponding AUs for “happiness” using a knowledge graph.
In this context, the graph suggests AU 6 (Cheek Raiser) and
AU 12 (Lip Corner Puller), based on established mappings
from Ekman’s FACS Investigator Guide [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ].
      </p>
      <p>These inferred AUs are compared against the
expertannotated AUs present in the image. The mismatch
between the deduced and observed AUs initiates a revision
process: the system reconsiders the initial pseudo-emotion
and iteratively refines its prediction, potentially through
classifier retraining with the revised prediction. Ultimately,
the process converges on “Sadness” as the final emotion
label, whose knowledge-graph derived AUs align with those
annotated in the image.</p>
      <p>
        This adaptive feedback loop promotes coherence across
three layers: the model’s perceptual prediction, symbolic
reasoning based on AU-emotion relationships, and
empirical AU observations. The system will iteratively refine its
predictions and symbolic knowledge, leading to consistency
between DL and symbolic reasoning output. This process
efectively manages ambiguity and variability in facial
expressions. It enables the generation of new hypotheses such
as novel labels or AU-emotion combinations. Additionally,
human expert feedback can be incorporated into this
mechanism [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ].
      </p>
      <p>
        A critical component of abductive learning in FER is the
consistency optimization between the perception model
and the symbolic reasoning module. This involves
dynamically adjusting pseudo-labels predicted by an undertrained
DL model, often inaccurate in the early training phases,
so that they align with domain knowledge encoded in a
symbolic system. While Dai et al. [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] rely on
derivativefree optimization (RACOS) [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ], alternative strategies such
as evolutionary algorithms [
        <xref ref-type="bibr" rid="ref38">38</xref>
        ] or LLM-based optimizers
[
        <xref ref-type="bibr" rid="ref39">39</xref>
        ] may ofer greater flexibility, particularly for complex
or semantically ambiguous correction tasks. This iterative
process refines both the symbolic knowledge graphs and
the neural model predictions, promoting coherence across
perception and reasoning. Over successive cycles, relational
features extracted from consistent hypotheses serve as
feedback, enabling the DL model to generalize better, distinguish
ambiguous features, and ultimately converge toward
symbolically grounded, high-fidelity outputs.
      </p>
      <p>We anticipate that abductive learning will ofer significant
advantages for FER. By integrating both sub-symbolic data
and expert knowledge, this approach is expected to handle
the inherent ambiguity and variability in facial expressions
more efectively than conventional methods. It also ofers
enhanced adaptability through the continuous refinement
of DL predictions and symbolic knowledge, enabling the
system to apply learned patterns to novel, unseen
scenarios, such as new AU-emotion combinations. Its capacity to
reason under uncertainty makes abductive learning
wellsuited for robust, accurate emotion recognition, even with
sparse or noisy data. Additionally, this reasoning process is
expected to yield deeper insights into AU-emotion
relationships by refining the underlying symbolic knowledge base
and enabling a more systematic evaluation of the quality
and consistency of research data, both of which are crucial
for advancing emotion psychology.</p>
      <p>This approach allows FER systems to not only detect
patterns in facial expressions but also to evaluate these patterns
against symbolic expert knowledge, such as the theory of
seven universal emotional states. Furthermore, abductive
learning enables the possibility to refine the symbolic
knowledge base, ensuring that FER systems adapt to new findings
and remain efective in diferent contexts. This adaptability
is particularly valuable in environments where expressions
may vary significantly, helping to mitigate biases and
enhance the reliability of emotion recognition technologies. In
sum, our proposed abductive learning framework within the
Neuro[Symbolic] pattern ofers a novel, cognitively inspired,
and technically robust path forward for FER.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this work, we have examined the fundamental challenges
faced by current FER systems and systematically evaluated
six N-SAI architectures as potential solutions. Among these,
we propose the Neuro[Symbolic] design pattern as the most
suitable framework for addressing the ambiguity, variability,
and contextual dependence inherent in facial emotional
expression.</p>
      <p>As a key contribution, we identify abductive learning
as a novel and underexplored instantiation within the
Neuro[Symbolic] paradigm, uniquely capable of integrating
symbolic (deductive) and sub-symbolic (inductive)
reasoning. Inspired by human cognitive processes, particularly
the ability to generate plausible explanations from
incomplete observations, abductive learning enables FER systems
to not only align data-driven predictions with expert
emotional theories, but also to iteratively resolve inconsistencies
between neural outputs and symbolic knowledge. This
integration promotes a more adaptable, interpretable, and
conceptually grounded approach to emotion recognition.</p>
      <p>We believe this framework opens promising directions
for FER research by enabling systems to hypothesize, adapt
to uncertainty, and support a deeper understanding of
emotional behavior across diverse real-world contexts. Future
work should focus on empirically validating the proposed
architecture, exploring advanced consistency optimization
techniques, and incorporating human-in-the-loop feedback
to enhance interpretability and ensure the ethical and
trustworthy deployment of afective technologies.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used
ChatGPT-based models to assist with grammar and spelling
checks. After using these models, the authors reviewed and
edited the content as needed and take full responsibility for
the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Pang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Pang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Yang, cGAN Based Facial Expression Recognition for Human-Robot Interaction, IEEE Access 7 (</article-title>
          <year>2019</year>
          )
          <fpage>9848</fpage>
          -
          <lpage>9859</lpage>
          . doi:
          <volume>10</volume>
          .1109/ACCESS.
          <year>2019</year>
          .
          <volume>2891668</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kamal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Sayeed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rafeeq</surname>
          </string-name>
          ,
          <article-title>Facial emotion recognition for Human-Computer Interactions using hybrid feature extraction technique</article-title>
          ,
          <source>in: 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE)</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>180</fpage>
          -
          <lpage>184</lpage>
          . doi:
          <volume>10</volume>
          .1109/ SAPIENCE.
          <year>2016</year>
          .
          <volume>7684129</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Shan</given-names>
            <surname>Jia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Jia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. J.</given-names>
            <surname>Webster</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Xin</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Xin</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Detection of Genuine and Posed Facial Expressions of Emotion: Databases and Methods</article-title>
          ., Frontiers in Psychology 11 (
          <year>2021</year>
          ). doi:
          <volume>10</volume>
          .3389/ fpsyg.
          <year>2020</year>
          .
          <volume>580287</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Werner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Al-Hamadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Niese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Walter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gruss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Traue</surname>
          </string-name>
          ,
          <source>Automatic Pain Recognition from Video and Biomedical Signals</source>
          ,
          <year>2014</year>
          . doi:
          <volume>10</volume>
          .1109/ICPR.
          <year>2014</year>
          .
          <volume>784</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <source>Deep Facial Expression Recognition: A Survey</source>
          ,
          <source>IEEE Transactions on Afective Computing</source>
          <volume>13</volume>
          (
          <year>2022</year>
          )
          <fpage>1195</fpage>
          -
          <lpage>1215</lpage>
          . doi:
          <volume>10</volume>
          .1109/TAFFC.
          <year>2020</year>
          .
          <volume>2981446</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>L. F.</given-names>
            <surname>Barrett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Adolphs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Marsella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Martinez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. D.</given-names>
            <surname>Pollak</surname>
          </string-name>
          , Emotional Expressions Reconsidered:
          <article-title>Challenges to Inferring Emotion From Human Facial Movements:, Psychological Science in the Public Interest (</article-title>
          <year>2019</year>
          ). doi:
          <volume>10</volume>
          .1177/1529100619832930.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Pantic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Valstar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rademaker</surname>
          </string-name>
          , L. Maat,
          <article-title>WebBased Database for Facial Expression Analysis</article-title>
          ,
          <source>in: 2005 IEEE International Conference on Multimedia and Expo</source>
          , IEEE, Amsterdam, The Netherlands,
          <year>2005</year>
          , pp.
          <fpage>317</fpage>
          -
          <lpage>321</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICME.
          <year>2005</year>
          .
          <volume>1521424</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Gebele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Brune</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Faußer</surname>
          </string-name>
          , Face Value:
          <article-title>On the Impact of Annotation (In-)Consistencies and Label Ambiguity in Facial Data on Emotion Recognition</article-title>
          ,
          <source>in: 2022 26th International Conference on Pattern Recognition (ICPR)</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>2597</fpage>
          -
          <lpage>2604</lpage>
          . doi:
          <volume>10</volume>
          .1109/ ICPR56361.
          <year>2022</year>
          .
          <volume>9956230</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ekman</surname>
          </string-name>
          ,
          <article-title>Basic emotions</article-title>
          .
          <source>Handbook of cognition and emotion</source>
          , Wiley, New York (
          <year>1999</year>
          )
          <fpage>301</fpage>
          -
          <lpage>320</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>L. F.</given-names>
            <surname>Barrett</surname>
          </string-name>
          , Discrete Emotions or Dimensions?
          <article-title>The Role of Valence Focus</article-title>
          and
          <string-name>
            <given-names>Arousal</given-names>
            <surname>Focus</surname>
          </string-name>
          ,
          <source>Cognition and Emotion</source>
          <volume>12</volume>
          (
          <year>1998</year>
          )
          <fpage>579</fpage>
          -
          <lpage>599</lpage>
          . doi:
          <volume>10</volume>
          .1080/ 026999398379574.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>K.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Sarsenbayeva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Tag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Dingler</surname>
          </string-name>
          , G. Wadley,
          <string-name>
            <given-names>J.</given-names>
            <surname>Goncalves</surname>
          </string-name>
          ,
          <article-title>Benchmarking commercial emotion detection systems using realistic distortions of facial image datasets</article-title>
          ,
          <source>The Visual Computer</source>
          <volume>37</volume>
          (
          <year>2021</year>
          )
          <fpage>1447</fpage>
          -
          <lpage>1466</lpage>
          . doi:
          <volume>10</volume>
          .1007/s00371-020-01881-x.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>N.</given-names>
            <surname>Samadiani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.-H. Chi</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Xiang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>He</surname>
          </string-name>
          ,
          <source>A Review on Automatic Facial Expression Recognition Systems Assisted by Multimodal Sensor Data, Sensors</source>
          <volume>19</volume>
          (
          <year>2019</year>
          )
          <year>1863</year>
          . doi:
          <volume>10</volume>
          .3390/ s19081863.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P.</given-names>
            <surname>Hitzler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. K.</given-names>
            <surname>Sarker</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Eberhart (Eds.),
          <source>Compendium of Neurosymbolic Artificial Intelligence</source>
          , volume
          <volume>369</volume>
          <source>of Frontiers in Artificial Intelligence and Applications</source>
          , IOS Press,
          <year>2023</year>
          . doi:
          <volume>10</volume>
          .3233/FAIA369.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>D.</given-names>
            <surname>Seuss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hassan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Dieckmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Unfried</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. R.</given-names>
            <surname>Scherer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mortillaro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Garbas</surname>
          </string-name>
          ,
          <article-title>Automatic Estimation of Action Unit Intensities and Inference of Emotional Appraisals</article-title>
          ,
          <source>IEEE Transactions on Afective Computing</source>
          <volume>14</volume>
          (
          <year>2023</year>
          )
          <fpage>1188</fpage>
          -
          <lpage>1200</lpage>
          . doi:
          <volume>10</volume>
          .1109/ TAFFC.
          <year>2021</year>
          .
          <volume>3077590</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ekman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. V.</given-names>
            <surname>Friesen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Hager</surname>
          </string-name>
          , Facial Action Coding Sytem,
          <string-name>
            <given-names>A Human</given-names>
            <surname>Face</surname>
          </string-name>
          , Salt Lake City, Utah,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>H.</given-names>
            <surname>Gunes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Schuller</surname>
          </string-name>
          ,
          <article-title>Categorical and dimensional afect analysis in continuous input: Current trends and future directions</article-title>
          ,
          <source>Image and Vision Computing</source>
          <volume>31</volume>
          (
          <year>2013</year>
          )
          <fpage>120</fpage>
          -
          <lpage>136</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.imavis.
          <year>2012</year>
          .
          <volume>06</volume>
          . 016.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Russell</surname>
          </string-name>
          ,
          <article-title>A circumplex model of afect</article-title>
          ,
          <source>Journal of Personality and Social Psychology</source>
          <volume>39</volume>
          (
          <year>1980</year>
          )
          <fpage>1161</fpage>
          -
          <lpage>1178</lpage>
          . doi:
          <volume>10</volume>
          .1037/h0077714.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>M. M. Adnan</surname>
            ,
            <given-names>M. S. M.</given-names>
          </string-name>
          <string-name>
            <surname>Rahim</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Rehman</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Mehmood</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Saba</surname>
            ,
            <given-names>R. A.</given-names>
          </string-name>
          <string-name>
            <surname>Naqvi</surname>
          </string-name>
          ,
          <article-title>Automatic Image Annotation Based on Deep Learning Models: A Systematic Review and Future Challenges, IEEE Access 9 (</article-title>
          <year>2021</year>
          )
          <fpage>50253</fpage>
          -
          <lpage>50264</lpage>
          . doi:
          <volume>10</volume>
          .1109/ACCESS.
          <year>2021</year>
          .
          <volume>3068897</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>J.</given-names>
            <surname>Gebele</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Brune</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Schwab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Von</surname>
          </string-name>
          <string-name>
            <surname>Mammen</surname>
          </string-name>
          ,
          <source>Assessing Sequential Databases for Spontaneous and Posed Facial Expression Recognition</source>
          ,
          <year>2025</year>
          . arXiv:
          <volume>10125</volume>
          /108902.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>A. ten Teije</surname>
            ,
            <given-names>F. van Harmelen</given-names>
          </string-name>
          ,
          <article-title>Architectural patterns for neuro-symbolic AI</article-title>
          , in: P.
          <string-name>
            <surname>Hitzler</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Eberhart</surname>
          </string-name>
          , M. K. Sarker (Eds.),
          <source>Compendium of Neurosymbolic Artificial Intelligence, Frontiers in Artificial Intelligence and Applications</source>
          , IOS Press,
          <year>2023</year>
          , pp.
          <fpage>64</fpage>
          -
          <lpage>76</lpage>
          . doi:
          <volume>10</volume>
          .3233/FAIA230135.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>I. S. N.</given-names>
            <surname>Berkeley</surname>
          </string-name>
          , What the &lt;
          <volume>0</volume>
          .
          <issue>70</issue>
          ,
          <issue>1</issue>
          .17,
          <issue>0</issue>
          .99,
          <issue>1</issue>
          .07&gt; is a Symbol?,
          <source>Minds and Machines</source>
          <volume>18</volume>
          (
          <year>2008</year>
          )
          <fpage>93</fpage>
          -
          <lpage>105</lpage>
          . doi:
          <volume>10</volume>
          .1007/s11023-007-9086-y.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>H.</given-names>
            <surname>Kautz</surname>
          </string-name>
          ,
          <source>The Third AI Summer: AAAI Robert S. Engelmore Memorial Lecture, AI Magazine</source>
          <volume>43</volume>
          (
          <year>2022</year>
          )
          <fpage>105</fpage>
          -
          <lpage>125</lpage>
          . doi:
          <volume>10</volume>
          .1002/aaai.12036.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pennington</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Socher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Manning</surname>
          </string-name>
          , Glove:
          <article-title>Global Vectors for Word Representation</article-title>
          ,
          <source>in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Doha, Qatar,
          <year>2014</year>
          , pp.
          <fpage>1532</fpage>
          -
          <lpage>1543</lpage>
          . doi:
          <volume>10</volume>
          .3115/v1/
          <fpage>D14</fpage>
          -1162.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>D.</given-names>
            <surname>Silver</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. J.</given-names>
            <surname>Maddison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Guez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Sifre</surname>
          </string-name>
          , G. van den Driessche, J. Schrittwieser,
          <string-name>
            <given-names>I.</given-names>
            <surname>Antonoglou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Panneershelvam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lanctot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dieleman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Grewe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Nham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kalchbrenner</surname>
          </string-name>
          , I. Sutskever,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lillicrap</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Leach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kavukcuoglu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Graepel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hassabis</surname>
          </string-name>
          ,
          <article-title>Mastering the game of Go with deep neural networks and tree search</article-title>
          ,
          <source>Nature</source>
          <volume>529</volume>
          (
          <year>2016</year>
          )
          <fpage>484</fpage>
          -
          <lpage>489</lpage>
          . doi:
          <volume>10</volume>
          . 1038/nature16961.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>G.</given-names>
            <surname>Chaslot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bakkes</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Szita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Spronck</surname>
          </string-name>
          ,
          <article-title>Monte-Carlo Tree Search: A New Framework for Game AI</article-title>
          ,
          <source>Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment</source>
          <volume>4</volume>
          (
          <year>2008</year>
          )
          <fpage>216</fpage>
          -
          <lpage>217</lpage>
          . doi:
          <volume>10</volume>
          .1609/aiide.v4i1.
          <fpage>18700</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>J.</given-names>
            <surname>Mao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kohli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. B.</given-names>
            <surname>Tenenbaum</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Wu</surname>
          </string-name>
          ,
          <source>The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision</source>
          ,
          <year>2019</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.
          <year>1904</year>
          .
          <volume>12584</volume>
          . arXiv:
          <year>1904</year>
          .12584.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>G.</given-names>
            <surname>Lample</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Charton</surname>
          </string-name>
          ,
          <source>Deep Learning for Symbolic Mathematics</source>
          ,
          <year>2019</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.
          <year>1912</year>
          .
          <volume>01412</volume>
          . arXiv:
          <year>1912</year>
          .01412.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>I.</given-names>
            <surname>Donadello</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Serafini</surname>
          </string-name>
          , A.
          <string-name>
            <surname>d'Avila Garcez</surname>
          </string-name>
          ,
          <source>Logic Tensor Networks for Semantic Image Interpretation</source>
          ,
          <year>2017</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.1705.08968. arXiv:
          <volume>1705</volume>
          .
          <fpage>08968</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>P.</given-names>
            <surname>Smolensky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>He</surname>
          </string-name>
          , W.-t. Yih,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <source>Basic Reasoning with Tensor Product Representations</source>
          ,
          <year>2016</year>
          . doi:
          <volume>10</volume>
          .48550/arXiv.1601.02745. arXiv:
          <volume>1601</volume>
          .
          <fpage>02745</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>D.</given-names>
            <surname>Kahneman</surname>
          </string-name>
          , Thinking, Fast and Slow, macmillan,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>W.-Z. Dai</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>Z.-H.</given-names>
          </string-name>
          <string-name>
            <surname>Zhou</surname>
          </string-name>
          ,
          <article-title>Bridging Machine Learning and Logical Reasoning by Abductive Learning</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>32</volume>
          ,
          <string-name>
            <surname>Curran</surname>
            <given-names>Associates</given-names>
          </string-name>
          , Inc.,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kapitan</surname>
          </string-name>
          ,
          <article-title>Peirce and the autonomy of abductive reasoning</article-title>
          ,
          <source>Erkenntnis</source>
          <volume>37</volume>
          (
          <year>1992</year>
          )
          <fpage>1</fpage>
          -
          <lpage>26</lpage>
          . doi:
          <volume>10</volume>
          .1007/ BF00220630.
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>X.-W.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-J.</given-names>
            <surname>Shao</surname>
          </string-name>
          , W.-W. Tu,
          <string-name>
            <given-names>Y.-F.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.-Z.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.-H.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <article-title>Safe Abductive Learning in the Presence of Inaccurate Rules</article-title>
          ,
          <source>in: Proceedings of the AAAI Conference on Artificial Intelligence</source>
          , volume
          <volume>38</volume>
          ,
          <year>2024</year>
          , pp.
          <fpage>16361</fpage>
          -
          <lpage>16369</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>Y.-X.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.-Z.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.-H.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <article-title>Enabling knowledge refinement upon new concepts in abductive learning</article-title>
          ,
          <source>in: Proceedings of the AAAI Conference on Artificial Intelligence</source>
          , volume
          <volume>37</volume>
          ,
          <year>2023</year>
          , pp.
          <fpage>7928</fpage>
          -
          <lpage>7935</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>P.</given-names>
            <surname>Hitzler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bianchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ebrahimi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. K.</given-names>
            <surname>Sarker</surname>
          </string-name>
          ,
          <article-title>Neural-symbolic integration and the Semantic&amp;nbsp;Web, Semantic Web 11 (</article-title>
          <year>2020</year>
          )
          <fpage>3</fpage>
          -
          <lpage>11</lpage>
          . doi:
          <volume>10</volume>
          .3233/SW-190368.
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ekman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. V.</given-names>
            <surname>Friesen</surname>
          </string-name>
          , Facial Action Coding System:
          <string-name>
            <surname>Investigator's Guide</surname>
          </string-name>
          , Consulting Psychologists Press,
          <year>1978</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Qian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.-Q.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <surname>Derivative-Free Optimization</surname>
          </string-name>
          via Classification,
          <source>Proceedings of the AAAI Conference on Artificial Intelligence</source>
          <volume>30</volume>
          (
          <year>2016</year>
          ). doi:
          <volume>10</volume>
          .1609/ aaai.v30i1.
          <fpage>10289</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>P.</given-names>
            <surname>Rakshit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Konar</surname>
          </string-name>
          , Foundation in Evolutionary Optimization, in: P.
          <string-name>
            <surname>Rakshit</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Konar (Eds.), Principles in Noisy Optimization: Applied to Multi-agent Coordination, Springer, Singapore,
          <year>2018</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>56</lpage>
          . doi:
          <volume>10</volume>
          .1007/
          <fpage>978</fpage>
          -981-10-8642-
          <issue>7</issue>
          _
          <fpage>1</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>C.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q. V.</given-names>
            <surname>Le</surname>
          </string-name>
          ,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>