<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Active Inference for AI</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Maria Raffa</string-name>
          <email>maria.raffa@studenti.iulm.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>IULM University</institution>
          ,
          <addr-line>via Carlo Bo 1, Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The aim of this short paper is to present the connection between the cognitive framework of predictive processing and active inference and existing implementation in AI tools, in order to investigate whether these models of cognition can be used to design explainable and sustainable AI architectures. The aim of this short paper is to trace the connection between the cognitive framework of predictive processing (PP) and the existing implementation in AI tools. PP is related to some formal models of mind, i.e., Karl Friston's free energy principle (FEP) and active inference (AIF), which share their mathematical foundations with generative AI. Therefore, it is worth exploring this relationship to enrich the debate about whether current AI architectures can benefit from frameworks established in neuroscience and philosophy of mind.</p>
      </abstract>
      <kwd-group>
        <kwd>AI</kwd>
        <kwd>predictive processing</kwd>
        <kwd>active inference</kwd>
        <kwd>Bayesian networks 1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Predictive Processing, Free Energy Principle, Active Inference</title>
      <p>
        By PP, or predictive coding, is meant the hierarchical model theorized by neuroscientist Karl Friston. PP
is considered a unified explanation for perception, action and cognition, and sees the brain as a
predictive machine that tries to predict its next states based on information gained through the previous
interactions with the environment [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This means that the brain always tries to minimize the
probability of prediction errors and avoid states with a high degree of surprise. This is also the concept
of Bayesian inference, which reappears in the context of FEP. FEP, of which Friston is again the father,
has been used in different contexts and with different objectives, and has been regarded as a more or
less mysterious unification principle, that can be applied to almost any field of knowledge: from
thermodynamics, to biology, to the study of the mind. Friston emphasizes that FEP is not a framework,
or a theory, but should be understood as a tool or, in other words, a methodology. In its original
formulation, FEP is a mathematical formula that states that agents exist because they can persist,
maintaining their equilibrium by minimizing free energy. This minimization of energy is achieved by
AIF, which – again – means changes in state and the suppression of prediction errors by information
gained from the history of previous interactions with the environment [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ][
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. AIF can be intuitively
described as «feeling our way in darkness: we anticipate what we might touch next and then try to
confirm those expectations» [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. More generally, AIF provides a comprehensive framework to describe,
1Corresponding author.
      </p>
      <p>
        2020 Copyright for this paper by its authors.
explain, simulate and understand the mechanisms underlying decision-making, perception, and action
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ][
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        AIF modeling is very attractive for several reasons. For example, it has been used in the context of
scientific research on introspection, self-modeling and self-access, which has led to several theories of
consciousness [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]: introspection, i.e., the ability to access and evaluate one’s own mental states, thoughts
and experiences, plays a critical role in self-awareness, learning and decision-making, and is a pillar of
human consciousness [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Mahault Albarracin and colleagues state that self-modeling and self-access
can be defined as «interconnected processes that contribute to the development of self-awareness and
to the capacity for introspection» [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Moreover, «self-modeling involves the creation of internal
representations of oneself, while self-access refers to the ability to access and engage with these
representations for self-improvement and learning […] These processes, in conjunction with
introspection, form a complex dynamic system that enriches our understanding of consciousness and
the self – and indeed, may arguably form the causal basis of our capacity to understand ourselves and
others» [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The starting idea for a model of consciousness based on AIF is that for a system to be able
to report and evaluate its own conclusions, it must be able to implement some form of self-access, i.e.,
some parts of the system can use the output of other parts as their own input, for further processing.
This is related to the idea that some cognitive processes are “transparent”. Like a clean window, they
allow us to access other things without being perceptible themselves. Other cognitive processes are
“opaque”, meaning that they can be evaluated per se, such as introspective self-awareness (e.g., being
aware that you are looking at a tree rather than seeing a tree). The idea, then, is that introspective
processes make other cognitive processes accessible to the system as such, thus rendering them opaque.
AIF has been used by Albarracin and colleagues to develop a three-level generative model that models
self-access and introspective abilities in terms processes that govern transparency and opacity at the
phenomenal level of description and attentional selection at the psychological level, respectively [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ][
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>AIF, FEP and PP are not only used for modelling consciousness in the science of mind, but are also
of interest in many other fields such as thermodynamics and biology, as they provide a very general and
comprehensive framework for optimizing resources: FEP and AIF can be applied to almost all types of
organisms and organizations that are able to sustain themselves with the least waste of resources and
through the strategy of minimizing prediction errors. As mentioned earlier, this is the same concept of
Bayesian inference, and Bayesian structures are also used in the field of generative AI.</p>
      <p>On these grounds, the next section addresses the connection between AIF and AI models based on
Bayesian networks.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Predictive Processing, Active Inference, AI</title>
      <p>
        In the field of AI, the AIF framework, which is essentially Bayesian inference, is used in a wide variety of
ways. One of the most popular applications concerns generative models, which are used for neural
networks to generate images and text. The idea behind generative models is the same as for predictive
perception based on Bayesian inference, namely minimization through errors. Indeed, these algorithms
can generate sensory signals that match predicted causes, as they learn from small data sets and
generalize to new situations [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Bayesian networks are a type of generative models where all
processes can be easily tracked and understood: namely the propagation of events evolves through
multiple ambiguous points, where the event diverges probabilistically between paths, and at any given
point in the network, the probability of that node being visited depends on the joint probability of that
node being visited is dependent on the joint probability of the preceding nodes. Therefore, Bayesian
networks resemble explicit reasoning, although they are probabilistic. For this reason, they are more
reliable in terms of explainability, than other models, such as neural models based on feedforward
architectures trained with backpropagation: these models are very powerful in producing fast results,
but they often lead to black box situations that are difficult to explain, since the inputs are known, but
the path to the outputs is less clear [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        As mentioned earlier, generative models based on Bayesian networks are used in AI for generating
images and text. The most popular AI for images (e.g., Dall-E and Midjourney) are deep generative
networks, which are a combination of a generative model and a deep neural network. This means it is a
basic neural network that has been augmented in both size and training data, both of which are required
for good performance [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. The whole is combined with a generative language model, whose task is to
process the user’s text input to generate the images, i.e., a Generative Pretrained Transformer network
(GPT). A transformer network is a deep learning model that applies the mechanism of self-attention,
differentially weighting the importance of each part of the input data [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. These AI algorithms have
access to billions of images on the Internet tagged with words. GPT can read this image data and extract
some basic features (e.g., the style of a particular artist, or facial shapes), and hence, GPT manages to
understand how much the images are coherent with the inputs submitted by the user. It means that
during the process of producing new images, the generative networks act gradually adding new features
built on the features that they have already seen.
      </p>
      <p>
        As for a more direct connection between PP, AIF and AI, these cognitive systems are also
interesting for robotics. In particular, for robotics applications where the dynamic of the robot or the
task are uncertain: for estimation, adaptive control, fault-tolerant control, prospective planning and
complex cognition skills (human-robot collaboration, discrimination between self and others) [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
Pablo Lanillos and Gordon Cheng [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] have developed a computational model that allows a robot to
determine its own body configuration by basing it on PP. More in detail, PP is used for a computational
perception model that makes any multisensory humanoid robot able to learn, infer and update its body
configuration. This model enables generic multisensory integration, by integrating different sources of
information (tactile, visual and proprioceptive): the robot estimates its body configuration and adjusts
it based on sensory information alone. Prior to Lanillos and Cheng, also Jun Tani [16] and Leo Pio Lopez
and colleagues [17] also studied similar AIF-based models for real robots. Similar approaches have also
been used for industrial manipulators [18], and for implementing active vision in robots in simulation
environments [19]. Toon Van de Maele and his team proved in a simulation environment that a robot
agent, moving without knowledge of the workspace, chooses the next visual position by evaluating the
expected free energy. This means that it is possible to use the active inference paradigm as a «natural
solution for active vision in complex tasks in which the distribution over the environment is not defined
upfront» [19]. With the appropriate amount of GPU memory, a more complex generative model could
also be trained to compute the expected free energy for all potential viewpoints of an agent, and not just
for a limited set of considered ones: this would allow extending the use of the neural network for a
realtime control of physical robot manipulators.
      </p>
      <p>All the generative model based on Bayesian networks, which rely on FEP and AIF too, are worth
to be investigated also because they could be considered more sustainable than more complex models
(e.g., neural models based on feedforward architectures) in terms of fairness and explainability [20],
since Bayesian networks, as already anticipated, imply processes that can be tracked and understood.
They also refer to FEP, which provides a very general framework for resource optimisation. On the other
hand, it may be risky to consider this type of algorithms as models for building AI cognition, since they
rely on basic decision theory that assumes an optimal decision maker who is able to calculate and choose
the move that maximizes the utility function at each stage of problem solving. These theories of
maximising expected utility have been criticised as being computationally intractable [21][22], since
they are computationally demanding for systems with a large number of random variables, and
therefore lead to an exponential increase in computational effort and energy consumption, as the
number of states increases.</p>
      <p>Hence, the take-away message from this section is that AIF, PP and FEP are not only abstract
models for cognition, mind or resource optimization, but they also have very concrete applications in
AI. For this reason, research should delve more into this relationship’s implications. On these grounds,
the next section discusses what has been done on this path until now and what might be investigated in
the future.</p>
    </sec>
    <sec id="sec-4">
      <title>4. A few steps forward</title>
      <p>The previous sections have dealt with the description of the cognitive frameworks of PP and AIF and
have shown how they are employed for AI implementation. The relevance of this link has been noted by
neuroscientists, who are studying the implications of these tools to draw new kinds of AI, characterized
by top-down systems which prioritize high-level planning and decision-making, rather than bottom-up
systems, driven by masses of training data [23].</p>
      <p>
        Moreover, Albarracin and colleagues [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] have argued that incorporating the design principles of
AIF into AI systems can lead to better explainability for two reasons: the first is that «by deploying an
explicit generative model, AI systems premised on AIF are designed to be interpreted by a user or a
stakeholder, that can be fluent in the operation of such models». The second is that an architecture
inspired by AIF models of introspection is useful to «build systems that are able to access and report on
the reasons for their decisions, and their state of mind when reaching these decisions». Specifically, they
propose an architecture including components that update and maintain an internal model of its own
state, beliefs and goals: this capacity of self-access, enables the AI system to optimize and report in its
decision-making processes, fostering introspection and enhanced explainability. This is done by a soft
attention mechanism, which uses a weighted combination of hierarchical generative model components
to focus on relevant information: the attention weights are computed based on the input data and the
AI system’s internal state, allowing the system to adaptively focus on different aspects of the hierarchical
generative model [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>All that considered, Albarracin and colleagues indicate different research paths for future
development. It is worth exploring, among other things, more advanced data fusion techniques, such as
deep learning-based fusion or probabilistic fusion, to improve the AI system’s ability to combine and
process multimodal data effectively. The explanation and transparency of these kind of algorithms has
been a significant topic in recent years, especially in decision-making scenarios.</p>
      <p>
        In this sense, an interesting path might be found in the research about Bayesian networks for
decision-making, specifically exploring a specific subset of a Bayesian network, i.e., a Markov blanket
(MB): «the Markov blanket of a variable X is the set consisting of the parents of X, the children of X, and
the variables sharing a child with X» [24]. This is, in a Bayesian network, the nodes forming the MB of X
simplify a lot the computation of the value of a variable, since «if X is embedded in a graph with a
hundred variables, but its MB consists of five variables, one can safely ignore ninety-five variables in the
computation» [25]. Hence, in the framework of FEP and AIF, Friston introduced the statistical concept
of MB, by advocating it as a tool to describe a specific form of conditional independence between a
dynamical system and its environment, and by addressing it as real boundaries of living organisms
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ][
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Indeed, they are associated with the physical boundaries surrounding cells (i.e., their membrane),
through which all influences between the intracellular and extracellular spaces are mediated. More
broadly, MBs represent a separation of the world into different sets of states, which interact only via
MB. Hence, although living systems need to interact with the environment, since they are open systems,
they also need to distinguish themselves from the external environment: in this context, conditional
independence occurs: when the state on the organism/environment boundary, i.e., the MB, is fixed, what
happens on one side of the boundary does not influence what happens on the other side. This means
that all the necessary information for explaining the behaviour of the internal system is given by the
state of the blanket [26][27].
      </p>
      <p>Besides the theoretical framework, MBs are employed in AI implementation, joint with AIF, for
instance, in music generation [28] and in robotics. If we consider again one of the examples seen in the
previous section, i.e., Van de Maele’s simulation of the robotic agent moving in a space AIF based, also
MBs structures occur. Indeed, in that context the robotic agent was designed as separated from the
world state through a MB, meaning that the agent can only update its knowledge about the surrounding
space by interacting with the world through its chosen actions and its observed sensory information
[19].</p>
      <p>Hence, when establishing research on MBs, one should be very careful about the basic
assumptions to be made: this is, on the one hand, MB can be used instrumentally in their original
statistical form, i.e., as a formal mathematical construct for inference on a generative model, e.g., a
Bayesian network. On the other hand, this usage cannot come only from the theoretical framework
established by Friston and cannot be justified with traditional statistics. However, grasping the potential
of this tool and delving into its possible implications might be worth for further research on using AIF
for AI architecture.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>So far, this short paper has described the theoretical frameworks of PP and AIF, showing their linkages
to FEP, and how all of them are employed in the field of AI. The paper has reported the research
conducted by Mahault Albarracin and colleagues on the possibility of using AIF for implementing
explainable and transparent AI architecture. Specifically, this work’s final proposal is to follow this path
by focusing on the potential of the statistical structures of MBs, which might be helpful for the
development of more sustainable and transparent decision-making processes. This follows from the
observation that the investigation of the relationship between AI and the cognitive frameworks
provided by the philosophy of mind and neuroscience still has not yet received sufficient attention by
literature and should be taken into account for future AI development, and, vice versa, delving into AI
implementation might be helpful to shed some light on these cognitive frameworks.
[16] J. Tani, Learning to generate articulated behavior through the bottom-up and the top-down
interaction processes, Neural networks 16(1) (2003) 11–23.
[17] L. Pio-Lopez, A. Nizard, K. Friston, G. Pezzulo, Active inference and robot control: a case study,</p>
      <p>Journal of The Royal Society Interface 13 (122) (2016).
[18] C. Pezzato, M. Baioumy, C. H. Corbato, N. Hawes, M. Wisse, R. Ferrari, Active inference for fault
tolerant control of robot manipulators with sensory faults, International Workshop on Active
Inference, Springer, (2020) 20–27.
[19] T. Van De Maele, T. Verbelen, O. Çatal, Active Vision for Robot Manipulators Using the Free Energy</p>
      <p>Principle, Frontiers in Neurorobotics 15 (2021). doi: 10.3389/fnbot.2021.642780.
[20] K. Imai, Z. Jiang, Principal Fairness for Human and Algorithmic Decision-Making, ArXiv (2022)
arXiv:2005.10400.
[21] J. Kwisthout, I. van Rooij, Computational resource demands of a predictive Bayesian brain,</p>
      <p>Comput Brain Behav 3 (2020) 174-188. doi: https://doi.org/10.1007/s42113-019-00032-3.
[22] A. Lieto, Cognitive design for artificial minds, Routledge, New York, 2021.
[23] M. Cullen, B. Davey, K.J. Friston, R.J. Moran, Active Inference in OpenAI Gym: A Paradigm for
Computational Investigations Into Psychiatric Illness, Biol Psychiatry Cogn Neurosci
Neuroimagining 3 (9) (2018). doi: 10.1016/j.bpsc.2018.06.010.
[24] T. Koski, J. M. Noble, Bayesian Networks: An introduction, Wiley and Sons, Chichester, 2009.
[25] M. Facchin, Extended predictive minds: Do Markov blankets matter?, Review of Philosophy and</p>
      <p>Psychology (2021) 1-30. doi: https://doi.org/10.1007/S13164-021-00607-9.
[26] J. Hohwy, Quick’n’Lean or Slow and Rich? Andy Clark on predictive processing and embodied
cognition, in: M. Colombo, E. Irvine, M. Stapleton (Eds.), Andy Clark and His Critics, Oxford
University Press, New York, 2019, pp. 191-205.
[27] E. R. Palacios, A. Razi, T. Parr, On Markov Blanket and hierarchical self-organization, Journal of</p>
      <p>Theoretical Biology 486 (110089) (2020).
[28] J. Cruz, Deep learning vs Markov model in music generation, Honors College Theses, graduate thesis
(2019).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>K.</given-names>
            <surname>Friston</surname>
          </string-name>
          ,
          <article-title>The free energy principle: A unified brain theory?</article-title>
          ,
          <source>Nature Reviews Neuroscience 11</source>
          <volume>2</volume>
          (
          <year>2010</year>
          )
          <fpage>127</fpage>
          -
          <lpage>138</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K.</given-names>
            <surname>Friston</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mattout</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kilner</surname>
          </string-name>
          ,
          <article-title>Action understanding and active inference</article-title>
          ,
          <source>Biological cybernetics, 104</source>
          <volume>1</volume>
          (
          <year>2011</year>
          )
          <fpage>137</fpage>
          -
          <lpage>160</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kirchhoff</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Parr</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. Palacios,</surname>
          </string-name>
          <article-title>The Markov Blanket of life: Autonomy, active inference and the free energy principle</article-title>
          ,
          <source>Journal of the Royal Society Interface</source>
          ,
          <fpage>15</fpage>
          <lpage>138</lpage>
          (
          <year>2018</year>
          ). https://doi.org/10.1098/rsif.
          <year>2017</year>
          .
          <volume>0792</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Constant</surname>
          </string-name>
          , M, J,
          <string-name>
            <surname>D</surname>
          </string-name>
          , Ramstead,
          <string-name>
            <given-names>S. P. L.</given-names>
            <surname>Veissière</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K. J.</given-names>
            <surname>Friston</surname>
          </string-name>
          ,
          <article-title>Regimes of expectations: An active inference model of social conformity and human decision making</article-title>
          , Frontiers in Psychology,
          <volume>10</volume>
          (
          <year>2019</year>
          ) p.
          <fpage>679</fpage>
          . doi:
          <volume>10</volume>
          .3389/fpsyg.
          <year>2019</year>
          .
          <volume>00679</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L. Da</given-names>
            <surname>Costa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. J.</given-names>
            <surname>Friston</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Heins</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Pavliotis</surname>
          </string-name>
          ,
          <article-title>Bayesian mechanics for stationary processes</article-title>
          ,
          <source>Proceedings of the Royal Society, A</source>
          <volume>477</volume>
          .
          <volume>2256</volume>
          (
          <year>2021</year>
          ). doi:
          <volume>10</volume>
          .1098/rspa.
          <year>2021</year>
          .
          <volume>0518</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ramstead</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Albarracin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kiefer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Klein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Fields</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Friston</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Safron</surname>
          </string-name>
          ,
          <article-title>The inner screen model of consciousness: applying the free energy principle directly to the study of conscious experience</article-title>
          , (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Limanowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. J.</given-names>
            <surname>Friston</surname>
          </string-name>
          , '
          <article-title>Seeing the dark': Grounding phenomenal transparency and opacity in precision estimation for active inference</article-title>
          ,
          <source>Frontiers in Psychology</source>
          ,
          <volume>9</volume>
          (
          <year>2018</year>
          ), p.
          <fpage>643</fpage>
          . doi:
          <volume>10</volume>
          .3389/fpsyg.
          <year>2018</year>
          .
          <volume>00643</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Albarracin</surname>
          </string-name>
          , I. Hipolito,
          <string-name>
            <given-names>S.E.</given-names>
            <surname>Tremblay</surname>
          </string-name>
          ,
          <article-title>Designing explainable artificial intelligence with active inference: A framework for transparent introspection and decision-making</article-title>
          ,
          <source>ArXiv</source>
          (
          <year>2023</year>
          ). doi: arXiv:
          <fpage>2306</fpage>
          .
          <fpage>04025</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>L.</given-names>
            <surname>Sandved-Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hesp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mattout</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. J.</given-names>
            <surname>Friston</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lutz</surname>
          </string-name>
          , and
          <string-name>
            <surname>M. J. D. Ramstead</surname>
          </string-name>
          ,
          <article-title>Towards a computational phenomenology of mental action: Modelling meta-awareness and attentional control with deep parametric active inference</article-title>
          ,
          <source>Neuroscience of Consciousness</source>
          ,
          <year>2021</year>
          .
          <volume>1</volume>
          (
          <year>2021</year>
          ). doi:
          <volume>10</volume>
          .1093/nc/ niab018.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Seth</surname>
          </string-name>
          ,
          <article-title>The brain as a prediction machine</article-title>
          , in: D.
          <string-name>
            <surname>Mendonça</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Curado</surname>
          </string-name>
          , S. S. Gouveia (Eds.),
          <source>The Philosophy and Science of Predictive Processing</source>
          , Bloomsbury, London,
          <year>2020</year>
          , pp.
          <article-title>XIV-XVII.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>I. P.</given-names>
            <surname>Derks</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>deWaal, A taxonomy of explainable Bayesian networks</article-title>
          , in: A.
          <string-name>
            <surname>Gerber</surname>
          </string-name>
          (Ed.),
          <source>Artificial Intelligence Research</source>
          , CCIS, Springer, Cham,
          <year>2020</year>
          . https://doi. org/10.48550/arXiv.2101.11844.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kaplan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>McCandlish</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Henighan</surname>
          </string-name>
          ,
          <article-title>Scaling Laws for Neural Language Models</article-title>
          , arXiv (
          <year>2020</year>
          ). arXiv:
          <year>2001</year>
          .08361v1.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          , Attention Is All You Need,
          <source>ArXiv</source>
          (
          <year>2017</year>
          ). arXiv:
          <volume>1706</volume>
          .
          <fpage>03762</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>P.</given-names>
            <surname>Lanillos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Meo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Pezzato</surname>
          </string-name>
          ,
          <article-title>Active inference in robotics and artificial agents: survey and challenges</article-title>
          ,
          <source>ArXiv</source>
          (
          <year>2021</year>
          ).
          <source>arXiv:2112</source>
          .
          <year>01871</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>P.</given-names>
            <surname>Lanillos</surname>
          </string-name>
          , G. Cheng,
          <year>2018</year>
          ,
          <article-title>Adaptive robot body learning and estimation through predictive coding</article-title>
          ,
          <source>IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)</source>
          .
          <source>IEEE</source>
          (
          <year>2018</year>
          )
          <fpage>4083</fpage>
          -
          <lpage>4090</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>