<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Human-Under-Test and Continual Bidirectional Assessment for Co-development of Human-AI Systems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Roberto Casadei</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanni Delnevo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Silvia Mirri</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Alma Mater Studiorum-Università di Bologna</institution>
          ,
          <addr-line>Cesena</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <fpage>23</fpage>
      <lpage>28</lpage>
      <abstract>
        <p>Recent developments in artificial intelligence (AI) and large-language models (LLM) promote collaboration of humans and AI-based agents. However, the use of AI has risks, e.g., related to over-reliance and possibly unintended consequences stemming from structural issues and mistakes from both sides. Given that AI is a tool with intrinsic strengths and weaknesses, there are also responsibilities on the human side regarding how the tool is used. For the human-AI system to be efective, both actors should understand the limitations and risks of both players and adopt strategies to mitigate them. Therefore, in this position paper, we propose a model and process for continual bidirectional assessment and co-development of human-AI systems. Though research has mostly focussed on the evaluation of AI agents, we especially focus on the human. Through an analogy with software testing, we propose a “human-under-test” schema, where the AI agent proactively inspects the human user to identify potential issues (e.g., in knowledge, expectations, or process consistency) that might negatively afect the collaboration.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;human-AI collaboration</kwd>
        <kwd>LLM</kwd>
        <kwd>AI agents</kwd>
        <kwd>workflows</kwd>
        <kwd>hybrid intelligence</kwd>
        <kwd>co-development</kwd>
        <kwd>human-AI interaction</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Context. The recent developments on artificial intelligence ( AI), generative AI (GenAI), and large
language models (LLMs) are revolutionising human-AI collaboration, creating new opportunities and
challenges for the implementation of systems and processes fostering hybrid/collective intelligence [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ].
Our focus is on human-AI collaboration in “projects”, loosely defined as multi-step activities aimed at
solving information-intensive tasks and producing deliverables (e.g., software projects).
Problem, state of the art, and gap. Limitations in AI tools, in user knowledge and expectations,
and in their human-AI interaction (HAI), implies risks that may undermine performance and give way to
unintended consequences. Works on AI maturity models have been proposed to comprehensively assess
an organisation’s ability of leveraging AI, suggesting properties that humans and AI should possess for
mature HAI and ecosystems. Several studies have been carried out on human-AI teams [
        <xref ref-type="bibr" rid="ref3 ref4 ref5 ref6 ref7">3, 4, 5, 6, 7</xref>
        ],
supporting collaborative problem solving up to human-AI co-evolution [
        <xref ref-type="bibr" rid="ref8 ref9">8, 9</xref>
        ]. Crucial concepts include
feedback loops [
        <xref ref-type="bibr" rid="ref10 ref8">8, 10</xref>
        ] for tuning and incremental co-development, shared mental models [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], to properly
frame expectations and promote efective interaction through human- AI mutual understanding, and
meta-cognitive scafolding [
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ], supporting user reasoning through thinking assistants [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Though
proposals for co-evolution and reciprocal learning exist [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], we observe limited contributions on
integrated end-to-end frameworks for human-AI teams with bidirectional assessment of mental model
and knowledge gaps. This view is also shared by other researchers [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], mentioning “investigation of
bidirectional causality” and “modelling of the feedback loop” as open challenges in human-AI systems.
Contribution. In this position paper, we review contributions on human-AI collaboration, and
propose and discuss two ideas to foster structured research on efective human- AI collaboration. The
general idea is to consider an integrated, end-to-end process framework for bidirectional scafolding,
aimed at proactively reducing risks by assessing mental models and knowledge about several entities
(knowledge, behaviour, configuration of the human and AI actors, goals, context, process outputs
and history). The specific idea, covering one direction of the overall schema, is to consider
humansunder-test, namely the human actors as “test subjects”—following a software testing analogy. This also
suggests an alternative to the common “human-in-the-(AI-)loop” viewpoint, which could be referred to
as “AI-in-the-human-loop”.
      </p>
      <p>Paper structure. Section 2 covers background and related work. Section 3 presents the contribution.
Section 4 provides a discussion with pointers to future research.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background and Related Work</title>
      <p>For proper human-AI collaboration, human users and AI tools should interact (and possibly collaborate)
in order to understand, assess, and support each other. We review works related to diferent research
directions contributing to this view.</p>
      <sec id="sec-2-1">
        <title>2.1. AI Maturity Models</title>
        <p>
          In general, maturity models are models aimed at evaluating and supporting the optimisation of processes
by humans or organisations [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. AI maturity models specifically aim to evaluate an organisation’s
readiness in taking advantage of AI, by using or developing it. Sometimes, the proposed maturity
models have a diferent focus, e.g., on the human organisation adopting AI, or on the AI system itself,
considered as an entity with diferent levels of maturity.
        </p>
        <p>
          Surveys on AI maturity models. A number of surveys provide insights on the AI maturity models
themselves: their focus, goals, and development methods. In 2023, Akbarighatar et al. [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] reviewed
AI maturity models, with a focus on responsible development and use of AI, under the lenses of
a sociotechnical perspective. The extracted capabilities for responsible AI include: (i) continuous
evaluation of the efects of AI decisions, (ii) employee’s awareness of ethical issues in AI, (iii) security
and privacy, (iv) fairness evaluation, (v) transparency/understandability of AI models, (vi) accountability.
A 2021 systematic literature review on AI maturity models is also provided by Sadiq et al. [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. They
adopt a taxonomy based on: research objectives (development, validation, or application of AI maturity
models), scope (domain generality, analysis scope), method (what analytical and empirical methods),
design approach (top-down vs. bottom-up), architecture (stage-based vs. continuous), purpose of use
(descriptive vs. prescriptive vs. comparative), typology (maturity grids, structured models, Likert-like
questionnaires, or hybrid models), and maturity model components (levels and elements). They extract
seven critical dimensions: (i) data, (ii) analytics, (iii) technology and tools, (iv) intelligent automation,
(v) governance, (vi) people, and (vii) organisation.
        </p>
        <p>
          Examples of recent AI maturity model proposals. Hartikainen et al. [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] propose a (preliminary)
maturity model to guide the development of human-centred AI systems (HCAI-MM), based on six
building blocks: (i) working with AI uncertainty, (ii) collaboration and human control, (iii) accountability,
(iv) fairness, (v) transparency, and (vi) explainability. We share similar motivations, though we emphasise
the uncertainty related to human actors. Fukas et al. [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] provide an AI maturity model tailored to the
auditing domain (A-AIMM). The A-AIMM consists of five levels for eight dimensions (technologies, data,
people and competences, organisation and processes, strategy and management, budget, products and
services, ethics and regulation). At the most advanced level, the organisation (i) leads the development
of AI technologies, (ii) audit is data-driven, (iii) people have leading AI competences, (iv) processes are
AI-enabled and -driven, (v) the AI strategy is decided, (vi) AI has dedicated budget, (vii) AI supports
products and services, and (viii) AI is trustworthy and explainable. Hansen et al. [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] also propose an AI
capability maturity model to understand and guide adoption of AI in organisations. Their framework
uses two technological (data, infrastructure), three organisational (strategy, people, culture), and two
external dimensions (ethics &amp; regulations, and pressures &amp; motivation). At the sixth, top-most level, AI
is a core integrated part of the organisation’s business model and culture.
        </p>
        <p>Relationship with our work. We share the vision on the critical aspects of AI use and development.
We focus on the people, culture, and human-AI collaboration aspects.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Human-AI Interaction and Collaboration: Mental models, Teaming,</title>
      </sec>
      <sec id="sec-2-3">
        <title>Co-evolution</title>
        <p>Another thread of topics focus on the interaction between humans and AI. Research could be
distinguished mainly in terms of the scope of analysis: co-evolution is more long-term and broad in scope,
whereas human-AI teaming tends to focus more on short-term decision-making or specific aspects
(e.g., mental models, trust). Specifically, mirroring mental models and scafolding user understanding
and reasoning are two key goals and means for human-AI collaboration. In mirroring, the AI reflects
back cognitive and afective states to users. In (metacognitive) scafolding, the AI supports users in
self-regulation and -reasoning during interaction with AI.</p>
        <p>
          Human-AI teams. Enssley [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] discusses aspects afecting human-AI team performance, including
decision making, coordination, interaction methods, team training, trust, transparency, explainability,
and the role of bias. The topic is vast and there are not broad surveys yet, beside a scoping review on
human-centred human-AI by Berretta et al. [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] and a conceptual outlook from Lou et al. [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
Contributions on this topic are indeed quite diverse. For instance, Lancaster et al. [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] investigate human-AI
team training, identifying that users tend to value cross-training (understanding more about other roles)
and adaptive roles of AI. To help humans understand AI systems, Cabrera et al. [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] propose to use AI
behaviour (pattern) descriptions for sub-groups of problem instances.
        </p>
      </sec>
      <sec id="sec-2-4">
        <title>Mental models, mirroring, and meta-cognitive scafolding. Andrews et al. [ 11] review the role</title>
        <p>
          of shared mental models in human-AI teams, based on the intuition that when teammates’ mental models
align, then the team will perform better due to improved prediction accuracy backed by reciprocal
understanding. Bansal et al. [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] focus on AI-advised human decision making in high-stakes domains.
They show that humans with accurate mental models of AI systems (e.g., their error boundary) improve
team performance and, more interestingly, that updating AI to increase accuracy, at the expense
of compatibility (coherence with mental model based on previous experience), may degrade team
performance. Te’eni et al.[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] proposed the notion of reciprocal human-machine learning (RHML)
in which both humans and machines iteratively update internal representations through cycles of
feedback. Their Fusion system exemplifies this principle by supporting experts’ message classification
tasks through cognitive mirroring and mutual adaptation.
        </p>
        <p>
          In a broader critique, Lewis[
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] argues that most current AI systems lack true reflective capacity, a core
cognitive faculty in humans, thus failing to manage ambiguity, context, and emergent meaning. They
propose a reflective AI architecture grounded in complex systems and cognitive science to address these
shortcomings. Tankelevitch et al.[
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] and Lim [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] extend these ideas into practical design strategies
for scafolding user metacognition when interacting with GenAI. These include explainability features,
adaptive prompting, and bias-aware nudges, as seen in the DeBiasMe platform. Levin [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ] introduces
the concept of Meta-AI skills, a new class of metacognitive competencies essential for co-reasoning with
GenAI. These skills include reflective prompt engineering, multimodal synthesis, and tacit knowledge
articulation, which together enable learners to engage AI not just as a tool, but as a cognitive partner.
An interesting concept relevant in this context is that of thinking assistants [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], namely AI agents that
help users think by fostering self-reflection in the user.
        </p>
        <p>
          user's view
Human-AI coevolution. Defined by Pedreschi et al. [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] as “a process in which humans and AI
algorithms continuously influence each other”, it afects human- AI ecosystems and the broader society.
It is based on a feedback loop where, iteratively, the human user feeds data into AI recommenders or
assistants which re-train/tune and then provide a suggestion influencing the user for the next iteration.
Among the open challenges, the authors mention the “investigation of bidirectional causality” and
the “modelling of the feedback loop”, which align with the focus of this paper. Some studied focus on
particular directions of the team relationships: e.g., Schut et al. [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] investigate human learning from AI,
even as a means for advancing human knowledge.
        </p>
        <p>
          Relationship with our work. This paper focusses on human-AI teaming, especially on collaboration
aimed at identifying issues with mental models and knowledge gaps through bidirectional assessment.
Though extensive research has been carried out on mental models [
          <xref ref-type="bibr" rid="ref11 ref4 ref7">4, 11, 7</xref>
          ], to the best of our knowledge,
limited contributions exist on bidirectional knowledge gap assessment, which is related but not the
same as reciprocal learning as in [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Bidirectional knowledge gap assessment is a diagnostic method
for identifying missing information, whereas reciprocal learning is a collaborative teaching strategy
that fosters mutual knowledge transfer. We explore how metacognitive scafolding is appropriated
and transformed by users over time, and how mirroring strategies can serve not only as feedback
mechanisms but as foundations for shared cognitive ground.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. A Model for End-to-End Integrated Bidirectional Assessment and</title>
    </sec>
    <sec id="sec-4">
      <title>Scafolding for Human-AI Teams</title>
      <p>This section introduces a model (Section 3.1) and process framework (Section 3.2) for bidirectional
assessment and scafolding in human-AI systems.
3.1. Model
With no loss of generality, we consider a system involving a team consisting of a single human user
and a single AI agent. The ground-truth system state  = (, Ag , , Ctx ,  ) (see Figure 1) is given by:
• user state  = ( ,  ,  ), including its knowledge  , behaviour  , and configuration
 (which can be interpreted as a set of explicit predictors or factors afecting the behaviour);
• AI agent state Ag = (Ag , Ag , Ag ), including its knowledge Ag , behaviour Ag , and
configuration Ag ;
• goals , modelling what the system is trying to achieve;
• context Ctx , modelling all relevant information that can be exploited to achieve the goals;
• process  = (, ,  ), modelling the entire process history in terms of a log object , as well as
the produced output  (cf. project deliverables), and the plan (or workflow)  of future activities.</p>
      <p>Such state  is of course dynamic, meaning that all its components can evolve in time: from 0 to 
(current time).</p>
      <p>The system includes two actors: the user agent  and the AI agent . Crucially, each actor has its
own view of the system and its components,  () =  and () = Ag , which may not be aligned
with the ground truth and with each other’s view. Notice that, while certain elements such as the
project outputs can be directly inspected by the actors, other elements are directly inaccessible and can
only be estimated in terms of one’s knowledge and context information.</p>
      <sec id="sec-4-1">
        <title>3.2. Process framework</title>
        <p>The proposed framework, summarised in Figure 2, envisions an end-to-end, bidirectional process in
which humans and AI systems co-develop understanding, strategies, and alignment. This process is
grounded in a dynamic, developmental view of human-AI interaction and is operationalised through a
set of interacting architectural components that adapt over time to user behaviour and task complexity.
Grounding and Calibration. At the outset, the system initiates a grounding and calibration phase.
The Expectation Calibration Layer aligns the user’s mental model with the actual capabilities, limitations,
and intended epistemic role of the AI system. This step ensures a realistic understanding of how the AI
will support cognitive work. The Task Ontology Layer decomposes the user’s activity into hierarchical
structures comprising phases (e.g., planning, execution), subtasks, and the cognitive goals driving
them. The User Task Maturity Model estimates the user’s proficiency and fluency across subtasks,
while the User-AI Maturity Model assesses the user’s awareness, adaptability, and strategic competence
in collaborating with AI. These models inform both the human-side scafolding strategy and the
initialisation of the AI Maturity Model, which characterizes the AI’s current ability to adaptively assist
across task contexts.</p>
        <p>Tracking Cognition. As the interaction unfolds, the user engages through the Human Cognitive
Interface to express decisions, strategies, hesitations, and reasoning outputs. These expressions
materialize as Human Actions (observable physical or cognitive activities) and are captured as structured data
in the Log Object. Simultaneously, the user’s inner reasoning processes are modelled via the Thought
Trace, which captures the fine-grained structure of cognitive operations, including inferred intentions,
logical steps, and reflective loops.</p>
        <p>This cognitive and behavioural data is monitored and interpreted by the Cognitive Mirror. It is
the AI function that acts as a reflective twin of the user’s reasoning. It monitors, maps, and surfaces
latent cognitive patterns, mirroring back the inferred logic and prompting metacognitive reflection.
These insights are shared with three other core entities. When deviations, misconceptions, or epistemic
breakdowns are detected, the Feedback/Nudging Layer is activated. This layer does not merely provide
corrective responses; it generates metacognitive scafolds in the form of warnings, reframing questions,
strategic hints, or reflective prompts. The Simulation Engine of User Reasoning, that generates
hypothetical reasoning trajectories based on the current task state and historical behaviour. It detects
discrepancies, biases, or blind spots and supports adaptive diagnostics. This engine functions as both
a diagnostic tool and an anticipatory agent, supporting proactive, personalised scafolding. The User
Testing module, that is responsible for actively probing the user’s capabilities and understanding via
targeted assessments and strategic interruptions. These interventions help dynamically update the user
models and inform future support. The intensity and type of feedback are calibrated based on the joint
analysis from User Task Maturity Model and User-AI Maturity Model, ensuring that support is neither
redundant nor cognitively intrusive.</p>
        <p>Feedback Loop. As interaction progresses, both the human and the AI system evolve. The
CoReasoning Loop embodies the long-term synergy between the two agents, allowing for the mutual
development of cognitive models, strategy refinement, and trust calibration. The human’s models (User
Task Maturity Model and User-AI Maturity Model) are dynamically updated based on performance
and adaptation, while the AI Maturity Model evolves to fine-tune its scafolding behaviours and
metacognitive prompts, reinforcing the shift from static assistance to collaborative intelligence.</p>
        <p>This bidirectional process not only supports problem-solving but enables a deeper form of
metacognitive scafolding, wherein the human gradually becomes more reflective, strategic, and autonomous,
while the AI becomes more context-aware, sensitive to individual variation, and educationally aligned.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Discussion and Research Perspectives</title>
      <p>
        “Humans-under-test” and the software testing analogy. Software testing is the overall process of
preparing and executing various kinds of tests to verify (w.r.t. requirements) and validate (w.r.t. intended
usage) a software system, to promote quality and reduce risks [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. A test case gives the procedure,
conditions, expected results for the assessment of a subject-under-test (SUT) by a tester.
      </p>
      <p>The idea can be extended to human-machine systems and specifically to humans 1. In the latter case,
1This is just an analogy meant to convey the similarity of certain goals and procedures between the diferent domains, in
order to provide a starting point for further reflections, and should not be interpreted as a form of “dehumanisation” which
would be ethically deplorable. Other ethical issues exist: for instance, terms like “defects” and “test failure” are problematic,
Human test: a test executed by the AI aimed at verifying or validating
some aspect of a human’s knowledge and behaviour
Human-under-Test (HUT)
The HUT response “significantly” difers from expectation (more
generally, it seems that pass/failure is a limited dichotomy for human test
outcomes)
The cause for failure (e.g., lack of knowledge, lack of consistency w.r.t.
plan, or a mistake)
The amount of “relevant” HUT behaviour and knowledge that has been
assessed by a collection of tests
Assessment of a minimal unit of the HUT’s knowledge/behaviour
Assessment of how the HUT interacts with other technical tools or the
AI
A test aimed at monitoring that the human maintains the learned
competences over time
The AI only observes input-output behaviour
The AI aims to assess the human’s mental model (e.g., beliefs and
intentions)
The process of planning when human tests are executed</p>
      <p>Gap-directed human-AI co-evolution
we may talk of “human-under-test (HUT)” as the human subject that is “tested” by an AI agent. The
goal is to help the human improve its understanding and behaviour (quality), and to reduce risks in
human-AI collaboration. In this context, validation could refer to the assessment of compliance w.r.t.
the AI agent’s mental model, whereas verification could refer to the assessment of the human user’s
mental model and behaviour w.r.t. shared requirements and factual knowledge. The AI agent should
plan what tests to be executed, when, and how, to cover the most significant risks (also checking for
regressions) according to available resources and the shared testing strategy. See Table 1 for a summary
of elements of the analogy.</p>
      <p>This interpretation raises interesting questions that should be addressed by further research:
• What kinds of “tests” are most suitable for testing diferent aspects of the HUT? (cf. white- vs.</p>
      <p>
        black-box testing, unit vs. integration tests)
• How to identify and generate the relevant tests depending on the context?
• How to define the expectations for the responses of the HUT? How to quantify the compliance
or deviance? How to integrate confidence levels and testing outcomes? (cf. construction of test
oracles)
• How to estimate the coverage of relevant knowledge and behaviour?
• How to deal with the non-determinism that characterises (most of) human activity?
• How could tests be planned to avoid bothering the human user?
• How to design a privacy- and ethics-aware human testing2 procedure?
• Who evaluates the evaluators? [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] How to measure if the human testing activity carried out by
the AI is useful or efective?
and methods for locating failures might be considered “aggressive” or “manipulative”. It should be remarked that the goal is
to identify risks and promote quality (cf. improvement and learning).
2Notice that, in literature, “human testing” refers to “testing carried out by humans” and not our acceptation (where the
human is the subject of tests).
      </p>
      <p>
        From “human-in-the-(AI-)loop” to (the complementary) “AI-in-the-human-loop”. As AI
becomes increasingly embedded in decision-making, learning, and creative workflows, its role is
undergoing a fundamental transformation. Traditionally, AI systems have operated as autonomous agents
embedded within human-centred workflows, a paradigm commonly referred to as
human-in-theloop (HITL) [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]. In this configuration, AI outputs are subject to human validation or override, and
humans remain the locus of reasoning, with AI serving as a subordinate tool. This unidirectional model
of interaction emphasises human oversight and correction of AI-generated content, largely focusing on
output quality rather than on mutual understanding or reflective improvement.
      </p>
      <p>In contrast, emerging paradigms, particularly those involving GenAI and LLMs, redefine the nature of
human-AI collaboration. In the framework (Section 3), we introduce a notion of AI-in-the-human-loop,
where AIs is positioned not as a subordinate, but as a reflective partner in cognition. Rather than
being evaluated solely by the human, the AI continuously evaluates, adapts to, and scafolds the user’s
reasoning. It does so by embedding itself within the human’s cognitive loop: observing, simulating,
and mirroring human mental processes to foster metacognitive awareness, strategic refinement, and
epistemic development. The result is a shift from task completion to co-reasoning, where the AI acts as
a cognitive twin that supports reflective learning and knowledge articulation through adaptive feedback
and dialogic interaction. Key diferences are highlighted in Table 2.</p>
      <p>This shift motivates a broader rethinking of how AI systems can be designed to not only support
decision-making, but also promote human growth, self-regulation, and sense-making. Traditional
approaches to AI design prioritize performance metrics such as accuracy and eficiency; however, such
metrics are insuficient when AI is integrated into tasks that involve ambiguity, judgment, or iterative
understanding. Instead, we argue that AI systems—particularly those built on LLMs—should be capable
of mirroring users’ reasoning, surfacing cognitive biases, and scafolding metacognitive processes.
Doing so requires both theoretical reconfiguration and practical innovation. We draw from cognitive
science, educational theory, and reflective interaction design to propose new foundations for human- AI
collaboration. Our goal is to lay the groundwork for AI systems that can not only perform tasks with
users, but also help users better understand themselves through interaction with AI.
Final remarks. Our conclusion is that an end-to-end integrated framework for proactive bidirectional
assessment and scafolding is needed to continuously identify risks, monitor processes, and foster mutual
learning and reasoning (co-evolution). Further research is needed to realise this vision. The
“humanunder-test” analogy and the “AI-in-the-human-loop” concept might represent guiding metaphors for
suggesting specific research questions and directions.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used ChatGPT (GPT-4) in order to: grammar and spell
check, paraphrase and reword. After using this tool, the authors reviewed and edited the content as
needed and take full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>M. M. M. Peeters</surname>
            ,
            <given-names>J. van Diggelen</given-names>
          </string-name>
          , K. van den Bosch, A. Bronkhorst,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Neerincx</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Schraagen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Raaijmakers</surname>
          </string-name>
          ,
          <article-title>Hybrid collective intelligence in a human-AI society</article-title>
          ,
          <source>AI &amp; SOCIETY</source>
          <volume>36</volume>
          (
          <year>2020</year>
          )
          <fpage>217</fpage>
          -
          <lpage>238</lpage>
          . doi:
          <volume>10</volume>
          .1007/s00146-020-01005-y.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Casadei</surname>
          </string-name>
          ,
          <article-title>Artificial collective intelligence engineering: A survey of concepts and perspectives</article-title>
          ,
          <source>Artif. Life</source>
          <volume>29</volume>
          (
          <year>2023</year>
          )
          <fpage>433</fpage>
          -
          <lpage>467</lpage>
          . doi:
          <volume>10</volume>
          .1162/ARTL_A_
          <fpage>00408</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>B.</given-names>
            <surname>Lou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. S.</given-names>
            <surname>Raghu</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y. Zhang,</surname>
          </string-name>
          <article-title>Unraveling human-AI teaming: A review and outlook</article-title>
          ,
          <source>CoRR abs/2504</source>
          .05755 (
          <year>2025</year>
          ). doi:
          <volume>10</volume>
          .48550/ARXIV.2504.05755. arXiv:
          <volume>2504</volume>
          .
          <fpage>05755</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Berretta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tausch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Ontrup</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Gilles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Peifer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kluge</surname>
          </string-name>
          ,
          <article-title>Defining human-ai teaming the human-centered way: a scoping review and network analysis</article-title>
          ,
          <source>Frontiers in Artificial Intelligence</source>
          <volume>6</volume>
          (
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .3389/frai.
          <year>2023</year>
          .
          <volume>1250725</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Cabrera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Perer</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. I. Hong</surname>
          </string-name>
          ,
          <article-title>Improving human-ai collaboration with descriptions of AI behavior</article-title>
          ,
          <source>Proceedings of the ACM on Human-Computer Interaction</source>
          <volume>7</volume>
          (
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>21</lpage>
          . doi:
          <volume>10</volume>
          .1145/ 3579612.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C. M.</given-names>
            <surname>Lancaster</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Duan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mallick</surname>
          </string-name>
          ,
          <string-name>
            <surname>N. J. McNeese</surname>
          </string-name>
          ,
          <article-title>Human-centered team training for human-ai teams: From training with AI tools to training for AI teammates</article-title>
          ,
          <source>Proceedings of the ACM on Human-Computer Interaction</source>
          <volume>9</volume>
          (
          <year>2025</year>
          )
          <fpage>1</fpage>
          -
          <lpage>38</lpage>
          . doi:
          <volume>10</volume>
          .1145/3710998.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>G.</given-names>
            <surname>Bansal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Nushi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Kamar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Weld</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. S.</given-names>
            <surname>Lasecki</surname>
          </string-name>
          , E. Horvitz,
          <article-title>Updates in human-ai teams: Understanding and addressing the performance/compatibility tradeof</article-title>
          ,
          <source>in: 33rd AAAI Conference on Artificial Intelligence</source>
          ,
          <string-name>
            <surname>AAAI</surname>
          </string-name>
          , AAAI Press,
          <year>2019</year>
          , pp.
          <fpage>2429</fpage>
          -
          <lpage>2437</lpage>
          . doi:
          <volume>10</volume>
          .1609/AAAI.V33I01. 33012429.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D.</given-names>
            <surname>Pedreschi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Pappalardo</surname>
          </string-name>
          , E. Ferragina,
          <string-name>
            <given-names>R.</given-names>
            <surname>Baeza-Yates</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-L.</given-names>
            <surname>Barabási</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Dignum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dignum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Eliassi-Rad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Giannotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kertész</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Knott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ioannidis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lukowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Passarella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Pentland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shawe-Taylor</surname>
          </string-name>
          , A. Vespignani,
          <article-title>Human-ai coevolution</article-title>
          ,
          <source>Artificial Intelligence</source>
          <volume>339</volume>
          (
          <year>2025</year>
          )
          <article-title>104244</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.artint.
          <year>2024</year>
          .
          <volume>104244</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>L.</given-names>
            <surname>Schut</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Tomašev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>McGrath</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hassabis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Paquet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <article-title>Bridging the human-ai knowledge gap through concept discovery and transfer in alphazero</article-title>
          ,
          <source>Proceedings of the National Academy of Sciences</source>
          <volume>122</volume>
          (
          <year>2025</year>
          ). doi:
          <volume>10</volume>
          .1073/pnas.2406675122.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D.</given-names>
            <surname>Te</surname>
          </string-name>
          <article-title>'eni, I.</article-title>
          <string-name>
            <surname>Yahav</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Zagalsky</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Schwartz</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Silverman</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Cohen</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Mann</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Lewinsky</surname>
          </string-name>
          ,
          <article-title>Reciprocal human-machine learning: A theory and an instantiation for the case of message classification, Management Science (</article-title>
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .1287/mnsc.
          <year>2022</year>
          .
          <volume>03518</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>R. W.</given-names>
            <surname>Andrews</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Lilly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. M. Feigh</surname>
          </string-name>
          ,
          <article-title>The role of shared mental models in human-ai teams: a theoretical review</article-title>
          ,
          <source>Theoretical Issues in Ergonomics Science</source>
          <volume>24</volume>
          (
          <year>2022</year>
          )
          <fpage>129</fpage>
          -
          <lpage>175</lpage>
          . doi:
          <volume>10</volume>
          .1080/1463922x.
          <year>2022</year>
          .
          <volume>2061080</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>L.</given-names>
            <surname>Tankelevitch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kewenig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Simkute</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Scott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sarkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sellen</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Rintel,</surname>
          </string-name>
          <article-title>The metacognitive demands and opportunities of generative ai</article-title>
          ,
          <source>in: Proceedings of the CHI Conference on Human Factors in Computing Systems, CHI '24</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          ,
          <year>2024</year>
          , p.
          <fpage>1</fpage>
          -
          <lpage>24</lpage>
          . doi:
          <volume>10</volume>
          .1145/3613904.3642902.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>C.</given-names>
            <surname>Lim</surname>
          </string-name>
          , DeBiasMe:
          <article-title>De-biasing human-AI interactions with metacognitive AIED (AI in Education) interventions (</article-title>
          <year>2025</year>
          ). doi:
          <volume>10</volume>
          .48550/ARXIV.2504.16770.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Park</surname>
          </string-name>
          , C. Kulkarni,
          <article-title>Thinking assistants: LLM-based conversational assistants that help users think by asking rather than answering</article-title>
          ,
          <source>CoRR abs/2312</source>
          .06024 (
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .48550/ARXIV.2312. 06024. arXiv:
          <volume>2312</volume>
          .
          <fpage>06024</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>W. S.</given-names>
            <surname>Humphrey</surname>
          </string-name>
          ,
          <article-title>Managing the software process, The SEI series in software engineering</article-title>
          ,
          <source>AddisonWesley</source>
          ,
          <year>1989</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P.</given-names>
            <surname>Akbarighatar</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Pappas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Vassilakopoulou</surname>
          </string-name>
          ,
          <article-title>A sociotechnical perspective for responsible AI maturity models: Findings from a mixed-method literature review</article-title>
          ,
          <source>International Journal of Information Management Data Insights</source>
          <volume>3</volume>
          (
          <year>2023</year>
          )
          <article-title>100193</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.jjimei.
          <year>2023</year>
          .
          <volume>100193</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>R. B. Sadiq</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Safie</surname>
            ,
            <given-names>A. H.</given-names>
          </string-name>
          <string-name>
            <surname>Abd Rahman</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Goudarzi</surname>
          </string-name>
          ,
          <article-title>Artificial intelligence maturity model: a systematic literature review</article-title>
          ,
          <source>PeerJ Computer Science</source>
          <volume>7</volume>
          (
          <year>2021</year>
          )
          <article-title>e661</article-title>
          . doi:
          <volume>10</volume>
          .7717/peerj-cs.
          <volume>661</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hartikainen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Väänänen</surname>
          </string-name>
          , T. Olsson,
          <article-title>Towards a human-centred artificial intelligence maturity model</article-title>
          ,
          <source>in: Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing 1145/3544549</source>
          .3585752.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>P.</given-names>
            <surname>Fukas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rebstadt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Remark</surname>
          </string-name>
          , O. Thomas,
          <article-title>Developing an artificial intelligence maturity model for auditing</article-title>
          ,
          <source>in: 29th European Conference on Information Systems - Human Values Crisis in a Digitizing World, ECIS</source>
          <year>2021</year>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>H. F.</given-names>
            <surname>Hansen</surname>
          </string-name>
          , E. Lillesund,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mikalef</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Altwaijry</surname>
          </string-name>
          ,
          <article-title>Understanding artificial intelligence difusion through an AI capability maturity model</article-title>
          ,
          <source>Information Systems Frontiers</source>
          <volume>26</volume>
          (
          <year>2024</year>
          )
          <fpage>2147</fpage>
          -
          <lpage>2163</lpage>
          . doi:
          <volume>10</volume>
          .1007/s10796-024-10528-4.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Endsley</surname>
          </string-name>
          ,
          <article-title>Supporting human-AI teams: Transparency, explainability, and situation awareness</article-title>
          ,
          <source>Computers in Human Behavior</source>
          <volume>140</volume>
          (
          <year>2023</year>
          )
          <article-title>107574</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.chb.
          <year>2022</year>
          .
          <volume>107574</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>P. R.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sarkadi</surname>
          </string-name>
          , Reflective artificial intelligence,
          <source>Minds and Machines</source>
          <volume>34</volume>
          (
          <year>2024</year>
          ).
          <source>doi: 10. 1007/s11023-024-09664-2.</source>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>I.</given-names>
            <surname>Levin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Marom</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kojukhov</surname>
          </string-name>
          ,
          <article-title>Rethinking AI in education: Highlighting the metacognitive challenge</article-title>
          ,
          <source>BRAIN. Broad Research in Artificial Intelligence and Neuroscience</source>
          <volume>16</volume>
          (
          <year>2025</year>
          )
          <article-title>250</article-title>
          . doi:
          <volume>10</volume>
          .70594/brain/16.s1/21.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ammann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ofutt</surname>
          </string-name>
          , Introduction to Software Testing, 2 ed., Cambridge University Press,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>P.</given-names>
            <surname>Liguori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Improta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Natella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Cukic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cotroneo</surname>
          </string-name>
          ,
          <article-title>Who evaluates the evaluators? on automatic metrics for assessing ai-based ofensive code generators</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>225</volume>
          (
          <year>2023</year>
          )
          <article-title>120073</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.eswa.
          <year>2023</year>
          .
          <volume>120073</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>X.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , T. Ma,
          <string-name>
            <given-names>L.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <article-title>A survey of human-in-the-loop for machine learning</article-title>
          ,
          <source>Future Generation Computer Systems</source>
          <volume>135</volume>
          (
          <year>2022</year>
          )
          <fpage>364</fpage>
          -
          <lpage>381</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.future.
          <year>2022</year>
          .
          <volume>05</volume>
          . 014.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>