<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Generative AI for Critical Analysis: Practical Tools, Cognitive Offloading and Human Agency</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Simon Buckingham Shum</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Connected Intelligence Centre, University of Technology Sydney</institution>
          ,
          <addr-line>NSW</addr-line>
          ,
          <country country="AU">Australia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Generative artificial intelligence (GenAI) is now capable of performing tasks that we have considered intellectually demanding. There are justified concerns that this will undermine the agency of both educators and students, if tools are poorly designed, poorly used, or imposed - with consequences for education and the future of work. This short paper contributes practical examples pointing the potential for GenAI to promote critical analysis as part of intellectually demanding tasks, by both students and educators. However, this depends on appropriate usage. The paper then briefly discusses how we may balance the benefits and risks of human cognitive offloading to AI, as a perspective on human agency.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Generative AI</kwd>
        <kwd>critical thinking</kwd>
        <kwd>agency</kwd>
        <kwd>cognitive offloading1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Critical thinking through argument analysis</title>
      <p>2 https://twitter.com/ylecun/status/1659332688786882560</p>
      <sec id="sec-2-1">
        <title>2.2. Example 2: Critiquing an argument by analogy on social media</title>
        <p>
          Social media platforms such as Twitter have established themselves as influential channels for public
discourse and opinion, although the quality of conversation is of course highly variable with platform
and community. In a tweet, a well-known AI researcher argued that “AI doomers”, who are proponents
of strong AI regulation, would also have called for the banning of pens and pencils. This is an argument
by analogy [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. Bing Chat (a version of GPT-4 integrated into the Microsoft Bing search engine) was
able to critique this claimed analogy effectively (Figure 2).
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.3. Example 3: Analysing an extended argument to create an argument map</title>
        <p>The examples so far have been very short: the arguments have made a single ‘move’, which GPT
could recognise and comment on. Let us now consider a more complex case. In March 2023, a large
number of eminent thinkers wrote an open letter calling for a pause in building large language models.3
Achieving widespread media coverage, this provoked extensive debate, including a letter of rebuttal
from another set of academics and industry researchers.4 This seemed an authentically rich argument to
test GenAI.</p>
        <p>I asked Bing Chat (now Copilot) to access the letter online and identify the key claim and arguments.
It provided a reasonable textual summary, output as a set of bullet points summarising key arguments.
However, it is well established that students struggle to critique arguments, and that rendering them
visually as an argument map can help them understand the key elements of the argument (this is a form
of concept map tuned specifically to show multiple perspectives, and the key features of arguments
such as supporting/challenging claims/evidence). I asked it to generate a map, but it could not. However,
when asked, it confirmed that it understood Argdown, which is a markdown notation for argument
maps. It generated this in a code window, which I pasted into the Argdown web app,5 resulting in a map
(Figure 3).</p>
        <p>Examination of the argument map reveals to what extent this was a rigorous analysis, but also
illustrates ‘hallucination in argument mapping’ (Figure 4). Hallucinations of two types were found.
Firstly, the red underline signals incorrect classification of a premise using incorrect, or indeed
madeup argument schemes. There is to my knowledge no such argument type as Argument from
responsibility, or Argument from precaution. Argument from omission seems to be a jumbling of
Fallacy of omission and Argument from ignorance.</p>
        <p>Secondly, there were hallucinated summaries. This node apparently reads well as a summary, but
the authors do not talk about researchers at all.</p>
        <p>Asking students to perform critical evaluations of
AIgenerated argument maps should serve as assurance of learning
about the subject matter, but can also provide important insights
for them into the limitations of AI, if students are equipped and
empowered to see through hallucinations.
3 https://futureoflife.org/open-letter/pause-giant-ai-experiments
4 https://www.dair-institute.org/blog/letter-statement-March2023
5 https://argdown.org</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.4. Conversing about argument analysis</title>
        <p>Conversational agents are exciting for education since they are, by definition, premised on learning
through dialogue — hardly a novel concept. But consider this illustration of GPT’s capabilities (Figure
5).</p>
        <p>Bing Chat’s Argdown code can also be rendered as textual outlines. Figure 7 shows the addition of
the critical questions, and substitution with placeholders for students to complete.</p>
        <p>An important educational question arises, as we see this kind of performance, namely, will the
students engage in excessive cognitive offloading, and fail to learn how to do this themselves? We
return to this in the discussion about user agency.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. CILObot: analysis and summarisation of learning outcomes</title>
      <p>
        Thus far, we have focused on critical thinking and reflection around arguments, primarily with students
in mind, but equally, these are tools for any professional to test their thinking. In the next example, we
focus on a specifically instructional task, which harnesses the generative capability of LLMs more fully
to distill complex text into key themes. The text in this case is a specific ‘genre’ of writing, the Course
Intended Learning Outcome (CILO). CILOs define what students know and can do on successful
completion of the course. As part of a well-designed curriculum, each part of a course – subjects,
modules and assessments – should all respond to its CILOs. Effective implementation of CILOs
requires both the subject matter expertise of academics and the pedagogical knowledge of learning
designers (LDs). Indeed, recent evidence points to the benefits that academics gain from working with
LDs on their online teaching, and how this transfers to their in-person teaching [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>One specific element in this task that academics can struggle with is to articulate good LOs.
Furthermore, these typically vary widely in quality and quantity between academics. At UTS, we are
working towards summarising all courses consistently using approximately six CILOs, to achieve a
better user experience as students make enrolment decisions, and to assist teaching teams in their course
design and reviews. However, it is an intellectually and linguistically demanding task to distill a list of
20-30 CILOs (which is not uncommon), down to six well designed CILOs, and the university needs to
implement this summarization for its entire program.</p>
      <p>It is here that we anticipated that LLMs could assist. GenAI intranets now provide universities with
authenticated, secure, private services, integrated with other internal services, and tuned to support
business processes.6 In a 2-day hackathon, iterative prompt engineering informed by feedback from
academics and learning designers led to the refinement of a system prompt that configured ‘CILObot’,
a ChatGPT to aid in drafting these new CILOs. The system prompt incorporates widely recognised
design principles (e.g. open each CILO with a verb from Bloom’s Taxonomy), with the addition of
internal requirements (e.g., UTS Indigenous-CILOs), and the chatbot is grounded in a corpus of
documents about CILO design.</p>
      <p>The prototype is showing promise, and after a day’s intensive work using the Azure ‘Chat
Playground’ (the ChatGPT design environment), the results for several programs in our Health faculty
were validated by disciplinary experts (e.g., Figure 8). CILObot generates a coherent first draft in about
30 seconds, which can of course then be refined through further conversation with it, and edited by the
teaching team. We estimate that agreeing on how to distill 20-30 CILOs into 6 would normally be a
minimum of 3 hours’ meeting between the Course Director and the program’s lead academics, which
represents an impressive return on investment. Next steps will test CILObot with other degree courses.
6 cf. Ithaka SR project: Making AI Generative for Higher Education:</p>
      <p>https://sr.ithaka.org/blog/making-ai-generative-for-higher-education-2/</p>
    </sec>
    <sec id="sec-4">
      <title>4. Discussion: cognitive offloading and human agency</title>
      <p>These capabilities are, in my view, impressive. If students were to produce argumentative reasoning as
presented above, we would surely conclude that they were thinking critically, and had mastered some
argumentation principles. Similarly, if an academic proposed a set of six distinctive, well expressed
CILOs with complete coverage of the original CILOs, we would regard that as exactly the kind of task
senior academics should be capable of. The difference, of course, is that these tasks are performed in
under a minute, producing coherent drafts.</p>
      <p>We do not need to believe that agents have the same kind of understanding as people to appreciate
the value of AI being able to communicate with this fluency and precision in order to provoke critical
human reflection. GenAI performs these tasks in seconds, and can iterate its analysis as often as
requested. In principle7, therefore, GenAI can be used to:
• offer students, academics or any other kind of analysts instant, formative feedback on draft
arguments, for instance, identifying points that could be potentially attacked;
• analyse a written corpus to give insights into the quantity and quality of argumentation, which
could inform both LA researchers, practitioners, and educators;
• analyse a written corpus in order to derive a representative set of summary themes (noting that</p>
      <p>AI cannot ‘read between the lines’ as a human qualitative analyst does).
7 These are in principle capabilities for GenAI argument analysis and feedback, since this has not yet been tested empirically
with students, to the best of my knowledge.</p>
      <p>The pivotal question — whether we are envisioning the future of learning among students or
professionals in the workplace — is the “allocation of function” between human and machine, to use
the original term from ergonomics. Questions of cognitive offloading and human agency now arise, as
we consider different scenarios.</p>
      <p>If AI improves short-term productivity (e.g., faster syntheses of complex information; more creative
ideas; more incisive reasoning), we might anticipate (and indeed we are already seeing in certain
professions) that AI apps will embed into professional work practices. Professionals are qualified to
‘drive’ such intellectual power-tools (in contrast to students their qualifications should enable them to
recognise poor AI output); they will welcome cognitive offloading in their busy lives; and if they do
not use AI may find they are unable to compete with those who do. We might see this as empowering
professionals — and yet we might also see a loss of agency as they are essentially forced to use AI in
order to compete. Time will tell if the long-term use of AI leads to the degrading of important human
capabilities, just as GPS satellite route navigation has for many young people obviated the need, and
hence ability, to navigate via printed maps.</p>
      <p>
        In sharp contrast, for education the story is very different. “Productivity gains” need to be judged by
a different yardstick, since while an essay written solely by GenAI in 2 minutes is a “productivity gain”
in terms of artifacts/minute, the absence of the student’s cognitive engagement fails other “KPIs” for
meaningful education. Students must build their foundational knowledge, skills and dispositions, in
order to function as citizens and professionals in the myriad contexts in which they cannot call on AI,
but must think on their feet and demonstrate diverse intelligences [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ].
      </p>
      <p>
        Consequently, as emphasised in a recent national report for the higher education sector, assessment
must be reformed for the age of AI [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Cognitive offloading takes on special importance in assessment
design [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], since it forces us to ask what exactly we deem important to assess in the age of AI. It is
beyond the scope of this short paper to expand on this issue further, but a fitting conclusion is to return
to AIED research 30 years ago, and remember a distinction made by Roy Pea (emphasis added):
“Pedagogic systems focus on cognitive self-sufficiency, much like existing educational programs,
in contrast to pragmatic systems, which allow for precocious intellectual performances of which the
child may be incapable without the system's support. We thus need to distinguish between systems
in which the child uses tools provided by the computer system to solve problems that he or she
cannot solve alone and systems in which the system establishes that the child understands the
problem-solving processes thereby achieved. We can call the first kind of system pragmatic and the
second pedagogic. Pragmatic systems may have the peripheral consequence of pedagogical effects,
that is, they may contribute to understanding but not necessarily. The aim of pedagogic systems is
to facilitate, through interaction, the development of the human intelligent system. While there is a
grey area in between and some systems may serve both functions, clear cases of each can be
defined.” [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]
      </p>
      <p>
        GenAI forces us to ask when we are — or should be, as the boundary shifts — assessing joint
human+AI system performance, versus capability without AI. A consequence of this distinction is that
we must cultivate “mindful engagement”, not “mindless engagement” [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. In the intense debates about
whether the human (student, academic or professional) remains sufficiently in the loop, these concepts
from the era of symbolic AI, when they could barely glimpse what is now possible, remain as important
as ever.
      </p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgements</title>
      <p>CILObot would not have been possible without the joint expertise of my colleagues Sharon Coutts, Ann
Wilson, Michaela Zappia (Institute for Interactive Media and Learning), Susan Gibson &amp; Carl Young
(Data Analytics and Insights Unit), and Miguel Ramal &amp; Olaf Reger (IT Unit).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>H.</given-names>
            <surname>Gardner</surname>
          </string-name>
          ,
          <article-title>Five Minds for the Future</article-title>
          . Harvard Business Review Press,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D. G.</given-names>
            <surname>Thompson</surname>
          </string-name>
          ,
          <article-title>"Marks Should Not Be the Focus of Assessment - But How Can Change Be Achieved?,"</article-title>
          <source>Journal of Learning Analytics</source>
          , vol.
          <volume>3</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>193</fpage>
          -
          <lpage>212</lpage>
          , doi: 10.18608/jla.
          <year>2016</year>
          .
          <volume>32</volume>
          .9.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.</given-names>
            <surname>Markauskaite</surname>
          </string-name>
          et al.,
          <article-title>"Rethinking the entwinement between artificial intelligence and human learning: What capabilities do learners need for a world with AI?,"</article-title>
          <source>Computers and Education: Artificial Intelligence</source>
          , vol.
          <volume>3</volume>
          , p.
          <fpage>100056</fpage>
          ,
          <year>2022</year>
          , doi: https://doi.org/10.1016/j.caeai.
          <year>2022</year>
          .
          <volume>100056</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Buckingham</surname>
          </string-name>
          <string-name>
            <surname>Shum</surname>
          </string-name>
          ,
          <article-title>"The Roots of Computer-Supported Argument Visualization," in Visualizing Argumentation: Software Tools for Collaborative and Educational Sense-</article-title>
          <string-name>
            <surname>Making</surname>
            ,
            <given-names>P. A.</given-names>
          </string-name>
          <string-name>
            <surname>Kirschner</surname>
            ,
            <given-names>S. Buckingham</given-names>
          </string-name>
          <string-name>
            <surname>Shum</surname>
          </string-name>
          , and C. Carr Eds. London: Springer-Verlag,
          <year>2003</year>
          , pp.
          <fpage>3</fpage>
          -
          <lpage>24</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Buckingham</surname>
          </string-name>
          <string-name>
            <surname>Shum</surname>
          </string-name>
          ,
          <article-title>"Sensemaking on the Pragmatic Web: A Hypermedia Discourse Perspective," presented at the 1st</article-title>
          <source>International Conference on the Pragmatic Web</source>
          ,
          <fpage>21</fpage>
          -
          <lpage>22</lpage>
          Sept 2006, Stuttgart,
          <year>2006</year>
          . [Online]. Available: Open Access Eprint: http://oro.open.ac.uk/6442.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Buckingham</surname>
          </string-name>
          <string-name>
            <surname>Shum</surname>
          </string-name>
          ,
          <article-title>"</article-title>
          <source>Cohere: Towards Web 2.0 Argumentation.," presented at the 2nd International Conference on Computational Models of Argument</source>
          ,
          <fpage>28</fpage>
          -
          <lpage>30</lpage>
          May
          <year>2008</year>
          , Toulouse, France,
          <year>2008</year>
          . [Online]. Available: http://oro.open.ac.uk/10421.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.</given-names>
            <surname>Walton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Reed</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Macagno</surname>
          </string-name>
          , Argumentation Schemes. Cambridge: Cambridge University Press,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Joyner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rusch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Duncan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wojcik</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Popescu</surname>
          </string-name>
          ,
          <article-title>"Teaching at Scale and Back Again: The Impact of Instructors' Participation in At-Scale Education Initiatives on Traditional Instruction," presented at the</article-title>
          <source>Proceedings of the Tenth ACM Conference on Learning @ Scale</source>
          , Copenhagen, Denmark,
          <year>2023</year>
          . [Online]. Available: https://doi.org/10.1145/3573051.3593389.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Lodge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Howard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bearman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dawson</surname>
          </string-name>
          , and Associates,
          <article-title>"Assessment reform for the age of Artificial Intelligence," Tertiary Education Quality &amp; Standards Agency (TEQSA), Australian Government</article-title>
          , Canberra,
          <string-name>
            <surname>AUS</surname>
          </string-name>
          ,
          <year>2023</year>
          . [Online]. Available: https://www.teqsa.gov.au/guides-resources/resources/corporate-publications/
          <article-title>assessment-reformage-artificial-intelligence</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>P.</given-names>
            <surname>Dawson</surname>
          </string-name>
          ,
          <article-title>"Cognitive Offloading and Assessment,"</article-title>
          in Re-imagining University Assessment in a Digital World,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bearman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dawson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ajjawi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tai</surname>
          </string-name>
          , and D. Boud Eds. Cham: Springer International Publishing,
          <year>2020</year>
          , pp.
          <fpage>37</fpage>
          -
          <lpage>48</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>R. D.</given-names>
            <surname>Pea</surname>
          </string-name>
          ,
          <article-title>"Integrating Human and Computer Intelligence," in Children and Computers: Directions for Child Development</article-title>
          (No. 28), E. L. Klein Ed. San Francisco: Jossey Bass,
          <year>1985</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>G.</given-names>
            <surname>Salomon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. N.</given-names>
            <surname>Perkins</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Globerson</surname>
          </string-name>
          ,
          <article-title>"Partners in cognition: extending human intelligence with intelligent technologies</article-title>
          .,
          <source>" Educational Researcher</source>
          , vol.
          <volume>20</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>2</fpage>
          -
          <lpage>9</lpage>
          ,
          <year>1991</year>
          , doi: 10.3102/0013189X020003002.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>