<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Using Intelligent Texts in a Computer Science Classroom: Findings from an iTELL Deployment</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Scott Crossley</string-name>
          <email>scott.crossley@vanderbilt.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Joon Suh Choi</string-name>
          <email>joon.suh.choi@vanderbilt.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wesley Morris</string-name>
          <email>wesley.morris@vanderbilt.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Langdon Holmes</string-name>
          <email>langd.holmes@vanderbilt.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>David Joyner</string-name>
          <email>david.joyner@gatech.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vaibhav Gupta</string-name>
          <email>vaibhav.gupta@vanderbilt.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CSEDM'24: 8th Educational Data Mining in Computer Science Education Workshop</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>College of Computing, Georgia Institute of Tech</institution>
          ,
          <addr-line>Atlanta, Georgia</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Data Science Institute, Vanderbilt University</institution>
          ,
          <addr-line>Nashville, Tennessee</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Peabody College, Vanderbilt University</institution>
          ,
          <addr-line>Nashville, Tennessee</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This study assesses the efficacy of intelligent texts to help students in a computer science class understand and process information about computational thinking and programming. The intelligent texts used in this study were taken from an introductory programming textbook. The texts were ingested into an intelligent text format using the Intelligent Texts for Enhanced Language Learning (iTELL) framework, which converts any type of machine-readable text into an interactive, intelligent text. iTELL asks students to complete constructed response items and summaries, which are scored automatically by large language models (LLMs) specifically trained to generate scores to inform qualitative feedback to students. Survey results indicated that students responded positively to the constructed response and summary items and felt both items helped them learn. An analysis of delta value gain scores between pre-tests and post-tests for students that used iTELL and those that did not use iTELL indicated that iTELL students showed increased learning gains. Regression analyses showed that delta scores for the iTELL students were predicted by the number of scrolls, word scores on summaries, and pre-test proficiency level (low/high). The results indicate that intelligent texts may help computer science students better learn material than traditional texts.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Intelligent texts</kwd>
        <kwd>natural language processing</kwd>
        <kwd>reading assessment</kwd>
        <kwd>computer science1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Computational thinking is a critical 21st century skill that
can help students navigate an increasingly digital world. It
describes the ability to express a problem in terms of steps,
such that they could be written out as an algorithm.
Teaching computational thinking allows students to
explore knowledge in concrete ways, and asking students
to code computational thinking into a computer program
can provide students with a quick method to check the
validity of the knowledge [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        However, coding is a complex skill
that requires sustained effort, a specialized approach, and
a diverse skill set. Developing these skills is an iterative
process that requires persistence and knowledge well
beyond simple syntax [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Becoming a proficient
programmer requires a combination of various abilities,
and merely knowing programming syntax is just the initial
step in the challenging process of creating effective
programs. The complexity of computer programming and
      </p>
      <p>
        © 2024 Copyright for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0).
the dedication required to succeed as a programmer means
that computer science classes suffer from high failure and
dropout rates [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] which has led the computer science
community, especially those interested in education, to
develop numerous tools and supports to help facilitate
student success. The majority of these tools focus on
assessing the correctness of assignments in object-oriented
programming languages. Typically, these tools use
dynamic techniques to provide grades and feedback to
students. Some tools use static analysis techniques to
compare a student's submission with a reference solution
or a set of correct student submissions [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>
        Students in computer science classes are also expected
to learn about computational thinking and programming
approaches, using reading materials. However, printed
books are generally considered ineffective at teaching
computational thinking and the dynamic nature of
programming because they are bound by the static
confines of the text [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]-[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Specifically, studies indicate
that students may fail to comprehend programming
dynamics when explained through static pedagogical
materials [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>The goal of this study is to assess the efficacy of
interactive intelligent texts in a computer science class to
help students understand and process information about
computational thinking and programming. The intelligent
texts used in this study are taken from an introduction to
computing textbook. They were ingested into an intelligent
text format using the Intelligent Texts for Enhanced
Language Learning (iTELL) framework. iTELL is a
computational framework that converts any type of
machine-readable text into an interactive, intelligent text
within a web-app. iTELL is based on theories of reading
comprehension and provides opportunities for users to
generate knowledge about what they read and watch
using constructed responses and summary writing. The
constructed responses and summaries are scored
automatically by large language models (LLMs) specifically
trained to generate scores which inform qualitative
feedback to students. The feedback from these AI
integrations is used in a number of different ways,
including to guide learning, correct misconceptions, review
missed topics, prepare for upcoming materials, make links
between the texts and the real world, and help elaborate on
what users have learned. iTELL represents an advanced
learning technology based on AI which can be used to
improve student learning outcomes in computer science
classes.
1.1.</p>
      <sec id="sec-1-1">
        <title>Intelligent Texts</title>
        <p>
          Intelligent textbooks have become more popular as
advancements in Natural Language Processing (NLP) have
made human-machine interaction more accessible [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
Early digital textbooks offer several advantages over
traditional print textbooks, such as videos and hyperlinks,
but studies found no significant difference in learning
outcomes between digital and print textbooks [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
However, a recent meta-analysis indicates that the
interactive features of intelligent textbooks can moderately
improve reading performance [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Moreover, college
students tend to prefer digital textbooks due to their lower
cost and ease of use.
        </p>
        <p>
          Digital textbooks have been in production for over 30
years with initial texts using principles of knowledge
engineering wherein domain experts designed and
produced the textbooks [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. Early work included the
development of hypertext technology, which allowed
students to navigate the textbooks more efficiently [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
One of the first web-based interactive textbooks was
ELMART, an intelligent and interactive textbook introduced in
1996 to teach computer programming [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
        </p>
        <p>
          Research on intelligent textbooks has increased in the
past decade as computational tools have become more
advanced and accessible [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Studies have focused on
analyzing student behaviors in intelligent textbooks to
provide personalized learning experiences. For instance,
researchers have developed algorithms that use previous
assessment results to recommend optimal learning
activities for each student in textbooks [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. Student
behaviors, such as struggling to answer comprehension
questions, can also be used to adaptively modify intelligent
textbook content and provide remedial materials [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ].
Furthermore, experts can extract concepts from intelligent
texts that can train machine learning algorithms to
personalize learning [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] or part-of-speech taggers can be
used to automatically generate comprehension questions
[
          <xref ref-type="bibr" rid="ref17">17</xref>
          ].
        </p>
        <p>The advent of large language models (LLMs) will
further change the developmental landscape for intelligent
textbooks. LLMs allow intelligent textbooks to be more
interactive and afford more accurate, real-time feedback on
student generated responses used to assess text
understanding. Additionally, LLMs allow for the
integration of AI chatbots that can guide readers through
the text and potentially help with any misunderstandings
or difficulties the readers may encounter. In short, LLMs
enable the creation of intelligent textbooks that can
automatically generate content as well as prompt and
evaluate reader responses allowing for the automation of
scoring and feedback generation.
1.2.</p>
        <p>iTELL
iTELL is a framework that simplifies the creation and
deployment of intelligent texts with integrated
interactive features. iTELL includes automated pipelines
that leverage Large Language Models (LLMs) with human
oversight to generate participatory content like
constructed response items and summaries. Additionally, it
includes scoring APIs for constructed responses and
summaries. iTELL is a domain-agnostic framework that
utilizes multiple highly transferable generative LLMs to
transform static texts into interactive, intelligent textbooks.</p>
        <p>iTELL generates rich clickstream data, allowing for the
analysis of user behaviors, particularly related to reading.
It uses JavaScript's intersection observer API to determine
whether a specific text section is within the user's viewport
and logs the observation time for different parts of the text.
iTELL also log events within the systems include scrolling
and page clicks.</p>
        <p>
          Most importantly, iTELL includes read-to-write tasks
that engage the reader in learning. Read-to-write tasks
require readers to extract and integrate information from
the text into their writing allowing them to construct
knowledge as they read [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]-[
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. Read-to-write task have
been shown to be effective learning tools. For example,
asking readers to summarize what they have written
results in strong learning gains [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]-[
          <xref ref-type="bibr" rid="ref23">23</xref>
          ]. Additionally,
constructed responses, where students provide short
written answers, can improve learning comprehension [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ].
In iTELL, readers are required to complete at least one
constructed response item and write one summary per
page. iTELL utilizes multiple fine-tuned and out-of-the-box
Large Language Models (LLMs) to support these tasks.
Specifically, iTELL uses LLMs to generate short
questions and to evaluate readers’ constructed responses
to those questions. Additionally, iTELL requires readers to
submit a summary of each page and uses LLMs to score
those summaries and provide feedback to readers to use
when revising summaries.
1.3.
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>Current Study</title>
        <p>The current study examines an iTELL volume deployed in
an Introduction to Computing class. A single volume of
iTELL was developed that covered a textbook chapter on
control structures (Chapter 3). Students within the
classroom were given extra credit to use the iTELL volume
of which about 25% did. The remaining students depended
on a static, digital version of the textbook. At the end of
each chapter, the students were given a test to assess their
knowledge. The research questions that guide this study
are the following:
1.
2.</p>
        <sec id="sec-1-2-1">
          <title>Do students think that iTELL is easy to interact</title>
          <p>with, is understandable, helps them learn, and
provides accurate feedback?
Do students that completed the iTELL volume
show gains from the test on chapter two (no
iTELL volume) to the test on chapter 3 (iTELL
volume) compared to students who did not
complete the iTELL volume?
Are data points collected from the iTELL volume
related to click-stream data, focus time, and
summary scores related to differences in test
scores from chapter two to chapter three?</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Method</title>
      <p>2.1.</p>
      <sec id="sec-2-1">
        <title>Course</title>
        <p>Data for this research is based on an Introduction to
Computing class that was taught in the spring of 2024 at a
large technology university in the southeastern United
States. The course is one of three courses that can fulfill the
computer science requirement for all students at the
university and is taken by over a couple of thousand
students every academic year in one of two variations:
inperson and online. The course covers the basics of
computing, presupposing no prior programming ability: it
begins with the basics of procedural programming, moves
through control structures and data structures, and
concludes with brief units on object-oriented programming
and algorithms. Throughout the course, students complete
several hundred small programming problems through the
homework assignments, as well as some live coding
problems during four timed, proctored tests. Tests and
quizzes comprise 52% of students' grades in the class.
2.2.</p>
      </sec>
      <sec id="sec-2-2">
        <title>Textbook</title>
        <p>
          The textbook used in the course is called Introduction to
Computing, first edition [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ]. The textbook is published by
McGraw Hill Education and is available in digital format.
There are five units in the textbook with each unit
comprising between 2 and 5 chapters. The five units are
Computing, Procedural Programming, Control Structures, Data
Structures, and Object-Oriented Programming.
        </p>
        <p>For this study, we ingested the third unit covering
Control Structures into an iTELL volume using iTELL’s
content management system. The volume comprised an
overview page that introduced iTELL and 5 additional
pages with each page referencing a chapter from the unit.
Each page followed the structure of the chapter in the book
including learning objectives, key terms, the chapter prose,
and any figures, graphs or tables. However, screenshots of
integrated development environments (IDEs) in the
textbook that demonstrated Python code and the code
output were replaced with a Python interactive sandbox.
The sandbox allowed students to enter in their own code
and run the code within iTELL. The pages were separated
into chunks (i.e., all content under a unique sub-heading)
based on related content as selected by the page designer.
On average, each page had around 6.6 chunks
(SD=1.14). Only the first chunk on a page was visible to a
user at the beginning with all subsequent chunks being
blurred. Users were required to click on a “chunk reveal”
button to unblur the next chunk.</p>
        <p>The content management system automatically
generates constructed response questions and answers for
each chunk with human-in-the-loop. Prior to publishing,
the page designer ensured that the questions and answers
were accurate. Each chunk had an accompanying
constructed response item. There was a 1/3 chance of a
constructed response item being presented to a user for
each chunk.
2.3.</p>
      </sec>
      <sec id="sec-2-3">
        <title>Participants and Procedure</title>
        <p>There was a total of 476 students enrolled in the class. Of
those enrolled, 121 students elected to use the iTELL
version of the textbook and 356 did not. Students were
given 1% extra credit (added to their overall course grade)
for participating in the study.</p>
        <p>These 121 students first provided consent for their data
to be used. If the student did not provide consent or was
under the age of 18, they were sent directly to the iTELL
volume, but no data was collected. If they provided
consent, they then completed an intake survey that
collected demographic information and individual
difference data including age, sex, race/ethnicity, first
language, and reading habits and technology use. They
were then sent to the iTELL volume. If the student finished
the five pages in the volume, participants were asked to
complete an outtake survey to describe their experience of
working with the iTELL volume. The outtake survey
focused on students’ perceptions of the digital text’s
layout, organization, annotation features, and the
effectiveness of the summary and short answer tasks.</p>
        <p>Of these 121 students, 101 consented to having their
data used for analysis. Of those 101 students, 82 completed
iTELL including the outtake survey. However, of the 82
students that completed iTELL, 79 reported test scores for
units 2 and 3. Of the 356 students that did not use iTELL,
277 reported test scores for units 2 and 3.
2.4.
2.4.1.</p>
      </sec>
      <sec id="sec-2-4">
        <title>Surveys</title>
        <sec id="sec-2-4-1">
          <title>Intake Survey</title>
          <p>Before interacting with iTELL, students completed a short
intake survey to collect demographic data such as age,
gender, race or ethnicity, and first language background. In
this intake survey, students were also asked to provide
information about their interactions with technology and
they provided input about their reading habits on
electronic and traditional texts. Finally, students provided
information about the types of features that they have used
before in intelligent texts.
2.4.2.</p>
        </sec>
        <sec id="sec-2-4-2">
          <title>Outtake Survey</title>
          <p>Upon completion of the intelligent texts, users completed
an outtake survey which included user feedback questions
on each feature of iTELL including annotation and
notetaking, the section summary tasks, constructed
response items, and overall feedback about the layout and
organization of the intelligent text. This survey allowed us
to collect data on users’ perceptions of how well each of the
features stayed relevant to the text, worked correctly, was
easy to interact with, and helped improve the users’
learning. Users were also prompted to provide short text
feedback about each of the features.
2.5.</p>
          <p>iTELL Data Extraction
For this analysis, we extracted data related to participant
focus time, click-stream events, constructed responses, and
summaries.
2.5.1.</p>
        </sec>
        <sec id="sec-2-4-3">
          <title>Focus Time</title>
          <p>For each page focus time was extracted in two different
ways. First, focus time was recorded bey subtracting the
time that users opened the page and the time the user
moved onto the next page. The focus time included all the
time spent on constructed responses and on summary
scoring. Second, focus time was recorded by how long each
chunk in a page was viewed. From this, a total time for all
chunks per page and an average time for all the chunks on
a page was derived.
2.5.2.</p>
        </sec>
        <sec id="sec-2-4-4">
          <title>Events</title>
          <p>A number of events are calculated in iTELL that can be
instrumented into predictive variables. These include
chunk reveal events (for chunks with and without
constructed response items), general clicks on items within
the system, periods of time when learners are focusing on
the page, and when scrolling. The chunk reveals without
constructed responses is not the inverse of the chunk
reveals after constructed responses because many students
reread chapters either as a choice or as a function of
needing to reset the chapter because of a bug in the iTELL
system. Around 40% of chapters had some type of
rereading (i.e., scrolling upwards of more than 3% of the
page content) on the part of students.
2.5.3.</p>
        </sec>
        <sec id="sec-2-4-5">
          <title>Constructed Responses</title>
          <p>
            As part of the iTELL integration, an accompanying
constructed response item is generated for each chunk
using GPT-3.5-turbo with human-in-the-loop. End users do
not see all constructed response items when reading an
iTELL volume; instead, each chunk has a 1/3 chance of
spawning an accompanying constructed response item,
with a minimum of one constructed response item per
page. Users are required to submit at least one response to
a spawned item before proceeding to the next chunk.
Readers’ constructed responses are scored for correctness
using two separate fine-tuned LLMs, Bilingual Evaluation
Under-study with Representations from Transformers
(BLEURT) [
            <xref ref-type="bibr" rid="ref26">26</xref>
            ] and Masked and Permutated Language
Modeling (MPNet) [
            <xref ref-type="bibr" rid="ref27">27</xref>
            ], both of which report an accuracy
of ~ .80 [
            <xref ref-type="bibr" rid="ref28">28</xref>
            ] on question/answer pairs in the Multi-Sentence
Reading Comprehension (MultiRC) dataset [
            <xref ref-type="bibr" rid="ref29">29</xref>
            ]. The same
BLEURT and MPNet models are used to provide feedback
to readers who are given the opportunity to revise their
constructed responses if needed.
          </p>
          <p>
            For each participant, we calculated the number of
constructed responses they produced and the average score
they received for the constructed response on a scale of
13 with a 1 representing when the two LLMs agreed the
answer was incorrect, a 2 representing when one of the
LLMs classified the answer as incorrect and the other
classified as correct, and a 3 representing when the two
LLMs agreed the answer was correct. Figure 1
demonstrates the interface and the feedback returned for a
constructed response scored a 2 by the model. Because of a
bug in the code connecting iTELL to its database, a number
of constructed responses were not logged and were omitted
at random. Of the 395 pages completed by the 79 iTELL
participants, 130 of those pages did not have constructed
response data logged.
After reading each page, students were prompted to write
a summary of what they read. Algorithmic filters in iTELL
ensure that the summaries are between 50 and 200 words
long, are written in English, do not include inappropriate
language. The iTELL interface does not allow copying and
pasting directly from the text. When a student submits a
summary, they receive a score on Language Borrowing by
calculating the proportion of overlapping trigrams between
the summary and the source [
            <xref ref-type="bibr" rid="ref30">30</xref>
            ]. They also receive a score
on Relevance using cosine similarity between the text
embedding of the summary and the text embedding of the
source. If they pass these tests, they are scored by two
encoder LLMs introduced by Morris et al. [31-[
            <xref ref-type="bibr" rid="ref32">32</xref>
            ] on
Content (i.e. does the summary reproduce the content of
the source) and Wording (i.e. does the summary use correct
grammar/syntax and paraphrasing).
          </p>
          <p>
            These models, based on the Longformer pretrained
model [
            <xref ref-type="bibr" rid="ref33">33</xref>
            ], were finetuned on a large dataset of sources
and summaries that were scored on a six-criteria analytic
rubric by expert raters. The six criteria were distilled into
two principal components [
            <xref ref-type="bibr" rid="ref32">32</xref>
            ]. The Content PCA score
includes how well the summary reproduced the main idea
and details of the text, how well the summary was
organized, and how well the summary used objective voice.
The Wording PCA score includes grammar/syntax and how
well the summary paraphrased the source using original
language. The score predictions are normalized so that 0
represents the mean of the scores in the original training
set, with a standard distribution of 1. In a held-out test set
of sources that the models had not encountered during
training, they reported R2 values of 0.82 for Content and 0.7
for Wording [
            <xref ref-type="bibr" rid="ref32">32</xref>
            ]. An example of the summary interface is
provided in Figure 2 for a summary that passed all scoring
metrics.
          </p>
          <p>For each participant, we calculated the number of
summaries they produced and the average score they
received on each summary for Content, Wording,
Language Borrowing, and Relevance.
2.5.5.</p>
        </sec>
        <sec id="sec-2-4-6">
          <title>Python IDE</title>
          <p>The iTELL volume of the Introduction to Computing
textbook included an integrated development environment
(IDE) sandbox that exhibited Python code and allowed
students to enter and run their own code. However, logging
features for the sandbox data were not implemented at the
time of data collection.
2.6.</p>
        </sec>
      </sec>
      <sec id="sec-2-5">
        <title>Analyses</title>
        <p>We ran three different analyses to address our research
questions. For the first research question related to
whether users think iTELL is easy to interact with, is
understandable, helps them learn, and provides accurate
feedback, we ran simple descriptive statistics and graphed
out the results. For the second research question related to
whether students who completed the iTELL volume show
gains from chapter 2 to chapter 3 test scores compared to
students who did not use the iTELL, we ran statistical tests
to assess differences between the two groups’ delta values.
Our main statistical metrics are a p value to indicate if an
effect exists and a Cliff’s Delta value to examine the
strength of the effect. For the third research question
related to whether data points collected from the iTELL
volume related to click-stream data, focus time, and
summary scores are predictive of delta values, we
conducted a stepwise linear regression using iTELL data as
predictors of the delta values. We included a categorical
performance variable on the chapter 2 test based on
whether students scored above the mean (high) or below
the mean (low). This variable was included as a predictor
and as a possible interaction to see if there was an effect of
iTELL on lower or higher-performing students.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Results</title>
      <p>3.1.</p>
      <sec id="sec-3-1">
        <title>User Survey Data</title>
        <p>For the user survey data, we were most interested in
student responses to the summarization task, the
constructed response items, and their overall satisfaction
with iTELL. To provide a simpler representation for survey
item visualizations, we combined scores of 4 and 5 into a
single category (agree) and all scores of 1 and 2 into a single
category of disagree. Scores of 3 were labeled neutral. We
conducted follow up ANOVAs to examine if any differences
were noted across survey responses by ethnicity or reading
frequency.</p>
        <p>For the summary task, the mean responses were
generally positive (M &gt; 4). The lowest responses were for
the accuracy of feedback (M = 4.18) while the highest
responses were for ease of understanding (M = 4.29).
Students felt that the summary tasks helped them learn
(M = 4.18). There were no significant differences noted
across survey items by race or ethnicity or by reading
frequency. Data for this analysis are presented in Figure 3.
condition may have improved their test scores after using
iTELL more than students who did not use iTELL
For our test score difference analysis, we examined the
Delta score between test 2 and test 3 for students who used
iTELL and those that did not. Descriptive statistics
indicated greater gains in learning for test 2 and 3 by the
students who used iTELL (M = .032, SD = .304 than the
noniTELL students (M = -.018, SD = .248). Visual examinations
of the data indicated that it was not normally distributed
(see Figure 6 histogram). Thus, we conducted a
MannWhitney U test on delta scores across conditions (iTELL vs.
non-iTELL). The differences approached significance
(p=0.067; U=12,386) indicating that an effect size is likely.
The Cliff’s Delta = .132, 95% CI [-0.029, 0.286]) reported a
small effect suggesting that students in the iTELL condition
scored higher on test 3 (after using iTELL) than they did on
test 2 (no iTELL use).
* p &lt; .05, ** p &lt; .010
Three variables were significant predictors in our
regression model: number of scrolls, Wording scores on
summaries, and the categorical variable related to whether
the student scored high or low on test 2. The linear model
reported r = .501, R2 = .251, F (3, 75) = 8.362, p &lt; .001 (see
model parameters summarized in Table 1). The coefficients
indicated that higher delta scores between test 2 and 3
were predicted by fewer scrolls, lower Wording scores, and
performance on test 2 with lower performers on test 2 that
used iTELL showing greater gains between test 2 and 3.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Discussion and conclusion</title>
      <p>
        In the modern economy, computing skills are increasingly
important for students to acquire effectively.
Understanding computational thinking and computer
programming enables students to solve important,
realworld problems in a number of complex ways across a
number of domains. However, learning computer skills is
difficult and requires sustained efforts, specialized teaching
environments, and diverse skills, making success difficult.
As a result, many computer science curricula have suffered
from high failure and dropout rates [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. While many
supports have been developed to help students succeed,
there is some consensus that traditional textbooks are less
than effective for acquiring computing skills because
computing is a dynamic process that is not captured well
in static texts [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. The goal of this study was to assess the
efficacy of interactive intelligent texts in a computer
science class to help students understand and process
information about computational thinking and
programming. Specifically, we assessed an iTELL volume of
an introduction to computing textbook that focused on a
section related to control structures.
      </p>
      <p>Our assessment of the iTELL volume was an A/B test
where about 25% of the class volunteered to use the iTELL
volume (for extra credit) and the remainder of the class
used a plain digital version of the text. We used test scores
from the previous chapter where all students used the
digital textbook as a baseline measure of student skills. We
then examined students’ survey data to better understand
their experiences with the iTELL volume. Additionally, we
compared delta scores calculated as the difference between
the scores on the control structures test and scores on the
baseline test to assess potential gains for students who
used iTELL. Lastly, we ran a regression model to better
understand what features explained gains by students who
used the iTELL volume.</p>
      <p>The survey results indicated that students’ experiences
with the AI tools within iTELL were positive. Overall,
students felt that the constructed response items and
summary tasks were easy to work with and helped them
improve their learning. Students also felt the AI feedback
was accurate. Student surveys also indicated that the
students were satisfied with the iTELL volume overall.</p>
      <p>
        In terms of learning differences between the iTELL and
non-iTELL students, a Mann-Whitney U test approached
significance and reported a meaningful, but small,
relationship between score differences between the
baseline test and the test on the control structures chapter
(see reported effect sizes). A number of students showed
ceiling effects across both tests and removing these
students led to similar results. While a p value can indicate
whether an effect exists, the Cliff’s Delta size shows the
magnitude of the differences between the iTELL and
noniTELL groups (a small but meaningful effect) and is the
main quantitative consideration of the study [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ]. The
mean score differences indicated that students that used
iTELL showed gains of ~5% versus the students that did not
use iTELL and the Cliff’s Delta indicated that this
difference was meaningful. The standard deviation was
quite high for both groups, though.
      </p>
      <p>The regression analysis of the iTELL student data
indicated that students who showed a greater number of
scrolls showed lower delta scores. This may indicate that
students who are non-linear readers or students who scroll
in smaller increments performed worse. However, much
deeper analysis of scrolling needs to be performed to
support any meaningful conclusions. Additionally, the
regression model indicated that students who scored
higher in Wording for their summaries showed lower delta
score gains. This may indicate that students who focused
on Wording in their summaries at the expense of content
may have performed worse. It may also be that students
that are better writers gain less from using iTELL then
students who are worse writers. This may be supported by
the final feature of the regression model which was testing
level. The coefficients for testing level indicated that
students that performed lower on the baseline test showed
greater gains when using iTELL. This indicates that iTELL
may work better for low level students, but, again, much
more fine-grained testing is needed to support this notion.</p>
      <p>
        Overall, this study finds evidence that intelligent
textbooks are an advanced learning technology that can
use AI to improve student learning outcomes in an
introduction to computing class. This finding builds on
previous studies that have argued that traditional, static
textbooks may be ineffective in computer science
instruction [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. However, intelligent texts that are
interactive and allow for dynamic assessments and greater
student engagement may be effective, especially intelligent
texts that integrate read-to-write tasks known to lead to
increased learning gains [
        <xref ref-type="bibr" rid="ref22 ref23 ref24">22-24</xref>
        ] based on generation
effects [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ].
      </p>
      <p>While the learning gains are small (see reported effect
sizes), it is unlikely that a single text-based intervention
would lead to stronger gains in a computer science class.
However, in combination with other AI powered tools help
computer science students effectively use debuggers and
compilers or guide students through complex, multi-step,
open-ended problems, intelligent texts may lead to greater
gains.</p>
      <p>
        There were a number of limitations to the analyses
conducted. First, we were looking at a convenience sample
and one in which students were rewarded for using iTELL,
so there is likely a self-selection bias [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ]. While it is
difficult to know the type of students that volunteered for
the iTELL condition, Figure 4 indicates that many more
students that were at ceiling on scores for Tests 2 and 3 did
not volunteer to use iTELL. So, it is likely that students who
needed extra credit volunteered versus students who are
highly motivated in general. Regardless, a randomized
control trial is needed to truly assess iTELL effects in the
computer science classroom. There were also problems in
iTELL with data documentation. A bug in the constructed
response items led to data for over half of the items not
being logged. Additionally, no data from the Python
sandbox was logged. Thus, it is difficult to disaggregate the
effects of these tools on learning in the iTELL environment.
We also had a relatively small sample size for the iTELL
condition compared to the non-iTELL condition, which
makes it difficult to generalize the findings to different
populations. Additionally, the standard deviation in scores
was quite high for both groups indicating much variation
in learning gains. Lastly, the students in the class all come
from a technical background (i.e., they are studying at a
technical university), which may have affected the
outcomes.
      </p>
      <p>Overall, though, the study provides a promising first
step in understanding how the use of intelligent texts may
lead to learning gains for computer science students.
Knowing the importance of computer science skills in the
modern economy and the difficulty in obtaining those
skills, a variety of educational tools will be needed to
address potential learning deficits in the computer science
classroom. Intelligent texts may prove to be one of those
tools.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgements</title>
      <p>This material is based upon work supported by the
National Science Foundation under Grant 2112532. Any
opinions, findings, and conclusions or recommendations
expressed in this material are those of the author(s) and do
not necessarily reflect the views of the National Science
Foundation.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Webb</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bell</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Katz</surname>
            ,
            <given-names>Y. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reynolds</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chambers</surname>
            ,
            <given-names>D. P.</given-names>
          </string-name>
          , ... &amp;
          <string-name>
            <surname>Mori</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Computer science in the school curriculum: Issues and challenges</article-title>
          .
          <source>In Tomorrow's Learning: Involving Everyone. Learning with and about Technologies and Computing: 11th IFIP TC 3 World Conference on Computers in Education, WCCE</source>
          <year>2017</year>
          , Dublin, Ireland,
          <source>July 3-6</source>
          ,
          <year>2017</year>
          ,
          <source>Revised Selected Papers</source>
          <volume>11</volume>
          (pp.
          <fpage>421</fpage>
          -
          <lpage>431</lpage>
          ). Springer International Publishing.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Jiau</surname>
            ,
            <given-names>H. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>J. C.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Ssu</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.-F.</surname>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>Enhancing self-motivation in learning programming using game-based simulation and metrics</article-title>
          .
          <source>IEEE Transactions on Education</source>
          ,
          <volume>52</volume>
          (
          <issue>4</issue>
          ),
          <fpage>555</fpage>
          -
          <lpage>562</lpage>
          . https://doi.org/10.1109/TE.
          <year>2008</year>
          .2010983
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Robins</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rountree</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Rountree</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          (
          <year>2003</year>
          ).
          <article-title>Learning and Teaching Programming: A Review and Discussion</article-title>
          .
          <source>Com-puter Science Education</source>
          ,
          <volume>13</volume>
          (
          <issue>2</issue>
          ),
          <fpage>137</fpage>
          -
          <lpage>172</lpage>
          . https://doi.org/10.1076/ csed.13.2.137.14200
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Messer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brown</surname>
            ,
            <given-names>N. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kölling</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Shi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2024</year>
          ).
          <article-title>Automated grading and feedback tools for programming edu-cation: A systematic review</article-title>
          .
          <source>ACM Transactions on Computing Education</source>
          ,
          <volume>24</volume>
          (
          <issue>1</issue>
          ),
          <fpage>1</fpage>
          -
          <lpage>43</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Bennedsen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Caspersen</surname>
            ,
            <given-names>M. E.</given-names>
          </string-name>
          (
          <year>2005</year>
          ).
          <article-title>Revealing the programming process. Paper presented at the ACM SIGCSE Bulletin</article-title>
          . https://doi.org/10.1145/1047344.1047413
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stafford</surname>
            ,
            <given-names>T. F.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Teaching introductory programming to IS students: The im-pact of teaching approaches on learning performance</article-title>
          .
          <source>Journal of Information Systems Education</source>
          ,
          <volume>24</volume>
          (
          <issue>2</issue>
          ), 6. Retrieved from http://jise.org/Volume24/n2/JISEv24n2p147.pdf
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Gomes</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mendes</surname>
            ,
            <given-names>A. J.</given-names>
          </string-name>
          (
          <year>2007</year>
          ,
          <article-title>September)</article-title>
          .
          <article-title>Learning to program-difficulties and solutions</article-title>
          .
          <source>In International Con-ference on Engineering Education-ICEE</source>
          (Vol.
          <volume>7</volume>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Sosnovsky</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brusilovsky</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Lan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2022</year>
          ,
          <article-title>July)</article-title>
          .
          <article-title>Intelligent textbooks: themes and topics</article-title>
          .
          <source>In International Conference on Artificial Intelligence in Education</source>
          (pp.
          <fpage>111</fpage>
          -
          <lpage>114</lpage>
          ). Cham: Springer International Publishing.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Rockinson-Szapkiw</surname>
            ,
            <given-names>A. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Courduff</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carter</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Bennett</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Electronic versus traditional print textbooks: A comparison study on the influence of university students' learning</article-title>
          .
          <source>Computers &amp; Education</source>
          ,
          <volume>63</volume>
          ,
          <fpage>259</fpage>
          -
          <lpage>266</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Clinton-Lisell</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seipel</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gilpin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Litzinger</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2023</year>
          ).
          <article-title>Interactive features of e-texts' effects on learning: A systematic review and meta-analysis</article-title>
          .
          <source>Interactive Learning Environments</source>
          ,
          <volume>31</volume>
          (
          <issue>6</issue>
          ),
          <fpage>3728</fpage>
          -
          <lpage>3743</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Brusilovsky</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sosnovsky</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Thaker</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          (
          <year>2022</year>
          ).
          <article-title>The return of intelligent textbooks</article-title>
          .
          <source>AI Magazine</source>
          ,
          <volume>43</volume>
          (
          <issue>3</issue>
          ),
          <fpage>337</fpage>
          -
          <lpage>340</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Bareiss</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Osgood</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>1993</year>
          , December).
          <article-title>Applying AI models to the design of exploratory hypermedia systems</article-title>
          .
          <source>In Proceedings of the fifth ACM conference on Hypertext</source>
          (pp.
          <fpage>94</fpage>
          -
          <lpage>105</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Weber</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Brusilovsky</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>Elm-art-an interactive and intelligent web-based electronic textbook</article-title>
          .
          <source>International Journal of Artificial Intelligence in Education</source>
          ,
          <volume>26</volume>
          ,
          <fpage>72</fpage>
          -
          <lpage>81</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Lan</surname>
            ,
            <given-names>A. S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Baraniuk</surname>
            ,
            <given-names>R. G.</given-names>
          </string-name>
          (
          <year>2016</year>
          , June).
          <article-title>A Contextual Bandits Framework for Personalized Learning Action Selection</article-title>
          .
          <source>In Educational Data Mining</source>
          (pp.
          <fpage>424</fpage>
          -
          <lpage>429</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Thaker</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>He</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Brusilovsky</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>Recommending Remedial Readings Using Student Knowledge State</article-title>
          .
          <source>International Educational Data Mining Society.</source>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lamb</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saveliev</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cameron</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zaykov</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hernandez-Lobato</surname>
            ,
            <given-names>J. M.</given-names>
          </string-name>
          , ... &amp;
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2021</year>
          ,
          <article-title>August)</article-title>
          .
          <article-title>Results and insights from diagnostic questions: The neurips 2020 education challenge</article-title>
          .
          <source>In NeurIPS 2020 Competition and Demonstration Track</source>
          (pp.
          <fpage>191</fpage>
          -
          <lpage>205</lpage>
          ). PMLR.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Banchs</surname>
            ,
            <given-names>R. E.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>D'Haro</surname>
            ,
            <given-names>L. F.</given-names>
          </string-name>
          (
          <year>2015</year>
          , June). Revup:
          <article-title>Automatic gap-fill question generation from educational texts</article-title>
          .
          <source>In Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications</source>
          (pp.
          <fpage>154</fpage>
          -
          <lpage>161</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Delaney</surname>
            ,
            <given-names>Y. A.</given-names>
          </string-name>
          (
          <year>2008</year>
          ).
          <article-title>Investigating the reading-towrite construct</article-title>
          .
          <source>Journal of English for academic purposes</source>
          ,
          <volume>7</volume>
          (
          <issue>3</issue>
          ),
          <fpage>140</fpage>
          -
          <lpage>150</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Grabe</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Stoller</surname>
            ,
            <given-names>F. L.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Teaching and researching reading</article-title>
          . Routledge.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Nelson</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Calfee</surname>
            ,
            <given-names>R. C.</given-names>
          </string-name>
          (
          <year>1998</year>
          ).
          <article-title>Chapter I: The Reading-Writing Connection Viewed Historically</article-title>
          . Teachers College Record,
          <volume>99</volume>
          (
          <issue>6</issue>
          ),
          <fpage>1</fpage>
          -
          <lpage>52</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Nelson</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>King</surname>
            ,
            <given-names>J. R.</given-names>
          </string-name>
          (
          <year>2023</year>
          ).
          <article-title>Discourse synthesis: Textual transformations in writing from sources</article-title>
          .
          <source>Reading and Writing</source>
          ,
          <volume>36</volume>
          (
          <issue>4</issue>
          ),
          <fpage>769</fpage>
          -
          <lpage>808</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Silva</surname>
            ,
            <given-names>A. M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Limongi</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>Writing to learn increases long-term memory consolidation: A mental-chronometry and computational-modeling study of “Epistemic writing”</article-title>
          .
          <source>Journal of Writing Research</source>
          ,
          <volume>11</volume>
          (
          <issue>1</issue>
          ),
          <fpage>211</fpage>
          -
          <lpage>243</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Brown</surname>
            ,
            <given-names>A. L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Campione</surname>
            ,
            <given-names>J. C.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Day</surname>
            ,
            <given-names>J. D.</given-names>
          </string-name>
          (
          <year>1981</year>
          ).
          <article-title>Learning to learn: On training students to learn from texts</article-title>
          .
          <source>Educational researcher</source>
          ,
          <volume>10</volume>
          (
          <issue>2</issue>
          ),
          <fpage>14</fpage>
          -
          <lpage>21</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Bensoussan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kreindler</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          (
          <year>1990</year>
          ).
          <article-title>Improving advanced reading comprehension in a foreign language: summaries vs. short‐answer questions</article-title>
          .
          <source>Journal of Research</source>
          in Reading,
          <volume>13</volume>
          (
          <issue>1</issue>
          ),
          <fpage>55</fpage>
          -
          <lpage>68</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Joyner</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2016</year>
          ).
          <article-title>Introduction to Computing. McGraw-Hill Education LLC</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <surname>Sellam</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Das</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Parikh</surname>
            ,
            <given-names>A. P.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>BLEURT: Learning robust metrics for text generation</article-title>
          . arXiv preprint arXiv:
          <year>2004</year>
          .04696.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <surname>Song</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tan</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qin</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>T. Y.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>Mpnet: Masked and permuted pre-training for language understanding</article-title>
          .
          <source>Advances in neural information processing systems</source>
          ,
          <volume>33</volume>
          ,
          <fpage>16857</fpage>
          -
          <lpage>16867</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <surname>Morris</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Choi</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holmes</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gupta</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Crossley</surname>
            ,
            <given-names>S. A.</given-names>
          </string-name>
          (in press).
          <source>Automatic Question Generation and Constructed Response Scoring in Intelligent Texts. Proceedings of the 17th International Conference on Educational Data Mining.</source>
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Khashabi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chaturvedi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roth</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Upadhyay</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Roth</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <year>2018</year>
          .
          <article-title>Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences</article-title>
          .
          <source>Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          (New Orleans, Louisiana,
          <year>2018</year>
          ),
          <fpage>252</fpage>
          -
          <lpage>262</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>Broder</surname>
            ,
            <given-names>A. Z.</given-names>
          </string-name>
          (
          <year>1998</year>
          ).
          <article-title>On the resemblance and containment of documents</article-title>
          .
          <source>Proceedings Compression and Complexity of SEQUENCES 1997 (Cat no 97TB100171)</source>
          ,
          <fpage>21</fpage>
          -
          <lpage>29</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <surname>Morris</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Crossley</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holmes</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Trumbore</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2023</year>
          , March).
          <article-title>Using transformer language models to validate peer-assigned essay scores in massive open online courses (MOOCs)</article-title>
          .
          <source>In LAK23: 13th international learning analytics and knowledge conference</source>
          (pp.
          <fpage>315</fpage>
          -
          <lpage>323</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <surname>Morris</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Crossley</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Holmes</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ou</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dascalu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>McNamara</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2024</year>
          ).
          <article-title>Formative Feedback on Student-Authored Summaries in Intelligent Textbooks Using Large Language Models</article-title>
          .
          <source>International Journal of Artificial Intelligence in Education</source>
          ,
          <volume>1</volume>
          -
          <fpage>22</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>Beltagy</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peters</surname>
            ,
            <given-names>M. E.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Cohan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>Longformer: The long-document transformer</article-title>
          . arXiv preprint arXiv:
          <year>2004</year>
          .05150.
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <surname>Sullivan</surname>
            ,
            <given-names>G. M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Feinn</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>Using effect sizeor why the P value is not enough</article-title>
          .
          <source>Journal of graduate medical education</source>
          ,
          <volume>4</volume>
          (
          <issue>3</issue>
          ),
          <fpage>279</fpage>
          -
          <lpage>282</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <surname>Bertsch</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pesta</surname>
            ,
            <given-names>B.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wiscott</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          and
          <string-name>
            <surname>McDaniel</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          <year>2007</year>
          .
          <article-title>The generation effect: A meta-analytic review</article-title>
          .
          <source>Memory &amp; Cognition. 35</source>
          ,
          <issue>2</issue>
          (Mar.
          <year>2007</year>
          ),
          <fpage>201</fpage>
          -
          <lpage>210</lpage>
          . DOI:https://doi.org/10.3758/BF03193441.
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <surname>Tripepi</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jager</surname>
            ,
            <given-names>K. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dekker</surname>
            ,
            <given-names>F. W.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Zoccali</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Selection bias and information bias in clinical research</article-title>
          .
          <source>Nephron Clinical Practice</source>
          ,
          <volume>115</volume>
          (
          <issue>2</issue>
          ),
          <fpage>c94</fpage>
          -
          <lpage>c99</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>