Developing and Evaluating an Interactive Reading Tool with
                         Teachers in the Loop: Action Research Approach
                         Mihwa Lee
                         Hector Research Institute of Education Sciences and Psychology, Universirty of Tübingen, Walter-Simon-Str. 12, 72072 Tübingen, Germany
                         LEAD Graduate School & Research Network, University of Tübingen, Europastr. 6, 72072 Tübingen, Germany


                                        Abstract
                                        Reading is an essential life skill and crucial for students’ academic success. Especially, there has been an increasing necessity for students
                                        to read in English as a second language (L2) due to its global importance. However, teachers in schools often face challenges in providing
                                        interactive L2 reading experiences for a large number of students due to limited time and highly heterogeneous students, leading to L2
                                        readers having few opportunities for meaningful, interactive reading practice with instant support. The rapid advancements of artificial
                                        intelligence (AI) in education have given rise to a number of opportunities for interactive and adaptive learning. Despite significant
                                        advancements in AI-powered educational tools, many language educators continue to view them with skepticism. This may stem from
                                        a perceived misalignment between teaching methods that educators find effective and the features or approaches offered by these
                                        technologies. As a result, the gap between educators’ expectations and the capabilities of AI-driven solutions remains a point of concern.
                                        It is crucial to ensure that educational systems align with established theories and pedagogical insights, and to investigate them from
                                        multiple perspectives, including perceptions of the system, learning outcomes, motivation, and learning behavior to better design
                                        educational products. This article introduces a pedagogically grounded web-based intelligent computer-assisted language learning
                                        (ICALL) system, designed to enhance L2 reading experiences, developed using the Action Research design with teachers in the loop.
                                        The article details the system’s development and provide an overview of ongoing and planned studies, which focus on different aspects
                                        of the ICALL system, examining learners’ behaviors through interaction logs to further L2 learning research and improve educational
                                        tools.

                                        Keywords
                                        Reading comprehension, Language learning, Intelligent computer-assisted language learning (ICALL), Process data


                         1. Introduction                                                                                              To address this gap between research on language ed-
                                                                                                                                   ucation, foreign language teaching insights, and real-life
                         In today’s increasingly globalized world, the increasing ne-                                              classroom usage, an ICALL system that systematically and
                         cessity for students to read in English as a L2 underscores                                               automatically provides various interactive support for L2
                         the importance of proficient L2 reading skills. Learning to                                               reading has been designed and developed, targeting learners
                         read in L2 is complex, as learners must grasp literacy in an                                              of English as a foreign language (EFL). The design and de-
                         unfamiliar language [1]. Thus, there is an urgent need to                                                 velopment of the system is grounded in theories in Second
                         support L2 reading from the early school years. However,                                                  Language Acquisition (SLA), educational sciences, and ped-
                         teachers face challenges in providing interactive and adap-                                               agogical insights from school practitioners, and leverages
                         tive learning experiences for a large number of students                                                  the affordance of the Natural Language Processing (NLP)
                         with limited time. Digital environments, such as ICALL sys-                                               tools and Large Language Model (LLM). In this article, we
                         tems, offer unique opportunities for new ways of learning                                                 introduce the design rationale of the system and present the
                         and teaching [2]. These systems have been shown to en-                                                    plans and status of studies assessing the effectiveness of the
                         hance learning engagement [3] and achieve better language                                                 system in promoting L2 reading from various dimensions
                         acquisition [4] through features such as automatic feedback                                               and examining learners’ learning behaviors using interac-
                         [5], intelligent tutoring [6], and personalized support [7].                                              tion logs stored in the system. Specifically, the first goal
                         Despite these advancements, there remains a significant gap                                               is to design and develop an ICALL system that supports
                         of the use of such tools in school settings, possibly because                                             and enhances L2 reading comprehension based on the SLA
                         of the skepticism among practitioners [8] due to not only                                                 theories and teachers’ insights. The second goal is to ex-
                         people’s lack of knowledge of the field and its capabilities                                              amine the effectiveness of the ICALL system in promoting
                         but also the fact that a lot of AI-based education applications                                           students’ learning outcomes and motivation compared to
                         do not meet educators’ expectations of how effective lan-                                                 traditional online reading practice. Lastly, the third goal
                         guage teaching and learning should be conducted [9]. Given                                                is to investigate learners’ self-regulated learning behavior
                         the complexity of challenges in AI in education (AIED) and                                                from by combining interaction logs with self-report data.
                         the field’s traditional emphasis on technical aspects, many                                               By exploring these dimensions, we aim to advance L2 learn-
                         AI-driven educational tools and studies struggle to align                                                 ing research and refine educational tools to better support
                         with the most recent advancements in learning theories,                                                   reading development in school contexts.
                         empirical research findings, and pedagogical insights [9].

                         Proceedings of the Doctoral Consortium of the 19th European Conference                                    2. Background
                         on Technology Enhanced Learning, 16th September 2024, Krems an der
                         Donau, Austria
                         Envelope-Open mihwa.lee@uni-tuebingen.de (M. Lee)
                                                                                                                                   2.1. Linguistic knowledge in reading
                         GLOBE                                                                                                          comprehension
                         https://uni-tuebingen.de/fakultaeten/wirtschafts-und-sozialwissensch
                         aftliche-fakultaet/faecher/fachbereich-sozialwissenschaften/hector-ins                                    Reading is a complex cognitive task that necessitates the
                         titut-fuer-empirische-bildungsforschung/institut/personen/lee-mihwa/                                      integration of textual information with prior knowledge.
                         (M. Lee)                                                                                                  Effective comprehension relies on the reader’s ability to effi-
                         Orcid 0009-0000-0101-3549 (M. Lee)                                                                        ciently process the visual information presented in the text
                                    © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License
                                    Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
[1]. Current theories of reading comprehension generally        like the desire to learn [18, 21, 22]. [22] emphasizes that
depict it as involving multiple interconnected layers of con-   effective feedback should guide students to consider both
ceptual representation. These layers include a lower-level      cognitive and motivational aspects in their learning process,
representation that draws on text-based elements such as        particularly when using computer-assisted learning tools.
vocabulary and grammar, and a higher-level representation       Therefore, one can assume that providing adaptive and scaf-
where the textual content is incorporated into the reader’s     folding feedback for learners potentially triggers changes
broader conceptual framework (e.g., combining informa-          of learners’ attitudes (motivational component) and read-
tion across sentences) [10, 11, 12]. The reader’s vocabulary    ing strategies (cognitive component), which consequently
and grammar knowledge significantly shape the formation         improves reading comprehension [21].
of these semantic structures throughout the reading pro-           Despite the significant role feedback plays in reading com-
cess [10]. Specifically, vocabulary and grammar steer the       prehension, teachers are usually the only reliable source of
parsing process, which builds meaning from local text seg-      feedback for learners in real-life classroom settings. How-
ments. If the local-level representations are inaccurate or     ever, their time and the amount they can spend with each
incomplete, overall text comprehension can be significantly     student in class are very limited, resulting in few opportu-
hindered [10, 13]. Lexical-syntactic knowledge is essential     nities for learners to receive individual formative feedback.
for constructing the local-level representation, which forms    This is especially important given the substantial individual
the foundation for higher-level text comprehension [11, 14].    differences in aptitude and proficiency [23]. Another issue
Thus, vocabulary and grammar facilitate the building of         is the lack of research on how feedback enhances L2 reading
text-based propositions and contribute to deeper compre-        comprehension. A recent meta-analysis [21] indicates that
hension.                                                        research on feedback has predominantly concentrated on
   SLA researchers have also focused on the role of vocabu-     learning outcomes related to reading comprehension, where
lary and grammar knowledge in understanding L2 reading          few studies have explored the cognitive and affective pro-
comprehension. Numerous studies have explored how these         cesses triggered by feedback aimed at text comprehension.
factors influence L2 reading comprehension, with findings       Furthermore, most of this research has focused on the effec-
consistently underscoring the importance of morphosyn-          tiveness of feedback in reading comprehension in the first
tactic knowledge [14]. Recent meta-analyses on L2 read-         langauge (L1), with relatively little attention given to its
ing comprehension [14, 15] highlight that vocabulary and        impact on L2 learners’ reading comprehension. Therefore,
grammar knowledge are the two strongest predictors of           further empirical research is needed to investigate if and
L2 reading comprehension. Consequently, vocabulary and          how such feedback, especially in the context of a computer-
grammar knowledge have a significant impact—whether             assisted learning environment, can enhance learners’ learn-
directly or indirectly—on reading comprehension.                ing processes and, consequently, influence their learning
   From the instructional perspective, however, it is almost    behavior and overall reading comprehension.
impossible for teachers to pinpoint vocabulary and gram-
matical knowledge that each learner does not understand         2.3. Self-regulated learning (SRL) in a
while students are reading. One way to support teachers
in this process is by utilizing supportive computer envi-
                                                                     computer-based learning environment
ronments. Despite the importance of such fundamental            Self-regulated learning (SRL) broadly refers to an educa-
linguistic skills in reading comprehension, however, there      tional process in which learners proactively engage in aca-
are a relatively small number of technological applications,    demic tasks [24, 25]. [24] provides a widely accepted defi-
though these (e.g., [16, 17]) involve only minimal use of       nition: “active, constructive process whereby students set
technology. Therefore, it is crucial for researchers to ac-     goals for their learning and then attempt to monitor, regu-
count for both vocabulary and grammar when developing           late, and control their cognition, motivation, and behavior,
language learning applications. Additionally, further re-       guided and constrained by their goals and the contextual
search is needed to explore whether and how support for         features of their environment” (p. 453). In academic litera-
these aspects can enhance learners’ L2 reading processes,       ture, there is a consensus that SRL is essential for students’
potentially shaping their learning behaviors and improving      reading development. Proficient readers are typically highly
overall reading comprehension.                                  motivated self-regulated learners who use various reading
                                                                strategies effectively [26]. Motivation drives learners to use
2.2. Feedback in reading comprehension                          learning strategies, which helps regulate learning behav-
                                                                iors and improve outcomes [25, 27]. SRL strategies include
Feedback is information communicated to learners to mod-        planning, critical thinking, peer learning, effort regulation,
ify their thinking or behavior to close the gap between their   and goal orientation. Classroom based research indicates
actual performance and target performance [18], thus aim-       that SRL strategies lead to higher learning performance
ing to improve learning [19], as well as enhance emotions       [24, 25, 27]. Therefore, supporting students’ motivation
and motivation during a learning situation [20]. In the field   through, for instance, help options and feedback is cru-
of both SLA and educational sciences, feedback is recognized    cial for promoting their use of learning strategies, which
as an important factor in supporting learning, particularly     eventually enhances learning outcomes [28, 29]. Digital
when it helps overcome insufficient or false hypotheses         technologies have the potential to directly influence learn-
[18]. Feedback serves a cognitive function by informing         ers’ motivation, strategy use, and outcomes by providing
readers of misunderstandings, filling gaps, and increasing      interactive and adaptive learning environments that cater
awareness of their understanding [21]. This awareness of        to individual needs [30]. Hence, investigating the effects of
one’s understanding level is crucial for teaching students      these technologies on motivation, learning strategies, and
to self-regulate their learning from texts, which involves      outcomes is urgent.
both (meta)cognitive strategies, such as making inferences         Previous studies indicated that SRL is the crucial skill for
and monitoring comprehension, and motivational processes,
success in computer-based learning environments as well               1. What are the characteristics that are considered ben-
[31]. However, learners cannot always regulate themselves                eficial for supporting and enhancing L2 reading com-
successfully because of reasons such as lack of good strategy            prehension? What are the students’ and teachers’
use, lack of metacognitive knowledge, failure to control of              perceptions of our L2 reading system?
metacognitive processes, or lack of experience in learning            2. To what extent is the ICALL system for reading effec-
environments with multiple representations. Thus, how to                 tive in promoting students’ learning outcomes and
foster SRL ability has become a central issue in the field of            motivation compared to traditional online reading
education research and practice. In order to support SRL in              practice?
the computer-mediated learning environments, instruments              3. What insights do the interaction logs reveal about
that capture students’ self-regulation are critical.                     the learners’ usage of the system and their SRL be-
                                                                         havior? How students with high self-regulation and
2.4. Trace-based measurement of SRL                                      low self-regulation behave differently?
                                                                    The following sections present the plans and status of
Offline instruments like self-reported questionnaires and
                                                                  the studies addressing those research questions in detail. At
semi-structured interviews have long been used to measure
                                                                  the time of submission, the first research question is being
students’ SRL processes in both educational sciences and
                                                                  addressed in a study that is currently taking place.
SLA. However, these traditional methods face criticism due
to their subjectivity, obtrusiveness, and limited ability to
capture the dynamic nature of learning [32]. Such instru-         4. Methodology
ments are often unable to reflect all the elements learners
attend to during their learning processes [33]. To address        Involving teachers or stakeholders in education research
these limitations, researchers advocate for the integration       whose results will be used in schools is considered very
of multiple data types, such as digital-trace data, which         important because schools and teachers should not only
includes real-time interaction log data [34, 35]. This digital-   be treated as consumers of the research results [42]: suc-
trace data offers a more granular and continuous insight into     cessful research that has a practical impact in schools is
SRL, allowing both researchers and practitioners to mon-          always the outcome of bi-directional efforts. This the bi-
itor students’ learning behaviors and strategic decisions         directional effort will not be a one-off process, but a process
in online environments with remarkable detail and in real         that will involve multiple iterations of interactions between
time [36]. These online measures are particularly valuable        the research team and the teachers. Consequently, a multi-
because they capture cognitive processes as they unfold           cycle Action Research paradigm was chosen to guide the
during learning, offering a temporal perspective on cogni-        research process. The Action Research Model (see Figure 1)
tive change and presenting a moment-by-moment view of             is a systematic, collective, collaborative, and self-reflective
students’ processing behaviors [37].                              scientific inquiry aimed at improving educational practices
   One challenge that is often addressed by researchers is        and ad-dressing the practical concerns of teachers [43, 44],
importance of aligning the data collection with SRL model         where a key characteristic of action research is the involve-
[32, 38]. To this end, researchers have often utilised theory-    ment of stakeholders, including teachers, students, and re-
aligned coding schemes that define SRL processes at differ-       searchers. Throughout the project, we adhere to this ap-
ent levels of granularity by, for example, coding schemes         proach in the process of development, testing, and imple-
[39]. Based on these schemes, previous research has primar-       mentation of the system. Figure 2 illustrates the overview
ily relied on clickstream data from Learning Management           of the research design based on the Action Research Model.
Systems (LMS) to measure SRL behaviors related to time
management, a crucial sub-construct of SRL [40]. These
studies consistently demonstrate that clickstream-based
measures of time management predict student performance
in online learning environments.
   However, despite the promise of this microanalytic
method and its availability due to recent technological ad-
vances, its application remains limited in the field of SLA and
language learning studies [41]. Furthermore, most research
has focused exclusively on time management, leaving other
critical SRL sub-constructs largely unexplored in the con-
text of digital-trace data collection [40]. This gap highlights   Figure 1: Action Research Model, adpated from [43]
the need for future studies to broaden their focus to include
other dimensions of SRL to fully leverage the potential of        RQ1: To answer the first research question, we first con-
interaction log data in understanding the complexities of         ducted an intensive literature review about L2 reading com-
student learning.                                                 prehension in order to decide on the characteristics to be
                                                                  implemented in an ICALL system for L2 reading. The aim
                                                                  was to understand the key factors that contribute to effective
3. Research questions                                             L2 reading comprehension and how these can be supported
Driven by the objective of advancing L2 learning research         in a digital learning environment. As discussed in the Back-
and refining educational tools to better support reading          ground section (see section 2.1 and section 2.2), reading
development in school contexts, this project centers the          comprehension involves integrating text information, heav-
attention on the following research questions:                    ily relying on vocabulary and grammar. Feedback plays a
                                                                  crucial role in improving comprehension by helping learn-
                                                                  ers bridge understanding gaps and enhance self-regulation.
              Figure 2: Overvivew of methodology based on Action Research Model


The first prototype of our ICALL system, called ARES, in-
cludes features to support these aspects (more discussion of
the technical side of the system development can be found
in Lee et al. (2024)). Following the Action Research Model,
multiple consultations with English practitioners and teach-
ers from German secondary schools (”Gymnasiums”) were
conducted. This collaborative co-design approach ensured
that the system’s features met not only the SLA theories,
but also practical classroom needs and pedagogical insights.
Figures 3–8 illustrate some features of the first prototype of
the system, developed upon after the initial “Plan” phase of
the first iteration cycle of the Action Research Model. Using
the NLP tools, key features on the learner side include:
         • On-demand interactive lookup on language means:
           learners can access detailed explanations and exam-
           ples of language means directly within the reading
           text, adaptively helping them understand grammar
           rules in context according to their need (see Figure
           3).
         • On-demand interactive vocabulary lookup: learners
           can access detailed explanations and examples of
           vocabulary in terms of its form, meaning, and use         Figure 3: Lookup of language means
           directly within the reading text, adaptively helping
           them understand vocabulary in context according
           to their need (see Figure 4).
         • Elaborated feedback: learners receive detailed, per-
           sonalized feedback on their reading and comprehen-
           sion activities, highlighting areas of strength and
           providing targeted suggestions for improvement (see
           Figure 5).
   In addition to the features that support learners, educa-
tional systems should also support teachers so that they can
be used in real-life classroom contexts. At the same time,
however, it should not replace the teachers. Rather, it should
help teachers. Therefore, with the LLM (ChatGPT4o1 ), the
system includes features and resources that empower teach-
ers to effectively support L2 reading development in their
classrooms, while at the same time it is designed in a way
that teachers’ expertise is always involved in the process
(more discussion of the technical side of the system develop-
ment can be found in Lee et al. (2024)). They can post-edit
suggestions by the LLM, confirm them, or add their own
questions manually. In this way, teachers make the ultimate
decision about what to show the students. Key features on
the teacher side include:                                            Figure 4: Vocabulary lookup
         • Customization of annotations on language means:
           teachers can customize which annotations on lan-
           guage means are shown to students to align with                  their instructional goals and the specific needs of
1                                                                           their students (see Figure 6).
    https://chatgpt.com/
Figure 5: Feedback for student’s response


      • Question generation: the system generates reading          Figure 7: Question generation for a reading text
        comprehension questions (factual and inferential)
        based on the reading text, helping teachers provide
        questions tailored to the text (see Figure 7).
      • Feedback generation: the system creates personal-
        ized feedback for students based on their perfor-
        mance, helping teachers provide individualized sup-
        port (see Figure 8).
      • Evaluation: the system evaluates student responses
        to comprehension questions, providing immediate
        grading, which reduces the grading burden on teach-
        ers.
      • Minimalistic analytics: the system provides simple
        analytics on student performance and engagement,
        offering teachers quick insights without overwhelm-
        ing them with data.
      • Text bank and uploading: the system not only in-
        cludes a library of reading texts of a variety of topics
        but also lets the teachers upload texts, allowing them
        to tailor the reading materials to their curriculum
                                                                   Figure 8: Grading of individual submission
        and students’ interests.


                                                                   design. In order to enable Learning Analytics, all user ac-
                                                                   tivities such as button clicks, lookups of language means,
                                                                   reading comprehension question attempts, assignment sub-
                                                                   missions, viewing of specific feedback messages, and any
                                                                   other relevant user actions are logged through xAPI6 , an
                                                                   interoperability specification for learning technology, and
                                                                   stored in a Learning Record Store (LRS) in the database.
                                                                      Since the first version of the system is deployed, a study
                                                                   investigating teachers’ and students’ perceptions of the
                                                                   system is currently taking place in two intact English
                                                                   classes at secondary schools (students around age 13-14)
                                                                   in southwest Germany with the purpose of evaluating
                                                                   the system’s usability and overall task and system design.
Figure 6: Selection of annotations of language means               These mixed-gender classes are part of the academic
                                                                   track of the German education system. The curriculum
                                                                   at this grade level is equivalent to A2-B1 levels on the
   In terms of the technical aspect, ARES is built with Java       Common European Framework of Reference for Languages
at the backend, with a Jetty2 server and a Docker3 con-            (CEFR) [46], representing the students’ fourth year of
tainer. The database is PostgreSQL4 , and the frontend is          EFL instruction in school. Over an eight-week period,
based on a popular JavaScript framework, HTML, and Boot-           students read two texts weekly using ARES as part of their
strap5 that provides a highly extensible component-based           homework assigned by teachers. A mixed-method approach
2
  https://jetty.org/index.html                                     with quantitative data from self-reports and qualitative
3
  https://www.docker.com/                                          data from semi-structured interviews is employed. System
4
  https://www.postgresql.org/
5                                                                  6
  https://getbootstrap.com/                                            https://xapi.com/
perceptions are assessed through a self-report questionnaire     section (see section 2.4), it is critical to align the data col-
of comprehensive evaluation of educational technology            lection with SRL model [32, 38]. Consequently, the analysis
adapted from [47], which contains closed-ended items in          will be guided by the SRL processes proposed by [32] that de-
eight evaluation categories such as Usability, Design, and       fines the three macro level [54] of SRL processes: Planning,
Learning Motivation with a 7-point Likert scale. Additional      Engagement, and Evaluation and Reflection. Each process
open-ended items asking what students and teachers liked         phase is further divided into several micro-level SRL pro-
or disliked, and what they wish for the system are included      cesses in order to define fine-grained SRL processes. Details
as well. For the analysis of the learning behavior from logs,    about this theoretical framework and the SRL processes it
students’ self-reported SRL skills in online learning (Online    encompasses are provided in Table 1. Next, to extract the
Self-Regulated Learning Questionnaire, OSLQ, adapted from        SRL behavior implied by the actions, the actions will be ag-
[48]) are also collected from students. After filling in the     gregated into a common xAPI statement structure with the
questionnaires, teachers and several students will be invited    theoretical framework of SRL processes proposed by [32].
for a follow-up semi-structured interview to gain their          Among seven micro-level SRL processes proposed by [32],
perceptions of the system in-depth, which will follow the        five processes are identified according to the functionality
guideline suggested by [49]. For quantitative data analysis      of the system where the actions are taken. Table 2 summa-
of the self-reports, the mean and standard deviation of of       rizes the actions in the systems in ARES mapped to each
each close-ended item and category will be calculated. For       proposed macro-level and micro-level SRL processes.
quantitative analysis from the semi-structures interview,           To explore how students with varying levels of self-
a reflexive thematic analysis [50] will be conducted. The        regulation approach their learning, we will compare their
results will be discussed with the English teachers at the       self-reported SRL skills with behavioral patterns recorded
participating schools to refine the system’s usability and       in the system. Behavioral variables will be tracked for each
task design.                                                     student and assignment to provide a detailed profile of their
                                                                 learning behaviors. K-means cluster analysis will be em-
RQ2: To answer the second research question, the study in-       ployed to group students based on (1) their self-reported
vestigating the effectiveness of the system is planned to take   SRL skills and (2) their behavioral patterns as reflected in the
place this school year in English classes (students around       trace data in order to identify patterns that highlight how
age 13-14) in secondary schools in southwest Germany. The        well their self-perceptions align with their actual learning
study will be administered via the ARES system, and use a        behaviors. The resulting clusters will then be compared
posttest/pretest design consisting of a battery of tests and     to examine correlations between subjective and objective
questionnaires. After providing parental consent, partici-       measures of SRL, which will help reveal whether students
pants will be introduced to the ARES system and complete         with strong self-reported SRL skills also demonstrate strong
the pre-tests and pre-questionnaires. The teachers will be       behavioral evidence of self-regulation, or whether there are
asked to assign at least two reading assignments per week        discrepancies between the two, providing valuable insights
over an eight-week period via the ARES interface. Based on       into the alignment (or misalignment) between students’ per-
the methodology of [23], rotational within-class randomiza-      ceived and actual learning strategies.
tion will be employed based on the condition. In the first
four weeks, half of each class will serve as the intervention
group, using the system with lookup and feedback on com-         5. Conclusion and contribution
prehension questions features, while the other half will read
                                                                 Driven by the need to fill the gap between research on
plain texts without such aids. In the second four weeks,
                                                                 language education, foreign language teaching insights,
this will be reversed. After eight weeks, participants will be
                                                                 and real-life classroom usage, this article presented the
instructed to complete the post-tests, post-questionnaires,
                                                                 development of the pedagogically grounded ICALL system
and background questionnaire.
                                                                 that provides various learning supports for L2 reading
   Learning outcomes will be measured by pre- and
                                                                 comprehension and an overview of ongoing and planned
post-tests that measure their English vocabulary knowledge
                                                                 studies, which focus on different aspects of the ICALL
(Updated Vocabulary Levels Test, [51]), English reading
                                                                 system, examining learners’ behaviors through interaction
comprehension (Reading section of TOEFL® Primary™
                                                                 logs. The results of this project will provide AIED re-
Step 2), general English proficiency (Elicited Imitation
                                                                 searchers and language educators with an interdisciplinary
test, [52]), and L2 reading motivation (Reading Motivation
                                                                 perspective and further insights on the feasibility and
Questionnaire, adapted from [53]). OSLQ [48] will be also
                                                                 capabilities of using the current NLP and AI (LLM) tools
used to measure the students’ self-reported SRL skills.
                                                                 in language learning applications and inform system and
During the eight-week period, participants’ interaction
                                                                 task design decisions for enhancing learning outcomes.
with the system will be tracked. We plan to conduct a
                                                                 Apart from the research plans and studies outlined in this
pretest-posttest and pre questionnaire-post questionnaire
                                                                 article, future directions include examining the feasibility
comparison across groups, in which we expect improve-
                                                                 of leveraging the LLM to generate short answer questions
ments in the measurements on which participants had
                                                                 and feedback, the classification accuracy of annotations
access to the aids while learning.
                                                                 on language means, and the efficacy of different feedback
                                                                 types for students with different levels of SRL skills.
RQ3: To answer the third research questions, the subset of
log data that is stored as students interact with the system
from the aforementioned studies addressing RQ1 and RQ2
will be used. Student’s behavioral data will be firstly col-
lected as learning logs from the ARES system by extracting
students’ interactions in the LRS in form of xAPI statements
stored in the system database. As noted in the Background
 Macro-level     Micro-level SRL
                                         Description
 SRL process     process
                 Task Analysis           To get familiar with the learning context and the definition and requirements of a (learning)
 Planning                                task at hand
                 Goal Setting            To explicitly set, define, or update learning goals
                 Making Personal
                                         To create plans and select strategies for achieving a set learning goal
                 Plans
 Engage-         Working on the Task     To consistently engage with a learning task, using tactics and strategies
 ment            Applying Strategy
                                         To revise learning strategies, or apply a change in tactics
                 Changes
 Evaluation      Evaluation              Evaluating one’s learning process and comparing one’s work with the goal
 &               Applying Strategy
                                         Reflecting on individual learning and sharing learning experiences
 Reflection      Changes
    Table 1
    Theoretical framework guiding trace-based measurement of SRL processes, articulated in [32]

 Macro-level     Micro-level SRL
                                         Behavioral variables
 SRL process     process
                 Task Analysis           Total number of visits to an assignment overview; Total number of visits to a tutorial video;
 Planning                                Sum of total time spent on a tutorial video
 Engage-         Working on the Task     Total number of visits to assignments before deadline; Sum of total time spent on assign-
 ment                                    ments before deadline; Total number of visits to reading questions before deadline; Total
                                         number of assignments completed; Total number of correct responses; Total number of
                                         incorrect responses
                 Applying Strategy
                                         Total number of access to vocab help; Total number of access to language means help
                 Changes
 Evaluation      Evaluation              Total number of visits to assignments after deadline; Sum of total time spent on assignments
 &                                       after deadline; Total number of visits to feedback to correct responses after deadline; Total
 Reflection                              number of visits to feedback to incorrect responses after deadline; Total number of visits
                                         to target answers after deadline
                 Applying Strategy
                                         Total number of visits to class average score after deadline; Total number of visits to own
                 Changes
                                         score after deadline
    Table 2
    Matching map between the SRL processes and learning behavior data in the system


Acknowledgments                                                           ronment, ReCALL 29 (2017) 313–334. doi:10.1017/s0
                                                                          95834401700012x .
This PhD project is funded by the German Ministry of Ed-              [6] I.-C. Choi, Efficacy of an ICALL tutoring system and
ucation and Science (BMBF) under the funding number                       process-oriented corrective feedback, Computer As-
01IS22076.                                                                sisted Language Learning 29 (2016) 334–364. doi:10.1
                                                                          080/09588221.2014.960941 .
                                                                      [7] M. Heilman, K. Collins-Thompson, J. Callan, M. Eske-
References                                                                nazi, A. Juffs, L. Wilson, Personalization of reading
 [1] L. Verhoeven, Second language reading acquisition,                   passages improves vocabulary acquisition, Interna-
     in: M. L. Kamil, P. D. Pearson, E. B. Moje, P. Af-                   tional Journal of Artificial Intelligence in Education
     flerbach (Eds.), Handbook of reading research, vol-                  20 (2010) 73–98. doi:10.3233/JAI- 2010- 0003 .
     ume 4, Routledge, New York, NY, 2011, pp. 661–683.               [8] A. Kate, Three reasons to be skeptical of artificial in-
     doi:10.4324/9780203840412.ch27 .                                     telligence in schools, 2020. URL: https://www.edweek
 [2] L. A. Amaral, D. Meurers, On using intelligent                       .org/technology/three-reasons-to-be-skeptical-of-art
     computer-assisted language learning in real-life for-                ificial-intelligence-in-schools/2020/02.
     eign language teaching and learning, ReCALL 23                   [9] X. Chen, E. Bear, B. Hui, H. Santhi-Ponnusamy,
     (2011) 4–24. doi:10.1017/S0958344010000261 .                         D. Meurers, Education theories and ai affordances: De-
 [3] C.-C. Liu, P.-C. Wang, S.-J. D. Tai, An analysis of                  sign and implementation of an intelligent computer-
     student engagement patterns in language learning fa-                 assisted language learning system, in: M. Rodrigo,
     cilitated by Web 2.0 technologies, ReCALL 28 (2016)                  N. Matsuda, A. Cristea, V. Dimitrova (Eds.), Artificial
     104–122. doi:10.1017/s095834401600001x .                             Intelligence in Education. Posters and Late Breaking
 [4] A. Oberg, P. Daniels, Analysis of the effect a student-              Results, Workshops and Tutorials, Industry and In-
     centred mobile learning instructional method has on                  novation Tracks, Practitioners’ and Doctoral Consor-
     language acquisition, Computer Assisted Language                     tium. AIED 2022., Lecture Notes in Computer Science,
     Learning 26 (2013) 177–196. doi:10.1080/09588221                     Springer, Cham, 2022, pp. 582–585. doi:10.1007/97
     .2011.649484 .                                                       8- 3- 031- 11647- 6_120 .
 [5] H. Ai, Providing graduated corrective feedback in an            [10] J. Jung, Second language reading and the role of gram-
     intelligent computerassisted language learning envi-                 mar, Working Papers in TESOL and Applied Linguis-
                                                                          tics 9 (2009) 29–48. doi:10.7916/D88915FW .
[11] W. Kintsch, The role of knowledge in discourse com-              cational Psychology 11 (1986) 307–313. doi:10.1016/
     prehension: A construction-integration model, Psy-               0361- 476X(86)90027- 5 .
     chological Review 95 (1988) 163–182. doi:10.1037/00         [26] H. Li, Z. Gan, Reading motivation, self-regulated
     33- 295x.95.2.163 .                                              reading strategies and english vocabulary knowl-
[12] W. Kintsch, T. A. van Dijk, Toward a model of text               edge: Which most predicted students’ english reading
     comprehension and production, Psychological Review               comprehension?, Frontiers in Psychology 13 (2022)
     85 (1978) 363–394. doi:10.1037/0033- 295X.85.5.3                 1041870. doi:10.3389/fpsyg.2022.1041870 .
     63 .                                                        [27] D. H. Schunk, B. J. Zimmerman, Motivation and self-
[13] K. Koda, Reading and language learning: Crosslinguis-            regulated learning: Theory, research, and applications,
     tic constraints on second language reading develop-              Lawrence Erlbaum Associates Publishers, Mahwah,
     ment, Language Learning 57 (2007) 1–44. doi:10.111               2008.
     1/j.1467- 9922.2007.00411.x .                               [28] R. S. Jansen, A. van Leeuwen, J. Janssen, R. Conijn,
[14] Y. Choi, D. Zhang, The relative role of vocabulary and           L. Kester, Supporting learners’ self-regulated learning
     grammatical knowledge in L2 reading comprehension:               in massive open online courses, Computers Education
     A systematic review of literature, International Review          146 (2020) 103771. doi:10.1016/j.compedu.2019.1
     of Applied Linguistics in Language Teaching 59 (2021)            03771 .
     1–30. doi:10.1515/iral- 2017- 0033 .                        [29] S.-H. Jin, K. Im, M. Yoo, I. Roll, K. Seo, Supporting stu-
[15] H. Chen, H. Mei, How vocabulary knowledge and                    dents’ self-regulated learning in online learning using
     grammar knowledge influence L2 reading comprehen-                artificial intelligence applications, International Jour-
     sion: a finer-grained perspective, European Journal              nal of Educational Technology in Higher Education 20
     of Psychology of Education (2024) 1–23. doi:10.1007/             (2023) 37. doi:10.1186/s41239- 023- 00406- 5 .
     s10212- 024- 00793- x .                                     [30] M. L. Bernacki, J. A. Greene, H. Crompton, Mobile
[16] D. Meurers, R. Ziai, L. Amaral, A. Boyd, A. Dimitrov,            technology, learning, and achievement: Advances in
     V. Metcalf, N. Ott, Enhancing authentic web pages                understanding and measuring the role of mobile tech-
     for language learners, in: Proceedings of the NAACL              nology in education, Contemporary Educational Psy-
     HLT 2010 Fifth Workshop on Innovative Use of NLP                 chology 60 (2019) 101827. doi:10.1016/j.cedpsych
     for Building Educational Applications, BEA 2010, As-             .2019.101827 .
     sociation for Computational Linguistics, Los Angeles,       [31] T. Adeyinka, S. Mutula, A proposed model for eval-
     California, 2010, pp. 10–18.                                     uating the success of webct course content manage-
[17] S. Walker, P. Schloss, C. R. Fletcher, C. A. Vogel, R. C.        ment system, Computers in Human Behavior 26 (2010)
     Walker, Visual-syntactic text formatting: A new                  1795–1805.
     method to enhance online reading, Reading Online 8          [32] M. Siadaty, D. Gašević, M. Hatala, Trace-based
     (2005).                                                          micro-analytic measurement of self-regulated learn-
[18] J. Hattie, H. Timperley, The power of feedback, Review           ing processes, Journal of Learning Analytics 3 (2016)
     of Educational Research 77 (2007) 81–112. doi:10.310             183––214. doi:10.18608/jla.2016.31.11 .
     2/003465430298487 .                                         [33] P. Robinson, A. Mackey, S. Gass, R. Schmidt, Attention
[19] V. J. Shute, Focus on formative feedback, Review of              and awareness in second language acquisition, in:
     Educational Research 78 (2008) 153–189. doi:10.310               S. Gass, A. Mackey (Eds.), The Routledge handbook
     2/0034654307313795 .                                             of second language acquisition, Routledge, New York,
[20] C. J. Fong, E. A. Patall, A. C. Vasquez, S. Stautberg, A         NY, 2012, pp. 247–267.
     meta-analysis of negative feedback on intrinsic mo-         [34] D. Gašević, S. Dawson, G. Siemens, Let’s not forget:
     tivation, Educational Psychology Review 31 (2019)                Learning analytics are about learning, TechTrends 59
     121–162. doi:10.1007/s10648- 018- 9446- 6 .                      (2015) 64–71.
[21] E. K. Swart, T. M. Nielen, M. T. S. de Jong, Does feed-     [35] G. Siemens, R. Baker, Learning analytics and edu-
     back targeting text comprehension trigger the use of             cational data mining: Towards communication and
     reading strategies or changes in readers’ attitudes?             collaboration, in: Proceedings of the 2nd international
     a meta‐analysis, Journal of Research in Reading 45               conference on learning analytics and knowledge, As-
     (2022) 171–188. doi:10.1111/1467- 9817.12389 .                   sociation for Computing Machinery, New York, NY,
[22] M. ter Beek, L. Brummer, A. S. Donker, M.-C. J. Opde-            2012, pp. 252–254. doi:10.1145/2330601.2330661 .
     nakker, Supporting secondary school students’ read-         [36] G. Siemens, R. Baker, Student learning in higher educa-
     ing comprehension in computer environments: A sys-               tion: A commentary, Educational Psychology Review
     tematic review, Journal of Computer Assisted Learning            29 (2017) 353–362. doi:1 0 . 1 0 0 7 / s 1 0 6 4 8 - 0 1 7 - 9 4 1
     34 (2018) 557–566. doi:10.1111/jcal.12260 .                      0- x .
[23] D. Meurers, K. D. Kuthy, F. Nuxoll, B. Rudzewitz,           [37] I. Roll, P. H. Winne, Understanding, evaluating, and
     R. Ziai, Scaling up intervention studies to investi-             supporting self-regulated learning using learning an-
     gate real-life foreign language learning in school, An-          alytics, Journal of Learning Analytics 2 (2015) 7–12.
     nual Review of Applied Linguistics 39 (2019) 161–188.            doi:10.1007/s10648- 017- 9410- x .
     doi:10.1017/S0267190519000126 .                             [38] Q. Yu, Y. Zhao, The value and practice of learning ana-
[24] P. R. Pintrich, The role of goal orientation in self-            lytics in computer assisted language learning, Studies
     regulated learning, in: M. Boekaerts, P. R. Pintrich,            in Literature and Language 10 (2015) 90.
     M. Zeidner (Eds.), Handbook of self-regulation, vol-        [39] M. Raković, S. Iqbal, T. Li, Y. Fan, S. Singh, S. Suren-
     ume 4, Academic Press, San Diego, 2000, pp. 451–502.             drannair, J. Kilgour, J. van der Graaf, L. Lim, I. Mole-
     doi:10.1016/B978- 012109890- 2/50043- 3 .                        naar, M. Bannert, J. Moore, D. Gašević, Harnessing
[25] B. J. Zimmerman, Becoming a self-regulated learner:              the potential of trace data and linguistic analysis to
     Which are the key subprocesses?, Contemporary Edu-               predict learner performance in a multi-text writing
      task, Journal of Computer Assisted Learning 39 (2023)
      703–718.
[40] Q. Lia, R. Baker, M. Warschauer,             Using click-
      stream data to measure, understand, and support self-
      regulated learning in online courses, The Internet and
      Higher Education 45 (2020) 100727.
[41] Q. Yu, Learning analytics: The next frontier for com-
      puter assisted language learning in big data age, in:
      SHS Web of Conferences, volume 17, EDP Sciences,
      2015, p. 02013. doi:10.1051/shsconf/20151702013 .
[42] E. Farley-Ripple, H. May, A. Karpyn, K. Tilley, K. Mc-
      Donough, Rethinking connections between research
      and practice in education: A conceptual framework,
      Educational Researcher 47 (2018) 235–245. doi:10.310
      2/0013189X18761042 .
[43] S. Kemmis, R. McTaggart, The Action Research Plan-
      ner, 3rd ed., Deakin University Press, Geelong, 1988.
[44] R. N. Rapoport, Three dilemmas of action research,
      Human Relations 23 (1970) 499–513. doi:10.1177/00
     1872677002300601 .
[45] M. Lee, B. Rudzewitz, X. Chen, Developing a pedagog-
      ically oriented interactive reading tool with teachers
      in the loop, in: Proceedings of the 13th Workshop on
      Natural Language Processing for Computer Assisted
      Language Learning (NLP4CALL 2024), 2024. In press.
[46] Council of Europe, Common European Framework of
      Reference for Languages: Learning, teaching, assess-
      ment – Companion Volume, Cambridge University
      Press, Cambridge, 2020.
[47] J. W. M. Lai, J. D. Nobile, M. Bower, Y. Breyer, Compre-
      hensive evaluation of the use of technology in educa-
      tion – validation with a cohort of global open online
      learners, Education and Information Technologies 27
     (2022) 9877––9911. doi:10.1007/s10639- 022- 10986
     - w.
[48] L. Barnard, Y. W. Lan, M. Y. To, V. O. Paton, S.-L. Lai,
      Measuring self-regulation in online and blended learn-
      ing environments, The Internet and Higher Education
      12 (2009) 1–6.
[49] E. Rosalind, J. Holland, What is qualitative interview-
      ing?, Bloomsbury Academic, London, 2013.
[50] V. Braun, V. Clarke, Thematic analysis: A practical
      guide, Sage Publications, Los Angeles, 2022.
[51] A. Godfroid, K. M. Kim, The updated vocabulary levels
      test: Developing and validating two new forms of the
     vlt, International Journal of Applied Linguistics 168
     (2017) 34––70. doi:10.1075/itl.168.1.02web .
[52] A. Godfroid, K. M. Kim, The contributions of
      implicit-statistical learning aptitude to implicit second-
      language knowledge, Studies in Second Language Ac-
      quisition 43 (2021) 606––634. doi:10.1017/S0272263
     121000085 .
[53] W. Wang, Z. Gan, Development and validation of the
      reading motivation questionnaire in an english as a
      foreign language context, Psychology in the Schools
      58 (2021) 1151––1168. doi:10.1002/pits.22494 .
[54] U. Kroehne, F. Goldhammer, How to conceptualize,
      represent, and analyze log data from technology-based
      assessments? a generic framework and an application
      to questionnaire items, Behaviormetrika 45 (2018)
      527–563. doi:10.1007/s41237- 018- 0063- y .