<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>PASTEL: Evidence-based learning engineering method to create intelligent online textbook at scale</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Noboru Matsuda</string-name>
          <email>Noboru.Matsuda@ncsu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Machi Shimmei</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Center for Educational Informatics Department of Computer Science North Carolina State University</institution>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>An extension of online courseware with the macro-adaptive scaffolding, called CyberBook, is proposed. The macro-adaptive scaffolding includes (1) a dynamic control for the amount of formative assessments, (2) a just-in-time navigation to a direct instruction for formative assessment items on which a student failed to answer correctly, and (3) embedded cognitive tutors to provide individual practice on solving problems. The paper also proposes two learning-engineering methods to effectively create a CyberBook: a web-browser based authoring tool for cognitive tutors and a text-mining application for automated skillmodel discovery and annotation. A classroom evaluation study to measure the effectiveness of the CyberBook was conducted for two subjects: middle school science (Newton's Law) and high school math (Coordinate Geometry). The results show that students who used the fully functional Science CyberBook outperformed those who used a version of CyberBook where the macro-adaptive scaffolding was turned off. However, the same effect was not observed for Math Cyberbook. On both subjects, students on the CyberBook with the macro-adaptive scaffolding answered on fewer number of formative assessments due to the dynamic control. Further data analysis revealed that those who asked for more hints on the formative assessments achieved higher scores on the post-test than students who asked for fewer hints. The effect of hint usage was more prominent for students with the low-prior competency.</p>
      </abstract>
      <kwd-group>
        <kwd>Online Courseware</kwd>
        <kwd>Macro-adaptive Scaffolding</kwd>
        <kwd>Learning Engineering</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        One of the challenges in current online textbooks is a lack of individual support that
apparently hinders students’ learning. For example, competency-based scaffolding is
desired for students who need tailored scaffolding based on their competency. The lack
of an embedded student model results in the excessive training—i.e., all students being
exposed to a fixed amount of assessments regardless of their competencies, which
severly impacts students’ learning [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. Studies say that the excessive training
decreases students’ motivation and causes an early course termination [
        <xref ref-type="bibr" rid="ref3 ref4 ref5 ref6">3-6</xref>
        ].
      </p>
      <p>A technology innovation to drive individualized scaffolding on a large-scale online
textbook that can be plugged-in to existing online-course platforms is therefore
critically needed. Without such technology, the online textbook will not fully show its
potential to impact a large body of students’ learning.</p>
      <p>
        We hypothesize that the lessons learned from long-standing research on intelligent
tutoring systems, in particular the skill-model based pedagogy, will apply to the
largescale online textbook. The skill-model based pedagogy requires a skill-model (aka a
knowledge-component model) that consists of skills each representing a piece of
knowledge that students ought to learn. Given that each individual instructional content
will be tagged with a skill, the system will compute a proficiency of each skill for each
individual student to decide an appropriate pedagogical action. Cognitive tutors, for
example, deploy the model-tracing and knowledge-tacing techniques upon a given skill
model to drive micro (flagged feedback and just-in-time hint) and macro (problem
selection) adaptive instructions [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ].
      </p>
      <p>A major technical challenge in this line of research concerns the scalability of
existing techniques for creating a skill model. A transformative technique to fully
automatically create a skill model and annotate instructional materials on actual online
courseware is desired.</p>
      <p>A primary goal of the current paper is to introduce a platform-agnostic suite of
learnking-engineering methods (called PASTEL; Pragmatic methods to develop Adaptive
and Scalable Technologies for next generation E-Learning) that allows courseware
developers to efficiently create a particular type of online courseware called CyberBook.
The CyberBook is an intelligent textbook that provides students with macro-adaptive
pedagogy driven by an embedded skill model and cognitive tutors. As a proof of
concept, we have made two instances of CyberBook, one for middle school science and
another for high school math, and tested their effectiveness with actual students.</p>
    </sec>
    <sec id="sec-2">
      <title>Solutions: CyberBook and PASTEL</title>
      <sec id="sec-2-1">
        <title>CyberBook</title>
        <p>CyberBook is a structured sequence of instructional activities organized in multiple
chapters, sections and units. CyberBook contains three types of instructional elements:
(1) direct instructions that convey subject matters (e.g., skills and concepts) usually
with written paragraphs and videos, (2) formative assessments typically in the form of
multiple-choice or fill-in-the-blank questions, and (3) cognitive tutors that provides
mastery practice on solving a particular type of problem. These cognitive tutors are
equipped with hint messages that are also considered as instructional elements.</p>
        <p>On CyberBook, the three types of learning activities may be placed on multiple
pages that compose a “unit.” Multiple units become a “section,” and a collection of
sections becomes a “chapter.” Each page has a navigation for page forward/backward,
but students may freely visit any page in any order through a table of contents that is
also available from any page.</p>
        <p>A current version of CyberBook provides students with two types of the
macroadaptive scaffolding: (a) a dynamic control for the amount of formative assessments
and cognitive tutors, and (b) a just-in-time navigation to a direct instruction for a
formative assessment that a student failed to answer correctly.</p>
        <p>
          The first type of adaptive scaffolding, the dynamic control for the amount of
assessments and practice, is to determine which formative assessment items and cognitive
tutors should be given to individual students based on their competency. This dynamic
control may reduce the number of unnecessary assessments, i.e., the excessive training.
To judge if an assessment item (either a cognitive tutor or a formative assessment) is
beneficial to a particular student, the system applies the knowledge-tracing technique
[
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] to compute the probability of an individual students answering the next formative
assessment item correctly. Based on the student model, those assessment items and
cognitive tutors that the student would highly likely (&gt; 0.85) answer/perform correctly
are automatically hidden from the students’ view.
        </p>
        <p>The second type of adaptive scaffolding, the just-in-time navigation, is to provide
students with a link (called a dynamic link) to a corresponding direct instruction for
(and only for) formative assessments and cognitive tutors that they failed to answer
correctly.</p>
        <p>To provide this macro-adaptive scaffolding, all the written instructional elements,
i.e., text paragraphs, assessment items, and cognitive tutors are tagged with skills. This
skill tagging is automatically done by the SMART method introduced in the next section.</p>
        <p>The concept of CyberBook is platform independent hence it can be implemented on
any online learning platform. Currently, as a proof of concept, we have prototyped
CyberBook on Open edX (edx.org) and Open Learning Initiative (Carnegie Mellon
University).
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>PASTEL</title>
        <p>PASTEL is a collection of learning-engineering methods to efficiently build online
courseware with embedded the skill model and cognitive tutors. In this paper, we
describe two PASTEL methods that are used for the current study: a text-mining
application for an automated skill-model discovery (SMART) and a web-browser based
cognitive tutor authoring tool (WATSON).</p>
        <p>SMART: Skill Model mining with Automated detection of Resemblance among Texts</p>
        <p>SMART is a method for automatic discovery of a skill model from a given set of
instruction texts. The unit of analysis is a “text,” which is either a written paragraph,
question sentences for a single assessment item, or hint messages for a single cognitive
tutor.</p>
        <p>
          SMART first applies the k-means text clustering technique [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] to divide assessment
items into clusters with similar semantic meanings. Prior to clustering and keyword
extraction, all “texts” are distilled by removing the punctuations and stopwords, which
are a set of words that have little grammatical values (e.g., articles, conjunctions, and
prepositions, etc.). Distilled “texts” are split into words (aka tokens). Each tokenized
“text” is converted into a Term Frequency (TF) vector showing weighted frequency of
the total tokens in all given texts (called the token space). For example, the i-th element
of a TF for a “text” corresponding to a written paragraph, shows the frequency of the
ith token in the token space appearing in the “text” (or zero if the “text” does not contain
that token).
        </p>
        <p>
          Our naïve assumption is that if a set of “texts” are all about a same latent skill (e.g.,
paragraphs explaining a concept X and assessment items asking about the concept X),
the latent skill can be identified from the set of “texts.” We then hypothesize that
assessment items in a particular cluster are assumed to assess the same particular skill.
Each cluster of assessments is therefore given a label, that becomes a skill name, by
applying keyword extraction technique, TextRank [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. Finally, each written paragraph
and a hint message (of a cognitive tutor) is paired with the closest cluster, i.e., a skill,
using the cosine similarity [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. As a result, the instructional elements on CyberBook
are fully automatically tagged with skills, and a three-way skill mapping among written
paragraph, assessment items, and cognitive tutors (through their hint messages) is
formed.
        </p>
        <p>WATSON: Web-based Authoring Technique for adaptive tutoring System on Online
courseware</p>
        <p>
          WATSON is a web-browser based authoring tool to create cognitive tutors by
demonstration. Fig. 1 shows an example screenshot of WATSON. Cognitive tutors allow
students to practice solving problems while providing the double-looped micro-adaptive
scaffolding—scaffolding between problems (aka, the outer-loop) and within a problem
(aka, the inner-loop) [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. The tutor continuously provides students with problems, while
the students solve a given problem step by step, until they show mastery in solving the
problems. The outer-loop scaffolding uses domain pedagogy to pose a problem to be
solved next that maximizes students’ likelihood of achieving the mastery. The
innerloop scaffolding uses domain knowledge that consists of the immediate flagged
feedback on the correctness of steps performed, and the just-in-time, on-demand hint on
how to perform a next step.
        </p>
        <p>
          The outer-loop is driven by the knowledge-tracing technique [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] whereas the
innerloop is driven by the model-tracing technique [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. Both techniques are task
independent and rely only on a given domain expert model that is written as a set of production
rules each of which represents a piece of skill that students ought to learn. This implies
that creating a cognitive tutor is reduced to creating a domain expert model, a set of
problems used for tutoring, and a tutoring interface.
        </p>
        <p>
          WATSON is built on a third party Cognitive Tutor Authoring Tool (CTAT) [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. To
build a cognitive tutor using WATSON, an author first uses CTAT to create a tutoring
interface diretly on a web-browser. CTAT outputs an HTML5 code for the tutoring
interface. WATSON renders the HTML5-based tutoring interface on a web-browser with
additional graphical user interface for the author to interactively create a domain expert
model (Fig. 1-a). To create the domain expert model, the author interactively tutors a
machine learning agent, called SimStudent [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], through the tutoring interface. The
author poses a problem for SimStudent and asks SimStudent to solve the problem (Fig.
1d). SimStudent may attempt to solve the problem by performing one step at a time
(which in this example, corresponds to entering a value in a text box shown on the
tutoring interface). When SimStudent performs a step, it asks the author to provide
feedback on the correctness (Fig. 1-e). The author responds by clicking the [yes/no]
button. When SimStudent gets stuck on performing a step, it asks the author to
demonstrate the next step. The author then performs that step on the tutoring interface.
        </p>
        <p>Through the interctive tutoring, SimStudent produces a set of production rules each
of which corresponds to a single step reified on the tutoring interface. Each production
rule therefore represents a particular skill that is sufficient to perform a particular step.
The author provides a name of each production rule while tutoring. Those production
rules will be used as a domain expert model for the cognitive tutor with the exact
HTML5 tutoring interface used to tutor SimStudent.</p>
        <p>
          While authoring, a list of skills and problems tutored are displayed (Fig. 1-b and c
respectively). When the author clicks on a skill name on the graph (Fig. 1-b), a current
production rule written in the Jess language [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] is shown on a separate browser tab.
When the author mouse-over a name listed in the problem bank (Fig. 1-c), names of the
skills used to solve the corresponding problem are shown in a pop-up dialogue.
3
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Evaluation Study</title>
      <p>To measure the effectiveness of CyberBook, we conducted an evaluation study at our
partner schools using the instances of CyberBook for middle school science and high
school math. The Science CyberBook had 11 sections (40 units) with 17 videos and 83
formative assessments. There were no cognitive tutors embedded for the Science
CyberBook due to a constraint on the timing of the study and the development cycle.
All of the adaptive scaffolding functionalities mentioned earlier were available. The
Math CyberBook, on the other hand, had 23 sections (26 units) with 179 formative
assessments and 14 cognitive tutors. No video was used for the Math CyberBook,
partly because the in-service math teacher who led the curriculum design believed that
videos would not be necessariy if the curriculum has robast and rich instructions and
graphics.
3.1</p>
      <sec id="sec-3-1">
        <title>Method</title>
        <p>The school study was a stratified randomized controlled trial with two treatment
conditions—fully functional CyberBook (the Adaptive condition, hereafter) vs. a version
of CyberBook without the macro-adaptive scaffolding (the Non-Adaptive condition).
For Science, two public middle schools in Texas, USA participated with 131 and 34
students in 6 and 2 science classes respectively. For Math, 143 and 25 students in 5 and
2 math classes from two public high schools in Texas, USA were participated. The
study was conducted in their usual science and math class periods as a part of their
business-as-usual classroom activities.</p>
        <p>The school study sessions involved 5 days, one classroom period per day. On Day 1,
all students took a pre-test. For Science, the test lasted for 20 minutes with 18
multiplechoice questions. For Math, the test lasted for 30 minutes with 21 multiple-choice
questions. For both subjects, the test was printed on paper, but students were asked to enter
their answers through an online form.</p>
        <p>After taking the pre-test, students were randomly assigned to one of two conditions
using the stratified randomization based on the pre-test score—i.e., the difference in the
mean pre-test scores between two conditions was aimed to be the minimum; for
Science, MAdaptive = 0.68±0.23 vs. MNon-Adaptive = 0.70±0.25, t(140) = 0.76, p = 0.45; for Math
MAdaptive = 0.36±0.19 vs. MNon-Adaptive = 0.36±0.19, t(132) = -0.47, p = 0.64.</p>
        <p>On Day 2 through Day 4, students used their assigned version of CyberBook. During
this phase, students worked on CyberBook at their own pace while they were
encouraged to ask questions to teachers, if necessary. Equally, teachers were encouraged to
interact with their students in the same way as they usually do in their classrooms.</p>
        <p>On Day 5, students took the post-test that was isomorphic to the pre-test, i.e., the
same number and types of problems that can be solved by applying the same
knowledge. For post-test items, the difference is in their cover stories and quantities
used.</p>
        <p>Two researchers attended each of the classroom sessions to take field observation
notes and help students overcome any technical issues. Those researchers did not
provide students with any instructional scaffolding (but only encouraged students to ask
their teachers for an assistance when needed).
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Results</title>
        <p>There were no particular exclusion criteria for participants during the study—all
students were welcomed to participate in any part of the study. In the following analysis,
however, we include only those students who took both pre- and post-tests, and
attended all three days of intervention. Table 1 shows the number of participants who
155
(77/78)</p>
        <p>148
(76/72)</p>
        <p>154
(79/75)</p>
        <p>153
(76/77)
took pre and post-tests respectively, and those who attended all three days of
intervention (i.e., Day 2 through Day 4). The table also show the number of participants who
meet the inclusion criteria.</p>
        <p>Test Scores: Table 2 shows mean test scores comparing two conditions both for
Science and Math. To see if there was an effect of the macro-adaptive scaffolding on
students’ learning, a repeated-measures ANOVA was conducted for each subject
independently, with post-test score as the dependent variable and test-time (pre vs. post)
and condition (Adaptive vs. Non-Adaptive) as fixed factors.</p>
        <p>For Science, there was an interaction between condition and test-time; F(1,142) =
5.61, p &lt; 0.05. A post-hock analysis revealed that only the adaptive condition shows
an increase in the test score from pre- to post-tests; for Adaptive condition:
pairedt(72) = -5.18, p &lt; 0.001, d = 0.46; for Non-Adaptive condition: paired-t(70) = -1.52, p
= 0.13, d = 0.13. In the science classes, students who used a version of CyberBook with
the macro-adaptive scaffolding outperformed students without adaptive scaffolding on
the post-test.</p>
        <p>For Math, there was a main effect of test-time (F(1,132) = 61.57, p &lt; 0.001), but
condition was not a main effect (F(1,132) = 0.23, p = 0.63). In the math classes,
students’ scores on the test increased from pre- to post-tests equally regardless of whether
the macro-adaptive scaffolding was available.</p>
        <p>Behavior Analysis: To understand why only the Science CyberBook showed the effect
of the macro-adaptive scaffolding, we analyzed the process data showing detailed
interactions between students and the system while they were working on the CyberBook.
The process data contain the clickstream data (including the information about the
assessment items such as problem ID and the skills associated with each problem) and
the correctness of the students’ answers.</p>
        <p>
          We first hypothesized that there was a condition difference in the way students
watched the video vs. answering formative assessments on the Science CyberBook
(there was no video on Math CyberBook). Not surprisingly, there was a notable
condition difference in the number of formative assessments students answered in Science
CyberBook; MAdaptive = 62.2±16.93 vs. MNon-Adaptive = 74.7±12.69, t(133) = -5.04, p &lt;
0.001, d = 0.84. The dynamic control for the amount of problems effectively reduced
the number of formative assessments for the Adaptive students. There was, however, no
statistically reliable relationship between the number of formative assessments
answered and the post-test score when the pre-test was entered as the primary factor to a
regression model; F(1,141) = 3.08, p = 0.08. There was no condition difference in the
number of videos watched either; MAdaptive = 25.6±26.23 vs. MNon-Adaptive = 25.0±28.92,
t(128) = 0.145, p = 0.89. The doer/non-doer effect that predicts that learning by doing
(i.e., working on formative assessments) better facilitates learning than by watching
videos [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] did not present in the current study on the Science CyberBook. As a side
note, for Math, there was no condition difference in the number of formative
assessments students answered; MAdaptive = 55.5±12.79, MNon-Adaptive = 51.0±15.65, p = 0.07, d
= 0.32.
        </p>
        <p>Second, we hypothesized the dynamic link, which was only available for Adaptive
students, effectively facilitated learning on Science CyberBook. This hypothesis was
not supported. To our surprise, the average number of dynamic-link clicked was quite
low; M = 0.4±1.8. It turned out that, in the instances of CyberBook used in the current
study, most of the linked contents are placed on the same page as the assessment item,
at relatively in a close distance. The field observation notes collected during classroom
sessions mentioned that students noticed that they were simply able to scroll up the
page to review a related content instead of clicking on the dynamic link.</p>
        <p>
          Third, we explored students’ engagement in learning by doing—i.e., how seriously
students worked on formative assessments. In particular, we investigated whether
Adaptive students worked on multiple choice questions more seriously than
Non-Adaptive students. On Science CyberBook, about 2/3 of the formative assessments are
multiple-choice questions. The degree of engagement on the multiple-choice question
might have had a significant impact on students’ learning [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. We operationalized the
“seriousness” as the number of choice items submitted before making a correct answer.
Since CyberBook provides the immediate feedback on an answer submission, when
students were not engaged in learning, they might merely try choice items one-by-one
until they see affirmative feedback. We hypothesized that the ratio of choice items
(RCI) submitted before submitting a correct answer on multiple-choice questions is
lower among Adaptive than Non-Adaptive students. This hypothesis was not
supported. There was no condition difference in the average RCI per student; for Science,
MAdaptive = 0.48±0.06 vs. MNon-Adaptive = 0.47±0.07; t(140) = 0.78, p = 0.44. The same
trend was observed for Math; MAdaptive = 0.52±0.08 vs. MNon-adaptive = 0.50±0.07; t(129)
= 0.78, p = 0.20.
        </p>
        <p>Fourth, we investigated the difference in the hint usage between Adaptive and
NonAdaptive students. In particular, we hypothesized that Adaptive students used hints
more frequently when they failed to answer a formative assessment item correctly. We
operationalize the hint usage on failed assessment items (per student) as the ratio of
assessment items on which a student failed to answer correctly and asked for a hint to
the total number of assessment items that the student failed to answer
correctly—denoted as Hint on Failure Ratio (HFR). This hypothesis was supported only for Science.
When aggregated across all students within each condition, there was a condition
difference on HFR for Science; MAdaptive = 0.32±0.23 vs. MNon-Adaptive = 0.23±0.19; t(138)
= 2.40, p &lt; 0.05, d = 0.40. For Math, the condition difference was weaker; MAdaptive =
0.31±0.26; MNon-Adaptive = 0.24±0.24; t(130) = 1.71; p = 0.09; d = 0.30. However, a
regression analysis did not confirm a correlation between HFR and post-test score when
pre-test was entered to the model as the primary factor; for Science, pre-test was a
significant predictor, F(1,141) = 177.9, p &lt; 0.0001; HFR was not, F(1,141) = 1.28, p =
0.26. The same trend was observed for Math; pre-test F(1,130) = 167.4, p &lt; 0.0001;
HFR F(1,130) = 3.17, p = 0.08.</p>
        <p>A further analysis revealed that the correlation between HFR and post-test score was
negative; for Science r(142) = -0.34, p &lt; 0.001; for Math r(131) = -3.99, p &lt; 0.001.
Though this finding was controversial at the beginning, we hypothesized that (1)
students with a low prior competency (measured as pre-test score) needed more hints on
failed assessments—i.e., HFR and pre-test score were negatively correlated, and
(2) pre-test and post-test scores were highly positively correlated as is almost always
the case in school evaluation studies.</p>
        <p>If this hypothesis is true, then we should see more evident condition difference of
HFR among low prior students than high prior students. This hypothesis was supported
as shown in Table 3. It was only for Science that condition (Adaptive vs. Non-Adaptive)
was the main effect for HFR for low-prior students. Although, understanding the exact
reason why Adaptive Low-Prior students used more hint than Adaptive Non-Adaptive
students requires further investigation. We suspect the fact that the dynamic link
(available only for Adaptive students) was located physically close to the hint button might
have had a positive influence on students’ hint usage. Interestingly, Table 3 shows the
similar HFR values for low prior students among Science and Math students. Yet, the
lack of statistical significance for Math is arguably due to a larger variance.
a: t(69) = 2.57, p &lt; 0.05; b: t(65) = 1.69, p = 0.10.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Math</title>
      </sec>
      <sec id="sec-3-4">
        <title>Adaptive</title>
        <p>0.39(0.28)b
0.23(0.21)</p>
      </sec>
      <sec id="sec-3-5">
        <title>Non-Adaptive</title>
        <p>0.27(0.28)b
0.20(0.17)</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Discussions</title>
      <p>
        It is not entirely clear why the doer/non-doer effect was not confirmed in the current
study hence needs more investigation. A previous study [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] reported that learning by
doing (i.e., answering formative assessments and receive feedback/hints) is six times
more effective than watching videos and reading texts. One potential hypothesis for the
doer effect not shown in the current study is that almost all students in our study might
have worked on the sufficient number of formative assessments hence they were all
rather equally doers (hence no correlation between the number of assessments and test
score observed). The students’ competency might be another factor. In the current
study, the number of assessments is determined based on the student’s competency—
those who have lower competency received more assessments. Therefore, it might not
be surprising to see a negative correlation between the number of formative assessments
answered and learning outcome (pre-test is strongly correlated with post-test after all).
      </p>
      <p>The dynamic link was not apparently functioning as expected in the current study,
anecdotally because students noticed that related contents were just a scroll away.
Unfortunately, there was no logging made for this type of behavior (i.e., scrolling through
pages and reviewing related contents). Therefore, it is technically challenging to
comprehensively evaluate the effect of the dynamic link in the current data. The
improvement of the logging function to track the precise usage of dynamic link is one of the
subjects for future system improvement.</p>
      <p>The effect of the proposed macro-adaptive scaffolding was not replicated between
the Science and Math courseware. The current study is somewhat confounded due to
the difference in the availability of videos (available only science) and cognitive tutors
(only math). Further investigation is needed to conduct a thorough study on when and
how the macro-adaptive scaffolding facilitates students’ learning.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>We found that the online courseware with the macro-adaptive scaffolding including a
dynamic controlling of the mount of formative assessments and cognitive tutors
amplified students learning on a middle school science course, but the effect was not
replicated on a high school geometry course. The major differences between these two
instances of courseware include that: (1) only the science courseware contained 17
videos, and (2) only the math courseware contained 14 cognitive tutors. The current data
suggest that it was the use of hint for formative assessment items on which students
failed to answer correctly that correlated with learning outcome, and this effect was
present only among those students who had a low prior competency, measured as the
pre-test score. This effect was only observed for the science course. Understanding why
the same was not the case for the math course requires a further analysis.</p>
      <p>Creating effective online courseware at scale is one of the most demanded challenges
in the current cyberlearning era. The current paper demonstrated the fidelity of
implementation for two learning-engineering methods to build practical online courseware
with the macro-adaptive scaffolding. More studies are needed to understand what
exactly is needed to develop practical learning-engineering methods with a firm impact
on students’ learning with a diverse population.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Koedinger</surname>
            ,
            <given-names>K.R.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>E.A.</given-names>
            <surname>McLaughlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and J.C.</given-names>
            <surname>Stamper</surname>
          </string-name>
          ,
          <source>Automated student model improvement</source>
          ,
          <source>in Proceedings of the 5th International Conference on Educational Data Mining</source>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Yacef</surname>
          </string-name>
          , et al.,
          <source>Editors</source>
          .
          <year>2012</year>
          . p.
          <fpage>17</fpage>
          -
          <lpage>24</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Martin</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , et al.,
          <article-title>Evaluating and improving adaptive educational systems with learning curves. User Modeling</article-title>
          and
          <string-name>
            <surname>User-Adapted</surname>
            <given-names>Interaction</given-names>
          </string-name>
          ,
          <year>2011</year>
          .
          <volume>21</volume>
          (
          <issue>3</issue>
          ): p.
          <fpage>249</fpage>
          -
          <lpage>283</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Gates</surname>
            ,
            <given-names>S.J.</given-names>
          </string-name>
          , et al., eds. Engage to Excel:
          <article-title>Producing One Million Additional College Graduates with Degrees in Science</article-title>
          , Technology, Engi- neering, and Mathematics.
          <year>2012</year>
          , PCAST STEM Undergraduate Working Group:
          <article-title>Office of the President</article-title>
          , DC.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Goodman</surname>
            ,
            <given-names>I.F.</given-names>
          </string-name>
          ,
          <article-title>Final Report of the Women's Experiences in College Engineering (WECE) Project</article-title>
          .
          <year>2002</year>
          , Cambridge, MA: Goodman Research Group.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Seymour</surname>
            ,
            <given-names>E. and N.M.</given-names>
          </string-name>
          <string-name>
            <surname>Hewitt</surname>
          </string-name>
          ,
          <source>Talking About Leaving: Why Undergraduates Leave the Sciences. 1997</source>
          , Boulder, CO: Westview Press.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Watkins</surname>
            , J. and
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Mazur</surname>
          </string-name>
          ,
          <article-title>Retaining students in science, technology, engineering, and mathematics (STEM) majors</article-title>
          .
          <source>J Coll Sci Teach</source>
          ,
          <year>2013</year>
          .
          <volume>42</volume>
          (
          <issue>5</issue>
          ): p.
          <fpage>36</fpage>
          -
          <lpage>41</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>VanLehn</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <source>The Behavior of Tutoring Systems</source>
          .
          <source>International Journal of Artificial Intelligence in Education</source>
          ,
          <year>2006</year>
          .
          <volume>16</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Ritter</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , et al.,
          <source>Cognitive tutor: Applied research in mathematics education. Psychonomic Bulletin &amp; Review</source>
          ,
          <year>2007</year>
          .
          <volume>14</volume>
          (
          <issue>2</issue>
          ): p.
          <fpage>249</fpage>
          -
          <lpage>255</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Corbett</surname>
            ,
            <given-names>A.T.</given-names>
          </string-name>
          and
          <string-name>
            <surname>J.R. Anderson</surname>
          </string-name>
          ,
          <article-title>Knowledge tracing: Modeling the acquisition of procedural knowledge</article-title>
          .
          <source>User Modeling and User Adapted Interaction</source>
          ,
          <year>1995</year>
          .
          <volume>4</volume>
          (
          <issue>4</issue>
          ): p.
          <fpage>253</fpage>
          -
          <lpage>278</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Hartigan</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          and
          <string-name>
            <given-names>M.A.</given-names>
            <surname>Wong</surname>
          </string-name>
          ,
          <string-name>
            <surname>Algorithm</surname>
            <given-names>AS</given-names>
          </string-name>
          136:
          <string-name>
            <surname>A K-Means Clustering</surname>
          </string-name>
          <article-title>Algorithm</article-title>
          .
          <source>Journal of the Royal Statistical Society</source>
          ,
          <year>1979</year>
          . C-
          <volume>28</volume>
          (
          <issue>1</issue>
          ): p.
          <fpage>100</fpage>
          -
          <lpage>108</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Mihalcea</surname>
            , R. and
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Tarau</surname>
          </string-name>
          , Textrank:
          <article-title>Bringing order into texts</article-title>
          ,
          <source>in Proceedings of EMNLP</source>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lin</surname>
          </string-name>
          and
          <string-name>
            <surname>D</surname>
          </string-name>
          . Wu, Editors.
          <year>2004</year>
          : Barcelona, Spain. p.
          <fpage>404</fpage>
          -
          <lpage>411</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Salton</surname>
          </string-name>
          , G. and
          <string-name>
            <surname>M.J. McGill</surname>
          </string-name>
          , Introduction to modern information retrieval.
          <year>1983</year>
          ,
          <article-title>Auckland: McGraw-Hill.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Anderson</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Pelletier</surname>
          </string-name>
          ,
          <article-title>A development system for model-tracing tutors</article-title>
          .
          <source>Proc. of the International Conference on the Learning Sciences</source>
          ,
          <year>1991</year>
          : p.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Aleven</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          , et al.,
          <article-title>The Cognitive Tutor Authoring Tools (CTAT): Preliminary evaluation of efficiency gains</article-title>
          ,
          <source>in Proceedings of the 8th International Conference on Intelligent Tutoring Systems</source>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ikeda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.D.</given-names>
            <surname>Ashley</surname>
          </string-name>
          , and T.W. Chan, Editors.
          <year>2006</year>
          , Springer Verlag: Berlin. p.
          <fpage>61</fpage>
          -
          <lpage>70</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Matsuda</surname>
            , N.,
            <given-names>W.W.</given-names>
          </string-name>
          <string-name>
            <surname>Cohen</surname>
            , and
            <given-names>K.R.</given-names>
          </string-name>
          <string-name>
            <surname>Koedinger</surname>
          </string-name>
          , Teaching the Teacher:
          <article-title>Tutoring SimStudent leads to more Effective Cognitive Tutor Authoring</article-title>
          .
          <source>International Journal of Artificial Intelligence in Education</source>
          ,
          <year>2015</year>
          .
          <volume>25</volume>
          : p.
          <fpage>1</fpage>
          -
          <lpage>34</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Friedman-Hill</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <source>Jess in Action: Java Rule-based Systems</source>
          .
          <year>2003</year>
          , Greenwich,
          <string-name>
            <surname>CT</surname>
          </string-name>
          : Manning.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Koedinger</surname>
            ,
            <given-names>K.R.</given-names>
          </string-name>
          , et al.,
          <article-title>Learning is Not a Spectator Sport: Doing is Better than Watching for Learning from a MOOC</article-title>
          ,
          <source>in Proceedings of the Second ACM Conference on Learning@Scale</source>
          .
          <year>2015</year>
          , ACM. p.
          <fpage>111</fpage>
          -
          <lpage>120</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Marsh</surname>
            ,
            <given-names>E.J.</given-names>
          </string-name>
          , et al.,
          <article-title>The memorial consequences of multiple-choice testing</article-title>
          .
          <source>Psychonomic bulletin &amp; review</source>
          ,
          <year>2007</year>
          .
          <volume>14</volume>
          (
          <issue>2</issue>
          ): p.
          <fpage>194</fpage>
          -
          <lpage>199</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>