<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Corpus Methods and Textual Visualization To Enhance Learning in Core Writing Courses</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>David Kaufer</string-name>
          <email>kaufer@andrew.cmu.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Suguru Ishizaki</string-name>
          <email>suguru@cmu.edu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Carnegie Mellon University</institution>
          ,
          <addr-line>5000 Forbes Ave., Pittsburgh, PA 15213, +1 412-268-1074</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Carnegie Mellon University</institution>
          ,
          <addr-line>5000 Forbes Ave., Pittsburgh, PA 15213, +1 412-268-4013</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Writing tasks require countless composing decisions that are typically beyond the conscious grasp of writers. Much of the skill of being ―text-aware‖ inheres in understanding that texts produced from classroom assignments are not just composed of words and sentences but of highly structured and often highly predictive composing decisions. However, the decision-making underlying writing is an extremely abstract idea that is hard to make tangible for students. Although a significant number of pedagogical approaches have been investigated in the past three decades, the means to help students acquire more tangible understanding and control of their composing decisions has not been addressed. We propose to address this gap by developing a corpus-based learning tool to help students notice and reflect on composition decisions in their writing and to become resultantly more selfaware and reflective writers. This approach builds on an existing corpus-based text analysis tool called DocuScope, which for over a decade was successfully used for these purposes in a graduate pilot course. The goal of this project is to extend this approach to support the core writing courses at our university.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>Writing tasks require countless composing decisions that are
typically beyond the conscious grasp of writers. Much of the skill
of being ―text-aware‖ inheres in understanding that texts produced
from classroom assignments are not just composed of words and
sentences but of highly structured and often highly predictive
composing decisions. A fundamental goal of Carnegie Mellon’s
core writing courses is to help students develop this textual
awareness so that they are able to make appropriate compositional
decisions for different text types. Unfortunately, the
decisionmaking underlying writing is an extremely abstract notion and
hard to make tangible for students. While various pedagogical
approaches have been investigated over the past 30+ years,
making tangible the decision-making underlying writing has
eluded these approaches.</p>
      <p>The goal of our project is to develop a suite of corpus-based
learning tools that will help students notice hidden structures and
composing decisions in writing, and become more self-aware and
reflective writers.</p>
    </sec>
    <sec id="sec-2">
      <title>2. OUR APPROACH</title>
      <p>
        Our approach builds on a graduate-level writing course developed
and taught by Kaufer over a decade, in collaboration with
Ishizaki. In the course, students used DocuScope [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]—a
dictionary-based tool for rhetorical text analysis with a suite of
tools for interactive visualization—that allowed students to
visualize differences in the rhetorical strategies underlying their
drafts and across the different genres they were assigned to write.
      </p>
      <p>DocuScope transformed the writing classroom into a design
studio–like environment for writing, where—unlike a typical
writing course—students could compare their writing at a glance
as if they were comparing posters on a wall (Figure 1).</p>
      <p>DocuScope, then, would allow students to select specific writing
to view how certain rhetorical strategies are implemented in terms
of composing decisions (Figure 2).</p>
      <p>We informally observed that the visualizations helped enhance
students’ awareness of (a) their composing decisions and (b) the
relationship of their decision-making to their writing context and
the genre of text they were seeking to produce. Although we have
no definitive understanding of how this works, we suspect that
allowing students to see their composing decisions visualized
after the fact creates grounded evidence for claiming ownership of
those decisions and using those decisions to explain their situated
goals of composing with sharpened clarity.</p>
      <p>In our current project, our goal is to extend the use of DocuScope
to a much larger scale by embedding it in a freshman-writing
course and a popular professional writing course. Each student
will receive feedback based on the text-analysis that compares and
situate his or her writing against the historical student data.</p>
      <p>Students of any cohort on any assignment will be able to compare
their writing against a historical cohort writing on the same
assignment.</p>
      <p>More specifically, we are developing a tool for automatically
generating visual reports that highlight salient structures and
composition decisions in the students’ own writing in relation to
the historical data as well as writing by other students in class. We
hypothesize that enhancing students’ awareness of their low-level
composition choices can enhance their overall metacognitive
awareness as writers.</p>
      <p>RIGHT: Single-Text Visualization (STV)—In this screenshot, we see how a student writer or teacher can drill down from MTV and
see how DocuScope categories tag individual words and word strings. A number of categories are highlighted. Notice how the word
"suggested" is tied to the facilitating category through color-coding. To suggest something is to help another facilitate action.</p>
    </sec>
    <sec id="sec-3">
      <title>3. CHALLENGES</title>
      <p>
        While the course taught by Kaufer was successful [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ], the text
analysis tool was not fully automated. Running DocuScope
therefore required a manual process that had to be handled by the
instructor (Kaufer). This original context worked as well as it did
because (1) the instructor was extremely familiar with the tool and
(2) he was able to assist students in interpreting the analysis.
In order to scale the use of this environment for core writing
courses with many sections with different instructors, we must
make it highly user-friendly and capable of presenting results
clearly to non-writing experts—i.e., students. Accordingly, we are
currently addressing the following specific research questions.



      </p>
      <sec id="sec-3-1">
        <title>What are optimal ways to integrate automated reporting</title>
        <p>into undergraduate writing instructions? We are
exploring how these reports can be integrated
meaningfully for students in our core writing classes.
We are also examining the extent to which these reports
can positively impact student understanding of
structures and composition decisions in their own
writing.</p>
      </sec>
      <sec id="sec-3-2">
        <title>What are the optimal statistical methods for uncovering</title>
        <p>the most salient composing choices from data generated
from DocuScope? In order to fully automate the
analysis and report generation, we are exploring
statistical methods for uncovering salient features in a
student’s writing.</p>
      </sec>
      <sec id="sec-3-3">
        <title>What are optimal ways to visualize the results of statistical analysis? We are exploring optimal ways students’ composing decisions can be visualized.</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. DEMO</title>
      <p>In this demonstration, we will provide an overview of the
technology we have developed so far, including the tool to mine
the corpus, the visualizations (i.e., reports) we are experimenting
to provide feedback to students.</p>
      <p>We are currently working with a team of statistics professors and
students to help us answer some of these questions. By the time of
the workshop, we should have more concrete results about helpful
visual feedback to students. We will also discuss our pedagogical
philosophy for the way students can productively use this
feedback, as well as some of the challenges of getting this
ambitious project off the ground.</p>
    </sec>
    <sec id="sec-5">
      <title>5. ACKNOWLEDGMENTS</title>
      <p>Our thanks to Danielle Wetzel, Necia Werner, Xizhen Cai, Ann
Lee, Joel Greenhouse, Arianna Garofalo, Chushan Chen and
Binghui Ouyang for vital help on this project.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Ishizaki</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kaufer</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2011</year>
          ).
          <article-title>Computer-aided rhetorical analysis</article-title>
          . In P. McCarthy &amp;
          <string-name>
            <surname>C. Boonthum</surname>
          </string-name>
          (Eds.),
          <source>Applied Natural Language Processing and Content Analysis: Advances in Identification</source>
          , Investigation, and Resolution. Hershey, PA: IGI Global.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Kaufer</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Geisler</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vlachos</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Ishizaki</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2006</year>
          ).
          <article-title>Mining textual knowledge for writing education and research</article-title>
          . In L. v. Waes,
          <string-name>
            <given-names>M.</given-names>
            <surname>Leijten</surname>
          </string-name>
          , &amp; C. Neuwirth (Eds.),
          <source>Writing and Digital Media</source>
          (pp.
          <fpage>115</fpage>
          -
          <lpage>130</lpage>
          ). Oxford, UK: Elsevier Science.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>David</given-names>
            <surname>Kaufer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Suguru</given-names>
            <surname>Ishizaki</surname>
          </string-name>
          , Jeff Collins, and Pantelis Vlachos, (
          <year>2004</year>
          )
          <article-title>―Teaching Language Awareness in Rhetorical Choice Using IText and Visualization in Classroom Genre Assignments</article-title>
          .
          <source>‖ Journal for Business and Technical Communication</source>
          ,
          <volume>18</volume>
          :3
          <fpage>361</fpage>
          -
          <lpage>40</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>