Writing Analytics, Data Mining, and Writing Studies Norbert Elliot Katie Walkup Joseph Moxley New Jersey Institute of Technology University of South Flordia University of South Florida 323 Dr. Martin Luther King Jr. Blvd 4202 E. Fowler Ave 4202 E. Fowler Ave Newark, NJ 07102 Tampa, FL 33620 Tampa, FL 33620 1-856-952-7680 1-813-922-1492 1-813-974-2421 elliot@njit.edu kwalkup@mail.usf.edu mox@usf.edu ABSTRACT This workshop fully realizes Bolter's idea that “ceci tuera cela,” or The primary goal of this workshop is to facilitate a research “this will destroy that.” When applied to Writing Analytics (WA), community around the topic of large-scale data analysis with a researchers and practitioners stand at a pivotal point of change. particular focus on writing studies, data mining, and analytics. Writing Analytics are going to redefine the teaching and learning The workshop aims hopes to generate cross-disciplinary research space by replacing feedback as teachers and students have always among writing program directors and faculty, computational delivered feedback. The affordances of digital tools mean that linguists, and educational measurement specialists. machines can process and present knowledge to an extent unimaginable by Bolter is a future that seems to have few limits. Keywords Feedback; foundational measurement issues; My Reviewers; Data-collection methods such as Latent Semantic Analysis (LSA) STEM, visual mapping and Natural Language Processing (NLP) have enabled researchers in WA to present studies that portend a complex future for the discipline of Writing Studies—a discipline where humanities 1. INTRODUCTION collaborative with mathematicians on predictive algorithms, The following passage from Victor Hugo’s Notre-Dame de Paris corpus linguists on linguistic patterns discerned from big data, is quoted in David Jay Bolter’s Writing Space: The Computer, and computer sciences on intelligent tutoring systems. WA may Hypertext, and the History of Writing [1]. eventually replace grading as we know it, but the research area is controversial, especially for researchers in the humanities. Opening the window of his cell, he pointed to the immense church of Notre Dame, which, with its twin towers, stone walls, and To understand these concerns, it is important to recognize the monstrous cupola forming a black silhouette against the starry history of providing machine feedback. While many humanities sky, resembled an enormous two-headed sphinx seated in the researchers reject the idea of WA and the use of corpus methods middle of the city. to refine feedback, Tackitt et al. point out that feedback and grading have always been controversial practices [2]. Many have The archdeacon pondered the giant edifice for a few moments in investigated the reliability of instructor evaluation of writing silence, then with a sigh he stretched his right hand toward the within and without the disciplines [2]. Before these, however, printed book that lay open on his table and his left hand toward Tackitt et al. write that “the evaluation of student learning through Notre Dame and turned a sad eye from the book to the church. student writing is a modern model made possible through modern “Alas!” he said, “This will destroy that.” means and methods.” With this brief reminder that technologies replace technologies, the workshop leaders can look beyond the controversies of WA. In Bolter's seminal work Writing Space: The Computer, This workshop will then seek to extend and surmount the current Hypertext, and the History of Writing, he begins with the above boundaries of WA [3] by attention to the following : epigram about the book replacing the church [1]. Bolter uses this 1. Structuring opportunities for students to learn idea to parallel the replacement of the printed book with hypertext. As Bolter explains, “The idea and the ideal of the book 2. Understanding the cognitive, interpersonal, intrapersonal will change: print will no longer define the organization and constructs, as they emerge within sociocognitive and sociocultural presentation of knowledge.” settings, that enable students to recognize and respond to feedback 3. Gaining actionable information about what practices will help students to become better writers in academic and workplace settings When WA is reconfigured to embrace student learning, we can see that the efforts of researchers and practitioners change the learning space. With interdisciplinary collaboration, we can mediate the constructs that underlie WA as a field of research. 2. PRESENTATIONS Valerie Ross, Mark Liberman, Lan Ngo, Rodger LeGrand (University of Pennsylvania) address another kind of reflective This workshop centers around mapping writing analytics from an writing: peer feedback. The Critical Writing Program at Penn interdisciplinary, student-centered perspective. As researchers began working with My Reviewers in the Fall of 2013, working point out, discussing WA from a disciplinary perspective can collaboratively with the My Reviewers team at the University of distract researchers and practitioners from completing actionable South Florida to develop a portfolio solution. Since then, research. students evaluate peer’s portfolios. In turn, instructors use eportfolio tools to evaluate middle and end-of-semesters The workshop begins with an activity in mapping writing portfolios. As a result, Penn has developed a large corpus of peer analytics, led by Joseph Moxley (University of South Florida). reviews and epotfolio reviews. In this study, Ross et al. use a The ensuing presentation extends foundational perspectives on the weighted log-odds-ratio, informative Dirichlet prior method (“bag definition of Writing Analytics (WA) to further conceptualize the of words” approach) to analyze student comments and scores field. The authors use the metaphor of mapping to understand the posed to My Reviewers, which is designed to collect student tensions and successes navigated by researchers and practitioners writing as well as their peers' comments and scores on those and to chart new ways in which this field can benefit the domains drafts. This preliminary study suggests that the use of this of academia, business, and culture. methods shows lower-performing writers might be receiving kinds This interdisciplinary approach allows the audience to of feedback generally viewed as counterproductive in the field of reconceptualize the field. From there, Alex Rudniy (Fairleigh writing studies. Dickinson University) and Norbert Elliot (New Jersey Institute of From examining the effectiveness of feedback on revision, Technology) explore the use of n-grams in analyzing student and attitudes toward writing, student and instructor training and instructor comments within My Reviewers 1, a web-based learning motivation, participants in this workshop will then begin to environment. Shown to be informative in a wide variety of understand how big data researchers approach corpuses of student applications, n-gram analysis is of interest in determining concept revisions. Chris Holcomb and Duncan Buell (University of South proliferation in topics, purposes, terminologies, and rubrics used Carolina) approach First Year Composition as a big data in writing courses. As the present study demonstrates, unigram, phenomenon by prototyping software to study revision in a large bigram, trigram, fourgram, and fivegram analytic methods reveal corpus of student papers. The authors address a question central to important information about instructor and student use of scholarship in Composition and Rhetoric: What role does revision concepts. This analysis holds the potential to lead to precise and play in students' writing processes? actionable revision behaviors. Denise Comer (Duke University) closes the day's presentations by David Kaufer and Sugura Ishizaki (Carnegie Mellon University) recasting the framework for big data and WA research. Comer introduce the concept of textual visualization to enhance learning uses the frame of writing transfer to explore how researchers can in core writing courses. These authors use corpus methods to transfer strategies, approaches, and knowledge about writing show that writing tasks require countless composing decisions gained from big-data writing analtyics to other writing pedagogy that are typically beyond the conscious grasp of writers. Much of contexts. The author will share methods and results from four big- the skill of being “text aware” is to understand that texts produced data research projects, which stem from research gained in a from classroom assignments are not just composed of words and writing-based Massive Open Online Course. Comer will present sentences but of highly structured and often highly predictive findings on big data and writing assessment, writing and peer-to- composing decisions. However, the decision-making underlying peer interactions, writing and negativity, and peer-review and writing is an extremely abstract idea that is hard to make tangible transfer. for students. Although a significant number of pedagogical approaches has been investigated in the past three decades, the This workshop closes on a final collaborative activity, as the means to help students acquire more tangible understanding and participants are asked once again to return to a mind-map of control of their composing decisions has not been addressed. The Writing Analytics. Using lessons learned from corpus methods authors propose to address this gap by developing a corpus-based and big data techniques, participants will reconceptualize the field learning tool to help students notice and reflect on composition from an interdisciplinary, actionable perspective. decisions in their writing and to become more self-aware, reflective writers. 3. REFERENCES 1 Dr. Joseph Moxley wishes to disclose a potential conflict of [1] Bolter, D. J. (1991). Writing space: The computer, hypertext, interest: while the My Reviewers software is not commercially and the history of writing. Lawrence Erlbaum: Hillsdale, NJ. available, it may become commercially available in the future. Because the data collection methods used in this study [2] Tackitt, A., Moxley, J., and Eubanks, D. (2015). Signifying demonstrate the viability of My Reviewers, this research study scores: Instructor rating as an assessment measure. Manuscript may enhance the commercial value of My Reviewers. submitted for publication. Ultimately, USF owns My Reviewers; however, Moxley [3] Buckingham Shum, S., Knight, S., McNamara, D., Allen, L., possesses the rights to license My Reviewers. Given this Bektik, D., and Crossley, S. 2016. Critical perspectives on writing potential conflict, Professor Moxley has filed the necessary USF analytics. In Proceedings of the Sixth International Conference on conflict of interest paperwork. The Conflict of Interest Learning Analytics & Knowledge (LAK '16). ACM, New York, Committee at USF has developed a management plan with NY,USA,481-483. which Dr. Moxley has complied prior to submitting this and DOI=http://dx.doi.org/10.1145/2883851.2883854 similar research.