=Paper=
{{Paper
|id=Vol-3617/paper-04
|storemode=property
|title=Text revision in Scientific Writing Assistance: A Review
|pdfUrl=https://ceur-ws.org/Vol-3617/paper-04.pdf
|volume=Vol-3617
|authors=Léane Jourdan,Florian Boudin,Richard Dufour,Nicolas Hernandez
|dblpUrl=https://dblp.org/rec/conf/birws/JourdanBDH23
}}
==Text revision in Scientific Writing Assistance: A Review==
<pdf width="1500px">https://ceur-ws.org/Vol-3617/paper-04.pdf</pdf>
<pre>
                                Text revision in Scientific Writing Assistance: An
                                Overview ⋆
                                Léane Jourdan1 , Florian Boudin1 , Richard Dufour1 and Nicolas Hernandez1
                                1
                                    Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France


                                                                         Abstract
                                                                         Writing a scientific article is a challenging task as it is a highly codified genre. Good writing skills are
                                                                         essential to properly convey ideas and results of research work. Since the majority of scientific articles
                                                                         are currently written in English, this exercise is all the more difficult for non-native English speakers as
                                                                         they additionally have to face language issues. This article aims to provide an overview of text revision
                                                                         in writing assistance in the scientific domain. We will examine the specificities of scientific writing,
                                                                         including the format and conventions commonly used in research articles. Additionally, this overview
                                                                         will explore the various types of writing assistance tools available for text revision. Despite the evolution
                                                                         of the technology behind these tools through the years, from rule-based approaches to deep neural-based
                                                                         ones, challenges still exist (tools’ accessibility, limited consideration of the context, inexplicit use of
                                                                         discursive information, etc.)

                                                                         Keywords
                                                                         NLP, text revision, scientific writing assistance, academic writing, grammar error correction, moves


                                1. Introduction
                                The process of writing a scientific article can be complex and challenging, especially for junior
                                researchers who often have to learn the conventions of scientific writing. This is even more true
                                for researchers who are not native English speakers (ESL (English as Second Language) and
                                EFL (English as foreign Language) learners) as strong writing skills are essential for effectively
                                conveying ideas to the reader. More generally, whether researchers are junior or senior, they
                                must pay attention to the quality of writing in order to ensure that their work is correctly shared
                                and understood by their audience.
                                   To answer these needs, scientific writing assistance (SWA) has received more attention in
                                recent years. In particular, a growing number of tools, language resources and events have
                                emerged, aiming at helping scholars address these writing challenges. SWA encompasses tools
                                that answer to a range of different tasks, such as bibliographic management, text revision,
                                spelling error correction or citation recommendation. Considering that the field of SWA is vast,
                                this paper focuses on summarizing approaches and tools for scientific text revision, defined as


                                BIR 2023: 13th International Workshop on Bibliometric-enhanced Information Retrieval at ECIR 2023, April 2, 2023
                                ⋆
                                 You can use this document as the template for preparing your publication. We recommend using the latest version
                                 of the ceurart style.
                                $ leane.jourdan@univ-nantes.fr (L. Jourdan); Florian.Boudin@univ-nantes.fr (F. Boudin);
                                Richard.Dufour@univ-nantes.fr (R. Dufour); Nicolas.Hernandez@univ-nantes.fr (N. Hernandez)
                                                                       © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                    CEUR
                                    Workshop
                                    Proceedings
                                                  http://ceur-ws.org
                                                  ISSN 1613-0073
                                                                       CEUR Workshop Proceedings (CEUR-WS.org)


                                                                                                                                          22


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
improving a draft on its content and phrasing to obtain the correct intended text in scientific
style [1, 2].
   More specifically, we will cover only Natural Language Processing (NLP) tools, not language
resources. However, the datasets used to train these tools will be mentioned. By conducting this
overview, we hope to gain a better understanding of how these tools support scientific writers
in effectively communicating their ideas and arguments and which approach they use for it.
Even with all the tools currently available, text revision in SWA is still an open field and our
article tries to identify the challenges and future directions of research.
   The rest of this paper is structured as follows: In Section 2, we will first provide a definition
of the research article genre and its characteristics. We will then define in Section 3 the task
of text revision in scientific writing assistance. Following this, we will present an overview of
the current tools that utilize NLP for this purpose. Finally, in Section 4, we will address the
challenges that may be encountered in future research on this topic.


2. Scientific writing
Scientific writing, also known as research writing, is a subgenre of academic writing. Among
the literature, the definition of academic writing is ambiguous. It can be defined as a genre that
encompasses all pieces of writing produced by students and researchers for academic work
purposes in a university setting (essay, thesis, syllabus, etc.) [3]. However, scientific writing
differs from academic writing in several aspects and has its own specificities and challenges.
   Scientific writing is produced by researchers for other researchers, often in the form of
research articles published in journals or conferences. It is expected to be concise, precise, clear,
and follows a highly codified structure, tense / pronoun usage, and terminology [4, 5]. The
format style can be required by the targeted journal (for example IEEE style1 ).
   Researchers, especially junior ones, often face particular difficulties when writing their first
research articles, as they may lack experience with the codes, methodology, and techniques
required in the scientific genre [5]. This is a concern for researchers across all domains, and it
is especially relevant for those working outside their own discipline or in a multidisciplinary
environment. Additionally, the majority of research articles today are written in English: for
this reason, it is often ESL and EFL researchers who face the greatest difficulties in this regard
as they need to learn the specificities of a foreign language at the same time.
   In this section, we will first present the structure followed by the majority of scientific
articles, then describe the writing process of an article, and finally, show how the argumentative
structure can be formalized.

2.1. The structure of scientific articles
The structure of a research article varies depending on the discipline and type of article (for
example in the NLP domain: literature review, presentation of a corpus, creation of a new model,
etc.). However, the commonly accepted structure for research articles is the IMRaD (or IMRD)
model: Introduction, Methods, Results, and Discussion. This structure was gradually adopted by

1
    https://www.ieee.org/content/dam/ieee-org/ieee/web/org/conferences/style_references_manual.pdf


                                                       23
the scientific community and became the most widely used pattern since the 1970s [6, 7]. This
model is the most popular and easiest to generalize across domains. To this model can be added
common sections like Literature review/Related work and Conclusion.
   The IMRaD format typically follows an hourglass pattern, beginning with a broad overview
in the introduction and narrowing down to a specific focus on the motivation and goals of the
research. The focus remains centered on this particular viewpoint throughout the related work,
methods, and results sections. Finally, the paper broadens its scope again in the discussion and
conclusion, considering the potential future directions and wider implications of the specific
findings [6].
   Overall, the structure of a research article is designed to clearly and concisely present the
research work. This structure helps to organize the information and makes it easy for readers to
understand and evaluate the research. Each section has its own purposes that we describe here:
    • Introduction: The purposes of this section are to give context for the research, provide
      background information on the topic being studied and state the research question or
      hypothesis. The goal is to know: What question was studied? and why?
    • Literature review/Related work: This section discusses previous research on the topic
      or related domains. The goals are to know: What is the current state of the art? What are
      the gaps in the existing literature?
    • Methods: This section describes the research design, including the participants, materials,
      and procedures used in the study. Its purpose is to answer the question: How was the
      problem studied?
    • Results: This section presents the results of experiments, a score of a model on a task,
      etc. typically in the form of tables and figures. It answers the question: What were the
      findings of the study?
    • Discussion: This section interprets the results, discusses their implications, and suggests
      directions for future research. The purpose is to answer the questions: What do these
      findings mean? What do they imply?
    • Conclusion: This section summarizes the key findings of the study and their significance.
      The conclusion must answer the question: What are the key elements the reader needs to
      remember from the paper?

2.2. The writing process of scientific articles
Writing a research article is a process that has been studied in the learning analytics community
usually broadening it to academic writing (including essays).
   Different processes have been proposed throughout the years. [8] established one process
to write a scientific article and [9, 10] proposed processes for academic writing. All these
propositions share similar steps that will be summarized in our considered process as Prewriting,
Drafting, Revision and Proof-reading.
   [11] proposed to add an iterative aspect and repetition of steps in this process as illustrated
in Figure 1. This iterative notion can also be found in [1], where they focus on the iterative
aspect of the revision step.
   Here is the writing process we will be considering, summarizing previous references and
following the iterative pattern between steps from Figure 1:


                                               24
                                                     STAGES

                      PROCESS ACTIVATED              Planning         Drafting


                    PROCESS TERMINATED                Editing         Revision

Figure 1: An example of a writing process proposed in [11]


    • Step 1: Prewriting
         – Collect and organize ideas
         – Write the outline
    • Step 2: Drafting
         – Writing full sentences from notes
         – Focus on content rather than form and structure
         – Starting with the body in no particular order and next the introduction and conclu-
           sion
    • Step 3: Revision
         – Changes in the structure of paragraphs and content of sentences
         – Focusing on conciseness, clarity, connecting elements, and simplifying the text
         – Making substantive rather than minor changes.
         – Iteratively revised until the structure and phrasing is satisfying
         – Correct grammar errors
    • Step 4: Edition
         – Proofreading: Spelling error correction, minor changes, etc.
         – Editing figures and tables
         – Iteratively edit until no error is left

   This process can also be supported by research in the psycho-linguistic domain lead on
expert writing summarized by [2] (p.374) as: “four cognitive processes support expert writing:
planning processes that set rhetorical goals, which guide the generation and organization
of ideas; translating processes that convert ideas into linguistic forms; transcription
processes that draw on spelling and handwriting (or typing) to externalize language in the form
of written text; and revising processes that monitor, evaluate, and change the intended and
the actual written text”. These definitions of the planning and revision tally with the previously
described processes. The translating and transcription processes can be linked to the drafting
step.
   In our considered writing process, we are interested in the revision step. In this step, text
coherence is particularly important. For this reason, it is essential to formalize the argumentative
structure of scientific articles.


                                                25
2.3. Modelization of the argumentative structure
In the discourse analysis and English for Academic Purposes research areas, efforts have been
made to model the argumentative structure of scientific articles formally. The argumentative
structure shows the roles that argumentative discourse units (usually sentences) play in the
overall argumentation [12, 13].
   [6] worked on the genre analysis of scientific articles and proposed the Creating a Research
Space (CARS) model to describe the argumentative structure of the introduction of a research
article. It is composed of four moves and eleven steps as illustrated in Figure 2.
   An argumentative move is defined as a “recurring and regularized communicative event”.
It is a segment of text such as a phrase, sentence, or paragraph serving a specific purpose in
discourse such as “Indication of a gap” in previous research [14](p.111)[6]. Each move is a series
of functional strategies referred to as step [15].


Figure 2: A CARS model for article introduction [6].


   Derived from the CARS model and motivated by both a desire to propose a solution to analyze
all sections of articles and the lack of resources on argumentative structure, several annotation
schemes have been proposed.
   One of them is the Argumentative Zoning (AZ) model (or its improved version AZ-II [16]),
based on [6] CARS model, that provides an analysis of the argumentative and rhetorical structure
of a scientific paper [16]. It is a sentence-level scheme used in the classification of sentences
by their argumentative role within a scientific paper [17, 18, 19]. The categories, referred to as
argumentative zones, are specific to the text type, in this case, research articles.


                                                 26
  Other efforts have been made to propose different annotation schemes with their own focus.
Where AZ focuses on how references are cited and for which purpose, schemes like Core-SC
(Core Scientific Concepts) specifically designed for chemistry research articles focus on being
easily understandable by humans [20].


3. Text revision in scientific writing assistance
Scientific writing assistance (SWA) encompasses a large range of tasks such as reference man-
agement, citation recommendation, grammatical error correction or sentence revision. In this
section, we are interested in the task of text revision and describing NLP tools that can be
helpful for that task.

3.1. Definition of the text revision task
Text revision is the task occurring at the revision step of the writing process. There is no single
definition for what text revision in SWA is, but [1] (p.1) defined it as: “identifying discrepancies
between intended and instantiated text, deciding what edits to make, and how to make those desired
edits”. In another work [21] (p.58), it is defined as “a series of text generation tasks including but
not limited to style transfer, text simplification, counterfactual debiasing, grammar error correction
and argument reframing”. Text revision is the transformation of an input text into an improved
version fitting a desired attribute (formality, clarity, etc.), closer to the intended text. In recent
work, the text revision task is often limited to sentences. The sentence revision task, also called
SentRev, is defined by [22] (p.41) as “revising and editing incomplete draft sentences to create final
versions”.
   No events have been organized specifically on text revision but the field of SWA has seen a
growing interest in recent years. For example, the Helping our own (HOO) shared task, held
in 2011, aimed to develop and evaluate automated tools and techniques for error correction
that can assist authors in their writing focusing on the NLP community [23]. Years later, the
Intelligent and Interactive Writing Assistant (In2Writing2 ) workshop began in 2022 with the
goal of “facilitate discussion around writing assistants, thereby enhancing our understanding of
their usage in the writing process and predicting the consequences”. These examples highlight the
research efforts and the resurging interest in the field of writing assistance.
   Below are the text revision tools that will be considered in our study. They are grouped
into three general categories depending on whether they suggest modifications, correct only
grammar and spelling errors, or only annotate the text to visualize its structure.
        • Sentence revision tools: These tools provide automatic suggestions for the revision
          step of the writing process (changing the structure of a sentence, rephrasing for clarity,
          etc.) at sentence level. Sentence revision is iterative [1] and 1-to-N [22] as a sentence can
          have several correct revisions.
        • Grammar checkers: These tools tackle grammar error correction (GEC) and spelling
          error correction (SEC) tasks as they can detect grammar, spelling, and punctuation errors
          in a document and propose a correction automatically. Some can be built directly into
2
    https://in2writing.glitch.me/


                                                    27
                      word processing software. They are considered a part of the text revision step as making
                      a grammar change can substantively modify the sentence.
                    • Moves annotators: These tools are dedicated to academic writing. They highlight the
                      moves (from CARS model or another framework of moves), for example by color-coded
                      them. It will make the argumentative structure visible in order to help the writer revise
                      and correct their draft. These tools can also give suggestions on the order of the moves
                      or if there is any missing one.

             3.2. Currently available tools
             There are a number of writing assistance tools acting on text revision. In this section, we will
             present a selection of some pertinent tools that exist and can be applied for scientific writing.
             Table 3.2 summarizes their main characteristics.
               With the exception of ChatGPT, the tools presented in this section are currently limited to
             the English language.

Category             Tool             Year3     Domain            Approach                    Task                 Availability
                 Langsmith [24]       2020      Scientific    Transformer based     SentRev, text completion,   Free and paid plans
Sentence                                                                                 GEC and SEC
revision              R3 [1]           2022     General/      Transformer based             SentRev                Open source
  tools                                         Scientific
                 Chat GPT [25]         2023      General           GPT 3.5             Text generation          Free and paid plans
                  Grammarly4           2023      General      Transformer based       Correctness, clarity,     Free and paid plans
Grammar                                                                              engagement, delivery
checkers     LinggleWrite [26]         2020     Academic          LSTM/ Bi-LSTM          Suggestions,               Free to use
                                                                                        essay scoring
                   Mover [27]          2016     Academic        Naive Bayes             Moves analysis              Free to use
  Move                                                            Classifier
annotators         RWT [28]              -      Academic     Probabilistic models        Moves analysis           Limited access
                 AcaWriter [29]        2022     Academic         Rule-based              Moves analysis            Open source
     Table 1
     Description of the tools currently available for text revision.


             3.2.1. Sentence revision tools
             In this section, we will present Langsmith and R3, two sentence revision tools trained on
             scientific articles. We will also discuss ChatGPT, one of the most advanced question-answer
             NLP tools, that can be used for text revision.

             Langsmith is an interactive academic sentence revision system developed by a team of
             researchers from Tohoku University, Edge Intelligence Systems Inc. and RIKEN [24]. The


             3
                 The year corresponds to the last known update.
             4
                 https://www.grammarly.com


                                                                      28
system was released in 2020 and is available online through the Langsmith editor5 with free or
paid plan options. Langsmith is designed to be used on NLP research papers enabling domain-
specific revisions such as correcting technical terms and is mainly targeted at inexperienced and
non-native researchers. The system’s main feature is its revision function, but it also includes a
text completion and an error correction feature, two tasks that can be considered part of the
revision step in our writing process. The revision function can suggest fluent, academic-style
sentences to writers based on their rough, incomplete phrases or sentences.
  Their system for revision uses an encoder-decoder with a convolution module and is trained
on synthetic data they created by altering sentences from research articles. The text completion
feature is specialized in academic writing and leverage a GPT-2 small model fined-tuned on
papers from the ACL Anthology.
  With Langsmith, users can request a specific revision or select one from several candidates.
This work pointed out the importance of considering the 1-to-N nature of the revision task, as
one sentence can have several correct revised versions.

R3 (Read, Revise, and Repeat) is a human-in-the-loop model for iterative text revision proposed
in 2022 [1]. The code is available on Github6 . It is composed of a fine-tuned RoBERTa-large and
a fine-tuned PEGASUS-LARGE [1]. The training data for R3 (dataset IteraTeR) was collected
from text revision data across three domains: ArXiv, Wikipedia, and Wikinews. The model is
designed to be a general revision model, but as it has been trained on research articles, it can be
considered a scientific writing assistant.
   R3 is an interface where the writer can upload a document and then iteratively accept or
refuse sets of revisions, as illustrated in this video7 . The main advantage of R3 is that it offers a
new way of thinking about the task of text revision by introducing the importance of both the
iterative aspect of the process and adding human in the loop. However, even presented as text
revision, it actually processes sentences one by one independently making it a model for the
SentRev task.

ChatGPT is a large-scale generative language model developed by OpenAI and launched on
November 30, 2022. It is based on GPT-3.5 and has been fine-tuned for dialogue, allowing it to
interact in a conversational manner. The first version of the chatbot was trained on a dataset of
human-human conversations, where one human plays the role of the chatbot. Then, the model
has been fine-tuned through reinforcement learning with human AI trainers ranking samples
of the chatbot’s responses in simulated conversations.
   ChatGPT can be accessed through a web application, where users can type in their questions
or requests for assistance in a chat interface. The model is not specifically tailored for scientific
or academic writing. However, it can be applied to a variety of tasks related to scientific
writing, such as text revision, simplification and restructuration, translation, grammatical error
correction (GEC), etc. Additionally, it is available in a range of different languages.


5
  https://editor.langsmith.co.jp/
6
  https://github.com/vipulraheja/IteraTeR/
7
  https://youtu.be/lK08tIpEoaE


                                                 29
  It is a new actor in writing assistance and can be used for scientific writing by asking specific
queries. Here are some ideas of prompts to use it as an assistant in your writing such as:

       • Can you revise and correct this in an academic style? : “ <your draft>”
       • Can you revise the abstract I wrote for my paper on <subject of your paper>: “ <your
         abstract>”
       • Can you translate this paragraph where I talk about <subject and purpose of your
         paragraph>: “ <your paragraph>”.
       • Can you rephrase this paragraph making it more <add your criteria>: “ <your
         abstract>”

All these prompts follow the same pattern with a different intent: <Your instructions for
the model>: “ <Your draft version>”. Giving more context in your prompt will often
lead to better results.
   ChatGPT is currently in beta and free of use but this may change in the future with the
creation of a paid plan. However, it should be noted that it still has some limitations, including
sensitivity to phrasing in the prompt and the potential for plausible-sounding but incorrect or
nonsensical answers.
   Moreover, since its launch, ChatGPT has been highly criticized as its use raises questions
regarding ethics and plagiarism. In January, ACL 2023 released a post on their blog8 regarding
their policy on AI Writing Assistance focusing on the use of ChatGPT. In this post, they
discourage its use to produce new ideas and text. They ask the authors to acknowledge the use
of writing assistance tools except when using them purely with the language of the paper.

3.2.2. Grammar checkers
In this section, various error-checking tools will be discussed. These tools are commonly utilized
for general writing and include Grammarly, LanguageTool, Ginger, etc. Each tool offers a range
of functionalities, including correcting spelling and grammar errors, detecting paraphrasing,
essay scoring, etc. As an extensive number of these tools exist, only two will be presented here.
It should be noted that these tools do not take into account the argumentative structure of the
text nor a large context.

Grammarly is one of the most widely and easy-to-use error-checking tool. It was developed
by Grammarly Inc. in 2009 and is currently available as an online text editor, as well as a
mobile app and browser extension. The browser extension also allows for integration with
other popular text editors, such as Overleaf and Google Docs.
   The free version offers suggestions in correctness (spelling, grammar, or punctuation) and
clarity, and their online editor allows one to set specific goals for the writing in terms of targeted
audience, formality, and intent. The premium offer extends these features with suggestions in
engagement and delivery and an additional goal to specify the domain (where “academic” is
one of the available options).

8
    https://2023.aclweb.org/blog/ACL-2023-policy/


                                                    30
LinggleWrite is a writing coach for essay writing targeted to English learners providing
writing suggestions on an input text. It is derived from Linggle- Language Reference Search
Engines and was released in 2020 by NLPLab [26]. Unlike some other tools, it is only available
as a web application and there is no existing browser extension.
   The tool is specifically designed for essay writing in an academic setting and was trained
using the EF-Cambridge Open Language Database and the First Certificate in English dataset.
The system behind it consists of four components: writing suggestions, essay scoring, GEC,
and corrective feedback [26]. The writing suggestions are based on a dictionary of grammatical
patterns, hand built or extracted from a corpus. The model behind LinggleWrite utilizes a
combination of LSTM with an attention layer for essay scoring and BiLSTM-CRF, BERT, and
Flair embeddings for GEC.

3.2.3. Move annotators
Mover, Research Writing Tutor, and AcaWriter are tools from the writing analytics domain, which
is a sub-domain of learning analytics [15]. These tools are rhetorical moves annotators and
writing feedback tools. Their goal is to provide formative feedback to students by automatically
identifying the argumentative structure of their text. Move annotators purpose is to improve
students’ writing capacities and help them revise their drafts through the visualization of the
structure. These tools are more extensively described in [15].
   Move annotators are usually academic tools and currently present two limitations. First,
moves annotators are often linked to Universities and their access is secured by a password.
Secondly, in the argumentative structure and moves literature, the introduction and the abstract
have been the central point of focus as it contains discourse information about the subjects
and content of the article. This part contains much more rhetorical information but is one of
the last to write in the writing process. There is a lack of research and proposition from move
framework formalizing the other parts of research articles.

Mover is a text structure (moves) analysis software developed by Laurence Anthony and
George V. Lashkia in 2003. This software is intended to assist students in evaluating and revising
their drafts [15]. The algorithm employed by Mover is a Naive Bayes classifier approach and
was trained on a supervised learning task on a corpus of 100 research abstracts in the field of
information technology [27, 15]. The moves model utilized in the annotation process is the
“Modified (CARS) Model” by [30]. The software can be downloaded on Laurence Anthony’s
website9 .

Research Writing Tutor (RWT) is a web-based application developed by Elena Cotos and
Stephen Gilbert for academic writing assistance. It is composed of three modules, one of which
provides feedback and analysis of written text. This module identifies the rhetorical structure
and employs an extended move/step framework of the CARS model to color-code the structure
for better visualization [28] across all sections of the IMRaD structure, comprising a total of 61
steps distributed in 14 moves.

9
    https://www.laurenceanthony.net/software/antmover/


                                                         31
   Based on this structural analysis, RWT will provide feedback [15] on the use of moves,
comparing the draft’s moves distribution to a goal distribution extracted from articles in the
student’s discipline domain [28]. Additionally, it will analyze the use of steps in the form of
comments and clarifying questions about the rhetorical intent of a given sentence [28]. The
distributions are extracted from 900 introductions in 30 disciplines [31].
   For this supervised classification problem task, RWT uses probabilistic language models [32].
Currently, access to RWT is restricted to Iowa State University, however, guest accounts can be
created upon request [28].

AcaWriter is a web-based application that is part of the Academic Writing Analytics (AWA)
project by the University of Technology Sidney (UTS). AcaWriter uses the rhetorically salient
sentences as a moves framework to label the sentences using a rule-based system [15, 29]. When
used for abstracts and introductions, it provides feedback on the order of the moves or if there
is any missing one.
   AcaWriter is accessible to UTS staff and students and a demo version is available for external
users. They also propose an open-source platform for institutions that wish to host their own
version of AcaWriter 10 .


4. Conclusion and future directions
In this work, we highlighted the unique characteristics of scientific article writing as a highly
codified genre. The type of assistance needed for writing an article varies depending on the
task. We were particularly interested in the text revision task. We presented an overview of
currently available writing assistance tools for this task and how they address it. However, all
existing writing assistance tools have not been covered as they are too numerous, and only a
few representative ones have been selected.
   From our analysis of existing tools, we identify seven major challenges to face in future work:
       1. Benchmarking performance: While our article presented a selection of tools, proposed
          a classification of these writing assistants and identified their available features, comparing
          the tools’ performances is still an open issue. Finally, quite a few evaluations accompany
          the tools developed and it is currently impossible to compare their performances.
       2. Considering a larger context: Currently, sentences are treated independently. Con-
          sidering a larger context (for example revision at paragraph level) could be beneficial
          for the text revision task. One potential solution for this is to look into other domains,
          such as machine translation, to see how current models are trained to consider larger
          contexts [33, 34, 35, 36].
       3. Taking discursive analysis into consideration: Discourse analysis allows annotating
          the organization of discourse linking information in the text [37, 38]. Considering it
          would permit catching long-distance dependencies in the text which is essential for a
          good organization of scientific documents.


10
     https://cic.uts.edu.au/tools/awa/


                                                    32
  4. Including the argumentative structure: Current tools usually only label each sentence
     to highlight the structure and give some feedback. However, there is a lack of guidance
     on how to effectively structure an argument and present evidence to support claims. One
     research direction would be to use argument mining techniques to study the relationship
     between arguments [18, 19].
  5. Lack of available resources: Existing corpora for the revision task, such as
     IteraTeR [1] and arXivEdits [39], are composed of final articles and their versions
     before revisions, collected from arXiv. However, an issue with arXiv as a source of data
     is that the first versions of submitted articles have already been proofread, sometimes
     even revised by peers. Although there is no simple solution to that data problem, one
     possibility would be to ask researchers to contribute towards building such resources by
     providing early drafts of accepted papers.
  6. Improving accessibility and transparency: Some writing tools are not publicly acces-
     sible while others are not properly described in the research literature. The accessibility
     issue appears mostly with academic tools, some exist inside universities with access
     limited (non-commercial use) to students and staff [3]. The transparency issue occurs
     mainly with lucrative tools proposed by private companies as they do not always publish
     a paper on how they build and train their models nor share their training data.
  7. Emergence of ethical issues: The use of text revision tools raises some ethical issues
     about plagiarism, intellectual property and the potential impact on the quality and in-
     tegrity of research. Additionally, it is worth examining the impact of these tools on
     English learners, as they may facilitate the writing process to such an extent that proper
     writing skills and language knowledge are not developed [40, 41, 42].


References
[1] W. Du, Z. M. Kim, V. Raheja, D. Kumar, D. Kang, Read, revise, repeat: A system
    demonstration for human-in-the-loop iterative text revision, in: Proceedings of the
    First Workshop on Intelligent and Interactive Writing Assistants (In2Writing 2022), As-
    sociation for Computational Linguistics, Dublin, Ireland, 2022, pp. 96–108. URL: https:
    //aclanthology.org/2022.in2writing-1.14. doi:10.18653/v1/2022.in2writing-1.14.
[2] R. A. Alves, T. Limpo, Progress in written language bursts, pauses, transcription, and
    written composition across schooling, Scientific Studies of Reading 19 (2015) 374–391.
[3] C. Strobl, E. Ailhaud, K. Benetos, A. Devitt, O. Kruse, A. Proske, C. Rapp, Digital support
    for academic writing: A review of technologies and pedagogies, Computers & education
    131 (2019) 33–48.
[4] E. D. Kallestinova, How to write your first research paper, The Yale journal of biology and
    medicine 84 (2011) 181.
[5] S. Bourekkache, English for specific purposes: writing scientific research papers. Case
    study: PhD students in the computer science department, Master’s thesis, University of
    Biskra, Algeria, 2022.
[6] J. M. Swales, Genre Analysis: English in academic and research settings, The Cambridge
    applied linguistics series, The press syndicate of the University of Cambridge, 1990.


                                              33
 [7] L. B. Sollaci, M. G. Pereira, The introduction, methods, results, and discussion (imrad)
     structure: a fifty-year survey, Journal of the medical library association 92 (2004) 364.
 [8] E. A. Silveira, A. M. de Sousa Romeiro, M. Noll, Guide for scientific writing: how to avoid
     common mistakes in a scientific article, Journal of Human Growth and Development 32
     (2022) 341–352.
 [9] E. D. Laksmi, " scaffolding" students’writing in efl class: Implementing process approac,
     TEFLIN Journal 17 (2006) 144–156.
[10] S. Bailey, Academic writing: A handbook for international students, Routledge, 2014.
[11] A. Seow, The writing process and process writing, Methodology in language teaching: An
     anthology of current practice 315 (2002) 320.
[12] J. W. G. Putra, S. Teufel, T. Tokunaga, Annotating argumentative structure in english-
     as-a-foreign-language learner essays, Natural Language Engineering 28 (2022) 797–823.
     doi:10.1017/S1351324921000218.
[13] J. W. G. Putra, K. Matsumura, S. Teufel, T. Tokunaga, Tiara 2.0: an interactive tool for
     annotating discourse structure and text improvement, Language Resources and Evaluation
     (2021) 1–25.
[14] S. Teufel, J. Carletta, M. Moens, An annotation scheme for discourse-level argumentation
     in research articles, in: Ninth Conference of the European Chapter of the Association for
     Computational Linguistics, 1999, pp. 110–117.
[15] S. Knight, S. Abel, A. Shibani, Y. K. Goh, R. Conijn, A. Gibson, S. Vajjala, E. Cotos, Á. Sándor,
     S. B. Shum, Are you being rhetorical? a description of rhetorical move annotation tools
     and open corpus of sample machine-annotated rhetorical moves, Journal of Learning
     Analytics 7 (2020) 138–154.
[16] S. Teufel, A. Siddharthan, C. Batchelor, Towards domain-independent argumentative
     zoning: Evidence from chemistry and computational linguistics, in: Proceedings of the
     2009 conference on empirical methods in natural language processing, 2009, pp. 1493–1502.
[17] S. Teufel, et al., Argumentative zoning: Information extraction from scientific text, Ph.D.
     thesis, Citeseer, 1999.
[18] B. Liu, V. Schlegel, R. T. Batista-Navarro, S. Ananiadou, Incorporating zoning information
     into argument mining from biomedical literature, in: Proceedings of the Thirteenth
     Language Resources and Evaluation Conference, 2022, pp. 6162–6169.
[19] J. Lawrence, C. Reed, Argument mining: A survey, Computational Linguistics 45 (2020)
     765–818.
[20] M. Liakata, S. Teufel, A. Siddharthan, C. Batchelor, Corpora for the conceptualisation and
     zoning of scientific papers (2010).
[21] J. Li, Z. Li, T. Ge, I. King, M. R. Lyu, Text revision by on-the-fly representation optimization,
     arXiv preprint arXiv:2204.07359 (2022).
[22] T. Ito, T. Kuribayashi, H. Kobayashi, A. Brassard, M. Hagiwara, J. Suzuki, K. Inui, Diamonds
     in the rough: Generating fluent sentences from early-stage drafts for academic writing
     assistance, in: Proceedings of the 12th International Conference on Natural Language
     Generation, Association for Computational Linguistics, Tokyo, Japan, 2019, pp. 40–53.
     URL: https://aclanthology.org/W19-8606. doi:10.18653/v1/W19-8606.
[23] R. Dale, A. Kilgarriff, Helping our own: The HOO 2011 pilot shared task, in: Pro-
     ceedings of the 13th European Workshop on Natural Language Generation, Associ-


                                                 34
     ation for Computational Linguistics, Nancy, France, 2011, pp. 242–249. URL: https:
     //aclanthology.org/W11-2838.
[24] T. Ito, T. Kuribayashi, M. Hidaka, J. Suzuki, K. Inui, Langsmith: An interactive academic
     text revision system, in: Proceedings of the 2020 Conference on Empirical Methods in
     Natural Language Processing: System Demonstrations, Association for Computational
     Linguistics, Online, 2020, pp. 216–226. URL: https://aclanthology.org/2020.emnlp-demos.28.
     doi:10.18653/v1/2020.emnlp-demos.28.
[25] L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. L. Wainwright, P. Mishkin, C. Zhang, S. Agarwal,
     K. Slama, A. Ray, et al., Training language models to follow instructions with human
     feedback, arXiv preprint arXiv:2203.02155 (2022).
[26] C.-T. Tsai, J.-J. Chen, C.-Y. Yang, J. S. Chang, LinggleWrite: a coaching system for essay
     writing, in: Proceedings of the 58th Annual Meeting of the Association for Computational
     Linguistics: System Demonstrations, Association for Computational Linguistics, Online,
     2020, pp. 127–133. URL: https://aclanthology.org/2020.acl-demos.17. doi:10.18653/v1/
     2020.acl-demos.17.
[27] L. Anthony, G. V. Lashkia, Mover: A machine learning tool to assist in the reading and
     writing of technical papers, IEEE transactions on professional communication 46 (2003)
     185–193.
[28] E. Cotos, Computer-assisted research writing in the disciplines, in: Adaptive educational
     technologies for literacy instruction, Routledge, 2016, pp. 225–242.
[29] S. Knight, A. Shibani, S. Abel, A. Gibson, P. Ryan, Acawriter: A learning analytics tool for
     formative feedback on academic writing, Journal of Writing Research (2020).
[30] L. Anthony, Writing research article introductions in software engineering: How accurate
     is a standard model?, IEEE transactions on Professional Communication 42 (1999) 38–46.
[31] E. Cotos, S. Huffman, S. Link, Understanding graduate writers’ interaction with and
     impact of the research writing tutor during revision, Journal of Writing Research 12 (2020)
     187–232.
[32] E. Cotos, S. Huffman, S. Link, Furthering and applying move/step constructs: Technology-
     driven marshalling of swalesian genre theory for eap pedagogy, Journal of English for
     Academic Purposes 19 (2015) 52–72.
[33] S. Majumder, S. Lauly, M. Nadejde, M. Federico, G. Dinu, A baseline revisited:
     Pushing the limits of multi-segment models for context-aware translation, CoRR
     abs/2210.10906 (2022). URL: https://doi.org/10.48550/arXiv.2210.10906. doi:10.48550/
     arXiv.2210.10906. arXiv:2210.10906.
[34] B. Li, H. Liu, Z. Wang, Y. Jiang, T. Xiao, J. Zhu, T. Liu, C. Li, Does multi-encoder help? a case
     study on context-aware neural machine translation, in: Proceedings of the 58th Annual
     Meeting of the Association for Computational Linguistics, Association for Computational
     Linguistics, Online, 2020, pp. 3512–3518. URL: https://aclanthology.org/2020.acl-main.322.
     doi:10.18653/v1/2020.acl-main.322.
[35] Y. Feng, F. Li, Z. Song, B. Zheng, P. Koehn, Learn to remember: Transformer with recurrent
     memory for document-level machine translation, in: Findings of the Association for Com-
     putational Linguistics: NAACL 2022, Association for Computational Linguistics, Seattle,
     United States, 2022, pp. 1409–1420. URL: https://aclanthology.org/2022.findings-naacl.105.
     doi:10.18653/v1/2022.findings-naacl.105.


                                                 35
[36] J. Chen, X. Li, J. Zhang, C. Zhou, J. Cui, B. Wang, J. Su, Modeling discourse structure for
     document-level neural machine translation, in: Proceedings of the First Workshop on
     Automatic Simultaneous Translation, Association for Computational Linguistics, Seattle,
     Washington, 2020, pp. 30–36. URL: https://aclanthology.org/2020.autosimtrans-1.5. doi:10.
     18653/v1/2020.autosimtrans-1.5.
[37] M. Taboada, W. C. Mann, Rhetorical structure theory: Looking back and moving ahead,
     Discourse studies 8 (2006) 423–459.
[38] L. Danlos, Analyse discursive et informations de factivité (discursive analysis and in-
     formation factivity), in: Actes de la 18e conférence sur le Traitement Automatique des
     Langues Naturelles. Articles longs, ATALA, Montpellier, France, 2011, pp. 364–375. URL:
     https://aclanthology.org/2011.jeptalnrecital-long.32.
[39] C. Jiang, W. Xu, S. Stevens, arxivedits: Understanding the human revision process in
     scientific writing, In Proceedings of EMNLP 2022 (2022).
[40] N. Shintani, R. Ellis, The comparative effect of direct written corrective feedback and
     metalinguistic explanation on learners’ explicit and implicit knowledge of the english
     indefinite article, Journal of Second Language Writing 22 (2013) 286–306. URL: https:
     //www.sciencedirect.com/science/article/pii/S1060374313000271. doi:https://doi.org/
     10.1016/j.jslw.2013.03.011.
[41] A. Sampson, “coded and uncoded error feedback: Effects on error frequencies
     in adult colombian efl learners’ writing”, System 40 (2012) 494–504. URL: https://
     www.sciencedirect.com/science/article/pii/S0346251X12000772. doi:https://doi.org/
     10.1016/j.system.2012.10.001.
[42] S. Chen, H. Nassaji, Q. Liu, Efl learners’ perceptions and preferences of written corrective
     feedback: a case study of university students from mainland china, Asian-Pacific journal
     of second and foreign language education 1 (2016) 1–17.


                                              36

</pre>