Text revision in Scientific Writing Assistance: An Overview ⋆ Léane Jourdan1 , Florian Boudin1 , Richard Dufour1 and Nicolas Hernandez1 1 Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France Abstract Writing a scientific article is a challenging task as it is a highly codified genre. Good writing skills are essential to properly convey ideas and results of research work. Since the majority of scientific articles are currently written in English, this exercise is all the more difficult for non-native English speakers as they additionally have to face language issues. This article aims to provide an overview of text revision in writing assistance in the scientific domain. We will examine the specificities of scientific writing, including the format and conventions commonly used in research articles. Additionally, this overview will explore the various types of writing assistance tools available for text revision. Despite the evolution of the technology behind these tools through the years, from rule-based approaches to deep neural-based ones, challenges still exist (tools’ accessibility, limited consideration of the context, inexplicit use of discursive information, etc.) Keywords NLP, text revision, scientific writing assistance, academic writing, grammar error correction, moves 1. Introduction The process of writing a scientific article can be complex and challenging, especially for junior researchers who often have to learn the conventions of scientific writing. This is even more true for researchers who are not native English speakers (ESL (English as Second Language) and EFL (English as foreign Language) learners) as strong writing skills are essential for effectively conveying ideas to the reader. More generally, whether researchers are junior or senior, they must pay attention to the quality of writing in order to ensure that their work is correctly shared and understood by their audience. To answer these needs, scientific writing assistance (SWA) has received more attention in recent years. In particular, a growing number of tools, language resources and events have emerged, aiming at helping scholars address these writing challenges. SWA encompasses tools that answer to a range of different tasks, such as bibliographic management, text revision, spelling error correction or citation recommendation. Considering that the field of SWA is vast, this paper focuses on summarizing approaches and tools for scientific text revision, defined as BIR 2023: 13th International Workshop on Bibliometric-enhanced Information Retrieval at ECIR 2023, April 2, 2023 ⋆ You can use this document as the template for preparing your publication. We recommend using the latest version of the ceurart style. $ leane.jourdan@univ-nantes.fr (L. Jourdan); Florian.Boudin@univ-nantes.fr (F. Boudin); Richard.Dufour@univ-nantes.fr (R. Dufour); Nicolas.Hernandez@univ-nantes.fr (N. Hernandez) © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 22 CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings improving a draft on its content and phrasing to obtain the correct intended text in scientific style [1, 2]. More specifically, we will cover only Natural Language Processing (NLP) tools, not language resources. However, the datasets used to train these tools will be mentioned. By conducting this overview, we hope to gain a better understanding of how these tools support scientific writers in effectively communicating their ideas and arguments and which approach they use for it. Even with all the tools currently available, text revision in SWA is still an open field and our article tries to identify the challenges and future directions of research. The rest of this paper is structured as follows: In Section 2, we will first provide a definition of the research article genre and its characteristics. We will then define in Section 3 the task of text revision in scientific writing assistance. Following this, we will present an overview of the current tools that utilize NLP for this purpose. Finally, in Section 4, we will address the challenges that may be encountered in future research on this topic. 2. Scientific writing Scientific writing, also known as research writing, is a subgenre of academic writing. Among the literature, the definition of academic writing is ambiguous. It can be defined as a genre that encompasses all pieces of writing produced by students and researchers for academic work purposes in a university setting (essay, thesis, syllabus, etc.) [3]. However, scientific writing differs from academic writing in several aspects and has its own specificities and challenges. Scientific writing is produced by researchers for other researchers, often in the form of research articles published in journals or conferences. It is expected to be concise, precise, clear, and follows a highly codified structure, tense / pronoun usage, and terminology [4, 5]. The format style can be required by the targeted journal (for example IEEE style1 ). Researchers, especially junior ones, often face particular difficulties when writing their first research articles, as they may lack experience with the codes, methodology, and techniques required in the scientific genre [5]. This is a concern for researchers across all domains, and it is especially relevant for those working outside their own discipline or in a multidisciplinary environment. Additionally, the majority of research articles today are written in English: for this reason, it is often ESL and EFL researchers who face the greatest difficulties in this regard as they need to learn the specificities of a foreign language at the same time. In this section, we will first present the structure followed by the majority of scientific articles, then describe the writing process of an article, and finally, show how the argumentative structure can be formalized. 2.1. The structure of scientific articles The structure of a research article varies depending on the discipline and type of article (for example in the NLP domain: literature review, presentation of a corpus, creation of a new model, etc.). However, the commonly accepted structure for research articles is the IMRaD (or IMRD) model: Introduction, Methods, Results, and Discussion. This structure was gradually adopted by 1 https://www.ieee.org/content/dam/ieee-org/ieee/web/org/conferences/style_references_manual.pdf 23 the scientific community and became the most widely used pattern since the 1970s [6, 7]. This model is the most popular and easiest to generalize across domains. To this model can be added common sections like Literature review/Related work and Conclusion. The IMRaD format typically follows an hourglass pattern, beginning with a broad overview in the introduction and narrowing down to a specific focus on the motivation and goals of the research. The focus remains centered on this particular viewpoint throughout the related work, methods, and results sections. Finally, the paper broadens its scope again in the discussion and conclusion, considering the potential future directions and wider implications of the specific findings [6]. Overall, the structure of a research article is designed to clearly and concisely present the research work. This structure helps to organize the information and makes it easy for readers to understand and evaluate the research. Each section has its own purposes that we describe here: • Introduction: The purposes of this section are to give context for the research, provide background information on the topic being studied and state the research question or hypothesis. The goal is to know: What question was studied? and why? • Literature review/Related work: This section discusses previous research on the topic or related domains. The goals are to know: What is the current state of the art? What are the gaps in the existing literature? • Methods: This section describes the research design, including the participants, materials, and procedures used in the study. Its purpose is to answer the question: How was the problem studied? • Results: This section presents the results of experiments, a score of a model on a task, etc. typically in the form of tables and figures. It answers the question: What were the findings of the study? • Discussion: This section interprets the results, discusses their implications, and suggests directions for future research. The purpose is to answer the questions: What do these findings mean? What do they imply? • Conclusion: This section summarizes the key findings of the study and their significance. The conclusion must answer the question: What are the key elements the reader needs to remember from the paper? 2.2. The writing process of scientific articles Writing a research article is a process that has been studied in the learning analytics community usually broadening it to academic writing (including essays). Different processes have been proposed throughout the years. [8] established one process to write a scientific article and [9, 10] proposed processes for academic writing. All these propositions share similar steps that will be summarized in our considered process as Prewriting, Drafting, Revision and Proof-reading. [11] proposed to add an iterative aspect and repetition of steps in this process as illustrated in Figure 1. This iterative notion can also be found in [1], where they focus on the iterative aspect of the revision step. Here is the writing process we will be considering, summarizing previous references and following the iterative pattern between steps from Figure 1: 24 STAGES PROCESS ACTIVATED Planning Drafting PROCESS TERMINATED Editing Revision Figure 1: An example of a writing process proposed in [11] • Step 1: Prewriting – Collect and organize ideas – Write the outline • Step 2: Drafting – Writing full sentences from notes – Focus on content rather than form and structure – Starting with the body in no particular order and next the introduction and conclu- sion • Step 3: Revision – Changes in the structure of paragraphs and content of sentences – Focusing on conciseness, clarity, connecting elements, and simplifying the text – Making substantive rather than minor changes. – Iteratively revised until the structure and phrasing is satisfying – Correct grammar errors • Step 4: Edition – Proofreading: Spelling error correction, minor changes, etc. – Editing figures and tables – Iteratively edit until no error is left This process can also be supported by research in the psycho-linguistic domain lead on expert writing summarized by [2] (p.374) as: “four cognitive processes support expert writing: planning processes that set rhetorical goals, which guide the generation and organization of ideas; translating processes that convert ideas into linguistic forms; transcription processes that draw on spelling and handwriting (or typing) to externalize language in the form of written text; and revising processes that monitor, evaluate, and change the intended and the actual written text”. These definitions of the planning and revision tally with the previously described processes. The translating and transcription processes can be linked to the drafting step. In our considered writing process, we are interested in the revision step. In this step, text coherence is particularly important. For this reason, it is essential to formalize the argumentative structure of scientific articles. 25 2.3. Modelization of the argumentative structure In the discourse analysis and English for Academic Purposes research areas, efforts have been made to model the argumentative structure of scientific articles formally. The argumentative structure shows the roles that argumentative discourse units (usually sentences) play in the overall argumentation [12, 13]. [6] worked on the genre analysis of scientific articles and proposed the Creating a Research Space (CARS) model to describe the argumentative structure of the introduction of a research article. It is composed of four moves and eleven steps as illustrated in Figure 2. An argumentative move is defined as a “recurring and regularized communicative event”. It is a segment of text such as a phrase, sentence, or paragraph serving a specific purpose in discourse such as “Indication of a gap” in previous research [14](p.111)[6]. Each move is a series of functional strategies referred to as step [15]. Figure 2: A CARS model for article introduction [6]. Derived from the CARS model and motivated by both a desire to propose a solution to analyze all sections of articles and the lack of resources on argumentative structure, several annotation schemes have been proposed. One of them is the Argumentative Zoning (AZ) model (or its improved version AZ-II [16]), based on [6] CARS model, that provides an analysis of the argumentative and rhetorical structure of a scientific paper [16]. It is a sentence-level scheme used in the classification of sentences by their argumentative role within a scientific paper [17, 18, 19]. The categories, referred to as argumentative zones, are specific to the text type, in this case, research articles. 26 Other efforts have been made to propose different annotation schemes with their own focus. Where AZ focuses on how references are cited and for which purpose, schemes like Core-SC (Core Scientific Concepts) specifically designed for chemistry research articles focus on being easily understandable by humans [20]. 3. Text revision in scientific writing assistance Scientific writing assistance (SWA) encompasses a large range of tasks such as reference man- agement, citation recommendation, grammatical error correction or sentence revision. In this section, we are interested in the task of text revision and describing NLP tools that can be helpful for that task. 3.1. Definition of the text revision task Text revision is the task occurring at the revision step of the writing process. There is no single definition for what text revision in SWA is, but [1] (p.1) defined it as: “identifying discrepancies between intended and instantiated text, deciding what edits to make, and how to make those desired edits”. In another work [21] (p.58), it is defined as “a series of text generation tasks including but not limited to style transfer, text simplification, counterfactual debiasing, grammar error correction and argument reframing”. Text revision is the transformation of an input text into an improved version fitting a desired attribute (formality, clarity, etc.), closer to the intended text. In recent work, the text revision task is often limited to sentences. The sentence revision task, also called SentRev, is defined by [22] (p.41) as “revising and editing incomplete draft sentences to create final versions”. No events have been organized specifically on text revision but the field of SWA has seen a growing interest in recent years. For example, the Helping our own (HOO) shared task, held in 2011, aimed to develop and evaluate automated tools and techniques for error correction that can assist authors in their writing focusing on the NLP community [23]. Years later, the Intelligent and Interactive Writing Assistant (In2Writing2 ) workshop began in 2022 with the goal of “facilitate discussion around writing assistants, thereby enhancing our understanding of their usage in the writing process and predicting the consequences”. These examples highlight the research efforts and the resurging interest in the field of writing assistance. Below are the text revision tools that will be considered in our study. They are grouped into three general categories depending on whether they suggest modifications, correct only grammar and spelling errors, or only annotate the text to visualize its structure. • Sentence revision tools: These tools provide automatic suggestions for the revision step of the writing process (changing the structure of a sentence, rephrasing for clarity, etc.) at sentence level. Sentence revision is iterative [1] and 1-to-N [22] as a sentence can have several correct revisions. • Grammar checkers: These tools tackle grammar error correction (GEC) and spelling error correction (SEC) tasks as they can detect grammar, spelling, and punctuation errors in a document and propose a correction automatically. Some can be built directly into 2 https://in2writing.glitch.me/ 27 word processing software. They are considered a part of the text revision step as making a grammar change can substantively modify the sentence. • Moves annotators: These tools are dedicated to academic writing. They highlight the moves (from CARS model or another framework of moves), for example by color-coded them. It will make the argumentative structure visible in order to help the writer revise and correct their draft. These tools can also give suggestions on the order of the moves or if there is any missing one. 3.2. Currently available tools There are a number of writing assistance tools acting on text revision. In this section, we will present a selection of some pertinent tools that exist and can be applied for scientific writing. Table 3.2 summarizes their main characteristics. With the exception of ChatGPT, the tools presented in this section are currently limited to the English language. Category Tool Year3 Domain Approach Task Availability Langsmith [24] 2020 Scientific Transformer based SentRev, text completion, Free and paid plans Sentence GEC and SEC revision R3 [1] 2022 General/ Transformer based SentRev Open source tools Scientific Chat GPT [25] 2023 General GPT 3.5 Text generation Free and paid plans Grammarly4 2023 General Transformer based Correctness, clarity, Free and paid plans Grammar engagement, delivery checkers LinggleWrite [26] 2020 Academic LSTM/ Bi-LSTM Suggestions, Free to use essay scoring Mover [27] 2016 Academic Naive Bayes Moves analysis Free to use Move Classifier annotators RWT [28] - Academic Probabilistic models Moves analysis Limited access AcaWriter [29] 2022 Academic Rule-based Moves analysis Open source Table 1 Description of the tools currently available for text revision. 3.2.1. Sentence revision tools In this section, we will present Langsmith and R3, two sentence revision tools trained on scientific articles. We will also discuss ChatGPT, one of the most advanced question-answer NLP tools, that can be used for text revision. Langsmith is an interactive academic sentence revision system developed by a team of researchers from Tohoku University, Edge Intelligence Systems Inc. and RIKEN [24]. The 3 The year corresponds to the last known update. 4 https://www.grammarly.com 28 system was released in 2020 and is available online through the Langsmith editor5 with free or paid plan options. Langsmith is designed to be used on NLP research papers enabling domain- specific revisions such as correcting technical terms and is mainly targeted at inexperienced and non-native researchers. The system’s main feature is its revision function, but it also includes a text completion and an error correction feature, two tasks that can be considered part of the revision step in our writing process. The revision function can suggest fluent, academic-style sentences to writers based on their rough, incomplete phrases or sentences. Their system for revision uses an encoder-decoder with a convolution module and is trained on synthetic data they created by altering sentences from research articles. The text completion feature is specialized in academic writing and leverage a GPT-2 small model fined-tuned on papers from the ACL Anthology. With Langsmith, users can request a specific revision or select one from several candidates. This work pointed out the importance of considering the 1-to-N nature of the revision task, as one sentence can have several correct revised versions. R3 (Read, Revise, and Repeat) is a human-in-the-loop model for iterative text revision proposed in 2022 [1]. The code is available on Github6 . It is composed of a fine-tuned RoBERTa-large and a fine-tuned PEGASUS-LARGE [1]. The training data for R3 (dataset IteraTeR) was collected from text revision data across three domains: ArXiv, Wikipedia, and Wikinews. The model is designed to be a general revision model, but as it has been trained on research articles, it can be considered a scientific writing assistant. R3 is an interface where the writer can upload a document and then iteratively accept or refuse sets of revisions, as illustrated in this video7 . The main advantage of R3 is that it offers a new way of thinking about the task of text revision by introducing the importance of both the iterative aspect of the process and adding human in the loop. However, even presented as text revision, it actually processes sentences one by one independently making it a model for the SentRev task. ChatGPT is a large-scale generative language model developed by OpenAI and launched on November 30, 2022. It is based on GPT-3.5 and has been fine-tuned for dialogue, allowing it to interact in a conversational manner. The first version of the chatbot was trained on a dataset of human-human conversations, where one human plays the role of the chatbot. Then, the model has been fine-tuned through reinforcement learning with human AI trainers ranking samples of the chatbot’s responses in simulated conversations. ChatGPT can be accessed through a web application, where users can type in their questions or requests for assistance in a chat interface. The model is not specifically tailored for scientific or academic writing. However, it can be applied to a variety of tasks related to scientific writing, such as text revision, simplification and restructuration, translation, grammatical error correction (GEC), etc. Additionally, it is available in a range of different languages. 5 https://editor.langsmith.co.jp/ 6 https://github.com/vipulraheja/IteraTeR/ 7 https://youtu.be/lK08tIpEoaE 29 It is a new actor in writing assistance and can be used for scientific writing by asking specific queries. Here are some ideas of prompts to use it as an assistant in your writing such as: • Can you revise and correct this in an academic style? : “ ” • Can you revise the abstract I wrote for my paper on : “ ” • Can you translate this paragraph where I talk about : “ ”. • Can you rephrase this paragraph making it more : “ ” All these prompts follow the same pattern with a different intent: : “ ”. Giving more context in your prompt will often lead to better results. ChatGPT is currently in beta and free of use but this may change in the future with the creation of a paid plan. However, it should be noted that it still has some limitations, including sensitivity to phrasing in the prompt and the potential for plausible-sounding but incorrect or nonsensical answers. Moreover, since its launch, ChatGPT has been highly criticized as its use raises questions regarding ethics and plagiarism. In January, ACL 2023 released a post on their blog8 regarding their policy on AI Writing Assistance focusing on the use of ChatGPT. In this post, they discourage its use to produce new ideas and text. They ask the authors to acknowledge the use of writing assistance tools except when using them purely with the language of the paper. 3.2.2. Grammar checkers In this section, various error-checking tools will be discussed. These tools are commonly utilized for general writing and include Grammarly, LanguageTool, Ginger, etc. Each tool offers a range of functionalities, including correcting spelling and grammar errors, detecting paraphrasing, essay scoring, etc. As an extensive number of these tools exist, only two will be presented here. It should be noted that these tools do not take into account the argumentative structure of the text nor a large context. Grammarly is one of the most widely and easy-to-use error-checking tool. It was developed by Grammarly Inc. in 2009 and is currently available as an online text editor, as well as a mobile app and browser extension. The browser extension also allows for integration with other popular text editors, such as Overleaf and Google Docs. The free version offers suggestions in correctness (spelling, grammar, or punctuation) and clarity, and their online editor allows one to set specific goals for the writing in terms of targeted audience, formality, and intent. The premium offer extends these features with suggestions in engagement and delivery and an additional goal to specify the domain (where “academic” is one of the available options). 8 https://2023.aclweb.org/blog/ACL-2023-policy/ 30 LinggleWrite is a writing coach for essay writing targeted to English learners providing writing suggestions on an input text. It is derived from Linggle- Language Reference Search Engines and was released in 2020 by NLPLab [26]. Unlike some other tools, it is only available as a web application and there is no existing browser extension. The tool is specifically designed for essay writing in an academic setting and was trained using the EF-Cambridge Open Language Database and the First Certificate in English dataset. The system behind it consists of four components: writing suggestions, essay scoring, GEC, and corrective feedback [26]. The writing suggestions are based on a dictionary of grammatical patterns, hand built or extracted from a corpus. The model behind LinggleWrite utilizes a combination of LSTM with an attention layer for essay scoring and BiLSTM-CRF, BERT, and Flair embeddings for GEC. 3.2.3. Move annotators Mover, Research Writing Tutor, and AcaWriter are tools from the writing analytics domain, which is a sub-domain of learning analytics [15]. These tools are rhetorical moves annotators and writing feedback tools. Their goal is to provide formative feedback to students by automatically identifying the argumentative structure of their text. Move annotators purpose is to improve students’ writing capacities and help them revise their drafts through the visualization of the structure. These tools are more extensively described in [15]. Move annotators are usually academic tools and currently present two limitations. First, moves annotators are often linked to Universities and their access is secured by a password. Secondly, in the argumentative structure and moves literature, the introduction and the abstract have been the central point of focus as it contains discourse information about the subjects and content of the article. This part contains much more rhetorical information but is one of the last to write in the writing process. There is a lack of research and proposition from move framework formalizing the other parts of research articles. Mover is a text structure (moves) analysis software developed by Laurence Anthony and George V. Lashkia in 2003. This software is intended to assist students in evaluating and revising their drafts [15]. The algorithm employed by Mover is a Naive Bayes classifier approach and was trained on a supervised learning task on a corpus of 100 research abstracts in the field of information technology [27, 15]. The moves model utilized in the annotation process is the “Modified (CARS) Model” by [30]. The software can be downloaded on Laurence Anthony’s website9 . Research Writing Tutor (RWT) is a web-based application developed by Elena Cotos and Stephen Gilbert for academic writing assistance. It is composed of three modules, one of which provides feedback and analysis of written text. This module identifies the rhetorical structure and employs an extended move/step framework of the CARS model to color-code the structure for better visualization [28] across all sections of the IMRaD structure, comprising a total of 61 steps distributed in 14 moves. 9 https://www.laurenceanthony.net/software/antmover/ 31 Based on this structural analysis, RWT will provide feedback [15] on the use of moves, comparing the draft’s moves distribution to a goal distribution extracted from articles in the student’s discipline domain [28]. Additionally, it will analyze the use of steps in the form of comments and clarifying questions about the rhetorical intent of a given sentence [28]. The distributions are extracted from 900 introductions in 30 disciplines [31]. For this supervised classification problem task, RWT uses probabilistic language models [32]. Currently, access to RWT is restricted to Iowa State University, however, guest accounts can be created upon request [28]. AcaWriter is a web-based application that is part of the Academic Writing Analytics (AWA) project by the University of Technology Sidney (UTS). AcaWriter uses the rhetorically salient sentences as a moves framework to label the sentences using a rule-based system [15, 29]. When used for abstracts and introductions, it provides feedback on the order of the moves or if there is any missing one. AcaWriter is accessible to UTS staff and students and a demo version is available for external users. They also propose an open-source platform for institutions that wish to host their own version of AcaWriter 10 . 4. Conclusion and future directions In this work, we highlighted the unique characteristics of scientific article writing as a highly codified genre. The type of assistance needed for writing an article varies depending on the task. We were particularly interested in the text revision task. We presented an overview of currently available writing assistance tools for this task and how they address it. However, all existing writing assistance tools have not been covered as they are too numerous, and only a few representative ones have been selected. From our analysis of existing tools, we identify seven major challenges to face in future work: 1. Benchmarking performance: While our article presented a selection of tools, proposed a classification of these writing assistants and identified their available features, comparing the tools’ performances is still an open issue. Finally, quite a few evaluations accompany the tools developed and it is currently impossible to compare their performances. 2. Considering a larger context: Currently, sentences are treated independently. Con- sidering a larger context (for example revision at paragraph level) could be beneficial for the text revision task. One potential solution for this is to look into other domains, such as machine translation, to see how current models are trained to consider larger contexts [33, 34, 35, 36]. 3. Taking discursive analysis into consideration: Discourse analysis allows annotating the organization of discourse linking information in the text [37, 38]. Considering it would permit catching long-distance dependencies in the text which is essential for a good organization of scientific documents. 10 https://cic.uts.edu.au/tools/awa/ 32 4. Including the argumentative structure: Current tools usually only label each sentence to highlight the structure and give some feedback. However, there is a lack of guidance on how to effectively structure an argument and present evidence to support claims. One research direction would be to use argument mining techniques to study the relationship between arguments [18, 19]. 5. Lack of available resources: Existing corpora for the revision task, such as IteraTeR [1] and arXivEdits [39], are composed of final articles and their versions before revisions, collected from arXiv. However, an issue with arXiv as a source of data is that the first versions of submitted articles have already been proofread, sometimes even revised by peers. Although there is no simple solution to that data problem, one possibility would be to ask researchers to contribute towards building such resources by providing early drafts of accepted papers. 6. Improving accessibility and transparency: Some writing tools are not publicly acces- sible while others are not properly described in the research literature. The accessibility issue appears mostly with academic tools, some exist inside universities with access limited (non-commercial use) to students and staff [3]. The transparency issue occurs mainly with lucrative tools proposed by private companies as they do not always publish a paper on how they build and train their models nor share their training data. 7. Emergence of ethical issues: The use of text revision tools raises some ethical issues about plagiarism, intellectual property and the potential impact on the quality and in- tegrity of research. Additionally, it is worth examining the impact of these tools on English learners, as they may facilitate the writing process to such an extent that proper writing skills and language knowledge are not developed [40, 41, 42]. References [1] W. Du, Z. M. Kim, V. Raheja, D. Kumar, D. Kang, Read, revise, repeat: A system demonstration for human-in-the-loop iterative text revision, in: Proceedings of the First Workshop on Intelligent and Interactive Writing Assistants (In2Writing 2022), As- sociation for Computational Linguistics, Dublin, Ireland, 2022, pp. 96–108. URL: https: //aclanthology.org/2022.in2writing-1.14. doi:10.18653/v1/2022.in2writing-1.14. [2] R. A. Alves, T. Limpo, Progress in written language bursts, pauses, transcription, and written composition across schooling, Scientific Studies of Reading 19 (2015) 374–391. [3] C. Strobl, E. Ailhaud, K. Benetos, A. Devitt, O. Kruse, A. Proske, C. Rapp, Digital support for academic writing: A review of technologies and pedagogies, Computers & education 131 (2019) 33–48. [4] E. D. Kallestinova, How to write your first research paper, The Yale journal of biology and medicine 84 (2011) 181. [5] S. Bourekkache, English for specific purposes: writing scientific research papers. Case study: PhD students in the computer science department, Master’s thesis, University of Biskra, Algeria, 2022. [6] J. M. Swales, Genre Analysis: English in academic and research settings, The Cambridge applied linguistics series, The press syndicate of the University of Cambridge, 1990. 33 [7] L. B. Sollaci, M. G. Pereira, The introduction, methods, results, and discussion (imrad) structure: a fifty-year survey, Journal of the medical library association 92 (2004) 364. [8] E. A. Silveira, A. M. de Sousa Romeiro, M. Noll, Guide for scientific writing: how to avoid common mistakes in a scientific article, Journal of Human Growth and Development 32 (2022) 341–352. [9] E. D. Laksmi, " scaffolding" students’writing in efl class: Implementing process approac, TEFLIN Journal 17 (2006) 144–156. [10] S. Bailey, Academic writing: A handbook for international students, Routledge, 2014. [11] A. Seow, The writing process and process writing, Methodology in language teaching: An anthology of current practice 315 (2002) 320. [12] J. W. G. Putra, S. Teufel, T. Tokunaga, Annotating argumentative structure in english- as-a-foreign-language learner essays, Natural Language Engineering 28 (2022) 797–823. doi:10.1017/S1351324921000218. [13] J. W. G. Putra, K. Matsumura, S. Teufel, T. Tokunaga, Tiara 2.0: an interactive tool for annotating discourse structure and text improvement, Language Resources and Evaluation (2021) 1–25. [14] S. Teufel, J. Carletta, M. Moens, An annotation scheme for discourse-level argumentation in research articles, in: Ninth Conference of the European Chapter of the Association for Computational Linguistics, 1999, pp. 110–117. [15] S. Knight, S. Abel, A. Shibani, Y. K. Goh, R. Conijn, A. Gibson, S. Vajjala, E. Cotos, Á. Sándor, S. B. Shum, Are you being rhetorical? a description of rhetorical move annotation tools and open corpus of sample machine-annotated rhetorical moves, Journal of Learning Analytics 7 (2020) 138–154. [16] S. Teufel, A. Siddharthan, C. Batchelor, Towards domain-independent argumentative zoning: Evidence from chemistry and computational linguistics, in: Proceedings of the 2009 conference on empirical methods in natural language processing, 2009, pp. 1493–1502. [17] S. Teufel, et al., Argumentative zoning: Information extraction from scientific text, Ph.D. thesis, Citeseer, 1999. [18] B. Liu, V. Schlegel, R. T. Batista-Navarro, S. Ananiadou, Incorporating zoning information into argument mining from biomedical literature, in: Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022, pp. 6162–6169. [19] J. Lawrence, C. Reed, Argument mining: A survey, Computational Linguistics 45 (2020) 765–818. [20] M. Liakata, S. Teufel, A. Siddharthan, C. Batchelor, Corpora for the conceptualisation and zoning of scientific papers (2010). [21] J. Li, Z. Li, T. Ge, I. King, M. R. Lyu, Text revision by on-the-fly representation optimization, arXiv preprint arXiv:2204.07359 (2022). [22] T. Ito, T. Kuribayashi, H. Kobayashi, A. Brassard, M. Hagiwara, J. Suzuki, K. Inui, Diamonds in the rough: Generating fluent sentences from early-stage drafts for academic writing assistance, in: Proceedings of the 12th International Conference on Natural Language Generation, Association for Computational Linguistics, Tokyo, Japan, 2019, pp. 40–53. URL: https://aclanthology.org/W19-8606. doi:10.18653/v1/W19-8606. [23] R. Dale, A. Kilgarriff, Helping our own: The HOO 2011 pilot shared task, in: Pro- ceedings of the 13th European Workshop on Natural Language Generation, Associ- 34 ation for Computational Linguistics, Nancy, France, 2011, pp. 242–249. URL: https: //aclanthology.org/W11-2838. [24] T. Ito, T. Kuribayashi, M. Hidaka, J. Suzuki, K. Inui, Langsmith: An interactive academic text revision system, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Association for Computational Linguistics, Online, 2020, pp. 216–226. URL: https://aclanthology.org/2020.emnlp-demos.28. doi:10.18653/v1/2020.emnlp-demos.28. [25] L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. L. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray, et al., Training language models to follow instructions with human feedback, arXiv preprint arXiv:2203.02155 (2022). [26] C.-T. Tsai, J.-J. Chen, C.-Y. Yang, J. S. Chang, LinggleWrite: a coaching system for essay writing, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Association for Computational Linguistics, Online, 2020, pp. 127–133. URL: https://aclanthology.org/2020.acl-demos.17. doi:10.18653/v1/ 2020.acl-demos.17. [27] L. Anthony, G. V. Lashkia, Mover: A machine learning tool to assist in the reading and writing of technical papers, IEEE transactions on professional communication 46 (2003) 185–193. [28] E. Cotos, Computer-assisted research writing in the disciplines, in: Adaptive educational technologies for literacy instruction, Routledge, 2016, pp. 225–242. [29] S. Knight, A. Shibani, S. Abel, A. Gibson, P. Ryan, Acawriter: A learning analytics tool for formative feedback on academic writing, Journal of Writing Research (2020). [30] L. Anthony, Writing research article introductions in software engineering: How accurate is a standard model?, IEEE transactions on Professional Communication 42 (1999) 38–46. [31] E. Cotos, S. Huffman, S. Link, Understanding graduate writers’ interaction with and impact of the research writing tutor during revision, Journal of Writing Research 12 (2020) 187–232. [32] E. Cotos, S. Huffman, S. Link, Furthering and applying move/step constructs: Technology- driven marshalling of swalesian genre theory for eap pedagogy, Journal of English for Academic Purposes 19 (2015) 52–72. [33] S. Majumder, S. Lauly, M. Nadejde, M. Federico, G. Dinu, A baseline revisited: Pushing the limits of multi-segment models for context-aware translation, CoRR abs/2210.10906 (2022). URL: https://doi.org/10.48550/arXiv.2210.10906. doi:10.48550/ arXiv.2210.10906. arXiv:2210.10906. [34] B. Li, H. Liu, Z. Wang, Y. Jiang, T. Xiao, J. Zhu, T. Liu, C. Li, Does multi-encoder help? a case study on context-aware neural machine translation, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 3512–3518. URL: https://aclanthology.org/2020.acl-main.322. doi:10.18653/v1/2020.acl-main.322. [35] Y. Feng, F. Li, Z. Song, B. Zheng, P. Koehn, Learn to remember: Transformer with recurrent memory for document-level machine translation, in: Findings of the Association for Com- putational Linguistics: NAACL 2022, Association for Computational Linguistics, Seattle, United States, 2022, pp. 1409–1420. URL: https://aclanthology.org/2022.findings-naacl.105. doi:10.18653/v1/2022.findings-naacl.105. 35 [36] J. Chen, X. Li, J. Zhang, C. Zhou, J. Cui, B. Wang, J. Su, Modeling discourse structure for document-level neural machine translation, in: Proceedings of the First Workshop on Automatic Simultaneous Translation, Association for Computational Linguistics, Seattle, Washington, 2020, pp. 30–36. URL: https://aclanthology.org/2020.autosimtrans-1.5. doi:10. 18653/v1/2020.autosimtrans-1.5. [37] M. Taboada, W. C. Mann, Rhetorical structure theory: Looking back and moving ahead, Discourse studies 8 (2006) 423–459. [38] L. Danlos, Analyse discursive et informations de factivité (discursive analysis and in- formation factivity), in: Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, ATALA, Montpellier, France, 2011, pp. 364–375. URL: https://aclanthology.org/2011.jeptalnrecital-long.32. [39] C. Jiang, W. Xu, S. Stevens, arxivedits: Understanding the human revision process in scientific writing, In Proceedings of EMNLP 2022 (2022). [40] N. Shintani, R. Ellis, The comparative effect of direct written corrective feedback and metalinguistic explanation on learners’ explicit and implicit knowledge of the english indefinite article, Journal of Second Language Writing 22 (2013) 286–306. URL: https: //www.sciencedirect.com/science/article/pii/S1060374313000271. doi:https://doi.org/ 10.1016/j.jslw.2013.03.011. [41] A. Sampson, “coded and uncoded error feedback: Effects on error frequencies in adult colombian efl learners’ writing”, System 40 (2012) 494–504. URL: https:// www.sciencedirect.com/science/article/pii/S0346251X12000772. doi:https://doi.org/ 10.1016/j.system.2012.10.001. [42] S. Chen, H. Nassaji, Q. Liu, Efl learners’ perceptions and preferences of written corrective feedback: a case study of university students from mainland china, Asian-Pacific journal of second and foreign language education 1 (2016) 1–17. 36