-

Journal of Second Language Writing 26 (2014) 42-65. URL: https://www.sciencedirect.com/ science/article/pii/S1060374314000666. doi:https: //doi.org/10.1016/j.jslw.2014.09.005

10.18653/v1

at LangLearn: Language Development Assessment Model based on Sequential Information Attention Mechanism

Hongyan Wu

2 3

Nankai Lin

1 3

Shengyi Jiang

2 3

Lixian Xiao

0 3

Language Development Assessment, Sequential Information Attention Mechanism, BERT,

0 Faculty of Asian Languages and Cultures, Guangdong University of Foreign Studies , Guangzhou, Guangdong , PR China 1 School of Computer Science and Technology, Guangdong University of Technology , Guangzhou, Guangdong , PR China 2 School of Information Science and Technology, Guangdong University of Foreign Studies , Guangzhou, Guangdong , PR China 3 cessing and Speech Tools for Italian , Sep 7 - 8, Parma, IT

2019

1 49 58

In recent years, investigations into language acquisition have greatly benefited from the utilization of natural language processing technologies, particularly in analyzing extensive corpora consisting of authentic texts produced by learners across the realms of first and second language acquisition. A crucial task in this domain involves the assessment of language learners' language ability development. The “Language Learning Development” task featured in EVALITA 2023 [1] marks a significant milestone as the inaugural shared task focused on automated language development assessment, which entails predicting the relative order of two essays written by the same student. We introduce a novel attention mechanism, namely sequential information attention mechanism, with the primary objective of exploiting information interaction between sequence texts. Experimental results on the COWS dataset show the efectiveness of our proposed sequential information attention mechanism, showcasing its substantial impact on model performance during the final evaluation phase.

Mechanism

1. Introduction

Recently, there has been a surge of interest in harnessing the potential of natural language processing (NLP) tools and machine learning techniques to explore the realm of language development, both in first (L1) and second language (L2) acquisition scenarios. The primary ing the linguistic attributes of learners and the dynamic evolution of their language ability across diferent modalities and stages of acquisition. The utilization of learner corpora and the enhanced dependability of linguistic features extracted through computational tools and machine learning techniques have significantly advanced our comprehension of the linguistic properties exhibited by language learners. The empirical evidence has shed light on the temporal dynamics and the evolution of these language properties as learners progress in language ability [ 2 ].

A significant focus of scholarly inquiry has been directed towards the exploration of various avenues for advancing the field of language development research. nEvelop-O (L. Xiao) progression of language acquisition.

the study aimed to establish a comparison between the The “Language Learning Development” task featured scores obtained from these measures and the subjective in EVALITA 2023 is concerned with predicting the ratings provided for the overall writing quality of the chronological sequence of essays produced by the same learners. student over diferent periods. We introduce a novel atRecent work on the application of neural networks to tention mechanism, namely sequential information atten- language modeling has shown that models based on certion mechanism (SIAM), intending to exploit information tain neural architectures can capture syntactic informainteraction between sequence texts. We submitted three tion from utterances and sentences even without explicit results in total, namely the fine-tuned BERT model (Run 2), the fine-tuned BERT model with SIAM (Run 3), and syntactic goals. Sagae [11] conducted a study to determine whether a fully data-driven model of language dethe fusion of the results of the previous two models (Run velopment, utilizing a recurrent neural network encoder 1). Experimental results demonstrate that our proposed to encode utterances, could track changes in children’s sequential information attention mechanism has a re- language over the course of their language development markable impact on model performance during the final in a comparable manner to the leverage of expertly estabThese systems are trained on genuine learner data per- interaction of information within the sequence of text, evaluation phase.

2. Related Work

The existing research on the language development assessment task is mainly divided into two types, one focuses on the construction of the language ability development assessment model based on language features, and the other is concerned with the construction of a language ability development assessment model based on neural networks.

Given the inherent challenge of establishing a unique indicator of linguistic complexity within the domain of second language (L2) development, a diverse range of features spanning various linguistic levels have been employed as inputs for supervised classification systems. taining to diferent L2 languages. Notable examples include the works of Hancke and Meurers [ 6 ] as well as Vajjala and Lõo [ 7 ], which respectively investigated L2 German and L2 Estonian. Pilán and Volodina [8] provided a comprehensive analysis of predictive features extracted from both receptive and productive texts within the context of Swedish L2 acquisition. Miaschi et al. [9] used various linguistic features automatically extracted from students’ written expressions to track the evolution of written language abilities of second-language Spanish learners. Furthermore, Miaschi et al. [ 5 ] proposed a natural language processing-based style measure to track the evolution of Italian L1 learners’ written language competence, which relied on capturing a range of linguistically motivated features in terms of text style. In a study conducted by Bulté and Housen [10], the objective was to determine the nature and extent of English L2 writing proficiency development among 45 adult ESL learners throughout the duration of an intensive short-term academic English language program. The investigation employed quantitative measures that specifically targeted various aspects of lexical and syntactic complexity exhibited in the learners’ writing performance. Additionally, lished language assessment metrics for language-specific information.

The untapped potential of neural networks in language development assessment tasks necessitates further exploration, as the application of pre-trained models in this context has not been investigated. 3.

Method 3.1. Overview

methodology. Initially, we concatenate the historical text and current text, and utilize the pre-trained model BERT to encode them. Subsequently, we employ the sequential information attention mechanism to capture the thereby updating the representation of the historical text to obtain an improved global representation. Ultimately, we combine the enhanced global representation of the historical text with the original sentence representation for the final assessment of language development.

3.2. Text Representation

Aiming to efectively capture the intricate semantic information embedded within the text, we employ a nonautoregressive pre-trained model BERT [12] renowned for its remarkable performance in generating text-based semantic representations for sentence encoding. BERT possesses abundant linguistic, syntactic, and lexical knowledge, which is acquired through unsupervised training on a substantial corpus during the pre-training phase. The fundamental architecture of the model encompasses a multi-layer bidirectional Transformer encoder [13], facilitating global information processing and extraction. Given a historical text = { 1, 2, 3, ..., } cial tokens [] and a current text and [ ] = { 1, of BERT are utilized to 3, ...,

}, two spestitch them together, forming the text input = {[], 1, 2, 3, ..., , [ ],

the length of the two texts respectively. The semantic rep- tion weight of the current text to the historical text and denotes the dimension

historical text: where = {ℎ [] , ℎ1, ℎ2, ℎ3, ..., ℎ , ℎ [] , ℎ1, ℎ2, ℎ3, to capture diferences between the current text and the = ⋅

For the updated

, we use the average pooling operation to obtain an enhanced global representation of historical text: ℎ = (

)

3.4. Language Development Assessment

ℎ [] We concatenate the enhanced global representation ℎ of to obtain a text representation for classification: ℎ = (ℎ [] , ℎ ) = ( ) ..., ℎ , ℎ[]

} ∈ (+)⋅ of semantic representation. is respectively: = {ℎ1, ℎ2, ℎ3, ..., ℎ }

= {ℎ1, ℎ2, ℎ3, ..., ℎ }

3.3. Sequential Information Attention Mechanism

We present an innovative attention mechanism known as the sequential information attention mechanism (SIAM), specifically designed to exploit information interaction

Then the semantic representation of text and text (4) (5) (6) it as ( 2, 3,1). Then we construct sample 3 based on associated with the token “[CLS]” within the given sen- the above two samples. In terms of sample 3, essay 1 tence. The representation ℎ is utilized as the sentence’s overall feature representation, which is subsequently fed into a linear classifier with a softmax function. The preappears before essay 3, which is defined as ( 1, 3,1). In addition, we expand the negative samples based on the above positive samples, namely ( 2, 1,0), ( 3, 2,0), and dicted probabilities language development assessment of ( 3, 1,0), where ’0’ represents the the negative sample. the text .

where is the one-hot encoding of the text’s actual expected value. When = 1, 0 means that the writing time of the text is before the text ; otherwise, when = 0, 1 means that the writing time of the text is after

4. Experiments 4.1. Experimental Setup

All experimental procedures are conducted utilizing the NVIDIA A30 24-GB GPU. We utilize pytorch [14] and transformers [15] to build our models. Considering the similarity between the two languages, we only use the Italian BERT model (dbmdz/bert-base-italian-uncased), as we think it also contains a small amount of Spanish information. The feed-forward layer is initialized using weights drawn from a truncated normal distribution with a standard deviation of 2e-2, while the bias is initialized to zero. A fixed initial learning rate of 5e-5 is consistently applied across all experiments. The maximum sequence length is set to 512, representing the prescribed constraint on the number of tokens within a sentence. To optimize training, a warmup proportion of 1e-3 is implemented. The training episodes span 10 epochs with a batch size of 4.

4.2. Datasets

The datasets provided by EVALITA 2023 “Language Learning Development” task come from two samples, CItA [16] and COWS-L2H [9], where the number of training sets is 2394 and 1009 respectively. We perform data augmentation based on the datasets. Specifically, if essay 1 in sample 1 appears before essay 2, we describe it as ( 1, 2,1), where ’1’ denotes the positive sample. While essay 2 in sample 2 appears before essay 3, we describe The scales the augmented datasets for CItA and COWSL2H are 5056 and 2042, respectively. In the training set, (8) each positive sample can match a corresponding negative sample, so the number of positive and negative samples in the dataset is consistent. Ultimately, two datasets are combined to get a new training set.

In order to ensure the rationality of our strategy evaluation, we employ a 5-fold cross-validation methodology, which involves dividing the datasets into five distinct subsets to construct an ensemble model that exhibits enhanced generalization capabilities. More precisely, four of these subsets are assigned for training purposes, while the remaining subset is utilized for verification. The effective evaluation results of our strategies are derived by averaging the outcomes obtained from the five models.

4.3. Submission

We submit three results in total, namely the fine-tuned BERT model (Run 2), the fine-tuned BERT model with SIAM (Run 3), and the merge method (Run 1). The finetuned BERT model is to fine-tune directly on the BERT model based on the dataset. Concretely, the model in section 3 removes the sequential information attention mechanism. The merge method is the fusion of output probabilities of the fine-tuned BERT model (Run 2) and the fine-tuned BERT model with SIAM (Run 3).

4.4. Experimental results

Experimental results in the evaluation phase are shown in Table 1.

It can be seen that on the CItA test set, the BERT model achieves the best performance, , curacy are 0.9338, 0.9315 and 0.9316 respectively, while the BERT model with SIAM has slightly declined. We deem that the impunity can be attributed to our methods being trained on two corpora simultaneously, to some extent, the information of the two corpora afects each other, sacrificing the performance of the CItA dataset in exchange for the improvement of the COWS dataset. Concerning the merge method, regardless of the CItA test set, the COWS test set or the combined test set, the strategy of model fusion is powerless, which has not and the acbrought efective improvement.

Our proposed sequential information attention mechanism has demonstrated substantial improvements in both the COWS test set and the combined test set. Specifically, on the COWS test set, the BERT model with SIAM outperforms the BERT model by 0.0427, 0.0300, and 0.0313 in the three indicators of , , and accuracy, respectively. Likewise, on the combined test set, the BERT model with SIAM gains consistent improvement of 0.0143, 0.0126, and 0.0128 in three metrics based on the BERT model.

5. Conclusion

The “Language Learning Development” task revolves around accurately predicting the sequential order of two essays authored by a single student. In the study, we first attempt to tackle the task leveraging a high-performing pre-trained language model, demonstrating the strong potential of pre-trained language models to solve the language development assessment task. Moreover, we present a novel attention mechanism, known as sequential information attention, designed to efectively capture and leverage the interaction of information within sequential texts. In the final evaluation stage, experimental results reveal the efectiveness of our proposed method, substantiating that sequential information attention contributes to tracking the evolution of language competence.

In the future, we will further try to focus on neural networks to extract language features suitable for language development assessment tasks, so as to further improve the performance of the model, driving advancements in the field of language development assessment.

Acknowledgments

This work was supported by the Guangdong Philosophy and Social Science Foundation (No. GD20CWY10), the National Social Science Fund of China (No. 22BTQ045), and the Science and Technology Program of Guangzhou (No.202002030227).

[1]

Lai ,

Menini ,

Polignano ,

Russo ,

Sprugnoli , G. Venturi, Evalita 2023 : Overview of the 8th evaluation campaign of natural language processing and speech tools for italian, in: Proceedings of the Eighth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian . Final Workshop (EVALITA 2023 ), CEUR.org, Parma, Italy, 2023 .

[2]

S. A.

Crossley , Linguistic features in writing quality and development: An overview , Journal of Writing Research 11 ( 2020 ) 415 - 443 . URL: https: //jowr.org/index.php/jowr/article/view/582. doi: 10 . 17239/jowr- 2020. 11 .03.01.

[3]

Sagae ,

Lavie , B. MacWhinney , Automatic measurement of syntactic development in child language, in: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), Association for Computational Linguistics , Ann Arbor, Michigan, 2005 , pp. 197 - 204 . URL: https://aclanthology.org/P05-1025. doi: 10 .3115/1219840.1219865.

[4]

Lu , Automatic measurement of syntactic complexity in child language acquisition , International Journal of Corpus Linguistics 14 ( 2009 ) 3 - 28 . doi: 10 .1075/ijcl.14.1.02lu.

[5]

Miaschi ,

Brunato , F. Dell'Orletta, A nlp-based stylometric approach for tracking the evolution of l1 written language competence , Journal of Writing Research 13 ( 2021 ) 71 - 105 . URL: https://www. jowr.org/index.php/jowr/article/view/778. doi: 10 . 17239/jowr- 2021. 13 .01.03.

[6]

Hancke ,

Meurers , Exploring cefr classification for german based on rich linguistic modeling , 2013 , pp. 54 - 56 .

[7]

Vajjala ,

Lõo , Automatic CEFR level prediction for Estonian learner text, in: Proceedings of the third workshop on NLP for computer-assisted language learning , LiU Electronic Press, Uppsala, Swe-