Dialogue in Abridged Literary Texts in Japanese: Observations and Automatic Reproducibility

Dialogue in Abridged Literary Texts in Japanese: Observations and Automatic Reproducibility IglikaNikolova-Stoupak Sens Texte Informatique Histoire Sorbonne Université

Paris France

EvaSchaeffer-Lacroix Sens Texte Informatique Histoire Sorbonne Université

Paris France

GaëlLejeune gael.lejeune@sorbonne-universite.fr Sens Texte Informatique Histoire Sorbonne Université

Paris France

Dialogue in Abridged Literary Texts in Japanese: Observations and Automatic Reproducibility 1613-0073 7E2A0E6798F027C88720798A1835AA4D GROBID - A machine learning software for extracting information from scholarly documents Japanese dialogue, dialogue in literature, literary abridgement (G. Lejeune) 0000-0001-5230-3458 (I. Nikolova-Stoupak) 0000-0002-6260-9095 (E. Schaeffer-Lacroix) 0000-0002-4795-2362 (G. Lejeune)

This study investigates the language utilised in dialogues in abridged literary texts in Japanese that are commonly appreciated by language learners. The focus on dialogue is essential, as it uniquely illustrates the dynamics between interlocutors whilst highlighting cultural and linguistic norms. The study explores phenomena typical of the Japanese language, such as registers of formality, the use of end-of-sentence particles and explicit subject markers, as well as universal linguistic features commonly used in readability formulas and considered a proxy for textual complexity. Three world-famous literary texts as abridged by popular publishers (e.g. 'Aoitori Bunko') are considered, and insight is sought concerning how the dialogues within them compare to the texts in their entirety as well as to the dialogues in unabridged versions of the same texts. Ultimately, an experiment is proposed in order to test the overall validity of the identified linguistic patterns, in which the state-of-the-art GPT-4o model is provided with relevant guidelines and asked to simplify literary dialogue.

Introduction

Advanced dialogue systems such as OpenAI's popular chatbot ChatGPT and a growing fascination for Japanese-language cultural artefacts like animations and comic books coincidentally mark the current global landscape, particularly in relation to a young audience. A natural intersection point of the two phenomena is the educational sphere, where AI-based tools are more and more commonly employed as learning support, and the number of learners of Japanese as a foreign language is demonstrating a perceivable increase [1]. As original Japanese materials targeted at learners of the language are scarce, the latter often resort to alternative sources that are closely compatible, such as the book series 'Aoitori Bunko' [2] and 'Tsubasa Bunko' [3], which offer adaptations of popular texts targeted at young Japanese readers. The exploitation of these materials' qualities for the automatic or partially automatic generation of additional ones is a route worth exploring, as it would offer benefits to learners in terms of material suitability and availability as well as to professional educators/writers in terms of time and effort.

The current work aims to provide quantifiable insights into the characteristics of abridged texts in the Japanese language for future use in the context of automatic language generation. Specifically, the rendition of dialogues is initially focused on as they saturate an array of specificities of Japanese culture, including interpersonal dynamics and levels of politeness, which in turn are expressed through language in standard ways, such as the chosen register (long vs short verb forms). The dialogues in three adapted literary texts (Alice's Adventures in Wonderland, Les Miserables and Tom Sawyer) are examined with a focus on typical features of the Japanese language pertaining to dialogue (e.g. the use of end-of-sentence particles) as well as readability-based characteristics (e.g. type-to-token ratio). Comparisons are made, on the one hand, between the characteristics of the dialogue portions of the texts and the texts taken as a whole and, on the other hand, between dialogue in the abridged texts and in their unabridged counterparts. As a result of the comparison, guidelines for simplification of literary dialogue in the language are established and, by extension, conclusions can be drawn about the features of such dialogues that render them easy or difficult for readers. In a related experiment, the GPT-4o model (as per ChatGPT) is asked to provide a simplified version of a literary dialogue in Japanese in three discrete settings: zero-shot, one-shot (in which a reference pair of unabridged and abridged versions of a dialogue-filled text is provided) and through chain-of-thought prompt engineering that incorporates the quantified norms established by the current work.

Background

Automatic Dialogue Generation

Automatic dialogue generation differentiates between task-oriented and open-domain systems. The former are associated with a specific domain and, typically, specific knowledge and databases. In contrast, open-domain systems are broader in scope and may be generative, retrieval-based or ensemble. Dialogue systems, like the majority of NLP-related technology, have seen development over the past few decades from rule-based through machine-learning-driven to neural approaches. In the last couple of years, Large Language Models (LLMs) such as GPT have been particularly revolutionary in the field.

The evaluation of automatic dialogue is a challenging task due to dialogue's multiple and not readily quantifiable characteristics. Whilst task-oriented or retrieval-based open-domain dialogue systems might benefit from the application of BLEU scores, which compare their responses to human-made ones, the same cannot be said of systems that support a freer textual format. In their survey on the topic, [4] mention the relevance of multiple evaluation criteria, including interactiveness, outside knowledge, coherence, speaker consistency, empathy and the system's ability to keep the conversation going.

Dialogue systems have been applied in foreign language education since the 1960s [5]. Their advantages include ease of access, unlimited time of use, and the opportunity for students to practice their target language in a stress-free environment. In their review of 28 articles on the topic of AI dialogue systems used in language education, [5] point out that an optimal system provides appropriate, conventional and efficient dialogue, and that it is especially challenging but important for it to also guarantee sufficient humour and cross-cultural engagement. Whilst most automatic dialogue systems conceived for educational purposes are focused on the English language, exceptions do exist, such as [6]'s Subarashii (2000), which employed automatic speech recognition and rule-based logic to aid students of Japanese as a foreign language in beginner conversation. Today, chatbots like ChatGPT can engage in high-level conversation in multiple languages, including Japanese.

Dialogue in Literary Texts

A dialogue is an exchange between two or more parties that is associated with a specific setting and semantics. It is important to underline that the dialogues present in literary texts are by no means identical to those that occur in real life. [7] points out that real-life dialogues last until a constraint (e.g. time) has been imposed or a conclusion has been reached. It is a common occurrence for simple repetitions or utterances lacking information to comprise a significant part of an exchange, as well as for so-called 'dialogue-of-the-deaf' or 'pseudo-dialogue' to occur due to overly large agreement or, conversely, a lack thereof. In contrast, literary dialogues have been pre-elaborated by the author and necessarily serve narrative purposes, such as advancement of the plot, demonstration of character qualities or the creation of a specific mood. [8] goes so far as to make the following drastic distinction between the two types of dialogue: "it is not the purpose of dialogue to reproduce conversation naturalistically but rather, in the guise of conversation, to supply conversation's deficiencies -to be amusing where conversation is dull, to be economic where conversation is wasteful, to be articulate and lucid where conversation is mumbling or obscure" (p.7).

Dialogue in the Japanese Language

Japan is known for its specific (and often perplexing for foreigners) social norms, especially when it comes to formal communication. Naturally, these norms are to a great extent reflected verbally in interpersonal dialogue in the Japanese language.

To start with, several registers, associated with specific verb conjugation, occur in Japanese in function of the interlocutor's perceived 'status'. A very simplified rule of thumb would require that one utilise short forms (e.g. da1 ) with close friends and family, -masu forms (e.g. desu) in the majority of circumstances, 'humble' forms (e.g. de gozaimasu) when referring to themselves in front of a superior and 'honorific' forms (e.g. de irasshaimasu) when addressing or referring to a superior interlocutor. However, as [9] notes, additional criteria of use are also at play, such as the participants' mood and dynamics or the intention to use a construction that is only compatible with certain verb forms.

Examples of constructions that exert significant influence on their context are end-of-sentence particles. They are specific to and almost indispensable in Japanese dialogue. The particles zo, ze, sa and yo are associated with insistence, whilst others like ne and na denote a request for confirmation. However, practice once again shows that the employment of the phenomenon is not this easily generalisable. In fact, multiple and sometimes conflicting theories have been brought forward considering the nature of discrete particles and the differentiation between them. For instance, the speakers' 'territory of information' is often invoked [9]. In contrast, other theories underline the role of particles in dialogue coordination and persuasion dynamics [10].

Unlike English and other Western languages, Japanese introduces subject and object roles within a sentence via specific case markers that precede the associated verb. In addition, a 'topic' marker (wa) is also commonly used, whose function, albeit similar to that of the subject marker, is different in that it is not directly associated with syntactic roles. The absence of explicit particles, commonly employed in dialogue, is a case of anaphora referred to as 'zero pronoun'. Its function is seen as comparable to that of personal pronouns in English [11].

Methods

Texts

The primary literary texts investigated are abridged versions of Mark Twain's The Adventures of Tom Sawyer2 , Victor Hugo's Les Miserables 3 and Lewis Carroll's Alice's Adventures in Wonderland 4 . The first two books belong to the series 'Aoitori Bunko' by the publisher Kodan [2], and the third one is from the series 'Popura Poketto' by Popura [12].

Both series as well as others like 'Tsubasa Bunko' [3] seek to offer to young Japanese readers access to Japanese and world classics in simple language and child-friendly format. 'Aoitori Bunko' books, officially targeted at elementary school children, are commonly perceived as especially easy to read, as they contain a number of illustrations, use simple grammatical constructions, and the reading of all kanji5 is provided [13]. According to the blog 'Japanese Book Club', aimed at learners of Japanese as a foreign language, a typical book of the series contains grammar suitable for the N3 and N2 levels of competence as per the Japanese certificate exam JLPT6 [14].

In order for conclusions to be drawn that are specifically valid for and set apart abridged Japanese texts of an affirmed quality, each of the abridged texts is coupled with a non-abridged Japanese trans-lation of the same original text. 7 8 9

Textual Processing

All texts are converted to plain text format 10 , any metadata (including titles, character information and prefaces) is removed and basic textual cleaning is applied, including the removal of non-alphanumeric symbols resulting from textual conversion. Tabs and new lines are removed to allow for further processing of the texts. For tokenisation of Japanese sentences into words, python's janome 11 library is utilised. For conversion of kanji into kana 12 , the pykakasi 13 library is opted for. Dialogue is defined as all text closed between '「' and ' 」' signs that ends in punctuation.

Linguistic Characteristics

Two sets of linguistic characteristics are measured in relation to dialogue instances in the selected texts. When applicable, the dialogue in the texts is compared to the texts taken as a whole and/or with the dialogue in the abridged texts' non-abridged counterparts. The first set is conceived to represent the specificity of Japanese dialogue and consists of the percentage of sentences: discovered in relation to the professional texts but not exhibited by the current output. For the full generated texts, please refer to Appendix A; and for the prompts employed in the three scenarios, please refer to Appendix C.

The generated texts are then evaluated qualitatively via a survey administered to seven speakers of Japanese as a foreign language. There are two versions of the survey, each of which includes two of the automatically generated texts 16 . The two versions share a common text to facilitate the calculation of agreement between participants; the one-shot text is opted for as the other two texts are very similar to each other and their co-occurrence may cause confusion. The evaluation protocol is adapted from [16] and evaluates the following categories: (i) understandability, (ii) correctness, (iii) consistency, (iv) textual coherence and (v) aesthetic appeal. Four possible answers exist by category: 'yes', 'mostly yes', 'mostly no' and 'no'. Comments in free text are also encouraged. The full protocol can be found in Appendix B.

Results

General Observations

The detailed analysis of our experiments are shown in Table 1 (Alice in Wonderland), Table 2 (The Adventures of Tom Sawyer) and Table 3 (Les Miserables) . The lengths of each abridged and unabridged text were calculated and so were the ratios between each pair, revealing variation between 0.07 (for the longest original text, Les Miserables) and 0.87 (for Alice in Wonderland).

The percentage of dialogue was discovered to be consistently higher in the abridged texts, ratios ranging from 1.14 for Tom Sawyer to 1.38 for Les Miserables. The number of words per sentence is lower for dialogue than for the text taken as a whole in all six texts. Due to the presence of a clear trend, the ratios between the values of this feature for abridged and unabridged texts were also calculated in an attempt to discover a further trend that characterises abridged texts. Indeed, it was discovered that the ratios between the number of words per sentence in dialogue and in the full texts are generally higher for the abridged versions (or identical in the case of Alice in Wonderland) 17 .

No clear trends were discovered in relation to the number of letters per word (e.g. it is higher for dialogue in both of Alice in Wonderland's versions but this is the case only for Tom Sawyer's abridged version). In contrast, TTR does exhibit a trend, as it is higher in the case of dialogues in all texts (thereby speaking of higher lexical variety). Therefore, the ratios of the values in relation to the three abridgedunabridged pairs were also calculated; however, no further trends pertaining to abridged texts were discovered 18 . The percentages of words that fall outside the utilised frequency list were higher for the texts when taken as a whole as compared to the dialogue within them (the sole exception being Tom Sawyer's unabridged version). No further trends are noticeable in relation to a text's 'unabridged' versus 'abridged' status 19 .

The Japanese-specific feature 'percentage of sentences without topic or subject marker' reveals a general trend pointing to higher values in dialogue parts 20 . Calculation of the values' ratios within a textual pair shows a further trend of higher ratios in unabridged as compared to abridged texts 21 . 16 version 1: zero-shot and one-shot; version 2: one-shot and chain-of-thought 17 0.5 vs 0. 1 and the related forms datta, dewa/ja nai, and dewa/ja nakatta 2 and the related forms deshita and dewa/ja arimasen 3 yo, ne, na, sa, zo, ze, and wa 4 the proportion of number of characters in dialogue over the number of characters in the whole text 5 based on the multilingual Open Subtitles corpus and available at: https://github.com/hermitdave/FrequencyWord

Although the percentages of short and long verb forms within the different texts vary significantly as a consequence of the nature of the characters and their interactions (e.g. there is much higher formality in Les Miserables than in Alice in Wonderland), there is overall a higher number of short forms in all six texts. A calculation of ratios between the two types of verb forms in textual pairs was conducted; however, revealing no further trend 22 .

With the exception of Les Miserables, the abridged texts are associated with a lower percentage of end-of-sentence particles. Understandably, the choice of particles used is highly text-specific; for instance,Alice in Wonderland contains many instances of the particle wa, which is used almost exclusively by female speakers. The most common particles in both types of texts are yo, ne, and na; whilst the particle ze is generally avoided, in particular in abridged texts (even when used abundantly in the unabridged counterpart, like in the case of Tom Sawyer).

Generation Experiment

ChatGPT Output Table 4 offers detailed juxtaposition of the investigated features within the textual extract provided to ChatGPT versus its zero-shot output. The overall length of the two texts is, respec- tively, 2331 and 1457 characters, making the abridged version 0.63 of the length of the original one. This proportion falls neatly within the range exhibited by the previously examined professional texts. The percentage of dialogue is, naturally, high as the extract has been specifically selected to include extensive dialogue: 0.7 for the abridged vs 0.57 for the unabridged text. The abridged text contains a higher percentage of dialogue, thus matching the trend discovered in relation to the professional texts. TTR is higher for the automatically abridged text, which also keeps in line with the discovered trend.

Similarly, the number of words outside the frequency list is indeed higher in the full text than in its dialogue portions. However, when it comes to the number of words per sentence, the trend of there being fewer words in dialogue parts is not exhibited by ChatGPT's output. The percentage of sentences without subject or topic marker is higher in dialogue parts than in the full text both within the original extract and ChatGPT's output. To go further, the ratio between the two values is higher for the unabridged text, thus keeping up with both trends related to the textual feature. Interestingly, even though the percentages of sentences ending in da vs desu go against the discovered trend in the originally provided extract (the latter being more common), ChatGPT's output does in fact match the trend. Finally, a higher percentage of sentences end in a particle in the abridged text, going against the relevant trend. More importantly, there is very little variety in the included particles, yo single-handedly accounting for their vast majority.

Based on this conducted evaluation, ChatGPT was asked to make two modifications in the chain-ofthought setting: (i) Reduce the number of words in dialogue sentences and (ii) Add a larger variety of end-of-sentence particles, such as the commonly used ne and na. Whilst the requested particles were indeed added23 , the overall length of sentences in dialogue was not decreased24 . Still, upon reflection, no further prompts were provided at this point in order for the model's unedited attempt at meeting the requirements to be evaluated in the ensuing survey.

Human Evaluation The survey's 'version 1' (which consisted of the zero-shot and one-shot texts) was answered by three respondents and 'version 2' (one-shot and chain-of-thought) -by five. Inter-rater agreement as per Fleiss's Kappa was low for all three texts25 . Highest agreement was associated with 'correctness' in relation to the one-shot text and with 'textual coherence' in relation to the chain-of-thought one. Agreement was stable between categories for the zero-shot text.

The chain-of-thought text received the highest percentage of 'yes' answers, 76%; followed by 24% 'mostly yes'. The one-shot text received 70% 'yes', 20% 'mostly yes' and 10% 'mostly no'. Finally, the zero-shot text received 66.7% 'yes', 26.7% 'mostly yes' and 6.7% 'mostly no' answers. No text was given less than 50% of 'yes' answers for a given category.

The one-shot text was met with a higher number of negative free-text comments. Grammar was said to be complex, and the lack of kanji representations to be confusing. It was sometimes difficult to tell who was speaking, and the lack of explicit subject or owner rendered some sentences hard to understand. On the positive side, the text was said to be 'very flowing and natural'. Furthermore, the dialogue attributed to the characters was perceived as vivid and judged to match their personal traits ("Scrooge's dismissive attitude and the cheerful responses of his nephew").

The zero-shot text was said to be written in standard and fitting 'conversational Japanese', and it was pointed out that repetitions within dialogue were effective. The chain-of-thought text, in which ChatGPT was explicitly asked to replicate the trends discovered in this study, received mostly positive comments. Anaphora was said to be more readily resolvable (e.g. the pronoun kare26 was employed several times). Compared with its one-shot counterpart, the text was said to be 'better and smoother' and to contain simpler grammar. A specific sentence was said to be more 'aesthetically appealing' than the parallel one used in the one-shot text. According to the negative comments, complex kanji were sometimes used, verb tense was inconsistent on one occasion and, importantly, there was said to be less of a striking difference in the dialogue styles attributable to the two main characters.

Discussion

An automatic evaluation of both universal and language-specific features as present in dialogue in abridged (and related unabridged) literary texts in Japanese led to the discovery of several distinct trends. Some of them simply pertain to dialogue in Japanese literature, whilst others go further to reveal specific traits of the dialogue present in abridged texts. The latter trends help shed light on these texts' features that render them suitable for an audience of language learners and/or younger readers.

Abridged texts are associated with a higher percentage of dialogue and a lower percentage of endof-sentence particles. There tend to be a higher number of infrequent words in the text when taken as a whole than in its dialogue portions. TTR is higher in dialogue compared to the literary texts taken as a whole, and there tend to be more short than long verb forms in dialogue. The number of words per sentence is higher in dialogue than in full texts, and the ratio between the two values is higher for abridged texts. Sentences without marked topic or subject are more frequent in dialogue, and the associated ratio is higher for unabridged texts.

The chain-of-thought setting of automatic abridgement, in which ChatGPT was asked to apply modifications for its initial output text to inscribe within the mentioned trends, was met with significant encouraging feedback, such as the clear use of anaphora, which also characterises the zero-shot text. However, an important piece of criticism was the perceived lack of distinguishing features in the characters' language. There are indeed instances of dialogue in the chain-of-thought text that go against the grain of Scrooge being rather informal and patronising in contrast with his nephew being cheerful and always respectful. An example is the nephew informally skipping the question marker ka; e.g. " どうしてそんなことを言うんです？" 27 . However, upon verification, these instances are identical in the zero-shot text. Therefore, one can safely conclude that it was not the fact of asking the model to make a couple of specific adjustments that caused a compromise in terms of creativity and charac-27 "Why would you say this?" terisation. Conversely, it was the one-shot setting that helped enhance creativity. For instance, there, the mentioned question is rendered in a very formal manner and does end in a question particle: " どうしてそんなことを言うんですか？"

Interestingly, increased use of kanji was seen as an advantage and drawback by different participants in the survey, leading to the conclusion that generalisation of the texts' qualities in relation to different levels of reader proficiency is not optimal. Based on the respondents' feedback, the one-shot text seems to be suitable for more experienced readers of Japanese compared to the other two.

Table 5: Juxtaposition of a part of the original extract of A Christmas Carol and the output provided by ChatGPT in the different examined scenarios. The major differences between the texts are underlined. Version Text (English) 28 "Don't be angry, uncle. Come! Dine with us to-morrow. "

Scrooge said that he would see him-Yes, indeed he did. He went the whole length of the expression, and said that he would see him in that extremity first. "But why?" cried Scrooge's nephew. "Why?" "Why did you get married?" said Scrooge. "Because I fell in love. " "Because you fell in love!" growled Scrooge, as if that were the only one thing in the world more ridiculous than a merry Christmas. "Good afternoon!"

unabridged 「おこらないでください、おじさん。さあ明日は僕らと一緒に夕食をとってください」スクル一ジは、おまえが墓場に、確かにそう、まったくこの通り口にしたのだった、おまえが墓場に落ちるところをみたいものだなと。「どうしてなんです?」甥は叫びました。「いったいどうして」「どうしておまえは結婚したんだ?」「恋に落ちたからです」「恋に落ちたからとはな!」スクル一ジはまるでその言葉が、メリ一クリスマスより腹立たしい唯一の言葉であるかのように吐き捨てた。「ごきげんよう」 zero-shot 「そんなに怒らないでくださいよ、おじさん。明日は僕たちと一緒に夕食を食べませんか？」スクル一ジは冷たく言いました。「お前が墓場に落ちるところを見たいものだ」「どうしてそんなことを言うんです？」と甥は叫びました。「どうしてですか？」「お前が結婚したからだ」「恋に落ちたからです」と甥は答えました。「恋に落ちたから？そんな馬鹿な！ごきげんよう」 chain-of- thought 「そんなに怒らないでくださいよ、おじさん。明日は僕たちと一緒に夕食を食べませんか？」スクル一ジは冷たく言いました。「お前が墓場に落ちるところを見たいものだ、ね」「どうしてそんなことを言うんです？」と甥は叫びました。「どうしてですか、ね？」「お前が結婚したからだ」「恋に落ちたからです」と甥は答えました。「恋に落ちたから？そんな馬鹿な！ごきげんよう」 one-shot 「怒らないでくださいよ、おじさん。明日は僕たちと一緒に夕食をとりましょう」スクル一ジは、「お前が墓場に落ちるところを見たいものだ」と言いました。「どうしてですか？」おいは叫びました。「どうしてそんなことを言うんですか？」「どうしてお前は結婚したんだ？」「恋に落ちたからです」「恋に落ちたから？くだらない」とスクル一ジは言い捨てました。「さようなら」

Table 5 presents parallel texts resulting from the discrete prompting scenarios experimented with. The text's full version, including in the original English, is also included for comparison. In all three scenarios, the second paragraph is turned into dialogue. Other common characteristics include simplification of grammar; e.g. the exclamation " さあ" is excluded and simple constructions are used to express the nephew's invitation: " ません"/" ましょう". In all texts, Scrooge makes use of short verb forms. The zero-shot and chain-of-thought versions, which differ only in their use of end-of-sentence particles, tend to clarify the referents of anaphora, such as with the phrase " と甥は答えました"29 ; in the one-shot text, however, the reader is left to guess who the speaker is. The emphasised use of end-of-sentence particles in the chain-of-thought text reads naturally and enhances the dialogue's dynamics, such as in the nephew's question " どうしてですか、ね?", where the particle implies its status as rhetoric. In fact, the one-shot version also makes effective use of particles, such as in the sentence " 怒らないでくださいよ". An important quality of the one-shot text that, however, cannot be perceived in relation to the other abridged texts is a simplification of vocabulary. While the zeroshot and chain-of-thought versions retain Scrooge's original " ごきげんよう", a greeting likely to be challenging for non-native readers, the one-shot one uses a much better known synonym, " さようなら".

It is likely that the one-shot text's higher overall complexity is dependent upon the sample extract that was provided to the model and not an intrinsic feature of the generation scenario -a hypothesis that should be tested through the inclusion of multiple texts of different proficiency levels. High potential can therefore be expected from a future combination of the one-shot technique's natural tendency to provide variety and lexical simplification and the chain-of-thought addition's incorporation of specific instructions.

Conclusion and Future Work

Importantly, the current study's observations are general in nature as they are based on a very limited number of texts. A deeper examination of the discovered trends is planned to involve a substantial increase in corpus size. At best, works of a variety of genres, years, and original languages of writing will be added. A large corpus size could allow for not only trends but also reliable value ranges to be defined concerning the examined textual features (with the caveat that some features, such as TTR, are highly dependent on textual length).

Based on the conducted evaluation, it can be concluded that automatic generation may benefit from longer chains of thought and from a combination of the one-shot and chain-of-thought generation methods. Subsequently, the newly derived texts should ideally be evaluated by a larger sample of readers of defined Japanese proficiency levels.

Comparison of the current experiment with similar ones that focus on different languages may provide interesting insight about the acquisition processes as associated with the languages at hand; in particular, in relation to literary dialogue. For instance, the use of anaphora in the face of the Japanese zero pronoun may be compared to the use of personal pronouns in English or other European languages.

Limitations

The limited size of the literary corpus, caused by restrictions of availability and of a financial nature, makes observations less generalisable as the author's or translator's choices play a significant role in the texts' characteristics. Some of the examined automatic characteristics are not exhaustive in nature and are meant to help discover overall tendencies rather than specific values or frequencies. For instance, solely the verb desu is used to differentiate between the use of short and -masu forms, and the cases when the verb may not be positioned at the end of the sentence are not accounted for.

Finally, as the number of participants in the conducted survey is limited and heterogeneous in terms of proficiency levels, it is challenging to conclude on the quality of the three generated texts solely on its basis.

A. Texts Used and Generated in the Automatic Abridgement Experiment

Original

B. Evaluation Protocol

Thank you for taking part in this evaluation, which is inscribed within the project "Dialogue in Abridged Literary Texts in Japanese: Observations and Automatic Reproducibility."

You have been assigned two texts to evaluate based on the provided criteria. The texts are automatically abridged versions of the same extract from Charles Dickens'novel A Christmas Carol. Please fill in the two tables independently by marking your answers with a tick or cross (or by circling the relevant answer). The section Aesthetic Appeal is highly subjective; please provide your personal opinion. You are welcome to write additional comments in the rightmost field.

This protocol is adapted from

Yes Mostly Yes Mostly No No

The text is easy to understand. It is free from overly complex words and sentence constructions.

Correctness

Does the text exhibit correct vocabulary, grammar, punctuation, and structure?

Yes Mostly Yes Mostly No No

There are no non-existent or irrelevant words, grammatical mistakes, mistakes in punctuation, or non-optimal paragraph/sentence structure.

Consistency

Is the text consistent in terms of register, writing style, and the information presented?

Yes Mostly Yes Mostly No No

The text maintains the same genre/level of formality, verb tenses, character traits, and knowledge. There are no contradictions in the presented information.

Textual Coherence

Are textual transitions present and natural? Is anaphora clearly resolvable? Is the text free of needless repetition?

Yes Mostly Yes Mostly No No

The text flows fluently between sentences (paragraphs). There are no unclear anaphora referents. Information (incl. separate words) is not overly repeated.

Aesthetic Appeal

Does the text have appropriate length, pacing, style, and register? Is the text engaging?

Yes Mostly Yes Mostly No No

The text feels neither too long nor too short. Appropriate textual length is accorded to plot segments. Formality levels are natural.

C. Prompts Used

Zero-Shot Setting

Here is an extract from the novel A Christmas Carol by Charles Dickens, as translated into Japanese. Please generate an abridged version of this extract that is suitable for an audience of

Analysis of dialogue-centered textual characteristics in abridged and unabridged versions of Alice's Adventures in Wonderland (registers of politeness and end-of-sentence particles are only associated with communication; therefore, features pertaining to them are only calculated for dialogue). Details about the characteristics are given in Section 3.3.CharacteristicsAbridged Dialogue Full text Dialogue Full text UnabridgedSet 1: Characteristics typical to Japanese dialogue/text% s-ces w/o wa/ga % s-ces ending in da 1 % s-ces ending in desu 2 % s-ces ending in particles 320.41 8.26 0.70 15.8221.40 N/A N/A N/A66.60 3.00 0.94 18.5734.85 N/A N/A N/A% s-ces ending in yo3.69N/A5.63N/A% s-ces ending in ne4.39N/A5.63N/A% s-ces ending in na2.11N/A3.19N/A% s-ces ending in sa0.53N/A0.19N/A% s-ces ending in zo1.58N/A1.5N/A% s-ces ending in ze0N/A0N/A% s-ces ending in wa3.51N/A2.44N/ASet 2: Universal readability-based characteristicslength (characters) % dialogue 469642 14.3180055 12.56average # words/s-ce13.2826.629.6219.10average # letters (kana)/word1.981.942.101.99TTR % words outside a 30k freq. list 50.24 58.440.10 60.190.26 58.410.10 60.355 for Alice in Wonderland0.67 vs 0.64 for Tom Sawyer0.61 vs 0.59 for Les Miserables18 2.4 vs 2.6 for Alice in Wonderland1.6 vs 1.38 for Tom Sawyer1.6 vs 2.66 for Les Miserables19 ratios: constant 1 for Alice in Wonderland0.95 vs 1.15 for Tom Sawyer0.85 vs 0.82 for Les Miserables

20 Alice in Wonderland's abridged version is an exception, showing a similar but slightly higher value in relation to the entire text 21 0.95 vs 1.9 for Alice in Wonderland 1.37 vs 1.62 for Tom SawyerTable1

Table 22Analysis of dialogue-centered textual characteristics for The Adventures of Tom Sawyer.CharacteristicsAbridged Dialogue FullUnabridged

text Dialogue Full text Set 1: Characteristics typical to Japanese dialogue/text% s-ces w/o wa/ga50.3836.8251.1431.46% s-ces ending in da19.64N/A11.86N/A% s-ces ending in desu3.04N/A0.54N/A% s-ces ending in particles17.79N/A33.79N/A% s-ces ending in yo6.52N/A13.24N/A% s-ces ending in ne1.93N/A3.26N/A% s-ces ending in na5.34N/A3.96N/A% s-ces ending in sa1.78N/A4.74N/A% s-ces ending in zo1.19N/A1.69N/A% s-ces ending in ze0.96N/A5.36N/A% s-ces ending in wa0.07N/A1.53N/ASet 2: Universal readability-based characteristicslength (characters)103758229024% dialogue26.5624.32average # words/s-ce13.2219.6215.1723.60average # letters/word1.971.951.921.93TTR0.160.100.110.08% words outside a 30k freq. list61.4565.8564.5457.20

Table 33Analysis of dialogue-centered textual characteristics for Les Miserables.CharacteristicsAbridged Dialogue FullUnabridged

text Dialogue Full text Set 1: Characteristics typical to Japanese dialogue/text% s-ces w/o wa/ga56.0433.3745.9121.62% s-ces ending in da10.48N/A14.10N/A% s-ces ending in desu8.60N/A7.98N/A% s-ces ending in particles19.65N/A13.65N/A% s-ces ending in yo6.94N/A4.75N/A% s-ces ending in ne5.92N/A3.6N/A% s-ces ending in na2.31N/A2.17N/A% s-ces ending in sa1.01N/A0.84N/A% s-ces ending in zo1.45N/A0.46N/A% s-ces ending in ze0.36N/A0.32N/A% s-ces ending in wa1.66N/A1.51N/ASet 2: Universal readability-based characteristicslength (characters)1060441522950% dialogue22.9116.63average # words/s-ce9.9316.2812.7721.30average # letters/word1.971.972.032.08TTR0.160.100.080.03% words outside a 30k freq. list56.3058.7166.3171.45

Table 44Analysis of dialogue-centered textual characteristics in an extract from the unabridged novel A Christmas Carol and in its automatically abridged version.CharacteristicsAbridged Dialogue Full text Dialogue Full text UnabridgedSet 1: Characteristics typical to Japanese dialogue/text% s-ces w/o wa/ga34.4823.0817.659.09% s-ces ending in da13.33N/A4.92N/A% s-ces ending in desu11.67N/A11.48N/A% s-ces ending in particles11.67N/A19.67N/A% s-ces ending in yo8.33N/A16.39N/A% s-ces ending in ne0N/A0N/A% s-ces ending in na1.67N/A1.64N/A% s-ces ending in sa0N/A0N/A% s-ces ending in zo1.67N/A1.64N/A% s-ces ending in ze0N/A0N/A% s-ces ending in wa0N/A0N/ASet 2: Universal readability-based characteristicslength (characters)14572331% dialogue69.8057.02average # words/s-ce14.7613.2336.4724.96average # letters/word2.232.112.072.04TTR0.370.30.380.32% words outside a 30k freq. list49.3753.3349.0253.93

Mousavi et al. (2022). If you have any questions, please address them to iglika.nikolova.stoupak@gmail.comDimension/QuestionAnswer Op-tionsDefinitionComment tional)(op-UnderstandabilityIs the text generally un-derstandable (in termsof storyline, vocabulary,and syntax)?

to be Twain, M., Nishi, K., Iijima, A.: Tomy Souyaa no bouken. Kodan: Aoitori Bunko (2012). Hugo, V., Katayama, W., Tsukahara, R.: Re mizeraburu. Kodan: Aoitori Bunko (2012). Carroll, L., Nishida, Y.: Fushigi no kuni no arisu. Popura: Poketto Bunko (2015). logographic Chinese-based characters corresponding, respectively, to the B1 and B2 levels of the CEFR framework ne: 11.86%; na: 8.47%; yo: 6.78%; zo: 1.69% in fact, it was slightly increased to 14.82 due to the particles that were added simultaneously the possible answers ('yes', 'mostly yes', 'mostly no', and 'no') were taken as categorical at the calculation of agreement he Dickens, C.: A Christmas Carol. Chapman & Hall (1843). "the nephew answered"

「でも、どこかしらにはいきたいの」アリスはいい足しました。「たっぷりあるけば、どこかしらに着く」と、チェシャ-ネコその通りね、とアリスは思ったので、こんどはべつの質問をしてみました。「ここらへんには、どんな人が住んでるの

-クリスマスだって？クリスマスは、金もないのに支はらいをしなきゃいけない時だ。また一年、年を取るが、少しも金持ちになっていないことを知る時だ。もしわしの思い通りになるなら」とスクル-ジは怒って言いました。「『メリ-クリスマス』なんて言うばか者は、プディングと一緒に煮こんでやりたい」「おじさんったら」とおいは言いました。「お前はお前のやり方でクリスマスを祝えばいい。わしはわしのやり方があるから、ほっといてくれ」「やり方ですって？おじさんは何もしないじゃないですか」「ほっといてくれ、それでいいだろう。クリスマスはたのしいんだろう？今までもいいことがたくさんあったんだろうし」「はい、たくさんありました。お金はえなかったけど、クリスマスはいつもたのしいです。神さまの名前やクリスマスの意味を考えなくても、クリスマスは親切で、たのしい時だと思います。人々が心を開いて、他の人々を友達のように感じる時です。クリスマスが僕にお金をくれたことはないけど、それでもクリスマスは素晴らしいと思います。だから、神さまが守ってくださいますように」事務員が、思わず拍手をしましたが、すぐに火をかき回して暖かさを消してしまいました。「もう一度音を立ててみろ」スクル-ジは叫びました。「クビにするぞ」とおいの方を向いて、「国会議員にでもなった方がいいんじゃないか」と言いました。「怒らないでくださいよ、おじさん。明日は僕たちと一緒に夕食をとりましょう」スクル-ジは、「お前が墓場に落ちるところを見たいものだ」と言いました。「どうしてですか？」おいは叫びました。「どうしてそんなことを言うんですか？」「どうしてお前は結婚したんだ？」「恋に落ちたからです」「恋に落ちたから？くだらない」とスクル-ジは言い捨てました。「さようなら」「でも、おじさん、結婚する前だって来てくれなかったじゃないですか。どうして結婚が理由になるんですか」「さようなら」「べつにおじさんにどうこうしてもらおうとは思っていません。でも、仲良くできないんですか？」「さようなら」「おじさんがそんなに頑固なのは残念です。でも、クリスマスのき持ちを忘れずに、メリ-クリスマス、おじさん」「さようなら」「よいお年を」「さようなら」それでも、おいは怒らず、部屋を出ました。外に出る前に、事務員にクリスマスの挨拶をしました。事務員も、スクル-ジよりは心が温かかったので、心をこめて挨拶を返しました。

learners of Japanese as a foreign language at an intermediate level.

{text} One-Shot Setting {the prompt for the zero-shot setting} Please see below an example of an original and a well-abridged version of an extract from a novel.

{unabridged example} {abridged example}

Chain-of-Thought Setting

Here is an extract from the novel A Christmas Carol by Charles Dickens, as translated into Japanese. {text} You previously provided me with the following abridged version of it: {zero-shot output} Could you please make the following small modifications in the text, whilst trying to alter it as little as possible: -Make the sentences in the dialogue parts a little shorter (in terms of number of words).

-Include a larger variety of end-of-sentence particles in the dialogue, such as ne and na.

and (v) the type-to-token ratio (TTR) (vi) the percentage of words that are not ending with the short verb form da and the related forms datta nai and dewa/ja nakatta 3. ending with the full verb form desu and the related forms deshita and dewa/ja arimasen 4. ending with the particles yo, ne, na, sa, zo, ze and wa. The second set consists of atomic readability characteristics that are not specific to the Japanese language and are commonly considered as representative of textual complexity: (i) the text's total length (in characters), (ii) the percentage of dialogue in the full text, (iii) the average number of words per sentence, (iv) the average number of letters per word. found in a frequency list 14 [15. For a detailed description of the investigated characteristics, please refer to section 4.1 Automatic Text Generation and Adaptation Following the measurement of the characteristics in abridged and unabridged versions of Japanese literary texts as detailed in section 3.3, an experiment in automatic generation of abridged text is conducted. For this purpose, ChatGPT is employed in its recent GPT-4o version. The model is asked to provide an abridged version of an extract from Charles Dickens' A Christmas Carol 15 as translated into Japanese, which contains a large amount of dialogue. Three discrete generation scenarios are experimented with based on the nature of the provided prompts: zero-shot (in which only instructions are provided), one-shot (in which the model is given an example pair of unabridged and abridged extracts, taken from the respective professional versions of Alice in Wonderland) and chain-of-thought (which includes additional prompting following the model's generation). In the chain-of-thought setting, the model is asked to explicitly alter the text output in the zero-shot setting based on the trends that are MTwain KYoshida Tomy Souyaa no bouken Kaisei 1975 VHugo TYoshio Re mizeraburu. Kodan: Aozora Bunko 1987 Fushigi no kuni no arisu LCarroll HYamagata 2000 Asahi The utilised frequency list is based on the top 30k words of the Open Subtitles multilingual corpus Words are considered as types rather than tokens. Due to the nature of the frequency list, words consisting entirely of hiragana characters are not considered CDickens Katokt Kurisumasu Kyaroru Tatsu-Zine 2003 Statistics of jlpt takers, JLPT Japanese-Language Proficiency Test 2023 <author> <persName><forename type="first">Aoitori</forename><surname>Bunko</surname></persName> </author> <author> <persName><forename type="first">Bunko</forename><surname>Aoitori</surname></persName> </author> <author> <persName><surname>Books</surname></persName> </author> <ptr target="https://cocreco.kodansha.co.jp/aoitori" /> <imprint> <date type="published" when="2023">2023</date> <pubPlace>Japan</pubPlace> </imprint> </monogr> </biblStruct> <biblStruct xml:id="b9"> <monogr> <title/> <author> <persName><forename type="first">Tsubasa</forename><surname>Bunko</surname></persName> </author> <author> <persName><forename type="first">Bunko</forename><surname>Tsubasa</surname></persName> </author> <author> <persName><surname>Books</surname></persName> </author> <ptr target="https://tsubasabunko.jp" /> <imprint> <date type="published" when="2023">2023</date> <pubPlace>Japan</pubPlace> </imprint> </monogr> </biblStruct> <biblStruct xml:id="b10"> <analytic> <title level="a" type="main">Recent advances in deep learning based dialogue systems: a systematic survey JNi TYoung VPandelea 10.1007/s10462-022-10248-8 Artificial Intelligence Review 56 2023 A systematic review on artificial intelligence dialogue systems for enhancing english as foreign language students' interactional competence in the university CZhai SWibowo 10.1016/j.caeai.2023.100134 Computers and Education 4 100134 2023 Artificial Intelligence An interactive dialog system for learning japanese FEhsani JBernstein ANajmi 10.1016/S0167-6393(99)00042-4 Speech Communication 30 2000 Dialogues with/and great books: With some serious reflections on robinson crusoe DFishelov 10.1353/nlh.0.0022 New Literary History 39 2008 project MUSE Dialogue in novels and plays CMorgan Études Anglaises 6 97 1953 The interaction of style-shift and particle use in japanese dialogue AJanes 10.1016/S0378-2166(99)00118-6 Journal of Pragmatics 32 2000 Dialogue functions of japanese sentence-final particles 'yo' and 'ne YKatagiri 10.1016/j.pragma.2007.02.013 Journal of Pragmatics 39 2007 Identifying zero pronouns in japanese dialogue KYoshimoto International Conference on Computational Linguistics 1988. 1988 2 Coling Budapest Popura poketto bunko books 2023 What are aoi tori bunko books? 2023 Japanese Book Club Cafe Japanese novel review: 霧のむこうのふしぎな町 the marvelous village veiled in mist by kashiwaba sachiko Japanese Book Club Cafe 2023 The Classic Readability Studies WHDubay 10.1109/TPC.2008.2007872 2007 ERIC Clearinghouse Technical Report Evaluation of response generation models: Shouldn't it be shareable and replicable? SMMousavi GRoccabruna MLorandi SCaldarella GRiccardi Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM), Association for Computational Linguistics the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM), Association for Computational Linguistics

Abu Dhabi, United Arab Emirates; Hybrid

2022