Exploring the Use of Cohesive Devices in Dementia within an Elderly Italian Semi-spontaneous Speech Corpus Giorgia Albertin*,† , Elena Martinelli† Alma Mater Studiorum - University of Bologna, Department of Classical Philology and Italian Studies, 32 Zamboni Street, 40126 Bologna, Italy Abstract The study of language disruption in dementia, aimed at individuating which features correlate with cognitive impairment, is a growing area in computational linguistic research. Still, it needs a further development in analyzing some discourse phenomena that also undergo deterioration, and can help expand our understanding of dementia-related speech and refine automatic tools. This paper explores the discourse property of cohesion by investigating three types of cohesive devices: reference, lexical iteration, and connectives. Ten features related to these categories have been defined and automatically extracted from an Italian corpus of semi-spontaneous speech collected from dementia patients and healthy controls. Some of the designed features have proven significant for the binary classification of the two groups and further quantitative analysis highlight interesting differences in the use of cohesive devices, that seem to be associated with cognitive decline. Keywords Cohesion, Cohesive devices, Dementia, Cognitive Impairment, Semi-spontaneous speech 1. Introduction Coherence is compromised, especially in spontaneous speech: the discourse appears with an abundance of ir- Linguistics deficits commonly characterized neurodegen- relevant details and the overt difficulty to mention the erative diseases from their onset. In Dementia, or Major key concept or to refer to the topic, resulting in a lack of Neurocognitive Disorder (DSM-5 [1]), a syndrome of informativeness in communication [8, 9, 10]. acquired and progressive impairment in cognitive func- In recent years, speech analysis in cognitive decline tion that interfere with independence in everyday life, has gained increasing importance in the development language deterioration manifests itself within a broader of low-cost and portable tools for dementia screening, framework of cognitive impairment, which could affects also supported by the remarkable advancements in Nat- memory, visuo-spatial skills, executive functions and rea- ural Language Processing (NLP) and Machine Learning soning. Deficits both in verbal production and compre- (ML) technologies [11]. The refinement of classification hension have been observed, despite the specificity of systems goes hand in hand with the operationalization different Dementia’s etiological subtypes, among which of linguistic features computed from oral productions, the most common is Alzheimer’s Disease (AD), character- that need to be adapted to different languages. Regard- ized with a primary impairment in episodic memory. In ing Italian, the OPLON (OPportunities for active and AD, for example, among the well-established linguistic healthy LONgevity) [2014-2016] project was devoted to deficits there are word-finding problems, which include the automatic extraction of an extensive group of linguis- anomia, the production of semantic paraphasias [2, 3] and tic features from acoustic, rhythmic, readability, lexical, the "on the-tip-of-the tongue" experience [4], low speech morpho-syntactic and syntactic levels, from a speech cor- rate, poor word comprehension [5] and, as the disease pus of cognitively impaired patients and healthy peers worsen, a generalized simplification of syntax [6]. Also [12, 13]. Analysis of the significance of the features high- discourse and pragmatic level is affected by cognitive de- lighted that the acoustics ones largely correlated with cline. Errors in referential cohesion has been registered, the cognitive state of the subjects [14]. in particular regarding ambiguous use of pronouns [7]. Expanding the list of language levels covered to in- clude speech properties would enrich the features used CLiC-it 2024: Tenth Italian Conference on Computational Linguistics, for classification and, in addition, could broaden our un- Dec 04 — 06, 2024, Pisa, Italy * Corresponding author. derstanding of how cognitive decline manifests itself The contribution of each author to the paper is specified in the in verbal competence. Nevertheless, defining specific † CRediT authorship statement declaration. features of higher-level and complex phenomena is not $ giorgia.albertin3@unibo.it (G. Albertin); trivial. Drawing inspiration from works that propose a elena.martinelli12@unibo.it (E. Martinelli) "stratified" approach to discourse analysis, which indi- € https://www.unibo.it/sitoweb/giorgia.albertin3 (G. Albertin); vidually considers macro-phenomena that intersect with https://www.unibo.it/sitoweb/elena.martinelli12/ (E. Martinelli)  0000-0002-5728-3473 (G. Albertin); 0009-0007-4399-6951 one another [15, 16], this paper will examine cohesion, (E. Martinelli) the property of the superficial form of the text to reflect © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). its internal unity [17]. Cohesion assures continuity in dis- CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings Table 1 Recruitment Criteria (age; language exposure; neurological status or diagnosis; cognitive scores: MMSE, MoCA, phonemic (PF) and semantic (SF) fluency) and Demographics (age and sex). Control Group Pathological Group Age > 60 years Age > 60 years Monolingual Monolingual Italian L1 Italian L1 Absence of neurological/sensory deficits Clinical diagnosis of dementia Recruitment criteria MMSE ≥ 22 MMSE < 22 MoCA > 19.262 MoCA ≤ 19.262 PF ≥ 17.35 PF < 17.35 SF ≥ 7.25 SF < 7.25 Age 81 ± 6.3 (range: 63-91) 81 ± 6.9 (range: 63-92) Sex 12F, 8M 12F, 8M course through a network of cohesive devices, which are mainly words or morphemes, that contribute to maintain semantic relations occurring in the text [17]. Therefore, we proposed a method to design and formalize a set of cohesion features, with the aim of observing whether they contribute to discriminate the speech of individuals with dementia from healthy peers. Specifically, three types of elements, which Halliday & Hasan [18] indi- cate among the major contributors to cohesion, were taken into consideration: reference, lexical iteration and connectives. The implementation of measures based on cohesive devices is the first step towards the attempt to include discourse properties in the automatic analysis of Figure 1: Esame del Linguaggio II [19], stimulus figure used language in cognitive decline. The study of their interac- in the picture description task. tion with features of other linguistic levels is crucial to observe whether they have a positive impact on discrim- ination between dementia subjects and healthy subjects. in the collected speech, and will be discussed in Section The work presented in this paper, therefore, has to be in- 4 in relation to the results of the analysis. tended as a preliminary analysis that will serve to pursue The Pathological Group (PG) consists of 20 patients more sophisticated ML classification in the future. suffering from different forms of dementia (9 cases of Alzheimer’s Disease, 2 of Mixed Dementia, 5 of unspeci- 2. Corpus Description fied Dementia, 3 of Vascular Dementia, 1 of Frontotempo- ral Dementia), recruited at the “Universo Salute - Opera In this study, we used the corpus collected within the Don Uva (PZ)” rest home, and the Control Group (CG) project "Linguistic characteristics of the speech of el- consists of 20 subjects with neurotypical cognitive aging. derly subjects with dementia” [20, 21], approved by the Informed consent was obtained from all participants (in Bioethics Committee of the University of Bologna (Prot. the case of patients, by their family members, caregivers, N. 0072032/2022). The corpus consists of oral linguistic or legal tutors). As a first step, the recruited subjects un- production of 40 Italian-speaking individuals living in derwent an evaluation of their cognitive status through Basilicata, forming two groups balanced by sex and age. the administration of the four following neuropsycholog- Although the initial objective was to balance the cohorts ical tests: Mini-Mental State Examination (MMSE [22]), also on education level, it was not possible to consider Montreal Cognitive Assesment (MoCA [23]), and Verbal this aspect due to the lack of this information in some Fluency Test, both Phonemic [24, 25, 26, 27] and Seman- patients medical records. Even from a sociolinguistic tic [28]. The Table 1 summarizes the recruitment criteria perspective, it is important to advance that some par- and the demographics for study participants. ticipants, albeit Italian-speaking, were also exposed to Then, two narrative tasks (the story of a journey and dialect systems in their lives. This aspect explains the fre- the story of the Christmas holiday’s traditions) and one quent occurrence of substandard linguistic expressions picture description task (using the stimulus figure in “Lan- Table 2 Corpus Size. Audio duration and number of tokens (of the transcriptions) are reported, both with respect to the groups (Gr. durat. and Gr. count), to the single subject (Subj. avg (st.dev)) and to the whole corpus. Audio Tokens Gr. durat. Subj. avg. (sd) Gr. count Subj. avg. (sd) Pathological group 04:25:26 00:12:00 (00:08:00) 23,518 1,176 (1,218) Control group 03:23:17 00:10:00 (00:05:00) 25,745 1,287 (710) Total 07:48:43 - 49,263 - guage Examination II" [19], see Figure 2) were adminis- to something already known in the text or anticipating tered to collect semi-spontaneous speech, elicited with it. Reference functions either by repetition, which can the following stimulus sentences: 1) "Do you want to tell be partial (e.g., through a synonym) or total, by semantic me about a trip you took?"; 2) “How do you usually spend contiguity, or by substitution with pronouns or other Christmas day?”; 3) “Could you describe this figure to elements [17]. It is this second type of referential ex- me?”. This protocol allowed the collection of approxi- pressions, closely linked to the textual dimension, that is mately 9 hours of audio (i.e., 8 hours for the recruited investigated through the features, thus focusing on the groups and 1 hour for the interviewer), subsequently an- occurrence of anaphora and cataphora. notated at various linguistic levels. By using the ELAN An extensive literature review was necessary to se- software [29], the corpus was manually transcribed at lect a relevant group of those expressions in the Italian the orthographic level, segmented into utterances (i.e., language (see [33, 34, 35]). The group of elements col- the reference unit of discursive analysis [30]), and anno- lected includes pronouns, both personal (e.g., io, tu, lei, tated at the prosodic level (theoretical framework: The lui), demonstrative (e.g., questo, quello), indefinite (e.g., Language into Act Theory - L-AcT [31]). Table 2 sum- alcuni, tutti), and possessive, possessive adjectives (e.g., marize the size of the corpus and the average material mio, tuo), as well as deictics (e.g., fuori, sopra, avanti, qua, (audio/token) collected for each patient and control sub- qui, dentro, dietro, giù, indietro, su, lì, avanti, oltre, ci). The ject. The total number of tokens was calculated on the occurrences of these groups were counted and divided by orthographic transcription of the corpus (cleaned of an- the total number of tokens per subject (COE_REF). Addi- notation tags), and consists of 49,263 tokens (i.e., 23,518 tionally, the pronoun density (COE_PRON_DENS), defined for PG and 25,745 for CG). Finally, using the Gagliardi as the ratio between pronouns and nouns uttered [36], & Tamburini pipeline [32], tokenization, lemmatization, was computed for each subject. part-of-speech tagging, and syntactic parsing was auto- matically performed for the entire corpus. 3.2. Lexical iteration According to Halliday and Hasan [18], the iteration of 3. Cohesive Devices’ Features a lexical item is a specific use of the repetition-type ref- erential mechanism, which acquires cohesive force on Ten features that quantify the use of cohesive devices its own because it is typically used when the referent is by the speakers were designed and formalised. The fea- farther in the text. This set of features focuses on the tures were computed with respect to each subject, thus repetition of three main open-class categories, namely referring to the amount of speech produced by the sin- nouns, (main) verbs, and adjectives. The use of words gle individual in the three tasks. To comprehensively from these classes affects the richness of vocabulary, re- address the categories of cohesive devices considered, we flecting the speaker’s tendency toward lexical variation. use the .conll file resulted from the data annotation as Word-finding problems occurring in cognitive decline the input for our analysis. Features’ automatic extraction often manifest as difficulties in retrieving forms from was done via .python scripts. The methodology used the lexicon. The repetition of the same words can then will be described in detail in the following sections. occur as a sort of repair mechanism, resulting in seman- tically impoverished speech. Conversely, the use of some 3.1. Reference types of closed-class particles, such as prepositions and auxiliaries, is bound to the syntactic structure. Reference is involved when an expression that requires Lexical iteration features were computed by sepa- interpretation by referring to something else occurs in the rately considering word forms and lemmas of nouns, discourse [18]. This mechanism can be employed both verbs, and adjectives. These features include the in anaphoric and cataphoric uses, to refer respectively Figure 2: Example of .conll annotation. Occurrences of automatically extracted cohesion devices are reframed: lui as a referential expression (note the specification PronType:Prs in FEAT column), the repetition of word forms and lemma of a verb (parlava - parlare) and the connectives e and quando. repetitions of elements divided by the total number Table 3 of words (COE_RIP_LEM, COE_RIP_WORD), the av- Results of Kolmogorov-Smirnov test. The cohesive devices’ erage number of repetitions for repeated elements features are reported along with their p-value, significant ones (COE_MEDRIP_LEM, COE_MEDRIP_WORD), and the max- are marked in bold. The p-values of features that resulted sig- imum number of repetitions over the total number of nificant in Kolmogorov-Smirnov test but not after Bonferroni’s iterations (COE_MAXRIP_LEM, COE_MAXRIP_WORD). correction are given in italic. Features p-value 3.3. Connectives COE_TC 0.33 COE_REF 1 As defined by Ferrari [37], connectives are morpholog- COE_REF_DENS 1 ically invariable forms (e.g., conjunctions or locutions) COE_RIP_LEM 0.04 that explicitly indicate logical relations within parts of COE_RIP_WORD 1 the text and pertain to the logical level. Elements from COE_MEDRIP_LEM 0.81 different grammatical classes can be used as connectives COE_MEDRIP_WORD 0.33 and are classified based on their function, which usually COE_MAXRIP_LEM 1 reflects their meaning (e.g., temporal, causal, additive). COE_MAXRIP_WORD 1 To compile an extensive list of connectives, we rely COE_TOT 0.04 on the Lexicon of Italian Connectives - LICO 1 [38, 39]. LICO contains 173 entries, including single words (e.g., e, Table 4 se, ma, infatti, quando, quindi), complex expressions (e.g., Frequencies of cohesive devices by subject. The average num- a causa di, da allora), and correlatives (e.g., da un lato ber of occurrences of substitution-type reference items, itera- ... dall’altro). Connectives are reported along with their tions of lemmas and of word forms (of nouns, adjectives and lexical or orthographic variants, part of speech category, verbs) and connectives for each subject in PG and CG is re- the semantic relations conveyed according to the Penn ported, along with (st. dev). Discourse Tree Bank 3.0 schema [40], examples of usage, Cohesive devices PG CG and alignments of connectives from other languages. A feature was devoted to compute the occurrences of con- Reference 146.5 (152.23) 161 (90.93) nectives relative to the total number of tokens per subject Iter. lemma 68.9 (68.00) 87.05 (42.25) (COE_TC). Iter. word form 74.15 (74.38) 87.8 (49.25) Finally, the last feature was designed as an attempt Connectives 23.8 (35.15) 36.65 (26.68) to capture the overall impact of the classes of cohesive devices studied in this paper in the two cohorts of cor- pus speakers. Therefore, the role of cohesion elements 4. Results was comprehensively measured in COE_TOT by summing referential-substitute expressions, lexical iteration items The statistical significance of the cohesion features for and connectives, divided by the total number of words. the binary discrimination of PG and CG cohorts was cal- Figure 3.3 shows as example an excerpt from the anno- culated using the non-parametric Kolmogorov-Smirnov tation in .conll format, in which some of the linguistic test, due to the limited sample size of the corpus. Given elements considered were highlighted. the number of comparisons performed, we adjusted the results with Bonferroni correction to control for Type I 1 http://connective-lex.info/ error. This approach involves adjusting the significance (mean=68.9), while the two values are very similar in CG (lemmas: mean=87.05, words: mean=87.8). This imbal- ance in favor of forms in the dementia patients appears to uncover lexical impoverishment compared to healthy subjects. Indeed in CG, although a higher overall number of repetitions is registered, it is combined with a more bal- anced distribution between lemmas and forms, suggest greater lexical variety. An additional consideration regarding the opposing trend observed between lemmas and forms could be ex- plained with respect to the sociolinguistic profile of the data, related to the diatopic variation of Italian language [41]. Indeed, speakers from both groups show an exten- sive use of dialectal terms and structures characteristic of the Italian variety spoken in the Lucanian Apennine area. As reported in Section 2, the annotation was conducted automatically using the pipeline developed by Gagliardi Figure 3: Distribution plots of significantly discriminative & Tamburini [32], which is designed to analyze standard features. COE_RIP_LEM indicates the repetitions of lemmas of Italian. Therefore, it is likely that the system struggled nouns, adjectives and verbs and COE_TOT is a comprehensive to handle some substandard expressions, which often features of all the classes of cohesive devices considered. orthographically diverge from the other words in the transcription, as can be observed in this example from a PG subject: level by dividing the conventional alpha value (0.05) by the total number of comparisons made. The results of gemm’ a trua’ [=andammo a fare visita] a mia the test, reported in Table 3, show that two of the de- suocera, ca [=che] mio suocero è morto (. . . ). signed features significantly contribute to differentiate the two groups: a feature related to lemmas’ iteration It is not excluded that the presence of dialect may also (COE_RIP_LEM) and the comprehensive feature of cohe- have influenced the automatic extraction of other co- sive devices (COE_TOT). The distribution of these features hesive devices. Indeed, the higher frequency in CG of is reported in Figure 4. substitution-type reference items (mean=161) and con- The application of Bonferroni’s correction caused a nectives (mean=36.65) compared to PG (ref. mean=146.5, decrease in the p-value of two initially significant fea- conn. mean=23.8) contrasts with what has been observed tures, namely COE_TC and COE_MAXRIP_WORD. Given in oral production of narrative discourse in cohorts of the exploratory nature of the experiment, which involves dementia subjects and healthy controls [8]. Therefore, the formalisation of new features in order to discriminate we consider the possibility that automatic feature ex- subjects with cognitive impairment from healthy con- traction preceded on manually-checked annotation may trols in Italian, we have nevertheless chosen to highlight yield different results than those obtained. the p-values of these features in 3. Nevertheless, the significance of the comprehensive We can observe that, compared with the control group, feature (COE_TOT) indicates that the use of cohesive de- the speech of dementia subjects is characterized by fewer vices investigated in this paper plays a role in distin- repetitions of the same noun, verb and adjective lem- guishing dementia subjects from healthy controls. In mas out of the total number of words uttered, captured Figure 4 it can be noted that COE_TOT shows, on average, by COE_RIP_LEM. Thus in the dataset emerges that PG lower values for the PG compared to the CG. This results group is less prone to lexical iteration of lemmas than suggests that the linguistic processing of some phenom- CG. However, if we have a look to the occurrences’ dis- ena related to cohesion (i.e. substitution-type reference tributions of the cohesive elements considered, reported elements, lexical iteration items, and connectives) is gen- in Table 4, interesting trends could be noticed. Indeed, erally affected by cognitive decline in semi-spontaneous the quantitative analysis of lexical repetitions revealed a speech. Thus, the analysis of discourse properties seems disparity between repeated lemmas and repeated word to be a promising path for studying the linguistic charac- forms of the same grammatical categories (noun, adjec- terisation of neurodegenerative disorders. Therefore, we tives and verb) between the two groups. Specifically, hope that our approach in the future could be applied to despite the high variability due to subjective differences, phenomena strictly related to cohesion - first of all, co- it is observed that in PG, the average repetition of forms herence - or extend to other domains, such as pragmatics, (mean=74.15) is higher than the repetition of lemmas that may mask subtle clues of cognitive frailty. 5. Conclusion a comparative review, Journal of clinical and exper- imental neuropsychology 30 (2008) 501–556. In this work, we present a methodology for delineat- [4] E. A. Stamatakis, M. A. Shafto, G. Williams, P. Tam, ing linguistic features of cohesion to track and study L. K. Tyler, White matter changes and word find- changes in discourse properties in the speech of indi- ing failures with increasing age, PloS one 6 (2011) viduals with cognitive impairment compared to healthy e14496. peers. The research focused on three types of cohesive [5] A. E. Budson, N. W. Kowall, The handbook of devices, i.e., reference, lexical iteration, and connectives, Alzheimer’s disease and other dementias, John Wi- that were automatically extracted from a Italian corpus ley & Sons, 2011. of semi-spontaneous speech from dementia subjects and [6] S. O. Orimaye, J. S.-M. Wong, K. J. Golden, Learning controls, collected in Basilicata. Statistical significance predictive linguistic features for alzheimer’s disease for binary discrimination was computed applying the and related dementias using verbal utterances, in: Kolmogorov-Smirnov test, and then adjusting the results Proceedings of the Workshop on Computational with Bonferroni’s method. The test shows that a feature Linguistics and Clinical Psychology: From linguis- of the repetitions of lemmas and the one related to the tic signal to clinical reality, 2014, pp. 78–87. set of cohesive devices jointly considered contribute to [7] S. Carlomagno, A. Santoro, A. Menditti, M. Pan- distinguish the two groups. Moreover, the quantitative dolfi, A. Marini, Referential communication in distribution of the cohesive devices reveals differences alzheimer’s type dementia, Cortex 41 (2005) 520– in the use of elements within the considered categories 534. between PG and CG, which seem to highlight a general [8] C. Drummond, G. Coutinho, R. P. Fonseca, N. As- deterioration in discursive competencies associated with sunção, A. Teldeschi, R. de Oliveira-Souza, J. Moll, dementia. The results obtained provide a preliminary ba- F. Tovar-Moll, P. Mattos, Deficits in narrative dis- sis for further study of discourse properties in cognitive course elicited by visual stimuli are already present decline, with the aim of expanding the set of linguis- in patients with mild cognitive impairment, Fron- tic features that can be automatically extracted to other tiers in aging neuroscience 7 (2015) 96. levels of language. This expansion is intended to refine [9] S. Ahmed, A.-M. F. Haigh, C. A. de Jager, P. Garrard, digital systems that could be employed as support for Connected speech as a marker of disease progres- the early diagnosis and monitoring of neurodegenerative sion in autopsy-proven alzheimer’s disease, Brain diseases, potentially improving timely interventions for 136 (2013) 3727–3737. patients and their caregivers. [10] T. Bschor, K.-P. Kühl, F. M. Reischies, Spontaneous speech of patients with dementia of the alzheimer type and mild cognitive impairment, International CRediT authorship statement psychogeriatrics 13 (2001) 289–298. declaration [11] S. De la Fuente Garcia, C. W. Ritchie, S. Luz, Artifi- cial intelligence, speech, and language processing GA Conceptualization, Methodology, Software (i.e. fea- approaches to monitoring alzheimer’s disease: a tures formalization), Formal analysis, Writing (§ 1, 3, 4, systematic review, Journal of Alzheimer’s Disease 5). 78 (2020) 1547–1574. EM Resources (i.e. data collection), Data curation (i.e. [12] L. Calzà, G. Gagliardi, R. R. Favretti, F. Tamburini, manual transcription), Writing (§ 2). Linguistic features and automatic classifiers for identifying mild cognitive impairment and demen- References tia, Computer Speech & Language 65 (2021) 101113. [13] D. Beltrami, G. Gagliardi, R. Rossini Favretti, E. Ghi- [1] D. American Psychiatric Association, D. American doni, F. Tamburini, L. Calzà, Speech analysis by Psychiatric Association, et al., Diagnostic and statis- natural language processing techniques: a possible tical manual of mental disorders: DSM-5, volume 5, tool for very early detection of cognitive decline?, American psychiatric association Washington, DC, Frontiers in aging neuroscience 10 (2018) 369. 2013. [14] G. Gagliardi, F. Tamburini, Linguistic biomark- [2] E. Catricalà, P. A. Della Rosa, V. Plebani, D. Perani, ers for the detection of mild cognitive impairment, P. Garrard, S. F. Cappa, Semantic feature degrada- Lingue e linguaggio 20 (2021) 3–31. tion and naming performance. evidence from neu- [15] B. S. Kim, Y. B. Kim, H. Kim, Discourse measures rodegenerative disorders, Brain and language 147 to differentiate between mild cognitive impairment (2015) 58–65. and healthy aging, Frontiers in aging neuroscience [3] V. Taler, N. A. Phillips, Language performance in 11 (2019) 221. alzheimer’s disease and mild cognitive impairment: [16] J. Kim, J. Shim, J. H. Yoon, Subjective rating scale for discourse: Evidence from the efficacy of subjective [28] H. Spinnler, G. Tognoni, Standardizzazione rating scale in amnestic mild cognitive impairments, e taratura italiana di test neuropsicologici: Medicine 98 (2019) e14041. gruppo italiano per lo studio neuropsicologico [17] A. Ferrari, Linguistica del testo, Principi, fenomeni, dell’invecchiamento, Masson Italia periodici, strutture, Roma, Carocci (2014). Milano, 1987. Supplementum 8 - Italian journal of [18] M. A. K. Halliday, R. Hasan, Cohesion in english, neurological sciences. Routledge, 2014. [29] ELAN (version 6.2) [computer software], 2021. URL: [19] P. Ciurli, P. Marangolo, A. Basso, Esame del Lin- https://archive.mpi.nl/tla/elan. guaggio II. Manuale e materiale d’esame, Giunti, [30] J. L. Austin, How to do things with words, Claren- Firenze, 1996. don Press, Oxford, 1962. [20] E. Martinelli, V. Garrammone, F. Mori, I. Nolè, [31] E. Cresti, M. Moneglia, The illocutionary basis of F. Cameriero, M. Martino, G. Di Bello, G. Gagliardi, information structure: The language into act theory DemCorpus-basilicata: Dementia corpus, 2022. (l-act), in: E. Adamou, et al. (Eds.), Information URL: http://hdl.handle.net/20.500.11752/OPEN-989, Structure in Lesser-described Languages: Studies ILC-CNR for CLARIN-IT repository hosted at Insti- in prosody and syntax, John Benjamins Publishing tute for Computational Linguistics "A. Zampolli", Company, Amsterdam, 2018, pp. 360–402. National Research Council, in Pisa. [32] G. Gagliardi, F. Tamburini, The automatic extrac- [21] E. Martinelli, G. Gagliardi, Compromissioni tion of linguistic biomarkers as a viable solution for semantico-lessicali nei pazienti italofoni affetti da the early diagnosis of mental disorders, in: N. Calzo- demenza: un’analisi corpus-based, ITALIANO lari, F. Béchet, P. Blache, K. Choukri, C. Cieri, T. De- LINGUADUE 15 (2023) 711–732. doi:10.54103/ clerck, S. Goggi, H. Isahara, B. Maegaard, J. Mariani, 2037-3597/21986. H. Mazo, J. Odijk, S. Piperidis (Eds.), Proceedings [22] E. Magni, G. Binetti, A. Bianchetti, R. Rozzini, of the Thirteenth Language Resources and Evalu- M. Trabucchi, Mini-mental state examination: a ation Conference, European Language Resources normative study in italian elderly population, Eu- Association, Marseille, France, 2022, pp. 5234–5242. ropean Journal of Neurology 3 (1996). URL: https: URL: https://aclanthology.org/2022.lrec-1.561. //api.semanticscholar.org/CorpusID:24843663. [33] M. Prandi, C. De Santis, Le regole e le scelte, In- [23] S. Conti, S. Bonazzi, M. Laiacona, M. Masina, troduzione alla grammatica italiana, UTET, Torino M. V. Coralli, Montreal cognitive assessment (2006). (moca)-italian version: regression based norms [34] C. Andorno, Linguistica testuale. Un’introduzione, and equivalent scores, Neurological Sciences 36 Carocci, 2003. (2015) 209–214. URL: https://api.semanticscholar. [35] A. Ferrari, L. Zampese, Dalla frase al testo: una org/CorpusID:3026657. grammatica per l’italiano, Zanichelli, 2000. [24] C. Caltagirone, G. Gainotti, G. Carlesimo, L. Par- [36] M. M. Louwerse, P. M. McCarthy, D. S. McNamara, netti, L. Fadda, R. Gallassi, et al., Batteria per la A. C. Graesser, Variation in language and cohesion valutazione del deterioramento mentale (parte I): across written and spoken registers, in: Proceed- descrizione di uno strumento di diagnosi neurop- ings of the Annual Meeting of the Cognitive Science sicologica, Archivio di Psicologia, Neurologia e Society, volume 26, 2004. Psichiatria 56 (1995) 461–470. [37] A. Ferrari, Connettivi, Enciclopedia dell’italiano [25] G. A. Carlesimo, C. Caltagirone, G. Gainotti, et al., (2010). Batteria per la valutazione del deterioramento men- [38] A. Feltracco, E. Ježek, B. Magnini, Enriching a lex- tale (parte II): standardizzazione e affidabilità di- icon of discourse connectives with corpus-based agnostica nell’identificazione di pazienti affetti da data, in: Proceedings of the Eleventh International sindrome demenziale, Archivio di Psicologia, Neu- Conference on Language Resources and Evaluation rologia e Psichiatria 56 (1995) 471–488. (LREC 2018), 2018. [26] G. Carlesimo, C. Caltagirone, L. Fadda, et al., Batte- [39] A. Feltracco, E. Jezek, B. Magnini, M. Stede, Lico: ria per la valutazione del deterioramento mentale A lexicon of italian connectives, CLiC it (2016) 141. (parte III): analisi dei profili qualitativi di compro- [40] B. Webber, R. Prasad, A. Lee, A. Joshi, A discourse- missione cognitiva, Archivio di Psicologia, Neu- annotated corpus of conjoined vps, in: Proceedings rologia e Psichiatria 56 (1995) 489–502. of the 10th Linguistic Annotation Workshop held [27] G. A. Carlesimo, C. Caltagirone, G. Gainotti, et al., in conjunction with ACL 2016 (LAW-X 2016), 2016, The mental deterioration battery: Normative data, pp. 22–31. diagnostic reliability and qualitative analyses of cog- [41] G. Berruto, Sociolinguistica dell’italiano contempo- nitive impairment, European Neurology 36 (1996) raneo, Roma: Carocci (2021). 378–384.