A NLP-based Analysis of Reflective Writings by Italian Teachers Giulia Chiriatti• , Valentina Della Gala? , Felice Dell’Orletta , Simonetta Montemagni , Maria Chiara Pettenati? , Maria Teresa Sagri? , Giulia Venturi • Università di Pisa giuliachiriatti@gmail.com ? Istituto Nazionale Documentazione, Innovazione, Ricerca Educativa (INDIRE) {v.dellagala,mc.pettenati,t.sagri}@indire.it  Istituto di Linguistica Computazionale “Antonio Zampolli” (ILC-CNR) - ItaliaNLP Lab {nome.cognome}@ilc.cnr.it Abstract Macerata is based on the alternation of laborato- rial and traditional classroom activities with doc- English. This paper reports first results of umentation and reflection activities. The purpose a wider study devoted to exploit the po- is “to influence practices through a process that al- tentialities of a NLP-based approach to the ternates between moments of immersion and dis- analysis of a corpus of reflective writings tancing, which are actualised in When I teach and on teaching activities. We investigate how When I reconsider my teaching to think of what a wide set of linguistic features allows re- happened” (Magnoler et al., 2016). An on-line constructing the linguistic profile of the environment developed and managed by INDIRE2 texts written by the Italian teachers and was set up to support teachers to reflect about and predicting whether are reflective. document their educational and professional activ- ities (see Figure 1) during the induction program. Italiano. L’articolo descrive i primi risul- All evidences of the instructional tasks (surveys, tati di uno studio più ampio che impiega writing tasks, lesson plans, instructional materials, strumenti e metodi di analisi e classifi- etc.) are collected in the e-portfolio and printed by cazione automatica del testo per descri- the teachers for the final exam. An yearly monitor- vere le caratteristiche linguistiche di un ing of teachers activities is carried on by INDIRE corpus di documenti scritti dai neoassunti to assess the effectiveness of the whole induc- nella scuola italiana che riflettono su una tion program, as well as of the single instructional specifica esperienza didattica. tasks. It is aimed to modify, whenever needed, the program in order to improve stakeholders’ scaf- folding to the newly qualified teachers and lastly 1 Introduction teachers’ professional development. Since 2014, the “National Institute for Docu- mentation, Innovation and Educational Research” (INDIRE) manages for the Ministry of Educa- tion (MIUR) the induction program of the Italian Newly Qualified Teachers (NQTs), i.e. the induc- tion phase of teachers professional development that aims to support teachers in their transition from their initial teacher education into working life in schools. Experimented for the first time Figure 1: The on-line environment collecting the in 2014, it became effective starting in 2015 with e-portfolio of the newly qualified teachers. the DM 850/2015.1 The program involves all new hiring teachers from primary to secondary school In this paper, we report first results of an on- for a total of 130,000 NQTs committed in the last going study devoted to investigate the potentiali- 3 years. The underlying theoretical framework ties offered by Natural Language Processing meth- developed by INDIRE, MIUR and University of ods and tools for the analysis of the NQTs e- 1 2 http://neoassunti.indire.it/2018/files/DM 850 27 10 The e-portfolio is available at 2015.pdf http://neoassunti.indire.it/2018/ portfolio. We consider in particular the documents think”, Dewey provides the most shared defini- written by the 26,526 teachers hired in the 2016/17 tion of reflective thinking as applied in the edu- school year. Many protocols (or models) have cational field: reflection may be seen as an “ac- been proposed to assess reflection in teachers writ- tive, persistent, and careful consideration of any ing, e.g. (Sparks-Langer et al., 1990; Hatton and belief or supposed form of knowledge in the light Smith, 1995; Kember et al., 2008; Larrivee, 2008; of the grounds that support it and the further con- Harland and Wondra, 2011). These models rely on clusions to which tends”. Hence, reflection is a features that suggest either different levels of re- systematic process of thinking that happens only flection (means focused on the depth of reflection) if related to actual experiences, and includes ob- or content of reflection (focused on the breadth servation of conditions and references to different of reflection), and usually they have found to mix pieces of knowledge, (i.e. references to previous features of both classes (depth and breadth) (Ull- experiences, domain knowledge, common sense mann, 2015). We rather focus here on the anal- knowledge, etc.), in order to respond to a dilemma ysis of the form to study which are the main lin- (Mezirow, 1990). Teachers’ educators have ex- guistic phenomena, distinguishing reflective from tensively employed writing tasks, such as writing non reflective writings. Specifically, we devised structured or unstructured journals, portfolios, es- a methodology devoted to investigate whether and says, blogs, open-ended questions to foster reflec- to which extent a wide set of linguistic features au- tion both in pre-service and experienced teachers. tomatically extracted from texts can be exploited Operational definitions of reflectivity proposed to to characterize NQTs’ reflective writings. develop schemes for assessing it are focused on Our contribution: i) we collect a corpus of re- identifying the presence of “reflective content” in flective writings manually annotated by experts in teachers’ writing, or how deep the reflection is. the learning science domain and classified with re- Based on these premises, we are currently de- spect to different types of reflectivity; ii) we detect veloping a reflection assessment schema suitable a wide set of linguistically phenomena, character- to describe properly the peculiarities of the Italian izing the collected writings; iii) we report the first teachers’ reflective writings written in the frame- results of an automatic classification experiment to work of the 2016/17 induction program. The assess which features contribute more in the auto- schema designed so far, reported in Table 1, was matic prediction of reflexivity. devised according to the following criteria: a writ- ing is reflective if it i) makes direct references to 2 Defining reflection experienced teaching activity, ii) involves several topics (content/pedagogical knowledge) and refer- Within the teaching and teacher education domain, ences to previous experiences, classroom manage- a very large amount of studies have been dedicated ment, learners needs, iii) includes premises anal- to conceptualization and analysis of teachers re- ysis (theoretical, context-related, personal) iv) de- flection and teachers’ reflective practice. Dewey bates a problem (a dilemma), a doubt, v) has an (1933), Van Manen (1977), Schon (1984; Schon output: it sums up what was learned, sketches fu- (1987; Schon (1991), Mezirow (1990) are among ture plans, gives a new insight and understanding the main references. The attention on reflective for immediate or future actions. thinking in the teachers education field has in- creased starting from the 80s as a reaction to 3 The Corpus the overlay technical view of teaching. Scholars have intensely studied reflection as a concept, de- The corpus of NQTs reflective writings is part of tected more levels and types of reflection, how the wider collection of documents written by the it works during and after professional teachers’ 26,526 teachers engaged in the 2016/17 INDIRE practice, its role and purpose in teachers’ profes- induction program. The whole corpus includes all sional development, and how it can be embedded texts written in two of the seven activities of the in the curriculum of teachers preparation or pro- e-portfolio: Didactic Activity 1 and 2 (DA) for a fessional development, and which techniques may total of 265,200 texts. During these two activi- be used to promote it (groups of discussion, read- ties, teachers were supported by guiding questions ings, oral interview, action research projects, writ- designed by INDIRE experts to help them to un- ing tasks, etc). In his seminal work “How we derstand the consistency of the planned and acted Type of reflec- Description Example tivity Simple writing that merely describes what happened during the I contenuti presentati sono stati acquisiti e gli alunni intervistati si sono di- No reflection teaching activity, no mostrati soddisfatti dell’intervento e del parere personale che hanno potuto doubts or clues of an esprimere sull’argomento di discussione. inquiry attitude are shown Writing shows weak links to the actual teaching experience, Per rispondere alla domanda circa la possibilità di migliorare l’attività af- General consid- it is conducted at a frontata, dirò innanzitutto che ritengo sempre possibile migliorare le proprie erations and un- distance from the prestazioni. Sono convinta che l’esperienza sia una grande alleata e che, col derstanding phenomena of inter- tempo, si cresca, ci si arricchisca e si migliori. est. It can include general thoughts and considerations Writing includes Credo che la scelta più efficace sia stata quella della valutazione tra pari. considerations on In particolare, durante la fase della premiazione del concorso di poesia, un actual classroom ac- alunno per classe si è recato nell’altra scuola e ha tenuto un discorso intro- Descriptive re- tions/events and some duttivo alla premiazione, nonché gestito la stessa in autonomia. Questo, a flection kind of knowledge base mio avviso, ha fatto sentire gli studenti i veri protagonisti del loro lavoro e but doesn’t clearly refer ha favorito la motivazione, intrinseca ed estrinseca. Le consegne sono sem- to any “problems”, pre state fornite in modo chiaro, ma hanno necessitato diverse ripetizioni per doubt or dilemma essere assimilate. In realtà, mi sono accorta che solo pochi di loro erano capaci di dare una sp- Writing discusses prob- iegazione adeguata (anche dal punto di vista formale) e soprattutto non rius- lems, doubts and refers civano a trovare esempi calzanti se non con l’aiuto del libro di testo. Questo to some kind of action. momento di ricognizione ha portato via quasi il doppio del tempo che avevo It may report a reflec- previsto, ma è comunque stato molto utile per accelerare il loro compito di tive practice. There ricerca durante l’analisi del nuovo testo proposto. Li ho stimolati a chiarire Reflection ogni dubbio e grazie anche alle loro domande credo che gli argomenti siano could be evidences of a change on teachers’ at- stati davvero appresi da tutti gli studenti, anche da chi di solito ha più dif- titude or acquiring new ficoltà o da chi normalmente partecipa meno. È stata una lezione che li ha insights due to the prob- molto coinvolti nonostante si trattasse di una lezione piuttosto “tradizionale”, lems faced perché mi hanno detto che questo sarebbe servito loro anche per lo studio di altre materie e soprattutto in vista dell’esame. Table 1: Annotation schema of reflectivity. teaching activities. For DA 1 and 2 they wrote ence domain according to the reflectivity annota- 5 short texts as answers to 5 different groups of tion scheme described in Section 2 (see Table 2). questions. The first 4 groups provide guidance for The agreement between the three annotators was teachers to write general reflections only on the calculated using the Fleiss’ kappa test and we ob- design of their teaching activity; the fifth group is tained a k=0.66, i.e. substantial agreement. meant to guide NQTs towards an overall reflec- tion on their whole teaching experience, i.e. both Reflectivity n. answers n. sent. n. tokens No reflection 185 348 9,784 the design and the real teaching activity, also in- Rhetoric 35 91 3,140 cluding classroom assessment techniques. Reflection 217 609 21,686 Radical reflection 36 149 5,326 We focused here on the answers to this lat- TOTAL 473 1,197 39,936 ter group of questions that were devised in or- der to encourage teachers to reflect on the follow- Table 2: Corpus of NQTs reflective writings anno- ing issues: i) differences and similarities between tated for different types of reflectivity. the designed and achieved activities, ii) the most effective choices adopted, also including class- room assessment techniques, iii) how the activity 4 Linguistic Features and Reflectivity could be improved, iv) the role played by the tu- tor and documentation practices. We considered The annotated corpus was tagged by the part-of- in particular a subset of this group of answers that speech tagger described in Dell’Orletta (2009) and were annotated by 3 experts in the learning sci- dependency-parsed by the DeSR parser (Attardi et al., 2009). This allowed to extract a wide If we focus on the linguistic profile of the dif- set of multilevel features, i.e. raw text, lexical, ferent types of reflective writings, we can observe morpho-syntactic and syntactic, fully described by that answers annotated as Reflection and Radi- Dell’Orletta et al. (2013). They was used to recon- cal reflection are mostly characterized by features struct the linguistic profile of reflective writings typically related to structural complexity. This and to carry out a first classification experiment is particular the case of Radical reflection an- aimed at predicting whether a text is reflective. swers that are longer in terms of number of sen- tences and words; they have more complex ver- 4.1 Distribution of Linguistic Features bal predicates (e.g. an higher % of adverbs and Table 3 shows a selection of the features that of an implicit mood such as gerundive that can vary significantly i) between reflective and non- be more ambiguous with respect to the referential reflective answers (column Reflectivity) and ii) subject), more complex use of subordination (e.g. among the different types of reflectivity we con- average length of ‘chains’ of embedded subordi- sidered (column Types of Reflectivity)3 . The analy- nate clauses), long distance constructions (length sis of variance was computed in the first case using of dependency links), non canonical constructions the Wilcoxon Rank-sum test for paired samples, (post-verbal subject). The higher % of demonstra- while in the second case we used the Kruskal- tive pronouns and determiners can be related to Wallis test since we aimed to assess the different one of the most representative characteristic of re- distribution of features in the 4 classes. flection, i.e. the direct reference to real life. On the In both cases, features from all levels of analysis contrary, they contain a simpler use of lexicon, e.g. resulted to be significant. If we consider the first a lower Type/Token ratio and an higher percentage ten most discriminative features, reflective writ- of “Fundamental words”. ings resulted to be longer in terms of number of words and sentences, they are characterized by 4.2 Prediction of Reflectivity longer sentences and by a lower Type/Token Ra- tio; they contain an higher number of verbal heads Table 4 reports the results of the automatic classi- and of embedded complement ‘chains’ (governed fication experiment we devised in order to predict by a nominal head). Interestingly, they mostly whether a text is reflective. We built a classifier contain linguistic phenomena typically related to based on LIBLINEAR (Fan et al., 2008) as ma- syntactic complexity, for example they are char- chine learning library trained using the LIBLIN- acterized by i) an higher use of verbal modifica- EAR L2-regularized L2-loss support vector clas- tion (e.g. higher % of adverbs, of auxiliary and sification function. We followed a 5-fold cross- modal verbs), ii) more complex verbal predicate validation process and relied on a training set of structures (e.g. higher average verbal arity, cal- 370 answers balanced between the reflective and culated as the number of instantiated dependency non reflective texts, since the under sampling tech- links sharing the same verbal head), iii) more ex- nique has been proofed to improve classification tensive use of subordination (e.g. higher % of sub- performance on unbalanced datasets (Qazi and ordinate clauses also embedded in deep chains), Raza, 2012). The performance was calculated in iv) features related to a non canonical word or- terms of F-score in the correct classification of der (e.g. higher % of pre-verbal objects and post- non reflective (0 in the table) or of reflective (1) verbal subjects), v) longer dependency links and writings. We used different classification models: higher parse trees, two features related to sentence the Raw text one uses only raw text features, the length. On the contrary, non reflective NQTs’ an- Lexical one uses the distribution of the lexicon be- swers contain an higher level of lexical complex- longing to the Basic Italian Vocabulary and up to ity: they have an higher Type/Token Ratio, a lower bi-grams of words, the Morpho-syntactic one uses percentage of “Fundamental words”, i.e. very fre- the unigram of part-of-speech and verbal morphol- quent words according to the classification pro- ogy features, the All features model uses all the posed by De Mauro (2000) in the Basic Italian considered features including the syntactic ones. Vocabulary (BIV), and an higher percentage of A very competitive baseline was computed: it ex- “High usage words”. ploits the distribution of unigrams of words (Un- igrams). As it can be seen, the model that uses 3 The full list of ranked features is contained in Appendix. all the considered features resulted to be the best Feature Ranking position Avg. Feature Value in different types of (non)reflective texts Reflectivity Types of Reflectivity No reflection Rhetoric Reflection Radical reflection Raw text features: Avg sentence length 10 11 27.97 35.9 38.6 38.2 Avg number of sentences 9 7 1.88 2.6 2.81 4.14 Avg number of words 1 1 52.89 89.71 99.94 147.94 Lexical features: Type/token ratio (100 token) 8 9 0.78 0.71 0.7 0.69 % of “Fundamental words” of BIV 62 86 74.15 75.57 77.01 77.92 % of “High usage words” of BIV 92 38 19.35 15.79 15.71 14.92 % of “High availability words” of BIV 58 68 9.72 12.8 10.78 10.69 Morpho–syntactic features: % of adjectives 71 87 7.29 9.16 7.72 7.93 % of possessive adjectives 67 43 1.08 2 0.97 0.93 % of adverbs 42 46 3.95 3.93 4.82 5.29 % of prepositions 51 82 15.11 17.08 16.61 16.05 % of demonstrative pronouns 36 34 0.43 0.65 0.58 0.78 % of demonstrative determiners 35 30 0.35 0.66 0.42 0.6 % of determinative articles 30 41 8.29 6.89 6.81 7.07 % of subordinative conjunctions 69 63 0.94 0.68 0.98 1.27 % of sentence boundary punctuation 12 12 4.17 2.99 2.86 2.92 % of auxiliary verbs 25 27 6.66 4.01 4.92 4.48 % of modal verbs 40 40 0.69 1.06 0.78 0.97 % of verbs – subjective mood 72 39 1.16 1.29 2.55 1.53 % of verbs – infinitive mood 28 36 19.11 27.48 25.03 25.75 % of verbs – gerundive mood 37 45 5.54 6.06 6.51 6.73 % of verbs – indicative mood 38 58 10.46 14.76 11.74 12.91 % of verbs – third person singular 20 15 8.2 18.76 14.92 19.3 % of verbs – third person plural 80 91 6.14 10.83 8.04 7.67 % of verbs – imperfect tense 78 35 7.18 1.55 9.72 13.75 Syntactic features: % of dependency types – auxiliary 24 25 6.65 3.98 4.88 4.41 % of dependency types – object 44 59 4.22 4.7 5.06 5.6 % of dependency types – preposition 55 81 15.15 17.33 16.6 16.09 % of dependency types – subordinate clause 60 62 0.99 0.78 1.03 1.22 % of dependency types – subject 46 83 4.62 3.62 3.77 3.74 Avg number of verbal heads 2 3 52.89 89.71 99.94 147.94 Avg number of embedded complement 4 4 9.72 12.8 10.78 10.69 chains Length of ‘chains’ of embedded subordinate 19 21 0.48 0.69 0.86 0.95 clauses (avg) Maximum length of dependency links (avg) 16 19 10.26 12.71 14.16 14.8 Parse tree depth (avg) 21 24 7.86 9.73 9.56 9.65 Arity of verbal predicates (avg) 13 13 3.62 4.46 4.89 4.74 % of pre-verbal objects 52 42 4.84 9.71 7.59 4.81 % of post-verbal subject 86 84 10.65 11.17 10.64 17.07 % of subordinate clauses in post-verbal po- 23 16 52.21 76.57 78.97 97.71 sition Table 3: Feature ranking position characterizing i) reflective vs. non reflective texts and ii) different types of reflective texts and average value of feature distribution in the different types of reflective texts. Ranking positions with p <0.001 are marked in italics and with p <0.05 in boldface. one. On the contrary, the model relying on very Features F1 0 F1 1 Tot F1 Raw text 58.4 69.86 64.13 simple types of features (raw text features) that Lexical 78.58 77.53 78.05 capture how much teachers have written achieves Morpho-syntactic 74.87 75.18 75.02 the worst results. We also carried out a very pre- All features 79.31 79.01 79.16 Baseline (unigrams) 75.16 74.84 75.00 liminary experiment to classify the three different types of reflective writings but it produced unsat- Table 4: Classification of reflective vs. non reflec- isfactory results due to the unbalanced distribution tive writings using different models of features. of answers in the reflective classes. As expected, a balanced experiment yielded very low accuracies since we used very few data. the corpus with new manually annotated data to improve the accuracy of the automatic classifica- 5 Conclusions and current developments tion of different types of reflectivity. We reported first results of a on-going study de- voted to reconstruct the linguistic profile of a corpus of reflective writings by Italian newly re- References cruited teachers that we collected for the specific G. Attardi, F. Dell’Orletta, M. Simi and J. Turian. purpose of this paper. We are currently enlarging 2009. Accurate dependency parsing with a stacked multilayer perceptron. Proceedings of Evalita’09, D.A. Schon. 1984. The Reflective Practitioner: How Evaluation of NLP and Speech Tools for Italian , Professionals Think In Action. Basic Books. Reggio Emilia, December. D.A. Schon. 1987. Educating the Reflective Practi- D. Boud and D. Walker. 2013. Reflection: Turning tioner. Jossey-Bass. Experience into Learning. RoutledgeFalmer. D.A. Schon. 1991. The reflective turn: Case studies C.C. Chang and C.J. Lin. 2001. LIBSVM: a library in and on educational practice. Teachers College for support vector machines. Software available at Press. http://www.csie.ntu.edu.tw/ cjlin/libsvm GM. Sparks-Langer, GM. Simmons, M. Pasch, A. F. Dell’Orletta. 2009. Ensemble system for Part-of- Colton, A. Starko. 1990. Reflective pedagogical Speech tagging. Proceedings of Evalita’09, Evalu- thinking: How can we promote it and measure it? ation of NLP and Speech Tools for Italian , Reggio Journal of Teacher Education, Vol. 41(5). Emilia, December. T. D. Ullmann. 2015. Automated detection of reflec- F. Dell’Orletta, S. Montemagni and G. Venturi. 2013. tion in texts. A machine learning based approach. Linguistic profiling of texts across textual genre and The Open University. readability level. An exploratory study on Italian fic- tional prose. Proceedings of the Recent Advances in T. D. Ullmann. 2015. Keywords of written reflec- Natural Language Processing Conference (RANLP- tion - a comparison between reflective and descrip- 2013). tive datasets. Proceedings of the 5th Workshop on T. De Mauro. 2000. Grande dizionario italiano Awareness and Reflection in Technology Enhanced dell’uso (GRADIT). Torino, UTET. Learning. J. Dewey. 1933. How we think: a restatement of the re- M. Van Manen. 1977. Linking Ways of Knowing with lation of reflective thinking to the educative process. Ways of Being Practical. Curriculum Inquiry, Vol. D.C. Heath and company. 6(3). R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X. Wang, and C.- H.C. Waxman et al. 1987. Images of Reflection in J. Lin. 2008. LIBLINEAR: A Library for Large Teacher Education. Summaries of papers presented Linear Classification. Journal of Machine Learning at a National Conference on Reflective Inquiry in Research, 9:1871–1874. Teacher Education, Houston. DJ. Harland and JD. Wondra 2011. Presercice Teach- ers’ Reflection on Clinical Experiences: A Compar- ison of Blog and Final Paper Assignments. Jour- nal of Digital Learning in Teacher Education, Vol. 27(4). N. Hatton and D. Smith. 1995. Reflection in teacher education: Towards definition and implementation. Teaching and Teacher Education, Vol. 11(1). D. Kember, J. McKey, K. Sinclair, FKY Wong 2008. A forur category scheme for coding and assessing the level of reflection in written work. Assessment and Evaluation in Higher Education, Vol. 25(4). B. Larrivee 2008. Development of a tool to as- sess teachers’ level of reflective practice. Reflective Practice, Vol. 9(3). P. Magnoler, GR. Mangione, MC. Pettenati, A. Rosa, PG. Rossi. 2016. Induction models and teachers professional development. Journal of e-Learning and Knowledge Society, Vol. 12(3). J. Mezirow. 1990. Fostering critical reflection in adulthood: a guide to transformative and emanci- patory learning. Jossey-Bass Publishers. N. Qazi and K. Raza. 2012. Effect of Feature Se- lection, SMOTE and under Sampling on Class Im- balance Classification. Proceedings of the 2012 UKSim 14th International Conference on Modelling and Simulation, pp. 145-150. Feature Ranking position Avg. Feature Value in different types of (non)reflective texts Reflectivity Types of Reflectivity No reflection Rhetoric Reflection Radical reflection Raw text features: Avg sentence length 10 11 27.97 35.9 38.6 38.2 Avg number of sentences 9 7 1.88 2.6 2.81 4.14 Avg number of tokens 1 1 52.89 89.71 99.94 147.94 Lexical features: Type/token ratio (first 100 lemma) 8 9 0.78 0.71 0.7 0.69 Type/token ratio (first 200 lemma) 6 6 0.77 0.68 0.67 0.64 % of “Fundamental words” of BIV 62 86 74.15 75.57 77.01 77.92 % of “High usage words” of BIV 92 38 19.35 15.79 15.71 14.92 % of “High availability words” of BIV 58 68 9.72 12.8 10.78 10.69 Morpho–syntactic features: Lexical density 64 96 0.54 0.55 0.55 0.56 % of adjectives 71 87 7.29 9.16 7.72 7.93 % of possessive adjectives 67 43 1.08 2 0.97 0.93 % of adverbs 42 46 3.95 3.93 4.82 5.29 % of negative adverbs 54 53 0.64 0.38 0.64 0.65 % of determiners 63 88 1.19 1.19 1.28 1.43 % of demonstrative determiners 35 30 0.35 0.66 0.42 0.6 % of indefinite determiners 74 71 0.8 0.47 0.83 0.8 % of prepositions 51 82 15.11 17.08 16.61 16.05 % of articles 93 none 9.36 8.34 8.38 8.64 % of demonstrative pronouns 36 34 0.43 0.65 0.58 0.78 % of personal pronouns 89 99 0.29 0.39 0.32 0.24 % of relative pronouns 39 56 1.17 1.16 1.48 1.55 % of determinative articles 30 41 8.29 6.89 6.81 7.07 % of subordinative conjunctions 69 63 0.94 0.68 0.98 1.27 % of single commas or hyphens 27 33 3.55 4.7 4.67 5.26 % of numbers 87 67 0.22 0.19 0.4 0.29 % of sentence boundary punctuation 12 12 4.17 2.99 2.86 2.92 % of verbs 48 70 20.51 17.71 18.52 17.91 % of auxiliary verbs 25 27 6.66 4.01 4.92 4.48 % of modal verbs 40 40 0.69 1.06 0.78 0.97 % of verbs – subjective mood 72 39 1.16 1.29 2.55 1.53 % of verbs – infinitive mood 28 36 19.11 27.48 25.03 25.75 % of verbs – gerundive mood 37 45 5.54 6.06 6.51 6.73 % of verbs – indicative mood 38 58 10.46 14.76 11.74 12.91 % of verbs – third person singular 20 15 8.2 18.76 14.92 19.3 % of verbs – third person plural 80 91 6.14 10.83 8.04 7.67 % of verbs – imperfect tense 78 35 7.18 1.55 9.72 13.75 Syntactic features: % of syntactic roots 14 14 4.57 3.06 3.36 3.21 % of dep–auxiliary 24 25 6.65 3.98 4.88 4.41 % of dep–nominal/clausal argument 61 98 2.36 3.08 2.8 2.41 % of dep–indirect complement 66 61 0.46 0.62 0.5 0.48 % of dep–locative complement 47 31 0.07 0.21 0.34 0.14 % of dep–temporal complement 41 28 0.16 0.3 0.28 0.41 % of dep–nominal/clausal modifier 45 73 15.88 17.25 17.07 17.7 % of dep–relative modifier 32 32 1.18 1.1 1.46 1.8 % of dep–object 44 59 4.22 4.7 5.06 5.6 % of dep–preposition 55 81 15.15 17.33 16.6 16.09 % of dep–subordinate clause 60 62 0.99 0.78 1.03 1.22 % of dep–subject 46 83 4.62 3.62 3.77 3.74 Avg number of verbal heads 2 3 52.89 89.71 99.94 147.94 Avg number of embedded complement 4 4 9.72 12.8 10.78 10.69 chains Length of ‘chains’ of embedded subordinate 19 21 0.48 0.69 0.86 0.95 clauses (avg) Length of dependency links (avg) 15 18 2.09 2.3 2.4 2.42 Maximum length of dependency links (avg) 16 19 10.26 12.71 14.16 14.8 Parse tree depth (avg) 21 24 7.86 9.73 9.56 9.65 Arity of verbal predicates (avg) 13 13 3.62 4.46 4.89 4.74 % of verbal roots 57 29 0.96 0.95 0.9 0.84 % of verbal roots with explicit subj 70 65 67.92 73.76 59.05 60.79 % of finite complement clauses 83 95 19.85 17.19 23.08 27.64 % of infinite complement clauses % of pre-verbal objects 52 42 4.84 9.71 7.59 4.81 % of post-verbal subject 86 84 10.65 11.17 10.64 17.07 % of subordinate clauses in post-verbal po- 23 16 52.21 76.57 78.97 97.71 sition Table 5: Appendix A: Full list of feature ranking positions characterizing i) reflective vs. non reflective texts and ii) different types of reflective texts and average value of feature distribution in the different types of reflective texts. Ranking positions with p <0.001 are marked in italics and with p <0.05 in boldface. Features which were not selected during ranking have no rank.