=Paper= {{Paper |id=Vol-2535/paper_9 |storemode=property |title=A data-driven platform for creating educational content in language learning |pdfUrl=https://ceur-ws.org/Vol-2535/paper_9.pdf |volume=Vol-2535 |authors=Konstantin Schulz,Andrea Beyer,Malte Dreyer,Stefan Kipf |dblpUrl=https://dblp.org/rec/conf/qurator/SchulzBDK20 }} ==A data-driven platform for creating educational content in language learning== https://ceur-ws.org/Vol-2535/paper_9.pdf
 A data-driven platform for creating educational
         content in language learning?

        Konstantin Schulz, Andrea Beyer, Malte Dreyer, and Stefan Kipf

                       Humboldt-Universität zu Berlin, Germany



        Abstract. In times of increasingly personalized educational content, de-
        signing a data-driven platform which offers the opportunity to create
        content for different use cases is arguably the only solution to handle the
        massive amount of information. Therefore, we developed the software
        ”Machina Callida” (MC) in our project CALLIDUS (Computer-Aided
        Language Learning: Vocabulary Acquisition in Latin using Corpus-based
        Methods).
        The main focus of this research project is to optimize the vocabulary
        acquisition of Latin by using a data-driven language learning approach
        for creating exercises. To achieve that goal, we were facing problems con-
        cerning the quality of externally curated research data (e.g. annotated
        text corpora) while curating educational materials ourselves (e.g. prede-
        fined sequences of exercises). Besides, we needed to build a user-friendly
        interface for both teachers and students. While teachers would like to
        create an exercise or test and use them (even as printed out copies) in
        class, students would like to learn on the fly and right away.
        As a result, we offer a repository, a file exporter for various formats and,
        above all, interactive exercises so that learners are actively engaged in
        the learning process. In this paper we show the workflow of our software
        and explain the architecture focusing on the integration of Artificial In-
        telligence (AI) and data curation. Ideally, we want to use AI technology
        to facilitate the process and increase the quality of content creation,
        dissemination and personalization for our end users.

        Keywords: Educational content · Language learning · Data-driven ·
        Exercise repository


1     Curating language exercises: the user’s point of view

In German high schools, Latin is to this day the third most important foreign
language, esp. in grades 7 to 10. For that reason, educational publishing compa-
nies are investing in teaching materials for Latin classes, but all these materials
bear certain challenges for educational stakeholders: they are proprietary, hardly
?
    This project is funded by the German Research Foundation (project number
    316618374) and lead by Malte Dreyer, Stefan Kipf and Anke Lüdeling. Copyright c
    2020 for this paper by its authors. Use permitted under Creative Commons License
    Attribution 4.0 International (CC BY 4.0).
2        K. Schulz et al.

adaptable (or not even digital) for teachers and split into a vast amount of dif-
ferent items like textbook, exercise book, vocabulary book etc. that all learners
have to buy separately, if needed [7, p. 194f.]. On top of that, most of the teach-
ing materials only refer to the initial stage of language acquisition, in which
Latin original texts do not yet matter [21, p. 133]. Although the companies are
also providing teachers with reading books for intermediate learners containing
sections of selected Latin original texts, teachers are still in continuous need of
adaptable texts and exercises for these advanced stages. In addition, although
the curricula offer a standardized canon of Latin authors [20, p. 45], it still in-
cludes a wide range of different texts compared to the available time in Latin
classes. What is more, teachers prefer to use texts whose vocabulary is covered
as much as possible by the basic vocabulary already acquired by the students,
since the comprehensibility of the text can be considerably limited if less than
95% of words are known [25, p. 352].
    As a consequence, teachers may choose texts from a large pool of Latin au-
thors, but without supporting material they rarely do, because they lack the
time to prepare texts and exercises independently. Instead, they often fall back
on ready-made materials that are quality-tested but rarely fit the needs of the
learning group. This situation results in a kind of dilemma: Many teachers would
like to enrich their lessons with further authors and support their students indi-
vidually in their language acquisition with (personalized) exercises, but they do
not feel up to the challenge of selecting and adapting materials to their students’
needs [24, p. 115/117].
    This brief outline of the problem shows the need to develop a platform that
allows teachers (and students) to create needs-based exercises for authentic Latin
texts. Furthermore, for a good user experience it is necessary that the process of
generation is fast and easy to handle, that the generated exercises are ready to
use (analogically and digitally) or share, and that they are well curated for later
reuse. These requirements are illustrated in three exemplary use cases which
have been modeled loosely following the guidelines of Cockburn [9].

    Use Case 1: The teacher needs 2: The teacher does 3: The teacher wants
             exercises based on au- not have enough to support his/her
             thentic Latin texts    time to prepare an students in a person-
                                    exercise manually  alized way to enable
                                                       individual learning
    Primary                                Teacher
    actor
    Stakehol-                          Teacher, students
    ders
    Scope       An easy to handle ex- A database with Learning Analytics
                ercise generator      well-curated    dif- and     recommenda-
                                      ferent    types  of tions     for  future
                                      exercises            exercises
A data-driven platform for creating educational content in language learning      3


 User       As a teacher, I want As a teacher, I As a teacher, I want
 story      to select a section of search the reposi- an overview of how
            the work to be read. tory for at least one my students perform
            I want to compare matching exercise. I in an exercise. I want
            this section to the want to combine dif- to be able to see
            used core vocabulary ferent search terms at a glance what
            for getting an overview in       an   extended mistakes are made
            of the amount of un- search, e.g. Latin most often so that
            known words. Then, I text passage, exer- I know what to fo-
            want to set the param- cise type, linguistic cus on when creat-
            eters of the intended focus, popular ex- ing the next exer-
            exercise: type of exer- ercises, vocabulary. cise. I would also
            cise and linguistic focus Then, I want to use like a recommenda-
            (specific lemmata, syn- the exercise in class tion as to which ex-
            tactic structures, mor- (with smartphones, ercise to select next
            phology, context-based tablets or interactive if there already is a
            meaning, word equiva- whiteboard), to em- suitable exercise in
            lents). After getting a bed it in a learning the database.
            preview, all selections platform for later
            can be easily changed, use or to send it
            if I think that, e.g.,the to the students for
            exercise is too difficult. their homework.
 Level      Repetition and deep- Repetition and deep- Zone of proximal
            ening of vocabulary ening of vocabulary (linguistic)    devel-
            knowledge in context knowledge (individ- opment      of  each
                                 ually)               student
 Precondi- Teachers are presented Teachers are pre- Students      generate
 tion      with an option to gen- sented    with    an data about their
           erate new exercises.   option to browse individual progress.
                                  exercises from an The data can be
                                  existing database. tracked and ana-
                                                       lyzed automatically.
 Minimal The generated exercise The database con- Many         students
 Guaran- can be exported.       tains exercises and have completed the
 tees                           can be searched.    same (or similar)
                                                    exercises.
 Success The generated exercise The search for a Teachers            receive
 Guaran- can be shared and is matching          exercise helpful suggestions
 tees    stored in a database is supported by for choosing the next
         that is easily accessible advanced filtering. exercise.
         to end users.             Popular and well-
                                   curated exercises are
                                   marked.
4        K. Schulz et al.


    Trigger   The teacher invokes The teacher decides Students have just
              the exercise generation to use a ready-made completed an exer-
              setup.                  exercise.           cise and now should
                                                          attempt     another
                                                          one.
    Basic     The teacher picks the The teacher picks The students regis-
    flow:     option of generating a the option of search- ter with the software
    Step 1    new exercise.          ing the database.     and go through the
                                                           given exercise.
    Step 2    The teacher chooses a The teacher selects The teacher receives
              text passage from a a single or multiple an evaluation about
              wide range of Latin au- filters or uses the the      performance
              thors.                  extended search op- (percentage,     er-
                                      tion.               ror types) of each
                                                          student.
    Step 3    The teacher compares The teacher eval- The teacher also gets
              the words of the text uates the results. a recommendation
              with the used core vo- Depending on the which          parameters
              cabulary and changes results, the teacher to set for the next
              the section accordingly changes the search exercise or which
              (go to step 2) or pro- terms / filters (go to exercise to select
              ceeds to set the param- step 2) or decides to from the database.
              eters of the exercise. use one of the given
                                      exercises.
    Step 4    The teacher decides on The teacher uses the The students get
              the exercise format, the exercise in class or their new exercise
              linguistic focus and the disseminates it using and work on it (go
              instruction statement. a link, so that stu- to step 1).
                                       dents may use their
                                       own mobile devices.
    Step 5    The system presents a
              preview. The teacher
              either exports the exer-
              cise to a printable for-
              mat or shares it digi-
              tally or tries other pa-
              rameters (go to step
              4) or even changes the
              section (go to step 2).
                                Table 1: Use Cases
A data-driven platform for creating educational content in language learning       5

2     Automatic parsing and evaluation: the developer’s
      point of view

In order to help teachers create high-quality educational content, we provide
support for each of the necessary steps in our software at https://korpling.
org/mc.


2.1   Selection of text (Use Case 1)

Many Latin text editions are proprietary and thus do not comply with the FAIR
data principles [32]. Additionally, such resources are not compatible with the
requirements for projects funded by the German Research Foundation, which
need to prefer open licenses to closed ones [16]. To solve this problem, we de-
cided to rely solely on text editions from the public domain. This choice also
narrowed down the range of suitable text repositories a lot. In the end, we set-
tled for the Perseus Library [3] because it has a well-defined API (Canonical
Text Services [30]) and a standardized citation model (URN [8]) for ancient text
passages, works and authors. This repository, however, offers a vast amount of
texts: several hundreds of works from dozens of authors can be explored, so our
users need a way to prioritize them according to their specific needs. Currently,
we support this by offering a vocabulary filter and measures for text complexity.
     The vocabulary filter has to be targeted at one of several reference vocabu-
laries. These are essentially lemmatized word frequency lists derived from text-
books [5], treebanks [4] or materials created by publishing houses [31]. The ref-
erence vocabularies can be used to estimate the students’ previous knowledge by
specifying that, e.g., they should know the 500 most frequent words from that
list. This subset of words is then compared to the lemmata occurring in a given
corpus. Thus, if teachers specify a large corpus and the desired size of the final
text passage, the software will rank all possible subsets of the corpus according
to their congruence with the reference vocabulary. The boundaries for each sub-
set are chosen intelligently in order to maximize the number of known words.
This enables teachers to always choose a text that supports their students’ zone
of proximal development [27, p. 238].
     Text complexity, on the other hand, does not directly relate to a student’s pre-
vious knowledge, but to an intrinsic comparison between multiple Latin texts.
In our case, it is a combination of well-known operationalizations of the pre-
sumed degree of difficulty that readers may face when approaching a text, e.g.
lexical density [19, p. 61]. This helps teachers to determine the suitability of a
given text passage (or corpus) with regard to their students’ linguistic compe-
tence. The major strength of such measures does not reside in their inherently
flawed approximation of actual complexity, but in enabling a formalized linguis-
tic comparison that goes beyond mere counting of words and integrates syntax,
morphology and semantics [11, p. 607]. By combining information about vocabu-
lary and text complexity, teachers can significantly accelerate and improve their
choice of texts, thus curating better educational content for their students.
6       K. Schulz et al.

2.2   Focus on specific linguistic phenomena (Use Case 1)

Once teachers have committed themselves to a suitable text passage, they may
still not know the exact target of a potential exercise. Therefore, we offer a key-
word in context (KWIC) view to explore collocations and the specific usage of
a particular word [18, p. 97]. The superficial token-based display is enriched by
morpho-syntactic information, e.g. part of speech and dependency links. There-
fore, teachers can qualitatively inspect usage patterns on multiple linguistic levels
as needed.
     A major problem in this approach is that most Latin texts are not curated
as treebanks with scientific annotations, but rather just as plain text. In other
words, we lack the key prerequisite to provide a rich KWIC view. To compensate
for this shortcoming, we use an AI-driven dependency parser [29] to process plain
Latin text in a fully automatic manner. It was trained as a multi-task classifier
using representation learning on existing curated treebanks [28, p. 4291]. This
is very reliable for basic tasks like tokenization, segmentation, lemmatization
and part-of-speech tagging (>95% accuracy), but is rather error-prone (∼80%
accuracy) for dependency links. Thus, the syntactic visualization in the KWIC
view may not always be entirely correct, but the basic concordance function
and the information about parts of speech are highly accurate, thereby enabling
teachers to create educational content in a much more well-informed manner.
Besides, the lack of performance on the syntactic level may be alleviated by
accessing and linking further resources to the existing parser output [22, p. 75].


2.3   Design of interaction / learning setting (Use Case 1)

                                             Now that the basic content (i.e. texts
                                             and phenomena) of a new exercise has
                                             been established, it is time to look at
                                             the layout. Depending on the chosen
                                             phenomenon, but also on a student’s
                                             personal preferences, certain types of
                                             interaction may be more appropriate
                                             than others in order to reach a specific
                                             educational goal (see Fig. 1). In gen-
                                             eral, a systematic variation of interac-
                                             tion types can support more learning
                                             styles [26, p. 169], make the learning
                                             process more multifaceted [17, p. 1]
                                             and lead to a higher degree of moti-
                                             vation [17, p. 5] and engagement [26,
                                             p. 165]. On the other hand, the exclu-
                                             sive usage of ready-made exercises in
Fig. 1. Setting parameters for a new exer-   various formats can also cause men-
cise                                         tal overload for students [26, p. 161].
A data-driven platform for creating educational content in language learning         7

                                            Therefore, we offer teachers the pos-
                                            sibility to choose from a range of ex-
isting exercises with the same type of interaction, so it is easier for them to
maintain a certain level of consistency, even in longer learning sequences. Fur-
thermore, some of the exercise formats may be considered part of the same line
of progression, e.g. clozes can be solved with a visible pool of boxes using Drag
and Drop (easy, see Fig. 2) or by typing characters into blank text fields (more
difficult). Besides, the same basic technology and layout can be used to produce
different exercises, e.g. Drag and Drop works for both the cloze and matching
format. In this regard, the usage of a large common framework (H5P [2]) allows
for a diverse, but consistent learning experience. As an inspiration for longer
sequences of exercises, we offer the so-called Vocabulary Unit which roughly
corresponds to the length of an average lesson in school (about 45 minutes).




  Fig. 2. Drag-and-Drop-based cloze exercise with visible pool and binary feedback

2.4   Dissemination (Use Case 2)

When teachers are satisfied with their created content, they typically want to
distribute it to their students to employ it in a didactic context. To that end,
every exercise is labeled with a unique identifier, so it can be saved in a database
and shared via deep links to the software server (e.g. https://korpling.org/
mc/exercise?eid={EXERCISE_ID}). When creating an exercise as well as at any
later point in time, users may also export a given exercise to specific file formats:
PDF and DOCX for printing, XML for integration into a learning management
system. That way, teachers and students are able to build their own collections
of useful exercises over time and, in the case of XML, derive additional benefit
from the features offered by Learning Management Systems like Moodle [1]:
structured online courses, user management, learning analytics and so on. If, on
the other hand, teachers do not have the time to curate their own content, we
provide access to public exercises that can be filtered and searched for using an
extensive metadata schema, including the author, work, text passage, interaction
type, popularity, vocabulary and text complexity (see Fig. 3).
8        K. Schulz et al.




    Fig. 3. Exercise Repository with keyword search and options for sorting/filtering

2.5     Evaluation (Use Case 3)
Moodle already offers summative evaluation for created exercises, but teachers
usually refrain from using it because they have not been trained [6, p. 160] to deal
with the technological complexity during setup, maintenance and everyday us-
age [10, p. 342]. This also applies to digital media in general [14, p. 18]. Therefore,
in the long run, we need to provide such evaluation ourselves. A basic prototype
that goes beyond the single-exercise binary feedback (correct/incorrect) has been
implemented in our Vocabulary Unit. It shows the overall performance for the
given exercises, the student’s development from beginning to end and how many
words from the target vocabulary are already known (see Fig. 4). In the future,
we would like to add further analyses pertaining to the preferred type of inter-
action, problematic performance on certain linguistic phenomena and the speed
of problem solving. These goals are in line with the recent trend of focusing on
the learner’s perspective in computer-assisted evaluation [15, p. 313]: Where are
my strengths and weaknesses? How did I develop during the last weeks? What
can I do to improve specific skills?




    Fig. 4. Summative evaluation of a student’s performance in the Vocabulary Unit

   However, user-specific quantitative evaluation is not enough. In order to in-
crease students’ learning success, they also need adaptive qualitative feedback.
A prerequisite for that is the detection and classification of errors: the integrated
A data-driven platform for creating educational content in language learning      9

binary evaluation of H5P can be used as a basis to categorize various error types,
e.g.: Did the student fail to give any answer at all? Did the student actually pro-
vide the correct answer, but with minor typing mistakes? Did the student make
obvious grammatical mistakes? If so, are they related to morphology, vocabulary
or syntax? Depending on the specific type of error, suitable feedback needs to be
generated. Our main objective here is to provide deeper support for teachers and
students in order to optimize the learning progress towards a specific goal, e.g.
being able to read texts from a specific corpus. A good approach in that case
may be to create exercises for this corpus and use the students’ performance
as an objective for reinforcement learning [13, p. 2094]. The AI model should
then learn to utilize suitable pedagogical actions (e.g. distributing exercises for
learning) to maximize a student’s performance on the test exercise dataset for a
corpus.

3   Next steps: Learning Analytics and semantic analysis
For the future integration of Learning Analytics in our software, we have already
built a prototype that evaluates a learner group’s performance across multiple
dimensions, e.g. working speed, interaction type, accuracy and performance gain
over time. A large part of this analysis is most suitable for groups, which is
why it is probably useful for teachers. Individuals, on the other side, would
need a stronger emphasis on their development over time, which is harder to
track because it would require them to use the software as their main source of
language learning. Therefore, specific milestones are to be reached in the next
months:
 – summarize group performances as an indicator that helps teachers to read-
   just their general didactic strategy, e.g. by focusing more heavily on certain
   linguistic phenomena
 – analyze results for individual students over time and suggest the most suit-
   able exercises for them considering their personal characteristics, i.e. learning
   style, thematic priority and particular weaknesses
Apart from improving the quality of the existing workflow, we also consider
increasing its quantity, e.g. by adding new linguistic phenomena: Semantics is
currently underrepresented in our automatic analyses, which makes it hard for
teachers to group their educational content around a certain topic. This could
be alleviated by integrating representation learning as an independent feature:
Unsupervised machine learning, in the form of Contextual Word Embeddings
like those provided by BERT [12], may be used to distinguish different usages of
the same word in different sentences, thereby highlighting fine-grained semantic
differences between authors or even within the same work. While we already
used Word2Vec [23] to perform simple vector-based analyses on existing Latin
treebanks, it still remains a challenge to generalize the calculation, visualization
and interpretation in this workflow while maintaining a sufficient level of quality.
A well-founded evaluation of representation learning for the purposes of language
acquisition is arguably the most important goal in this respect.
10      K. Schulz et al.

References

 1. Moodle: A learning platform designed to provide educators, administrators and
    learners with a single robust, secure and integrated system to create personalised
    learning environments. Moodle Pty Ltd
 2. H5P. Create, share and reuse interactive HTML5 content in your browser. Joubel
    AS (Jun 2018)
 3. Almas, B., Babeu, A., Krohn, A.: Linked Data in the Perseus Digital Library.
    ISAW Papers 7(3) (2014)
 4. Bamman, D., Crane, G.: The Ancient Greek and Latin Dependency Treebanks
    [AGLDT]. In: Language Technology for Cultural Heritage, pp. 79–98. Springer
    (2011)
 5. Bartoszek, V., Datené, V., Lösch, S., Mosebach-Kaufmann, I., Nagengast, G.,
    Schöffel, C., Scholz, B., Schröttel, W.: VIVA 1 Lehrerband, vol. 1. Vandenhoeck &
    Ruprecht (2013)
 6. Bäsler, S.A.: Lernen und Lehren mit Medien und über Me-
    dien.       Ph.D.      thesis,      Technische     Universität    Berlin    (2019).
    https://doi.org/http://dx.doi.org/10.14279/depositonce-7833
 7. Beyer, A.: Das Lateinlehrbuch Aus Fachdidaktischer Perspektive: Theorie - Anal-
    yse - Konzeption. Universitätsverlag Winter GmbH, Heidelberg (2018)
 8. Blackwell, C., Smith, N.: The Canonical Text Services URN Specification, Version
    2.0.rc.1 [CITE / URN] (2015)
 9. Cockburn, A.: Writing Effective Use Cases. The Agile Software Development Series,
    Addison-Wesley, Boston, 16. print edn. (2006)
10. Costa, C., Alvelos, H., Teixeira, L.: The use of Moodle e-learning platform: A study
    in a Portuguese University. Procedia Technology 5, 334–343 (2012)
11. Dascalu, M.A., Gutu, G.S., Ruseti, S.S., Cristian Paraschiv, I.S., Dessus, P., Mc-
    namara, D.A., Crossley, S.A., Trausan-Matu, S.A.: ReaderBench: A Multi-lingual
    Framework for Analyzing Text Complexity. In: Lavoué, É., Drachsler, H., Verbert,
    K., Broisin, J., Pérez-Sanagustı́n, M. (eds.) Data Driven Approaches in Digital
    Education, Proc 12th European Conference on Technology Enhanced Learning,
    EC-TEL 2017. pp. 606–609. Data Driven Approaches in Digital Education 12th
    European Conference on Technology Enhanced Learning, EC-TEL 2017, Tallinn,
    Estonia, September 12–15, 2017, Proceedings, Springer, Tallinn, Estonia (2017)
12. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirec-
    tional transformers for language understanding. arXiv preprint arXiv:1810.04805
    (2018)
13. Dorça, F.A., Lima, L.V., Fernandes, M.A., Lopes, C.R.: Comparing strate-
    gies for modeling students learning styles through reinforcement learn-
    ing in adaptive and intelligent educational systems: An experimental anal-
    ysis. Expert Systems with Applications 40(6), 2092–2101 (May 2013).
    https://doi.org/10.1016/j.eswa.2012.10.014
14. Eickelmann, B., Bos, W., Labusch, A.: Die Studie ICILS 2018 im Überblick –
    Zentrale Ergebnisse und mögliche Entwicklungsperspektiven. In: Gerick, J., Gold-
    hammer, F., Schaumburg, H., Schwippert, K., Senkbeil, M., Vahrenhold, J., Eickel-
    mann, B., Bos, W. (eds.) ICILS 2018 #Deutschland. Computer- und informations-
    bezogene Kompetenzen von Schülerinnen und Schülern im zweiten internationalen
    Vergleich und Kompetenzen im Bereich Computational Thinking, pp. 7–31. Wax-
    mann (2019), oCLC: 1124310958
A data-driven platform for creating educational content in language learning         11

15. Ferguson, R.: Learning analytics: Drivers, developments and challenges. Interna-
    tional Journal of Technology Enhanced Learning 4(5/6), 304–317 (2012)
16. Forschungsgemeinschaft, D.: Appell zur Nutzung offener Lizenzen in der Wis-
    senschaft. Tech. Rep. 68, Deutsche Forschungsgemeinschaft (Nov 2014)
17. Harecker, G., Lehner-Wieternik, A.: Computer-based Language Learning with In-
    teractive Web Exercises. ICT for Language Learning pp. 1–5 (2011)
18. Helm, F.: Language and culture in an online context: What can learner diaries tell
    us about intercultural competence? Language and Intercultural Communication
    9(2), 91–104 (May 2009). https://doi.org/10.1080/14708470802140260
19. Johansson, V.: Lexical diversity and lexical density in speech and writing: A de-
    velopmental perspective. Working Papers 53, 61–79 (2008)
20. Kipf, S.: Geschichte des altsprachlichen Literaturunterrichts. In: Lütge, C. (ed.)
    Grundthemen Der Literaturwissenschaft, pp. 15–46. De Gruyter, Berlin and Boston
    (2019)
21. König, J.: Die Lektürephase. In: Janka, M. (ed.) Lateindidaktik, pp. 133–155. Cor-
    nelsen Scriptor, Berlin (2017)
22. Mambrini, F., Passarotti, M.: Linked Open Treebanks. Interlinking Syntactically
    Annotated Corpora in the LiLa Knowledge Base of Linguistic Resources for Latin.
    In: Proceedings of the 18th International Workshop on Treebanks and Linguistic
    Theories (TLT, SyntaxFest 2019). pp. 74–81. Paris, France (Aug 2019)
23. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word repre-
    sentations in vector space. arXiv preprint arXiv:1301.3781 (2013)
24. Munser-Kiefer, M., Martschinke, S., Hartinger, A.: Subjektive Arbeitsbelastung
    von Lehrkräften in jahrgangsgemischten dritten und vierten Klassen. In: Miller, S.,
    Holler-Nowitzki, B., Kottmann, B., Lesemann, S., Letmathe-Henkel, B., Meyer, N.,
    Schroeder, R., Velten, K. (eds.) Profession und Disziplin : Grundschulpädagogik
    im Diskurs, pp. 114–120. Jahrbuch Grundschulforschung, Springer Fachmedien,
    Wiesbaden (2018)
25. Nation, I.S.: Learning Vocabulary in Another Language. Cambridge University
    Press, 2 edn. (2013)
26. Schmid, E.C.: Developing competencies for using the interactive whiteboard to
    implement communicative language teaching in the English as a Foreign Language
    classroom. Technology, Pedagogy and Education 19(2), 159–172 (2010)
27. Shabani, K., Khatib, M., Ebadi, S.: Vygotsky’s Zone of Proximal Development: In-
    structional Implications and Teachers’ Professional Development. English language
    teaching 3(4), 237–248 (2010)
28. Straka, M., Hajic, J., Straková, J.: UDPipe: Trainable Pipeline for Processing
    CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging
    and Parsing. In: LREC. pp. 4290–4297 (2016)
29. Straka, M., Straková, J.: UDPipe. A LINDAT/CLARIN project
30. Tiepmar, J., Teichmann, C., Heyer, G., Berti, M., Crane, G.: A new implemen-
    tation for canonical text services [CTS]. In: Proceedings of the 8th Workshop on
    Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaT-
    eCH). pp. 1–8 (2014)
31. Utz, C.: Mutter Latein und unsere Schüler — Überlegungen zu Umfang und Aufbau
    des Wortschatzes [BWS]. Antike Literatur–Mensch, Sprache, Welt. Dialog Schule
    und Wissenschaft 34, 146–172 (2000)
32. Wilkinson, M.D., Dumontier, M., Aalbersberg, I.J., Appleton, G., Axton, M.,
    Baak, A., Blomberg, N., Boiten, J.W., da Silva Santos, L.B., Bourne, P.E., Bouw-
    man, J., Brookes, A.J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S.,
12       K. Schulz et al.

     Evelo, C.T., Finkers, R., Gonzalez-Beltran, A., Gray, A.J.G., Groth, P., Goble,
     C., Grethe, J.S., Heringa, J., ’t Hoen, P.A.C., Hooft, R., Kuhn, T., Kok, R., Kok,
     J., Lusher, S.J., Martone, M.E., Mons, A., Packer, A.L., Persson, B., Rocca-Serra,
     P., Roos, M., van Schaik, R., Sansone, S.A., Schultes, E., Sengstag, T., Slater,
     T., Strawn, G., Swertz, M.A., Thompson, M., van der Lei, J., van Mulligen, E.,
     Velterop, J., Waagmeester, A., Wittenburg, P., Wolstencroft, K., Zhao, J., Mons,
     B.: The FAIR Guiding Principles for scientific data management and stewardship.
     Scientific Data 3, 160018 (Mar 2016). https://doi.org/10.1038/sdata.2016.18