Teaching Computational Aspects in the Digital Humanities Program at University of Stuttgart – Intentions and Experiences Nils Reiter1 , Sarah Schulz1 , Gerhard Kremer1 , Roman Klinger1 , Gabriel Viehhauser2 and Jonas Kuhn1 1 Institut für Maschinelle Sprachverarbeitung 2 Institut für Literaturwissenschaft Universität Stuttgart viehhauser@ilw.uni-stuttgart.de firstname.lastname@ims.uni-stuttgart.de Abstract each student deepens their knowledge in the field they studied in a previous undergraduate program, The structure of the Digital Humanities (2) Digital Humanities, and (3) Computer Sci- master’s program at University of Stuttgart ences (CS). While different computer science in- is characterized by a big proportion of stitutes are offering courses in this program, a ma- classes related to natural language process- jority of the courses are offered by the Institut für ing. In this paper, we discuss the moti- Maschinelle Sprachverarbeitung (Institute for Nat- vation for this design and associated chal- ural Language Processing), both electable and com- lenges students and teachers are faced with. pulsory courses. To provide background information, we In this paper, we present the intentions behind also sum up our underlying perspective the study program and report on results of a survey on Digital Humanities. Our discussion is conducted among the first two cohorts of students. driven by a qualitative analysis of a survey handed to the students of the program. 2 Digital Humanities and Computational Linguistics 1 Introduction Digital Humanities is a new and diverse field, and The importance of computer-assisted methods is pinpointing and defining its actual novelty has been increasing in various research fields, for instance in a hot topic in the past years (Presner and Johan- Biology (Bioinformatics and Computational Biol- son, 2009; Berry, 2011; Gibbs, 2011; Svensson, ogy), Media Sciences (Mediainformatics), or Geog- 2012; Kuhn and Reiter, 2015; Dunst, 2017; Thaller, raphy (Geoinformatics). More recently, the broad 2017). While differing views are plausible and fields of Humanities and Social Sciences adopted valid, we believe that formalization is one key as- the use of computational methods, which are of- pect of the field’s novelty, applied to both the re- ten referred to as Digital Humanities (Jannidis et search questions and to the analysis objects. The al., 2017). However, in contrast to preceding re- formal definition of – in principle – quantifiable search domains and sciences, the use of quantitative properties is a fundamental step when switching and statistical methods in this area is less popular, the focus from particular, incomparable pieces of which poses additional challenges to the introduc- art to comparing, counting and categorizing ob- tion of formal methods to the field. jects. Only properly formalized concepts can be The University of Stuttgart introduced a mas- reliably applied on different objects of interest, and ter’s program for Digital Humanities (DH) in 2015. only then can these objects be compared or viewed While other universities have been offering DH pro- quantitatively in the first place (for instance, the grams in various forms, one key characteristics of comparison of syntactic profiles for different au- the DH program in Stuttgart is the strong influence thors relies on the proper formalization of syntax). of Computational Linguistics (CL) on the program, Formalization, in this view, does not necessarily both on the design and planning of the program imply the implementation of such approaches in and on the actual courses. a computer. There are formalized approaches to The program consists of three main areas: (1) Humanities research questions or objects that are A specific discipline in Humanities, in which non- or pre-digital, e. g., John Snow’s map of a Master’s program Digital Humanities , University of Stuttgart 1st semester 2nd semester 3rd semester 4th semester Electives Humanities – In-Depth Humanities * 12-18 In-Depth Humanities DH in the DH in the Humanities I 6 Humanities II 6 Subject (lecture series) (lecture series) *** Master Theoretical and Methods of Methods of thesis 30 Specialization informatics basics DH 6 DH 6 for the DH 9 (seminar) (seminar) Digital (lecture + practice) ----------------------------- ----------------------------- Project work 9 Research Coll. 6 Humanities Computational Specialization Linguistics Methods for the DH 9 Electives Computer Science ** 12-18 Computer ---------------------------- Programming 3 Science Semester 1 30 CP Semester 2 30 CP Semester 3 30 CP Semester 4 30 CP Figure 1: Structure of the master’s program Digital Humanities at University of Stuttgart * import module from the Humanities ** import module from Computer Science *** DH course in the Humanities – offered by a Humanities subject London cholera outbreak in 1854 (which enabled most prominent example is the annotation work- a visual detection of the outbreak center), or the flow, including measuring inter-annotator agree- configuration analysis of 19th century traveling ment as a metric for annotation guideline quality theaters (which enabled a quick overview of the (Hovy and Lavid, 2010) or the use of shared tasks required number of actors to perform a play). As to foster tool or corpus creation (Reiter et al., 2017). the examples show, formalized approaches do not imply ‘big data’ or large-scale analyses. 3 Structure of the DH Master’s Program Applications that have been popular in the Digi- at University of Stuttgart tal Humanities (e. g., network analysis or stylome- Given the above, the DH master’s program in try) are all built on this formalization: Independent Stuttgart aims at both teaching conceptual un- of the visualization, a network is a formal model, in derstanding and practical experience, while at which data properties are represented by nodes and the same time deepening students’ Humanities edges between them. Stylometric analysis, e. g., backgrounds and interdisciplinary skills. This is implies a formalized notion of what tokens are, and achieved through a combination of theoretical lec- how they are counted and compared. tures and practical exercises, programming courses, Given the fact that text is a frequently used and group projects. medium in many Humanities disciplines (on the ob- The program is open for undergraduate students ject and/or meta-level), Computational Linguistics of a Humanities discipline that is also taught in plays a crucial role in two – complementary – ways: Stuttgart (e. g., Literary Studies, History, Philos- (i) On the operationalization level, formalizations ophy, or Art History). Interested undergraduates of, for instance, literary concepts can be built upon may apply once a year and start each year in Oc- linguistic structures (for which operationalizations tober. The program is designed to be completed do exist). In many cases, this requires tested and within four semesters. Courses are split into three proven annotation guidelines as well as implemen- categories, although not all classes can be clearly tations of tools for the automatic discovery of such assigned: Humanities, Digital Humanities, and structures – (computational) linguistic structures Computer Sciences. The structure of the program can therefore form the basis of more complex and is illustrated in Figure 1. abstract formalizations (e. g., narrative categories In the set of Humanities courses, students take defined on the phrase-level). (ii) On the method- classes in the discipline of their undergraduate pro- ological level, CL has established a number of best gram, where they are joined by their non-DH fellow practices for creating such formalizations, which students (e. g., master students of German studies). can be put to use on non-linguistic phenomena. The In contrast, Digital Humanities classes are spe- cific to the DH students, only. After a compulsory puter Science bachelor’s and master’s programs, in introductory lecture (6 hours/week, lecture and ex- which they share courses with the students from ercises), students take part in a group project in the the respective programs (e. g., data visualization). second semester, where ‘real-world’ research tasks It is a deliberate choice that DH students take of delimited scope are tackled. Emphasis is put courses that are also offered in the CS and CL on teamwork and on the independent development programs. This way, students are exposed to dif- of research strategies, two competences we regard ferent disciplinary styles and cultures, reflecting as crucial and also characteristic for research in the ‘in between worlds’-nature of DH in general. the DH. Thus, students learn to split up a research In addition, many of the courses that feature exer- problem in smaller parts and establish data models cises foster group exercises in order to strengthen that serve as the base for the application of formal- team-skills (which are crucial when working across ized computational methods. Those who choose a disciplines). CL-oriented project are advised by teachers from Strong interdisciplinary ties are also present Computational Linguistics. Other courses in the among the teachers involved in the program, who DH area are seminars to familiarize students with all are experienced in working in mixed teams with the most recent research in preparation of their members from different disciplines. master’s theses. 4 Evaluation of DH Students’ Appraisal The third area covers Computer Sciences and includes the Computational Linguistics courses. In 4.1 Methodology total, these courses cover roughly one third of the To get an impression of how the conceptional credit points each student has to achieve (excluding course design decisions are reflected and perceived the master’s thesis). Two courses from this area are by the students, we created an online questionnaire compulsory, both in the first semester: Computa- and distributed this survey among both cohorts cur- tional Linguistics Methods for Digital Humanities rently enrolled, first and second year students, by (6 hours/week; lecture and exercises) and Program- the end of the teaching term. Since there were ming (2 hours/week; lecture and exercises). These slight adjustments to the courses after the first year compulsory courses are designed for and offered such as an emphasis on independent learning and specifically to the DH students and are only taken changes to the programming course which was ad- by DH students. Content-wise, Computational Lin- justed to the needs of Digital Humanities students, guistics Methods for Digital Humanities resembles we analyze their feedback separately. introductory courses for students in the computa- The questionnaire covers topics with respect to tional linguistics programs. In addition, the use of the students’ overall satisfaction with their choice Natural Language Processing (NLP) tools and/or of study, the differences they perceive between workflows for addressing non-linguistic research their humanities discipline and the Digital Humani- questions is covered. In Programming, no fore- ties context, but especially the integration of NLP knowledge at all is assumed, treating every student courses in their curriculum. We inquired their per- as a first-time programmer. Some emphasis is on sonal attitude towards the practical courses, their the fact that many programming concepts exist in assessment of the difficulty of the offered courses, many programming languages, although we use and their opinion about the necessity of the acquisi- Python (version 3) throughout as our programming tion of NLP-related knowledge and skills for their language in teaching. The main reason for this is understanding of Digital Humanities. Appendix A that Python is widely used in the DH community. contains the complete questionnaire content. Many exercises in the programming course cover We distributed 34 questionnaires out of which 15 algorithms and ideas that have been discussed in the were returned completely filled out. Since the en- NLP-methods course (e. g., to implement functions tire study program has a small number of students that measure precision and recall). In general, we and a return of 15 does not allow for a reliable aim at performing exercises that students perceive statistical analysis, we rather catch the mood of as being related to (Digital) Humanities. Apart approval of the program’s structure rather than a from these two compulsory courses, students are full-fledged evaluation. free to choose from a selection of courses that are The questionnaire comprised a few free-text an- offered in the Computational Linguistics and Com- swers, but mostly, participants were asked to mark their personal view of adequateness for given state- more persons who are not very confident in their ments on a 6-point Likert scale. ability of familiarizing themselves with a new topic on their own. At the same time, divergent opin- 4.2 Results ions also exist with respect to the question whether they attach importance to a deeper understanding Firstly, students stated to enjoy their DH studies of NLP tools. Even though students seem to agree and both cohorts presume that their future career that an understanding contributes to their abilities will profit from their education in Digital Human- in DH, the second cohort in particular tends to find ities, whereas the second year cohort commits it more essential. The same trend can be observed stronger to both statements. in their appraisal of the necessity to possess pro- Since all of the students hold a bachelor’s de- gramming skills. The second cohort clearly agrees gree in a humanities discipline, they emphasize the that programming should be part of their skill set shift to a more practical, computational training as Digital Humanists, whereas the first cohort has as a clear difference from what they were used to. more divided views. However, in the first cohort most students stress the Thus, it seems that a higher confidence in CL- addition of CS as a difference, whereas in the sec- skills also fosters the acceptance of these methods. ond cohort the focus on practical courses/sessions But, admittedly, it might alternatively just show the accompanying a theoretical course is mainly men- inherent difference between the two cohorts. tioned as a difference to earlier studies. A few Being asked about suggestions for improvements students point out that sometimes basic knowledge for the program, the students wish for even more is taken for granted, which leads to excessive de- practical exercises, concrete preparation for their mands. These experiences highlight the difficult professional life and more diversity with respect to balance of overload and underload resulting from a application examples. very heterogeneous group of students. In summary, we attribute the differences between This major shift from ‘theory’ (or more abstract both cohorts to the changes we made after the feed- humanist approaches) towards ‘practice’ is also back at the end of the program’s first year. As an reflected in the students’ expectations of how CL- overall reflection of the affinity towards program- methods should be taught: Both cohorts agree that ming, independent learning, and a preference for practical exercises are a very important aspect (for practical courses, the second cohort has a higher the second cohort it has highest importance for self-perception of skills and also feels more con- everyone answering the questionnaire) and some fident to autonomously carry out a project with a of the students even wish to have more practical topic in Digital Humanities. We interpret this as training. a sign that our program structure with a focus on However, it seems that the practical exercises practical sessions prospers. should be based on a solid theoretical ground: Stu- dents in both cohorts tend to prefer a teaching ap- 5 Conclusion and Discussion proach in which theoretical knowledge serves as the basis to these practical sessions rather than an An often discussed problem of interdisciplinary col- approach in which one is introduced to a topic in laborations between humanists and computer sci- a practical manner and later on provided with the entists are communication difficulties that can lead theoretical background. to all kinds of misunderstandings, loss of valuable Regarding self-perception and acceptance of CL- time and frustration on both sides. These issues skills, our results seem to indicate a characteris- root in the differences of research traditions and tic difference between the two cohorts: In both the often opposed way of tackling research objec- groups, students feel capable of coping with the tives. By familiarizing students with both fields and DH courses in general. But, regarding CL courses, making them aware of these differences, we aim at most students in the first cohort feel overwhelmed, opening doors to even more fruitful collaborations whereas the majority of the second cohort does not. in the future. Instead, in general they feel equally well as in the In this study, we recognize a general difficulty DH courses. Students who feel overwhelmed often in estimating specific needs and issues of Digital emphasize the newness of the methods as a reason. Humanities students. The survey that was designed Among the students in the first cohort there are also to develop an understanding of particularities of this group revealed that, partially, its characteris- Systematic annotation of literary texts. In Digital tics are not different from what one might expect Humanities 2017: Conference Abstracts, Montreal, Canada. from other discipline switches – we presume that a student changing from a Humanities program to Patrik Svensson. 2012. Envisioning the Digital Hu- an engineering field would feel similar aspects to manities. Digital Humanities Quarterly, 6(1). be eye-catching (for instance, the combination of Manfred Thaller. 2017. Digital Humanities als Wis- lectures and exercises). This might indicate that senschaft. In Jannidis et al. (Jannidis et al., 2017). the difficulties lie not necessarily in the program itself, but in the special combination of Humanities Appendix A Questionnaire with a formal and more technical research area. In comparing our teaching experiences in Com- Below we provide the English translation of the puter Science/Computational Linguistics and Dig- student survey questionnaire. Horizontal rules ital Humanities, another aspect surfaces: CS/CL designate the space for free-text answers. In students are typically confronted with problems most cases, students were asked to mark the new to them (and the accompanying solutions), appropriateness of every statement as shown here: which is a straightforward way in teaching (from I disagree I agree the teacher’s perspective). In contrast, students 2 2 2 2 2 2 of DH have a background in a Humanities disci- pline and thus already have been confronted with a 1. I like studying Digital Humanities. number of research questions and possible solution methods. Naturally, they are expecting relatively 2. My Humanities area: concrete new solution methods to these diverse, pre-existing questions. This makes DH a more 3. The Digital Humanities study program will be application-oriented subject than many CS disci- helpful for my deliberate professional future plines. (if assessable). 4. This DH study program differs from my bach- References elor’s study program. If you (rather) agree, please explain how it David Berry. 2011. The computational turn: Thinking about the Digital Humanities. Culture Machine. differs: Alexander Dunst. 2017. Digital American stud- 5. I feel overwhelmed in DH courses. ies: An introduction and rationale. Amerikastudien, If you (rather) agree, please explain why: 61(3):381 – 395. Fred Gibbs. 2011. Critical discourse in Digital Human- 2 I’m lacking basic knowledge. ities. Journal of Digital Humanities, 1(1). 2 Pace of the course is too fast. Eduard Hovy and Julia Lavid. 2010. Towards a ‘sci- 2 The structure of the course is not intuitive ence’ of corpus annotation: A new methodologi- for me. cal challenge for Corpus Linguistics. International 2 Other reasons: Journal of Translation Studies, 22(1), January. Fotis Jannidis, Hubertus Kohle, and Malte Rehbein, ed- 6. I feel overwhelmed in CL courses. itors. 2017. Digital Humanities – Eine Einführung. If you (rather) agree, please explain why: J.B. Metzler. 2 I’m lacking basic knowledge. Jonas Kuhn and Nils Reiter. 2015. A plea for a method- driven agenda in the Digital Humanities. In Proceed- 2 Pace of the course is too fast. ings of Digital Humanities 2015, Sydney, Australia, 2 The structure of the course is not intuitive June. for me. Todd Presner and Chris Johanson. 2009. The promise 2 Other reasons: of Digital Humanities. Available as white paper on- line from http://humanitiesblast.com/ 7. I can contribute with my skills during CL Promise of Digital Humanities.pdf. courses. Nils Reiter, Evelyn Gius, Jannik Strötgen, and Marcus Willand. 2017. A shared task for a shared goal – 8. Programming skills are important to me. 9. It is important to me to understand the internal functional principle of computational linguis- tics tools. 10. Practical modules (like exercises offered addi- tionally to lectures) are important to me. 11. I am confident successfully conducting a hands-on DH project with my skills. 12. I prefer learning the theoretical background before applying it. 13. I prefer learning about hands-on applications before addressing the theoretical background. 14. I can perfectly familiarize myself with a topic on my own. 15. My suggestions to improve the DH study pro- gram: