Tracing changes in thematic structure of holiday picture postcards from 1950s to 2010s Kyoko Sugisaki, Nicolas Wiedmer, Marcel Naef, Heiko Hausendorf German department University of Zurich Switzerland {sugisaki,nicolas.wiedmer,marcel.naef,heiko.hausendorf}@ds.uzh.ch Abstract a structure of knowledge, in which the core of a frame (theme) is connected to the constituents of In this paper, we present our study of the knowledge. Depending on the context of a con- changes of thematic structures in holiday crete situation, possible constituents vary. These picture postcards from 1950s to 2010s. We constituents define the conditions of the realisation use over 1,000 cards that we annotated of textual phrases. In the case of holiday postcards, manually with thematic information and the theme of the frame is to be on holiday and the apply a clustering method (principal com- constituents of knowledge (i.e., slots) are possi- ponent analysis, PCA) to analyse the the- ble ways to be filled with actual text (i.e., fillers) matic structure. The primary objective of according to the concrete situation of writing a hol- our study is to group cards with similar the- iday postcard. In this work, we define a set of slots matic structure and to analyse changes of for postcards that report vacation experiences (cf. themes over the decades. Our PCA anal- Section 3.1) yses indicate that holiday postcards have We will first describe the corpus of postcards been changed in terms of (1) thematic struc- (Section 2) and then characterise the texts with re- ture, (2) function of text, and (3) language gard to thematic structure before presenting our patterns of the speech act ‘greeting’. annotation schema and discussing the annotation process (Section 3). Finally, we use a clustering 1 Introduction method to analyse our annotated postcards and In this paper, we introduce a novel approach to the present our results (Section 4). frame-semantic annotation of thematic structures from the point of view of text linguistics and pro- 2 Data source: Postcard corpus vide a data-driven analysis on the development of The holiday picture postcard corpus ANKO (An- holiday picture postcards over decades. sichtskartenkorpus ’picture postcard corpus’) con- So far, the annotation of theme in corpora has sists of 12,337 cards written in Standard German been carried out mainly based on information struc- (95%, 11,760 cards, 582,675 tokens) and in Swiss ture, such as the Prague Treebank (Hajič, 1998) German (5%, 577 cards) from 1898 to the present and the Potsdam Commentary Corpus (Stede and day. They were collected in Zurich, Switzerland Mamprin, 2016). In our work, the term theme is from 2009 to 2017. The postcards included in our distinguished from the notion of topic in informa- corpus are only cards that were sent from vacation. tion structure, that is, ‘aboutness’ and ‘old/given The postcards were sent from private individuals entity’. In information structure, the topic is deter- by post to Switzerland mainly from Switzerland, mined mainly by its syntactic position in a sentence Italy, Germany and other European countries. In and by the salience of its discourse in relation to the corpus, paragraphs, sentences and tokens are the entities mentioned in the previous sentence(s). segmented in an XML representation (cf. Sugisaki In contrast, in our work theme is rather a semantic et. al (2018)). frame that constitutes the thematic coherence of a certain genre of text (holiday picture postcards). We use the term semantic frame in the sense of Busse (2012, p. 563) who defines the frame as 67 Proceedings of the Workshop on Computational Methods in the Humanities 2018 (COMHUM 2018) # Question Class Example of a possible answer in text A Is anything except holiday thematised? Extra-diegetic Thank you for your card. B How did I travel to the holiday location? Outward and return journey The flight to Frankfurt was terrible. How do I travel home? C How is the weather? Weather The weather is fabulous! D How/where do I stay? Accommodation We stay in a camping place near Amster- dam. E What/where do I eat or drink? Eating and drinking Martin eats pizza every day! F Who did I meet on holiday? Meeting new people We met some Italian guys in Rome and we hang around a lot. G What do I do? Activity Yesterday, we visited a lot of churches in Florence. H Did something unexpected happen? Happenings Unfortunately, we had a car accident in Spain, so that we gave up our road trip. I Where am I on my holiday? Location We are now in Ibiza with the kids. J What do I know about the holiday place? Knowledge The romans built this city about 2000 years ago. K What kind of holiday do I take? Type Greetings from our hiking-trip! L What do I want to achieve on my holiday? Reason I always wanted to learn Italian, so now I’m taking a course in Rome. M How do I feel? Feeling We really enjoy our vacation in Italy :) N What can I see/hear in my holidays? General There are so many lavender fields here. Table 1: Annotation scheme of thematic structures in postcards 3 Theme annotation Specifically, this function is to maintain personal contact during holidays. In other words, the the- We manually annotated the core thematic struc- matic structures of postcards were conventionalised tures in the postcards. The text of the postcards and standardised over time by fulfilling the com- was generally structured as follows: 1) a preface municative needs of holidayers. Of course, some that contains the date, sometimes the location (e.g., variations were caused by the social changes and Laax, 20/12/1977); 2) a salutation (e.g., Dear Mr. the use of postcards as a mean of communication. & Mrs. Smith); 3) the message; 4) greetings (e.g., Therefore, we consider that the categories of the greetings from Paris); 5) the signature of sender(s). thematic structures that we annotated in this study The thematic annotations concerned only 1), 3) could be super themes that might remain consistent and 4). No thematic information was found in the over time (cf. Hausendorf and Kesselheim (2008, salutations 2) or the signatures 5). p. 103), Hausendorf (2008, p. 333), Hausendorf In the following section, we describe in detail (2009, p. 13)). the subcategorization and annotation process. To develop an annotation schema of the thematic 3.1 Developing the annotation scheme structures in postcards, we first determined a set of main thematic categories based on the observation Our primary goal of the annotation presented in of hundreds of postcards. We then tested this initial this paper is to find the core thematic structures of schema with 14 test participants in order to refine the postcards and their development over time. In and extend it and then to produce the final annota- reporting holiday experiences, postcards exhibit a tion schema. In the following sections, these two handful of thematic patterns. These thematic pat- steps are described in detail. terns have been formed and remained over time because the postcards fulfilled the main purpose 3.1.1 Defining core thematic frames of this type of text, which is the function of con- To identify the categories of semantic frames in tact (in the term of Hausendorf and Kesselheim postcards, we first defined a set of questions that (2008, p. 154ff) or Brinker et. al (2014, p. 118ff)). can be answered in text (Ziem, 2008, p. 94f). In 68 Proceedings of the Workshop on Computational Methods in the Humanities 2018 (COMHUM 2018) other words, we assume that every sentence of a contain more than one proposition because of coor- postcard can be read as an answer to at least one dination (cf. sentence 1) and the inherent semantic question shown in Table 1. property of categories (cf. sentence 2). Therefore, We divided the frame categories into two classes: each sentence is annotated as belonging to one ore 1) about the holiday (B-N in Table 1) 2) not about more thematic categories. the holiday (A in Table 1). The first category is sub- (1) Frame category of activity and eating and categorised into semantics-oriented themes. The drinking: thematic categories do not refer to individual topic entities (e.g. snow, rain and wind) but to the super Jetzt gehen wir Ski fahren und nachher categories of such entities (e.g. weather). The su- Appenzeller Fondue essen. per category weather, for example, was frequently ‘Now we are going to ski and then we will eat thematised in postcards. Therefore, we concluded cheese fondue á la Appenzell’. that the weather is an important element in the frame of being on holidays. Similarly, eating and (2) Frame category of accommodation and drinking, meeting new people and accommodation location: belong to this category of relevance. Furthermore, Unser Hotel ist in der Nähe vom Genfersee. we observed that the postcards reported what the ‘Our hotel is near Lake Geneva’. writer would do, was doing or did on the holiday (the category of activity). While this category in- 3.1.2 Testing the core thematic categories cludes comments on events carried out intention- In order to test the robustness and the comprehen- ally by holidayers (e.g., hiking, skiing and dancing), sibility of our thematic annotation scheme, we con- the category of happenings refers to unexpected ducted a study at the University of Zurich in which and unintended events (e.g., car accidents, illness 14 linguistics students individually annotated 12 and lost baggage). In addition, the postcards often cards. After a 45-minute briefing session about began with descriptions of where the writers were the annotation scheme, they were provided with a (the category of location) with or without explana- MS Excel sheet in which each sentence was dis- tory comments on holiday places (the category of played in a cell. The students then assigned the knowledge), and why they were in that holiday categories shown in Table 1 to the sentences. The location (the categories of type and reason). For category of happenings (H) was not part of the tag example, the type of holiday could be a school trip, set at that moment because it is the result of this a shopping trip or a ski vacation, all of which are study. Furthermore, the extra category of ‘X’ was holiday prototypes. In contrast, the category of rea- provided in the case that the students did not find son concerns what the writer wants to achieve on any of the categories suitable for a unit. We then holiday. Treatment in a sanatorium (body fitness compared the thematic categories assigned by the as scope), and language holidays abroad (language students to a gold standard that was created by our learning as scope) are prototypical in this category. four internal annotators. Moreover, the postcard writers described their holi- The students’ overall annotation precision ranged day with an emphasis on their emotional state (the from 83.89% to 98.58% (average: 93.30% ) with re- category of feeling) or without any reference to call between 85.93% and 97.20% (average: 92.61%). emotion, they focussed on what they saw and heard The students’ overall scores were satisfying con- on their holiday (the category of general). Finally, sidering the short instruction time. However, there we created the extra category of outward and re- were remarkable differences with regard to preci- turn journey, which refers to the journey to and sion and recall in some categories. We summarised from the holiday location. This category includes the results as shown in Table 2. As the balanced events that were not directly related to the holiday score, we used the Matthews correlation coeffi- location but were part of the holiday experience. cient (MCC) instead of the kappa coefficient to In our annotation scheme, the thematic unit is a account for the differences in frequency between sentence. Compared to words, phrases and para- the categories (cf. Powers (2012)). Table 2 shows graphs, sentences are ideal units for thematic analy- the problematic thematic categories of accommo- sis because each question to be answered in the text dation, location, knowledge, type, reason, feeling contains a proposition. However, a sentence can and general. We discussed the results with the stu- dents and came to the conclusion that the relatively 69 Proceedings of the Workshop on Computational Methods in the Humanities 2018 (COMHUM 2018) # Class Pre Rec MCC preliminary study described in the previous section. A Extra-diegetic 88.61 93.23 89.59 In addition to the questions shown in Table 1, the annotation guideline summarised in section 3.1.1 B Journey 92.86 100.00 96.30 was handed to them. Each postcard was annotated C Weather 97.76 90.34 93.40 by one of the two annotators, who, if they were not D Accommodation 88.89 84.21 86.19 sure, noted a comment. After this first round of the E Eating and drinking 88.89 94.12 91.17 annotation, two additional annotators (annotator F Meeting new people 84.62 82.50 83.13 C and D) discussed problematic sentences in the G Activity 96.92 92.99 93.91 annotation process, and they jointly decided which H Happenings – – – thematic categories were to be chosen. However, these two steps were not sufficient to I Location 96.24 86.72 88.79 obtain a highly consistent annotation. The problem J Knowledge 73.21 98.80 84.13 was that some categories were not clearly distin- K Type 38.89 50.00 43.53 guishable from others, which led to the result that L Reason 73.33 78.57 75.68 the first two annotators (annotator A and B) often M Feeling 81.03 97.92 87.90 did not agree on the categories of general, feeling N General 75.29 81.01 76.90 and knowledge. Our approach was to assign the the- O X 92.93 98.90 93.60 matic categories that best answered the questions (cf. Table 1), which, however, allowed room for Table 2: Precision, recall and MCC (in percent) for interpretations of the annotators. For example, the the preliminary study with students sentence, ‘The beach is really wonderful.’ was seen as an answer to the question, ‘How did they feel on their holiday?’ (the category of feeling) by annota- tor A, and as an answer to the question ‘What was the holiday place like?’ (the category of general) low recall of the categories accommodation and by annotator B. For this reason, we defined a set location could be explained by misunderstandings of lexical items for the categories of feeling and in the briefing session. For example, the students knowledge. For example, geniessen (‘enjoy’), gut often did not assign the category location if the (‘good’), schlecht (‘bad’), wunderbar (‘wonderful’) location was not mentioned in the message but in are lexical items that express feeling. They express the preface (e.g., ‘Paris, 07.08.1966’). Based on the opinion, evaluation, and emotional state of the the discussion, we created an annotation guideline writer. In contrast, man, alle, hier ‘one, all, here’ with definitions of the categories and examples of are lexical items for the category of knowledge. contentious cases. They demonstrate the general knowledge, includ- ing stereotypical prejudices, of the writer. However, 3.2 Manual annotation the occurrence of a lexical item is not a definite cri- Based on the study described in the previous sec- terion for placing a sentence in a certain category. tion, we carried out a sentence-based multiple-class Thus, these two categories were revised by exam- thematic annotation with 14 categories (cf. Table ining the lexical items and their adequacy in the 1) and 1,120 postcards. The cards were selected context of every single sentence. For the category from our Standard German postcard corpus using of general, we did not define a set of lexical items. a random sampling strategy with a fixed number Instead, it was chosen whenever the writer gave a (160) of cards for each of the 7 decades between clear description of something he or she could see the 1950s and 2010s. This sampling method en- or hear (e.g., ‘The beach is really dirty.’). After sured that the range of cards would include those having defined lexical items for the categories of from less frequent decades in the corpus. feeling and knowledge and having determined the The manual annotation was carried out in three new criteria for the category of general, annotator steps. First, two linguistics students (annotator A and C examined all the instances and revised the A and B) were asked to assign one or more pre- annotation jointly in the third step using these new defined thematic category to each sentence dis- annotation criteria. played on a Microsoft Excel sheet similar to the 70 Proceedings of the Workshop on Computational Methods in the Humanities 2018 (COMHUM 2018) 4 Data analysis Sonntag mörderisch heiss es glich einem langsamen Selbstmord – sind wir nun doch Based on the manual annotation of over 1,000 cards noch für ein paar Tage hier im wirklich (49,261 tokens and 6,713 sentences), we analyse malerischen & sehr ruhigen Bergdörfchen our annotated texts with principal component anal- Zinal (Walliser Hochthal) gelandet. Im ysis (PCA). PCA is a dimension reduction method Grande Hotel des Diablons (ganz alt, aber that locates underlying latent dimensions of a col- [unclear] mit [unclear] franz. Küche) sind wir lection of text by ‘eliminating the covariance while in allen Teilen gut aufgehoben. Wir ruhen uns preserving most of the variance in data’ (Moisl, Beide gut aus, mein [unclear] Mann hat dies 2015). In linguistics, PCA has been used in corpus nach seiner Bruchoperation nötig und ich linguistics as explanatory method, in particular, for laufe auch schon lange zum [unclear] Jetzt authorship attribution or stylistics (e.g., Baayen et. bricht eben ein Sturmwind mit Gewitter los, al (1996)), and factor analysis for register analysis hoffentl. kommt kein Dauerregen. – Frau (e.g., Biber (1995)). [NN] hat sich dann wieder fest gemeldet \\ For the PCA analysis, we aggregated our the- Liebe Grüsse, Ihre H. [unclear] [NN] matic classes reason and purpose to a new class why and meeting-new-people and eating and drink- ‘Zinal, 21.7.64. \\ Our beloved ones, after a ing to activity to get a better result. We counted few hours of rigor at the Expo – it was the frequency of each 10 semantic frames (without murderously hot from Sunday it was like a class x) in a text, normalised the count for 1000 slow suicide – we eventually ended up here words and log-transformed it. for a few days in the really picturesque and PCA identified four principal components that very quiet mountain village Zinal (Valais account for 62.67% of the variance of 10 variables.1 High Valley). At the Grande Hotel des The loading is shown in Table 3. Figure 1 (a) and Diablons (very old, but [unclear] with (b) illustrate all the cards in our data set and the [unclear] French cuisine) we are in good directions of the variables. In the following para- hands in every way. We both rest well, my graphs, we go through each of these four compo- [unclear] husband needs this after his fracture nents in detail. surgery, and I have been walking for a long time to [unclear] Now a storm wind is Component 1 In the first component (20.78% of breaking loose with thunderstorms, hopefully variance), PCA indicates that activity, feelings and it will not rain constantly. – Incidentally, Mrs weather are highly correlated. The card with the [NN] has again made a firm commitment \\ highest score was (A), while that with the lowest Best regards, yours H. [unclear] [NN]’2 score was (B). In the card (A), only the purpose of the holiday is mentioned, whereas the major We interpret the component as the standardis- semantic frames (weather, activity, feeling, gen- ation of prototype themes in postcards. In Fig- eral, knowledge, accommodation, location) are ure 2(a), we observe that postcards have gradually mentioned in the card (B). evolved from rather scattered and sparse semantic- thematic contents towards prototypical ones. Fur- (A) The highest score: thermore, the analysis indicates that holidayers Ausflug 19.6.93. more often wrote about why they are on holiday ‘excursion 19.6.93.’ before the emergence of mass tourism. (B) The lowest score: Component 2 In the second component (15.61% of variance), our cards are clearly grouped into Zinal, 21.7.64. \\ Unsere Lieben, nach den two clusters, mainly depending on the occurrence paar strengen Stunden der Expo – es war vom of the class extra-diegetic. Activity and weather 1. Kaiser-Meyer-Olkin (KMO) factor adequacy was .6, which are negatively correlated to extra-diegetic. The indicates that the sampling adequacy was acceptable. The card with the highest score (A below) was about KMO values of the items range from .49 to .67, which were by and large above the acceptance value (.5). Bartlett’s test weather, location, and activity, while that with the of sphericity was significant (X 2 (55)=383.42, p < .001). All lowest score (B below) was all about the addressee. four principal components had eigenvalue above 5.69, which was well above the acceptance value (1). 2. [NN] stands for family name, and [unclear] for unreadable passages. 71 Proceedings of the Workshop on Computational Methods in the Humanities 2018 (COMHUM 2018) Comp 1 Comp 2 Comp 3 Comp 4 acc(ommodatation) −0.140 act(ivity) −0.595 −0.173 0.307 0.482 ext(ra-diegetic) −0.123 0.962 fee(lings) −0.574 0.136 gen(eral) −0.126 hap(pennings) kno(wledge) 0.130 loc(ation) −0.167 0.719 −0.644 out(journey) wea(ther) −0.465 −0.136 −0.587 −0.523 why 0.116 0.221 Table 3: PCA loading – component (Comp) 1, 2, 3, and 4 (a) (b) Figure 1: (a) Component 1 and 2 (b) Component 3 and 4 (A) The highest score: Ihre beiden Karten & guten Wünsche zum Aus unseren bisher sehr sonnigen und Geburtstag, worüber ich mich sehr freue. warmen Wanderferien im Berner Oberland Sind Sie immer noch in Kreuzingen? Ich senden wir Euch herzliche Grüsse\\ Gret + glaubte Sie wären ab 4.II. wieder im Ralph [NN] Sandra + Katja Clubhaus. – Wie geht es Ihnen – hatten Sie mit der Kur Erfolg? Ich komme Ende dieser ‘From our so far very sunny and warm hiking oder Anfang nächsten Monats wieder zurück holidays in the Bernese Oberland we send & hoffe dann auf ein Wiedersehen. Bis dahin you affectionate regards \\ Gret + Ralph noch weiter hin recht gute Wünsche & Liebe [NN] Sandra + Katja’ Grüsse auch an Herrn Sohn von Ihrer kleinen (B) The lowest score: Dame. Meine liebe große Dame! Besten Dank für ‘My dear Grand Lady! Many thanks for your 72 Proceedings of the Workshop on Computational Methods in the Humanities 2018 (COMHUM 2018) two cards & the good wishes for my birthday, Easter days due to the many visitors. With that makes me very happy. Are you still in regards \\ Bernhard \\ Barbara’ Kreuzingen? I thought you were back in the clubhouse from 4.II. – How are you – did you (B) The lowest score: have success with the cure? I’ll be back at the Von den viel zu kurzen, aber zu [unclear] end of this or early next month and hope to Ferien-Tage ganz herzliche Grüsse. Leider see you then. In the meantime, I send you spielt das Wetter nicht mit. Sehr kalt und very good wishes and best regards, also to Regen. \\ A. [NN] your Mr son from your Little Lady.’ ‘From the far too short, but too [unclear] holidays very affectionate regards. We interpret the component as changes of the Unfortunately the weather is not on our side. main text function of postcards. In Figure 2(a), Very cold and rain. \\ A. [NN]’ we observe a decrease of recipient-orientation over decades. It seems that the main function of post- We interpret the component as changes of social- cards has gradually shifted from a correspondence and cultural aspects in postcards. In Figure 2(b), we ‘how are you? I am thinking of you during my observe a decrease in the 1970s, the upper-bottom holidays’ (text function of contact) to a holiday re- of the 1980s and a slight increase in the 1990s. port (‘how do I spend my holidays?’), whose text In that period, holidayers extensively reported on function is description. weather. In 2000s and 2010s, holidayers reported with an emphasis on ‘Where are you? How many Component 3 In the third component (14.32% places do you visit? Why are you there (what is of variance), PCA indicates that location and ac- special about it)? What do you do there?’. We in- tivity (slightly knowledge and why) are correlated. terpret that holidayers wrote about the most general They are negatively correlated with weather. The topic weather in their holidays from the 1970s to card with the highest score was mainly about loca- 1990s. Holidayers expect to have a holiday weather tions (indicating where they are in holidays) and in their vacation. A massive tourism might lead to activities (what they did there), while that with the the feeling of ‘one of many’ who are at the mercy lowest score was about weather. of weather in holidays. Since 2000, holidayers tend to report on activity- and knowledge-oriented (A) The highest score: vacation and a round trip. Individuality and orig- 17.05.07 \\ Lieber Coni \\ Bernhard + ich inality of a trip and travel experiences might be- befinden uns auf einer Reise von Silvanien come important for the identity of holidayers in the durch Nord- griechenland, Mazedonien, performance/achievement-oriented society. Bulgarien. Wir haben die Vikos-schlucht durchwandert und am Ochridsee die Eichen- Component 4 In the fourth component (11.95% wälder an den Hängen. Gestern waren wir auf of variance), PCA indicates that location and weather dem Ohrid-See, ein Relikt aus der Eiszeit, an are highly correlated. They are negatively corre- der mazedonisch-albanischen Grenze.- Ganz lated with activity (slightly also with why and feel- herzliche Dank für den feinen Alpkäse u. die ings). The card with the highest score was greeting Güegi. Beides war über die Oster- tage bei combined with activity, while that with the lowest dem vielen Besuch hoch willkommen. Mit score consists of a sentence of greeting combined Gruss \\ Bernhard \\ Barbara with weather and location. We observed this pat- terns in the top 10 cards in this component. ‘17.05.07 \\ Dear Coni \\ Bernhard + I are on a journey from Silvania through Northern (A) The highest score: Greece, Macedonia, Bulgaria. We have hiked Lieber Pius, \\ von den schönen, aber sehr through the Vikos Gorge and at the Lake sportlichen Skiferien die herzlichsten Ochrid through the oak forests on the slopes. Purzelbaumgrüsse \\ [unclear] \\ Chlaus \\ Yesterday we were on Lake Ohrid, a relic Berthe [NN] Dieter from the Ice Age, on the ‘Dear Pius, \\ from the beautiful, but very Macedonian-Albanian border. Thank you sporty ski holidays the most affectionate very much for the delicate Alpine cheese and somersault regards \\ [unclear] \\ Chlaus \\ the Güegi. Both were very welcome over the Berthe [NN] Dieter’ 73 Proceedings of the Workshop on Computational Methods in the Humanities 2018 (COMHUM 2018) (a) (b) Figure 2: Trajectory over time: (a) Component 1 and 2 (b) Component 3 and 4 (B) The lowest score: change in postcards. Firstly, postcards have been Lieben Dank für Fredis Karte und herzlichen gradually standardised with regard to themes, and Grüsse aus dem sonnigen Spanien, wo ich evolved towards prototypical themes such as activ- zwei Ferienwochen verbringe, \\ Eure Hanni ity, weather and feelings (evaluation). Secondly, the PCA demonstrated that the main function of the ‘Thank you very much for Fredi’s card and text type ‘holiday picture postcards’ progressively affectionate regards from sunny Spain, where shifted from the obvious function of contact to I spend two weeks of vacation, \\ Yours more that of a description of holidays. Thirdly, we Hanni’ showed that postcards have evolved in ways that We interpret the component as a change of lan- can only be interpreted by further investigations guage patterns in the speech act ‘greeting’. In into the social and cultural backgrounds at that pe- 1970s and 1980s, the greeting form combined with riod in time. Lastly, PCA also identified language weather and location was more common than in patterns of a prototypical speech act ‘greeting’ in other decades. Component 4 shows that weather postcards. We observed two patterns of greeting: and activity are both prototypical for greetings, but weather and location, and activity are evoked pro- competitive semantic frames have been recurrently totypically in greeting. In particular, greeting with evoked in greetings. Again, their mention depends the mention of weather and location was common on time, the society and the culture. in the 1970s and 1980s. In future, we plan to inves- tigate the changes of narrative structures in holiday 5 Concluding remarks picture postcards. In our study we showed that PCA is a suitable Acknowledgments way to model thematic changes in holiday pic- ture postcards over time, but also, that our anno- This work has been funded under SNSF grant no. tation scheme provides an adequate basis for data 160238. We thank Maaike Kellenberger and David driven analysis. The PCA indicated four aspects of Koch for the annotation, and Josephine Obert and 74 Proceedings of the Workshop on Computational Methods in the Humanities 2018 (COMHUM 2018) Jan Langenhorst for the assistance during the an- Hausendorf, Heiko (2009). Kleine Texte. Über Ran- notation process, and Joachim Scharloth and Noah derscheinungen von Textualität. Germanistik in der Schweiz. Online-Zeitschrift der Schweizer Akademi- Bubenhofer for valuable discussions. schen Gesellschaft für Germanistik, 6. Hausendorf, Heiko and Wolfgang Kesselheim (2008). References Textlinguistik fürs Examen. Göttingen, Germany: Van- denhoeck & Ruprecht. Baayen, Herald, Hans van Halteren, and Fiona Tweedie (1996). Outside the cave of shadows: using syntactic Moisl, Hermann (2015). Cluster analysis for corpus annotation to enhance authorship attribution. Literary linguistics. Quantitative linguistics. Berlin: de Gruyter and Linguistic Computing, 11(3):121–132. Mouton. Biber, Douglas (1995). Dimensions of register varia- Powers, David M. W. (2012). The problem with kappa. tion: a cross-linguistic comparison. Cambridge: Cam- In Proceedings of the 13th Conference of the European bridge University Press. Chapter of the Association for Computational Linguis- tics (EACL), pages 345–355. Brinker, Klaus, Hermann Cölfen, and Steffen Pappert (2014). Linguistische Textanalyse: eine Einführung in Stede, Manfred and Sara Mamprin (2016). Information Grundbegriffe und Methoden. Berlin: Erich Schmidt structure in the Potsdam Commentary Corpus: Topics. Verlag, 8., neu bearb. und erw. aufl ed. In Proceeding of the 9th International Conference on Language Resources and Evaluation (LREC 2016). Busse, Dietrich (2012). Frame-Semantik. Ein Kom- pendium. Berlin, Boston: De Gruyter. Sugisaki, Kyoko, Nicolas Wiedmer, and Heiko Hausendorf (2018). Building a corpus from handwrit- Hajič, Jan (1998). Building a Syntactically Anno- ten picture postcards: Transcription, annotation and tated Corpus: The Prague Dependency Treebank. In part-of-speech tagging. In Proceeding of the 11th Inter- E. Hajičová, ed., Issues of Valency and Meaning. Stud- national Conference on Language Resources and Eval- ies in Honour of Jarmila Panevová, pages 106–132. uation (LREC’18), pages 255–259. Karolinum, Charles University Press, Prague, Czech Republic. Ziem, Alexander (2008). Frame-Semantik und Diskursanalyse – Skizze einer kognitionswis- Hausendorf, Heiko (2008). Zwischen Linguistik senschaftlich inspirierten Methode zur Analyse und Literaturwissenschaft: Textualität revisited. Mit gesellschaftlichen Wissens. In Ingo Warnke and Jür- Illustrationen aus der Welt der Urlaubsansichtkarte. gen Spitzmüller, eds., Diskurslinguistik nach Foucault. Zeitschrift für germanistische Linguistik (ZGL), 36(3): Methoden, pages 89–117. Berlin, New York: De 319–342. Gruyter. 75