Digitizing a 19th-Century Music Theory Debate for Computational Analysis Fabian C. Moss1 , Maik Köster2 , Melinda Femminis3 , Coline Métrailler3 and François Bavaud3 1 Digital and Cognitive Musicology Lab, Digital Humanies Institute, École Polytechnique Fédérale de Lausanne 2 Musikwissenschaftliches Institut, Universität zu Köln 3 Section des Sciences du Langage et de l’Information, Faculté des Lettres, Université de Lausanne Abstract We report the progress of the ongoing project “Digitizing the Dualism Debate: a case study in the computational analysis of historical music theory sources”. First, we give a brief introduction to the dualism debate, a central discussion in 19th-century German music theory. We then describe the transcription pipeline with which we process the digitized sources in order to arrive at a corpus of computationally feasible representations, and discuss a number of encountered challenges, e.g. the assignment of structural types and idiosyncratic symbols. Employing text similarity measures and topic modeling, we present some preliminary analyses. Future steps include text annotation, music encoding, and the presentation of the corpus with an online interface. Keywords digital musicology, music theory, dualism debate, corpus study, computational humanities 1. Introduction We present the ongoing project “Digitizing the Dualism Debate: a case study in the com- putational analysis of historical music theory sources” that strives to reconstruct and criti- cally evaluate the discursive relations within this debate by harnessing the combined power of qualitative-historical and quantitative-numerical methods. The “dualism debate”, a hot topic in 19th-century German music theory [30, 17], is concerned with the mutual relationship of major and minor triads. Specifically, the discussion revolves around whether the minor triad is a mere derivative of the major triad (the monist position, e.g. by lowering its third by a semitone) or whether it can be derived from first principles on its own right (the dualist position, e.g. by postulating the existence of an undertone series) [15, 35, 6, 20, 12, 37]. By negotiating the relationship of Western music’s two most relevant qualities of chords, and by extension their scales and tonalities, the debate concerns the most fundamental level of how harmony is conceptualized theoretically. Authors thus put forth their ideas in thorough and at times passionate ways, while drawing from different scholarly backgrounds (e.g. acoustics, physiology, practical harmony, or philosophy). Although the historical debate has essentially been settled [26, 5, 16], it still resonates in more recent approaches to harmony [10, 8]. CHR 2021: Computational Humanities Research Conference, November 17–19, 2021, Amsterdam, The Netherlands £ fabian.moss@epfl.ch (F.C. Moss); mkoest14@uni-koeln.de (M. Köster); melinda.femminis@unil.ch (M. Femminis); coline.metrailler@unil.ch (C. Métrailler); francois.bavaud@unil.ch (F. Bavaud) DZ 0000-0001-9377-2066 (F.C. Moss); 0000-0002-3196-481X (C. Métrailler); 0000-0002-4565-0715 (F. Bavaud) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Wor Pr ks hop oceedi ngs ht I tp: // ceur - SSN1613- ws .or 0073 g CEUR Workshop Proceedings (CEUR-WS.org) 159 Table 1 Overview of the transcribed sources (see References for full titles). Note: the total number of types is not the sum of the number of types for all texts since the vocabularies overlap. Author Year Pages Types Tokens ID Ref. Moritz Hauptmann 1853 394 7’511 33’938 HAU1853 [11] Ernst Naumann 1858 52 2’662 7’252 NAU1858 [23] Carl Friedrich Weitzmann 1860 63 1’788 4’853 WEI1860 [40] Carl Friedrich Weitzmann 1861 28 1’677 3’050 WEI1861 [39] Franz Joseph Kunkel 1863 59 5’193 13’408 KUN1863 [18] Arthur v. Oettingen 1866 294 6’463 25’045 OET1866 [25] Adolf Thürlings 1877 51 3’279 6’454 THU1877 [38] Hugo Riemann 1905 36 2’269 4’136 RIE1905 [31] Georg Capellen 1905 88 4’753 12’373 CAP1905 [4] Total 1’065 20’436 110’509 We address our project aims by creating a corpus for computational analysis from relevant sources. The current state of the project and the following descriptions and analyses are based on a sub-corpus of nine selected core texts. The selection comprises eight authors and more than a thousand pages of text (see Table 1). The texts following and including von Oettingen’s “Harmoniesystem in dualer Entwicklung” (1866) have dualism as their primary focus, whereas the earlier texts feature and develope such ideas in a more implicit manner, i.e. without using the term “dual”. Containing texts published between 1853 and 1905, the corpus may give some indication of how the discourse changed over time and eventually became a debate. Some texts are also directly responding to others in the corpus, thus forming meaningful connections within it. Other full texts as well as particularly relevant excerpts from other sources will follow in due course. Scans of the chosen works were either available from various online resources or requested from libraries. In this paper, we provide an overview of the transcription pipeline, consisting of segmenta- tion, OCR transcription, corrections, and export using the transcription tool Transkribus [22]. We describe a number of challenges particular to our corpus and report some initial compu- tational analyses, namely text similarity and topic modeling. Finally, we discuss future steps, such as annotation, music encoding, and presentation of the project online. By creating a machine-readable resource of historical texts and applying methods from digital humanities, our project aims at bridging the gap between the humanities and the sciences, in particular music theory and corpus studies [24], and at providing a case study for how computational analysis can be fruitfully employed in musicology. 2. The transcription pipeline 2.1. Segmentation During the segmentation process text regions and other elements on the page were identified and labeled according to their content type and function within the text. Since we are interested in representing the logical structure of the text, and not the physical book, it was important to distinguish whether a paragraph or graphic is completely shown on a page, or is continued on a different page. This allows us to reconstruct the precise beginnings and endings of units, 160 Figure 1: Sample lines from each of the documents in Table 1 (in order), illustrating the variety in font-styles and scan quality present in the corpus. regardless of page breaks. Errors on the baseline level may result from the splitting of text regions, it was therefore crucial to examine the lines after the region segmentation, to ensure that all lines within text regions were represented correctly and in the right order. 2.2. OCR transcription Subsequently, the entire document was first transcribed using AI-powered Optical Character Recognition (OCR) built into Transkribus. Several OCR models are available, each more or less specialized for specific scripts, languages, and source types. Our corpus is quite diverse in terms of font face, style, and scan quality (see Figure 1), but reasonably homogeneous from a linguistic perspective: all texts are written in German and stem from a narrow period of roughly 50 years. The following setting has proven to be a good choice for automated transcrip- tion on all documents transcribed thus far: CITlab HTR: ONB_Newseye_GT_M1+, Dictionary: trainDataLanguageModel. This model was trained in the NewsEye project [7] on newspapers in German from the late 18th to mid 20th century, taken from the Austrian National Libraries ANNO Collection.1 They comprise mostly black letter (Fraktur) but also Roman fonts and thus by and large historically and typographically resemble our corpus. 2.3. Corrections and export The text produced by OCR was then subjected to careful review by a native German speaker with good understanding of music theory. For normal sentences, relatively few mistakes had to be corrected overall. The OCR performance on Roman fonts was slightly worse than on black letter, but still highly satisfactory for an initial transcription. However, specialized expressions such as note names or harmonic analyses had to be reproduced by hand, all emphases had to be added manually, and some aspects of the transcribed texts had to be edited in accordance 1 https://anno.onb.ac.at/ 161 Figure 2: Screenshot from a source page. The text flows around the chord spellings, which are set to vertically align with each other. to our editorial guidelines,2 which are updated and refined throughout the project. For most aspects (i.e. spelling, modes of emphasis), the guidelines aim to reproduce the text as it appears in the original source. However, some aspects, such as font face, quotation marks, and hyphens, were unified for practical considerations. Finally, the transcriptions were exported to the XML format of the Text Encoding Initiative (TEI) [13] as well as simple text for further processing and computational analysis. 3. Challenges 3.1. Distinguishing between graphic and text Music-theoretical treatises present a challenge to corpus linguistic approaches because the authors do not only express themselves in natural language, but also use abstract, often highly idiosyncratic symbols to designate musical concepts and relationships. The first problem this poses is that symbols may not be easily represented using the means available in Transkribus. In some instances, such as for lines written above or below note names, alternative notations had to be invented. Analytical expressions may also exist on a gradient between text and graphical content: Whereas a simple chord spelling like C - e - G can be understood as regular text, the addition of brackets, arrows or specific alignments may make it a graphic. Some music theorists may traverse quite flexibly between the use of natural language and symbolic illustrations, for example by introducing specific alignments into their text, which blur the line between a graphical and non-graphical use. This results in formations which might be quite readily understood by a human reader, but are difficult to represent and la- bel appropriately in a digital format. An example of this is shown in Figure 2. Another frequent occurrence is that the graphical illustrations, although spatially separated from the text, are still embedded in the same sentence structure. This means that the sentence becomes incomplete on the main text level, if words in the graphic are excluded. 3.2. Unrepresentable means of highlighting By far the most common mode of emphasis in the sources is letter space, followed by bold and italics. These are easily reproduced using Transkribus’ interface. For other modifications, namely text alignment and font changes, this does not apply. Being centered may lead to interpreting a mathematical or analytical expression as a floating element. For a centered sentence, however, it is not desirable to label it as a float and thereby exclude it from the text proper, so we may treat it as a normal line within the paragraph, which removes some of the emphasis but keeps the text intact. Especially in texts set in black letter, authors 2 https://github.com/DCMLab/ddd/wiki/Editorial-Guidelines 162 sometimes chose to set foreign terms or note names in Roman font, distinguishing them from their surrounding. Due to their subtle nature we ultimately decided to not take font changes into account. 3.3. Other issues 1) The editions we used for KUN1863 and HAU1853 feature errata pages. Although this project does not aim to produce scholarly editions of the texts, we chose to adjust simple spelling or grammar mistakes. Corrections within graphics pose a problem as we intend to take graphics directly from the text. It was also not practically feasible to implement corrections referring to general oversights by the publishers. 2) Hauptmann makes extensive use of musical notations, which contain only rhythmic information, but no lines indicating pitch. We eventually decided to limit the label ‘music’ to examples which are written on a staff with five lines. The rhythmic notation in HAU1853 is thus labeled as a graphic instead. 3) Kunkel uses two levels of footnotes. Since this is unique to this text, the hierarchical relationship is represented by the symbols used, not by a special structure type. 4) WEI1861 contains dashes that do not appear to posses any syntactical significance. They have been left in the text because their meaning yet unclear, with one hypothesis being that they represent an omission, e.g. by censorship or editorial digression. 4. Preliminary analyses The 9 texts by 8 authors comprise 1’065 pages, 110’509 word tokens, and 35’595 word types in total (without stop words). Besides text, they conjunctly contain 402 music examples (384 floating, 18 inline), 829 graphical elements (720 floating, 138 inline), and 22 tables. We used the spaCy library for basic text processing.3 Overall, the ten most frequent nouns are ‘Töne’ (675), ‘Ton’ (450), ‘Terz’ (446), ‘Bedeutung’ (420), ‘Folge’ (410), ‘Tonart’ (364), ‘Grundton’ (355), ‘Accorde’ (289), ‘Dissonanz’ (272), and ‘Dreiklang’ (249), clearly reflecting their music- theoretical focus. In this early phase of the project, we did not yet apply more refined NLP strategies such as lemmatization (e.g. merging singular and plural forms of nouns) but focus on two computational analyses on the ‘raw’ word counts, namely text similarity and topic modeling, which will form the basis for more extensive textual explorations in future research. 4.1. Lexical text similarity To assess their lexical similarity, we compute vector representations of all texts by weighting the respective word counts with term frequency-inverse document frequency (TF-IDF) [33] after removing custom stop words, and use Principal Components Analysis (PCA) [14] for a reduction to two dimensions (left of Figure 3); we rely on the algorithms provided by the library scikit-learn.4 One can observe that the texts group into two clusters that are separated by the first principal component: 1) WEI1860, WEI1861, and KUN1863, and 2) HAU1853, NAU1858, and OET1866, as well as THU1877, RIE1905 and CAP1905. Moreover, within the second cluster, one can observe a diachronical trajectory where the second principal component distinguishes earlier from later publications. However, rather than representing historically changing lan- guage style, the clusters reflect kinship in content: Naumann was a student of Hauptmann 3 https://spacy.io/ 4 https://scikit-learn.org/stable/ 163 and dedicates his work to his teacher; both Hauptmann’s and Naumann’s texts are referred to in the introduction of Oettingens’s book and he considers his achievement to be the conjunc- tion of Hauptmann’s and Helmholtz’s teachings.5 Thürlings’ work features extensive reviews of Hauptmann, Helmholtz and Oettingen, while preparing some of the ideas later discussed between Riemann and Capellen. It thus connects these two groups of works. However, textual similarity as expressed by the TF-IDF vectors does not always correspond to positive affinities: KUN1863’s proximity to Weitzmann’s texts can be explained rather by strong and direct opposition than by positive reference; e.g. his Kritische Beleuchtung accuses Weitzmann of “error, ignorance, or intentional disregard” of prior literature [18, subtitle]. Likewise, RIE1905 and CAP1905, the youngest two texts in our corpus, are relatively similar in terms of their term frequencies, but could not be more contrasting in terms of their intention. Riemann’s text is decidedly ‘dualist’ and meant as a concise summary of his earlier extensive writings on the topic and Capellen’s direct reply vehemently defends ‘monism’. On the other hand, WEI1860 and WEI1861 are relatively distant, although they have the same author and topic, and are published in consecutive years. The data variance explained by the first two principal components are 29.9% and 15.5%, respectively, together accounting for less than half of the entire variance. Further analyses and pre-processing steps are thus required for a deeper understanding of the textual similarities in our corpus. 4.2. Topic modeling Topic modeling, in particular with Latent Dirichlet Allocation (LDA) [1], is a widely-adopted technique in the digital humanities [2, 28, 21]. Here, we rely on the the implementation of tomotopy6 and retrieved the 5 most likely topics for our corpus. Table 2 lists the 15 most common words per topic along with their TF-IDF-weighted frequencies, Table 3 shows the distribution over topics for each text, and the right panel of Figure 3 shows a PCA reduction of the topic vectors, where their size inversely corresponds to their topic coherence [36]. Topic 1 features chord-related terms such as ‘Folge’ (sequence), ‘Terz’, ‘Quint’, and ‘Grund- ton’, (third, fifth, and root; the constituent tones of triads), as well as ‘Dreiklang/Dreiklänge’ (triads), and we designate this topic with “chords”. Topic 2 likewise contains the notions of ‘Terz’ and ‘Quinte’, but the presence of ‘Octave’ as well as of ‘Intervalle’, ‘Obertöne’, and ‘Schwingungszahlen’ (intervals, overtones, and frequencies), and ‘Helmholtz’, indicate that they signify intervals in the acoustic sense instead of chord tones. We thus call this topic “acoustics”. Topic 3 prominently features all notes of the C-major scale except (plus ‘b’ and ‘fis’, the German versions of b♭ and f♯), and could be termed the “tones” topic. Topic 4 appears particularly Hauptmannian: unity (‘Einheit’) is one of the core notions of his dialectic/dualis- tic theory, and half of his book is concerned with meter, reflected in terms such as ‘Metrum’n, ‘Ordnung’, ‘metrische(n)’, and ‘Form’ (meter, order, metrical, and form), and this topic can thus be called “meter”. Finally, Topic 5 appears mixed and less coherent than the others. It features music-theoretical vocabulary (‘Tonart’, ‘Akkorde’, ‘Harmonie’, ‘Theorie’, ‘Musik’) as well as book-related words (‘Verfasser’, ‘Beispiele’, ‘Erklärung’). In lack of a better term we designate it with the very general label “music theory”. The topic distributions in Table 3 indicate a relatively strong correlation between certain texts and topics: all texts feature one or two particularly strong topics. This might be partially 5 The latter are not part of the present version of the corpus. 6 https://pypi.org/project/tomotopy/ 164 Figure 3: Similarities of texts and topics. Left: PCA reduction of TF-IDF vectors of the eight sources; circle size is proportional to text length. Right: PCA reduction of five topics; circle size is anti-proportional to topic coherence (less coherent topics are displayed larger). Table 2 15 most common words and their TF-IDF-weighted frequencies (in %) for all 9 texts and 5 topics. Topic 1 (“chords”) Topic 2 (“acoustics”) Topic 3 (“tones”) Topic 4 (“meter”) Topic 5 (“music theory”) Folge (1.08) Töne (1.29) c (2.76) Einheit (1.28) Verfasser (0.51) Terz (1.08) Intervalle (0.7) g (2.1) Bestimmung (1.14) Theorie (0.49) Quint (1.06) Terz (0.59) e (1.8) Ordnung (0.8) Tonart (0.47) C (1.03) reinen (0.52) d (1.57) Metrum (0.78) Harmonie (0.45) Grundton (1.01) Musik (0.51) a (1.44) Bedeutung (0.76) Musik (0.39) Tonart (0.95) Ton (0.48) f (1.4) metrischen (0.71) Töne (0.38) Dreiklang (0.89) Obertöne (0.45) h (1.06) Glied (0.7) Accorde (0.37) Auflösung (0.83) Octave (0.42) C (0.72) metrische (0.6) Tonarten (0.35) tonischen (0.82) Helmholtz (0.4) Klänge (0.7) Form (0.55) Beispiele (0.33) Ton (0.82) Schwingungszahlen (0.38) Verwandtschaft (0.6) Bestimmungen (0.54) Lehre (0.33) Accorde (0.76) nämlich (0.38) phon (0.6) Quint (0.5) alten (0.33) Töne (0.75) Quinte (0.37) phonischen (0.54) Folge (0.49) lassen (0.3) G (0.74) Reihe (0.37) b (0.51) Momente (0.49) Quinten (0.3) Dreiklänge (0.73) musikalischen (0.36) Ton (0.49) Formation (0.49) Harmoniesystem (0.29) Bedeutung (0.72) Konsonanz (0.36) fis (0.48) lassen (0.49) Erklärung (0.28) due to different conventions in notation (e.g. of tones and chords) and terminology, but also reflects some of the thematic differences discussed above (e.g. Hauptmann’s stronger metrical focus, Kunkel’s direct reference to Weitzmann, and Capellen’s reply to Riemann). A factor that needs to be addressed in our future analyses is the impact of text length: Hauptmann’s and Oettingen’s texts are substantially larger than the others, thus likely leading to skewed results. Simply relying on relative frequencies or employing appropriate sampling techniques might resolve this issue. 5. Future steps 5.1. Annotation In the next project phase, the transcribed texts will be annotated. In particular, we will add labels for named entities (e.g. persons, work titles) as well as a tag set specifically devised for 165 Table 3 Distributions of the 5 topics in all 9 texts of the corpus (in %). Topics accounting for more than 25% are highlighted. ID Topic 1 Topic 2 Topic 3 Topic 4 Topic 5 HAU1853 48.36 4.93 0.93 41.65 4.13 NAU1858 20.31 60.12 3.83 2.92 12.81 WEI1860 47.76 3.54 3.23 1.23 44.25 WEI1861 17.32 6.33 8.52 3.02 64.81 KUN1863 13.43 10.92 4.53 3.32 67.80 OET1866 11.83 17.71 55.53 3.62 11.32 THU1877 9.83 52.24 16.00 8.71 13.21 RIE1905 5.04 58.53 16.40 4.42 15.61 CAP1905 7.14 27.79 50.24 2.62 12.21 the genre of our texts. The labels thus obtained will constitute a minimal ontology tailored for our purpose, identifying concrete musical objects (e.g. ‘interval’, ‘chord’) as well as musical and scientific concepts, respectively. They will allow us to later analyze the music-theoretical vocabulary in more detail, and moreover permit to analyze conceptual networks between the texts and those terms in order to draw inferences about the discourse surrounding the debate on harmonic dualism. Due to the nature of our corpus, we expect that the first set of named-entity labels will be relatively small, whereas the second set of annotations will provide a valuable resource for digital musicology. However, close reading of the sources and initial annotations already suggest that the texts contain sufficient references to names of authors and composers and titles of works (texts or compositions) to be used for network analysis. 5.2. Music encoding Our sources contain more than four hundred music examples, either specifically construed for demonstration or taken from a music piece as illustration. They were structurally marked up in the segmentation phase enabling automated extraction. Although Optical Music Recognition (OMR) has improved in recent years, it is still facing major challenges and currently lacks the reliability of state-of-the-art OCR frameworks [32, 3, 32, 34]. We plan to manually transcribe the examples to the **kern format, and convert them to the XML format of the Music Encoding Initiative (MEI) [9] as well as SVG using the engraving software Verovio [29]. Figure 4 (left) shows an example cadence taken from [4]. Other, more complex examples involve annotations or graphical elements, which renders symbolic music encoding difficult. We will thus only transcribe exclusively musical examples. We hope that the resulting pairs of scanned and rendered score examples may serve as a ground-truth data set to aid the further improvement of OMR. 5.3. Presentation The scope of the project, a list of relevant sources, and the project team are presented on our GitHub page (https://dcmlab.github.io/ddd/). The corpus itself will be included on the website using TEI Publisher 7 that allows researchers to access the text and search for phrases. 7 https://teipublisher.com/ 166 Figure 4: Score example from CAP1905 [4, p. 80, Fig. 4]. Left: Image extracted from Transkribus. Right: Modern rendering of **kern transcription with Verovio [29]. We are currently working on integrating the Verovio viewer into this framework in order to show the rendered scores (see Figure 4) instead of the scanned images, which will enable users also to play the musical examples. In future stages of the project, this website will also feature a summary of our main results and links to the relevant publications. 6. Conclusion Drawing on a corpus of 19th-century German music theory texts, our project “Digitizing the Dualism Debate: a case study in the computational analysis of historical music theory sources” equally makes use of computational distant-reading and manual close-reading techniques [27], thus falling under a mixed-methods research paradigm [19]. By providing a machine-readable corpus of historical music theory texts and symbolic encodings of music, that enable com- putational analyses such as topic modeling, we hope that this project may serve as a proof of concept, upon which similar subsequent research projects in digital musicology and music theory can build. Acknowledgments This research is funded by the EPFL-UNIL funding scheme CROSS - Collaborative Research on Science and Society within the project “Digitizing the Dualism Debate: A Case Study in the Computational Analysis of Historical Music Sources”. References [1] D. M. Blei, B. B. Edu, A. Y. Ng, A. S. Edu, M. I. Jordan, and J. B. Edu. “Latent Dirichlet Allocation”. In: Journal of Machine Learning Research 3 (2003), pp. 993–1022. [2] D. M. Blei. “Topic Modeling and Digital Humanities”. In: Journal of Digital Humanities 2.1 (2012), pp. 8–11. doi: 10.1613/jair.301. arXiv: cs/9605103. [3] J. Calvo-Zaragoza, J. Hajič Jr., and A. Pacha. “Understanding Optical Music Recogni- tion”. In: arXiv:1908.03608 [cs, eess] (2019). arXiv: 1908.03608 [cs, eess]. [4] G. Capellen. Die Zukunft der Musiktheorie (Dualismus oder “Monismus”?) und ihre Einwirkung auf die Praxis. Leipzig: C. F. Kahnt, 1905. 167 [5] S. Clark. “Seduced by Notation: Oettingen’s Topography of the Major-Minor System”. In: Music Theory and Natural Order from the Renaissance to the Early Twentieth Century. Ed. by S. Clark and A. Rehding. Cambridge: Cambridge University Press, 1999, pp. 161– 181. [6] H. Danuser. “Von unten und von oben? Hugo Riemanns reflexive Theorie in der Mod- erne”. In: Zeitschrift der Gesellschaft für Musiktheorie [Journal of the German-speaking Society of Music Theory] 7.Sonderausgabe [Special Issue] (2010), pp. 99–116. doi: 10. 31751/564. [7] A. Doucet, M. Gasteiner, M. Granroth-Wilding, M. Kaiser, M. Kaukonen, R. Labahn, J.-P. Moreux, G. Muehlberger, E. Pfanzelter, M.-E. Therenty, H. Toivonen, and M. Tolonen. “NewsEye: A Digital Investigator for Historical Newspapers”. In: Digital Hu- manities 2020, DH 2020, Book of Abstracts. Alliance of Digital Humanities Organizations (ADHO), 2020, pp. 1–3. doi: 10.5281/zenodo.3895269. url: https://doi.org/10.5281/ zenodo.3895269. [8] D. Haller. “Negative Harmony: The Shadow of Harmonic Polarity on Contemporary Composition Techniques”. MA Thesis. Belmont University, 2020. [9] A. Hankinson, P. Roland, and I. Fujinaga. “The Music Encoding Initiative as a Document- Encoding Framework”. In: 12th International Society for Music Information Retrieval Conference (ISMIR 2011). 2011, pp. 293–298. [10] D. Harrison. Harmonic Function in Chromatic Music: A Renewed Dualist Theory and an Account of its Precedents. Chicago and London: University of Chicago Press, 1994. [11] M. Hauptmann. Die Natur der Harmonik und der Metrik. Leipzig: Breitkopf und Härtel, 1853. [12] L. Holtmeier. “The Reception of Hugo Riemann’s Music Theory”. In: The Oxford Hand- book of Neo-Riemannian Theories. Ed. by E. Gollin and A. Rehding. Oxford: Oxford University Press, 2011, pp. 3–54. [13] N. M. Ide and C. M. Sperberg-McQueen. “The TEI: History, Goals, and Future”. In: Computers and the Humanities 29.1 (1995), pp. 5–15. doi: 10.1007/bf01830313. [14] I. T. Jolliffe. Principal Component Analysis. 2nd. Springer Series in Statistics. New York: Springer-Verlag, 2002. doi: 10.1007/b98835. [15] D. Jorgenson. “A Résumé of Harmonic Dualism”. In: Music & Letters 44.1 (1963), pp. 31– 42. [16] H. Klumpenhouwer. “Dualist Tonal Space and Transformation in Nineteenth-Century Musical Thought”. In: The Cambridge History of Western Music Theory. Ed. by T. Christensen. Cambridge, UK: Cambridge University Press, 2002, pp. 456–476. [17] H. Klumpenhouwer. “Harmonic Dualism as Historical and Structural Imperative”. In: The Oxford Handbook of Neo-Riemannian Theories. Ed. by E. Gollin and A. Rehding. Oxford: Oxford University Press, 2011, pp. 192–217. [18] F. J. Kunkel. Kritische Beleuchtung des C. F. Weitzmann’schen Harmoniesystems. Frank- furt a. M.: Franz Benjamin Auffahrth, 1863. [19] N. L. Leech and A. J. Onwuegbuzie. “A Typology of Mixed Methods Research Designs”. In: Quality & Quantity 43.2 (2009), pp. 265–275. doi: 10.1007/s11135-007-9105-3. 168 [20] F. C. Moss. “‘Theorie der Tonfelder’ nach Simon und ‘Neo-Riemannian Theory’: System- atik, historische Bezüge und analytische Praxis im Vergleich”. MA Thesis. Hochschule für Musik und Tanz Köln, 2012. doi: 10.5281/zenodo.4748512. [21] F. C. Moss. “Transitions of Tonality: A Model-Based Corpus Study”. Doctoral Disser- tation. Lausanne, Switzerland: École Polytechnique Fédérale de Lausanne, 2019. doi: 10.5075/epfl-thesis-9808. [22] G. Muehlberger, L. Seaward, M. Terras, S. Ares Oliveira, V. Bosch, M. Bryan, S. Colutto, H. Déjean, M. Diem, S. Fiel, B. Gatos, A. Greinoecker, T. Grüning, G. Hackl, V. Haukko- vaara, G. Heyer, L. Hirvonen, T. Hodel, M. Jokinen, P. Kahle, M. Kallio, F. Kaplan, F. Kleber, R. Labahn, E. M. Lang, S. Laube, G. Leifert, G. Louloudis, R. McNicholl, J.-L. Meunier, J. Michael, E. Mühlbauer, N. Philipp, I. Pratikakis, J. Puigcerver Pérez, H. Putz, G. Retsinas, V. Romero, R. Sablatnig, J. A. Sánchez, P. Schofield, G. Sfikas, C. Sieber, N. Stamatopoulos, T. Strauß, T. Terbul, A. H. Toselli, B. Ulreich, M. Ville- gas, E. Vidal, J. Walcher, M. Weidemann, H. Wurster, and K. Zagoris. “Transforming Scholarship in the Archives through Handwritten Text Recognition: Transkribus as a Case Study”. In: Journal of Documentation 75.5 (2019), pp. 954–976. doi: 10.1108/jd- 07-2018-0114. [23] E. Naumann. Über die verschiedenen Bestimmungen der Tonverhältnisse und die Bedeu- tung des Pythagoreischen oder reinen Quinten-Systems. Leipzig, 1858. [24] M. Neuwirth and M. Rohrmeier. “Wie wissenschaftlich muss Musiktheorie sein? Chancen und Herausforderungen musikalischer Korpusforschung”. In: Zeitschrift der Gesellschaft für Musiktheorie [Journal of the German-Speaking Society of Music Theory] 13.2 (2016), pp. 171–193. doi: 10.31751/915. [25] A. v. Oettingen. Harmoniesystem in dualer Entwicklung. Dorpat und Leipzig: W. Gläser Verlag, 1866. [26] O. Ortmann. “The Fallacy of Harmonic Dualism”. In: The Musical Quarterly 10.3 (1924), pp. 369–383. [27] M. Piotrowski and M. Neuwirth. “Prospects for Computational Hermeneutics”. In: Pro- ceedings of the 9th Annual Conference of the AIUCD. Ed. by C. Marras and M. Passarotti. Milan: Associazione per l’Informatica Umanistica e la Cultura Digitale, 2020, pp. 204– 209. [28] A. Piper. Enumerations: Data and Literary Study. Chicago and London: University of Chicago Press, 2018. [29] L. Pugin, R. Zitellini, and P. Roland. “Verovio: A Library for Engraving MEI Music Notation into SVG”. In: Proceedings of the 15th International Society for Music Infor- mation Retrieval Conference (ISMIR 2014). Taipei, Taiwan, 2014, pp. 107–112. doi: 10.5281/zenodo.1417589. [30] A. Rehding. Hugo Riemann and the Birth of Modern Musical Thought. Cambridge, UK: Cambridge University Press, 2003. [31] H. Riemann. Das Problem des harmonischen Dualismus: Ein Beitrag zur Ästhetik der Musik. Leipzig: C. F. Kahnt, 1905. 169 [32] A. Ríos-Vila, J. Calvo-Zaragoza, and D. Rizo. “Evaluating Simultaneous Recognition and Encoding for Optical Music Recognition”. In: 7th International Conference on Digital Libraries for Musicology. DLfM 2020. New York, NY, USA: Association for Computing Machinery, 2020, pp. 10–17. doi: 10.1145/3424911.3425512. [33] G. Salton and C. Buckley. “Term-Weighting Approaches in Automatic Text Retrieval”. In: Information Processing & Management 24.5 (1988), pp. 513–523. doi: 10.1016/0306- 4573(88)90021-0. [34] E. Shatri and G. Fazekas. “Optical Music Recognition: State of the Art and Major Challenges”. In: arXiv:2006.07885 [cs, eess] (2020). arXiv: 2006.07885. [35] J. L. Snyder. “Harmonic Dualism and the Origin of the Minor Triad”. In: Indiana Theory Review 4.1 (1980), pp. 45–78. [36] K. Stevens, P. Kegelmeyer, D. Andrzejewski, and D. Buttler. “Exploring Topic Coherence over many Models and many Topics”. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Jeju Island, Korea, 2012, pp. 952–961. [37] D. Tan. “‘Dynamic Dualism’: Kurth and Riemann on Music Theory and the Mind”. In: Music Theory Spectrum 42.1 (2020), pp. 105–121. doi: 10.1093/mts/mtz017. [38] A. Thürlings. Die beiden Tongeschlechter und die neuere musikalische Theorie. Berlin: Leo Liepmannssohn, 1877. [39] C. F. Weitzmann. Die Neue Harmonielehre im Streit mit der alten. Leipzig: C. F. Kahnt, 1861. [40] C. F. Weitzmann. Harmoniesystem. Leipzig: C. F. Kahnt, 1860. 170