The Music Note Ontology?,?? Andrea Poltronieri1? ? ?[0000−0003−3848−7574] and Aldo Gangemi2[0000−0001−5568−2684] 1 LILEC, University of Bologna, Bologna, Italy 2 FICLIT, University of Bologna, Bologna, Italy Abstract. In this paper we propose the Music Note Ontology, an on- tology for modelling music notes and their realisation. The ontology addresses the relation between a note represented in a symbolic rep- resentation system, and its realisation, i.e. a musical performance. This work therefore aims to solve the modelling and representation issues that arise when analysing the relationships between abstract symbolic features and the corresponding physical features of an audio signal. The ontology is composed of three different Ontology Design Patterns (ODP), which model the structure of the score (Score Part Pattern), the note in the symbolic notation (Music Note Pattern) and its realisation (Musical Object Pattern). Keywords: Ontology Design Patterns · Computational Musicology · Computer Music · Music Information Retrieval. 1 Introduction A music note is defined as “the marks or signs by which music is put on paper. Hence the word is used for the sounds represented by the notes” [10]. However, musical notation provides some general information on how to play a certain note. These indications are then enriched in the context of a performance by a large amount of information, for example, from the musician’s sensitivity or the conductor’s instructions. Historically, music notation has been a crucial innova- tion that allowed the study of music and hence the rise of musicology. Music scores were initially introduced for the primary purpose of recording music by giving other musicians the possibility of playing the same piece in turn, repro- ducing it. However, musical notation by definition entails expressive constraints on the description of musical content. Representing music involves a number of issues that are closely related to the complexity of the musical content. This can be found in every system of ? Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). ?? This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 101004746. ??? Corresponding author: andrea.poltronieri2@unibo.it 2 A. Poltronieri and A. Gangemi music representation, from music scores to their various forms of digital encod- ing, usually referred to as symbolic representation systems. As Cook observed, music scores symbolise rather than represent music, as people don’t play musi- cal rhythms as written and often they don’t play the pitches as written: that is because the notation is only an approximation [7]. To underline this concept, it is useful to draw a distinction between a score and a musical object [26]. While the former can be thought of as a set of instructions that the musician or a computer system uses to realise a piece of music, a musical object consists of the realisation of this process. In other words, musical notation precedes realisation or interpretation and defines only partially the musical object [18]. Music notation is not just a mechanical transformation of performance in- formation. Performance nuances are lost when moving from performance to no- tation, and symbolic structure is partly lost in the translation from notation to performance [8]. Moreover, in music notation some features of musical con- tent are completely overlooked. For example, timbre information is generally ignored and is at best provided through generic information about the instru- ment that is supposed to play the part. Many of these problems have not been overcome by the numerous systems of symbolic representation proposed in recent decades. Music representation systems (MSR) have been evaluated by Wiggins et al. [26] using a system based on two orthogonal dimensions. These two di- mensions have been named as expressive completeness and structural generality. Expressive completeness is described as the extent to which the original musical content may be retrieved and/or recreated from its representation [25]. Struc- tural generality, instead, refers to the range of high-level structures that can be represented and manipulated [26]. For example, raw audio is very expressive since it contains a great deal of information about a certain music performance; but a MIDI representation contains much less information, e.g. timbral features are not represented in this format. On the contrary, when evaluating structural generality, raw audio performs poorly, due to the difficulties in the extraction of structured information (e.g., tempo, chords, notes, etc.) from such a format. Another problem in representing music is linked to the twofold nature of mu- sical content, which contains information that can be reduced to mathematical functions, but also information related to the emotional spectrum and a wide range of psychoacoustic nuances. In fact, music is distinguished by the presence of many relationships that can be treated mathematically, including rhythm and harmony. However, elements such as tension, expectancy, and emotion are less prone to mathematical treatment [8]. The limits of symbolic representations are also accentuated when the per- formance is considered in relation to the concept of interpretation. For each composition, several performances of the same one correspond to as many in- terpretations of the musical piece. These interpretations can also vary greatly from one another from agogic (note accentuation, [3]), timbral and dynamic viewpoints. Therefore, this work aims to represent musical notes both symbolically and as a realisation of the note itself. The alignment between these two representation The Music Note Ontology, 3 systems can be relevant on several grounds. For example, it can allow musical analysis using a structured representation (i.e. a symbolic representation) while simultaneously taking into account all the information that is only contained in the signal (e.g. timbre information). This would allow a different level of analysis, taking into account information that is usually overlooked in music scores and symbolic representations. On the other side, this type of representation would allow structured analysis of the music signal by having the signal information encoded in strict relation with the music score and the multiple hierarchies that the music score entails. Furthermore, this type of representation allows the analysis of different reali- sations of the same note, also providing information on how a score is performed differently in different performances. 2 Physical Features and Perceptual Features The features that need to be taken into account when representing music can be reduced to four main categories: tempo, pitch, timbre and dynamics. However, all these concepts refer to music perception. Symbolic representations, however, only represent abstractions of these features. In contrast, musical performance, as analysed in recorded form (i.e. signal representation), contains information that refers to physical characteristics. Through the analysis of these characteristics, it is in turn possible to abstract the perceptual characteristics aroused by a particular sound. However, this work aims not to analyse the perceptual aspects of music, but rather to formalise a representation that aims to express a musical note from the points of view of both scores and musical objects. As far as time is concerned, the main distinction that arises is that between the quantised time of a symbolic representation, and real time, which is expressed in seconds. A symbolic representation describes temporal information using pre- determined durations for each note (e.g., quarter notes, eighth notes, sixteenth notes, etc.). These durations are executed in relation to a time signature, in- dicated by a tempo marking (e.g., slow, moderate, allegro, vivace, etc.), or by a metronome mark, which indicates the tempo measured in beats per minute (bpm). Generally, symbolic notations (e.g. MusicXML) use the same expedi- ents as classical musical notation. However, some representation systems rely on real-time, such as the MIDI standard [2]. When considering frequency, the representation issue is more challenging. A sound, e.g. a note in the form of a musical note, is usually associated with a pitch. However, although pitch is associated with frequency [16], it is a psycho- logical percept that does not map straightforwardly onto physical properties of sound [15]. The American National Standards Institute (ANSI) defines pitch as the “auditory attribute of sound according to which sounds can be ordered on a scale from low to high” [11]. In fact, in many languages, the pitch is described by using terms having a spatial connotation such as “high” and “low” [21]. However, the spatial arrangement of sounds has been shown to be influenced by the listener’s musical training [23] and their ethnic group [1]. The relation 4 A. Poltronieri and A. Gangemi between pitch and frequency is evident in the case of pure tones [17]. For ex- ample, a sinusoid having a frequency of 440 Hz corresponds to the pitch A4. However, real sounds are not composed of a simple pure note with a unique and well-defined frequency. Playing a single note on an instrument may result in a complex sound that contains a mixture of different frequencies changing over time. Intuitively, such a musical tone can be described as a superposition of pure tones or sinusoids, each with its frequency of vibration, amplitude, and phase. A partial is any of the sinusoids, by which a musical tone is described [17]. The frequency of the lowest partial is defined as the fundamental frequency. Most instruments produce harmonic sounds, i.e. composed of frequencies close to the partial harmonics. However, some instruments such as xylophones, bells, and gongs produce inharmonic sounds, i.e. composed of frequencies that are not multiples of the fundamental. As a result, the pitch perception of these sounds is distorted for a listener who is unfamiliar with these instruments [15]. Several approaches have been proposed to determine the perceived pitch of these sounds, based both on the periodicity of the frequencies that compose the sound [22] and on spectral clues [5]. However, recent research suggests that pitch perception in- volves learning to recognise a sound timbre over a range of frequencies, and then associating changes in frequency with visuospatial and kinesthetic dimensions of an instrument [15]. On the other hand, timbre is a sound property even more difficult to grasp. ANSI defines the timbre as the “attribute of auditory sensation, in terms of which a subject can judge that two sounds similarly presented and having the same loudness and pitch are dissimilar” [11]. However, even today, the definition of timbre is rather debated. For example, it has been defined as a “misleadingly simple and vague word encompassing a very complex set of auditory attributes, as well as a plethora of psychological and musical issues” [14]. The main prob- lem in defining timbre is that there is no precise correspondence with one or more physical aspects. Timbre is also challenging to abstract, as it is usually described using adjectives such as light, dark, bright, rough, violin-like, etc. Nu- merous studies have shown that timbre depends on a large number of factors and physical characteristics. Some research has tried to represent the perception of timbre using multidimensional scaling (e.g. [9]). Other studies in this area focus on the grouping of timbre into a “Timbre Space” [14]. While those studies represented timbre in terms of perceptual dimensions, others have represented timbre in terms of physical dimensions, such as the set of parameters needed for a particular synthesis algorithm [8]. These physical characteristics concern both the time axis, such as attack time and vibrato variation and amplitude, spectral features such as spectral centroid, and harmonic features such as odd-to-even ratio and inharmonicity. An approximation of these features is provided by a model called ADSR, which consists of the analysis of the envelope of the sound wave and the measurement of attack (A), decay (D), sustain (S) and release (R). The envelope can be defined as a smooth curve outlining the extremes in the amplitude of a waveform. The ADSR model, however, is a strong simplification The Music Note Ontology, 5 and only yields a meaningful approximation for amplitude envelopes of tones that are generated by specific instruments [17]. The same distinction drawn between frequency and pitch can be made for loudness and sound intensity. Loudness is a perceptual property whereby sounds are ordered on a scale from quiet to loud. Sound intensity is generally measured in dB, and expresses the sound power (expressed in Watts) in relation to the physical space in which the sound propagates (expressed in square metres) [4]. As a perceptual property, it is by its nature subjective and is closely related to loudness. However, loudness also depends on other sound characteristics such as duration or frequency [17]. Also, the sound duration influences perception since the human auditory system averages the effect of sound intensity over an interval up to a second. Formalizing the aforementioned perceptual concepts is beyond the scope of this paper. Instead, the purpose of this contribution is to define (some of) the attributes that can provide helpful information to describe sound material. The importance of this analysis is to highlight the inner complexity of the sound material, underlying the connections that can be drawn between physical aspects and perceptual components, as well as the relationships among these aspects. 3 Related Work Several Semantic Web ontologies have been proposed for modelling musical data. The Music Theory Ontology (MTO) [19] aims at modelling music theoretical concepts. This ontology focuses on the rudiments that are necessary to under- stand and analyse music. However, this ontology is limited to the description of note attributes (dynamics, duration, articulation, etc.) at the level of detail of a note set. This peculiarity considerably reduces the expressiveness of the model, especially in relation to polyphonic and multi-voice music. The Music Score Ontology (Music OWL) [12] aims to represent similar con- cepts described in the Music Theory Ontology. However, it focuses on music notation, and the classes it is composed of are all aimed at representing music in a notated form, hence related to music sheets. The Music Notation Ontology [6] is an ontology of music notation content, focused on the core “semantic” information present in a score and subject to the analytic process. Moreover, it models some high-level analytic concepts (e.g. dissonances), and associates them with fragments of the score. However, all these ontologies only consider the characteristics of the musical material from the point of view either symbolic representation or music score. The aim of this paper is instead to propose a model that can be as general as possible and that can therefore represent musical notation expressed in different formats and considering at the same time scores and symbolic notations. This would allow interoperability between different notation formats and the stan- dardisation of musical notation in a format that aims to be as expressive as possible. 6 A. Poltronieri and A. Gangemi Moreover, to the best of our knowledge there is no works that relates the musical note (understood as part of the symbolic notation) to the corresponding musical object (understood as the realisation of a musical note). In addition, none of the proposed ontologies seem to model the physical information related to the audio signal. Therefore, this paper proposes an ontology that can represent a note both from a symbolic and a performance point of view. To do this, it is necessary to take into account both the symbolic characteristics (the signs and attributes typical of musical notation), and the physical characteristics of the note as re- produced by a musical instrument. However, this work is not intended to go into the perceptual characteristics of music. The problem of modelling information with respect to its realisation has al- ready been analysed in the past. For example, the Ontology Design Pattern (ODP) Information Realization3 –extracted from DOLCE+DnS Ultralite On- tology4 [13]– proposes to help ontology designers to model this type of scenario. This pattern allows designers to model information objects and their realiza- tions. This allows to reason about physical objects and the information they realize, by keeping them distinguished. The same kind of relationship can be found in the FRBR5 [24] conceptual entity-relationship model, which groups entities into four different levels (work, expression, manifestation and item). The relationship between expression, de- fined as the “specific intellectual or artistic form that a work takes each time it is realized” and manifestation, defined as “the physical embodiment of an ex- pression of a work”, can be traced back to the modelling problem discussed in this paper. However, these examples do not solve the modelling problems that have been advanced in the previous sections. In particular, the relationship between the approximate features represented by the symbolic notation and their realisation remains a challenge in the field of ontology design. To achieve this, this research proposes a new ontology to represent symbolic musical notation and its realisa- tion, considering the relationships between the abstraction of musical features expressed in the symbolic notation and the physical characteristics of the audio signal. Furthermore, this paper proposes a pattern-based approach, by proposing an ontology composed of modular and reusable elements for the representation of both audio and symbolic musical content. 4 The Music Note Ontology The Music Note Ontology6 (see Figure 1) models a musical note both as a constituent element of the score, and as a musical object, i.e. as a realisation of 3 http://www.ontologydesignpatterns.org/cp/owl/informationrealization.owl 4 http://ontologydesignpatterns.org/wiki/Ontology:DOLCE+DnS_Ultralite 5 https://www.ifla.org/publications/functional-requirements-for-bibliographic-records 6 The OWL file of the ontology can be found at: https://purl.org/ andreapoltronieri/notepattern The Music Note Ontology, 7 the event represented by a score or symbolic notation. To do this, the proposed ontology represents both the elements and the typical hierarchies of a score, describing all its attributes. In addition, the ontology models the realisation of a note, describing its physical characteristics in a given performance. The proposed ontology is composed of three distinct Ontology Design Pat- terns, which are presented in this paper and have been submitted to the ODP portal at the following links:: – The Score Part Pattern: http://ontologydesignpatterns.org/wiki/Submissions: Scorepart – The Musical Object Pattern: http://ontologydesignpatterns.org/wiki/ Submissions:Musicalobject – The Music Note Pattern: http://ontologydesignpatterns.org/wiki/Submissions: Notepattern. Table 1. List of competency questions for the Music Note Ontology. ID Competency Question CQ1 what is the name of a note? CQ2 what part of the score does a note belong to? CQ3 what are the dynamic indications referring to a note in the score? CQ4 what is the fundamental frequency of a note? CQ5 what are the different frequencies that make up the spectrum of a note? CQ6 what is the duration in seconds of a note, in a given performance? CQ7 how is the envelope of a note shaped? The note as represented in the score is described by the Music Note Pattern (in green in the figure), which in turn imports the Score Part pattern (in orange in the figure) and the Musical Object pattern (in yellow in the figure). The former import describes the relationships between the different components of a score, while the latter models the musical object from the point of view of its physical features. A selection of competency questions that the ontology must be able to an- swer are listed in Table 1. However, the complete list of compency questions in available on the repository of the project7 . Some of the classes and properties of the ontology have also been aligned with currently available ontologies and the alignments are available in a separate file8 . Alignments with other ontologies were mainly made on the Score Part Pattern and the Music Note Pattern. Although some of the available ontologies (see Section 3) define some of the classes and properties present in these patterns, for 7 The project repository is available at: https://github.com/andreamust/music_ note_pattern/ 8 Alignments to the Music Note Ontology are available at the URI: purl.org/ andreapoltronieri/notepattern-aligns 8 A. Poltronieri and A. Gangemi this we needed to re-model the score and symbolic elements. In fact, the objective of this work is, among others, to abstract from a specific representation system, combining the attributes of the musical score with other elements of the MIDI representation (as the most widely used symbolic representation) and, to the beat of our knowledge, there are no ontologies available with this characteristics. Fig. 1. The Music Note Ontology in Graffoo notation. Colours define the different ODPs that constitute it. 4.1 The Score Part Pattern The hierarchy (i.e. the mereological structure) of a music score is described by the Score Part Pattern9 . The Part class describes the instrumental part of a score, which refers to a specific staff of the musical notation that is associ- ated with an instrument. The instrument playing the part is modelled by the class Instrument, which is associated by means of a datatype property with the MIDI program assigned to that instrument (hasMidiProgram). Each part of the score is also divided into sections (class Section), similar music fragments (class HomogeneousFragment) and voices (class Voice). A voice is defined as 9 Score Part Pattern URI: https://purl.org/andreapoltronieri/scorepart The Music Note Ontology, 9 one of several parts that can be performed within the same staff (e.g. alto and soprano). A section is instead a part of the music score defined by the repetition signs. A fragment, on the other hand, is a grouping of notes that share the same metric, tempo and clef, described by the object properties hasMetric, hasTempo and hasClef. 4.2 The Musical Object Pattern The Musical Object Pattern10 models the physical features of a musical note object, i.e. the execution of a note. Specifically, this pattern models the physi- cal characteristics that can be extracted from the sound wave produced by an instrument playing a musical note. The MusicalObject class is connected to four classes that describe these physical characteristics, namely duration, sound intensity, frequency and envelope. The MusicObjectDuration class expresses the duration in seconds of the musical object, by means of the object prop- erty hasDurationInSeconds. In the same way, the musical intensity is modelled via the SoundIntensity class. Frequency is modelled by means of the class Frequency and its sub-classes FundamentalFrequency and PartialFrequency. For each expressed frequency, the magnitude of the frequency is also indicated us- ing the hasFrequencyMagnitude datatype property. Finally, the Envelope class is connected to four datatype properties that describe the envelope of the wave- form according to the ADSR model, namely hasAttack, hasSustain, hasDecay and hasRelease. 4.3 The Music Note Pattern The Music Note Pattern11 models the symbolic note (as represented in a score or in most of symbolic representation systems) and its attributes. Moreover, this ODP aims to abstract over different music representation systems. This means that it is possible to represent with this pattern music that was originally encoded in other formats. To do so, the pattern proposed describes each symbolic notes both by means of the standard music score notation and by using the MIDI reference for each of the notes’s attributes. This would allow the conversion between SymbolicNote is the central class representing the note as represented in the score. This class has seven subclasses, one for each of the seven notes of the tem- pered system (primitive notes), plus 35 subclasses which are defined as the com- bination between primitives and accidentals. The class Accidental (which has in turn five subclasses: Flat, Sharp, Natural, DoubleFlat, and DoubleSharp) allows the representation of altered notes. This allows to represent every note of the Western notation system, taking into account the function with respect to the tonality of the piece (e.g. differentiating between the notes A# and Bb). 10 Musical Object Pattern URI: https://purl.org/andreapoltronieri/ musicalobject 11 Music Note Pattern URI: https://purl.org/andreapoltronieri/notepattern 10 A. Poltronieri and A. Gangemi Notice that such differentiation has an important cognitive function for the mu- sic reader (distinguishing enharmonics according to the tonality), function that is mostly irrelevant for a computer reading a midi, but very relevant for an AI learning to generate music from notation). Furthermore, the SymbolicNote class is linked to several classes and data properties, describing the note’s attributes. The class Position defines the position of the note both in relation to the score (datatype property hasToMeasure) and within the measure (datatype property hasPositionInMeasure). The class NoteDuration describes the symbolic du- ration of the note, as expressed in the music score. The NoteDynamic class, instead, describes the dynamics of the note with reference to both musical no- tation (datatype property hasLiteralDynamic, e.g crescendo, vibrato, etc.) and the MIDI standard (data property hasMidiVelocity). Similarly, the NotePitch class defines both the octave in which the note is located (datatype property isDefinedByOctave) and the MIDI pitch (datatype property hasMidiPitch). In addition, the SymbolicNote class is linked to the other two patterns that make up the Music Note Ontology. The SymbolicNote belongs in fact to a Sec- tion rather than to a Voice, while it has realisation in a MusicalObject. Finally, the relationships between the attributes of the SymbolicNote and those of the MusicalObject are expressed. The NotePitch of a symbolic note is represented as the abstraction of the musical object’s frequency (expressed by the object property hasFrequencyAbstraction), while NoteDynamic and NoteDuration are described as the abstraction of SoundIntensity and MusicObjectDuration, respectively (ob- ject properties hasDynamicAbstraction and hasDurationAbstraction). In order to test the ontology, a Knowledge Base (KB)12 containing a single note and a single musical object was created. SPARQL queries were also created for each of the competency questions defined in Table 1. The KB can be tested against the SPARQL queries, in order to verify the expressiveness of the ontology with respect to the competency questions. The complete list of SPARQL queries is available on the repository of the project. The ontology can also be used to enrich the description of music content that is already annotated using other ontological models. For example, the classes and properties describing the structure of musical notation can be aligned to the widely used Music Ontology [20]. Similarly, the same classes can be aligned with the aforementioned Music OWL and Music Score Ontology. 5 Conclusions and Future Perspectives In this paper we have proposed the Music Note Ontology, an ontology for mod- elling musical content both from a symbolic point of view and in terms of its realisation. To do this, three different Ontology Design Patterns have been de- signed to model the internal relations of the score, the note from the symbolic point of view, and the note as musical object. The relationships between the 12 https://purl.org/andreapoltronieri/notepatterndata The Music Note Ontology, 11 different musical notations have also been considered, in particular between the symbolic abstraction and the features that can be extracted from the audio signal of a performance. In our future work we aim at modelling the perception of the different compo- nents of the audio signal, describing the different perceptual levels at which it is possible to experience music. We also aim to formalise the musical perception of different features from a subjective point of view, e.g. in individuals with different musical skills or from different musical cultures. Furthermore, since the model proposed at the moment can only express notes related to the traditional nota- tion system (and thus related to the temperate system of the Western musical tradition) we plan to expand the ontology with notes from other temperaments and musical traditions, as well as unpitched notes and microtonal variations. References 1. Antovic, M.: Musical metaphors in serbian and romani children: An empirical study. Metaphor and Symbol 24(3), 184–202 (Jul 2009). https://doi.org/10.1080/10926480903028136, https://doi.org/10.1080/ 10926480903028136 2. Association, M.M.: The complete midi 1.0 detailed specification: Incorporat- ing all recommended practices (1996), https://books.google.it/books?id= cPpcPQAACAAJ 3. Brown, C.: Historical performance, metronome marks and tempo in Beethoven’s symphonies. Early Music XIX(2), 247–260 (05 1991). https://doi.org/10.1093/earlyj/XIX.2.247, https://doi.org/10.1093/earlyj/ XIX.2.247 4. Bruneau, M., Scelo, T., d’Acoustique, S.: Fundamentals of Acoustics. ISTE, Wiley (2013), https://books.google.it/books?id=1\_y6gKuGSb0C 5. Carlyon, R.P., Shackleton, T.M.: Comparing the fundamental frequencies of resolved and unresolved harmonics: Evidence for two pitch mechanisms? The Journal of the Acoustical Society of America 95(6), 3541–3554 (1994). https://doi.org/10.1121/1.409971, https://doi.org/10.1121/1.409971 6. Cherfi, S.S.s., Guillotel, C., Hamdi, F., Rigaux, P., Travers, N.: Ontology- based annotation of music scores. In: Proceedings of the Knowledge Capture Conference. K-CAP 2017, Association for Computing Machinery, New York, NY, USA (2017). https://doi.org/10.1145/3148011.3148038, https://doi.org/ 10.1145/3148011.3148038 7. Cook, N.: Towards the complete musicologist. In: Proceedings of the 6th Interna- tional Conference on Music Information Retrieval (2006) 8. Dannenberg, R.B.: Music representation issues, techniques, and systems. Computer Music Journal 17(3), 20–30 (1993), http://www.jstor.org/stable/3680940 9. Grey, J.M.: Multidimensional perceptual scaling of musical timbres. The Journal of the Acoustical Society of America 61(5), 1270–1277 (1977). https://doi.org/10.1121/1.381428, https://doi.org/10.1121/1.381428 10. Grove, G., Fuller-Maitland, J.: Grove’s Dictionary of Music and Musicians. No. v. 4 in Grove’s Dictionary of Music and Musicians, Macmillan (1908), https://books. google.it/books?id=FBsPAAAAYAAJ 12 A. Poltronieri and A. Gangemi 11. Institute, A.N.S., Sonn, M., of America, A.S.: American National Standard Psy- choacoustical Terminology. N.: ANSI, American National Standards Institute (1973), https://books.google.it/books?id=5Fo0GQAACAAJ 12. Jones, J., de Siqueira Braga, D., Tertuliano, K., Kauppinen, T.: Musicowl: The music score ontology. In: Proceedings of the International Conference on Web Intelligence. p. 1222–1229. WI ’17, Association for Computing Machinery, New York, NY, USA (2017). https://doi.org/10.1145/3106426.3110325, https://doi. org/10.1145/3106426.3110325 13. Mascardi, V., Cordı̀, V., Rosso, P.: A comparison of upper ontologies. In: WOA. pp. 55–64 (01 2007) 14. McAdams, S., Giordano, B.L.: The perception of musical timbre., pp. 113–124. Oxford library of psychology., Oxford University Press, New York, NY, US (2016) 15. McLachlan, N.M.: Timbre, Pitch, and Music. Oxford University Press (Jun 2016). https://doi.org/10.1093/oxfordhb/9780199935345.013.44, https://doi. org/10.1093/oxfordhb/9780199935345.013.44 16. Moore, B.C.J.: An introduction to the psychology of hearing, 5th ed. An introduc- tion to the psychology of hearing, 5th ed., Academic Press, San Diego, CA, US (2003) 17. Müller, M.: Fundamentals of Music Processing: Audio, Analysis, Algorithms, Ap- plications. Springer Publishing Company, Incorporated, 1st edn. (2015) 18. Nattiez, J.J., Dunsby, J.M.: Fondements d’une semiologie de la musique. Perspectives of New Music 15(2), 226–233 (2021/05/14/ 1977). https://doi.org/10.2307/832821, http://www.jstor.org/stable/832821, full publication date: Spring - Summer, 1977 19. Rashid, S.M., De Roure, D., McGuinness, D.L.: A music theory ontology. In: Proceedings of the 1st International Workshop on Semantic Applications for Audio and Music. p. 6–14. SAAM ’18, Association for Computing Machinery, New York, NY, USA (2018). https://doi.org/10.1145/3243907.3243913, https: //doi.org/10.1145/3243907.3243913 20. Raymond, Y., Abdallah, S., Sandler, M., Giasson, F.: The music ontology. In: Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR 2007). pp. 417–422. Vienna, Austria (Sep 2007) 21. Rusconi, E., Kwan, B., Giorndano, B., Umilta, C., Butterworth, B.: Spatial representation of pitch height: the SMARC effect. Cognition 99(2), 113–129 (Mar 2006). https://doi.org/10.1016/j.cognition.2005.01.004, https://doi.org/ 10.1016/j.cognition.2005.01.004 22. Schouten, J.: The Perception of Subjective Tones. Separaat / Laboratoria N. V. Philips Gloeilampenfabrieken, Verlag nicht ermittelbar (1938), https://books. google.it/books?id=d9vjzQEACAAJ 23. Stewart, L., Walsh, V., Frith, U.: Reading music modifies spatial map- ping in pianists. Perception & Psychophysics 66(2), 183–195 (Feb 2004). https://doi.org/10.3758/bf03194871, https://doi.org/10.3758/bf03194871 24. Weiss, P.J., Shadle, S.: Frbr in the real world. The Serials Librarian 52(1-2), 93– 104 (2007). https://doi.org/10.1300/J123v52n01 09, https://doi.org/10.1300/ J123v52n01_09 25. Wiggins, G.: Computer-representation of music in the research environment. Mod- ern Methods for Musicology: Prospects, Proposals, and Realities pp. 7–22 (2009), https://doi.org/10.4324/9781315595894 26. Wiggins, G., Miranda, E., Smaill, A., Harris, M.: A framework for the evaluation of music representation systems. Computer Music Journal 17(3), 31–42 (1993), http://www.jstor.org/stable/3680941