=Paper=
{{Paper
|id=Vol-1436/Paper12
|storemode=property
|title=The C@merata Task at MediaEval 2015: Natural Language Queries on Classical Music Scores
|pdfUrl=https://ceur-ws.org/Vol-1436/Paper12.pdf
|volume=Vol-1436
|dblpUrl=https://dblp.org/rec/conf/mediaeval/SutcliffeFRHL15
}}
==The C@merata Task at MediaEval 2015: Natural Language Queries on Classical Music Scores==
The C@merata Task at MediaEval 2015: Natural Language Queries on Classical Music Scores Richard Sutcliffe Chris Fox Deane L. Root School of CSEE School of CSEE Department of Music University of Essex University of Essex University of Pittsburgh Colchester, UK Colchester, UK Pittsburgh, PA, USA rsutcl@essex.ac.uk foxcj@essex.ac.uk dlr@pitt.edu Eduard Hovy Richard Lewis Language Technologies Institute Department of Computing Carnegie-Mellon University Goldsmiths, University of London Pittsburgh, PA, USA London, UK hovy@cmu.edu richard.lewis@gold.ac.uk ABSTRACT This was the second year of the C@merata task [16,1] which relates natural language processing to music information retrieval. Participants each build a system which takes as input a query and a music score and produces as output one or more matching passages in the score. This year, questions were more difficult and scores were more complex. Participants were the same as last year and once again CLAS was the best with a Beat F-Score of 0.620. INTRODUCTION The C@merata task is a kind of Question Answering [13,17,2,12,18] combined with Music Information Retrieval [5,6]. The input is a phrase such as ‘dotted minim F#4’ together with a score in MusicXML [11] and the output is a list of one or more passages in the score each containing such a minim. There are three main applications for C@merata-type systems. First, we have observed in Grove and elsewhere [7,14,3,8,10] that musicological analyses make references to musical passages. For example, consider ‘cellos and basses lead Q: dotted minim F#4 us into the shadows while the upper strings accompany with A: [ 3/4, 1, 65:1-65:3 ] gently throbbing harmonies’ [8, p17]. This refers to a passage in Q: F4 crotchet in the oboe Beethoven’s First Symphony, but where exactly? A: [ 3/4, 2, 64:3-64:4 ] Second, experts may wish to find a specific passage based on Q: minim A2 in 3/4 time a possibly vague description, for example, ‘the Wagner coda from A: [ 3/4, 1, 62:2-62:3 ], [ 3/4, 1, 64:2-64:3 ] the 7th symphony of Bruckner’. Q: chord D2 E5 G5 in bars 54-58 Third, students of music who are unsure what an ‘interrupted A: [ 3/4, 2, 57:1-57:1 ] cadence’ is could benefit from a system which could find Q: quavers F3 A3 followed by crotchet A4 in the violin examples such as ‘The trumpet shall sound’ from Handel’s A: [ 3/4, 1, 57:2-57:3 ] Messiah. These three applications motivate our work. Q: four quavers in the violin against a minim in the bass clef A: [ 3/4, 1, 62:2-62:3 ], [ 3/4, 1, 64:2-64:3 ] 1. APPROACH Figure 1. Extract from Bach BWV 1047 Andante with sample questions and answers 1.1 The C@merata Task up to nineteen staves and from a few bars up to a hundred or Participants are given 200 questions and twenty scores in more. Query types were different from 2014 (Table 1) and MusicXML, ten questions on each score. The task is to find one consisted of eight base types which could have certain or more answer passages for each question. Suppose the query is qualifications. Some were similar to last year (‘D4 minim’) while ‘dotted minim F#4’ against the Andante of BWV 1047 (Figure 1). others were more complex (‘quavers F4 E4 in the oboe followed An answer passage is [ 3/4, 1, 65:1-65:3 ]. This means time by quavers E2 G#2 in the bass clef’). signature 3/4, measuring in crotchets, passage starts before the first crotchet in bar 65 and ends after the third crotchet. 1.2 Evaluation Metrics The twenty scores were chosen from Baroque, Classical and A passage is beat-correct if it starts in the correct bar at the Romantic composers. They ranged in complexity from one stave correct beat and it ends at the correct bar at the correct beat. Beat Precision (BP) is the number of beat-correct passages returned by Copyright is held by the author/owner(s). a system, in answer to a question, divided by the number of MediaEval 2015 Workshop, September 14-15, 2015, Wurzen, Germany. passages (correct or incorrect) returned. Similarly, Beat Recall (BR) is the number of beat-correct passages returned by a system divided by the total number of answer passages known to exist. Table 2. C@merata Participants Beat F-Score (BF) is the harmonic mean of BP and BR. A passage is measure-correct if it starts in the correct bar Runtag Leader Affiliation Country not necessarily at the correct beat and it ends at the correct bar not CLAS Stephen Wan CSIRO Australia necessarily at the correct beat. Measure Precision (MP) is the De Montfort number of measure-correct passages returned by a system divided DMUN Tom Collins England University by the number of passages (correct or incorrect) returned. Donncha Ó University of Measure Recall (MR) is the number of measure-correct passages OMDN Ireland Maidín Limerick returned by a system divided by the total number of answer Thane NK passages known to exist. Measure F-Score (MF) is the harmonic TNKG Nikhil Kini India Group mean of MP and MR. UNLP Kartik Asooja NUI Galway Ireland Table 1. Distribution of Query Types with Examples Table 3. Results by Participant Type No Example Run BP BR BF MP MR MF 1_melod 40 D4 minim; eighth note in measure 9 trill on a quaver A; G# in the Cello CLAS01 0.604 0.636 0.620 0.639 0.673 0.656 1_melod qualified part in measures 29-39; sixteenth note DMUN01 0.311 0.739 0.438 0.332 0.788 0.467 by perf, instr, clef, 40 C# in the left hand; half note E3 in time, key 2/2; sixteenth note G in G minor in DMUN02 0.242 0.739 0.365 0.265 0.809 0.399 measures 1-5 DMUN03 0.294 0.739 0.421 0.316 0.794 0.452 F# E G F# A; Do Mi Do Sol Do Mi OMDN01 0.817 0.175 0.288 0.817 0.175 0.288 Sol Do in bars 1-20; twenty n_melod 20 semiquavers; five note melody in bars TNKG01 0.061 0.488 0.108 0.073 0.586 0.129 1-10 two staccato quarter notes in the UNLP01 0.126 0.430 0.195 0.149 0.508 0.230 Violin 1; crotchet, crotchet rest, Maximum 0.817 0.739 0.620 0.817 0.809 0.656 crotchet rest, crotchet, crotchet rest, n_melod qualified Minimum 0.061 0.175 0.108 0.073 0.175 0.129 crotchet, crotchet, crotchet, crotchet, by perf, instr, clef, 20 crotchet in the Timpani; melodic 0.351 0.564 0.348 0.370 0.619 0.375 time, key Average octave leap in the bass clef in measures 70-80; G4 B4 E5 in 3/4; rising G minor arpeggio 1_harm possibly eighth note chord Bb, C, E; chord of D 2. RESULTS AND DISCUSSION qualified by perf, minor in measures 109-110; harmonic Five groups from four countries participated, exactly the 20 same as in 2014 (Table 2). The results are shown in Table 3. instr, clef, time, minor sixth in the Violas; dotted key minim chord in the left hand These were lower than last year but once again CLAS was the best monophonic passage; homophony in with BF 0.620. This was a great achievement as the questions texture 6 measures 1-14; polyphony in measures were generally much harder this year and there were fewer ‘easy’ 10-14; Alberti bass in measures 0-4 questions such as ‘crotchet F’ to boost the figures. quavers F4 E4 in the oboe followed by Participants generally updated and adapted their 2014 quavers E2 G#2 in the bass clef; systems. Almost all worked in Python using music21 [4] and parts follow possibly quarter note minor third followed by of the Baseline System from last year [15]. DMUN converted qualified on either eighth note unison; C followed by scores from MusicXML [11] to Kern [9] in order to use their pre- or both sides by 40 mordent Bb; chord C4 G4 C5 E5 then existing tools in Lisp. OMDN used their own tools in C++. Only perf, instr, clef, a quaver; three eighth notes in the basic NLP was used. Typically, a query was first scanned looking time, key Violin I followed by twelve sixteenth for terms (down bow → down_bow). Some adopted a QA notes in the Violin II in measures 87- approach and assigned each query to a pre-defined type, each with 92 its method of solution. Others parsed the concepts and converted synch possibly four eighth notes against a half note; them to a structured representaton. Some varied the representation qualified in either crotchet D3 on the word “je” against a of the score according to the question, e.g. using music21 chordify or both parts by 14 minim D2; four staccato quavers in the for cadence questions. As the amount of data to be searched per perf, instr, clef, Violoncello against a minim chord query was not large (just one score) no one used any inverted time, key Ab3 C4 F4 in the Harpsichord indexing of the music data. All 200 3. CONCLUSIONS This was the second year, and much was learned by 1.3 Gold Standard Queries participants and organisers alike. All were once again able to 200 questions were prepared according to a carefully crafted produce a working system. Questions were more complex this distribution of query types (Table 1). Answers were identified in year and results were lower in consequence. Future campaigns the scores and checked by two further experts. The data was used may bring use closer to the examples given in the introduction. to create the Gold Standard for evaluating results automatically. [9] Huron, D. (1997). Humdrum and Kern: Selective Feature 4. REFERENCES Encoding. In ‘Beyond MIDI’, ed. E. Selfridge-Field (p375- [1] C@merata (2015). http://csee.essex.ac.uk/camerata/ 401). Cambridge, MA: MIT Press. [2] CLEF (2015). http://www.clef-initiative.eu/. [10] Kirkpatrick, R. (1953). Domenico Scarlatti. Princeton, NJ: [3] Cooke, D. (1995). Bruckner, (Joseph) Anton. In S. Sadie Princeton University Press. (ed), New Grove Dictionary of Music and Musicians, [11] MusicXML (2015). http://www.musicxml.com/. Volume 3, Section 7. Music (p362-366). London, UK: [12] NTCIR (2015). http://research.nii.ac.jp/ntcir/index-en.html. Macmillan. [13] Peñas, A., Magnini, B., Forner, P., Sutcliffe, R., Rodrigo, A., [4] Cuthbert, M. S., & Ariza C. (2010). music21: a toolkit for & Giampiccolo, D. (2012). Question Answering at the Cross- computer-aided musicology and symbolic music data. Proc. Language Evaluation Forum 2003-2010. Language International Symposium on Music Information Retrieval Resources and Evaluation Journal, 46(2), 177-217. (Utrecht, The Netherlands, August 09-13, 2010), p637-642. [14] Sadie, S. (ed) (1995). The New Grove Dictionary of Music [5] Futrelle, J., & Downie, J. S. (2003). Interdisciplinary and Musicians. London, UK: Macmillan. Research Issues in Music Information Retrieval: ISMIR 2000–2002. Journal of New Music Research (32:2), 121- [15] Sutcliffe, R. F. E. (2014). A Description of the C@merata 131. Baseline System in Python 2.7 for Answering Natural Language Queries on MusicXML Scores. University of Essex [6] Ganseman, J., Scheunders, P., & D'haes, W. (2008). Using Technical Report, 21st May, 2014. XQuery on MusicXML databases for musicological analysis. Proc. International Symposium on Music Information [16] Sutcliffe, R. F. E., Crawford, T., Fox, C., Root, D. L., & Retrieval, p433-438. Hovy, E. (2014). The C@merata Task at MediaEval 2014: Natural language queries on classical music scores. In Proc. [7] Grove Music Online (2015). MediaEval 2014 Workshop, Barcelona, Spain, October 16- http://www.oxfordmusiconline.com/public/ 17 2014. http://ceur-ws.org/Vol-1263/. [8] Hopkins, A. (1982). The Nine Symphonies of Beethoven. [17] Sutcliffe, R., Peñas, A., Hovy, E., Forner, P., Rodrigo, A., London: Pan Books. Forascu, C., Benajiba, Y., Osenova, P. (2013). Overview of QA4MRE Main Task at CLEF 2013. Proc. QA4MRE-2013. [18] TREC (2015). http://trec.nist.gov/.