=Paper= {{Paper |id=Vol-1436/Paper88 |storemode=property |title=DMUN at the MediaEval 2015 C@merata Task: The Stravinsqi Algorithm |pdfUrl=https://ceur-ws.org/Vol-1436/Paper88.pdf |volume=Vol-1436 |dblpUrl=https://dblp.org/rec/conf/mediaeval/KatsiavalosC15 }} ==DMUN at the MediaEval 2015 C@merata Task: The Stravinsqi Algorithm== https://ceur-ws.org/Vol-1436/Paper88.pdf
                           DMUN at the MediaEval 2015
                       C@merata Task: the Stravinsqi Algorithm
                                              Andreas Katsiavalos and Tom Collins
                                                           Faculty of Technology
                                                          De Montfort University
                                                             Leicester, UK
                                                           +44 116 207 6192
                                                         tom.collins@dmu.ac.uk
ABSTRACT                                                                                        Start                                   End
This paper describes the Stravinsqi-Jun2015 algorithm, and
evaluates its performance on the MediaEval 2015 C@merata task.
Stravinsqi stands for STaff Representation Analysed VIa Natural                                                                   9. Convert to re-
                                                                                      1. Get string and division
language String Query Input. The algorithm parses a query string                                                                 quired time format

that consists of a natural language expression concerning a                                                                        Found
symbolically represented piece of music (which the algorithm
parses also), and then identifies where in the music event(s)                                 2. Check
                                                                                                                                     8. Check
                                                                                                                                                      nil
                                                                                                                                       syn-
specified by the query occur. For a given query, the output is a list                         bar range                              chronous
of time windows specifying the locations of the relevant events.                                                                      answers

Time windows output by the algorithm can be compared with
time windows specified by music experts for the same query-                             3. Split synchronous
                                                                                                                                   Found
                                                                                        questions – again
piece combinations. Across a collection of twenty pieces and 200                                                                      7. Check
questions, Stravinsqi-Jun2015 had recall .794 and precision .316                                                                        asyn-         nil
                                                                                                                                      chrnous
at the measure level, and recall .739 and precision .294 at the beat                                                                  answers
                                                                                       4. Split asynchronous
level. The paper undertakes a preliminary analysis of where                           questions – followed b
Stravinsqi might be improved, identifies applications of the                                                                       Found
C@merata task within the contexts of music education and music
listening more generally, and provides a constructive critique of                         5. Load point-set
                                                                                                                                       6. Get         nil
                                                                                                                                     elemental
some of the question categories that are new this year.                                    representations
                                                                                                                                      answers


1. INTRODUCTION
     The premise of the C@merata task [1] is that it is interesting
and worthwhile to develop algorithms that can (1) parse a natural                                                  chord-time-intervals
                                                                                                                   harmonic-interval-of-a
language query about a notated piece of music, and (2) retrieve                                                    melodic-interval-of-a
relevant time windows from the piece where events/concepts                                                         ...
mentioned in the query occur. The premise is strong, if we                                                         duration&pitch-class-time-intervals
                                                                                                                   pitch-class-time-intervals
consider that each year in the U.S. alone over 200,000 freshman                                                    duration-time-intervals
students declare music their intended major [2, 3], and that there                                                 ...
is a line connecting the types of queries being set in the
C@merata task and the questions these students are taught (or, by
college, have already been taught) to answer [4]. The C@merata                     Figure 1. Flow diagram for the Stravinsqi-Jun2015
task, apart from posing an interesting research problem at the                                       algorithm.
intersection of music theory, music psychology, music computing,
and natural language processing (NLP), could lead to new                Lisp package called MCStylistic-Jun2015 that has been under
applications that assist students, and music lovers more generally,     development since 2008 [8]. The MCStylistic package, free and
in gaining music appraisal skills. Other applications of research       cross-platform, supports research into music theory, music
motivated by the C@merata task include supporting work in               cognition, and stylistic composition, with new versions released
musicology [5], and informing solutions to various music                on an approximately annual basis.1 In addition to Stravinsqi,
informatics tasks, such as generation of music in an intended style     MCStylistic includes implementations of other algorithms from
[6] or expressive rendering of staff notation [7], where systems for    the fields of music information retrieval and music psychology,
either task may benefit from being able to automatically extract,       for tonal and metric analysis [e.g., 9], and for the discovery of
say, cadence locations and/or changes in texture.                       repeated patterns (e.g., motifs, themes, sequences) [10].
                                                                             A flow diagram outlining the Stravinsqi algorithm is given in
2. APPROACH                                                             Figure 1. The following is a succinct overview of 1focusing on the
2.1 Overview                                                            differences between this year’s (Stravinsqi-Jun2015) and last
    The Stravinsqi-Jun2015 algorithm (hereafter, Stravinsqi),           year’s submission (Stravinsqi-Jun2014) [11].2 Step 1 of Stravinsqi
which was entered into the C@merata task, is part of a Common           involves extracting the question string and divisions value from

                                                                        1
 Copyright is held by the author/owner(s).                                  http://www.tomcollinsresearch.net
                                                                        2
 MediaEval 2015 Workshop, September 14-15, 2015, Wurzen, Germany.           For more details, please see the six-page version of this paper.
the question file. Step 2 parses the question string for mention of                                                                                  Harmony
                                                                                                                                                         1
bar restrictions (“minim in measures 6-10”), stores this for                                           Articulation           1
                                                                                                                                                                                  Melody n

subsequent processing (as (6 10), say), removes the restriction                                                                                          0.75
                                                                                                                                                                                      1

from the question string (“minim”), and passes it to step 3.                                                                         0.75

                                                                                                             1                                                                 0.75
       Prompted by one of the questions from the task description                        Instrument
                                                                                                                                             0.5         0.5
                                                                                                                                                                                                       Melody 1
                                                                                                                      0.75
[12, p. 9], Stravinsqi splits queries by synchronous commands                                                                  0.5
                                                                                                                                                                       0.5
                                                                                                                                                                                                   1

first (step 3) and then further by asynchronous commands (step 4).                                                                           0.25
                                                                                                                                                    0.25 0.25                             0.75

For example, “D followed by A against F followed by F” would                                       1       0.75        0.5        0.25
                                                                                                                                                                0.25           0.5


emerge as (“D followed by A” “F followed by F”) from step 3,                                Clef                               0.25
                                                                                                                                                                  0.25
                                                                                                                                                                                                            Mean
and as ((“D” “A”) (“F” “F”)) from step 4. In general, a question                                                       0.5          0.25
                                                                                                                                                                 0.25           0.5        0.75         1
                                                                                                                                                       0.25
string emerges from step 4 as some nested list of strings ((s1,1 s1,2                                       0.75
                                                                                                                                           0.25 0.25
                                                                                                                                                                     0.5
... s1,n(1)) (s2,1 s2,2 ... s1,n(2)) ... (sm,1 sm,2 ... sm,n(m))), where each si,j is                  1
                                                                                                                              0.5
                                                                                                                                                        0.5                    0.75
a query element. Examples of query elements include “D”, “A♭4                              Time Sig.
                                                                                                                       0.75
                                                                                                                                            0.5
                                                                                                                                                                                            1
                                                                                                                                                                                                       Texture

eighth note”, “perfect fifth”, “melodic interval of a 2nd”, etc.                                                                           0.75
                                                                                                                                                              0.75             Synch

       In step 5, point-set representations of the relevant piece are                                              1
                                                                                                                                                                                                  Measure Recall
                                                                                                                                                                           1
loaded and possibly restricted to those points that belong to a                                             Key Sig.
                                                                                                                                              1
                                                                                                                                                                                                  Measure Precision
                                                                                                                                                                                                  Beat Recall
certain bar-number range. The xml2hum script is used to convert                                                                                      Follow                                       Beat Precision

each piece from its MusicXML format to kern format [13].3
Temporarily, in step 6 of Figure 1, each question element si,j from                            Figure 2. Results of the Stravinsqi-Jun2015 algorithm
step 4 is treated independently. A query element si,j is passed to                          on the MediaEval 2015 C@merata task. Overall results
seventeen music-analytic sub-functions, each of which tests                                 are indicated by the mean label, and followed by results
whether si,j is a relevant query for that function, and, if so,                                          for eleven question categories.
searches for instances of the query in the piece of music. If the
query is irrelevant to a sub-function, that function returns nil.                       motivated more by music-perceptual than typographical concerns,
       The output of step 6 is a nested list of time-interval sets,                     based on the premise that music is primarily an auditory-cognitive
((T1,1 T1,2 ... T1,n(1)) (T2,1 T2,2 ... T1,n(2)) ... (Tm,1 Tm,2 ... Tm,n(m))), one      phenomenon, and a visuo-cognitive phenomenon secondarily.
for each query element si,j, some of which may be empty. The                            When music perception and music theory collide, as they do
purpose of steps 7 and 8 is to determine whether any combination                        occasionally in the C@merata task and beyond [14], Stravinsqi's
v of these time intervals satisfies the constraints imposed by                          precision can be adversely affected. For example, unlike the task
synchronous and asynchronous parts of the question string (there                        description (consecutive elements “must both be on the same
may be one such v, several, or none). The final step of Stravinsqi,                     stave” [12, p. 7]), Stravinsqi does not require consecutive question
labeled Step 9 in Figure 1, comprises the conversion of the time                        elements to be on the same staff, because a staff swap has little (or
intervals v1, v2,…, vr into the XML format required by the task.                        sometimes no) effect on how the music sounds. Stravinsqi tends
3. RESULTS AND DISCUSSION                                                               to find the correct answers according to the task description, but
      Figure 2 contains a summary of results for the Stravinsqi                         also some extra answers that involve elements on different staves,
algorithm across various question categories.4 The mean measure                         which has a detrimental effect on its precision.
recall across all 200 questions, indicated by the black line next to                    4. CONCLUSION
the “Mean” label, is .794, and the mean measure precision,
                                                                                             We have provided an overview of the Stravinsqi-Jun2015
indicated by the blue line, is .316. The mean beat recall (green
                                                                                        algorithm, and described its performance on the 2015 C@merata
line) and beat precision (red line) are both slightly lower than their
                                                                                        task. Stravinsqi achieved high recall (approximately .75) in eight
measure counterparts (.739 and .294 respectively), but in general
                                                                                        of the eleven question categories, and had the highest measure and
it can be assumed that if Stravinsqi returned the correct
                                                                                        beat recall of any algorithm submitted to the task [1]. Further
beginning/ending measure number pairs for a question, then it was
                                                                                        analysis of the results is required to determine whether Stravinqi’s
also able to identify the relevant beats. Stravinsqi had the highest
                                                                                        precision can be improved while adhering to our general design
measure and beat recall of any algorithm submitted to the 2015
                                                                                        principle of favouring music-perceptual over typographical
C@merata task [1] and the third highest measure and beat F1
                                                                                        concerns. In the introduction, it was remarked that there is a line
score (F1 = 2PR/(P + R), where P is precision and R is recall).
                                                                                        connecting the types of queries being set in the C@merata task
      Across eight of the eleven question categories shown in
                                                                                        and the examination questions that students of the Western
Figure 2 (Melody 1, Melody n, Harmony, Articulation,
                                                                                        classical tradition are taught to answer. This year’s C@merata test
Instrument, Clef, Follow, and Synch), Stravinsqi achieves
                                                                                        set was lacking cadence and functional harmony queries, which
consistently high recall of approximately .75. For the remaining
                                                                                        was indicative of a general tendency to replace musically
three categories (Time Sig., Key Sig., and Texture) it is less
                                                                                        interesting questions (e.g., concerning cadence, triad, hemiola,
successful. Overall, the results suggest the need to investigate
                                                                                        ostinato, sequence, etc.) with questions that were linguistically
Stravinsqi’s precision being lower than its recall.
                                                                                        challenging to parse but of less musical relevance (e.g., Question
      We have not yet incorporated in Stravinsqi restrictions to
                                                                                        130, “fourteen sixteenth notes against a whole note chord in the
notes occurring after particular clef, time signature, or key
                                                                                        bass”). Next year, we would welcome the reintroduction of more
signature changes. Currently, a query such as “G4 in the key of G
                                                                                        musically interesting (if complex) question categories, to re-
major” would be parsed as though it were “G4”. Therefore, the
                                                                                        establish and strengthen the line that connects C@merata queries
recall of Stravinsqi remains high for such questions, but the
                                                                                        with concepts that are relevant for music students and enthusiasts.
precision will be negatively impacted. The design of Stravinsqi is

3
    http://extras.humdrum.org/bin/osxintel64/
4
    Please see [1] for definitions of the various metrics.
5. REFERENCES                                                              neural networks. In Proceedings of the International
[1] Sutcliffe, R. F. E., Fox, C., Root, D. L., Hovy, E, and Lewis,         Symposium on Music Information Retrieval (Taipei, Taiwan,
    R. 2015. The C@merata Task at MediaEval 2015: Natural                  October 27 - 31, 2014). 47-52.
    language queries on classical music scores. In MediaEval          [8] Collins, T. 2011. Improved Methods for Pattern Discovery in
    2015 Workshop, Wurzen, Germany, September 14-15, 2015.                 Music, with Applications in Automated Stylistic Composition.
    http://ceur-ws.org.                                                    Doctoral Thesis. Faculty of Mathematics, Computing and
[2] Kena, G, et al. 2015. The Condition of Education 2015. U.S.            Technology, The Open University.
    Department of Education, Washington, DC. Retrieved June           [9] Volk, A. 2008. The study of syncopation using inner metric
    4, 2015 from http://nces.ed.gov/pubsearch.                             analysis: linking theoretical and experimental analysis of
[3] Eagan, K, et al. 2014. The American Freshman: National                 metre in music. J. New Music Res. 37, 4, 259-273.
    Norms Fall 2014. Higher Education Research Institute,             [10] Meredith, D., Lemström, K., and Wiggins, G. A. 2002.
    UCLA, Los Angeles, CA. Retrieved March 15, 2015 from                   Algorithms for discovering repeated patterns in
    http://www.heri.ucla.edu.                                              multidimensional representations of polyphonic music.
[4] AQA. 2012. Music in Context: Past Paper for A-level Music              J. New Music Res. 31, 4, 321-345.
    Unit 4. AQA, Manchester, UK. Retrieved January 13, 2014           [11] Collins, T. 2014. Stravinsqi/De Monfort University at the
    from http://www.aqa.org.uk.                                            C@merata 2014 task. Proceedings of the C@merata Task at
[5] Cuthbert, M. S., and Ariza, C. 2010. music21: a toolkit for            MediaEval 2014.
    computer-aided musicology and symbolic music data. In             [12] Sutcliffe, R., Fox, C., Root, D. L., Hovy, E, and Lewis, R.
    Proceedings of the International Symposium on Music                    2015. Task Description v2: C@merata 15: Question
    Information Retrieval (Utrecht, The Nethlerands, August 09 -           Answering on Classical Music Scores. Retrieved June 4,
    13, 2010). 637-642.                                                    2015 from http://csee.essex.ac. uk/camerata.
[6] Collins, T., Laney, R., Willis, A, and Garthwaite, P. H. 2015.    [13] Sapp, C. S. 2013. Humdum Extras. Retrieved March 3, 2014
    Developing and evaluating computational models of musical              from http://wiki.ccarh.org/wiki/Humdrum_Extras.
    style. Artificial Intelligence for Engineering Design, Analysis   [14] Cook, N. 1994. Perception: a perspective from music theory.
    and Manufacturing. DOI: 10.1017/S0890060414000687                      In Musical perceptions, R. Aiello and J. A. Sloboda, Eds.
[7] van Herwaarden, S., Grachten, M, and de Haas, W. B. 2014.              Oxford University Press, Oxford, UK, 64-95.
    Predicting expressive dynamics in piano performances using