=Paper=
{{Paper
|id=Vol-1263/paper50
|storemode=property
|title=Stravinsqi/De Montfort University at the MediaEval 2014 C@merata Task
|pdfUrl=https://ceur-ws.org/Vol-1263/mediaeval2014_submission_50.pdf
|volume=Vol-1263
|dblpUrl=https://dblp.org/rec/conf/mediaeval/Collins14
}}
==Stravinsqi/De Montfort University at the MediaEval 2014 C@merata Task==
<pdf width="1500px">https://ceur-ws.org/Vol-1263/mediaeval2014_submission_50.pdf</pdf>
<pre>
                               Stravinsqi/De Montfort University
                            at the MediaEval 2014 C@merata Task
                                                                  Tom Collins
                                                             Faculty of Technology
                                                             De Montfort University
                                                                Leicester, UK
                                                              +44 116 207 6192
                                                          tom.collins@dmu.ac.uk
ABSTRACT
An overview is provided of the Stravinsqi-Jun2014 algorithm and
its performance on the MediaEval 2014 C@merata Task.
Stravinsqi stands for STaff Representation Analysed VIa Natural
language String Query Input. The algorithm parses a symbolic
representation of a piece of music as well as a query string
consisting of a natural language expression, and identifies where
event(s) specified by the query occur in the music. The output for
any given query is a list of time windows corresponding to the
locations of relevant events. To evaluate the algorithm, its output
time windows are compared with those specified by music experts
for the same query-piece combinations. In an evaluation consist-
ing of twenty pieces and 200 questions, Stravinsqi-Jun2014 had
recall .91 and precision .46 at the measure level, and recall .87 and
precision .44 at the beat level. Important potential applications of
this work in music-educational software and musicological
research are discussed.

1. INTRODUCTION
Given a natural language query and a piece of music in digital
staff notation representation, the C@merata task [6] evaluates an
algorithm’s ability to identify where one or more events specified
by the query occur in the music. It is the latest example of a long-
standing interest in querying music represented as (or derived
from) staff notation. The C@merata task challenges researchers to                Figure 1. Prelude to Te Deum H146 by Marc-Antoine
extend current knowledge in two respects:                                       Charpentier (1643-1704), annotated with a bar number
                                                                                 error (tick and cross), intervals of a harmonic second
     1.   Accepting a music-analytic query in the form of a natural-             (arrows), functional harmonies below each staff, and
          language string, such as “perfect fifth followed by a D4”;                      three perfect cadences (black boxes).
     2.   Reliably retrieving instances of higher-level music-
          theoretic concepts from staff notation, such as functional
          harmonies (e.g., “Ib”) or cadences (e.g., “interrupted
          cadence”).
One application of an algorithm that performs well on the
C@merata task would be within music notation software, so that
students could query and hear/see results for the pieces with
which they are working, in order to develop their understanding of
various music-theoretic terms.

2. APPROACH
2.1 Overview
The Stravinsqi-Jun2014 algorithm that was entered in the
C@merata task is embedded in a Common Lisp package called
MCStylistic-Jun2014 (hereafter, MCStylistic), which has been
under development since 2008 [1].1 MCStylistic includes

1
    http://www.tomcollinsresearch.net
                                                                              Figure 2. Results of the Stravinsqi-Jun2014 algorithm
    Copyright is held by the author/owner(s).                                on the MediaEval 2014 C@merata task. Overall results
    MediaEval 2014 Workshop, October 16-17, 2014, Barcelona, Spain.
                                                                             are indicated by the mean label, and followed by results
                                                                                          for twelve question categories.
implementations of algorithms from the fields of music infor-            results, with Stravinsqi having recall .91 and precision .46 at the
mation retrieval (MIR) and music psychology [2-5].                       measure level, and recall .87 and precision .44 at the beat level. 3
From a natural language perspective, there are two types of              Stravinsqi’s strong performance on the first eight of twelve
queries: compound queries such as “a Bb followed a bar later by a        categories (pitch, duration,…, melodic interval) is encouraging, as
C followed by a tonic triad”, and ordinary queries such as “perfect      is the small decrease in recall (.91 to .87) and precision (.46 to
cadence”. Stravinsqi checks the query string for compound                .44) with the change from measure- to beat-level granularity. The
queries and splits it into N query elements if necessary, e.g., “a       drop in precision for compound queries is due to over-lenient
Bb” and “a bar later by a C” and “tonic triad”.                          criteria used to select and combine time intervals for the different
                                                                         elements that comprise a compound query. This can be fixed in
The piece is converted from its MusicXML format to kern format           future work. For triad labelling, Stravinsqi suffered from an
using the xml2hum script.2 The kern file is parsed by import             under-labelling issue in two instances, missing two first-inversion
functions in MCStylistic to give the following representations,          triads because the same triad in root position preceded them, and
which are referred to as point sets: (1) instrument/staff and clef       the two triads got one root-position label. The triad and texture
names at the beginning of each staff; (2) bar numbers where time         categories are somewhat underrepresented in the training and test
signatures are specified, together with the number of beats per bar,     data, and so more attention ought to be given to these categories
the type of beat, and the corresponding ontime (incrementing time        in future. Less-than-perfect performance on melodic and harmonic
in staff notation); (3) a point-set representation of the piece, where   interval questions can be attributed to inconsistencies between the
each point represents a note. The five-dimensional point consists        task description/training collection, and test collection.
of the ontime of the note, its MIDI note number, its morphetic
pitch number [3], its duration in crotchet beats, and its numeric        4. CONCLUSION
staff number; (4) a point-set representation of the piece with three     Algorithms that perform strongly on the C@merata task open up
extra dimensions, one each for articulation, dynamics, and lyrics        new, interesting potential applications in music education and
information; (5) a point-set representation of the piece, where          musicological research. The Stravinsqi algorithm described above
each point represents a notated rest.                                    is one such strong performer, and has effectively solved seven of
                                                                         the twelve C@merata task categories shown in Figure 4 (pitch,
Each query element is passed to several sub-functions (e.g.,
                                                                         duration, pitch and duration, articulation, voice specific, lyrics,
harmonic-interval-of-a, duration&pitch-time-
                                                                         and melodic interval). As for the remaining five categories, future
intervals, rest-duration-time-intervals, etc.),                          work will involve bug fixes, resolving task inconsistencies, and
along with the appropriate point set(s). For example, the function       acquiring more data for cadence and texture query categories. It
rest-duration-time-intervals takes a query element,                      may also be helpful to have two experts provide annotations for
the point set of notated rests, and the point set of instrument/staff    the higher-level music-theoretic concepts. The addition of new,
and clef names as its arguments, because these three information         higher-level music-theoretic query categories would be welcome
sources are sufficient for locating rests of specific duration. If a     in future iterations of C@merata as well, in order to keep the task
query string is ordinary (contains one element only), then the time      at the forefront of research in music computing.
windows in the first nonempty sub-function’s output are passed to
a final function that converts the time windows into the XML             5. REFERENCES
format required by the task. For compound queries, plausible             [1] Collins, T. 2011. Improved Methods for Pattern Discovery in
sequences of time windows for the component query elements are               Music, with Applications in Automated Stylistic Composition.
merged before passing to the final syntax-conversion function.               Doctoral Thesis. Faculty of Mathematics, Computing and
                                                                             Technology, The Open University.
2.2 Example Output for Three Sub-Functions
Figure 1 contains example output for the sub-functions                   [2] Krumhansl, C. L. 1990. Cognitive Foundations of Musical
harmonic-interval-of-a, (arrows indicate retrieved                           Pitch. Oxford University Press, New York, NY.
harmonic seconds), HarmAn->Roman (functional harmonic                    [3] Meredith, D., Lemström, K., and Wiggins, G. A. 2002.
labels below each staff), and cadence-time-intervals                         Algorithms for discovering repeated patterns in
(three perfect cadences surrounded by black boxes). All three                multidimensional representations of polyphonic music.
functions involve implementing and extending MIR/music-                      J. New Music Res. 31, 4, 321-345.
psychology algorithms to achieve promising results, especially for       [4] Pardo, B., and Birmingham, W. P. 2002. Algorithms for
the higher-level music-theoretical concepts such as functional               chordal analysis. Comput. Music J. 26, 2, 27-49.
harmonies and cadences.
                                                                         [5] Sapp, C. S. 2005. Visual hierarchical key analysis. ACM
3. RESULTS AND DISCUSSION                                                    Computers in Entertainment. 3, 4, 1-19.
Figure 2 shows recall and precision results for the Stravinsqi           [6] Sutcliffe, R., Crawford, T., Fox, C., Root, D. L., and Hovy, E.
algorithm on the 2014 C@merata task. The measure metrics                     2014. The C@merata task at MediaEval 2014: natural
reward an algorithm’s output if it is in the same bar/measure as a           language queries on classical music scores. In MediaEval
ground-truth item, whereas the beat metrics require an algorithm’s           2014 Workshop, Barcelona, Spain, October 16-17 2014.
output to be in the same bar and on the same beat as a ground-
truth item. The mean category in Figure 4 shows the overall
                                                                         3
                                                                             Stravinsqi is labelled DMUN03 in the task overview paper [6].
                                                                             The other runs DMUN01 and DMUN02 are not remarkable:
                                                                             incorrect bar numbers in four pieces (see, for instance, the cross
                                                                             and correction in Figure 1) and xml2hum conversion caused
2
    http://extras.humdrum.org/bin/osxintel64/                                issues in DMUN01 and DMUN02.

</pre>