=Paper=
{{Paper
|id=Vol-1739/MediaEval_2016_paper_56
|storemode=property
|title=DMUN: A Textual Interface for Content-Based Music Information Retrieval in the C@merata task for MediaEval 2016
|pdfUrl=https://ceur-ws.org/Vol-1739/MediaEval_2016_paper_56.pdf
|volume=Vol-1739
|dblpUrl=https://dblp.org/rec/conf/mediaeval/Katsiavalos16
}}
==DMUN: A Textual Interface for Content-Based Music Information Retrieval in the C@merata task for MediaEval 2016==
DMUN: A Textual Interface for Content-Based Music Information Retrieval in the C@merata task for MediaEval 2016 Andreas Katsiavalos De Montfort University Leicester, UK andreas.katsiavalos@gmail.com ABSTRACT The music information workflow interpreter connects the This paper describes a text-based Question-Answering (QA) script with a set of music-related functions that are built on top of system for content-based music information retrieval (MIR) the music21 framework (see 2.4). discuss the development of the according to the C@merata task description [12,13]. question types over the past three years and in particular focus on the more sophisticated methods adopted for question generation 1. INTRODUCTION this year. We will then present the participating systems for this Content-based search of music information is an active year and discuss the results which they obtained. research area [4] with applications in education and general musicological tasks. Apart from collections of music data such as KernScores [11], even traditional library catalog servers can be 2.2 The Query Interpreter searched based on their content [7]. To access these data and The query interpreter is a class that is initalised with a extract content-based information we developed a text query “language” file that contains information about valid terms, their parser that, given a sentence such as a C@merata question, types, composite types and, composite type relations. Composite generates a script for music operations. The script contains the types are music concepts and will be referred to as entities. This music concepts and their relations as described in the query, but file stores generic terms, but some values, e.g. names of the parts in a structured form in such a way that workflows of specific are extracted from music data. music data operations are formed. A parser then reads the script The terms of a query phrase can be: and calls the corresponding functions from a framework we created on top of music21 [6]. The questions tested are a sub-set • values, of 28 random selections from the complete set of questions. • music concept/entity keywords, E, An overview of the query system is given in section 2 with • music concept/entity relation keywords, R. more detailed descriptions of important concepts and procedures. In section 3 we present the results of the algorithm with detailed For example, “dotted quarter note dominant 7th” is a chord description and discuss them. Last, the conclusions are presented entity. Entities are further categorized into “content” and in section 4. “context” types. Although in the question set we tested, context entities are the parts and measures and content entities are note, rest, chord and simultaneity, it is the relation keywords that define 2. APPROACH what is the search context and what is the target content. Relations enable the transformation of the query into a structured request by 2.1 Overview defining the context-content relation. The conditions are just the Query parsing and music content operations are kept separate entity attributes. and the only connection between them is through an intermediate Some of the relation types that were identified in the tested layer. question set are shown below (the “<>” symbol means any type of There are three major components of this approach are: entity): • A query interpreter < > (" ") < > , <(duration, pitch, note, chord)> • The script language < > ("followed by") < >, <(duration, pitch, note, chord)> • A music information workflow interpreter < > ("in", "in the") < > contextual and conditional < > ("of", "of a") < > The query interpreter resolves the query text into a script that < > ("parallel") describes a music information workflow. This is a layered process < > ("repeated") <> ("time","times") that required hard-coded knowledge about valid query terms and < > ("between”, “between the") < > ("and") < > types (see 2.2). < > ("against", "only against") < > The script language consists of “information request” … statements that are formed by the clauses: “select”, “from” and “where” having similar functionality as that described by the The terms of the query phrase are processed in layers starting Structured Query Language (SQL) (see 2.3). by identifying the type of each one. Next, composite types and words are grouped into entities. After all the types are matched, Copyright is held by the author/owner(s). the entity relations are identified. Last, the query is converted into MediaEval 2016 Workshop, October 20-21, 2016, Amsterdam. an information request using “select-from-where” statements. 1. Load the language file selection, measure selection based on range, and get attribute 2. Parse the query values for basic elements such as note, rest and chord type. 2.1 First pass: terms to types One way to avoid over-analyzing the query into complicated 2.2 Second pass: type groups and relations information requests is to use more complex representations, such 2.3 Third pass: Content and Context identification as note-sequences (VIS) [1], or Directed Interval Classes [3] and 2.4 Fourth pass: Make information request bypass low-level relations by transferring them to the 3 Run information request script with music framework representation. # 14 seven-note chord in the harpsichord 3. RESULTS AND DISCUSSION These are preliminary results and the approach is under context : parts, condition: instrument development. In the rest of this section we discuss how queries get type : chord resolve into information requests and the difficulties in the condition : cardinality value process. Figure 1. Example query analysis #3 octave leap in violin I context : part, instrument type and number 2.3 Information Request using a Script get type : melodic interval, keyword "leap" After the query phrase analysis a script that contains a condition : interval value structured information request is generated by converting the identified entities and their relations into a sequence of “select- #5 Bb3, A3, G3, F3, E3 from-where” statements. note,con:seq:comma, note, con:seq:comma, note, con:seq:comma, note, con:seq:comma, note #9 parallel thirds in measures 15-18 FROM CONTEXT: context : complete piece ? separate parts ? SELECT measures get type : pitch sequence FROM parts.all WHERE 15 <= measure.number <=18 SELECT CONTENT #9 parallel thirds in measures 15-18 SELECT chords con:relation, interval_type, con:where:in, key, int:comp:range FROM CONTEXT WHERE chord.type IS third context : measures WHERE (RELATION) get type : chords:condition:thirds parallel condition : parallel # 14 seven-note chord in the harpsichord FROM CONTEXT: #10 authentic cadence in measures 14-18 SELECT parts cadence_type, key, con:where:in, key, int:comp:range FROM parts.all WHERE part.name == “harpsichord” context : measures SELECT CONTENT get type : cadence SELECT chords FROM CONTEXT condition : cadence type WHERE chord.cardinality = 7 Figure 2. Text parsing examples of a function calls By ordering and nesting such statements, all the queries that were tested were successfully converted into this workflow representation. The use of a “language” file is a way to pass knowledge to the system about how to parse phrases. It contains: # 18 consecutive sixths between the Altos and Basses in • value collections grouped in primary types measures 73-80 ◦ e.g. 15-18 is type range.int con:temp_relation, num:position, con:selection:between_the, • primitive types grouped in music concepts/entities term, con:and, term, con:where:in, key, num:comp:range ◦ “dotted quarter” is a duration entity ◦ “first inversion of a triad” is a chord entity context : measures, int-range • Relation definitions relation : between X and Y ◦ groups of entities X type : part Y type : part 2.4 Music Content Extraction content : melodic sequence The structured information request that was described in the condition : interval type previous section is parsed from a music information retrieval interpreter that compiles an executable music21 script using #22 flute dotted half note only against strings music21 functions such as “getElementByClass()” and a plethora term, duration:exp, duration, key, ? , con:temp_relation:against, ? of features for music21.elements to compare with. Operating (find the string parts?) general_polyphony, pitch, on:where: within the music21 ontology, we can perform conditional part in_the, term, con:where:in, key, num:int, rule:direction context : parts, instrument 4. CONCLUSION type : duration, composite The C@merata task became very demanding this year; relation : only_against however, this approach seems promising. The use of the term : part group conditions > not empty ? intermediate information level created space for interpretations and generally allowed operations aimed at language understanding. Natural language was avoided but this approach #29 flute, oboe and bassoon in unison in measures 1-56 seems to resemble natural language query patterns. Even if the term, term, con:and, term. conection:where-condition:in, query language stays in a limited dictionary and syntax, as long as interval_type, context:where:in, int:comp,range it serves its purpose as an interface for information retrieval, it is context : measures worth attention. context : parts, the instruments The “segmentation ontology” (Fields et al., 2011) is an type : notes interesting idea. This work addresses large parts of the current condition : same notes approach’s need for an ontology, it provides implementations in RDF-OWL language for knowledge representations. #33 semibreve tied to a minim in the Bass clef duration, con:notation:tied:tied_to_a, duration, con:where:in_the, term, key=type context : parts ? or measures ? relation : tied_to a type : duration b type : duration # 44 four eighth notes in the bottom part context : part, relative position relation : sequence : numbertype : note, conditions: duration # 63 C D E F D E C in semiquavers repeated after a semiquaver context : all relation : X repeated after Y X type : sequence, type: pitch-class X cond : duration Y type : duration # 77 harmonic octave in the bass clef context : measures, clef: type : harmonic interval Notice the assumption in defining the context that bass clef can appear anywhere in the score and it does mean a complete part. # 86 whole-note unison E2 E3 E4 context : all parts type : chord, from notes in all parts condition : pitch content condition : duration #94 crotchet tied to crotchet context : single parts relation : X "tied to" Y X type : duration Y type : duration # 186 whole-note chord context : single part ? all parts ? type : chord 5. REFERENCES [1] Antila, C., & Cumming, J. (2014). The VIS Framework: Analyzing Counterpoint in Large Datasets. In Proceedings of the International Society for Music Information Retrieval. Taipei, Taiwan. [2] Arzt, A., Böck, S., & Widmer, G. (2012). Fast Identification of Piece and Score Position via Symbolic Fingerprinting. In 13th International Society for Music Information Retrieval (pp. 433–438). Porto, Portugal. [3] Cambouropoulos, E., Katsiavalos, A., & Tsougras, C. (2013). Idiom-independent harmonic pattern recognition based on a novel chord transition representation. In 3rd International Workshop on Folk Music Analysis. Amsterdam, Netherlands. [4] Casey, M., Veltkamp, R., Goto, M., Leman, M., Rhodes, C., Slaney, M., & others. (2008). Content-based music information retrieval: Current directions and future challenges. Proceedings of the IEEE, 96(4), 668–696. [5] Collins, T. 2014. Stravinsqi/De Monfort University at the C@merata 2014 task. Proceedings of the C@merata Task at MediaEval 2014. [6] Cuthbert, M. S., and Ariza, C. 2010. music21: a toolkit for computer-aided musicology and symbolic music data. In Proceedings of the International Symposium on Music Information Retrieval (Utrecht, The Nethlerands, August 09 - 13, 2010). 637-642. [7] Dovey, M. J. (2001). Adding content-based searching to a traditional music library catalogue server. In Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries (pp. 249–250). ACM. [8] Downie, J. S., & Cunningham, S. J. (2002). Toward a theory of music information retrieval queries: System design implications. [9] Fields, B., Page, K., De Roure, D., & Crawford, T. (2011). The segment ontology: Bridging music-generic and domain- specific. In Multimedia and Expo (ICME), 2011 IEEE International Conference on (pp. 1–6). IEEE. [10] Lewis, D., Woodley, R., Forth, J., Rhodes, C., Wiggins, G., & others. (2011). Tools for music scholarship and their interactions: a case study. [11] Sapp, C. S. (2005). Online Database of Scores in the Humdrum File Format. In ISMIR (pp. 664–665). [12] Sutcliffe, R. F. E., Fox, C., Root, D. L., Hovy, E., & Lewis, R. (2015). The C@merata Task at MediaEval 2015: Natural language queries on classical music scores. In Proceedings of the MediaEval 2015 Workshop, Wurzen, Germany, September 14-15 2015. [13] Sutcliffe, R. F. E., Fox, C., Root, D. L., Hovy, E. and Lewis, R. (2015). Second Shared Evaluation of Natural Language Queries against Classical Music Scores: A Full Description of the C@merata 2015 Task. Proceedings of the C@merata Task at MediaEval 2015. http://csee.essex.ac.uk/camerata/.