=Paper=
{{Paper
|id=Vol-1739/MediaEval_2016_paper_57
|storemode=property
|title=The KIAM System in the C@merata Task at MediaEval 2016
|pdfUrl=https://ceur-ws.org/Vol-1739/MediaEval_2016_paper_57.pdf
|volume=Vol-1739
|dblpUrl=https://dblp.org/rec/conf/mediaeval/Mytrova16
}}
==The KIAM System in the C@merata Task at MediaEval 2016==
The KIAM System in the C@merata Task at MediaEval 2016 Marina Mytrova Keldysh Institute of Applied Mathematics Russian Academy of Sciences Moscow, Russia mytrova@keldysh.ru ABSTRACT The KIAM system is an attempt to solve the C@merata task at MediaEval 2016 [4] using regular expressions. The aim of the KIAM system is to try the suitability of regular expressions for text recognition in a limited dictionary. The system answers natural language queries over musical scores. The C@merata project has existed since 2014 and there are 200 questions each year which a participant's system has to answer automatically. A question consists of a short noun phrase in English referring to a musical feature in the music score like "three consecutive thirds in the left hand in measures 22-30" together with a classical music score in MusicXML [2]. The required answer to this question is a set of one or more passages specifying exact places in the score relevant with input string. 1. INTRODUCTION The key aim of the C@merata evaluations is to apply Question Answering to Music Information Retrieval. The input is a short natural language query but it is asked against a music score rather than a document. Queries may be composed of • melodic elements (e.g. note, number of notes, note sequence or melodic interval), • harmonic elements (harmonic interval, chord or triad), The phrase in thetag is a natural language query. • performance styles, Attributes of the tag are the answer. It shows the • qualifications by instrument/clef/time signature/key, location in the given MusicXML score of the requested musical • qualifications by section/bar/beat/number. feature. Elements can follow one another or be synchronous with one 2. APPROACH another. Also, queries may contain information about texture, cadence and counterpoint. 2.1 Description The required answer (passage) is the location in the score of The vocabulary of musical terms is limited and the the requested musical feature. A passage consists of a start point specification of queries in the 2016 task contains many examples and an end point in the score associated with the question. The of its components. It can be used to construct a large amount of passage is specified as follows: regular expressions. So, regular expressions were chosen for query recognition, following the model of the UNLP work [1] in • a start time signature, the 2015 task [5]. The approach consists of two parts: query • an end time signature, recognition and music passage search. This is similar to UNLP • a start divisions value, but the KIAM system contains different regular expression for • an end divisions value, each example of query components. • a start beat, The diagram below summarizes the approach which can be • an end beat. summarised as follows: The XML file containing the C@merata task is processed by XML-processor to get the question string; The XML formats for the training data and run submissions The question string goes to input of the query recognizer; There it for C@merata 2016 task were as shown below: passes through a large number of regular expressions; If no match is found, the operation is stopped and KIAM system does not create a tag; Otherwise the request is sent to the music Copyright is held by the author/owner(s). passage search engine. MediaEval 2016 Workshop, October 20-21, 2016, Amsterdam. • time signature, • number, • section, • clef, • beat. This paper will consider only the first two checks because these worked the best on the C@merata 2016 task data. To simplify the problem it is considered that this data (instrument and measures) can only occur at the end of string. It is also considered that only one instance of this data can occur in the string. All found data are stored in the resulting object. It is an object that stores all the data needed to find music entities in the MusicXML file. Each regular expression of the query recognizer corresponds The remaining part of the string is checked for compliance to a different function in the music passage searcher. So, with one of the two patterns: depending on the matched regular expression, it calls the appropriate function. This function takes as input a given • string contains 'followed by', MusicXML file and searches for the appearance of a certain tag • elements in the string are divided by commas. sequence. The music passage searcher returns the location in the score of the requested musical feature. Finally, this location is This information is also stored in the resulting object. converted into tag attributes. Melodic elements, coming one after the other, are considered Here is a diagram of query recognizer: separately in the next step. Originally, the KIAM system should have also been checking synchronous elements but this work was not completed in time for this year’s task. The last action is recognition of a note or rest. A note’s attributes, such as pitch, duration and accidentals, are saved in the resulting object. At the end, the resulting object sends the entity sequences to the music passage searcher. The music passage searcher processes a given MusicXML file. In the file it searches for occurrences of data received from the query recognizer. These occurrences pass through the XML-processor to get an answer XML file. 2.2 Regular Expressions Regular expressions used for question recognition are listed below. They can be divided into eight types of recognition: • instrument • measures • sequence ('followed by') • repeats • type of note/rest • dotted note definition • rest recognition • accidentals, pitch and octave Regular expressions are used in the order as listed above. If a regex matches the query, data received from the regular expression are saved into the resulting object. Then the matching string is removed from the query. Regular expressions are used in a specific sequence. So, there cannot be a situation where multiple regular expressions match a particular query. The regexes were derived manually from the C@merata 2016 task description. First of all it gets an input string from the XML-processor. The following examples describe what KIAM's function Then the end of the string is checked for the presence of one of returns using regular expressions in specific cases. the following data: • instrument, • measures, Instrument 11) '/(one|1|two|2|three|3|four|4|five|5|six|6|seven|7|eight|8|nine|9| 1) '/(?: in the| in) ten|10) ([A-Ga-g]#)s/' (first|1st|second|2nd|third|3rd|fourth|4th|fifth|5th|sixth|6th|seventh| 7th|eighth|8th|ninth|9th|tenth|10th) ([A-Za-z]+)$/' 12) '/(one|1|two|2|three|3|four|4|five|5|six|6|seven|7|eight|8|nine|9| ten|10) ([A-Ga-g]b)s/' 2) '/(?: in the| in) ([A-Za-z]+) part$/' 13) '/(one|1|two|2|three|3|four|4|five|5|six|6|seven|7|eight|8|nine|9| These regular expressions recognize such descriptions of musical ten|10) ([A-Ga-g])s/' instruments as: These regular expressions recognize repeats: • G followed by Eb in the viola part • (returns 'viola') quarter-note A repeated four times • dotted quarter note D6 in the first violin (returns an array of string 'quarter-note A' repeated four times), • (returns 'violin I') 5 B4s Measures (returns an array of string 'B4' repeated five times). 3) '/(?: in the| in) (?:measures|bars) ([0-9]+)[\s]*-[\s]*([0-9]+)$/' Type of note/rest 4) '/(?: in the| in) (?:measure|bar) ([0-9]+)$/' 14) '/(whole-note|whole note|whole|half-note|half note|half| quarter-note|quarter note|quarter|crotchet|eight-note|eight These regular expressions recognize information about measures note|eight)/' such as: This regex recognizes type of note or rest: • quarter-note rest in measures 1-5 • (returns an array of integers from 1 to 5) quarter-note rest • A#1 in bars 44-59 (returns 'quarter'), • (returns an array of integers from 44 to 59) dotted quarter note D6 Sequence (returns 'quarter'), 5) '/(.+) (?:followed by a|followed by) (.+)/' crotchet D6 (returns 'quarter'). 6) '/(.+),[\s]*(.+)/' Dotted note definition The first regex recognizes the string 'followed by', as in the following examples: 15) '/dotted/' G followed by Eb This regex checks if note is dotted: (returns an array of strings 'G' and 'Eb'), dotted quarter note D6 5 B4s followed by a C5 (returns true). (returns an array of strings '5 B4s' and 'C5'). Rest recognition The second regular expression recognizes a sequence of strings 16) '/rest/' separated by commas, as in the following examples: This regex recognizes a rest: Bb3, A3, G3, F3, E3 (returns an array of strings 'Bb3', 'A3', 'G3', 'F3' and 'E3'). quarter-note rest (returns true). Repeats 7) '/repeated (one|1|two|2|three|3|four|4|five|5|six|6|seven|7|eight Accidentals, pitch and octave |8|nine|9|ten|10) times/' 17) '/([A-Ga-g])#([1-9])/' 8) '/(one|1|two|2|three|3|four|4|five|5|six|6|seven|7|eight|8|nine|9|ten| 18) '/([A-Ga-g])b([1-9])/' 10) ([A-Ga-g]#[1-9])s/' 19) '/([A-Ga-g])([1-9])/' 9) '/(one|1|two|2|three|3|four|4|five|5|six|6|seven|7|eight|8|nine|9|ten| 10) ([A-Ga-g]b[1-9])s/' 20) '/([A-Ga-g])#/' 10) '/(one|1|two|2|three|3|four|4|five|5|six|6|seven|7|eight|8|nine|9| 21) '/([A-Ga-g])b/' ten|10) ([A-Ga-g][1-9])s/' 22) '/([A-Ga-g])/' These regular expressions recognize properties of notes: String 'Eb' matches regular expression #21: Eb '/([A-Ga-g])b/' (returns an array ['pitch' => 'E', 'octave' => 1, 'flat' => true]), So, the result object includes the the following information: Bb3 (returns an array ['pitch' => 'B', 'octave' => 3, 'flat' => true]), instrument — 'viola' measures — all A#1 first note — G (returns an array ['pitch' => 'A', 'octave' => 1, 'sharp' => true]). second note — E, flat The music passage searcher starts by searching the different 2.3 Search Functions parts of the score. After finding the viola part other parts are not Search functions correspond to the regular expressions considered. outlined above. First, an executed function searches an The searcher then goes through all the notes of the viola part. appropriate part. It is a part of an instrument received from the A note’s position is stored when the current pitch matches 'G'. query recognizer. If next note matches 'Eb', its position is stored as the end Next, a function searches for the correct measures of the part. point in the score associated with the question. Then, the music They are used in the next iteration. passage searcher returns the location in the score of the requested The search function then compares each note in the measures musical feature. In order to do this, the location is first converted with the first note of the list returned by the query recognizer. It is into tag attributes, according to the rules of the repeated until a match is found, whereupon, the corresponding C@merata task. note’s position is stored. It is considered to be a starting point in Otherwise, the search continues again from note 'G'. This is the score associated with the question. repeated until either a correct answer is found or the parts end. The following note of the list is then verified in the same way. Notes found in this way should follow directly after each 2.5 Implementation of the System other. The PHP programming language was used to implement the When a match is found for the last note in the list, the note’s system. It supports regular expressions without the use of position is again stored. It is the end point in the score associated additional libraries and also PHP has a built-in functionality to with the question. parse XML. In future, PHP will allow us to create a Web-interface The search continues until all passages associated with the for the KIAM system. question are found. 3. RESULTS AND DISCUSSION 2.4 Example The KIAM system was designed to try the suitability of Here is a worked example based on the regular expressions regular expressions for text recognition, working in a domain and general search algorithm outlined above. Consider the input witha limited vocabulary. question: The system was able to recognize pitch, octave and accidentals for each note. These cases were handled correctly if 'G followed by Eb in the viola part' references to notes in a query followed one after the other separated by commas or by 'followed by'. The end of the string matches regular expression #2: KIAM was also able to determine musical instruments and measures. '/(?: in the| in) ([A-Za-z]+) part$/' Regular expressions were created only for the simplest queries. So, KIAM recognized correctly only seven requests such Instrument 'viola' is saved in the result object. The question is as: now reduced to: • 'G followed by Eb in the viola part', 'G followed by Eb' • 'quarter-note rest in measures 1-5', • 'Bb3, A3, G3, F3, E3', None of the 'measures'-type regex matches the question string. • 'dotted quarter note D6 in the first violin', However, the string matches regular expression #5: • 'quarter-note A repeated four times', • '5 B4s followed by a C5', '/(.+) (?:followed by a|followed by) (.+)/' • 'A#1 in bars 44-59'. This regex divides the query string into two parts: Unfortunately, the C@merata questions were very difficult this year, so the results are not as good as we had hoped. The 'G', actual scores for the KIAM system for all questions presented in 'Eb' the table below: String 'G' matches regular expression #22: BP BR BF MP MR MF 0.194 0.011 0.021 0.613 0.035 0.066 '/([A-Ga-g])/' The best system this year was DMUN which score BF 0.070 - a very low figure reflecting the difficulty of the task. The second system was KIAM with BF 0.021 as shown in the table above. As would be expected, MF for KIAM was higher at 0.066 - generally it is easier to determine the correct measure than the exact beat in the measure. It is interesting that MP was 0.613 which means that for some queries, the correct measure was identified quite accurately. Considering the KIAM performance across different question types, the results for 1_melod, n_melod and follow queries are shown below. BP BR BF MP MR MF 1_melod 0.273 0.044 0.076 0.727 0.118 0.203 n_melod 0.000 0.000 0.000 0.333 0.021 0.040 follow 0.600 0.140 0.227 1.000 0.233 0.378 We can clearly see that 1_melod questions showed the best performance with BF 0.076. The MP was also the highest for these, showing that KIAM can find the measure for a note but not its exact beat. MP overall is high because it is high here, and in addition, KIAM did not answer many of the queries overall. For n_melod queries, the correct beat was not recognised for any query (BF 0) but the correct measure was in some cases (MF 0.040). Queries of type follow were complex in this task, but KIAM was able to tackle some of these, resulting in relatively high BF 0.227 and MF 0.378. 4. CONCLUSION This was our first year of particating at C@merata; it is an extremely complex task and it took as quite a while to get a basic system running, especially as we chose to work in php and not to use the Python Baseline System provided by the organisers. Our approach was based on regular expressions and this proved surprisingly succesful for the simpler queries. Next steps will include refining our regular expressions via a detailed analysis of the Gold Standard data, and combining this approach with others more suited to the more complex query types. 5. REFERENCES [1] Asooja, K., Ernala, S. K., & Buitelaar P. (2015). UNLP at the MediaEval 2015 C@merata Task. In Proceedings of the MediaEval 2015 Workshop, Dresden, Germany, September 14-15 2015. http://ceur-ws.org/Vol-1436/Paper86.pdf. [2] MusicXML http://www.musicxml.com/ [3] Mytrova, M. (2013). Music Information Retrieval Based On Automated Reasoning Methods. RuSSIR Young Scientist Conference, Kazan, Russia, September 16-20, 2013. [4] Sutcliffe, R. F. E., Collins, T., Hovy, E., Lewis, R., Fox, C., & Root, D. L. (2016). The C@merata task at MediaEval 2016: Natural Language Queries Derived from Exam Papers, Articles and Other Sources against Classical Music Scores in MusicXML. In Proceedings of the MediaEval 2016 Workshop, Amsterdam, The Netherlands, October 20-21 2016. http://ceur-ws.org. [5] Sutcliffe, R. F. E., Fox, C., Root, D. L., Hovy, E., & Lewis, R. (2015). The C@merata Task at MediaEval 2015: Natural language queries on classical music scores. In Proceedings of the MediaEval 2015 Workshop, Dresden, Germany, September 14-15 2015. http://ceur-ws.org/Vol- 1436/Paper12.pdf. [6] Tabolin, A., Mytrova, M., Korukhova, Y. (2012). Music Theory Ontology. Reasoning Web Summer School, Vienna University of Technology, Austria, September 3-8, 2012.