<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">TCSL at the MediaEval 2014 C@merata Task</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Nikhil</forename><surname>Kini</surname></persName>
							<email>nikhil.kini@tcs.com</email>
							<affiliation key="aff0">
								<orgName type="institution">Tata Consultancy Services Ltd. Innovation Labs</orgName>
								<address>
									<settlement>Thane</settlement>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">TCSL at the MediaEval 2014 C@merata Task</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">72329D0119945F9D2CA694A50C9A5F83</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T16:11+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>We describe a system to address the MediaEval 2014 C@merata task of natural language queries on classical music scores. Our system first tokenizes the question to tag the musically relevant features in the question using pattern matching. In this stage suitable word replacements are made in the question based on a list of synonyms. Using the tokenized sentence we infer the question type using a set of handwritten rules. We then search the input music score based on the question type to find the musical features requested. MIT's music21 library [2] is used for indexing, accessing and traversing the score.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">INTRODUCTION</head><p>To those interested in studying music scores, and especially in the field of Musicology, it is often necessary to search for, or refer to, particular sections of a music score that represent relevant musical features, just as a data scientist might look for interesting patterns in an abundance of data. Manually going through the score is prone to inefficiency, oversight, and requires the expert knowledge of actually understanding the score. This paper aims to develop specifications for tools that automate the task of search and retrieval of musical passages from a score with natural language queries. The problem may be defined as: given a computer representation of music as a score in a particular format (in our case, musicXML), and given a short English noun phrase referring to musical features in the score, search and list the location of all occurrences of the said musical features in the score. A complete description of the task can be found at <ref type="bibr" target="#b1">[1]</ref>.</p><p>While a large body of work is available in other domains for Natural language understanding, as well as for searching through sheet music scores, we did not come across any work that combines these aspects. A natural language understanding systems survey may be found in <ref type="bibr" target="#b4">[3]</ref>, and work done in non-trivial search on sheet music scores can be found in <ref type="bibr" target="#b5">[4]</ref>, <ref type="bibr" target="#b6">[5]</ref>, <ref type="bibr">[6]</ref>, <ref type="bibr" target="#b8">[7]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">APPROACH</head><p>Figure <ref type="figure" target="#fig_0">1</ref> presents the main modules of our system. Since we treat the problem as one of natural language understanding (of the question) and searching (through the musicXML), we define a set of question classes based on the searchable musical features, and propose a specific search method for each type of question. The main operations performed by our system are as follows:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Identifying tokens in the question</head><p>In the tokenizing step, words representing musically important features are marked/tokenized. We use 3 or 4 letter markers for the token class. After tokenization, the sentence will contain tokens grouped with the value of the token, each token-value pair grouped by parentheses, and the token and the value will be separated by a comma. For example, "quarter note then half note then quarter note in the tenor voice" is output as "(DUR, quarter note) (SEQ, then) (DUR, half note) (SEQ, then) (DUR, quarter note) in the (PRT, tenor voice)". Another example is "melodic octave" becomes "(HRML, melodic) (INT, octave)". </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Synonyms List</head><p>A list of synonyms is referred to during tokenizing for substituting words that refer to the same feature. This serves two purposes: 1) to cover all manners of asking for the same feature and 2) standardizing the different ways of asking for the same thing so that specifying the subsequent modules becomes simpler. The list of synonyms can be updated as new ways of asking the same feature are discovered when users actually query the system.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Inferring the question type</head><p>The tokenized output (with synonym list substitution) is the input to the module which infers the question type. A handcrafted set of rules was used to guess what type of question is asked based on the constituent tokens (see section 2.4). Looking at all questions available to us so far -task description, training set, test set -we specify the following types of questions: simple note , note with expression, interval (harmonic), interval (melodic), lyrics, extrema (highest or lowest note), time signature, key signature, cadence, triads, texture, bar with dynamics, consecutive notes, combination of the above.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4">Question rules</head><p>Based on the tokens present in the question phrase, we can write rules to guess the type of the question. For simple questions made up of only one question this is straightforward. For the phrases which contain a combination of elementary question types, some parsing capability might be necessary. We will address this in future work.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.5">Search scope</head><p>An important part to getting the right answer is limiting the search scope. For example, in the question "A sharp in the Treble clef", we are not just looking for any A#, but particularly in the Treble clef. Our tokens PRT and CLF can be used to scope the search. We look only in these parts during searching or we filter only those search results as answers which are within this search scope.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.6">Searching for the answer</head><p>The last step is searching the musicXML score for the identified token/token combination. This step is still a work in progress. We make extensive use of music21 capabilities.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.7">Score index</head><p>This is a list of all the notes in the score stored with the following associated information for each note: note name, note letter, accidental, pitch class, note octave, bar, offset, note length, part number, part id and whether this is a rest or a note. (This terminology is as defined in music21).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">RESULTS AND DISCUSSION</head><p>Upon release of the results, we saw that the organizers had also used a scheme of classification for the questions. Reconciling the organizers' and our question types, we saw that as far as the test questions go, we had all possibilities covered.  <ref type="table" target="#tab_0">1</ref> are: 1, 2, 3 -simple note, 4 -simple note with expression, 5 -simple note with a staff scope, 6</p><p>-simple note with a lyrics scope, 7 -consecutive notes, 8 -interval (melodic), 9 -interval (harmonic), 10 -cadence, 11-triad, 12 -texture.</p><p>Table <ref type="table" target="#tab_0">1</ref> shows beat and measure precision recall scores for results produced by our system for the test set. The strongest performance is seen in the 'simple notes' category (simple pitch, simple length, pitch and length). This is no surprise as these question phrases are the easiest to handle. Perf_spec questions are simple note with expression type questions (involving for example, mordant and trill). Word_spec is pitch/length occurring over a certain word in the lyrics. Although these were not handled by our implementation, some results were returned because the system fell back to the simple note type, which explains the non-zero precision and recall. For e.g. "F trill" returns all F notes. Followed_by is equivalent to consecutive notes. Melodic_interval is a type with our system too, and the system performs decently on both these types.</p><p>Although search was not implemented for harmonic_interval, cadence_spec, triad_spec and texture_spec, nearly all questions for these types were correctly classified by our system. No answers were returned for these types of questions, which results in the zero scores seen in the table. Only 8 questions remained unclassified in the test data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">CONCLUSION</head><p>The system implemented based on the specifications in this paper performs decently on single musical feature retrieval. A study of the errors in this implementation might even be able to take the precision and recall for such simple types to 1, and this will be the aim of the next cycle of development.</p><p>While our system performs well on the simple question phrases, the more complex question phrases still need work. As a question grows more complicated to include multiple musical features, we will need to evolve a more complex parsing strategy to identify questions. It is possible that the specification of the system will need to be revisited to take into account all the possibilities. The scope of the system specification is limited mainly to what we have observed in the task description and the training set, and these are in no way exhaustive of the types of queries that can be asked.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 .</head><label>1</label><figDesc>Figure 1. System for natural language sheet music querying</figDesc><graphic coords="1,317.90,202.30,255.10,174.00" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 . Test set results</head><label>1</label><figDesc></figDesc><table><row><cell></cell><cell></cell><cell cols="2">Beat</cell><cell cols="2">Measure</cell></row><row><cell>#</cell><cell>Question type</cell><cell>P</cell><cell>R</cell><cell>P</cell><cell>R</cell></row><row><cell>1</cell><cell>simple_length</cell><cell>0.979</cell><cell>0.988</cell><cell>0.991</cell><cell>1</cell></row><row><cell>2</cell><cell>simple_pitch</cell><cell>0.959</cell><cell>0.963</cell><cell>0.982</cell><cell>0.986</cell></row><row><cell>3</cell><cell>pitch_and_length</cell><cell>0.723</cell><cell>0.892</cell><cell>0.754</cell><cell>0.93</cell></row><row><cell>4</cell><cell>stave_spec</cell><cell>0.661</cell><cell>0.987</cell><cell>0.661</cell><cell>0.987</cell></row><row><cell>5</cell><cell>melodic_interval</cell><cell>0.894</cell><cell>0.683</cell><cell>0.904</cell><cell>0.691</cell></row><row><cell>6</cell><cell>followed_by</cell><cell>0.733</cell><cell>0.688</cell><cell>0.842</cell><cell>0.789</cell></row><row><cell>7</cell><cell>word_spec</cell><cell>0.261</cell><cell>1</cell><cell>0.261</cell><cell>1</cell></row><row><cell>8</cell><cell>perf_spec</cell><cell>0.066</cell><cell>0.897</cell><cell>0.066</cell><cell>0.897</cell></row><row><cell>9</cell><cell>harmonic_interval</cell><cell>0</cell><cell>0</cell><cell>0</cell><cell>0</cell></row><row><cell>10</cell><cell>cadence_spec</cell><cell>0</cell><cell>0</cell><cell>0</cell><cell>0</cell></row><row><cell>11</cell><cell>triad_spec</cell><cell>0</cell><cell>0</cell><cell>0</cell><cell>0</cell></row><row><cell>12</cell><cell>texture_spec</cell><cell>0</cell><cell>0</cell><cell>0</cell><cell>0</cell></row><row><cell>13</cell><cell>all</cell><cell>0.633</cell><cell>0.821</cell><cell>0.652</cell><cell>0.845</cell></row><row><cell cols="6">Our classes corresponding to the organizers' "Question type" in</cell></row><row><cell>Table</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">ACKNOWLEDGMENTS</head><p>Many thanks to Dr. Sunil Kumar Kopparapu, my supervisor, for his help in shaping this paper.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title/>
		<author>
			<persName><surname>References</surname></persName>
		</author>
		<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">The C@merata Task at MediaEval 2014: Natural language queries on classical music scores</title>
		<author>
			<persName><forename type="first">R</forename><surname>Sutcliffe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Crawford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Fox</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">L</forename><surname>Root</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Hovy</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">MediaEval</title>
				<imprint>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m">Workshop</title>
				<meeting><address><addrLine>Barcelona, Spain</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2014">October 16-17 2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">music21: A toolkit for computer-aided musicology and symbolic music data</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">S</forename><surname>Cuthbert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Ariza</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ISMIR 2010</title>
				<meeting><address><addrLine>Utrecht, Netherlands</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2010-08-09">2010. August 9-13 2010</date>
			<biblScope unit="page" from="637" to="642" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">The Question Answering Systems: A Survey</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M N</forename><surname>Allam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">H</forename><surname>Haggag</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Research and Reviews in Information Sciences (IJRRIS)</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="issue">3</biblScope>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Large data sets &amp; recommender systems: A feasible approach to learning music</title>
		<author>
			<persName><forename type="first">J</forename><surname>Gabriel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Sound and Music Computing Conference 2013</title>
				<meeting>the Sound and Music Computing Conference 2013<address><addrLine>SMC; Stockholm, Sweden</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013">2013. 2013</date>
			<biblScope unit="page" from="701" to="706" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">S</forename><surname>Downie</surname></persName>
		</author>
		<title level="m">Evaluating a simple approach to music information retrieval: Conceiving melodic n-grams as text</title>
				<imprint>
			<date type="published" when="1999">1999</date>
		</imprint>
		<respStmt>
			<orgName>The University of Western Ontario</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">Doctoral dissertation</note>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Peachnote: Music Score Search and Analysis Platform</title>
		<author>
			<persName><forename type="first">V</forename><surname>Viro</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ISMIR 2011</title>
				<meeting><address><addrLine>Miami, Florida (USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2011-10-24">2011. October 24-28, 2011</date>
			<biblScope unit="page" from="359" to="362" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Using XQuery on MusicXML Databases for Musicological Analysis</title>
		<author>
			<persName><forename type="first">J</forename><surname>Ganseman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Scheunders</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>&amp; D'haes</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ISMIR 2008</title>
				<meeting><address><addrLine>Philadelphia, Pennsylvania (USA</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2008-09-14">2008. September 14-18, 2008</date>
			<biblScope unit="page" from="433" to="438" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
