<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">LIA @ MediaEval 2013 MusiClef Task: A Combined Thematic and Acoustic Approach</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Mohamed</forename><surname>Morchid</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">LIA -University of Avignon Avignon</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Richard</forename><surname>Dufour</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">LIA -University of Avignon Avignon</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Mohamed</forename><surname>Bouallegue</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">LIA -University of Avignon Avignon</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Georges</forename><surname>Linarès</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">LIA -University of Avignon Avignon</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Driss</forename><surname>Matrouf</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">LIA -University of Avignon Avignon</orgName>
								<address>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">LIA @ MediaEval 2013 MusiClef Task: A Combined Thematic and Acoustic Approach</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">C702EDCFBF0F38EACB93D7308103B392</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-19T17:59+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In this paper, we describe the LIA system proposed for the MediaEval 2013 Soundtrack task. The aim is to predict the most suitable soundtrack from a list of candidate songs, given a TV commercial. The organizers provide a development dataset including multimedia features. The initial assumption of the proposed system is that commercials which sell the same type of product, also share the same music rhythm. A two-fold system is proposed to provide a music for a commercial: find commercials with close subjects in order to determine the mean rhythm of this subset, and then extract from the candidate songs the music which better correspond to this mean rhythm.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">INTRODUCTION</head><p>The success of a product or a service essentially depends of the way to present it. Thus, companies pay much attention to choose the most appropriate advertisement that will make a difference in the customer choice. The advertisers have different media possibilities, such as journal paper, radio, TV or Internet. In this context, they can exploit the audio media using a song related to the commercial which attracts listeners. Therefore, the choice of an appropriate song is crucial and can determine the success of a product <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b1">2]</ref>.</p><p>For these reasons, the MediaEval 2013 Soundtrack task for commercials becomes a challenging and helpful task <ref type="bibr" target="#b2">[3]</ref>. Indeed, the MusiClef task seeks to make this process automated by taking into account both context-and contentbased information about the video, the brand, and the music. The main difficulty of this task is to find the set of relevant features that best describes the most appropriate song for a video. We propose a hybrid approach that uses a set of features from textual and audio media.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">PROPOSED APPROACH</head><p>The proposed hybrid system is composed of two processes. The first one projects a TV commercial into a topic space to find a set of other commercials sharing close topics. A TV commercial from the test set is thus linked to the TV commercial from the development set sharing the closest topics.</p><p>As a result, each TV commercial from the test set will be associated with a song extracted from the development data.</p><p>The second step has the responsibility to find, using audio features, the most similar songs to the one associated during the first step from a list of candidate songs (see figure <ref type="figure">1</ref>). In details, the development set D is composed of TV commercials C d , with for each, a soundtrack S d and a vector representation V d related to the d th TV commercial. In the same manner, the test set T is composed of TV commercials C t , with, for the t th one, a vector representation V t and a soundtrack S t to predict. Then a similarity score {α d,t } t=1,...,T d=1,...,D is computed for each commercial C d i of the development set given one from the test set C t :</p><formula xml:id="formula_0">D = {C d , V D , S d } d=1,...,D<label>(1)</label></formula><p>T = {C t , V T , S t k } k=1,...,5000 t=1,...,T .</p><p>In the next sections, the topic space representation and the mapping of a commercial in this topic representation are described. Then, the computed similarity score is detailed. Finally, the soundtrack prediction process from a TV commercial is explained. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Topic representation of a TV Commercial</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>. . . Vd[1]</head><p>Vd <ref type="bibr" target="#b1">[2]</ref> Vd <ref type="bibr" target="#b2">[3]</ref> Vd size N . This corpus contains 10, 724 Web pages related to brands of the commercials contained in D. This corpus is composed of 44, 229, 747 words for a vocabulary of 4, 476, 153 unique words. The topic representation is performed using a Latent Dirichlet Allocation (LDA) <ref type="bibr" target="#b0">[1]</ref> approach. At the final LDA analysis, a topic space m of n topics is obtained with, for each theme z, the probability of each word w of v knowing z and for the entire model m, the probability of each theme z knowing the model m. Each TV commercial from both development and test set is mapped into the topic space (see figure <ref type="figure" target="#fig_2">2</ref>).</p><formula xml:id="formula_1">[4] Vd[n]</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Similarity measure</head><p>Each commercial have been mapped into the topic space to produce its vector representation. Then, commercials from the test set T that deal with the same subjects of commercials from the development set D are clustered. The cosine is used as a similarity measure:</p><formula xml:id="formula_2">cosine(V d , V t ) = α d,t = n i=1 V d [i] × V t [i] n i=1 V d [i] 2 n i=1 V t [i] 2 (2)</formula></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Rhythm pattern</head><p>The cosine measure, presented in the previous section, is also used to evaluate the similarity between a mean rhythm pattern vector S t of a song and all the candidate songs S t k of the test set.</p><p>In details, each commercial from D, is related with a soundtrack that is represented with a rhythm pattern vector. In our experiments, the 10 rhythm features of the song are used (speed, percussion, periodicity, rhythm pattern. . . ). As a result, each commercial is represented by a rhythm pattern vector of size 58. From the subset of soundtracks of the l nearest commercials from D, a mean rhythm vector S is performed as:</p><formula xml:id="formula_3">S = 1 l d∈l S d .</formula><p>Finally, the cosine measure between this mean rhythm S of the l nearest commercials from D and each commercial (cosine(S, S t )t∈T ) is used to find, from the soundtrack S t of the test set T , the 5 songs from all the candidates having the closest rhythm pattern.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">EXPERIMENTS AND RESULTS</head><p>The proposed system is evaluated in the MediaEval 2013 MusiClef benchmark <ref type="bibr" target="#b3">[4]</ref>. The aim of this task is to predict for each video in the test set, the most suitable soundtrack from 5,000 candidate songs. The dataset is split into 3 sets. The development set contains multimodal information on 392 commercials (various metadata, Youtube uploader comments, various audio features, video features, web pages and text features). The test set is a set of 55 videos where a song should be associated using the recommandation set of 5,000 soundtracks (30 seconds long excerpts).</p><p>For each video in the test set, a ranked list of 5 candidate songs is proposed. The song prediction evaluation is manually performed using the Amazon Mechanical Turk platform. Three scores have been computed from our system output <ref type="bibr" target="#b3">[4]</ref>:</p><p>• First rank average score: 2.16</p><p>• Top 5 average score (arithmetic mean): 2.24</p><p>• Top 5 average score (harmonic mean, taking rank into account): 2.22</p><p>Considering that human judges rate the predicted songs from 1 (very poor) to 4 (very well), we can consider that our system is slightly better than the mean evaluation score <ref type="bibr" target="#b1">(2)</ref> no matter the metric considered.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">CONCLUSION</head><p>In this paper, an automatic system to assign a soundtrack to a TV commercial has been proposed. This system combines two media: textual commercial content and audio rhythm pattern.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>1 SFigure 1 :</head><label>11</label><figDesc>Figure 1: Global architecture of the proposed system.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head></head><label></label><figDesc>Let's consider a corpus D from the development set of TV commercials with a word vocabulary V = {w1, . . . , wN } of</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Mapping of a TV commercial in the topic space.</figDesc></figure>
		</body>
		<back>

			<div type="funding">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>* This work was funded by the SUMACC project supported by the French National Research Agency (ANR) under contract ANR-10-CORD-007.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Latent dirichlet allocation</title>
		<author>
			<persName><forename type="first">D</forename><surname>Blei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Jordan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">The Journal of Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="993" to="1022" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">The effectiveness of music in television commercials</title>
		<author>
			<persName><forename type="first">C</forename><surname>Bullerjahn</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Food Preferences and Taste: Continuity and Change</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page">207</biblScope>
			<date type="published" when="1997">1997</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<title level="m" type="main">Music and advertising: The effect of music in television commercials on consumer attitudes</title>
		<author>
			<persName><forename type="first">N</forename><surname>Hoeberichts</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
	<note type="report_type">Bachelor Thesis</note>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">Soundtrack Selection for Commercials</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">C S</forename><surname>Liem</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Orio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Peeters</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Scheld</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2013">2013. 2013</date>
			<publisher>MediaEval</publisher>
		</imprint>
	</monogr>
	<note>MusiClef</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Consumer response to television commercials: The impact of involvement and background music on brand attitude formation</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">W</forename><surname>Park</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Young</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Marketing Research</title>
		<imprint>
			<biblScope unit="page" from="11" to="24" />
			<date type="published" when="1986">1986</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
