<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Predicting Affect in Music Using Regression Methods on Low Level Features</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rahul Gupta</string-name>
          <email>guptarah@usc.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shrikanth Narayanan</string-name>
          <email>shri@sipi.usc.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Signal Analysis and Interpretation Lab (SAIL), University of Southern California</institution>
          ,
          <addr-line>Los Angeles, CA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <fpage>14</fpage>
      <lpage>15</lpage>
      <abstract>
        <p>Music has been shown to impact the a ective states of the listener. The emotion in music task at the MediaEval challenge 2015 focuses on predicting the a ective dimensions of valence and arousal in music using low level features. In particular, this edition of the challenge involves prediction on full length songs given a training set containing smaller 30 second clips. We approach the problem as a regression task and test several regression algorithms. We proposed these regression methods on the dataset from previous edition of the same task (Mediaeval 2014) involving prediction on 30 second clips instead of full length songs. Through evaluation on the 2015 data set, we obtain a point of reference for the model performances on longer song clips. Whereas our models perform relatively well in predicting arousal (root mean square error: .24), we do not obtain good results for valence prediction (root mean square error: .35). We analyze the results and the experimental setup and discuss plausible solutions for a better prediction.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        Music is an important part of media and considerable
research has gone into understanding and indexing the music
signal [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. Music has been shown to impact the a ective
states of the listeners and in depth analysis of the relation
between music and a ect can impact both understanding
and design of music. Over the past few years, the emotion
in music task at various MediaEval challenges [
        <xref ref-type="bibr" rid="ref3 ref4 ref5">3, 4, 5</xref>
        ] has
provided a uni ed platform for understanding the a ective
characteristics of music signals. The emotion in music task
at MediaEval 2015 [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] provides a training set which is a
subset of the 2014 challenge, with valence and arousal
annotations over 30 second clips. This subset is chosen for better
quality annotations as described in the overview paper [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
However, it is also unique in the sense that the prediction
has to be made on a test set containing full length songs.
This poses the challenge of generalizing models trained over
smaller music segments for prediction on longer segments.
      </p>
      <p>
        In this work, we present the results on a ect prediction
in music using our previous models developed on the 2014
challenge data set. We tested multiple regression models
followed by a smoothing operation in last year's challenge [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
and more recently developed a Boosted Ensemble of Single
feature Filters (BESiF) algorithm [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] for a ect prediction
in music. In general, the a ective signals evolve smoothly
over time and do not undergo abrupt changes. Our models
take this factor into account by learning the mapping from
features to the a ective dimensions while also accounting
for the smooth temporal evolution of a ect. In the 2015
emotion in music task, our best models obtain a root mean
square error values of .35 and .24 in valence and arousal
prediction, respectively. In the next section we describe our
methodology in detail.
2.
      </p>
    </sec>
    <sec id="sec-2">
      <title>METHODOLOGY</title>
      <p>
        The 2015 challenge task provides a development set
consisting of 30 second clips from 431 songs; annotated at a rate
of 2 frames per second. The baseline feature set is extracted
using OpenSmile [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and contains 260 features. The test set
contains 58 full length songs annotated at the same frame
rate as the development set. We use three di erent
regression methods to predict the a ective dimensions of valence
and arousal from the 260 baseline features. We describe
these methods below.
2.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Linear Regression + Smoothing (LR+S)</title>
      <p>In this model, we use the 260 features and learn
separate linear regression models to predict arousal and valence.
After obtaining the decisions, we perform a smoothing
operation by low pass ltering the frame-wise arousal and
valence values. We use a moving average lter as the low pass
lter with lter length tuned using three fold inner cross
validation on the train set (arousal lter length = 13;
valence lter length = 38). The smoothing operation not only
removes the high frequency noise, but also incorporates the
local context into account while making decision for a frame.
The decision for a frame is given as an unweighted
combination of frame values in a window centered around that
frame, thereby incorporating local context.
2.2</p>
    </sec>
    <sec id="sec-4">
      <title>Least Squares Boosting +</title>
    </sec>
    <sec id="sec-5">
      <title>Smoothing (LSB+S)</title>
      <p>
        Least squares boosting [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ] is another regression
algorithm trained using gradient boosting [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. We use the
\ tensemble" function in Matlab to train a least squares
boosting model for predicting valence and arousal. The base
learners used for least squares boosting are regression trees
[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. The number of regression trees in the ensemble is tuned
using 3 fold cross-validation on the train set. After
obtaining the frame-wise decisions from the least squares boosting
algorithm, we perform a smoothing operation as explained
in the section 2.1.
      </p>
      <p>
        We proposed another gradient boosting based algorithm
on the 2014 emotion in music data set [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. In this
algorithm, we propose the base learners to be lters (analogous
to regression trees used in LSB+S algorithm). The
motivation behind this algorithm was to perform a joint learning
of regression and smoothing unlike previous two methods.
The lters not only learn the mapping between low level
features and the a ective dimensions, but also perform
temporal smoothing. A detailed description of the training
algorithm can be found in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
2.4
      </p>
    </sec>
    <sec id="sec-6">
      <title>Unweighted combination of LS+S, LSB+S and BESiF algorithms</title>
      <p>
        Our nal model was an unweighted combination of the
previous three models. Unweighted combination of models
have been shown to help prediction if and when models
capture complementary information from the features [
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ].
In the next section, we present our results and analysis.
3.
      </p>
    </sec>
    <sec id="sec-7">
      <title>RESULTS AND DISCUSSION</title>
      <p>
        We show the results from the four models presented above
in Table 1. From the results, we observe that our approach
using regression fails for valence prediction with close to no
correlation with the ground truth. As this was not the case
for at least the LR+S system in the previous edition of the
challenge (MediaEval 2014 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]), we suspect that there are
inherent di erences in the data sets from MediaEval 2014
and 2015. As previously pointed out, this year's challenge
involved prediction over full length song segments with
training on 30 second clips. This poses a data mismatch problem,
particularly with respect to our BESiF algorithm. The
lters in the algorithm are optimized over shorter time series
whereas test set prediction is over longer time series.
      </p>
      <p>In case of arousal, our systems perform relatively well.
The linear regression system performs the best. The BESiF
algorithm again fails to perform better than the other
algorithms primarily because of the data mismatch problem.
The lters in the BESiF algorithm when trained on smaller
duration annotation time series may not capture the
dynamics that can exist over longer duration annotations. The
success of linear regression in arousal prediction o ers some
promise in case of problems involving such temporal
mismatch between train and test set. In the next section, we
talk about modifying our current approach to improve the
results.</p>
    </sec>
    <sec id="sec-8">
      <title>FUTURE WORK</title>
      <p>
        Given that our systems do not perform well for valence
prediction, we aim to perform a detailed analysis to
understand the reasons behind the poor performance. Despite the
presence of features correlated with valence in the train set
and our success in the last edition of the challenge, a low
performance on valence prediction poses a challenge in form
of understanding prediction over longer song segments. We
suspect that providing annotators with small song segments
versus longer segments may have an impact on the
annotation itself. Listening to longer clips may alter a ective
perceptions and introduce other annotator biases. In
particular, we aim to investigate the performance of our BESiF
algorithm and modify for the given problem setting. This
may involve including adaptation schemes [
        <xref ref-type="bibr" rid="ref14 ref15">14, 15</xref>
        ] to model
di erences in annotation over the train and the test set and
other mismatch that may exist.
      </p>
      <p>
        Also several previous works have reported di erences in
performances for arousal and valence prediction using
acoustic features similar to the ones used in this work [
        <xref ref-type="bibr" rid="ref16 ref17 ref18">16, 17, 18</xref>
        ].
This is worth investigating into as it may imply that
valence prediction may involve other features not considered in
the baseline set of features. In the case of continuous
emotion tracking involving human interaction, video modality
has been shown to add complementarity and even
outperform audio signals [
        <xref ref-type="bibr" rid="ref18 ref19 ref20">18, 19, 20</xref>
        ]. This poses a very interesting
problem for the valence prediction in music as emotion
annotations are made using music audio only. Whereas there can
exist videos for certain songs, it has not been investigated if
videos can be associated with and even alter the perceived
a ective evolution of the song. Along similar lines, several
works propose the use of song lyrics in predicting a ect [
        <xref ref-type="bibr" rid="ref21 ref22">21,
22</xref>
        ]. Hence textual content of the song can also be
incorporated towards the development of an enhanced multi-modal
a ect prediction system.
5.
      </p>
    </sec>
    <sec id="sec-9">
      <title>CONCLUSION</title>
      <p>In this work, we use several previously proposed
regression methods on the emotion in music task at MediaEval
challenge 2015. We note that despite our success in the
previous edition of the challenge, our methods fail, particularly
for valence prediction. Our methods perform relatively well
for arousal prediction, however the trends in performance
across models are not as expected. We suspect that there
could be several reasons for the unexpected results.
Primarily, the di erences in lengths of the train and test sets could
lead to a mismatched model for test set prediction. We also
suspect that it may cause di erences in perception of a ect
in music, leading to di erences in a ect annotation.</p>
      <p>Instead of providing answers to relation between low level
features and a ective dimensions, our work in this paper
opens up more questions regarding the a ective evolution of
music signal. With regards to the future work, di erences
in perception of short clips of music signal versus longer
clips, di erences between a ective dimensions of valence and
arousal with regards to model development and investigating
algorithmic designs will be our initial steps.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Mira</given-names>
            <surname>Balaban</surname>
          </string-name>
          , Kemal Ebcioglu, and Otto E Laske.
          <article-title>Understanding music with ai: perspectives on music cognition</article-title>
          .
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Tanner</surname>
          </string-name>
          and
          <string-name>
            <given-names>Malcolm</given-names>
            <surname>Budd</surname>
          </string-name>
          .
          <article-title>Understanding music</article-title>
          .
          <source>Proceedings of the Aristotelian Society, Supplementary Volumes</source>
          , pages
          <volume>215</volume>
          {
          <fpage>248</fpage>
          ,
          <year>1985</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Mohammad</given-names>
            <surname>Soleymani</surname>
          </string-name>
          , Michael Caro, Erik Schmidt, and
          <string-name>
            <surname>Yi-Hsuan Yang</surname>
          </string-name>
          .
          <article-title>The mediaeval 2013 brave new task: Emotion in music</article-title>
          . In MediaEval 2013 Workshop, Barcelona, Spain,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Anna</given-names>
            <surname>Aljanaki</surname>
          </string-name>
          ,
          <string-name>
            <surname>Yi-Hsuan Yang</surname>
            , and
            <given-names>Mohammad</given-names>
          </string-name>
          <string-name>
            <surname>Soleymani</surname>
          </string-name>
          .
          <article-title>Emotion in music task at mediaeval 2014</article-title>
          . In MediaEval 2014 Workshop, Barcelona, Spain,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Anna</given-names>
            <surname>Aljanaki</surname>
          </string-name>
          ,
          <string-name>
            <surname>Yi-Hsuan Yang</surname>
            , and
            <given-names>Mohammad</given-names>
          </string-name>
          <string-name>
            <surname>Soleymani</surname>
          </string-name>
          .
          <article-title>Emotion in music task at mediaeval 2015</article-title>
          . In MediaEval 2015 Workshop, Wurzen, Germany,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Naveen</given-names>
            <surname>Kumar</surname>
          </string-name>
          , Rahul Gupta, Tanaya Guha, Colin Vaz, Maarten Van Segbroeck,
          <string-name>
            <surname>Jangwon Kim</surname>
          </string-name>
          , and
          <string-name>
            <surname>Shrikanth S Narayanan</surname>
          </string-name>
          .
          <article-title>A ective feature design and predicting continuous a ective dimensions from music</article-title>
          .
          <source>In MediaEval Workshop</source>
          , Barcelona, Spain,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Rahul</given-names>
            <surname>Gupta</surname>
          </string-name>
          , Naveen Kumar, and
          <string-name>
            <given-names>Shrikanth</given-names>
            <surname>Narayanan</surname>
          </string-name>
          .
          <article-title>A ect prediction in music using boosted ensemble of lters</article-title>
          .
          <source>In The 2015 European Signal Processing Conference</source>
          , Nice, France,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Florian</given-names>
            <surname>Eyben</surname>
          </string-name>
          ,
          <article-title>Martin Wollmer, and Bjorn Schuller. Opensmile: the munich versatile and fast open-source audio feature extractor</article-title>
          .
          <source>In Proceedings of the international conference on Multimedia</source>
          , pages
          <volume>1459</volume>
          {
          <fpage>1462</fpage>
          . ACM,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Jerome</surname>
            <given-names>H</given-names>
          </string-name>
          <string-name>
            <surname>Friedman</surname>
          </string-name>
          .
          <article-title>Stochastic gradient boosting</article-title>
          .
          <source>Computational Statistics &amp; Data Analysis</source>
          ,
          <volume>38</volume>
          (
          <issue>4</issue>
          ):
          <volume>367</volume>
          {
          <fpage>378</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Gene</surname>
            <given-names>H</given-names>
          </string-name>
          <string-name>
            <surname>Golub and Charles F Van Loan</surname>
          </string-name>
          .
          <article-title>An analysis of the total least squares problem</article-title>
          .
          <source>SIAM Journal on Numerical Analysis</source>
          ,
          <volume>17</volume>
          (
          <issue>6</issue>
          ):
          <volume>883</volume>
          {
          <fpage>893</fpage>
          ,
          <year>1980</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Jane</surname>
            <given-names>Elith</given-names>
          </string-name>
          , John R Leathwick, and
          <string-name>
            <given-names>Trevor</given-names>
            <surname>Hastie</surname>
          </string-name>
          .
          <article-title>A working guide to boosted regression trees</article-title>
          .
          <source>Journal of Animal Ecology</source>
          ,
          <volume>77</volume>
          (
          <issue>4</issue>
          ):
          <volume>802</volume>
          {
          <fpage>813</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Thomas</surname>
            <given-names>G</given-names>
          </string-name>
          <string-name>
            <surname>Dietterich</surname>
          </string-name>
          .
          <article-title>Ensemble methods in machine learning</article-title>
          .
          <source>In Multiple classi er systems</source>
          , pages
          <fpage>1</fpage>
          <lpage>{</lpage>
          15. Springer,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Leo</given-names>
            <surname>Breiman</surname>
          </string-name>
          .
          <article-title>Bagging predictors</article-title>
          .
          <source>Machine learning</source>
          ,
          <volume>24</volume>
          (
          <issue>2</issue>
          ):
          <volume>123</volume>
          {
          <fpage>140</fpage>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>George</given-names>
            <surname>Foster</surname>
          </string-name>
          and
          <string-name>
            <given-names>Roland</given-names>
            <surname>Kuhn</surname>
          </string-name>
          .
          <article-title>Mixture-model adaptation for smt</article-title>
          .
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>WA</given-names>
            <surname>Ainsworth</surname>
          </string-name>
          .
          <article-title>Mechanisms of selective feature adaptation</article-title>
          .
          <source>Perception &amp; Psychophysics</source>
          ,
          <volume>21</volume>
          (
          <issue>4</issue>
          ):
          <volume>365</volume>
          {
          <fpage>370</fpage>
          ,
          <year>1977</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Mihalis</surname>
            <given-names>Nicolaou</given-names>
          </string-name>
          , Hatice Gunes,
          <string-name>
            <given-names>Maja</given-names>
            <surname>Pantic</surname>
          </string-name>
          , et al.
          <article-title>Continuous prediction of spontaneous a ect from multiple cues and modalities in valence-arousal space. A ective Computing</article-title>
          , IEEE Transactions on,
          <volume>2</volume>
          (
          <issue>2</issue>
          ):
          <volume>92</volume>
          {
          <fpage>105</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Angeliki</surname>
            <given-names>Metallinou</given-names>
          </string-name>
          , Athanasios Katsamanis, and
          <string-name>
            <given-names>Shrikanth</given-names>
            <surname>Narayanan</surname>
          </string-name>
          .
          <article-title>Tracking continuous emotional trends of participants during a ective dyadic interactions using body language and speech information</article-title>
          .
          <source>Image and Vision Computing</source>
          ,
          <volume>31</volume>
          (
          <issue>2</issue>
          ):
          <volume>137</volume>
          {
          <fpage>152</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Rahul</surname>
            <given-names>Gupta</given-names>
          </string-name>
          , Nikolaos Malandrakis, Bo Xiao, Tanaya Guha, Maarten Van Segbroeck,
          <string-name>
            <surname>Matthew Black</surname>
            , Alexandros Potamianos, and
            <given-names>Shrikanth</given-names>
          </string-name>
          <string-name>
            <surname>Narayanan</surname>
          </string-name>
          .
          <article-title>Multimodal prediction of a ective dimensions and depression in human-computer interactions</article-title>
          .
          <source>In Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge</source>
          , pages
          <volume>33</volume>
          {
          <fpage>40</fpage>
          . ACM,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Michel</surname>
            <given-names>Valstar</given-names>
          </string-name>
          , Bjorn Schuller, Kirsty Smith,
          <string-name>
            <given-names>Timur</given-names>
            <surname>Almaev</surname>
          </string-name>
          , Florian Eyben, Jarek Krajewski, Roddy Cowie, and
          <string-name>
            <given-names>Maja</given-names>
            <surname>Pantic</surname>
          </string-name>
          .
          <source>Avec</source>
          <year>2014</year>
          :
          <article-title>3d dimensional a ect and depression recognition challenge</article-title>
          .
          <source>In Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge</source>
          , pages
          <volume>3</volume>
          {
          <fpage>10</fpage>
          . ACM,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Vikramjit</surname>
            <given-names>Mitra</given-names>
          </string-name>
          , Elizabeth Shriberg, Mitchell McLaren,
          <string-name>
            <surname>Andreas Kathol</surname>
            , Colleen Richey, Dimitra Vergyri, and
            <given-names>Martin</given-names>
          </string-name>
          <string-name>
            <surname>Graciarena</surname>
          </string-name>
          .
          <article-title>The sri avec-2014 evaluation system</article-title>
          .
          <source>In Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge</source>
          , pages
          <volume>93</volume>
          {
          <fpage>101</fpage>
          . ACM,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>S Omar</given-names>
            <surname>Ali and Zehra F Peynircioglu</surname>
          </string-name>
          .
          <article-title>Songs and emotions: are lyrics and melodies equal partners?</article-title>
          <source>Psychology of Music</source>
          ,
          <volume>34</volume>
          (
          <issue>4</issue>
          ):
          <volume>511</volume>
          {
          <fpage>534</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Youngmoo</surname>
            <given-names>E Kim</given-names>
          </string-name>
          , Erik M Schmidt,
          <string-name>
            <given-names>Raymond</given-names>
            <surname>Migneco</surname>
          </string-name>
          , Brandon G Morton,
          <article-title>Patrick Richardson, Je rey Scott, Jacquelin A Speck,</article-title>
          and
          <string-name>
            <given-names>Douglas</given-names>
            <surname>Turnbull</surname>
          </string-name>
          .
          <article-title>Music emotion recognition: A state of the art review</article-title>
          .
          <source>In Proc. ISMIR</source>
          , pages
          <volume>255</volume>
          {
          <fpage>266</fpage>
          .
          <string-name>
            <surname>Citeseer</surname>
          </string-name>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>