<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>G. Meehan);</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>mender Systems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gregor Meehan</string-name>
          <email>N@20</email>
          <email>R@20</email>
          <email>gregor.meehan@qmul.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Johan Pauwels</string-name>
          <email>j.pauwels@qmul.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Music Recommender Systems, Artist Fairness, Recommender System Evaluation</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Queen Mary University of London</institution>
          ,
          <addr-line>327 Mile End Rd, London E1 4NS</addr-line>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
      </contrib-group>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Many modern research works in the field of music recommender systems (MRSs) evaluate performance by song ranking accuracy in ofline data splits. Although this paradigm matches widely adopted methodology in the broader recommender systems research community, in this work we argue that it neglects key considerations relating to musical artists. In particular, we show that there are significant diferences in the ability of MRSs to successfully predict songs by artists which are known to the user and to predict those by artists which the user has never listened to before. Through analysis of content-based, collaborative filtering, and hybrid MRSs in warm and cold settings, we illustrate that this discrepancy can lead to misleading conclusions about model capability, especially given that successful recommendations of new artists are particularly valuable to the MRS user experience and to producing fairer outcomes for less popular artists. To highlight this issue, we demonstrate that a simple heuristic method based only on personalized artist filtering can achieve the strongest performance according to standard evaluation protocol. We then describe a novel MRS evaluation scheme which accounts for a user's artist interaction history, allowing for more nuanced analysis of MRS predictive capability. Finally, we provide an illustrative example of how this method can be applied to MRSs which incorporate artist metadata.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        As the providers of musical content, artists are key stakeholders in digital music platforms [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. While
most public music interaction datasets include artist metadata (e.g. Last.fm listening histories [
        <xref ref-type="bibr" rid="ref2 ref3 ref4">2, 3, 4</xref>
        ]
or streaming platform playlist data [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ]), there is no single strategy in existing music recommender
system (MRS) research for incorporating this information into model design and evaluation. Some
studies focus on artist-level recommendation [
        <xref ref-type="bibr" rid="ref10 ref11 ref7 ref8 ref9">7, 8, 9, 10, 11</xref>
        ] or similarity [
        <xref ref-type="bibr" rid="ref12 ref13 ref14 ref15 ref16">12, 13, 14, 15, 16</xref>
        ] as their
primary MRS task. However, most MRS works address song-level recommendation: while some leverage
artist metadata to learn improved representations of musical audio for content-based MRS [
        <xref ref-type="bibr" rid="ref17 ref18 ref19">17, 18, 19</xref>
        ],
enrich song representations [20, 21], or add artist nodes to MRS knowledge graphs [22, 23, 24], many
(e.g. [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]) do not consider artist information at all.
      </p>
      <p>Furthermore, even the song-level MRS studies that integrate artist data into their model design
or training regime rarely consider it explicitly in their evaluation protocols. In MRS research, as in
recommender systems literature more generally [36], the prevalent paradigm for model evaluation is
measuring accuracy in ofline scenarios; other than a handful of works which conduct user studies
[27, 29], this is true for all of the papers referenced above. This method is widely adopted because
it facilitates easier comparison of diferent MRS methods, especially in academic research settings
where access to resources or large user populations for user studies or online testing may be limited.
In song-level MRSs, ofline evaluation involves splitting either songs or user-song interactions into
disjoint training, validation, and test sets. After training, models are evaluated by their ability to
rank songs which a user might be interested in, with metrics such as recall and precision calculated
based on the held-out interactions from the validation or test set. This data splitting is typically done
entirely at random, although some works take a temporal approach [37], e.g. using the most recent</p>
      <p>CEUR</p>
      <p>
        ceur-ws.org
10% of interaction data as a test set [31]. Artist-related data leakage during this process has long been
recognized as a potential pitfall in MRS [
        <xref ref-type="bibr" rid="ref7">7, 38</xref>
        ] as well as in other music information retrieval tasks such
as genre classification [ 39, 40]. However, with a few exceptions [
        <xref ref-type="bibr" rid="ref8">8, 41</xref>
        ], song-level MRS studies do not
take artist information into account when creating these data splits or in calculating ranking accuracy.
      </p>
      <p>
        In this paper, we argue that this artist-agnostic approach to MRS evaluation is flawed and has
significant implications for measurement of MRS performance. Our motivation for this work stems
from an unexpected finding in our recent study [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]: in that paper, we evaluate several contrastive
pre-training regimes for music representation learning with weak supervision from metadata in the
Melon Playlist Dataset [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. When evaluating downstream MRS performance via nearest neighbor-based
playlist continuation, we find that models trained using artist co-occurrence (Artist CO) consistently
achieve the highest recommendation accuracy. In particular, they surpass competing models trained
directly using playlist co-occurrence (Playlist CO), where positive contrastive pairs consist of songs
which share a playlist. This result contradicts the naive expectation that Playlist CO models would be
better at playlist continuation because their training regime aligns most closely with this task,1 pointing
instead to a wider issue in the evaluation setup.
      </p>
      <p>In this study, we generalize these results on the Music4All-Onion dataset [42], expanding our analysis
to include collaborative filtering and hybrid MRSs. We show that, in the standard evaluation scenario,
the vast majority of the correct suggestions by these MRSs come from artists that the user has already
listened to. This can lead to misleading conclusions on the system’s efectiveness, especially given
the value of novelty and diversity in MRSs [43, 44, 45]. In particular, as noted by van den Oord et al.
in a seminal early work in deep content-based MRS [25], ‘recommending songs by artists that the user
is known to enjoy is not particularly useful’.2 To emphasize the severity of the issue, we show that a
simple heuristic approach based on personalized artist filtering outperforms state-of-the-art methods
in standard accuracy metrics in both cold and warm scenarios. We discuss the implications of these
ifndings on MRS fairness from the artist perspective, and propose a novel MRS evaluation framework
to ensure robustness against this issue. Finally, we show the value of this framework by illustrating
how it aids in evaluation of an MRS which directly incorporates artist metadata.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Preliminaries</title>
      <sec id="sec-2-1">
        <title>2.1. Problem Statement and Notation</title>
        <p>In this paper, we focus on song-level MRSs, where the system is trained with interactions between a set
of users  and a set of songs  . The task of a song-level MRS is to predict preference scores P, of user
 ∈</p>
        <p>for each song  ∈  , with higher scores indicating higher interest of  in  . Each song  has an
cold subsets   and   , with only the warm songs appearing in the training data. Each  ∈ 
associated artist set   , with  the set of all artists. We assume that  has been divided into warm and
has a set
as  ’s known artists, while all other artists in  ∖ 
of historical interacted items   ⊂   and interacted artists   = ⋃∈ 
 are unknown artists for  . While in this work we
 . We refer to artists  ∈  

let   be all songs a user has interacted with, our insights also hold for other forms of interaction data,
such as individual listening sessions or playlists.</p>
        <p>We define the popularity pop() of song  as the number of users that have listened to  . We measure
the afinity af</p>
        <p>
          () of user  for artist  as the number of songs by  which  has listened to.
1One possible explanation for this finding is that, in contrast to songs by the same artist, the audio content of playlists is too
diverse for the model to learn efective representations; however, the fact that Playlist CO and Artist CO achieve similarly
strong performance in downstream tagging [
          <xref ref-type="bibr" rid="ref17 ref19">17, 19</xref>
          ] casts doubt on this hypothesis.
2We acknowledge that there are clear exceptions to this statement, especially given that repeat listens are common in music
consumption [46]. In particular, it is less applicable in online evaluation [47] or in ofline sequential or contextual MRSs. In
these settings, the aim is often to suggest ‘the right music at the right time’ [ 48] based on both current session activity and
long-term user preferences, which means that recommending a known song or artist can be much more valuable.
2.2. Data
We choose Music4All-Onion [42] (M4A-Onion) as our primary source of interaction data for this work,
as it contains both Last.fm listening histories [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] and corresponding audio clips from Music4All [49] for
generating audio content representations. Although M4A-Onion contains song features in many other
data modes, we choose to only consider a single modality (namely, audio) to simplify the implementation
of our chosen content-based and hybrid MRSs. The audio representation models considered are
selfsupervised method MusicFM (MFM) [50] and the Playlist CO (PCO) and Artist CO (ACO) models [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]
trained using the Melon Playlist Dataset [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] as in [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] with the short-chunk CNN backbone [51].
        </p>
        <p>To filter the interaction data, we first exclude skipped songs, where, as in [ 33], we consider a song
skipped if the next song starts within 30 seconds. Then, following [32], we only consider interactions
from 2018 and users from age 10 to 80. We then remove user-song interactions that only occur once
and perform 5-core filtering on users and songs.</p>
        <sec id="sec-2-1-1">
          <title>2.2.1. Data Splitting</title>
          <p>We follow a standard fully random splitting procedure in our main analysis, and discuss potential
artistrelated modifications in Section 4.1. We also split the data to facilitate analysis of model performance in
both cold and warm item scenarios. Following existing works which also tackle both problems [52], we
ifrst divide the songs  in an 80:20 ratio into warm (  ) and cold (  ) subsets, then further divide  
50:50 into cold validation and cold test songs. The interactions corresponding to these songs make up
the cold validation and test sets. All user-song interactions relating to the songs in   are then split
80:10:10 into warm training, validation, and test sets. Statistics of these splits are in Table 1.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>2.3. Models</title>
        <p>
          To demonstrate the wide applicability of our concerns, we consider content-based, collaborative filtering,
and hybrid models in our analysis. Our content-based method is ItemKNN [53], which calculates user
preferences based on similarity in our audio content vectors. Our collaborative filtering models are
BPR-based matrix factorization (BPR-MF) [54] and XSimGCL [55], a state-of-the-art graph-based
approach. For our hybrid model, we adopt CLCRec [56], which addresses both the warm and cold
scenario and has been used in several recent MRS studies [
          <xref ref-type="bibr" rid="ref18">18, 32</xref>
          ]. For ItemKNN, we test performance
with the content vectors output from each audio representation model mentioned in Section 2.2; in some
contexts, we refer to the ItemKNN results by the names of these audio representations for brevity (MFM,
PCO, ACO). We implement CLCRec with MusicFM [50] feature inputs. We conduct hyperparameter
tuning based on the ranges described in the original papers, and set the embedding dimension at 64 for
all models.
        </p>
        <sec id="sec-2-2-1">
          <title>2.3.1. Personalized Artist-Based Filtering</title>
          <p>Aside from the above methods, we also introduce a Personalized Artist-Based Filtering (PAF) heuristic
for song-level recommendation based only on user-artist interaction history and item popularity. Given
a user  and song  , PAF preference scores are defined as:
15000
s
t
i
.H10000
m
u
N 5000
ACO BPR-MFXSimGCLCLCRec PAF</p>
          <p>⎧0,
PP,AF = ⎪⎨log(pop()) ⋅ m∈ ax af  (),
⎪max af  () + rand(),
⎩ ∈ 
if   ∩   = ∅
if   ∩   ≠ ∅ and  ∈  
if   ∩   ≠ ∅ and  ∈  
(1)
For warm items, the preference score is the song’s log-popularity multiplied by the user’s afinity for
the artist. For cold songs, popularity is not available, so we use a uniformly sampled random number
between 0 and 1 as a ranking tie-breaker for songs with the same afinity. We note that PAF preference
scores are zero for any song where the user has not listened to its artist, i.e. this approach will only
suggest songs by artists that the user has already listened to.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Analysis</title>
      <p>3.1. Hits
In this section, we examine the performance and behavior of our chosen models after overlaying artist
information, and discuss the implications of our findings on MRS evaluation.</p>
      <p>We first investigate overall model performance in Figure 1, displaying the total number of successful
song recommendations (i.e., hits) split by whether the user has previously listened to that song’s artist.
We include full ranking metrics (namely, NDCG and Recall) below in Section 4.</p>
      <p>
        We observe that, in both the warm and cold scenario and across all models, the vast majority of hits
come from artists which are known to the user, i.e. songs  where   ∩   ≠ ∅. We can now explain
how Artist CO outperforms Playlist CO in [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]: its additional hits come entirely in the known artist
category, matching with its training objective of aligning audio representations of songs by the same
artist. Meanwhile, Playlist CO is equal or superior for unknown artists, as would be expected given
that its positive contrastive pairs are much more diverse.
      </p>
      <p>There is an analogous trend in the collaborative filtering methods: XSimGCL achieves significantly
more hits than BPR-MF, but almost all of the diference comes from known artists. Graph-based methods
like XSimGCL are widely adopted due to their ability to exploit collaborative data more efectively, but
these results suggest that, in a music context, this gain may come primarily from successfully suggesting
songs by artists the user has previously interacted with. An investigation of the mechanism by which
graph convolution amplifies these artist preferences is an avenue for future work.</p>
      <p>Finally, we note that our purely metadata-based PAF method achieves the highest number of hits in
both warm and cold contexts. However, by definition, none of these songs are by artists new to the
user. We discuss the implications of this finding in further detail in Section 3.3 below.
50%
0%</p>
      <sec id="sec-3-1">
        <title>3.2. Model Behavior</title>
        <p>In Figure 2 we provide further insight into the artist-related predictive behavior of our models. In
visualizing the proportion of hits and predictions by known artists, we can see that the known artist hit
proportions for all models are above the proportion of known artists in the warm and cold test sets.
In other words, songs by known artists are consistently over-represented in the model’s successful
recommendations. However, with the exception of PAF, the opposite is true of the model’s predictions,
which contain significantly lower proportions of songs by known artists.</p>
        <p>The disparity between known artist proportions of predictions and hits is particularly prevalent in the
ItemKNN models. This suggests that similarity in the chosen audio features cannot discriminate very
efectively to capture user preferences, with successful predictions instead appearing to rely primarily
on artist sonic coherence or other artifacts (e.g. similarities in mixing) which lead to high intra-artist
similarity. This ‘artist efect’ is well-established in audio similarity applications [ 38], although its impact
is exacerbated by the Artist CO training process.</p>
        <p>The gap between known artist predictions and hits is less severe for the trained models, but still
significant. XSimGCL’s graph-based training also appears to increase its tendency to predict known
artists, going some way to explain the finding in Section 3.1. In the cold data, CLCRec has similar
prediction proportions to the ItemKNN models, but a lower proportion of hits by known artists. This
suggests that the combination of collaborative training and content information can help to increase
performance for unknown artists.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.3. Discussion</title>
        <p>The above results illustrate that there is a significant diference in the quality of recommendations
between songs by artists known to the user and unknown to the user. There is some analogy between
our scenario and analysis of popularity bias, where there are also two categories of items (namely,
long-tail and popular items) with drastically diferent levels of performance. In such contexts, bias can
be amplified by the fact that long-tail and popular items are recommended in a single top-  ranked list,
as long-tail items are blocked from top ranking spots by popular items [57], i.e. they are underexposed
to users [58]. However, the results in Figure 2 suggest that exposure is not the primary issue for songs
by unknown artists, as these make up the majority of model predictions in most cases.</p>
        <p>Instead, our findings reflect the intuition that correctly recommending songs by unknown artists
is simply a much harder task than suggesting songs by artists a user is familiar with. As shown in
Figure 2, over 65% of targets in the warm and cold test sets come from songs by known artists, but the
precision of such recommendations will be significantly higher because the candidate pool is much
smaller. This is clearly demonstrated by the fact that, by focusing only on songs by known artists,
our simple PAF method has the most hits of any model by a wide margin. As noted in Section 1 and
in [25], such metadata-based suggestions lack novelty and will potentially be less valuable to users;
yet, according to standard evaluation protocol (cf. Table 2), PAF achieves the best results, showing that
evaluating song-level MRSs by overall performance alone obfuscates important information about which
recommendations are being made successfully. In Section 4 below we describe a simple adjustment to
ranking-based evaluation which disentangles known and unknown artist performance, allowing for
more nuanced analysis of MRS performance.</p>
        <sec id="sec-3-2-1">
          <title>3.3.1. Artist Fairness</title>
          <p>Aside from potentially degrading overall recommendation quality, MRS hits being dominated by known
artists can also lead to inequitable outcomes for less popular artists. In previous interview-based
studies [59, 60], artists describe the dificulties of reaching their intended audience via algorithmic
suggestions. This may be due to lack of exposure, which previous works show is a key challenge for
less popular artists in collaborative filtering contexts [ 61, 62]. However, a further concern is raised
by our analysis above: if an MRS has poor accuracy for artists that a user has not listened to, then its
ability to produce successful recommendations for artists with small listener bases will likely be limited.</p>
          <p>Even if exposure is not an issue, and an artist’s songs are recommended to many users, their audience
will not grow unless those users are likely to enjoy their music. Determining an MRS’s ability to
successfully recommend songs by unknown artists is therefore an important step in ensuring that its
suggestions are more fair. However, if model evaluation is based primarily on overall recommendation
accuracy, then the ‘best-performing’ model may be one which mainly succeeds at recommending known
artists, thereby harming less popular artists and reducing overall fairness.3 Given that users are more
satisfied with recommendations when they believe them to be fair [ 63] and that increasing fairness via
bias mitigation methods can improve perceived recommendation quality [45], these concerns further
motivate an alternative evaluation protocol which treats known and unknown artists separately.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Artist-Based Evaluation Protocol</title>
      <sec id="sec-4-1">
        <title>4.1. Strategy</title>
        <p>Based on the above insights, we now elucidate a evaluation framework which provides clearer
understanding of the relative strengths and weaknesses of MRSs from the artist perspective. Given a user  ,
the standard evaluation protocol involves calculating accuracy metrics on a single top- list, where songs
 are ranked by their predicted preference scores P, . We propose to split this list into songs by artists
known to  and artists unknown to  , and calculate ranking metrics on each list separately. This method
is somewhat similar to evaluating model performance separately on warm and cold songs; however,
these are global classifications, while known artists are specific to each user. By separating these two
categories, we ensure that our ability to evaluate model capacity for unknown artist recommendations
is not drowned out by the more straightforward recommendation of known artists.</p>
        <p>We note that this evaluation strategy can be applied more efectively if it is taken into account at
the data splitting stage, as this will ensure that there are suficient examples in both categories for all
users. Furthermore, if an overall evaluation approach is still required for the application at hand, this
adjustment to data splitting will also ensure that unknown artist performance has a bigger influence
on overall ranking metrics. We chose not to take this approach in this work so that the analysis in
Section 3 would more accurately reflect the current standard methodology. As shown in Figure 2, about
a third of holdout targets are from unknown artists, so even without this step there is still a statistically
significant sample in this category. In some contexts, it may also be useful to consider a third category
of artists which are unknown to all users, so that performance or fairness for brand new artists can also
be evaluated. We leave this challenge to future work.
3We note that improving net unknown artist performance is not a suficient condition to improve artist fairness; it may be
that the model only improves in suggesting popular artists to users who have not listened to them yet, and still neglects less
popular artists. However, these insights provide another dimension by which artist fairness may be analyzed in future work.
evaluation sets. The best result in each metric is bolded, and the second-best is underlined.
Warm</p>
        <p>Cold</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Results</title>
        <p>Overall
Known</p>
        <p>Artist
Unknown
Artist
Overall
Known</p>
        <p>Artist
Unknown
Artist
ItemKNN
BPR-MF
XSimGCL
MFM</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Example Use Case</title>
        <p>
          Suppose we want to enhance our ItemKNN method for cold songs by leveraging artist metadata.
Following previous works in artist similarity [
          <xref ref-type="bibr" rid="ref12 ref13 ref16">12, 13, 16</xref>
          ], we can generate an artist’s content representation by
averaging the feature vectors of their songs. Then for any song  we have its feature vector x and the
corresponding content vector x  for its artist(s). Standard ItemKNN uses cosine similarity sim(x , x )
between the feature vectors to measure the similarity  (, )
between two songs  and  ; we can augment
this method with artist information by calculating  (, ) =
balance weight controlling the emphasis placed on artist similarity.
sim(x , x ) +  ⋅ sim(x , x

 ), where  is a
        </p>
        <p>We plot the impact of varying this artist weight  in Figure 3. If we only consider overall results,
adding artist similarity appears to significantly improve model performance. However, we can see
from the other metrics that this benefit is limited to known artist predictions, with ranking quality
200.0150
@
nN0.0125
o
n
kn0.0100
U
ACO</p>
        <p>MFM</p>
        <p>PCO
n
on0.30
K
.
red0.25
P
%0.20
0.4 0.6</p>
        <p>Artist Weight (α)
for unknown artists degrading for higher values of  . This aligns with our discussion in Section 3.2 of
the artist efect, i.e. that similarity in audio content vectors is a poor predictor of inter-artist similarity
for MRS applications. Furthermore, this method increases the proportion of known artists in overall
ranking lists; as illustrated by PAF in Section 3, this can artificially increase overall performance without
providing significant additional value to the user. This example shows how measuring known and
unknown artist performance separately can allow for more informed evaluation of novel MRS methods.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this paper, we demonstrate that standard methods for ofline recommender system evaluation have
notable limitations in MRS applications. We show that MRSs across content-based, collaborative
ifltering, and hybrid paradigms exhibit significant diferences in performance when recommending
songs by known artists versus those by artists that a user has never previously listened to. Although this
disparity is not surprising in itself, it brings into question the value of standard accuracy measurements
dominated by much ‘easier’ known artist predictions, which may be less valuable to the user due to lack
of novelty. To emphasize this point, we show that our PAF heuristic method achieves the best results
according to standard evaluation procedure, despite only recommending a user’s known artists. To
address this issue, we propose a novel MRS evaluation strategy, where ranking metrics are calculated
separately for each user on their target lists of songs by known and unknown artists. We show that this
approach allows for more nuanced interpretation of model performance in warm and cold settings, and
that it is particularly useful when integrating artist metadata into model design.</p>
      <p>In future work we plan to build on these insights and explore how MRSs can be designed specifically
to improve unknown artist performance. This may also allow for improved personalization of model
suggestions: one possible approach, similar to calibration methods in popularity bias mitigation [64],
would use the top candidate songs by known and unknown artists to produce a combined list of
suggestions where the proportion of unknown artists matches a user’s historical appetite for new artist
discovery. We also acknowledge that our method is perhaps overly simplistic in its classification of
artists as ‘known’ or ‘unknown’ to a user, and that the interaction characteristics of a user’s
mostlistened artist and an artist they have heard only once may be very diferent. Future research will
explore what further insight may be gained by investigating the relationship between user-artist afinity
and model evaluation at diferent levels of familiarity.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work was funded by UK Research and Innovation as part of the UKRI CDT in Artificial Intelligence
and Music [grant number EP/S022694/1].</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>The author(s) have not employed any Generative AI tools.
[20] Q. Yang, S. Wang, D. Guo, D. Yu, Q. Xiao, D. Wang, C. Luo, Cascading multimodal feature enhanced
contrast learning for music recommendation, in: 2024 IEEE International Conference on Data
Mining (ICDM), IEEE, 2024, pp. 905–910.
[21] B. L. Pereira, P. D. V. Chaves, R. L. Santos, Eficient exploration and exploitation for sequential
music recommendation, ACM Transactions on Recommender Systems 2 (2024) 1–23.
[22] H. Weng, J. Chen, D. Wang, X. Zhang, D. Yu, Graph-based attentive sequential model with metadata
for music recommendation, IEEE Access 10 (2022) 108226–108240. doi:10.1109/ACCESS.2022.
3213812.
[23] X. Cui, X. Qu, D. Li, Y. Yang, Y. Li, X. Zhang, Mkgcn: Multi-modal knowledge graph convolutional
network for music recommender systems, Electronics 12 (2023). URL: https://www.mdpi.com/
2079-9292/12/12/2688. doi:10.3390/electronics12122688.
[24] D. Wang, X. Zhang, Y. Yin, D. Yu, G. Xu, S. Deng, Multi-view enhanced graph attention network
for session-based music recommendation, ACM Transactions on Information Systems 42 (2023)
1–30.
[25] A. Van den Oord, S. Dieleman, B. Schrauwen, Deep content-based music recommendation,</p>
      <p>Advances in neural information processing systems 26 (2013).
[26] A. Ferraro, X. Favory, K. Drossos, Y. Kim, D. Bogdanov, Enriched music representations with
multiple cross-modal contrastive learning, IEEE Signal Processing Letters 28 (2021) 733–737.
[27] M. Pulis, J. Bajada, Siamese neural networks for content-based cold-start music recommendation.,
in: Proceedings of the 15th ACM conference on recommender systems, 2021, pp. 719–723.
[28] A. Saravanou, F. Tomasi, R. Mehrotra, M. Lalmas, Multi-task learning of graph-based inductive
representations of music content., in: ISMIR, 2021, pp. 602–609.
[29] M. Park, K. Lee, Exploiting negative preference in content-based music recommendation with
contrastive learning, in: Proceedings of the 16th ACM Conference on Recommender Systems,
2022, pp. 229–236.
[30] P. Magron, C. Févotte, Neural content-aware collaborative filtering for cold-start music
recommendation, Data Mining and Knowledge Discovery 36 (2022) 1971–2005. URL: https://doi.org/10.
1007/s10618-022-00859-8. doi:10.1007/s10618- 022- 00859- 8.
[31] R. Borges, M. Queiroz, Audio-based sequential music recommendation, in: 2023 31st European</p>
      <p>Signal Processing Conference (EUSIPCO), IEEE, 2023, pp. 421–425.
[32] C. Ganhör, M. Moscati, A. Hausberger, S. Nawaz, M. Schedl, A multimodal single-branch embedding
network for recommendation in cold-start and missing modality scenarios, in: Proceedings of the
18th ACM Conference on Recommender Systems, 2024, pp. 380–390.
[33] P. Seshadri, S. Shashaani, P. Knees, Enhancing sequential music recommendation with
negative feedback-informed contrastive learning, in: Proceedings of the 18th ACM Conference on
Recommender Systems, 2024, pp. 1028–1032.
[34] Y.-M. Tamm, A. Aljanaki, Comparative analysis of pretrained audio representations in music
recommender systems, in: Proceedings of the 18th ACM Conference on Recommender Systems,
2024, pp. 934–938.
[35] M. Bevec, M. Tkalčič, M. Pesek, Hybrid music recommendation with graph neural networks, User</p>
      <p>Modeling and User-Adapted Interaction (2024) 1–38.
[36] E. Zangerle, C. Bauer, Evaluating recommender systems: survey and framework, ACM computing
surveys 55 (2022) 1–38.
[37] R. Burke, Evaluating the dynamic properties of recommendation algorithms, in: Proceedings of
the fourth ACM conference on Recommender systems, 2010, pp. 225–228.
[38] A. Flexer, D. Schnitzer, Efects of album and artist filters in audio similarity computed for very
large music databases, Computer Music Journal 34 (2010) 20–28.
[39] E. Pampalk, A. Flexer, G. Widmer, et al., Improvements of audio-based music similarity and genre
classificaton., in: ISMIR, volume 5, London, UK, 2005, pp. 634–637.
[40] I. Vatolkin, G. Rudolph, C. Weihs, Evaluation of album efect for feature selection in music genre
recognition., in: ISMIR, 2015, pp. 169–175.
[41] A. Vall, M. Dorfer, H. Eghbal-Zadeh, M. Schedl, K. Burjorjee, G. Widmer, Feature-combination
hybrid recommender systems for automated music playlist continuation, User Modeling and
User-Adapted Interaction 29 (2019) 527–572.
[42] M. Moscati, E. Parada-Cabaleiro, Y. Deldjoo, E. Zangerle, M. Schedl, Music4all-onion – a large-scale
multi-faceted content-centric music recommendation dataset, in: Proceedings of the 31st ACM
International Conference on Information &amp; Knowledge Management, CIKM ’22, Association for
Computing Machinery, New York, NY, USA, 2022, p. 4339–4343. doi:10.1145/3511808.3557656.
[43] Y. Deldjoo, M. Schedl, P. Knees, Content-driven music recommendation: Evolution, state of the
art, and challenges, Computer Science Review 51 (2024) 100618. URL: https://www.sciencedirect.
com/science/article/pii/S1574013724000029. doi:10.1016/j.cosrev.2024.100618.
[44] Y. Ping, Y. Li, J. Zhu, Beyond accuracy measures: the efect of diversity, novelty and serendipity in
recommender systems on user engagement, Electronic Commerce Research (2024) 1–28.
[45] R. Ungruh, K. Dinnissen, A. Volk, M. S. Pera, H. Hauptmann, Putting popularity bias mitigation
to the test: A user-centric evaluation in music recommenders, in: Proceedings of the 18th ACM
Conference on Recommender Systems, 2024, pp. 169–178.
[46] M. Schedl, H. Zamani, C.-W. Chen, Y. Deldjoo, M. Elahi, Current challenges and visions in music
recommender systems research, International Journal of Multimedia Information Retrieval 7 (2018)
95–116.
[47] W. Bendada, G. Salha-Galvan, T. Bouabça, T. Cazenave, A scalable framework for automatic
playlist continuation on music streaming services, in: Proceedings of the 46th International ACM
SIGIR Conference on Research and Development in Information Retrieval, 2023, pp. 464–474.
[48] E. Liebman, M. Saar-Tsechansky, P. Stone, The Right Music at the Right Time: Adaptive
Personalized Playlists Based on Sequence Modeling, MIS Quarterly 43 (2019) 765–786.
[49] I. A. P. Santana, F. Pinhelli, J. Donini, L. Catharin, R. B. Mangolin, V. D. Feltrim, M. A. Domingues,
et al., Music4all: A new music database and its applications, in: 2020 International Conference on
Systems, Signals and Image Processing (IWSSIP), IEEE, 2020, pp. 399–404.
[50] M. Won, Y.-N. Hung, D. Le, A foundation model for music informatics, in: ICASSP 2024-2024 IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2024, pp.
1226–1230.
[51] M. Won, S. Oramas, O. Nieto, F. Gouyon, X. Serra, Multimodal metric learning for tag-based music
retrieval, in: Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2021, pp.
591–595.
[52] F. Huang, Z. Wang, X. Huang, Y. Qian, Z. Li, H. Chen, Aligning Distillation For Cold-start Item
Recommendation, in: Proceedings of the 46th International ACM SIGIR Conference on Research
and Development in Information Retrieval, SIGIR ’23, Association for Computing Machinery, New
York, NY, USA, 2023, pp. 1147–1157. doi:10.1145/3539618.3591732.
[53] B. Sarwar, G. Karypis, J. Konstan, J. Riedl, Item-based collaborative filtering recommendation
algorithms, in: Proceedings of the 10th international conf. on World Wide Web, 2001, pp. 285–295.
[54] S. Rendle, C. Freudenthaler, Z. Gantner, L. Schmidt-Thieme, Bpr: Bayesian personalized ranking
from implicit feedback, in: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial
Intelligence, UAI ’09, AUAI Press, Arlington, Virginia, USA, 2009, p. 452–461.
[55] J. Yu, X. Xia, T. Chen, L. Cui, N. Q. V. Hung, H. Yin, Xsimgcl: Towards extremely simple graph
contrastive learning for recommendation, IEEE Transactions on Knowledge and Data Engineering
36 (2023) 913–926.
[56] Y. Wei, X. Wang, Q. Li, L. Nie, Y. Li, X. Li, T.-S. Chua, Contrastive learning for cold-start
recommendation, in: Proceedings of the 29th ACM International Conference on Multimedia, 2021, pp.
5382–5390.
[57] Z. Zhu, Y. He, X. Zhao, Y. Zhang, J. Wang, J. Caverlee, Popularity-opportunity bias in collaborative
ifltering, in: Proceedings of the 14th ACM international conference on web search and data mining,
2021, pp. 85–93.
[58] M. Mansoury, H. Abdollahpouri, M. Pechenizkiy, B. Mobasher, R. Burke, Feedback loop and
bias amplification in recommender systems, in: Proc. of the 29th ACM International Conf. on
Information &amp; Knowledge Management, 2020, pp. 2145–2148.
[59] A. Ferraro, X. Serra, C. Bauer, What is fair? exploring the artists’ perspective on the fairness of
music streaming platforms, in: IFIP conference on human-computer interaction, Springer, 2021,
pp. 562–584.
[60] K. Dinnissen, C. Bauer, Amplifying artists’ voices: Item provider perspectives on influence and
fairness of music streaming platforms, in: Proceedings of the 31st ACM Conference on User
Modeling, Adaptation and Personalization, 2023, pp. 238–249.
[61] A. Ferraro, D. Bogdanov, X. Serra, J. Yoon, Artist and style exposure bias in collaborative filtering
based music recommendations, arXiv preprint arXiv:1911.04827 (2019).
[62] H. Abdollahpouri, R. Burke, M. Mansoury, Unfair exposure of artists in music recommendation,
arXiv preprint arXiv:2003.11634 (2020).
[63] B. Ferwerda, E. Ingesson, M. Berndl, M. Schedl, I don’t care how popular you are! investigating
popularity bias in music recommendations from a user’s perspective, in: Proceedings of the 2023
conference on human information interaction and retrieval, 2023, pp. 357–361.
[64] H. Abdollahpouri, M. Mansoury, R. Burke, B. Mobasher, E. Malthouse, User-centered evaluation
of popularity bias in recommender systems, in: Proceedings of the 29th ACM conference on user
modeling, adaptation and personalization, 2021, pp. 119–129.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bauer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kholodylo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Strauss</surname>
          </string-name>
          ,
          <article-title>Music recommender systems challenges and opportunities for non-superstar artists (</article-title>
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>O.</given-names>
            <surname>Celma</surname>
          </string-name>
          ,
          <article-title>Music recommendation, in: Music recommendation and discovery: The long tail, long fail, and long play in the digital music space</article-title>
          , Springer,
          <year>2010</year>
          , pp.
          <fpage>43</fpage>
          -
          <lpage>85</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Schedl</surname>
          </string-name>
          ,
          <article-title>The lfm-1b dataset for music retrieval and recommendation</article-title>
          ,
          <source>in: Proceedings of the 2016 ACM on international conference on multimedia retrieval</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>103</fpage>
          -
          <lpage>110</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Schedl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Brandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Lesota</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Parada-Cabaleiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Penz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Rekabsaz</surname>
          </string-name>
          , Lfm-2b:
          <article-title>A dataset of enriched music listening events for recommender systems research and fairness analysis</article-title>
          ,
          <source>in: Proceedings of the 2022 Conf. on Human Information Interaction and Retrieval</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>337</fpage>
          -
          <lpage>341</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>C.-W.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lamere</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Schedl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zamani</surname>
          </string-name>
          , Recsys challenge
          <year>2018</year>
          :
          <article-title>Automatic music playlist continuation</article-title>
          ,
          <source>in: Proceedings of the 12th ACM Conf. on Recommender Systems</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>527</fpage>
          -
          <lpage>528</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ferraro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Jo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Jang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Serra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bogdanov</surname>
          </string-name>
          ,
          <article-title>Melon playlist dataset: a public dataset for audio-based playlist generation and music tagging</article-title>
          ,
          <source>CoRR abs/2102</source>
          .00201 (
          <year>2021</year>
          ). URL: https://arxiv.org/abs/2102.00201. arXiv:
          <volume>2102</volume>
          .
          <fpage>00201</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B.</given-names>
            <surname>McFee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Barrington</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Lanckriet</surname>
          </string-name>
          ,
          <article-title>Learning content similarity for music recommendation</article-title>
          ,
          <source>IEEE transactions on audio, speech, and language processing 20</source>
          (
          <year>2012</year>
          )
          <fpage>2207</fpage>
          -
          <lpage>2218</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Oramas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Nieto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sordo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Serra</surname>
          </string-name>
          ,
          <article-title>A deep multimodal approach for cold-start music recommendation</article-title>
          ,
          <source>in: Proceedings of the 2nd Workshop on Deep Learning for Recommender Systems, DLRS</source>
          <year>2017</year>
          ,
          <article-title>Association for Computing Machinery</article-title>
          , New York, NY, USA,
          <year>2017</year>
          , p.
          <fpage>32</fpage>
          -
          <lpage>37</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Kowald</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Schedl</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. Lex,</surname>
          </string-name>
          <article-title>The unfairness of popularity bias in music recommendation: A reproducibility study</article-title>
          ,
          <source>in: European conference on information retrieval</source>
          , Springer,
          <year>2020</year>
          , pp.
          <fpage>35</fpage>
          -
          <lpage>42</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Trainor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Turnbull</surname>
          </string-name>
          ,
          <article-title>Popularity degradation bias in local music recommendation</article-title>
          ,
          <source>arXiv preprint arXiv:2309.11671</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>N.</given-names>
            <surname>Bertram</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dunkel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hermoso</surname>
          </string-name>
          ,
          <article-title>I am all ears: Using open data and knowledge graph embeddings for music recommendations</article-title>
          ,
          <source>Expert Systems with Applications</source>
          <volume>229</volume>
          (
          <year>2023</year>
          )
          <fpage>120347</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>F.</given-names>
            <surname>Korzeniowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Oramas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Gouyon</surname>
          </string-name>
          ,
          <article-title>Artist similarity with graph neural networks</article-title>
          ,
          <source>arXiv preprint arXiv:2107.14541</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>F.</given-names>
            <surname>Korzeniowski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Oramas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Gouyon</surname>
          </string-name>
          ,
          <article-title>Artist similarity for everyone: A graph neural network approach, Transactions of the International Society for Music Information Retrieval (</article-title>
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .5334/tismir.143.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Oramas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ferraro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sarasua</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Gouyon</surname>
          </string-name>
          ,
          <article-title>Talking to your recs: Multimodal embeddings for recommendation and retrieval (</article-title>
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>F.</given-names>
            <surname>Grötschla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Strässle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Lanzendörfer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Wattenhofer</surname>
          </string-name>
          ,
          <article-title>Towards leveraging contrastively pretrained neural audio embeddings for recommender tasks</article-title>
          ,
          <source>arXiv preprint arXiv:2409.09026</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>A. C. M. da Silva</surname>
            ,
            <given-names>D. F.</given-names>
          </string-name>
          <string-name>
            <surname>Silva</surname>
            ,
            <given-names>R. M.</given-names>
          </string-name>
          <string-name>
            <surname>Marcacini</surname>
          </string-name>
          ,
          <article-title>Artist similarity based on heterogeneous graph neural networks</article-title>
          ,
          <source>IEEE/ACM Transactions on Audio, Speech, and Language Processing</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>P.</given-names>
            <surname>Alonso-Jiménez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Favory</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Foroughmand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Bourdalas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Serra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lidy</surname>
          </string-name>
          , D. Bogdanov,
          <article-title>PreTraining Strategies Using Contrastive Learning and Playlist Information for Music Classification</article-title>
          and Similarity,
          <year>2023</year>
          . URL: http://arxiv.org/abs/2304.12257.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>R.</given-names>
            <surname>Salganik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ma</surname>
          </string-name>
          , J. Kang, T.-S. Chua,
          <article-title>Larp: Language audio relational pre-training for cold-start playlist continuation</article-title>
          ,
          <source>arXiv preprint arXiv:2406.14333</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>G.</given-names>
            <surname>Meehan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Pauwels</surname>
          </string-name>
          ,
          <article-title>Evaluating contrastive methodologies for music representation learning using playlist data</article-title>
          ,
          <source>in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</source>
          ,
          <year>2025</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICASSP49660.
          <year>2025</year>
          .
          <volume>10888157</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>