Greatest Hits Versus Deep Cuts: Exploring Variety in Set-lists Across Artists and Musical Genres Edward Abel1,∗ , Andrew Goddard2 1 University of Southern Denmark, Denmark 2 Independent Researcher Abstract Live music concert analysis provides an opportunity to explore cultural and historical trends. The art of set-list construction, of which songs to play, has many considerations for an artist, and the notion of how much variety different artists play is an interesting topic. Online communities provide rich crowd-sourced encyclopaedic data repositories of live concert set-list data, facilitating the potential for quantitative analysis of live music concerts. In this paper, we explore data acquisition and processing of musical artists’ tour histories and propose an approach to analyse and explore the notion of variety, at individual tour level, at artist career level, and for comparisons between a corpus of artists from different musical genres. We propose notions of a shelf and a tail as a means to help explore tour variety and explore how they can be utilised to help define a single metric of variety at tour level, and artist level. Our analysis highlights the wide diversity among artists in terms of their inclinations toward variety, whilst correlation analysis demonstrates how our measure of variety remains robust across differing artist attributes, such as the number of tours and show lengths. Keywords computational musicology, statistical music analysis, set-list composition, music information retrieval 1. Introduction Live music experiences offer a unique glimpse into society and have significant cultural impact [6]. They have also become a crucial source of revenue for artists in the streaming era [17]. Con- structing live music set-lists involves several considerations for an artist, such as catering to different types of fans with varying expectations and managing the trade-offs between these expectations [26]. Artists must also consider how performing specific songs or covers could at- tract more media attention than the concert might otherwise receive [16, 24]. The implications of a set-list now extend beyond just the audience in attendance in the venue, with artists like Bruce Springsteen offering the ability to buy and stream every single show from current and previous tours [15], while bands like Metallica1 and Pearl Jam2 provide numerous ofÏcial live recordings of their performances. In addition, live shows’ set-lists have been shown to poten- CHR 2024: Computational Humanities Research Conference, December 4 – 6, 2024, Aarhus, Denmark. ∗ Corresponding author. £ edabelcs@gmail.com (E. Abel); bossfansheff@gmail.com (A. Goddard) ç www.edabel.co.uk (E. Abel) ȉ 0000-0002-3694-5116 (E. Abel); 0000-0001-7384-0252 (A. Goddard) © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 1 https://www.livemetallica.com/ 2 https://pearljam.com/music/bootlegs 802 CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings tially impact and influence listening behaviour of an artist’s fans regarding non-live material [30]. The process of set-list creation varies, with some artists meticulously planning their sets in advance, while others prefer to make more spontaneous decisions based on the energy and reactions of the crowd during the show. Getting the right balance of set-list variety is crucial. Artists may face backlash if their set-lists are perceived as too formulaic and lacking in variety. For example, during Bruce Springsteen’s 2023 tour, fans criticised performances for being too similar night after night, leading band members to defend their choices on social media [25]. Conversely, there is a risk of alienating fans by including too many unexpected songs at the expense of beloved greatest hits, as seen by some fan’s reactions to recent Bob Dylan shows [13], or by altering hit songs too much from their album versions, as seen by fan frustrations at recent Arctic Monkeys’ shows [7]. Constructing a set list involves navigating diverse expecta- tions, and the degree of variety in an artist’s performances is an intriguing topic with broader applications, such as historical live performance recommender systems [1]. In this paper, we explore the data acquisition and processing of musical artists’ tour histories, and propose an approach to explore variety, at individual tour level, artist career level, and for comparisons between a corpus of artists. For this, we propose notions of a shelf and a tail, as a means to investigate and explore tours’ variety and properties, and explore how they can be used to quantify variety at tour level and artist level. Additionally, our approach explores the impact of cover songs upon variety, and explores variety comparisons across musical genres. Through analysing many artists, we highlight the wide range of artist variety levels, from those who tend to focus more on greatest hits to those who perform more diverse shows fea- turing deep cuts, and everything in between. Additional correlation analysis of our concept of variety explores how our measure remains robust across artists with differing characteristics, such as the number of tours or average show lengths. For more information about the project and its data, see the project’s GitHub repository,3 and to interactively explore the data and the proposed approach, see the project’s Interactive Web App4 . The rest of the paper is structured as follows: Section 2 covers background literature, Section 3 details our approach, and Section 4 provides conclusions. 2. Background Work has explored live music performance exploring performer or audience psychology and touching on how set-lists impact this, such as in terms of presence and representation within shows and set-lists [27], through measuring value of live music as a motivation scale [23], and exploring how the set-list and beyond has a bearing on an artist’s impact as part of their stage success [10]. Other work has explored live music performances from the perspective of music theory, such as work defining performance parameters in relation to how performances bring compositions to life through variations in timing and dynamics and its impact on listener perception [19], and work exploring how composers’ choices in tones, intervals, and harmonies influence stylistic music changes over time [22]. 3 https://github.com/EdAbel/setlist-variety 4 www.edabel.co.uk/setlist-variety 803 Various work exploring live music performance taking a more quantitative approach have focused on an in-depth analysis of a single artist. Work has explored the band The Grateful Dead, examining the band’s live recordings over three decades to analyse the performances in relation to cultural trends in music and to investigate how the performances of the band change over time [33]. Others have analysed The Grateful Dead’s live concerts from 1972 to 1995 in comparison to listening habits outside of concerts [30], highlighting how there are correlations between live set-lists and home listening. Work focusing on the artist Bruce Springsteen, has explored his live performances and set-lists from the perspective of examining how set-list analysis can provide inclinations about tensions between commercial considerations for play- ing new album material and playing expected but older hits [2]. Bob Dylan is another artist work has focused upon, such as a study of Bob Dylan’s set-lists from the 1960s to the 2020 that investigates his approach to performing, exploring how he curates a show and how this in turn creates a meta-narrative [8]. Limited studies have looked to compare tour and set-lists for multiple artists. The Music & Entertainment publication Consequence set out and discussed the ”25 Best Rock Acts with the most Unique set-lists” [31]. Within such articles, although there is an implication of a focus, here on the acts whose shows define the word unique, the methodology involved in compiling such a list is obscured, and so therefore it is difÏcult to assess. Some recent discourse has explored how the make up of fan communities of artists such as Taylor Swift and The Grateful Dead may share surprising similarities, despite their differ- ent musical styles, and the impact this can have on further similarities between performance set-lists [5]. The notion of a comparison between Taylor Swift and The Grateful Dead per- formances has been further explored, along with a limited number of other artists, in terms of unique songs [11]. Focusing on considerations of the notion of special songs, analysis of set-lists from individual tours for different artists is carried out, to explore the prominence of unique (and quite unique) song occurrence rates. Comparisons highlight how The Grate- ful Dead is seen as a very varied artist compared to Taylor Swift, when considering variety only from the perspective of unique special ”surprise” songs. In this analysis, only individual tours are considered, which may result in an unrepresentative view of an artist’s overall career. Additionally, the methodology favours artists with longer tours and longer sets. The metric Consecutive Set Similarity (CSS) has been proposed to look to measure variety for an artist [20]. Here, an artist’s career of set-lists are arranged in sequence and each set-list is compared to the previous set-list in terms of the amount of different songs, resulting in a value of 1 if the the set-list is identical to the previous set, and -1 if it is completely different. From this, an overall average is derived for each artist, from which comparisons and clustering of different artists can be performed [28]. The measure has a very narrow focus due to only considering two shows at a time, and so will consider oscillating changes of songs that are played frequently but not every night as signalling high variety. Moreover, it does not consider information such as when one tour ends and another begins (likely to signal a significant change in set-lists, which may in turn favour artists with many small tours over a long period). Some coverage has highlighted how artists themselves explore the art of set-list curation, and the considerations they have for variety. During a tour that contained over 800 shows the band Radiohead made a conscious choice to ensure every show was unique and to never have a repeating set-list. The band would curate the set-lists daily and emphasised the importance of 804 Figure 1: Data Pipeline Stages of our Approach variety, contrasting their approach with other artist’s that play identical sets nightly [21]. More recently, it has been claimed that the band Metallica look to customise concert set-lists based on local Spotify data (and local radio trends), to look to cater to localised fans’ preferences, and to help increase show diversity [29]. 3. Our Approach In our approach, raw artist tour data is first collected and processed, then utilized in various stages of analysis, as illustrated in Figure 1. Communities such as MusicBrainz5 and Setlist.fm6 provide extensive crowd-sourced encyclopaedic data on musical artists and live concert set-lists. The following sub-section outlines our data acquisition process, detailing how these sources were leveraged to obtain the data used for our analysis. 3.1. Data Acquisition Given a set of music artist names, we acquire tour details, and data for each tour of each artist, and store the data as depicted in the data model in Figure 2. For each artist name, they can be uniquely identified via MusicBrainz Identifiers (MBIDs), which is a universal unambiguous standard artist identification.7 Through calls to the Music Brainz API,8 the MBID for each artist name is acquired, along with additional Music Brainz artist information including their Gen- 5 https://musicbrainz.org/ 6 https://www.setlist.fm 7 https://musicbrainz.org/doc/MusicBrainz_Identifier 8 https://musicbrainz.org/doc/MusicBrainz_API (utilising the musicbrainz API wrapper R package - https://github.com/dmi3kno/musicbrainz 805 Table 1 Genre Mappings General Genre (Specific) Genre Rock Rock, Hard Rock, Pop Rock, Southern Rock, Gothic Rock, Comedy Rock, Blues Rock, Rap Rock Alternative Alternative Rock, Indie Rock, Post-Rock, Dance-Punk, Garage Rock, In- die Pop, Britpop, Emo, Grunge, Alternative Hip Hop Metal Heavy Metal, Thrash Metal, Power Metal, Gothic Metal, Melodic Death Metal, Progressive Metal, Nu Metal, Symphonic Metal, Alternative Metal, Groove Metal, Metalcore, Death Metal, Glam Metal Punk Pop Punk, Punk Rock, Hardcore Punk, Post-Hardcore, Gypsy Punk ElectronicAndDance Synthpop, Electronic, Industrial Metal, Industrial Rock, Dance Folk Folk Rock, Indie Folk, Folk Punk Progressive Progressive Rock, Progressive Metal, Experimental Rock, Post, Psychedelic Rock Pop Pop, Pop Rock der (one of Male, Female, or Group), and their Start Date and EndDate.9 Additionally, a single musical genre considered most representative was curated for each artist using MusicBrainz’s genre information. As this data often assigns multiple genre tags to an artist, a manual selec- tion of a single tag was performed where necessary. Next, using each artist’s MBIDs, each’s corresponding setlist.fm ID (a separate setlist.fm unique identifier for each artist) is obtained via calls to the setlist.fm API.10 This data is stored as depicted in the Artist table in Figure 2. The determined single genres for each artist result in a large number of different genre values, many representing similar sub-genres of a more general genre. To make genre analysis more tractable, the set of genres can be mapped onto a smaller set of generalized genres. Such a mapping of dozens of genres onto a set of generalized genres is shown in Table 1. This data is stored as depicted in the Genre table in Figure 2.11 Given a set of artist setlist.fm IDs, a list of each artist’s tours can be obtained, from which data associated with each tour of each artist can be subsequently acquired. For each tour, overall information of the tour name, the total number of shows on the tour, and date ranges are acquired, and stored as depicted in the Tour table in Figure 2. For each tour’s songs, the list of songs played on the tour, along with each song’s number of plays on the tour are acquired. Additionally, for each song, whether the song is denoted as a cover song (with respect to the artist the tour is for) is recorded.12 The song information is stored as depicted in the Song table 9 For groups this represents their formation date and disband dates, for solo artists it holds just their birth dates and retirement or death dates. For ongoing artists EndDate will be ”present”. 10 https://api.setlist.fm/docs/1.0/index.html (utilising the SetListR wrapper R package - https://github.com/fusionet24/SetListR 11 The assignment of a single genre to each artist, followed by the mapping of these genres to a broader set of gener- alized categories, has been curated as a proof of concept. However, genre classification is a complex and expansive subject in its own right, with numerous studies addressing the challenges associated with genre categorisation [9] and the phenomenon of genre crossover [32]. Given such complexities, the automation of genre classification represents a compelling area for future exploration. 12 Additionally, tours that are empty (made up of only empty shows) are removed, as are any artists for which all 806 Figure 2: Data Model of our Approach in Figure 2. From this, full artist tour history data for over 200 artists was acquired, chosen for their prominence within popular music history and culture. This number is progressively expanding, and up-to-date figures, and access to the data, can be found at the project’s GitHub repository.13 All of the data acquired is stored as raw data, as depicted in Figure 1, to then be utilised within the analysis stage. 3.2. Data Pre-Processing Before beginning analysis, pre-processing of the raw dataset is performed. Within our analy- sis, we are interested in artists who have sufÏciently substantial touring histories. Therefore, we define thresholds to utilise only artists that have a minimum number of tours and a mini- mum overall number of shows, and only keep tours that have a minimum number of shows, a minimum number of unique songs played on the tour, and a minimum show length of the tours are empty. Further, tours identified by their name as a set of Promotional publicity media/private shows, are removed. 13 https://github.com/EdAbel/setlist-variety 807 Table 2 Raw Data Threshold Parameters Variable Value Artist Minimum No. of Tours 5 Artist Minimum Total No. of shows 200 Tour Minimum No. of shows 20 Tour Minimum No. of songs 10 Tour Minimum Average Show Length 10 shows from the tour.14 These threshold values are inherently subjective and context dependent; therefore, we conduct the analysis on the raw data, preserving its integrity so that alternative thresholds could be applied if needed. Within our analysis that follows, the threshold param- eter values utilised are shown in Table 2. Following the pre-processing stage, we begin the analysis at individual tour level. 3.3. Tour Analysis For a tour, each (unique) song has a Play Count (𝑃𝐶), denoting how many times it was played on that tour. Tours can vary in terms of how many shows they are made up of, therefore, for comparisons between tours, a tour’s (absolute) 𝑃𝐶 values can be normalised with respect to the number of shows in the tour. For a tour, a Relative Play Count (𝑅𝑃𝐶) for each song played on the tour can be computed via: PC𝑖 𝑅𝑃𝐶 𝑖 = ( ) ∗ 100 (1) 𝑡𝑁 Where 𝑃𝐶 𝑖 is the 𝑃𝐶 value of the 𝑖-th song and 𝑡𝑁 is the number of shows in the tour. A 𝑅𝑃𝐶 value of 100 represents a song being played every single show of a tour, whilst a value of 50 represents a song being played at exactly half of the tour’s shows. We can visualise a tour and its songs in terms of their 𝑅𝑃𝐶 values, where the y-axis denotes 𝑅𝑃𝐶 value, and the x-axis denotes song number where songs are sorted with respect to 𝑅𝑃𝐶 values high to low. For example, Bruce Springsteen (and the E-Street Band’s) 2023 tour, is shown in Figure 3a, and Coldplay’s Music of the Spheres 2023 Tour is shown in Figure 3b. Such tour visualisations highlight a generality for many tours to map out an s shaped sigmoidal like function shape. From a tour’s dataset, notions of Shelf, Tail, 100%’ers, Uniques and Covers can be calculated and subsequently highlighted within such visualisations. Each of these notions are defined and explained next. Shelf - The notion of a tour’s Shelf is a measure of the significance of a tour to have a set of of songs that are played at most of the tour’s shows, and outline a shelf like shape in the top left of the plots in Figure 3. Given a Shelf Size (𝑆𝑆) value, denoting what top percentile of tour songs are to be considered part of the shelf, a Shelf Value 𝑆 can be calculated as the ratio of a 14 Such thresholds are beneficial for identifying and removing tours with missing data issues, such as tours which only a few of the shows have been added for, or tours that have many shows added but lack song information for many of the shows. Such tours, if left in the data, can unduly impact and bias analysis. 808 100 100 80 80 Relative Play Count Relative Play Count 60 60 40 40 20 20 0 0 0 10 20 30 40 50 60 70 0 20 40 60 80 100 120 140 Song No. Song No. (a) Bruce Springsteen (and The E-Street Band) - (b) Coldplay - Music of the Spheres 2023 Tour (Just 2023 Tour (Just Data Points). Data Points) Figure 3: Tours Visualisation Just Data Points tour’s songs that are in its shelf. Given the tour’s set of X songs, of length 𝑛, sorted as, 𝑥1 to 𝑥𝑛 , from high to low with respect to their 𝑅𝑃𝐶 values: X = {𝑥1 , 𝑥2 , … , 𝑥𝑛 } where 𝑥1 ≥ 𝑥2 ≥ ⋯ ≥ 𝑥𝑛 (2) The Shelf Songs are selected as the top 𝑆𝑆 percentile of X: Shelf Songs = Top𝑆𝑆 Percentile = {𝑥 ∈ X|𝑥 ≥ 𝑃𝑆𝑆 (X)} (3) and 𝑆 is calculated via: |Shelf Songs| 𝑆= (4) 𝑛 An 𝑆𝑆 value of 10% would select all the songs that have been played at 90% or more of a tour’s shows, and the corresponding 𝑆 value represents the ratio of the tour’s songs that are played at 90+% of the tour’s shows. So, an 𝑆 value of 0.25 would represent that 25% of the tour’s songs are played at 90% or more of its shows. Tail - The notion of a tour’s Tail is a measure of the significance of a tour to have a set of songs that are played only rarely on the tour, and outlines a tail like shape in the bottom right of the plots in Figure 3. Given a Tail Size (𝑇 𝑆) value, denoting what bottom percentile of tour songs are to be considered a part of the tail, a Tail Value 𝑇 can be calculated as the ratio of a tour’s songs that are in its tail. The Tail Songs are selected as the bottom 𝑇 𝑆 percentile of X: Tail Songs = Bottom𝑆𝑆 Percentile = {𝑥 ∈ X|𝑥 ≤ 𝑃𝑇 𝑆 (X)} (5) and 𝑇 is calculated via: |Tail Songs| 𝑇 = (6) 𝑛 809 A 𝑇 𝑆 value of 10% would select the songs that are played at most at 10% of a tour’s shows, and the corresponding 𝑇 value represents the ratio of the tour’s songs that are played at 10% or less of the tour’s shows. So, a 𝑇 value of 0.4 would represent that 40% of the tour’s songs are played at 10% or less of its shows. 100%’ers - In the set of songs making up the tour’s shelf, there exists a subset of 0 or more songs that are played at 100% of the tour’s shows. This subset of songs (100%′ 𝑒𝑟 𝑆𝑜𝑛𝑔𝑠) can be identified as the set of songs that have 𝑅𝑃𝐶 = 100. A 100%’ers Value 𝐻 is calculated as the ratio of a tour’s songs that are in this subset. |100%’er Songs| 𝐻 = (7) 𝑛 Uniques - In the set of songs making up a tour’s tail, there exists a subset of 0 or more songs that are played only once during the whole tour. This subset of songs (𝑈 𝑛𝑖𝑞𝑢𝑒𝑆𝑜𝑛𝑔𝑠) can be identified as the set of songs that have a 𝑃𝐶 = 1. A Uniques Value 𝑈 is calculated as the ratio of a tour’s songs that are in this subset. |Unique Songs| 𝑈 = (8) 𝑛 Covers - For each song played on the tour we have information denoting which are cover songs (with respect to the artist), from which the shelf songs that are cover songs, and the tail songs that are cover songs, can be determined. From this, the set of shelf songs minus those that are covers can be determined and a Shelf Minus Covers Ratio Value 𝑆𝑀𝐶 calculated. Similarity, the set of tail songs minus those that are covers can be determined and a Tail Minus Covers Ratio Value 𝑇 𝑀𝐶 calculated. These calculated notions of Shelf, Tail, 100%’ers, Uniques and Covers can be highlighted visually within our tour plots, as shown for the tours introduced earlier of Bruce Springsteen (and the E-Street Band’s) 2023 tour in Figure 4a, and Coldplay’s Music of the Spheres 2023 Tour in Figure 4b. In these plots, the shelf lower edge is denoted via a dotted green line and the 100%s’ers are those songs that sit on the solid green line. The tail upper edge is denoted via a dotted orange line and the uniques are those songs that sit on the solid orange line. Cover songs are denoted by filled red data points, and the shelf and tail covers can be identified as those filled data points above the shelf’s lower edge and below the tail’s upper edge respectively. Here, and within subsequent analysis, 𝑆𝑆 and 𝑇 𝑆 parameters of 10% are utilised. However, these parameters can be altered to any desired numbers. For experimentation exploring sensitively analysis and impacts of these parameters see Appendix A. Such analysis can be utilised to explore and compare different tours from the same artist and to compare tours of different artists. For example, regarding other Bruce Springsteen tours, the plot for the Wrecking ball tour is shown in Figure 4c and the plot for the Bruce Springsteen on Broadway tour shown in Figure 4d. These plots highlight how, as in The Wrecking ball tour’s case, a tour can have quite a small and sharp shelf, or, as in the Bruce Springsteen on Broadway tour, a tour can alternatively be predominantly made up of a shelf. The tour for Taylor Swift’s Speak Now World Tour is shown in Figure 4e, highlighting a tour which has a large long tail that is made up of many uniques that are cover songs. The tour for Pink Floyd’s The Wall tour 810 100 100 Cover Cover 80 80 Relative Play Count Relative Play Count 60 60 40 40 20 20 0 0 0 10 20 30 40 50 60 70 0 20 40 60 80 100 120 140 Song No. Song No. (a) Bruce Springsteen - 2023 Tour (b) Coldplay - Music of the Spheres Tour 100 100 Cover Cover 80 80 Relative Play Count Relative Play Count 60 60 40 40 20 20 0 0 0 50 100 150 200 250 5 10 15 Song No. Song No. (c) Bruce Springsteen - Wrecking Ball Tour (d) Bruce Springsteen - Broadway Tour 100 100 Cover 80 80 Relative Play Count Relative Play Count 60 60 Cover 40 40 20 20 0 0 0 20 40 60 80 0 5 10 15 20 25 Song No. Song No. (e) Taylor Swift - Speak Now World Tour (f) Pink Floyd - The Wall Tour Figure 4: Tour Visualisations with Shelf, Tail, and Covers Identified is shown in Figure 4f, highlighting a tour where all of the tour’s songs are contained in the shelf and in fact are all 100%’ers. Building on this analysis of single tours, next, we explore analysis of an artist’s whole career of tours. 811 3.4. Artist Career of Tours Analysis For a single tour, its Shelf Value 𝑆 and Tail Value 𝑇 can be calculated. With the calculation of a pair of such values for every tour of an artist’s career, their whole career can be visualised, in chronological order, in a bar plot. Such a visualisation, for every tour for Bruce Springsteen is shown in Figure 5a. Here, for each tour, tail values are shown as negative blue bars and each tour’s corresponding shelf values are shown as green bars. Additionally, for each tour, its 100%’ers Value 𝐻 and Uniques Value 𝑈 can be calculated. These pair of values represent values equal to or less than the tour’s 𝑆 and 𝑇 values. Therefore, an artist’s whole career can be visualised in a bar plot where the amount of each tour’s tail that is made up of uniques, and the amount of each tour’s shelf that is made up of 100%’ers is highlighted. For Bruce Springsteen, every tour with this information is shown in Figure 5b. Moreover, for each tour the amount of its shelf that is made up of covers, and the amount of its tail that is made up of covers can be calculated. Then, an artist’s whole career can be visualised with the amount of each tour’s tail that is made up of covers, and the amount of each tour’s shelf that is made up of covers highlighted. Every tour for Bruce Springsteen with this information is shown in Figure 5c. Alternatively, we could visualise the impact of cover songs on each tour’s shelf and tail through calculating each tour’s Shelf Minus Covers Value 𝑆𝑀𝐶 and Tail Minus Covers Value 𝑇 𝑀𝐶. Then, an artist’s whole career can be visualised, highlighting how shelves or tails that are made up of a substantial amount of cover songs will become smaller. Every tour for Bruce Springsteen with this information is shown in Figure 5d. Such analysis can be utilised to explore and compare the careers of different artists. For example, Figure 6a shows the career of Iron Maiden, and Figure 6b shows the career of Slipknot, each showing Shelf and Tail values and how much of them are taken up by 100%s and Uniques. These plots highlight how these artists have a clear leaning towards playing more conformity and greatest hits like sets, and highlight how over their career this has only become more pronounced. The career of Taylor swift, showing each tour’s Shelf and Tail values and how much of them are taken up by covers, is shown in Figure 6c. Here, we observe how after her first tour, a similar size of shelves and tails is observed. However, whereas the Speak Now World Tour’s tail is made up almost entirely of cover songs, we observe the inverse for the tail of the most recent Eras tour. The range of different shelf and tail values for the tour’s of Pink Floyd are shown in Figure 6d. From this plot we observe stark differences between early tours which have little or no shelves, and later tours, that coincide with their commercial peak, having little or no tails and large shelves. Building on this analysis of the whole touring career of a single artist, next, we explore comparison analysis between a corpus of artists. 3.5. Comparing Artists For an artist, Shelf 𝑆 and Tail 𝑇 values for each tour can be calculated, denoting the size of shelf and tail for each tour. From the set of shelf values, a single average shelf value 𝑆 ̄ can be calculated via: 812 Tail Tail_Minus_Uniques Shelf_Minus_The100s Shelf Uniques The100s Greetings From Asbury Park, N.J. Greetings From Asbury Park, N.J. The Wild, The Innocent & The E Street Shuffle The Wild, The Innocent & The E Street Shuffle Chicken Scratch Tour Chicken Scratch Tour Born to Run Born to Run Lawsuit Tour Lawsuit Tour Darkness Darkness The River The River Born in the U.S.A. Born in the U.S.A. Tunnel of Love Express Tunnel of Love Express Tour Name Tour Name Human Rights Now! Human Rights Now! Bruce Springsteen 1992–1993 World Tour Bruce Springsteen 1992–1993 World Tour The Ghost of Tom Joad The Ghost of Tom Joad Reunion Tour Reunion Tour The Rising The Rising Devils & Dust Devils & Dust Seeger Sessions Seeger Sessions Magic Magic Working on a Dream Working on a Dream Wrecking Ball Wrecking Ball High Hopes High Hopes The River Tour 2016 The River Tour 2016 Springsteen On Broadway Springsteen On Broadway Springsteen On Broadway 2021 Springsteen On Broadway 2021 Springsteen & E Street Band 2023 Tour Springsteen & E Street Band 2023 Tour −100 −50 0 50 100 −100 −50 0 50 100 Relative Size Relative Size (a) All Tours Shelf and Tail (b) All Tours Shelf & 100%’ers and Tail & Uniques Tail_Minus_TailCovers Shelf_Minus_ShelfCovers Tail_Minus_TailCovers TailCovers ShelfCovers Shelf_Minus_ShelfCovers Greetings From Asbury Park, N.J. Greetings From Asbury Park, N.J. The Wild, The Innocent & The E Street Shuffle The Wild, The Innocent & The E Street Shuffle Chicken Scratch Tour Chicken Scratch Tour Born to Run Born to Run Lawsuit Tour Lawsuit Tour Darkness Darkness The River The River Born in the U.S.A. Born in the U.S.A. Tunnel of Love Express Tunnel of Love Express Tour Name Tour Name Human Rights Now! Human Rights Now! Bruce Springsteen 1992–1993 World Tour Bruce Springsteen 1992–1993 World Tour The Ghost of Tom Joad The Ghost of Tom Joad Reunion Tour Reunion Tour The Rising The Rising Devils & Dust Devils & Dust Seeger Sessions Seeger Sessions Magic Magic Working on a Dream Working on a Dream Wrecking Ball Wrecking Ball High Hopes High Hopes The River Tour 2016 The River Tour 2016 Springsteen On Broadway Springsteen On Broadway Springsteen On Broadway 2021 Springsteen On Broadway 2021 Springsteen & E Street Band 2023 Tour Springsteen & E Street Band 2023 Tour −100 −50 0 50 100 −100 −50 0 50 100 Relative Size Relative Size (c) All Tours Shelf & Covers and Tail & Covers (d) All Tours Shelf And Tail Without Covers Figure 5: All Tours Analysis - Bruce Springsteen 𝑛 1 𝑆̄ = ∑𝑆 (9) 𝑛 𝑖=1 𝑖 where 𝑆 ̄ is the average of the individual Shelf values, 𝑆𝑖 represents the Shelf value for Tour 𝑖, and 𝑛 is the number of Tours for the artist. Similarly, an average tail value 𝑇 ̄ for the artist can be calculated via: 𝑛 1 𝑇 ̄ = ∑ 𝑇𝑖 (10) 𝑛 𝑖=1 where 𝑇 ̄ is the average of the individual Tail values, 𝑇𝑖 represents the Tail value for Tour 𝑖, and 𝑛 is the number of Tours for the artist. From these calculations, a pair of 𝑆 ̄ and 𝑇 ̄ values can be calculated for every artist, and the whole set of artists can be shown within a single scatter plot, as shown in Figure 7. Here, the x-axis denotes mean tail values (𝑇 ̄ ) and the y-axis denotes mean shelf values (𝑆),̄ with the shelf axis scale inverted to highlight how, in variety terms, the larger the shelf the less variety, and the larger the tail the more variety. Each data point in the plot represents an artist, coloured with respect to their generalized genre value. The solid blue diagonal line, running from the bottom left to the top right, signifies the vector of values where the sum of 𝑆 ̄ and 𝑇 ̄ is 100, which 813 Tail_Minus_Uniques Shelf_Minus_The100s Tail_Minus_Uniques Shelf_Minus_The100s Uniques The100s Uniques The100s Metal for Muthas Ozzfest 1999 Iron Maiden Tour 1980 Livin La Vida Loco Killer World Tour World Domination Tour The Beast on the Road Ozzfest 2001 World Piece Pledge of Allegiance World Slavery Tour European Iowa Tour 2K2 Somewhere On Tour Seventh Tour of a Seventh Tour Jägermeister Music Tour (Spring 2004) No Prayer on the Road European Open Air Tour 2004 Fear of the Dark Ozzfest 2004 Tour Name Tour Name A Real Live Tour The Unholy Alliance Tour 2004 2005 The X Factour Subliminal Verses World Tour 2005 Virtual XI World Tour Mayhem Festival 2008 The Ed Hunter Tour All Hope is Gone Brave New World Mayhem Festival 2012 Give Me Ed... 'til I'm Dead Memorial World Tour Dance of Death Summer's Last Stand Tour Eddie Rips Up the World Prepare For Hell Tour A Matter of Life and Death North American Summer Tour 2016 Somewhere Back in Time The Final Frontier World Tour Knotfest Roadshow Maiden England We Are Not Your Kind The Book of Souls World Tour Knotfest Roadshow 2021 Legacy of the Beast Knotfest Roadshow 2022 The Future Past The End, So Far −100 −50 0 50 100 −100 −50 0 50 100 Relative Size Relative Size (a) Iron Maiden – All Tours Shelf & 100%’ers and (b) Slipknot – All Tours Shelf & 100%’ers and Tail Tail & Uniques & Uniques Tail_Minus_TailCovers Shelf_Minus_ShelfCovers Tail_Minus_Uniques Shelf_Minus_The100s TailCovers ShelfCovers Uniques The100s Pink Floyd World Tour 1968 Fearless The Man and The Journey Atom Heart Mother World Tour Speak Now World Tour Meddle U.S. Tour 1971 Dark Side of the Moon Tour Name Tour Name The Red Tour British Winter Tour 1974 Wish You Were Here The 1989 World Tour In the Flesh The Wall reputation Stadium Tour A Momentary Lapse of Reason Another Lapse The Eras Tour The Division Bell −100 −50 0 50 100 −100 −50 0 50 100 Relative Size Relative Size (c) Taylor Swift – All Tours Shelf & Covers and Tail (d) Pink Floyd – All Tours Shelf & 100%’ers and Tail & Covers & Uniques Figure 6: All Tours Analysis - Various Artists would signify that 100% of songs (in all the artist’s tours) are contained within shelves and tails. Therefore, each artist’s distance to this line signifies their average combined shelves and tail size. The solid red diagonal line, running from the top left to the the bottom right of the plot, signifies the set of pairs of equal 𝑆 ̄ and 𝑇 ̄ values. Artists sitting on this line have equally sized average shelf (𝑆)̄ and tail (𝑇 ̄ ) values, artists that sit above the line have a greater average tail than shelf suggesting more variety, and artists that sit below the line have a greater average shelf then tail suggesting less variety. The distance each artist is from this line represents the strength of this property. Artists further towards the bottom left of the plot represent those with much larger shelves and smaller tails on average, suggesting they are the artists with the least variety. Artists further towards the top right of the plot represent those with smaller shelves and larger tails on average, suggesting they are the artists with the most variety. Such a visualisation, which preserves the dimensions of the shelf and the tail separately, enables nuanced comparisons within this multi-dimensional space [3]. This facilitates highlighting, for example, differences between artists who are equidistant from the solid red diagonal line but vary in their distance from the solid blue line. Further analysis can consider shelves and tails not including the songs in them that are cover songs, through computing average shelf and tail values for each artist in relation to this, and 814 umphrey's−mcgee 0 u.d.o. grateful−dead frank−zappa céline−dion nofx pearl−jam red−hot−chili−peppers yo−la−tengo iced−earth jethro−tull cheap−trick frank−turner two−door−cinema−club elton−john−and−billy−joel primus faith−no−more billy−joel you−me−at−six queens−of−the−stone−age r.e.m. wilco beck pj−harvey elton−john foo−fighters foals limp−bizkit kaiser−chiefs aerosmith eric−clapton him trivium radiohead bon−jovi the−killers the−national kings−of−leonbiffy−clyro deftones coldplay pixies rise−against bruce−springsteen heart imagine−dragons yellowcard thirty−seconds−to−mars taylor−swift −25 lacuna−coil epica kasabian arcade−fire green−day black−sabbath dream−theater u2 bob−dylan interpol incubus the−offspring the−cult journey whitesnake judas−priest exodus oasis new−found−glory rammstein kiss metallica weird−al−yankovic Mean Shelf in−flames rob−zombie the−rolling−stones zz−top tool nickelback yes −50 avenged−sevenfold velvet−revolver acdc disturbed fall−out−boy linkin−park van−halen queen slayer paramore opeth lady−gaga tina−turner korn paul−mccartney cher def−leppard sigur−rós eagles iron−maiden depeche−mode slipknot beyoncé −75 pink−floyd madonna roger−waters GeneralizedGenre ghost a Alternative britney−spears a ElectronicAndDance a Folk rush a Metal a Pop a Progressive a Punk −100 a Rock 0 25 50 75 100 Mean Tail Figure 7: All Artists Comparisons - Average Tail Vs Average Shelf creating a scatter plot of these results, as shown in Figure 8. In this plot, we see how some artists, that play a lot of covers within their tail, move further away from the top right of the plot, highlighting the importance, for some artists, of playing cover songs as part of their attainment of variety.15 From Figures 7 and 8’s data, additional analysis can compute overall averages for each gen- eralized genre. Calculated genre averages, for Tail and Shelf values are shown in Figure 9a. The plot highlights how genres, such as Electronic and Dance, and Pop, on average exhibit less variety that other genres, such as Folk, Alternative, and Punk. Calculated genre averages, when shelf and tail cover songs are not considered are shown in Figure 9b. Here, we observe the impacts removing covers has on the genre averages, and how the impacts are greater for some genres, such Rock and Alternative, than others, such as Punk. Further analysis from Figure 7 and 8’s data can, through division of the 2-dimensional plot 15 Similarity we could explore utilising data pertaining to the amount of uniques and 100%’ers within shelves and tails to, for example, use a weighting system to give these songs more impact. 815 u.d.o. umphrey's−mcgee yo−la−tengo frank−zappa iced−earth céline−dion 0 red−hot−chili−peppers nofx frank−turner two−door−cinema−clubgrateful−dead pearl−jam eric−clapton primus jethro−tull cheap−trick you−me−at−six r.e.m. billy−joel beck wilco pj−harvey elton−john foo−fighters him aerosmith the−national pixies imagine−dragons biffy−clyro green−day −25 tina−turner taylor−swift whitesnake interpol the−offspring velvet−revolver oasis Mean Shelf Minus Covers zz−top cher rob−zombie avenged−sevenfold acdc yes lady−gaga fall−out−boy −50 opeth linkin−park slayer lamb−of−god korn def−leppard paul−mccartney sigur−rós beyoncé iron−maiden depeche−mode slipknot pink−floyd −75 ghost madonna britney−spears roger−waters GeneralizedGenre a Alternative a ElectronicAndDance a Folk rush a Metal a Pop a Progressive a Punk −100 a Rock 0 25 50 75 100 Mean Tail Minus Covers Figure 8: All Artists Comparisons - Average Tail Minus Tail Covers Vs Average Shelf Minus Shelf Covers space, cluster the artists into a set of ordinal clusters from very high variety, to very low variety. For discussions and results from such clustering analysis see Appendix B. Finally, in the pursuit of a single measure of variety for each artist, the shelf and tail values of a tour are combined, to derive a single measure of Variety 𝑉 for each tour. For tour 𝑖, its Variety measure 𝑉𝑖 can be calculated via: 𝑉𝑖 = 𝑇𝑖 − 𝑆𝑖 (11) where 𝑇𝑖 is the Tail value of tour 𝑖 and 𝑆𝑖 is the Shelf value of tour 𝑖. A positive value represents a tour with a tail larger than its shelf, suggesting more variety, and a negative value represents a tour with a tail smaller than its shelf, suggesting less variety. From this, an average overall variety 𝑉 ̄ value of an artist’s tours can be calculated via: 𝑛 1 𝑉̄ = ∑𝑉 (12) 𝑛 𝑖=1 𝑖 816 0 0 Folk Folk Alternative Alternative −25 −25 Punk Punk Rock Metal Rock Metal Progressive Progressive Mean Shelf Mean Shelf ElectronicAndDance ElectronicAndDance Pop −50 −50 Pop −75 −75 GeneralizedGenre GeneralizedGenre a Alternative a Alternative a ElectronicAndDance a ElectronicAndDance a Folk a Folk a Metal a Metal a Pop a Pop a Progressive a Progressive a Punk a Punk −100 a Rock −100 a Rock 0 25 50 75 100 0 25 50 75 100 Mean Tail Mean Tail (a) All Artists - Tail and Shelf analysis with Genre (b) All Artists - Tail without covers and Shelf with- Averages out covers analysis with Genre Averages Figure 9: Genre Averages Analysis where 𝑉 ̄ is the average of the individual tour 𝑉 values, 𝑉𝑖 represents the Shelf value for Tour 𝑖, and 𝑛 is the number of Tours for the artist. The set of all artists and their 𝑉 ̄ values, ordered with respect to 𝑉 ̄ , and coloured with respect to generalized genre, is shown in Figure 10, providing an overall visualisation of a corpus of artists with respect to variety. 3.6. Correlation Analysis of Variety with Other Features To examine the robustness of our notion of variety for comparing different artists, despite their varying characteristics, such as the number of tours, the length of performances, and activity during different time periods, we conducted a correlation analysis between our 𝑉 ̄ Variety mea- sure and such properties. Table 3 shows the correlation results for seven artist properties, along with definitions of each property. The Correlation values are the correlation levels found be- tween each of the seven properties and our 𝑉 ̄ measure. Here, correlation is calculated with respect to Pearson Correlation CoefÏcient. For fuller descriptions and discussions of each of these properties see Appendix C, which also contains visualisation scatter plots of each prop- erty against our 𝑉 ̄ measure. Table 3 highlights how our measure has only very weak correlation to these properties,16 suggesting our measure is robust for analysis between artists, and will not be unduly bias by, for example, different artists having more tours or longer shows. 16 Where the semantics of correlation strength can be classified as - Very Weak Correlation: |𝑟| < 0.2, Weak Cor- relation: 0.2 ≤ |𝑟| < 0.4, Moderate Correlation: 0.4 ≤ |𝑟| < 0.6, Strong Correlation: 0.6 ≤ |𝑟| < 0.8, Very Strong Correlation: |𝑟| ≥ 0.8 [12] 817 umphrey's−mcgee yo−la−tengo frank−turner red−hot−chili−peppers billy−joel taylor−swift r.e.m. pearl−jam cheap−trick frank−zappa coldplay the−killers foo−fighters beck nofx wilco eric−clapton grateful−dead bon−jovi bruce−springsteen green−day faith−no−more céline−dion imagine−dragons the−national aerosmith queens−of−the−stone−age jethro−tull u.d.o. tom−petty−and−the−heartbreakers rise−against primus deftones limp−bizkit arcade−fire thirty−seconds−to−mars bob−dylan iced−earth elton−john−and−billy−joel elton−john kasabian kaiser−chiefs radiohead foals the−flaming−lips biffy−clyro nick−cave−&−the−bad−seeds heart the−offspring trivium jimmy−eat−world epica u2 halestorm pj−harvey death−cab−for−cutie two−door−cinema−club kings−of−leon you−me−at−six arctic−monkeys journey whitesnake him soundgarden system−of−a−down the−smashing−pumpkins alice−in−chains yellowcard florence−+−the−machine dream−theater incubus papa−roach oasis GeneralizedGenre pixies jane's−addiction Alternative Artist Name the−who black−sabbath editors ElectronicAndDance new−found−glory bullet−for−my−valentine Folk franz−ferdinand deep−purple anthrax Metal volbeat neil−young Pop enter−shikari testament Progressive placebo weird−al−yankovic lacuna−coil Punk blink−182 Rock peter−gabriel weezer the−used helloween david−bowie five−finger−death−punch children−of−bodom twenty−one−pilots the−black−keys within−temptation garbage scorpions yes queensrÿche the−cult muse the−rolling−stones megadeth exodus motörhead marilyn−manson noel−gallagher's−high−flying−birds interpol ozzy−osbourne kiss killswitch−engage fall−out−boy metallica nine−inch−nails nickelback mastodon lynyrd−skynyrd tool nightwish the−cure queen in−flames rammstein disturbed rob−zombie judas−priest acdc van−halen lady−gaga opeth guns−n'−roses zz−top avenged−sevenfold velvet−revolver paul−mccartney lamb−of−god mötley−crüe slayer paramore linkin−park korn def−leppard eagles tina−turner cher depeche−mode beyoncé sigur−rós iron−maiden slipknot madonna roger−waters pink−floyd ghost britney−spears rush −100 −50 0 50 100 Average Variety Figure 10: All Artists, Variety analysis, coloured by Generalized Genre 4. Conclusions In this paper, we explored data acquisition and processing of musical artists’ touring histories, and proposed an approach to explore set-list variety, at tour level, artist career level, and for 818 Table 3 Correlation Analysis Property Name Description Correlation Number of Tours The total number of tours -0.1786 Total Number of Shows The total number of shows from all tours -0.0725 Length of Tours The average number of shows per tour 0.0908 Average Show Length: The average show legnth in terms of number of songs 0.0952 H-Index The careear H-Index, where an artist has a h-index 0.0630 of h if they have played h songs at least h times each Artist Start Date The formation incarnation date of the artist (for 0.0933 (Groups Only) groups only) Amount Time Period The amount time active in terms of years (for groups -0.0952 (Groups Only) only) comparisons between artists. Our approach proposed the notions of a shelf and a tail, to aid explorations of, and to quantify, variety at tour level and artist level. Furthermore, the ap- proach explores the impact of cover songs on these notions of variety, and explores variety comparisons between different musical genres. The analysis of variety highlighted the diver- sity among artists, in terms of a prevalence to lean towards playing more conformative or more diverse shows. Additional correlation analysis explored the robustness of the proposed notion of variety, with respect to differing artist properties, such as the number of tours or the average lengths of shows. From our data processing, some data quality issues were uncovered, such as incomplete or empty data, for which such instances can be flagged, and filtering thresholds utilised. Gen- erally, we found more setlist.fm data issues for older tours and shows, and for less popular artists. Additionally, setlist.fm provides set-list data without details on other potential set-list semantics, such as variations in how a song is performed or the inclusion of special elements like artist monologues or other forms of communication during a show. Therefore,future work will explore integrating additional data sources, such as artist fan community databases, to en- rich our dataset and model, offering potential for incremental improvements. Additionally, future work will investigate incorporating our analysis into live music recommender systems, which suggest items based on user preferences [4]. Given that factors such as variety and diver- sity have become increasingly important in this field [18], our analysis may provide valuable insights. References [1] E. Abel and A. Goddard. “A Live Concert Performance Recommender System Utilizing User Ideal and Antithesis Ideal Setlist Preferences”. In: 14th International Conference on Smart Computing and Artificial Intelligence. IEEE Computer Society Press, 2023, pp. 330– 335. [2] E. Abel and A. Goddard. The Art Behind Bruce Springsteen’s Setlist Composition as Part of His Stagecraft. 2024. 819 [3] E. Abel, L. Mikhailov, and J. Keane. “Inconsistency Reduction in decision making via Multi-objective Optimisation”. In: European Journal of Operational Research (2017). doi: 10.1016/j.ejor.2017.11.044. url: http://linkinghub.elsevier.com/retrieve/pii/S0377221717 31055X. [4] M. Aljukhadar, S. Senecal, and C.-E. Daoust. “Using Recommendation Agents to Cope with Information Overload”. In: International Journal of Electronic Commerce 17.2 (2012), pp. 41–70. url: http://www.jstor.org/stable/41739511. [5] S. Ante. Why Taylor Swift is the new Grateful Dead. 2023. url: https://www.fastcompan y.com/90901513/taylor-swift-grateful-dead-cult-brands. [6] N. Baxter-Moore and T. M. Kitts. “The Live Concert Experience: An Introduction”. In: Rock Music Studies 3.1 (2016), pp. 1–4. url: https://doi.org/10.1080/19401159.2015.11319 23. [7] A. Bullard. Arctic Monkeys blasted for ‘changing lyrics and rhythm of best known songs’ leaving fans unable to singalong - MyLondon. 2023. url: https://www.mylondon.news/w hats-on/music-nightlife-news/arctic-monkeys-blasted-changing-lyrics-27104114. [8] E. C. Callahan and C. Carney. The politics and power of Bob Dylan’s live performances : play a song for me. 2024, p. 229. url: https://www.routledge.com/The-Politics-and-Pow er-of-Bob-Dylans-Live-Performances-Play-a-Song-for-Me/Callahan-Carney/p/book/9 781032315416. [9] J. R. Castillo and M. J. Flores. “Web-Based Music Genre Classification for Timeline Song Visualization and Analysis”. In: IEEE Access 9 (2021), pp. 18801–18816. doi: 10.1109/acc ess.2021.3053864. [10] P. Chianca. “Springsteen’s stage success”. In: Bruce Springsteen and Popular Music. Rout- ledge, 2018, pp. 178–188. doi: 10.4324/9781315672144-15/springsteen-stage-success-pet er-chianca. [11] C. Dalla Riva. Swifties vs. Deadheads: A Meditation on Live Music. 2023. url: https://chri sdallariva.substack.com/p/swifties-vs-deadheads-a-meditation?utm%5C%5Fsource=pu blication-search. [12] J. D. Evans. Straightforward Statistics for the Behavioral Sciences. Brooks/Cole Publishing Company, 1996. [13] E. Gleadow. Bob Dylan divides fans by ’doing whatever he wants’ and snubbing hit songs at shows - Mirror Online. 2024. url: https://www.mirror.co.uk/3am/celebrity-news/bob- dylan-divides-fans-doing-33095766. [14] J. E. Hirsch. “An index to quantify an individual’s scientific research output”. In: Proceed- ings of the National Academy of Sciences of the United States of America 102.46 (2005), p. 16569. doi: 10.1073/pnas.0507655102. url: https://www.ncbi.nlm.nih.gov/pmc/article s/PMC1283832/. [15] R. Johnston. How to Stream Bruce Springsteen 2024 Tour Online. 2024. url: https://www .billboard.com/culture/product-recommendations/watch-bruce-springsteen-tour-onlin e-streaming-1235669883/. 820 [16] D. Kreps. Bruce Springsteen’s Poignant Cover of Prince’s ’Purple Rain’. 2016. url: https: //www.rollingstone.com/music/music-news/see-bruce-springsteens-poignant-cover-of -princes-purple-rain-171985/. [17] A. B. Krueger. Land of Hope and Dreams: Rock and Roll, Economics and Rebuilding the Middle Class. Tech. rep. obama whitehouse archives, 2013. [18] M. Kunaver and T. Požrl. “Diversity in recommender systems – A survey”. In: Knowledge- Based Systems 123 (2017), pp. 154–162. doi: 10.1016/j.knosys.2017.02.009. [19] A. Lerch, C. Arthur, A. Pati, and S. Gururani. “An Interdisciplinary Review of Music Performance Analysis”. In: Transactions of the International Society for Music Information Retrieval 3.1 (2020), pp. 221–245. doi: 10.5334/tismir.53. url: https://transactions.ismir .net/articles/10.5334/tismir.53. [20] C. Love. On Repeat. Are artists trotting out the same old set lists gig after gig? Tech. rep. Medium, 2018. url: https://databeats.medium.com/on-repeat-70aba1cdc5f8. [21] B. Mathis-Lilley. “Secrets of the Radiohead Set List”. In: New York Magazine (2006). url: https://nymag.com/arts/all/process/17306/. [22] F. C. Moss, R. Lieck, and M. Rohrmeier. “Computational modeling of interval distribu- tions in tonal space reveals paradigmatic stylistic changes in Western music history”. In: Humanities and Social Sciences Communications 11.1 (2024), p. 684. url: https://doi.org /10.1057/s41599-024-03168-1. [23] M. Mulder and E. Hitters. “Visiting pop concerts and festivals: measuring the value of an integrated live music motivation scale”. In: Cultural Trends 30.4 (2021), pp. 355–375. url: https://doi.org/10.1080/09548963.2021.1916738. [24] T. Murray. Taylor Swift unexpectedly covers Calvin Harris, Rihanna hit during Liverpool show. 2024. url: https://www.independent.co.uk/arts-entertainment/music/news/taylo r-swift-this-is-what-you-came-for-eras-tour-b2563075.html. [25] MusicNews. Steve Van Zandt Defends Static Bruce Springsteen Setlists. 2023. url: https://v ermilioncountyfirst.com/2023/03/29/steve-van-zandt-defends-static-bruce-springsteen -setlists/. [26] M. Pandey. Crowd-pleasers: The art of choosing the perfect setlist. 2024. url: https://www .bbc.com/news/articles/c4nn9expp04o. [27] D. Pattie. Rock Music in performance. Palgrave Macmillan, 2007, pp. 1–188. doi: 10.1057 /9780230593305/cover. [28] R. Radburn and C. Love. Digging into concert setlist data: Which artists play the same songs over and over? Tech. rep. Tableau, 2018. url: https://www.tableau.com/blog/data-music -which-artists-use-same-old-setlists-gig-after-gig. [29] A. Rodriguez. Metallica bases its setlist on what fans listen to on Spotify. 2018. url: https: //qz.com/1340887/metallica-bases-its-setlist-on-what-fans-listen-to-on-spotify. [30] M. Rodriguez, V. Gintautas, and A. Pepe. “A Grateful Dead Analysis: The Relationship Between Concert and Listening Behavior”. In: First Monday 14 (2008). doi: 10.5210/fm.v 14i1.2273. 821 [31] M. Roffman. The 25 Best Rock Acts with Unique Setlists. 2015. url: https://consequence.n et/2016/08/the-25-best-rock-acts-with-unique-setlists/. [32] D. Silver, M. Lee, and C. C. Childress. “Genre Complexes in Popular Music”. In: Plos One 11.5 (2016), e0155471. url: https://doi.org/10.1371/journal.pone.0155471. [33] F. Thalmann, E. Nakamura, and K. Yoshii. “Tracking The Evolution Of A Band’s Live Per- formances Over Decades”. In: Proc. of the 23rd Int. Society for Music Information Retrieval Conf. 2022, pp. 850–857. url: https://madmom.readthedocs.io. A. Exploration of Shelf and Tail Parameter Values To aid selection of pertinent 𝑆𝑆 and 𝑇 𝑆 parameters, and to aid understanding our of dataset, sensitively analysis experimentation was performed exploring the average percentage of tour songs that are contained within different combined shelf and tail size parameter values. The impact of experimentation with different sized shelf and tail values is shown in Figure 11. The x-axis denotes different combined sizes of shelf and tail values (so 20 represents where the shelf and tail values are both 10) and the y-axis denotes the overall average percentage of songs that are contained within the combined shelf and tail. Figure 11 highlights how, due to the general trend observed for tours to map out an s shaped sigmoidal like function, there is a pattern that the percentage of songs contained within the shelf and tail is greater than the percentile values denoting the shelf and tail size. From example, a shelf and tail percentile size both of 10% (20% combined value) results in over 50% on average of tours’ songs being contained within this 20% percentile space. Moreover, these values denote a point in the plot where the decrease of the gradient of the line is levelling off, suggesting their suitability as shelf and tail percentile sizes. Further analysis could explore additional experimentation such as, breaking the results down to see the separate contributions of the shelf and tail to the total, using unequal shelf and tail size values, and exploring differences within the results when the data is subsetted for features such as genre. B. Ordinal Clustering Analysis From calculations of a pair of 𝑆 ̄ and 𝑇 ̄ values for every artist, the whole set of artists can be visualised within a single scatter plot, as shown in Figure 7, and from calculations of shelves and tails not including cover songs, the set of artists can be visualised within a single scatter plot, as shown in Figure 8. Further analysis of the data in Figures 7 and 8 can cluster artists into ordinal groups by dividing the 2-dimensional plot space. Variety can be viewed as a combination of levels of shelves and tails, with the plot space divided accordingly, as shown in Figure 12a, here, for average shelf and tail values for each artist. In this plot, the dotted red lines divide the space into 8 levels of variety, representing 8 ordinal clusters. Each artist belongs to only one cluster, as shown by the data point colours in Figure 12a. The membership constraints of each of the 8 clusters can be defined in terms of mean shelf (𝑆)̄ and mean tail (𝑇 ̄ ) value ranges, and assigned semantic ordinal names such as: 822 100 100 90.53 90 80.29 80 74.76 68.67 Mean Tail And Shelf Percent 70 65.08 61.39 60 52.99 50 46.79 40.57 40 30 26.44 16.32 20 12.13 10 0 0 0 10 20 30 40 50 60 70 80 90 100 Combined Shelf and Tail Values Figure 11: Shelf and Tail Parameter Size Impact on Overall Percentage of Songs Included 1. Very High Variety: Where the difference between the mean tail and mean shelf value is greater than or equal to 75. 2. High Variety: Where the difference between the mean tail and mean shelf value is greater than or equal to 50 and less than 75. 3. Moderate Variety: Where the difference between the mean tail and mean shelf value is greater than or equal to 25 and less than 50. 4. Low Variety: Where the difference between the mean tail and mean shelf value is greater than or equal to 0 and less than 25. 5. Low Uniformity: Where the difference between the mean tail and mean shelf value is greater than or equal to -25 and less than 0. 6. Moderate Uniformity: Where the difference between the mean tail and mean shelf value is greater than or equal to -50 and less than -25. 7. High Uniformity: Where the difference between the mean tail and mean shelf value is greater than or equal to -75 and less than -50. 8. Very High Uniformity: Where the difference between the mean tail and mean shelf value is less than -75. Similar analysis can be conducted for data calculations of shelves and tails not including any songs that are cover songs, as shown in Figure 12b. Here, we observe that when covers are excluded, some artists shift in the plot to the extent that they belong to a different cluster, invariably one with less variety. Within such cluster analysis, the % of artists in each cluster can be computed. The breakdown of each cluster artist %, for both data including cover songs, and data not considering cover songs, is shown in Table 4. The Table highlights how when cover songs are not considered, cluster memberships exhibit a general trend for the distribution to move from variety to uniformity. 823 u.d.o. umphrey's−mcgee u.d.o. 0 frank−zappa yo−la−tengo frank−zappa iced−earth céline−dion grateful−dead 0 umphrey's−mcgee iced−earth céline−dion pearl−jam red−hot−chili−peppers yo−la−tengo red−hot−chili−peppers nofx frank−turner elton−john−and−billy−joel grateful−dead two−door−cinema−club nofx cheap−trick frank−turner pearl−jam two−door−cinema−club jethro−tull r.e.m. eric−clapton primus primus faith−no−more billy−joel jethro−tull cheap−trick you−me−at−six elton−john you−me−at−six r.e.m. queens−of−the−stone−age wilco beck beck kaiser−chiefs billy−joel wilco eric−clapton foo−fighters pj−harvey elton−john pj−harvey limp−bizkit foals foo−fighters radiohead aerosmith bon−jovi the−killers him aerosmith him trivium deftones the−national the−national coldplay pixies pixies kings−of−leon biffy−clyro bruce−springsteen imagine−dragons heart rise−against imagine−dragons biffy−clyro yellowcard thirty−seconds−to−mars green−day green−day −25 lacuna−coil −25 epica arcade−fire taylor−swift taylor−swift kasabian tina−turner interpol u2 whitesnake bob−dylan incubus interpol the−offspring the−offspring the−cult journey velvet−revolver oasis judas−priest in−flames kiss the−rolling−stones weird−al−yankovic zz−top cher zz−top yes rob−zombie tool rob−zombie Mean Shelf yes Mean Shelf avenged−sevenfold acdc nickelback avenged−sevenfold acdc lady−gaga fall−out−boy −50 fall−out−boy −50 velvet−revolver disturbed opeth linkin−park van−halen queen linkin−park slayer opeth slayer lamb−of−god lady−gaga korn paramore korn paul−mccartney tina−turner cher def−leppard def−leppard paul−mccartney sigur−rós sigur−rós eagles beyoncé iron−maiden depeche−mode iron−maiden depeche−mode slipknot slipknot beyoncé pink−floyd −75 pink−floyd madonna −75 ghost madonna ghost roger−waters Cluster britney−spears roger−waters Cluster a High Variety a High Variety britney−spears a Moderate Variety rush a Moderate Variety a Low Variety a Low Variety rush a Low Uniformity a Low Uniformity a Moderate Uniformity a Moderate Uniformity a High Uniformity a High Uniformity −100 a Very High Uniformity −100 a Very High Uniformity 0 25 50 75 100 0 25 50 75 100 Mean Tail Mean Tail (a) All Artists Average Tail Vs Average Shelf Clus- (b) All Artists Average Tail Without Covers Vs Av- ter Membership erage Shelf Without Covers Cluster Member- ship Figure 12: Cluster Membership Analysis Table 4 Cluster Membership Breakdown Percentages Cluster Name Artist % (Including Covers) Artist % (Not Including Covers) Very High Variety 0.00 0.00 High Variety 1.85 0.62 Moderate Variety 12.96 2.47 Low Variety 19.14 25.31 Low Uniformity 40.74 43.21 Moderate Uniformity 17.90 21.60 High Uniformity 5.56 6.17 Very High Uniformity 1.85 0.62 C. Correlation Investigation and Discussions of Variety Our analysis of artist variety explores comparisons of shelves and tails, and our single artist level measure of variety, for comparisons between artists. The robustness of our notion of variety can be explored for its capability to compare different tours and different artists, even though different artists have different career characteristics, in terms of properties such as the number of tours, tour show count sizes, show lengths, being active and touring within different time periods and more. We explore the presence of correlation between our 𝑉 ̄ variety measure and such properties, with the Pearson Correlation CoefÏcient calculated via: 𝑛(∑ 𝑥𝑦) − (∑ 𝑥)(∑ 𝑦) 𝑟= (13) 2 2 2 2 √[𝑛 ∑ 𝑥 − (∑ 𝑥) ][𝑛 ∑ 𝑦 − (∑ 𝑦) ] 824 The explored properties and results are shown in Table 5, and visualisation scatter plots of each property against our 𝑉 ̄ measure are shown in Figures 13 – 16. Next, each property is outlined and discussed. Number of Tours: Our dataset includes artists with varying numbers of tours. Figure 13a compares each artist’s tour count with their 𝑉 ̄ , showing no strong relationship between the variables. The Pearson Correlation CoefÏcient value for these two variables is -0.1786. Total Number of Shows: For the artists in our dataset, the overall number of shows varies. Comparisons between artists’ total show count and their 𝑉 ̄ is shown in Figure 13b. Here, the plot highlights there is no strong relationship between these variables; the Pearson Correlation CoefÏcient value for these two variables is -0.0725. Length of Tours: Within our dataset, the number of shows within our artists’ tours differs. Comparisons between each artist’s average tour show count and their 𝑉 ̄ is shown in Figure 14a, and the correlation value between these variables is 0.0908. The outlier in the plot is due to the fact Bob Dylan’s ”Never Ending Tour” is ofÏcially billed as a single continuous tour spanning from 1988 to present day and contains around 4000 shows. A plot with Bob Dylan excluded to aid readability is shown in Figure 14b. Average Show Length: For our set of artists, their average show length, in terms of the num- ber of songs in the set-lists, varies. Comparisons of each artist’s average show Length and their 𝑉 ̄ is shown in Figure 15a, and the correlation value between these variables is 0.0952. H-Index: The H-index is a scientific output metric that quantifies both the productivity and citation impact of a researcher’s publications, where a scholar has an index of h if h of their papers have been cited at least h times each [14]. We could consider this notion within the set- list domain as a measure for an artist (instead of a researcher) and songs (instead of papers), where an artist has a h-index of h if they have played h songs at least h times each. To calculate this for each artist, the set of tours for an artist is combined and overall songs counts calculated from which the H-Index can then be derived from. The plot comparing artists’ H-Index against 𝑉 ̄ is shown in Figure 15b, and the correlation value between these variables is 0.0630. Artist Start Dates and Amount of Active Years (Groups Only): For the groups within our artist dataset, information pertaining to the groups formation start date and, if applicable, dis- bandment date (denoted as present day for still going groups) is retrieved from MusicBrainz. Comparisons of each group’s start date and their 𝑉 ̄ is shown in Figure 16a, and the correlation value between these variables is 0.0933. From this timeline data, the length of time each group were/have been active for can be calculated (with groups that are still active measured up to present day). Comparisons of each group’s Years Active and their 𝑉 ̄ is shown in Figure 16b, and the correlation value between these variables is -0.0952. The set of correlation values for these properties against 𝑉 ̄ is shown in Table 5. Considering correlation strength classified as, Very Weak Correlation: |𝑟| < 0.2, Weak Correlation: 0.2 ≤ |𝑟| < 0.4, Moderate Correlation: 0.4 ≤ |𝑟| < 0.6, Strong Correlation: 0.6 ≤ |𝑟| < 0.8, Very Strong Correlation: |𝑟| ≥ 0.8 [12], Table 5 highlights how our 𝑉 ̄ variety measure has only very weak correlation to all these properties. This suggests the measure is robust across the varying properties for our artists, and that comparisons between artists are not unduly impacted by these variations. 825 Table 5 Correlation Data Analysis Property Name Correlation Min Value Max Value Standard Deviation Number of Tours -0.1786 5 38 6.2613 Total Number of Shows -0.0725 209 3541 518.9224 Length of Tours 0.0908 23.87 393.4 35.7931 Average Show Length 0.0952 12.87 57.52 5.7834 H-Index 0.0630 28 126 15.6733 Artist Start Date (Groups Only) 0.0933 1962-01-01 2011-01-01 – Active Time in years (Groups Only) -0.0952 5.999 62.52 11.6033 bob−dylan slayer korn elton−john−and−billy−joel 3000 elton−john 30 yes elton−john elton−john−and−billy−joel kiss jethro−tull Total Number of Shows the−cure megadeth iron−maiden iron−maiden Number of Tours queensrÿche kiss bruce−springsteen bruce−springsteen def−leppard in−flames frank−zappa metallica 2000 def−leppard yes rush 20 marilyn−manson dream−theater slayer queensrÿche deep−purple ghost zz−top weezer incubus aerosmith metallica dream−theater slipknot rush acdc depeche−mode weird−al−yankovic aerosmith the−who van−halen grateful−dead depeche−mode judas−priest mötley−crüe u2 judas−priest megadeth jethro−tull bon−jovi frank−turner korn eagles acdc bon−jovi van−halen u2 pixies mötley−crüe frank−turner primus wilco r.e.m. umphrey's−mcgee marilyn−manson muse eagles red−hot−chili−peppers madonna pearl−jam coldplay red−hot−chili−peppers 1000 tina−turner zz−top deftones r.e.m. 10 pink−floyd paramore beck billy−joel foo−fighters billy−joel britney−spears roger−waters slipknot eric−clapton wilco coldplay frank−zappa sigur−rós cher britney−spears madonna roger−waters cheap−trick yo−la−tengo the−killers taylor−swift beyoncé nofx taylor−swift ghost pink−floyd cher the−killers beyoncé opeth pearl−jam cheap−trick umphrey's−mcgee paramore beck yo−la−tengo sigur−rós velvet−revolver you−me−at−six nofx 0 0 −100 −50 0 50 100 −100 −50 0 50 100 Variety Variety (a) Number of Tours Vs 𝑉 ̄ (b) Total Number of Shows Vs 𝑉 ̄ Figure 13: Number of Tours and Total Number of Shows Correlation Analysis 826 400 cher bob−dylan tina−turner 150 the−killers Length of Tours (Average Shows PerTour) Length of Tours (Average Shows PerTour) placebo arctic−monkeys rise−against coldplay 300 kasabian acdc the−national imagine−dragons red−hot−chili−peppers roger−waters deep−purple queens−of−the−stone−age bon−jovi the−black−keys 100 nightwish biffy−clyro faith−no−more iron−maiden oasiscéline−dion taylor−swift weird−al−yankovic def−leppard editors bruce−springsteen billy−joel britney−spears eagles van−halen green−day r.e.m. beyoncé foo−fighters 200 rush eric−clapton depeche−mode frank−turner radiohead pink−floyd cher arcade−fire tina−turner madonna u.d.o. wilco arctic−monkeys the−killers jethro−tull placebo rise−against cheap−trick the−national coldplay 50 slayer yellowcard roger−waters acdc primus pearl−jam yo−la−tengo deep−purple kasabian bon−jovi slipknot korn opeth iron−maiden the−offspringbeck umphrey's−mcgee 100 britney−spears red−hot−chili−peppers sigur−rós the−cure incubus limp−bizkit frank−zappa beyoncé linkin−park neil−young ghost nofx rush pink−floyd frank−turner paramore jane's−addiction the−flaming−lips wilco madonna cheap−trick yo−la−tengo rob−zombie grateful−dead slipknot beck pearl−jam umphrey's−mcgee sigur−rós nofx ghost rob−zombie grateful−dead 0 0 −100 −50 0 50 100 −100 −50 0 50 100 Variety Variety (a) Length of Tours Vs 𝑉 ̄ (b) Length of Tours Vs 𝑉 ̄ (sans Bob Dylan) Figure 14: Length of Tours Correlation Analysis 60 nofx bob−dylan bruce−springsteen pearl−jam 100 anthrax rush dream−theater the−cure paul−mccartney weird−al−yankovic jethro−tull r.e.m. 40 Overall Careear H−Index u2 depeche−mode queensrÿche david−bowie Average Show Length iron−maiden van−halen muse elton−john bon−jovi pearl−jam the−cure neil−young madonna yes primus grateful−dead metallica placebo aerosmith taylor−swift linkin−park paul−mccartney the−offspring def−leppard slayer cheap−trick primus coldplay red−hot−chili−peppers paramore opeth scorpions roger−waters kiss the−flaming−lips beck korn eagles guns−n'−roses pixies pixies radiohead wilco foo−fighters eagles britney−spears nofx rush beyoncé exodus heart wilco beyoncé zz−top frank−turner zz−top roger−waters limp−bizkit r.e.m. yo−la−tengo tina−turner green−day billy−joel umphrey's−mcgee pink−floyd beck ghost taylor−swift 50 slipknot yo−la−tengo 20 cher the−killers pink−floyd ghost rob−zombie limp−bizkit britney−spears umphrey's−mcgee sigur−rós sigur−rós lynyrd−skynyrd foals lamb−of−god two−door−cinema−club iron−maiden epica five−finger−death−punch tool foals velvet−revolver jane's−addiction 0 0 −100 −50 0 50 100 −100 −50 0 50 100 Variety Variety (a) Average Show Length Vs 𝑉 ̄ (b) H-Index Vs 𝑉 ̄ Figure 15: Average Show Length and H-Index Correlation Analysis 827 noel−gallagher's−high−flying−birds the−rolling−stones twenty−one−pilots lynyrd−skynyrd scorpions ghost imagine−dragons 60 the−who florence−+−the−machine two−door−cinema−club yes jethro−tull five−finger−death−punch you−me−at−six foals judas−priest deep−purple paramore zz−top bullet−for−my−valentine aerosmith kaiser−chiefs eagles queen journey velvet−revolver acdc heart arcade−fire pink−floyd iron−maiden kiss cheap−trick the−killers black−sabbath 2000 lamb−of−god mastodon trivium rise−against rush van−halen the−cure u2 the−national linkin−park interpol def−leppard exodus whitesnake umphrey's−mcgee Length of Active Time (Years) kasabian coldplay depeche−mode mötley−crüe queensrÿche slipknot disturbed biffy−clyro queens−of−the−stone−age metallica anthrax bon−jovi nickelback foo−fighters guns−n'−roses pixies nofx sigur−rós weezer limp−bizkit wilco 40 slayer primus u.d.o. yo−la−tengo Active Start Date korn him radiohead nine−inch−nails jane's−addiction iced−earth green−day in−flamesoasis pearl−jam tool incubus deftones opeth tool deftones pearl−jam opeth iced−earth incubus nine−inch−nails marilyn−manson green−day in−flames r.e.m. radiohead jane's−addiction alice−in−chains u.d.o. sigur−rós wilco korn pixies primus yo−la−tengo limp−bizkit foo−fighters guns−n'−roses nofx slipknot disturbed biffy−clyro dream−theater nightwish interpol coldplay umphrey's−mcgee mötley−crüe the−cult bon−jovi red−hot−chili−peppers linkin−park anthrax rise−against the−national slayer metallica queensrÿche avenged−sevenfold 1980 depeche−mode arcade−fire the−killers r.e.m. paramore def−leppard exodus whitesnake 20 you−me−at−six kaiser−chiefs the−cure motörhead five−finger−death−punch foals tom−petty−and−the−heartbreakers oasis iron−maiden acdc u2 imagine−dragons kiss cheap−trick ghost florence−+−the−machine van−halen twenty−one−pilots two−door−cinema−club journey heart noel−gallagher's−high−flying−birds queen aerosmith eagles judas−priest rush black−sabbath zz−top yes deep−purple jethro−tull velvet−revolver pink−floyd scorpions grateful−dead lynyrd−skynyrd the−who 0 the−rolling−stones 1960 −100 −50 0 50 100 −100 −50 0 50 100 Variety Variety (a) Artist Start Date (Groups Only) Vs 𝑉 ̄ (b) Artist Active Years (Groups Only) Vs 𝑉 ̄ Figure 16: Start Date and Active Years Correlation Analysis (Groups Only) 828