How Last.fm Illustrates the Musical World: User Behavior and Relevant User-Generated Content Ya-Xi Chen Sebastian Boring Andreas Butz Media Informatics Media Informatics Media Informatics University of Munich University of Munich University of Munich yaxi.chen@ifi.lmu.de sebastian.boring@ifi.lmu.de andreas.butz@ifi.lmu.de ABSTRACT lieve, that an investigation of online music communities Over the last few years, online multimedia exchange plat- could lead to a better understanding of people’s behavior forms have experienced a rapid growth. They allow users to surrounding music in general and bring valuable insights on share their own content and access other’s in turn and hence how to successfully harness the metadata contributed by the form very large public collections of User-Generated Con- users of these music communities. tent. While research is mostly looking at photo sharing plat- forms, such as Flickr, much less is known about online mu- There are several online music communities. Similar to sic communities. In this paper we present the results of an artist map proposed by Gulik and Vignolo in [7], Musicove- observational user study followed by a large-scale online ry 1 is an interactive radio station, for which the user can survey, which investigated the behavior and the relevant define the current mood, time range, desired tempo and content generated by the users of Last.fm, one of the most genre. Live365 2 is a radio network, in which the user can popular music communities. Based on the analysis of the generate a personalized radio station. The recommendations results, we present implications for the usage of User- are organized and characterized by genre. Similar radio Generated Content in online music communities. Then we functionalities are also provided in Jamendo 3 . Imeem 4 is a developed a first prototype based on the implications for social media community offering a variety of media types, improving semantic understanding of collaborative tags. such as music, video, photos and blogs. We believe our study gives insights for developing informa- Last.fm 5 is one of the largest and most popular online mu- tion visualization and recommender systems for online mu- sic communities with a large user group and abundant ser- sic communities. vices. According to Wikipedia 6 , Last.fm has over 30 mil- lion active users spreading over 200 countries. As Last.fm Author Keywords claims, they focus on playing the right songs to the right Online music community, User-Generated Content, user people. Its functionality can be extended based on a re- behavior, Last.fm. leased API and a series of applications have already been proposed. However, there is little research focusing on the ACM Classification Keywords user behavior and relevant UGC in those music communi- H5.2. Information interfaces and presentation (e.g., HCI): ties. To obtain implications for better use of UGC, such as User Interfaces. providing personalized recommendations and facilitating discovery of new music, we chose Last.fm as our experi- INTRODUCTION mental platform and conducted a user study based on it. Most of the current research on public multimedia exchange platforms is focusing on the behavior around photos in RELATED WORK online communities, such as searching, tagging and sharing. There are studies about users’ behavior with music, for ex- Much less is known about how people define their musical ample, searching, sharing and tagging. Some research also taste and how User-Generated Content (UGC) helps online focuses on music recommendations. All these studies reveal music communities to make more sense of music. We be- 1 Musicovery , http://www.musicovery.com/ 2 Live365, http://www.live365.com 3 Jamendo, http://www.jamendo.com 4 Imeem, http://www.imeem.com/ Workshop on Visual Interfaces to the Social and Semantic Web 5 (VISSW2010), IUI2010, Feb 7, 2010, Hong Kong, China. Last.fm, http://www.last.fm Copyright is held by the author/owner(s). 6 Wikipedia, http://en.wikipedia.org/wiki/Last.fm 1 the nature of our experience with music and help to under- Transparency of recommender systems stand the users’ desires regarding music-related technolo- Many online music communities, such as Pandora.com, gies. iTunes Genius and Amazon, offer music recommendations, and the mechanisms behind them vary from content analy- Searching sis to the users’ listening or purchasing patterns. People often do not explicitly search in media collections. Transparency is a crucial issue in recommender systems. They are rather looking for something that satisfies certain Herlocker et al. [10] suggested that the explanations of rec- (possibly vague) criteria, instead of one specific item. Oth- ommendations can make the system more understandable ers follow a different strategy by first picking up some can- and involve the user more in it, and thus improve the user’s didates and then making a final decision among these pre- satisfaction. In contrast to previous research focusing on selections. Vignoli [24] claimed that non-expert users have statistical accuracy of the algorithm, Swearingen and Sinha strong difficulties to express their musical preferences in a [22] emphasized interface issues from the user perspective. formal way, and that they often change their minds during They claimed that users like and feel more confident about the search process. recommendations with transparency, especially for new Kim et al. [14] investigate people’s perception of music and items. SIMAC [11] is one of the few existing systems, observe that both in the description and in searching, users which addressed the issue of transparency. In SIMAC six tend to combine music with events and emotions. Similar semantic descriptors were designed in order to solve the implications were derived in [5] based on the analysis of semantic gap. The weights of all descriptors were visual- respective requests posted to a music-related newsgroup. ized in a radial graph in which the radial distance presents the value of weight. The user can change the weight by Collaborative tagging moving the descriptor manually. With the rapid growth of the next-generation Web, many websites allow the users to make contributions by tagging USER-GENERATED CONTENT IN LAST.FM digital items. This collaborative tagging has become a fash- In Last.fm, each user has a personal profile integrated with ion on many websites. The user-contributed tags are not library and playlists, charts of listened music, social net- only an effective way to facilitate personal organization, but works such as friends and groups. Users can listen to music also provide a possibility for the users to search for infor- online, receive recommendations from the system and from mation or discover new things. other users, and they are also allowed to tag all music items. Based on the music-surrounding behavior, there is abundant A TagCloud (see figure 3) is a visual presentation of the data generated by users, such as personal listening history, most popular tags, in which tags are usually displayed in tags and social network, which work as the fundament of alphabetical order and text attributes, such as font size, the Last.fm services for personal charts, system recommen- weight or color are used to represent features (e.g., font size dations and tag-based search. for prevalence and color brightness for recentness). As a result of collaborative tagging, TagClouds have a more ac- Listening history curate meaning than those assigned by a single person, and The listening history is automatically recorded when the reflect the general interests among a broad demography [9, user listens to Last.fm music. It serves as the statistical ba- 23]. Due to their easy understandability and aesthetical sis of Last.fm’s main functionalities of charts and system presentation, TagClouds have become a fashion on many recommendations. websites. However, they still have some intrinsic disadvan- tages and many researchers have been dedicated to improve their aesthetical presentation [1, 13, 20] or semantic under- standing [8, 15]. Sharing One important activity around music is sharing, which fa- cilitates social communication and information exchange, but also helps to maintain personal images in front of oth- ers. One of the few detailed investigations [2] compared music sharing behavior with offline and online sharing sys- tems such as Napster, and then explored in detail a system named Music Buddy for browsing other people’s music collections. The study showed that music sharing is tightly bonded with social activities, and it suggested that music Figure 1. Personal chart for top artists. should be shared in a more collaborative and community- Charts are statistical presentations of the listening history. related environment. Voida et al. [25] explored practices Personal charts are displayed as a list of recently played surrounding the iTunes music sharing functionality and music, ordered by play count. Figure 1 is an example chart made several improvement suggestions. for top artists. Similarly, there are public charts calculated Based on these user generated tags, the user can conduct based on all users’ listening histories. tag-based searching and Last.fm will return a page for the respective tag, in which related tags and the top artists for Based on the aggregation of all users’ listening histories, this tag will be displayed. Figure 4 is the retrieval results of the system provides recommendations of similar artists for the tag “rock”. each artist and neighbors who share a similar musical taste with the user. If the user further browses each neighbor’s Social network profile, the similarity of musical taste between these two The user can add other users as friends, and join groups, in users is represented as a bar slider called musical compati- which people with common interests gather. Similar to the bility (see Figure 2). personal profile, Last.fm generates a profile for each group. A group radio is created based on the overall listening his- tory of the whole group. Besides system recommendations, the user can also rec- Figure 2. System recommendation of neighbors. ommend music to other users by sending internal textual message, which is called “sharing” in Last.fm. Tags Last.fm allows users to tag each track, album and artist with INTERVIEW free form texts, which can then be used for tag-based visu- As already discussed, UGC forms the fundamental basis for alizations and search. Last.fm. In order to gain more insights on the effective use Last.fm offers TagCloud visualization of the top tags gen- of metadata contributed by the users, the following essential erated by users. As shown in figure 3, most of the popular issues need to be explored: the performance of system rec- tags are genre-related. ommendations based on the users’ listening histories, other useful information which can be extracted from the listen- ing history, the features and benefits of music-related tags, and the user’s social network activities. In order to answer these questions, we first conducted inter- views with Last.fm users. Participants We recruited 13 participants in the Last.fm online forum, 3 female and 10 male. Their age ranged from 18 to 26 with an average age of 23 years. Most of the participants were stu- dents and all of them have common knowledge about com- puters and the Internet. Participants are all music amateurs and rated themselves to be experienced Last.fm users with an average score of 4.2 (5 for very experienced). Settings and procedure During the interview, the participants were equipped with a PC, keyboard and mouse. They could freely browse the Figure 3. TagCloud for top tags in Last.fm. Last.fm website and relevant applications, such as the desk- top radio. One visualization tool for listening histories was installed beforehand. First, the participants were asked to fill out a pre- questionnaire about their personal information and general experience with music. Then they joined an interview about their personal experience with Last.fm, which mainly cov- ered the issues of system recommendation, personal profile, tagging and searching behavior, and social network. Par- ticipants could freely browse their personal profiles and other services of Last.fm. On average the user study lasted about 1 hour per participant. It was conducted in English and recorded on video. The Think-Aloud protocol was ap- plied. The questions were grouped into four categories. To learn about the participant’s general experience with Last.fm, we Figure 4. Retrieval results of the tag “rock”. 3 asked about the services that were considered as most use- General experience with Last.fm ful, the main source for discovering new music and the Besides frequently visiting the website, participants also use quality of the system recommendations. Example questions other Last.fm applications. 8 of them are regular user of are: “How often do you visit the Last.fm website?”, “Do AudioScrobbler, a plugin for desktop music players, which you also use other desktop or portable applications?”, automatically transfers statistics of the user’s listening his- “Which functionalities do you think are most useful?”, tory to the personal charts in Last.fm. The two participants “How do you discover new music?” and “What do you who own an iPhone or iPod Touch also use the Last.fm think of the system recommendation of artists and mobile applications. Regarding useful functionalities in neighbors?”. Last.fm, the top three are AudioScrobbler, personal charts and the system recommendation for similar artists and In the next step, participants answered questions related to neighbors. their personal profiles, which helped to understand their musical tastes. Example questions were: “How would you Since the system recommendations and the discovery of describe your musical taste?”, “Do you think it is hard to new music are remarkably important for the participants, express musical taste verbally?”, “How well does your we discussed these two issues in more detail. All of the Last.fm library present your musical taste?” and “Do you participants mainly discover new music from the system mind your personal profile being public in Last.fm?”. recommendation of similar artists. The other means are recommendations by social contacts, such as friends or Another explored key issue was the tagging and searching groups, and by browsing neighbors’ profiles. Only one par- behavior and relevant user-generated tags. Example ques- ticipant uses the searching functionality to find music of a tions for searching were: “How often do you search for certain genre. Generally all the participants appreciated the music in Last.fm?”, “How often do you use tags for search- system recommendations and scored higher for recommen- ing?” and “What do you think about TagClouds of dation of similar artists (M=4.33, SD=0.65) than neighbors Last.fm?”. About the tagging behavior, some example ques- (M=3.66, SD=0.49). There were two main reasons for the tions were: “How often do you tag music in Last.fm?”, lower score of neighbor recommendation: besides a list of “Which kind of tags do you use for tagging?” and “Do you neighbors with the relevant shared artists, the participants think tagging music is difficult?”. would have liked an additional detailed description of the Since Last.fm offers functionalities for social networking, neighbors’ musical preferences; the current recommenda- such as friends and groups, we also discussed those with the tion is based on the latest weekly listening history. The user participants. Some example questions were: “How many of might get different neighbors if the weekly interests change. your Last.fm friends are also friends in your daily life?”, Although this reflects the continuously changing nature of “How do you find new friends and groups?”, “How often musical taste, some participants still expressed the wish to do you receive music recommendations from other users?” get neighbors with overall similar taste. and “How often do you recommend music to other users?” User 4: the biggest part of my music is funk, others are electronic and classical. However, I only get funk Results neighbors. Based on the analysis of the questionnaire and the recorded video, the following results were discovered: User 13: My girlfriend and I intentionally listen to similar music but our weekly musical compatibility is unstable, Personal music experience maybe because of the different listening sequences. All participants own portable music devices with normally more than 500 songs. When asked about the general sources Personal profile in Last.fm for discovering new music, all of them chose Last.fm as the When asked to describe the personal musical taste with free main online source, other sources being music services such text, all participants came up with short descriptions and as napster, amazon, iTunes and youTube. 9 out of 13 re- most of them were genre-related. Most of the participants ceive recommendations from friends and only 4 mentioned have a relatively stable preference. When asked how hard conventional means, such as CD stores, TV programs or it was to express musical taste verbally, 8 out of 13 scored newspapers. higher than 3 (5 for very difficult). Regarding devices for listening to music, the PC seems to Although the participants did not concern about the profile be the dominant device. Most of the participants listen being public, some of them still applied different strategies through the PC much longer (4.9 hours/day) than through to maintain their personal images. For example, one partici- portable devices (1.8 hours/day), such as an MP3 player or pant has two players, one for free personal usage with his mobile phone. Regarding the listening situations, the four whole collection, the other one with representative music equally mentioned main situations are background music with plugged scrobbler which automatically transfers the for working, during the commute, social events such as listening history of these songs to his Last.fm personal parties, and pure enjoyment. charts. Since the personal listening history is essential for both the • Understanding of other’s musical taste: user and the system, some applications are developed for the visualization of personal listening histories. Most of User 5: He likes rock and pop music. I don’t think he sticks them use a flow metaphor to represent how the personal to any specific artists. musical taste changes over time. Extra Stats 7 is an applica- In these comments, we can see that Last.fm helps to dis- tion, which visualizes the top artists as colored waves on a cover new music and that the listening history contains rich timeline (see figure 5). Each wave presents one artist and information. It also works as a self-reflection and helps to the width represents the play count of this artist in each understand other’s musical taste. time period. Other similar visualizations can be found in LastGraph 8 and Last.fm Spiral 9 . During the interview, the Searching and Tagging participants were asked to observe the visualization results Most of the participants use the search functionality fre- of their own listening history and one of another partici- quently, with the exception of one, who finds music by pant’s. A consistent pattern appeared in all the visualization browsing the charts for popular artists. Besides the standard results: there were always bursts when the user found new keywords such as the name of artist, album and song, tags artists and listened to them very often in a short time period. are less used for searching and the scores for the usage fre- After a while, these discoveries fell into the normal flows. quency were rather low (M=2.18, SD=1.08, on a 5-point Linkert-scale where 1 stands for “never”). The top three types of tags used for searching are genre, mood and artist biography. The aspects of tags are diverse, but currently in Last.fm the user cannot combine multiple tags for specific searching. User 3: It is a pity that I cannot use more than 1 tag as keywords, for example, to find a tiny part between punk and indie electronic. All the participants felt that the too general tags might make the user getting lost among abundant results and thus find nothing specific. User 5: Tags are too subjective and heavily depend on the personal musical taste. For example, for your favorite song, Figure 5. Extra Stats: flow visualization of the personal others might think it is awful .It is not suitable to describe listening history. the essence of music. User 12: “seen live” doesn’t help me at all. It’s like asking All participants thought the visualization was useful and for the way to the Eiffel Tower and someone tells you “in they also learnt additional information from the visualiza- Europe”. tion. For example, they noticed the break period during their usage of Last.fm, and also received new insights with When asked to give comments of the top tags shown in their own listening behavior and other’s musical taste: Figure 3, one prominent comment was the redundancy, for example “favorite” and “favourite”. Since music is difficult • Recall of relevant social activities: to express verbally, and there is no standard category for User 1: (point at one peak) I just returned from vacation genre, people have different definitions of genres and even and I met a girl there. I listened a lot to the music she liked. have different understanding of the same genre, which leads to remarkable redundancy and even errors with genre- • Re-discovery of forgotten music: related tags. User 3: there was a band I once liked very much but they User 4: I noticed that some people think IDM (Intelligent never came again. Maybe I should listen to them again. Dance Music) and electronic are the same so they always • Understanding of personal listening behavior: appear in a pair. But actually they are different. User 8: Drops down in august, maybe I was not so often at The participants do not tag so often and the average tagging home in summer. frequency is 1.09 (SD=0.83). Similar to the description of personal musical taste and tags used for searching, most of their generated tags were also related to genre, mood and 7 artist biography. Some other participants also use personal- Extra Stats, http://build.last.fm/item/34 ized tags for quick relocating, such as “listen again” and 8 LastGraph, lastgraph3.aeracode.org “Sunday morning”. The majority of participants thought 9 that tagging music is hard. Last.fm Spiral, http://build.last.fm/item/377 5 User 1: Talking about music is just like dancing with a The most often used devices for playing music were PC poem. It is hard to describe music with words. (M=4.75, SD=0.55), portable digital player (M=3.96, SD=1.33) and mobile phone (M=2.23, SD=1.44). The main Social network in Last.fm listening situations were consistent with the answers in the Besides music, Last.fm also offers functionalities for social interviews. networking, such as friends and groups. Most of the partici- pants use Last.fm only for music, since they already have General experience with Last.fm other social networks. Adding users as friends either ac- Besides Last.fm website, other frequently used applications tively or passively is determined by the social contacts with were AudioScrobbler (M=4.18, SD=1.45), desktop radio them. For the users who have no daily contacts, most of station (M=1.99, SD=1.25) and MobileScrobbler (M=1.65, them will be added on their requests. The participants’ SD=1.31). friend lists showed that most of them are real friends. The main means of discovering new music were system Compared with friends, group-related activity is less popu- recommendations (M=3.69, SD=1.30), browsing friends’ lar. Generally the themes of the groups are related to a loca- profiles (M=3.67, SD=1.28), recommendations from friends tion (affiliation, city, country) or genre. Which group to join (M=3.20, SD=1.46), browsing neighbors’ profile (M=2.96, and how to find a suitable group is determined by the per- SD=1.49) and recommendations from group (M=2.53, sonal music experience or influenced by friends, geographic SD=1.43). The system recommendations were appreciated and cultural factors. and received higher for recommendation of similar artists (M=4.11, SD=1.07) than neighbors (M=3.28, SD=1.18). User 5: Groups are very useful because my musical taste is special and in daily life I don’t know too many people shar- Personal profile in Last.fm ing the same taste. Participants believed that their libraries well represented Although last.fm offers functionality for recommending their tastes (M=4.25, SD=0.75). For the description of per- music by sending a message, it is seldom used and partici- sonal taste, 173 out of 228 participants proposed genre- pants rarely recommend music explicitly. Only 2 partici- related texts. The general attitude toward public nature of pants once received recommendations from others and only the personal profile was rather neutral (M=2.95, SD=1.32). 2 occasionally send recommendations. Concerning the listening behavior, they always play music from own library (M=3.6, SD=1.30) and a repetitive listen- ONLINE SURVEY ing pattern was revealed: They tend to repeatedly listen to In order to verify the results of the interview, we conducted certain artists, albums and songs. an online survey in English which lasted for two months. The questions asked in the survey were consistent with the The visualization of personal listening history in Extra Stats interview, mainly covered the demographic information, was commented as useful in supporting understanding taste general experience with Last.fm, system recommendations, changes over time, artist re-discovery and reflection of lis- searching and tagging behavior, and social network. tening patterns. In total we received 228 complete questionnaires, 93 female Searching and Tagging and 133 male (two gender identifiers were left blank). Their Participants look for music in Last.fm very frequently age ranged from 16 to 36 with an average age of 22 years. (M=3.99, SD=1.15, on a 5-point Linkert-scale where 1 Most of the participants were students and employees from stands for “daily”), but they more likely browse with no North America and Europe. Participants rated themselves to clear goal rather than specific search. Different from par- be experienced Last.fm users with an average score of 3.8 ticipants in the interview, keyword based search was less (5 for very). conducted (M=1.80, SD=1.10) and participants mostly search music-related information such as artist, album and Results song (M=4.04, SD=1.31), and less about social aspects such In general, the results of the online survey are consistent as group, user or event. with those derived during the interview. The Last.fm TagClouds was commented as useful to gain Personal music experience an overall impression of the most popular items but similar About the general sources for discovering new music, the linguistic problems were also noticed. The majority of par- online source was very popular (M=4.47, SD=0.93, on a 5- ticipants seldom tag. They mainly tag music in their own point Linkert-scale where 1 stands for “daily”) and the most libraries and most of their generated tags were genre- often mentioned websites were Last.fm, iTunes and You- related. Different from participants in the interview, they Tube. The other two main sources were recommendations consider tagging as rather easy (M=2.22, SD=0.09, 5 for from others (M=3.69, SD=1.08), and traditional sources very difficult). The top motivations for tagging were facili- (M=2.65, SD=1.18). tating browsing and searching, facilitating personal organi- zation, and helping others to understand music. Social network in Last.fm tory offers better understanding about how the musical taste Last.fm was considered more of a music website (M=4.59, changed over time. Users can get abundant information SD=0.69, 5 for highly agree) than a social network from the visualization which helps to discover personal (M=3.44, SD=1.14) and the most popular social networks listening behavior, re-discover forgotten music and under- among the participants were facebook, myspace and twitter. stand others’ musical tastes. Since some users might have a The number of friends varied from 0 to 322 with average long history, the visualization should offer a better over- number of 32 (SD=41.40). Different from participants in view while helping to construct a complete mental model the interview, the Last.fm friends also known in daily life conveniently. Although existing visualization tools receive were much less (M=6, SD=9.05). Most of the friends were positive feedback, more interactions should be introduced added on their requests. The number of group also varied a to enhance the understandability. Most of the current tools lot from 0 to 60 (M=28, SD=66.50). Compared with only target single users and it might be appealing to offer friends, the group-relevant activities were less popular. And users an intuitive way to browse and compare multiple us- the popular group themes were genre, artist, geo-location ers’ listening histories, which in turn could improve the and events. The functionality of recommending music to system transparency. others was less used. Tags and relevant tagging behavior People do not tag music so often and they tag for different IMPLICATIONS Based on the results of the interview and online survey, reasons. Some people take music very seriously and want some implications about the user’s behavior surrounding others to know more about their favorite music through online music and relevant UGC were revealed: tags. Some users annotate music with special tags for per- sonal use. Others simply make a contribution or offer General experience with music knowledge by tagging. The PC dominates as the main music device and portable In Last.fm, most of the top tags are related to genre, mood devices show a noticeable potential when people are “on or artist biography. There is less chance for users to be the way” and thus relevant applications should receive more ‘educated’ since the personal understanding of genre and attention. A smart music recommendation system should emotion is subjective and according to different musical recognize the context, choose and switch songs smoothly, experiences, the users might come up with different tags for for example as Cunningham et al. mentioned in [4], shuffle the same music. Therefore, searching by tags is not com- by genre, which might be more appealing than existing ran- mon in Last.fm because freely generated tags are normally dom shuffle mode. too general to help users narrowing down the results. More neat and organized tags with less redundancy would be System recommendation more useful and the option of combining multiple tags in Current system recommendations of similar artists is gener- the searching process might help the user to harness the ally appealing and it could be further improved, for exam- searching direction. ple, by taking the recency factor into account. Last.fm recommendations of neighbors are based on the Social network latest weekly charts. When the user has an unstable musical Most of the participants use Last.fm only for music and the taste, especially when discovering new bursts and sticking social-related activities are mainly passive, such as receiv- to them for a while, the neighbors keep changing. Although ing recommendations from others, adding friends or joining the system offers a list of neighbors with a high musical groups. Active music recommendation is not popular in compatibility score, more detailed explanation is expected, last.fm, even though the system offers a sharing functional- which also helps to build self-reflection and to understand ity. Although the personal profile being public is not a big others’ musical taste. When the user wants a neighbor rec- issue, some users still want to maintain personal images, for ommendation based on his or her overall musical taste, the example, by keeping the Last.fm library or charts in a rep- system should offer a more flexible and smart recommenda- resentative and neat way. tion scheme, in which the user’s requirements could be dy- EXPERIMENT BASED ON IMPLICATIONS namically integrated. The system could, for example, let the user choose a time period or select some of the neighbors as Based on the implications derived from our user study, ap- examples, which help to discover new matching neighbors. plications for information visualization and recommender systems can be built: for example, illustrating the world- Listening history wide musical trends, improving semantic understanding of Personal listening history is the key issue of Last.fm which tags, and facilitating discovery of new music and people helps to formulate the charts and system recommendations. sharing similar tastes. As the title of [4], music is more of an art than science, As the results of the user study showed, TagClouds contains which illustrates that musical taste is hard to express effi- redundancies and errors with freely generated tags and can ciently by purely statistical methods. Compared with statis- not support semantic understanding of the relationships tical charts, the graphical visualization for the listening his- 7 among tags. Therefore, we developed an aggregation of both systems. The analysis of both quantitative and qualita- TagClouds named TagClusters (see Figure 6). tive data indicated that TagClusters performed overall bet- ter and have advantages in supporting semantic understand- The hierarchical structure and positions of tags are achieved ing, impression formation and matching. In our future based on a semantic analysis. Text analysis is first applied work, we will explore using TagClusters to support tag rec- to produce a semantic clustering of similar tags: After re- ommendation and multiple-tags-based searching. moval of separators such as “_” and “&”, the Porter algo- rithm [19] is applied to detect the stem of each tag. Tags CONCLUSION AND FUTURE WORK with the same stem words are clustered in the same group. In this paper we conducted a preliminary user study with For example, metal related tags such as “heavy metal”, Last.fm, an online music community. We investigated key “gothic metal” and “melodic death metal” are grouped into issues about User-Generated Content, such as listening his- one metal cluster. After semantic grouping of similar tags tory, tags and social network, based on which Last.fm of- into genre-clusters, the hierarchical structure in each cluster fers services of charts, system recommendations of similar is determined based on the tag length because of the charac- artists and neighbors. Based on an analysis of relevant user teristic feature of genre-related tags: the tag in lower se- behavior and relevant generated data, implications for usage mantic level always contains the tag in the higher level and of UGC were derived. We developed our first prototype for the length of tag is proportional with its semantic level, for improving semantic understanding of tags. We believe our example, “death metal” and “brutal death metal”. user study could bring insights for better usage of UGC and help users to get better understanding of the Last.fm musi- The location of each tag is determined by the semantic cal world. In our future work, we plan to develop proto- similarity (see Equation 1). It equals to the ratio between types based on the derived implications, mainly in the realm the number of resources in which a pair of tags A and B co- of information visualization and recommender systems. occur and the number of resources in which any of these Based on the accumulated experience with the prototype two tags appears. development we expect to obtain general design guidelines Sim ( A, B ) =| A I B | / | A U B | (1) with UGC in online music communities. After this semantic analysis, semantically similar tags are ACKNOWLEDGMENTS clustered into groups and their visual distance represents This research was funded by the China Scholarship Council their semantic similarity, thus the visualization offers a bet- (CSC) and by the German state of Bavaria. We would like ter hierarchical understanding of collaborative tags. to thank the participants of our study. REFERENCES 1. Ahlberg C., Shneiderman B. Visual information seek- ing: tight coupling of dynamic query filters with star- field displays. In Proc. CHI 1994, ACM Press (1994), 313-317. 2. Brown B., Geelhoed E., Sellen A. Music sharing as a computer supported collaborative application. In Proc. ECSCW 2001, ACM Press (2001), 179-198. 3. Chen Y.-X., Santamaria R., Butz A., Theron R. Tag- Clusters: Semantic Aggregation of Collaborative Tags beyond TagClouds. In Proc. SG 2009, Springer Press (2009), 56-67. Figure 6. TagCluster: Aggregation of TagClouds [3]. 4. Cunningham S., Bainbridge D., Falconer A. ‘More of an art than a science’: supporting the creation of playlists A comparative evaluation was conducted with TagClouds and mixes. In Proc. ISMIR 2006. and TagClusters based on the same Last.fm tag collection. 5. Downie J. S., Cunningham S. Toward a theory of music 12 participants were recruited and were required to conduct information retrieval system design implications. In 6 tasks (each task is consisted of two similar sub-tasks): Proc. ISMIR 2002. locating one single item, sorting tags by popularity, group- ing similar tags, driving group structure, finding relation 6. Graham, A., Garcia-Molina, H., Paepcke, A., Winograd, between tags and judging their similarity. The complete T.: Time as essence for photo browsing through per- time and the answer precision were measured. After com- sonal digital libraries. In Proc. JCDL 2002, ACM Press pleted each task, the participants were asked to score the (2002), 326-335. easiness of each task and the usefulness of both systems. 7. Gulik, R van, Vignoli, F. Visual Playlist Generation on After completing all the tasks, the participants filled out a the Artist Map. In Proc. ISMIR 2005. post-questionnaire which concerns the overall impression of 8. Hassan M. Y., Herrero S. V. Improving tag-clouds as 17. Pampalk E., Goto M. MusicSun: a new approach to art- visual Music Retrieval interfaces. In Proc. INSCIT ist recommendation. In Proc. ISMIR 2007. 2006. 18. Platt, J., Czerwinski, M., Field, B.: Phototoc: automatic 9. Hearst M. A. What’s up with Tag Clouds? clustering for browsing personal photographs. Microsoft http://perceptualedge.com/articles/guests/whats_up_with Research Technical Report MSR-TR-2002-17, 2002. _tag_clouds.pdf. Accessed December 30, 2008. 19. Porter, M.F.: An algorithm for suffix stripping. Program 10. Herlocker, L., Konstan J., Riedl J. Explaining Collabo- 14(3), 130–137 (1980). rative Filtering Recommendations. In Proc. CSCW 20. Seifert C., Kump B. Kienreich W. On the beauty and 2000, ACM Press (2000), 241-250. usability of tag clouds. In Proc. IV 2008, IEEE Press 11. Herrera P., Bello J., Widmer G., Sandler M., Celma O, (2008), 17-25. Vignoli F., Pampalk E., Cano P., Pauws S., Serra X. 21. Shneiderman, B., Kang, H. Direct annotation: a drag- SIMAC: semantic interaction with music audio contents. and-drop strategy for labeling photos. In Proc. IV 2000, Journal of Intelligent Information Systems, 2005. IEEE Press (2000), 88-98. 12. Huynh, D., Drucker, S., Baudisch, P., Wong, C.: Time 22. Swearingen K., Sinha R. Beyond algorithms: An HCI quilt: Scaling up zomable photo browsers for large, un- perspective on recommender systems. In ACM SIGIR structured photo collections. Ext. Abstracts CHI 2005, 2001 Workshop on Recommender Systems, ACM Press ACM Press (2005), 1937-1940. (2001), 1-11. 13. Kaser O., Lemire D. Tag-cloud drawing: algorithms for 23. Viégas F. B., Wattenberg M. Tag Clouds and the Case cloud visualization. In Proc. WWW 2007, ACM Press for Vernacular Visualization. Interactions. 15(4), 49-52, (2007). 2008. 14. Kim J-Y., Belkin J. Categories of Music Description and 24. Vignoli F. Digital Music Interaction concepts: a user Search Terms and Phrases Used by Non-Music Experts. study. In Proc. ISMIR 2004. In Proc. ISMIR 2002. 25. Voida, A. Grinter, R. E. Ducheneaut, N. Edwards, W. K. 15. Li R., Bao S., Yu Y., Fei B., Su Z. Towards effective Newman, M. W. Listening In: Practices Surrounding browsing of large scale social annotations. In Proc. iTunes Music Sharing. In Proc. CHI 2005, ACM Press WWW 2007, ACM Press (2007), 943-952. (2005), 191-200. 16. Pampalk E., Goto M. MusicRainbow: a new user inter- face to discover artists using audio-based similarity and web-based labeling. In Proc. ISMIR 2006. 9