Music Composition as a Lens for Understanding Human-AI Collaboration Eric Tron Gianet1 , Luigi Di Caro1 and Amon Rapp1 1 University of Turin, Computer Science Department, Torino, Italy Abstract As generative artificial intelligence (GenAI) systems gain human-like capabilities in creative tasks, they seem to blur the line between machines and users, prompting questions about how to design systems where AI and humans collaborate. Music composition with AI may offer a lens to explore the nuances of human-AI collaboration. We review recent literature on music generation with AI, highlighting key challenges like the need for user control and context awareness, and noting a potential shift in the user’s role towards curation or co-production when using AI tools. However, much of the existing research evaluates the impact of current AI tools rather than engaging in fieldwork to investigate music composition “in practice” within specific socio-cultural contexts. We then propose an ethnographic study to understand music composition as a situated practice, considering composers’ personal motivations, artistic sensibilities, and the broader socio-cultural context. Preliminary findings highlight the importance of creative intentionality and meaning-making in driving compositional choices. Furthermore, music creation often involves collaboration between various human actors, raising questions about whether AI should facilitate this already present collaboration or disrupt existing dynamics. Keywords Human-AI Collaboration, Music Composition, Generative AI, Human-AI Co-creation 1. Introduction The rapid evolution of Artificial Intelligence (AI), and particularly GenAI, is reshaping how humans interact with technology. We are moving beyond more traditional interaction models towards scenarios of collaboration between humans and AI, which raises critical questions about how we ought to design the interaction with systems that promise to leverage the strengths of both. GenAI systems can not only perform classification tasks but also create artifacts like music, images, and text, blurring the line between traditional tools and active collaborators with creative abilities. In the broad research area of Human-Centered AI (HCAI) [1, 2, 3], a perspective has emerged that seeks to leverage the strengths of both humans and AI, creating synergistic systems that surpass their individual capabilities [3, 4, 5]. This concept of human-AI collaboration arguably represents a more human-centered approach compared to human-in-the-loop models, as it makes the ”best use of both human and AI capabilities, rather than the human simply being called Proceedings of the 1st International Workshop on Designing and Building Hybrid Human–AI Systems (SYNERGY 2024), Arenzano (Genoa), Italy, June 03, 2024. Envelope-Open erictrngnt@gmail.com (E. Tron Gianet); luigi.dicaro@unito.it (L. Di Caro); amon.rapp@unito.it (A. Rapp) Orcid 0009-0009-3092-7089 (E. Tron Gianet); 0000-0002-7570-637X (L. Di Caro); 0000-0003-3855-9961 (A. Rapp) © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings upon to do what the AI cannot yet manage in an AI led project” [3]. However, this emphasis on AI agency, which Sarkar [6] terms as an “agentistic turn”, necessitates critical consideration. While it aligns with Bruno Latour’s notion of agency [7], which extends agency across a network of human and non-human actors, showing how intricate the relationships between humans and their surroundings are, it could obscure the vast amount of “ghost work” [8] that powers AI, frequently conducted in the Global South at low wages (e.g., by data labelers) [6]. The concept of “collaboration” in human-AI interaction is certainly multifaceted, but, while some interactions with non-generative AI might exhibit collaborative aspects already [9], the emergence of GenAI presents unique challenges. These systems, capable of human-like outputs, blur the lines between tools and collaborators. This necessitates a deeper understanding of how human roles are redefined and how decision-making, creative processes, and information handling are reshaped. To explore this further, we focus on a specific domain – music composition with AI – as an illustrative case study, to understand how collaboration and co-creation occur in practice, taking into account not only the user’s objectives, needs, and motivations, but also the social and cultural context within which humans and GenAI systems may collaborate to achieve situated goals. The collaborative process of composing music raises intriguing questions about ownership, control, and intentions, all of which are central themes in the broader discussion of human-AI collaboration. We will begin by reviewing recent literature on human-AI music composition and then propose a study investigating the situated practices [10] of music composition. 2. AI in Music Composition The interest in computer-based music composition has steadily increased since the 1980s, with various techniques of Algorithmic Composition like Markov Models, Generative Grammars, and Genetic Algorithms. Later, as Neural Networks became more prominent, the advancements in Deep Learning led to the adoption of established architectures from fields like Computer Vision and Natural Language Processing in music generation as well [11]. While research on interactive musical systems had already highlighted the importance of applying user-centered approaches to support composers’ processes of creation, exploration, and learning [12, 13, 14], more recent studies have delved into the specific challenges and strategies of co-creating music with AI. Newman et al. [15] investigated through interviews how current AI tools concretely influence musical creativity. They then proposed a new model for developing ethical and productive collaborative AI tools for music. This model emphasizes the importance of clearly defined roles for AI and pays attention to how control is distributed. Their research reveals that users perceive as positive AI use cases those where they can maintain control, agency, intention, and choice throughout iterative cycles of generation and evaluation [15]. Huang et al. [16] studied the challenges and strategies of co-creating music with AI through a survey conducted during an AI song contest. The results highlight the importance of context awareness and user control, advising future AI systems to be designed to adapt to existing composers’ practices rather than forcing new AI-determined workflows. In another study investigating the use of “steering tools” for collaborative music creation, Louie et al. [17] identified two challenges in using GenAI: information overload and non- deterministic output. Their findings suggest that steering tools can enhance the user’s sense of control, trust, and understanding of the AI system, resulting in a greater feeling of involvement in the creative process. The authors also comment that users often have pre-existing mental models about music composition and use them for tackling problems. These preconceptions should be considered so that both the AI models and the interfaces can be designed to be more intuitive, requiring less cognitive effort, and ultimately increasing the user’s sense of agency. Louie et al. [17] also argue that the AI’s role should adapt to the user’s needs and the creative context: while ceding control is welcomed during exploratory phases, where the user is in search of unexpected inspiration, maintaining control over specific details becomes critical during production. Context thus plays a crucial role in shaping the dynamics of human-AI collaboration, influencing how control and agency are perceived. Suh et al. [18] explored how AI systems can act as a “social glue” to support human-human collaboration in compositional practices. Their findings suggest that AI can facilitate the exchange of ideas and group cohesion, potentially reducing tensions typically associated with collaboration. They advocate for an intentional design of AI to further strengthen social collaboration. However, while AI can act as a support system, they also observed a potential shift in roles: participants reported feeling more like curators or co-producers, focusing on evaluating AI-generated material rather than actively developing ideas, leading to a weaker sense of creative involvement. This observation aligns with the findings of Civit et al. [19], who noted that “the composer became more of an arranger of different melodies” comparing their role to that of a producer managing misbehaving musicians. While this shift is viewed as a “very creative, fruitful process” by them instead, it still highlights the need for future AI systems to be adaptive to the creative context, user needs, and the composer’s specific intentions. This said, despite the growing interest in human-AI collaboration for music composition, current research seems to have limited scope: much of the focus is on evaluating the impact of existing generative systems, exploring strategies for composers to navigate their challenges, and integrating steering tools into current interfaces. Efforts to bridge the gap between AI music generation and the social and cultural complexities of music composition often lack engagement with field studies on compositional practices. Unlike humans, who are influenced by personal and social motivations and cultural context, AI systems currently operate solely on algorithms and predictive models, potentially limiting their effectiveness and failing to fully capture the nuances of human creativity [20]. The existing research gap highlights the importance of investigating current practices adopted by music composers, as well as their personal motivations, artistic sensibilities, the broader cultural and social context that influences their work, and the specific characteristics of various musical genres. This knowledge will be crucial for designing human-AI systems and creative workflows that effectively complement human strengths and intentions. 3. Music Composition as a Situated Practice The existing literature emphasizes the need to better understand the complexities of music compositions in order to design human-AI collaborative music systems that account for both the individual and the socio-cultural context. While efforts like that of Hernandez-Olivan and Beltrán [11] to create generalized models of music composition are valuable for highlighting core principles, such rigid frameworks can hardly capture the dynamic and diverse nature of music creation, which is constantly evolving, shaped by genre, stylistic trends, personal choices, improvisation, and the unique socio-cultural setting where musicians operate. To address this complexity, we propose an ethnographic approach that foregrounds the situated nature of music composition. This approach will delve into both the composers’ personal motivations (e.g., creative aspirations and career goals) and the socio-cultural context that influences their work. Ethnography allows us to explore these aspects of music creation and the lived experiences of composers, and has already been applied and discussed as a method for AI research [21, 22, 23, 24, 25], feeding into a broader discussion about the lack of, and thus need to integrate, the social sciences into AI research to mitigate the exclusive use of quantitative methods, which lack of socio-technical perspective, and their uncritical and positivist use that often leads to ignoring the context and causes that bring to a certain outcome and the ways in which it occurs [26, 21, 27]. By employing ethnography, we look at users not simply as passive recipients of technology, but as active agents who shape its context, meanings, and consequences [28]. This approach emphasizes the context-dependent nature of music composition, laying the groundwork for designing synergistic human-AI systems that are sensitive to the nuances of human creativity and to the specific settings where music is created. Our research aims to answer the following provisional questions: • Can the use of Generative AI systems be actually considered a collaboration? • How does the specific context of music creation influence decision-making, workflows, and creative choices when working with AI? • How might collaboration with Generative AI systems redefine the roles of composers in the creative process of composition? • How do composers and AI negotiate creative control and authorship within this collabo- ration? To answer these questions, we propose an ethnographic approach combining semi-structured interviews and participant observation. We will interview 18 musicians with experience in composing for diverse genres, exploring topics like: • Motivations, aspirations, creative sensibilities and personal workflow • The role of context in their music creation process • Experiences with AI tools and their perception of AI’s role • How users perceive their own role evolving in collaboration with AI Following the interviews, we will conduct at least 60 hours of participant observation, to immerse ourselves in the composers’ practice. In summary, we aim to shed light on the situated practice of composing music and on how the broader context influences the collaborative process. We do this to inform the design of human-AI collaborative systems that support and empower, not replace, musicians in their creative practices. Additionally, this study can offer insights into human-AI collaboration beyond music, contributing to uncover design patterns for systems that are synergistic with human capabilities and responsive to the specific context in which they are used. 3.1. Preliminary findings Our analysis of initial interviews reveals some preliminary findings: (a) Intentionality Shapes and Gives Meaning to Music: Composers’ creative intentions significantly impact their approach to composition. For example, whether they start with a melody, harmony, or specific sound (timbre) often depends on what they want to convey. While the existing literature acknowledges the link between intention, control, and creative agency, our study highlights the specific link between intention and a deeper meaning-making process, where composers strive to construct a “coherent discourse” through their music. How could we design human-AI collaborative systems that support and adapt to user intentions? (b) Music is already collaborative: Music composition and production are already an often collaborative process. From bandmates to collaborators, clients, and sound engineers, various stakeholders contribute to and have an interest in the final product. This raises the question of whether AI systems should be designed to enhance these existing human-human interactions, or whether they themselves should become an additional collaborator within a system in which creative control is already dynamically distributed and negotiated. More findings will be shared during the workshop. References [1] W. Xu, Toward human-centered AI: A perspective from human-computer interaction, Interactions 26 (2019) 42–46. doi:10.1145/3328485 . [2] B. Shneiderman, Human-Centered AI, Oxford University Press, Oxford, 2022. [3] T. Capel, M. Brereton, What is Human-Centered about Human-Centered AI? A Map of the Research Landscape, in: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI ’23, Association for Computing Machinery, Hamburg, Germany, 2023, p. 23. doi:10.1145/3544548.3580959 . [4] L. G. Terveen, Overview of human-computer collaboration, Knowledge-Based Systems 8 (1995) 67–81. doi:10.1016/0950- 7051(95)98369- H . [5] Z. Wu, D. Ji, K. Yu, X. Zeng, D. Wu, M. Shidujaman, AI Creativity and the Human-AI Co-creation Model, in: M. Kurosu (Ed.), Human-Computer Interaction. Theory, Methods and Tools, volume 12762 of Lecture Notes in Computer Science, Springer International Publishing, Cham, 2021, pp. 171–190. doi:10.1007/978- 3- 030- 78462- 1_13 . [6] A. Sarkar, Enough With “Human-AI Collaboration”, in: Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, CHI EA ’23, Association for Computing Machinery, Hamburg, Germany, 2023, p. 8. doi:10.1145/3544549.3582735 . [7] B. Latour, Reassembling the Social: An Introduction to Actor-Network-Theory, Clarendon Lectures in Management Studies, Oxford University Press, Oxford, 2005. [8] M. L. Gray, S. Suri, Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass, Houghton Mifflin Harcourt, Boston, 2019. [9] A. Rapp, A. Boldi, L. Curti, A. Perrucci, R. Simeoni, Collaborating with a Text-Based Chatbot: An Exploration of Real-World Collaboration Strategies Enacted during Human- Chatbot Interactions, in: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI ’23, Association for Computing Machinery, New York, NY, USA, 2023, pp. 1–17. doi:10.1145/3544548.3580995 . [10] L. A. Suchman, Plans and Situated Actions: The Problem of Human-Machine Communica- tion, volume 103, Cambridge University Press, USA, 1987. [11] C. Hernandez-Olivan, J. R. Beltrán, Music Composition with Deep Learning: A Re- view, in: A. Biswas, E. Wennekes, A. Wieczorkowska, R. H. Laskar (Eds.), Advances in Speech and Music Technology: Computational Aspects and Applications, Signals and Communication Technology, Springer International Publishing, Cham, 2023, pp. 25–50. doi:10.1007/978- 3- 031- 18444- 4_2 . [12] R. Fiebrink, D. Trueman, C. Britt, M. Nagai, K. Kaczmarek, M. Early, MR. Daniel, A. Hege, P. Cook, Toward Understanding Human-Computer Interaction In Composing The Instru- ment, in: Proceedings of the International Computer Music Conference, International Computer Music Association, New York, 2010, pp. 135–142. [13] Y. Wu, N. Bryan-Kinns, Supporting Non-Musicians? Creative Engagement with Musical Interfaces, in: Proceedings of the 2017 ACM SIGCHI Conference on Creativity and Cognition, C&C ’17, Association for Computing Machinery, New York, NY, USA, 2017, pp. 275–286. doi:10.1145/3059454.3059457 . [14] H. Scurto, F. Bevilacqua, Appropriating Music Computing Practices Through Human-AI Collaboration, in: Journées d’Informatique Musicale (JIM 2018), Amiens, France, 2018. [15] M. Newman, L. Morris, J. H. Lee, Human-AI Music Creation: Understanding the Perceptions and Experiences of Music Creators for Ethical and Productive Collaboration, in: Proc. of the 24th Int. Society for Music Information Retrieval Conf., International Society for Music Information Retrieval, Milan, Italy, 2023, pp. 80–88. [16] C.-Z. A. Huang, H. V. Koops, E. Newton-Rex, AI Song Contest: Human-AI Co-creation in Songwriting, in: Proc. of the 21st Int. Society for Music Information Retrieval Conf., International Society for Music Information Retrieval, Montréal, Canada, 2020, pp. 708–716. [17] R. Louie, A. Coenen, C. Z. Huang, M. Terry, C. J. Cai, Novice-AI Music Co-Creation via AI-Steering Tools for Deep Generative Models, in: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, CHI ’20, Association for Computing Machinery, Honolulu, HI, USA, 2020, pp. 1–13. doi:10.1145/3313831.3376739 . [18] M. M. Suh, E. Youngblom, M. Terry, C. J. Cai, AI as Social Glue: Uncovering the Roles of Deep Generative AI during Social Music Composition, in: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, CHI ’21, ACM, Yokohama, Japan, 2021, p. 11. doi:10.1145/3411764.3445219 . [19] M. Civit, J. Civit-Masot, F. Cuadrado, M. J. Escalona, A systematic review of artificial intelligence-based music generation: Scope, applications, and future trends, Expert Systems with Applications 209 (2022) 118–190. doi:10.1016/j.eswa.2022.118190 . [20] O. Bown, Sociocultural and Design Perspectives on AI-Based Music Production: Why Do We Make Music and What Changes if AI Makes It for Us?, in: E. R. Miranda (Ed.), Handbook of Artificial Intelligence for Music, Springer International Publishing, Cham, 2021, pp. 1–20. doi:10.1007/978- 3- 030- 72116- 9_1 . [21] V. Marda, S. Narayan, On the importance of ethnographic methods in AI research, Nature Machine Intelligence 3 (2021) 187–189. doi:10.1038/s42256- 021- 00323- 0 . [22] A. Christin, The ethnographer and the algorithm: Beyond the black box, Theory and Society 49 (2020) 897–918. doi:10.1007/s11186- 020- 09411- 3 . [23] A. F. Blackwell, Ethnographic artificial intelligence, Interdisciplinary Science Reviews 46 (2021) 198–211. doi:10.1080/03080188.2020.1840226 . [24] R. Van Voorst, T. Ahlin, Key points for an ethnography of AI: An approach towards crucial data, Humanities and Social Sciences Communications 11 (2024) 337. doi:10.1057/ s41599- 024- 02854- 4 . [25] N. Seaver, Algorithms as culture: Some tactics for the ethnography of algorithmic systems, Big Data & Society 4 (2017) 205395171773810. doi:10.1177/2053951717738104 . [26] M. Sloane, E. Moss, AI’s social sciences deficit, Nature Machine Intelligence 1 (2019) 330–331. doi:10.1038/s42256- 019- 0084- 6 . [27] E. Dahlin, Mind the gap! On the future of AI research, Humanities and Social Sciences Communications 8 (2021) 1–4. doi:10.1057/s41599- 021- 00750- 9 . [28] P. Dourish, Implications for design (2006) 541–550. doi:10.1145/1124772.1124855 .