=Paper= {{Paper |id=Vol-3701/paper6 |storemode=property |title=Music Composition as a Lens for Understanding Human-AI Collaboration (short paper) |pdfUrl=https://ceur-ws.org/Vol-3701/paper6.pdf |volume=Vol-3701 |authors=Eric Tron Gianet,Luigi Di Caro,Amon Rapp |dblpUrl=https://dblp.org/rec/conf/synergy/GianetCR24 }} ==Music Composition as a Lens for Understanding Human-AI Collaboration (short paper)== https://ceur-ws.org/Vol-3701/paper6.pdf
                                Music Composition as a Lens for Understanding
                                Human-AI Collaboration
                                Eric Tron Gianet1 , Luigi Di Caro1 and Amon Rapp1
                                1
                                    University of Turin, Computer Science Department, Torino, Italy


                                                                         Abstract
                                                                         As generative artificial intelligence (GenAI) systems gain human-like capabilities in creative tasks, they
                                                                         seem to blur the line between machines and users, prompting questions about how to design systems
                                                                         where AI and humans collaborate. Music composition with AI may offer a lens to explore the nuances
                                                                         of human-AI collaboration. We review recent literature on music generation with AI, highlighting key
                                                                         challenges like the need for user control and context awareness, and noting a potential shift in the
                                                                         user’s role towards curation or co-production when using AI tools. However, much of the existing
                                                                         research evaluates the impact of current AI tools rather than engaging in fieldwork to investigate music
                                                                         composition “in practice” within specific socio-cultural contexts. We then propose an ethnographic study
                                                                         to understand music composition as a situated practice, considering composers’ personal motivations,
                                                                         artistic sensibilities, and the broader socio-cultural context. Preliminary findings highlight the importance
                                                                         of creative intentionality and meaning-making in driving compositional choices. Furthermore, music
                                                                         creation often involves collaboration between various human actors, raising questions about whether AI
                                                                         should facilitate this already present collaboration or disrupt existing dynamics.

                                                                         Keywords
                                                                         Human-AI Collaboration, Music Composition, Generative AI, Human-AI Co-creation




                                1. Introduction
                                The rapid evolution of Artificial Intelligence (AI), and particularly GenAI, is reshaping how
                                humans interact with technology. We are moving beyond more traditional interaction models
                                towards scenarios of collaboration between humans and AI, which raises critical questions about
                                how we ought to design the interaction with systems that promise to leverage the strengths
                                of both. GenAI systems can not only perform classification tasks but also create artifacts like
                                music, images, and text, blurring the line between traditional tools and active collaborators
                                with creative abilities.
                                   In the broad research area of Human-Centered AI (HCAI) [1, 2, 3], a perspective has emerged
                                that seeks to leverage the strengths of both humans and AI, creating synergistic systems that
                                surpass their individual capabilities [3, 4, 5]. This concept of human-AI collaboration arguably
                                represents a more human-centered approach compared to human-in-the-loop models, as it
                                makes the ”best use of both human and AI capabilities, rather than the human simply being called

                                Proceedings of the 1st International Workshop on Designing and Building Hybrid Human–AI Systems (SYNERGY 2024),
                                Arenzano (Genoa), Italy, June 03, 2024.
                                Envelope-Open erictrngnt@gmail.com (E. Tron Gianet); luigi.dicaro@unito.it (L. Di Caro); amon.rapp@unito.it (A. Rapp)
                                Orcid 0009-0009-3092-7089 (E. Tron Gianet); 0000-0002-7570-637X (L. Di Caro); 0000-0003-3855-9961 (A. Rapp)
                                                                       © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                    CEUR
                                    Workshop
                                    Proceedings
                                                  http://ceur-ws.org
                                                  ISSN 1613-0073
                                                                       CEUR Workshop Proceedings (CEUR-WS.org)




CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
upon to do what the AI cannot yet manage in an AI led project” [3]. However, this emphasis on AI
agency, which Sarkar [6] terms as an “agentistic turn”, necessitates critical consideration. While
it aligns with Bruno Latour’s notion of agency [7], which extends agency across a network of
human and non-human actors, showing how intricate the relationships between humans and
their surroundings are, it could obscure the vast amount of “ghost work” [8] that powers AI,
frequently conducted in the Global South at low wages (e.g., by data labelers) [6]. The concept of
“collaboration” in human-AI interaction is certainly multifaceted, but, while some interactions
with non-generative AI might exhibit collaborative aspects already [9], the emergence of GenAI
presents unique challenges. These systems, capable of human-like outputs, blur the lines
between tools and collaborators. This necessitates a deeper understanding of how human
roles are redefined and how decision-making, creative processes, and information handling are
reshaped.
   To explore this further, we focus on a specific domain – music composition with AI – as
an illustrative case study, to understand how collaboration and co-creation occur in practice,
taking into account not only the user’s objectives, needs, and motivations, but also the social
and cultural context within which humans and GenAI systems may collaborate to achieve
situated goals. The collaborative process of composing music raises intriguing questions about
ownership, control, and intentions, all of which are central themes in the broader discussion of
human-AI collaboration.
   We will begin by reviewing recent literature on human-AI music composition and then
propose a study investigating the situated practices [10] of music composition.


2. AI in Music Composition
The interest in computer-based music composition has steadily increased since the 1980s, with
various techniques of Algorithmic Composition like Markov Models, Generative Grammars,
and Genetic Algorithms. Later, as Neural Networks became more prominent, the advancements
in Deep Learning led to the adoption of established architectures from fields like Computer
Vision and Natural Language Processing in music generation as well [11].
   While research on interactive musical systems had already highlighted the importance of
applying user-centered approaches to support composers’ processes of creation, exploration,
and learning [12, 13, 14], more recent studies have delved into the specific challenges and
strategies of co-creating music with AI.
   Newman et al. [15] investigated through interviews how current AI tools concretely influence
musical creativity. They then proposed a new model for developing ethical and productive
collaborative AI tools for music. This model emphasizes the importance of clearly defined
roles for AI and pays attention to how control is distributed. Their research reveals that users
perceive as positive AI use cases those where they can maintain control, agency, intention, and
choice throughout iterative cycles of generation and evaluation [15].
   Huang et al. [16] studied the challenges and strategies of co-creating music with AI through
a survey conducted during an AI song contest. The results highlight the importance of context
awareness and user control, advising future AI systems to be designed to adapt to existing
composers’ practices rather than forcing new AI-determined workflows.
   In another study investigating the use of “steering tools” for collaborative music creation,
Louie et al. [17] identified two challenges in using GenAI: information overload and non-
deterministic output. Their findings suggest that steering tools can enhance the user’s sense of
control, trust, and understanding of the AI system, resulting in a greater feeling of involvement
in the creative process. The authors also comment that users often have pre-existing mental
models about music composition and use them for tackling problems. These preconceptions
should be considered so that both the AI models and the interfaces can be designed to be more
intuitive, requiring less cognitive effort, and ultimately increasing the user’s sense of agency.
Louie et al. [17] also argue that the AI’s role should adapt to the user’s needs and the creative
context: while ceding control is welcomed during exploratory phases, where the user is in
search of unexpected inspiration, maintaining control over specific details becomes critical
during production. Context thus plays a crucial role in shaping the dynamics of human-AI
collaboration, influencing how control and agency are perceived.
   Suh et al. [18] explored how AI systems can act as a “social glue” to support human-human
collaboration in compositional practices. Their findings suggest that AI can facilitate the
exchange of ideas and group cohesion, potentially reducing tensions typically associated with
collaboration. They advocate for an intentional design of AI to further strengthen social
collaboration. However, while AI can act as a support system, they also observed a potential
shift in roles: participants reported feeling more like curators or co-producers, focusing on
evaluating AI-generated material rather than actively developing ideas, leading to a weaker
sense of creative involvement. This observation aligns with the findings of Civit et al. [19], who
noted that “the composer became more of an arranger of different melodies” comparing their role
to that of a producer managing misbehaving musicians. While this shift is viewed as a “very
creative, fruitful process” by them instead, it still highlights the need for future AI systems to be
adaptive to the creative context, user needs, and the composer’s specific intentions.
   This said, despite the growing interest in human-AI collaboration for music composition,
current research seems to have limited scope: much of the focus is on evaluating the impact of
existing generative systems, exploring strategies for composers to navigate their challenges,
and integrating steering tools into current interfaces. Efforts to bridge the gap between AI
music generation and the social and cultural complexities of music composition often lack
engagement with field studies on compositional practices. Unlike humans, who are influenced
by personal and social motivations and cultural context, AI systems currently operate solely
on algorithms and predictive models, potentially limiting their effectiveness and failing to
fully capture the nuances of human creativity [20]. The existing research gap highlights the
importance of investigating current practices adopted by music composers, as well as their
personal motivations, artistic sensibilities, the broader cultural and social context that influences
their work, and the specific characteristics of various musical genres. This knowledge will be
crucial for designing human-AI systems and creative workflows that effectively complement
human strengths and intentions.
3. Music Composition as a Situated Practice
The existing literature emphasizes the need to better understand the complexities of music
compositions in order to design human-AI collaborative music systems that account for both
the individual and the socio-cultural context. While efforts like that of Hernandez-Olivan and
Beltrán [11] to create generalized models of music composition are valuable for highlighting
core principles, such rigid frameworks can hardly capture the dynamic and diverse nature of
music creation, which is constantly evolving, shaped by genre, stylistic trends, personal choices,
improvisation, and the unique socio-cultural setting where musicians operate.
   To address this complexity, we propose an ethnographic approach that foregrounds the
situated nature of music composition. This approach will delve into both the composers’
personal motivations (e.g., creative aspirations and career goals) and the socio-cultural context
that influences their work. Ethnography allows us to explore these aspects of music creation and
the lived experiences of composers, and has already been applied and discussed as a method for
AI research [21, 22, 23, 24, 25], feeding into a broader discussion about the lack of, and thus need
to integrate, the social sciences into AI research to mitigate the exclusive use of quantitative
methods, which lack of socio-technical perspective, and their uncritical and positivist use that
often leads to ignoring the context and causes that bring to a certain outcome and the ways in
which it occurs [26, 21, 27]. By employing ethnography, we look at users not simply as passive
recipients of technology, but as active agents who shape its context, meanings, and consequences
[28]. This approach emphasizes the context-dependent nature of music composition, laying the
groundwork for designing synergistic human-AI systems that are sensitive to the nuances of
human creativity and to the specific settings where music is created.
   Our research aims to answer the following provisional questions:
    • Can the use of Generative AI systems be actually considered a collaboration?
    • How does the specific context of music creation influence decision-making, workflows,
      and creative choices when working with AI?
    • How might collaboration with Generative AI systems redefine the roles of composers in
      the creative process of composition?
    • How do composers and AI negotiate creative control and authorship within this collabo-
      ration?
   To answer these questions, we propose an ethnographic approach combining semi-structured
interviews and participant observation. We will interview 18 musicians with experience in
composing for diverse genres, exploring topics like:
    • Motivations, aspirations, creative sensibilities and personal workflow
    • The role of context in their music creation process
    • Experiences with AI tools and their perception of AI’s role
    • How users perceive their own role evolving in collaboration with AI
  Following the interviews, we will conduct at least 60 hours of participant observation, to
immerse ourselves in the composers’ practice.
  In summary, we aim to shed light on the situated practice of composing music and on how
the broader context influences the collaborative process. We do this to inform the design of
human-AI collaborative systems that support and empower, not replace, musicians in their
creative practices. Additionally, this study can offer insights into human-AI collaboration
beyond music, contributing to uncover design patterns for systems that are synergistic with
human capabilities and responsive to the specific context in which they are used.

3.1. Preliminary findings
Our analysis of initial interviews reveals some preliminary findings:
   (a) Intentionality Shapes and Gives Meaning to Music: Composers’ creative intentions
significantly impact their approach to composition. For example, whether they start with a
melody, harmony, or specific sound (timbre) often depends on what they want to convey. While
the existing literature acknowledges the link between intention, control, and creative agency,
our study highlights the specific link between intention and a deeper meaning-making process,
where composers strive to construct a “coherent discourse” through their music. How could we
design human-AI collaborative systems that support and adapt to user intentions?
   (b) Music is already collaborative: Music composition and production are already an often
collaborative process. From bandmates to collaborators, clients, and sound engineers, various
stakeholders contribute to and have an interest in the final product. This raises the question of
whether AI systems should be designed to enhance these existing human-human interactions,
or whether they themselves should become an additional collaborator within a system in which
creative control is already dynamically distributed and negotiated.
   More findings will be shared during the workshop.


References
 [1] W. Xu, Toward human-centered AI: A perspective from human-computer interaction,
     Interactions 26 (2019) 42–46. doi:10.1145/3328485 .
 [2] B. Shneiderman, Human-Centered AI, Oxford University Press, Oxford, 2022.
 [3] T. Capel, M. Brereton, What is Human-Centered about Human-Centered AI? A Map of the
     Research Landscape, in: Proceedings of the 2023 CHI Conference on Human Factors in
     Computing Systems, CHI ’23, Association for Computing Machinery, Hamburg, Germany,
     2023, p. 23. doi:10.1145/3544548.3580959 .
 [4] L. G. Terveen, Overview of human-computer collaboration, Knowledge-Based Systems 8
     (1995) 67–81. doi:10.1016/0950- 7051(95)98369- H .
 [5] Z. Wu, D. Ji, K. Yu, X. Zeng, D. Wu, M. Shidujaman, AI Creativity and the Human-AI
     Co-creation Model, in: M. Kurosu (Ed.), Human-Computer Interaction. Theory, Methods
     and Tools, volume 12762 of Lecture Notes in Computer Science, Springer International
     Publishing, Cham, 2021, pp. 171–190. doi:10.1007/978- 3- 030- 78462- 1_13 .
 [6] A. Sarkar, Enough With “Human-AI Collaboration”, in: Extended Abstracts of the 2023
     CHI Conference on Human Factors in Computing Systems, CHI EA ’23, Association for
     Computing Machinery, Hamburg, Germany, 2023, p. 8. doi:10.1145/3544549.3582735 .
 [7] B. Latour, Reassembling the Social: An Introduction to Actor-Network-Theory, Clarendon
     Lectures in Management Studies, Oxford University Press, Oxford, 2005.
 [8] M. L. Gray, S. Suri, Ghost Work: How to Stop Silicon Valley from Building a New Global
     Underclass, Houghton Mifflin Harcourt, Boston, 2019.
 [9] A. Rapp, A. Boldi, L. Curti, A. Perrucci, R. Simeoni, Collaborating with a Text-Based
     Chatbot: An Exploration of Real-World Collaboration Strategies Enacted during Human-
     Chatbot Interactions, in: Proceedings of the 2023 CHI Conference on Human Factors in
     Computing Systems, CHI ’23, Association for Computing Machinery, New York, NY, USA,
     2023, pp. 1–17. doi:10.1145/3544548.3580995 .
[10] L. A. Suchman, Plans and Situated Actions: The Problem of Human-Machine Communica-
     tion, volume 103, Cambridge University Press, USA, 1987.
[11] C. Hernandez-Olivan, J. R. Beltrán, Music Composition with Deep Learning: A Re-
     view, in: A. Biswas, E. Wennekes, A. Wieczorkowska, R. H. Laskar (Eds.), Advances in
     Speech and Music Technology: Computational Aspects and Applications, Signals and
     Communication Technology, Springer International Publishing, Cham, 2023, pp. 25–50.
     doi:10.1007/978- 3- 031- 18444- 4_2 .
[12] R. Fiebrink, D. Trueman, C. Britt, M. Nagai, K. Kaczmarek, M. Early, MR. Daniel, A. Hege,
     P. Cook, Toward Understanding Human-Computer Interaction In Composing The Instru-
     ment, in: Proceedings of the International Computer Music Conference, International
     Computer Music Association, New York, 2010, pp. 135–142.
[13] Y. Wu, N. Bryan-Kinns, Supporting Non-Musicians? Creative Engagement with Musical
     Interfaces, in: Proceedings of the 2017 ACM SIGCHI Conference on Creativity and
     Cognition, C&C ’17, Association for Computing Machinery, New York, NY, USA, 2017, pp.
     275–286. doi:10.1145/3059454.3059457 .
[14] H. Scurto, F. Bevilacqua, Appropriating Music Computing Practices Through Human-AI
     Collaboration, in: Journées d’Informatique Musicale (JIM 2018), Amiens, France, 2018.
[15] M. Newman, L. Morris, J. H. Lee, Human-AI Music Creation: Understanding the Perceptions
     and Experiences of Music Creators for Ethical and Productive Collaboration, in: Proc. of
     the 24th Int. Society for Music Information Retrieval Conf., International Society for Music
     Information Retrieval, Milan, Italy, 2023, pp. 80–88.
[16] C.-Z. A. Huang, H. V. Koops, E. Newton-Rex, AI Song Contest: Human-AI Co-creation
     in Songwriting, in: Proc. of the 21st Int. Society for Music Information Retrieval Conf.,
     International Society for Music Information Retrieval, Montréal, Canada, 2020, pp. 708–716.
[17] R. Louie, A. Coenen, C. Z. Huang, M. Terry, C. J. Cai, Novice-AI Music Co-Creation via
     AI-Steering Tools for Deep Generative Models, in: Proceedings of the 2020 CHI Conference
     on Human Factors in Computing Systems, CHI ’20, Association for Computing Machinery,
     Honolulu, HI, USA, 2020, pp. 1–13. doi:10.1145/3313831.3376739 .
[18] M. M. Suh, E. Youngblom, M. Terry, C. J. Cai, AI as Social Glue: Uncovering the Roles of
     Deep Generative AI during Social Music Composition, in: Proceedings of the 2021 CHI
     Conference on Human Factors in Computing Systems, CHI ’21, ACM, Yokohama, Japan,
     2021, p. 11. doi:10.1145/3411764.3445219 .
[19] M. Civit, J. Civit-Masot, F. Cuadrado, M. J. Escalona, A systematic review of artificial
     intelligence-based music generation: Scope, applications, and future trends, Expert Systems
     with Applications 209 (2022) 118–190. doi:10.1016/j.eswa.2022.118190 .
[20] O. Bown, Sociocultural and Design Perspectives on AI-Based Music Production: Why
     Do We Make Music and What Changes if AI Makes It for Us?, in: E. R. Miranda (Ed.),
     Handbook of Artificial Intelligence for Music, Springer International Publishing, Cham,
     2021, pp. 1–20. doi:10.1007/978- 3- 030- 72116- 9_1 .
[21] V. Marda, S. Narayan, On the importance of ethnographic methods in AI research, Nature
     Machine Intelligence 3 (2021) 187–189. doi:10.1038/s42256- 021- 00323- 0 .
[22] A. Christin, The ethnographer and the algorithm: Beyond the black box, Theory and
     Society 49 (2020) 897–918. doi:10.1007/s11186- 020- 09411- 3 .
[23] A. F. Blackwell, Ethnographic artificial intelligence, Interdisciplinary Science Reviews 46
     (2021) 198–211. doi:10.1080/03080188.2020.1840226 .
[24] R. Van Voorst, T. Ahlin, Key points for an ethnography of AI: An approach towards
     crucial data, Humanities and Social Sciences Communications 11 (2024) 337. doi:10.1057/
     s41599- 024- 02854- 4 .
[25] N. Seaver, Algorithms as culture: Some tactics for the ethnography of algorithmic systems,
     Big Data & Society 4 (2017) 205395171773810. doi:10.1177/2053951717738104 .
[26] M. Sloane, E. Moss, AI’s social sciences deficit, Nature Machine Intelligence 1 (2019)
     330–331. doi:10.1038/s42256- 019- 0084- 6 .
[27] E. Dahlin, Mind the gap! On the future of AI research, Humanities and Social Sciences
     Communications 8 (2021) 1–4. doi:10.1057/s41599- 021- 00750- 9 .
[28] P. Dourish, Implications for design (2006) 541–550. doi:10.1145/1124772.1124855 .