1. Introductory Remarks

Downsides, Pitfalls, and Socio-Cultural Shortcomings of Human-AI Music Co-Creation

António Correia

antonio.g.correia@jyu.fi 0 0 University of Jyväskylä, Faculty of Information Technology , P.O. Box 35, FI-40014 Jyväskylä , Finland

Research has shown a misalignment between artificial intelligence (AI)-steered music creation tools and the cultural aspects that might be encountered in diverse musical experiences. This lack of cultural sensitivity in AI development raises several concerns, particularly when new AI-steered music creation tools are introduced without a clear understanding of their risks and impacts. While AI offers unprecedented opportunities for enhancing creativity and streamlining production in artistic activities, it can also reinforce cultural biases and exacerbate the marginalization of underrepresented communities. In this sense, the socio-cultural implications of AI-steered music cocreation must be framed into a culturally inclusive and context-aware design strategy that respects diverse musical traditions, identities, and composition practices through an approach that considers AI not merely as a tool but as a co-creative partner shaped by (and responsive to) the musician's socio-cultural background and needs. By discussing the role of culturally sensitive AI design in music composition, this study contributes to the ongoing discourse on equitable AI in the creative arts and its potential pitfalls in relation to the situated nature of musicians' everyday compositions.

creativity culturally sensitive AI design human-AI music co-creation inclusivity multicultural settings music socio-cultural1harms

1. Introductory Remarks

With the dramatic increase in the number of artificial intelligence (AI)-based products and services, there is an immense deal of untapped potential for music creativity across genres and styles. Nowadays, generative AI plays a central role in compositional sample-based music remixing and transformation due to its capabilities to interact with humans while generating novel content and modifying an original version of a song into a different version that suits particular needs and preferences. As AI-driven tools see a growing level of maturity, musicians can now inexpensively generate professional quality instrumentals, song lyrics, and recordings through the use of steering interfaces developed to support creativity and composition activities [ 1 ]. This synthetic method of generating musical content offers unprecedented opportunities by considering the AI involvement in certain roles such that of a co-creative dance partner [ 2 ] and songwriting assistant [3]. In spite of this, engaging with this AI-generated material entails an added cognitive demand [4], potential biases [5], and societal harms such as dehumanization, stereotyping, and systemic erasure [6] that tend to cause damages in under-resourced and underserved communities. As a situated practice [7], AI-mediated music creation must be contextualized into the individual and socio-cultural settings in which each musician operates, including their creative and artistic sensibilities.

Among many known sociotechnical harms, Shelby and co-authors [6] stressed the importance of mitigating cultural harms that appear throughout socio-algorithmic experiences. Besides common problems like the lack of ownership and control over AI-generated creations, the development of algorithmic systems for music creation can also contribute to social exclusion and segregation when there is a lack of careful consideration and understanding about how these algorithms actually work and are “influenced” to perform unintended behaviors such as generating harmful lyrics. Therefore, there is a need to scrutinize AI music composition beyond the algorithms to mitigate the individual and sociotechnical harms associated with inappropriate AI implementations, as well as to explore novel approaches to interacting with AI systems during music co-creative processes. This paper contributes to this debate by looking at the current shortcomings in the quest to design AI co-creativity tools for music composition that are culturally sensitive and inclusive in their capacity to learn from (and adapt to) each musician’s unique characteristics.

2. Socio-Cultural Harms and Vulnerabilities in Human-AI Music Composition

AI-enabled music composition typically involves a range of activities, including melody-to-lyric and lyric-to-melody generation, singing voice synthesis, music style/emotion modeling, timbre rendering, and sound mixing. From an anthropomorphic view, individuals tend to exhibit a preference for products created by humans as opposed to those generated by AI technology [8]. The main reason for this phenomenon can be attributed to the cultural familiarity and proximity that exist with human creations when compared to AI-produced creative products. As cultural harms are likely to be reproduced through the use of AI systems, designing for cultural diversity assumes particular relevance in creating more culturally aware human-AI music composition interactions in line with user expectations, goals, and cultural differences [9]. In particular, culture is dynamically formed and collectively reproduced as a social activity that develops in an ongoing fashion [10]. Drawing from this generative view of culture, it becomes imperative to integrate the viewpoints of culturally diverse groups into the design of AI-based technologies [11]. With this in mind, AI system developers can contribute to preserving cultural stability and identity while mitigating harmful cultural beliefs (e.g., propagating erroneous perceptions regarding particular cultural groups) [6], algorithmic unfairness [12], and adverse cultural impacts (e.g., cultural divides) [13]. However, despite the optimism surrounding culturally sensitive design [14], the literature on the cultural aspects underlying human-AI music cocreation is scant [9], especially regarding the nuanced interplay between traditional musical practices and culturally contextualized enactments embodied in generative AI models.

As a multilayered creative process based on cultural productions and artistic expressions, music composition plays a crucial role in shaping the complex and intricate tapestry of human expression and cultural identity. Research has demonstrated that the mental models users develop toward AI technology are established at a very early stage and can have lasting detrimental effects in the long run [15]. If incorrectly formed, mental models can gradually diminish users’ trust in AI over time. This needs an alignment between each musician’s expectations towards AI-driven tools and their actual uses in practice to avoid erroneous assumptions and thereby ensure appropriate reliance on AI. Given the fact that these depictions are derived from each individual’s authentic experiences, along with their personal and sociocultural backgrounds [16], it is crucial for designers and developers to create culturally sensitive interactive experiences in the AI music composition space.

3. Setting the Scene for a Culturally Sensitive Design in AI-steered Music Co-Creation

Designing for culturally sensitive AI in music composition involves a thorough consideration of a diverse set of cultural backgrounds, musical traditions, styles, identities, and preferences that are shaped throughout the use of algorithmic systems. However, incorporating culturally congruent aspects into the generative AI models is very challenging by nature. According to Seaver [17], algorithms can be seen “as culture” themselves since they reflect norms, values, or even socio-technical vulnerabilities regarding the environment where they are embedded. The author goes even further by claiming that “algorithms are [culturally] enacted by practices which do not heed a strong distinction between technical and non-technical concerns, but rather blend them together”. That is, algorithms are dynamic entities influenced by collective human practice and are thus subjected to their harms. Despite the last stirrings around the use of AI-steered tools, algorithms trained on biased data can exacerbate societal prejudices. For instance, generative AI models like DALL·E 3 incorporate cultural attributes and traits from the array of training data available online [18]. This may lead to inadequate actions and misrepresentations, as culturally insensitive AI systems may unintentionally exclude or marginalize certain groups, ultimately triggering a digital divide [19]. Elaborating on this, developing AI-driven tools that are aware of cultural values and contingencies is crucial for reducing the risk of unfair outcomes and harmful expressions that may or may not be present in multicultural settings.

3.1. How Culture Shapes Musicians’ Work with AI-steered Tools

Overall, culture can have a substantial impact on individuals’ expectations and preferences regarding their interaction with AI systems [20]. However, some studies indicate a tendency in matters of AI, favoring individuals from Western, educated, industrialized, rich and democratic countries to the detriment of the remaining 88% of the global population [21]. Therefore, it can be contended that there is a need to advocate for increased inclusivity, fairness, plurality, and equity in AI technology design [22]. Solving the everyday challenges of excluded groups by accommodating different cultures through inclusive design strategies is thus of critical importance in today’s AI-driven socio-technical landscape. Given all these potential perils and pitfalls, AI developers need to learn from the diversity that surrounds musicians to effectively capture what is acceptable and what is not from a cultural and behavioral view. That is, an emancipatory AI approach that gives artistic autonomy to the musician and takes into account their unique creative individuality and socio-cultural context is thus required from a humancentered design standpoint.

Nowadays, AI can provide suggestions tailored to each music composer. For example, a horror movie soundtrack artist can be interested in knowing more about the process that led other artists in the past to create specific pieces (e.g., Dario Argento’s Suspiria), while a hip-hop producer may wish to know more soundtracks from the 70s and 80s that could provide the right sample for their next music. AI’s growing capability to produce synthetic data is thus of interest in many composition activities although its potential drawbacks in the creation process since most of these systems are trained with data from Western cultures and therefore reflect highly specific norms that do not fit into other cultures and contexts. If we consider an Italian musician from Sicily, their cultural background and music preferences are shaped by a rich tapestry of regional traditions, historical influences, and socio-political contexts unique to the island's Mediterranean heritage. On the other hand, a Japanese musician may draw upon a distinct set of cultural narratives and musical idioms rooted in their specific regional and historical context, which can lead to different aesthetic priorities and compositional outcomes. However, research on large language models (LLMs) and other AI-driven solutions has highlighted social identity issues [23] and the cultural impacts arising from the underrepresentation of certain populations in favor of more developed countries [24]. If AI systems do not recognize these differences and adequately support individual work preferences in a culturally sensitive manner, they may compromise the entire creation process.

3.2. Algorithmic Contingencies in Human-AI Music Compositional Activities

Algorithmic contingencies refer to the unpredictable, non-linear and variable ways algorithms influence (or respond to) human behavior, systems, or environments. The downsides of an algorithmic culture that fosters discriminatory outcomes and poses threats to social inclusion and justice are already recognized in the literature [25]. Embedding diverse perspectives throughout a socio-algorithmic interaction design strategy comprising cultural sensitivity and transparency (e.g., social blackboxing) can reduce inequities and other detrimental consequences on the long run. Interculturality is an integral aspect of music composition and AI models must demonstrate cultural understanding to enable musicians to communicate effectively through music. Musicians form social bonds based on a shared identity that is commonly built around aspects like language, traditions, cultural heritage, and sense of belonging to a community (e.g., metalheads). In line with this, imprecision can be leveraged as a way of opening up new avenues for creating culturally embedded artefacts that accommodate diverse interpretations and practices [26]. Equity is a long and winding road in every activity involving AI. At the same time, promoting equity “can conflict with the maximization of individual liberty” [27]. Therefore, an understanding of how AI technology is misaligned to exacerbate abusive behaviors, unbalanced content, harmful stereotypes, and subtler biases that result from inadequate representation of protected groups can help stakeholders to better support marginalized and vulnerable communities through tangible interventions.

Cross-cultural studies are needed and ethnography can play a crucial role in this context by enabling researchers to immerse themselves in the social and cultural contexts of musical cocreation while yielding valuable insights into the in situ compositional experiences and sociotechnical needs of musicians [28, 29]. When implemented within a socio-technical context, AI can be examined as a cultural artifact with implications to real-world artistic practices that are a result of collective culture-imbued content diffused and reshaped across generations [26]. Beyond authenticity and ownership, interventions can include training AI to recognize and extract culturally significant features such as rhythmic patterns, tonal systems like the raga in Indian classical music, or traditional instruments like the kora in West African music in order to ground its generative outputs in culturally rooted elements. AI can also be used to analyze lyrical contents, song structures, and live performance acts to uncover the underlying narratives and social themes embedded in musical traditions such as the storytelling aspects of folk music or the protest motifs prevalent in specific subgenres such as underground hip-hop. Guided by cultural context, generative AI models can produce music that not only mirrors the stylistic nuances of a region but also respects its aesthetic and historical sensibilities. Furthermore, techniques like style transfer (cultural export) enable the blending of musical characteristics across cultures. Creating personalized compositional experiences by matching the AI-steered tool with the musician’s style or suggesting culturally relevant variations during the co-creation process is also a possible approach. In this sense, artists play a critical role by mediating between cultural expression and algorithmic function, creating a bridge that frames AI not just as a tool of technical utility but as a medium for intercultural dialogue and creative exploration.

4. Final Remarks

This study is guided by the premise that AI developers can facilitate a more intuitive and inclusive user experience in music composition by aligning the interface design and system components with each musician’s cultural frames of reference, ultimately encouraging their broader adoption and appropriate usage. While generative AI models have demonstrated remarkable capabilities in alleviating musicians’ creative blocks by supporting content ideation, they simultaneously raise concerns surrounding originality, authenticity, and human agency. These challenges underscore the urgent need for human-centered AI design strategies that prioritize both artistic and non-artistic cultural dimensions. In the context of AI-based music generation, it is essential to consider the cultural diversity of stakeholders, especially those from marginalized or vulnerable communities who are directly impacted by these technologies. By embedding culturally sensitive approaches into the co-creative process, including genre-specific nuances and intercultural dialogues, AI systems can foster inclusivity and therefore support societal integration while enhancing the expressive capacity of artists by augmenting their lyrical palette, musical expertise, etc. As both public and private sectors invest in the creative industry in an ongoing basis, embedding cultural projections and “dimensionalizing” (sub)cultures within generative AI models becomes a critical endeavor for ensuring inclusive implementations.

Declaration of Generative AI

The author has not employed any Generative AI tools. [3] C. Z. A. Huang, H. V. Koops, E. Newton-Rex, M. Dinculescu, C. J. Cai, Human-AI cocreation in songwriting, in: Proceedings of the International Society for Music Information Retrieval Conference, 2020, pp. 708–716. [4] L. Tankelevitch, V. Kewenig, A. Simkute, A. E. Scott, A. Sarkar, A. Sellen, S. Rintel, The metacognitive demands and opportunities of generative AI, in: Proceedings of the CHI Conference on Human Factors in Computing Systems, 2024, pp. 1–24. [5] R. B. Tchemeube, J. Ens, C. Plut, P. Pasquier, M. Safi, Y. Grabit, J. B. Rolland, Evaluating human-AI interaction via usability, user experience and acceptance measures for MMM-C: A creative AI system for music composition, in: Proceedings of the International Joint Conference on Artificial Intelligence, 2023, pp. 5769–5778. [6] R. Shelby, S. Rismani, K. Henne, A. Moon, N. Rostamzadeh, P. Nicholas, N’M. Yilla-Akbari, J. Gallegos, A. Smart, E. García, G. Virk, Sociotechnical harms of algorithmic systems: Scoping a taxonomy for harm reduction, in: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 2023, pp. 723–741. [7] E. T. Gianet, L. Di Caro, A. Rapp, Music composition as a lens for understanding humanAI collaboration, in: Proceedings of the International Workshop on Designing and Building Hybrid Human-AI Systems, 2024, pp. 1–7. [8] A. Tubadji, H. Huang, D. J. Webber, Cultural proximity bias in AI-acceptability: The importance of being human, Technological Forecasting and Social Change, 173 (2021) 121100. [9] A. Correia, On the human-AI metaphorical interplay for culturally sensitive generative AI design in music co-creation, in: Joint Proceedings of the ACM International Conference on Intelligent User Interfaces, 2024, pp. 1–11. [10] L. Irani, J. Vertesi, P. Dourish, K. Philip, R. E. Grinter, Postcolonial computing: A lens on design and development, in: Proceedings of the CHI Conference on Human Factors in Computing Systems, 2010 pp. 1311–1320. [11] X. Ge, C. Xu, D. Misaki, H. R. Markus, J. L. Tsai, How culture shapes what people want from AI, in: Proceedings of the CHI Conference on Human Factors in Computing Systems, 2024, pp. 1–15. [12] N. Grgić-Hlača, G. Lima, A. Weller, E. M. Redmiles, Dimensions of diversity in human perceptions of algorithmic fairness, in: Proceedings of the ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization, 2022, pp. 1–12. [13] A. DeVos, A. Dhabalia, H. Shen, K. Holstein, M. Eslami, Toward user-driven algorithm auditing: Investigating users’ strategies for uncovering harmful algorithmic behavior, in: Proceedings of the CHI Conference on Human Factors in Computing Systems, 2022, pp. 1– 19. [14] A. Sabiescu, A. de Moor, N. Memarovic, Opening up the culture black box in community technology design, AI & SOCIETY, 34 (2019) 393–402. [15] P. Pataranutaporn, R. Liu, E. Finn, P. Maes, Influencing human–AI interaction by priming beliefs about AI can increase perceived trustworthiness, empathy and effectiveness, Nature Machine Intelligence, 5 (2023), 1076–1086. [16] T. Kulesza, S. Stumpf, M. Burnett, S. Yang, I. Kwan, W. K. Wong, Too much, too little, or just right? Ways explanations impact end users’ mental models, in: Proceedings of the IEEE Symposium on Visual Languages and Human Centric Computing, 2013, pp. 3–10. [17] N. Seaver, Algorithms as culture: Some tactics for the ethnography of algorithmic systems,

Big Data & Society, 4 (2017) 2053951717738104. [18] L. Struppek, D. Hintersdorf, F. Friedrich, P. Schramowski, K. Kersting, Exploiting cultural biases via homoglyphs in text-to-image synthesis, Journal of Artificial Intelligence Research, 78 (2023) 1017–1068. [19] Y. K. Dwivedi et al., “So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy, International Journal of Information Management, 71 (2023) 102642. [20] U. Peters and M. Carman, Cultural bias in explainable AI research: A systematic analysis,

Journal of Artificial Intelligence Research, 79 (2024), 971–1000. [21] K. Seaborn, G. Barbareschi, S. Chandra, Not only WEIRD but “uncanny”? A systematic review of diversity in human–robot interaction research, International Journal of Social Robotics, 15 (2023), 1841–1870. [22] S. Linxen, C. Sturm, F. Brühlmann, V. Cassau, K. Opwis, K. Reinecke, How weird is CHI?, in: Proceedings of the CHI Conference on Human Factors in Computing Systems, 2021, pp. 1–14. [23] T. Hu, Y. Kyrychenko, S. Rathje, N. Collier, S. van der Linden, J. Roozenbeek, Generative language models exhibit social identity biases, Nature Computational Science, 5 (2025) 65– 75. [24] S. Dudy, T. Tholeti, R. Ramachandranpillai, M. Ali, T. J. J. Li, R. Baeza-Yates, Unequal opportunities: Examining the bias in geographical recommendations by large language models, in: Proceedings of the ACM International Conference on Intelligent User Interfaces, 2025, pp. 1499–1516. [25] S. Moussawi, X. Deng, K. D. Joshi, AI and discrimination: Sources of algorithmic biases,

ACM SIGMIS Database, 55 (2024) 6–11. [26] B. Caramiaux, S. Fdili Alaoui, “Explorers of unknown planets”: Practices and politics of artificial intelligence in visual arts, Proceedings of the ACM on Human-Computer Interaction, 6, CSCW2 (2022) 1–24. [27] S. Tolmeijer, M. Christen, S. Kandul, M. Kneer, A. Bernstein, Capable but amoral? Comparing AI and human expert collaboration in ethical decision making, in: Proceedings of the CHI Conference on Human Factors in Computing Systems, 2022, pp. 1–17. [28] A. Correia, D. Schneider, B. Fonseca, H. Mohseni, T. Kujala, T. Kärkkäinen, And justice for art(ists): Metaphorical design as a method for creating culturally diverse human-AI music composition experiences, in: Proceedings of the IEEE International Congress on HumanComputer Interaction, Optimization and Robotic Applications, 2024, pp. 1–4. [29] T. G. Eric, L. Di Caro, A. Rapp, Human-AI collaboration insights from music composition, in: Generative AI and HCI Workshop Proceedings, 2024, pp. 1–5.

[1]

Louie ,

Engel ,

C. Z. A.

Huang , Expressive communication: Evaluating developments in generative models and steering interfaces for music creation , in: Proceedings of the ACM International Conference on Intelligent User Interfaces , 2022 , pp. 405 - 417 .

[2]

Winston ,

Magerko , Turn-taking with improvisational co-creative agents , in: Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment , 2017 , pp. 129 - 135 .