=Paper=
{{Paper
|id=Vol-3918/paper350
|storemode=property
|title=Theoretical and practical aspects of using artificial intelligence technologies in the field of sound design
|pdfUrl=https://ceur-ws.org/Vol-3918/paper350.pdf
|volume=Vol-3918
|authors=Oleksandr A. Bobarchuk,Svitlana M. Halchenko,Serhii O. Hnidenko,Ivan P. Zavadetskyi
|dblpUrl=https://dblp.org/rec/conf/aredu/BobarchukHHZ24
}}
==Theoretical and practical aspects of using artificial intelligence technologies in the field of sound design==
<pdf width="1500px">https://ceur-ws.org/Vol-3918/paper350.pdf</pdf>
<pre>
                         Oleksandr A. Bobarchuk et al. CEUR Workshop Proceedings                                                                                               319–328


                         Theoretical and practical aspects of using artificial
                         intelligence technologies in the field of sound design
                         Oleksandr A. Bobarchuk, Svitlana M. Halchenko, Serhii O. Hnidenko and
                         Ivan P. Zavadetskyi
                         State Non-Commercial Company “State University “Kyiv Aviation Institute”, 1 Liubomyra Huzara Ave., Kyiv, 03058, Ukraine


                                     Abstract
                                     The theoretical and practical aspects of using artificial intelligence technologies in the field of sound design are
                                     considered. An analysis of modern technologies, their capabilities and limitations is conducted, the advantages
                                     and risks are examined, and the prospects for development in this field are outlined. The results of the research
                                     are aimed at increasing the understanding of the potential of AI in working with sound and determining ways to
                                     effectively implement these technologies in the creative process.

                                     Keywords
                                     sound design, artificial intelligence, sound creation for music, Suno AI, sound plugins, visual novels, AudioGen


                         1. Introduction
                         Modern artificial intelligence (AI) technologies are becoming an integral part of many spheres of human
                         activity, including creative industries [1, 2]. One such area where AI demonstrates significant potential
                         is sound design. This discipline combines art and technology to create sound compositions used in
                         cinema, video games, advertising, music and other media. The use of AI in sound design opens up new
                         possibilities for process automation, sound generation and interactive sound accompaniment, changing
                         traditional approaches to working with sound.
                            Despite significant interest in the use of artificial intelligence in creative industries, the topic of
                         AI application in sound design is not yet fully explored in modern scientific literature. Most studies
                         focus on specific aspects such as sound synthesis, audio signal processing or adaptive sound systems
                         for interactive environments. However, a holistic analysis of the theoretical foundations, practical
                         applications, and the impact of these technologies on the industry as a whole remains fragmented.
                            Some works highlight the technical aspects, describing the algorithms and methods used to generate
                         or process sound. Others focus on applied cases, such as the integration of AI in the production of
                         music or sound effects for cinema and games. Meanwhile, a comprehensive approach that would take
                         into account both creative and technical challenges, ethical aspects and development prospects is still
                         lacking.
                            This indicates the need for deeper research that would create a general concept of using AI in sound
                         design. This article attempts to fill this gap by analysing not only existing technologies, but also their
                         impact on the process of sound creation, as well as outlining future prospects for this field.


                         2. Transformation of modern sound design
                         In the classical sense, sound design is the process of obtaining (generating), editing, and implementing
                         sound elements (samples) in a multimedia composition [3]. It covers a wide range of applications,


                          AREdu 2024: 7th International Workshop on Augmented Reality in Education, May 14, 2024, Kryvyi Rih, Ukraine
                          " a.bobarchuk@interactiveklass.com (O. A. Bobarchuk); smgalchenko@gmail.com (S. M. Halchenko);
                          serhii.hnidenko@npp.nau.edu.ua (S. O. Hnidenko); ivan.zavadetskyi@npp.nau.edu.ua (I. P. Zavadetskyi)
                           0000-0003-3176-7231 (O. A. Bobarchuk); 0000-0003-0531-1572 (S. M. Halchenko); 0009-0002-3215-8577 (S. O. Hnidenko);
                          0000-0002-6854-3971 (I. P. Zavadetskyi)
                                     © 2025 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings

                                                                                                           319
Oleksandr A. Bobarchuk et al. CEUR Workshop Proceedings                                           319–328


including cinema, theatre, video games, advertising, the music industry and even architectural design
of sound environments. The main principles of classical sound design include the following aspects [4]:

    • Realism and authenticity. The main principle underlying the classical approach is the creation
      of realistic sounds that correspond to the visual or dramatic context.
    • Technical skill. Sound designers rely on traditional methods of recording sound using micro-
      phones, field recorders, analogue and digital processing tools.
    • Foley art. A special place in classical sound design is occupied by the art of creating sound effects
      manually, using real objects and materials to imitate various sounds. Real sounds are recorded
      and processed by appropriate means of artistic sound processing, such as reverberation, echo,
      chorus, etc., to achieve the desired result.
    • Composition and editing. The sound designer combines sounds into a sound composition,
      using editing to achieve the desired rhythm, harmony and dramatic impact. That is, a synergistic
      combination of sound and dynamic change of images is performed.

   The classical approach laid the fundamental principles of sound design, which still remain relevant.
However, changes in the digital landscape pose new challenges. Classical methods of sound design
have their limitations (for example, lack of adaptability, instrumental and technical limitations, time
and resource costs).
   The gradual development of digital technologies has also changed the principles and means of sound
design. With the advent of VST (Virtual Studio Technology) and AU (Audio Units), sound designers
gained access to thousands of digital instruments, simulators of analogue and digital synthesisers,
classical musical instruments and effects [5, 6]. This significantly reduced equipment costs and expanded
the possibilities for experimentation. The development of the gaming industry stimulated the emergence
of adaptive audio systems, where sound changes depending on the player’s actions or environment.
Wwise (figure 1) and FMOD technologies have become the standard in the field of interactive sound
design [7].
   Digital sound libraries gradually began to appear – sets of pre-recorded samples that can be quickly
applied to one’s own projects. The use of ready-made sounds was not a new practice in itself – the 1950s-
60s saw the creation of the first commercial libraries storing sounds of gunshots, natural phenomena,


Figure 1: Wwise Software – a tool for creating sound for interactive media and video games.


                                                    320
Oleksandr A. Bobarchuk et al. CEUR Workshop Proceedings                                             319–328


transport, etc. [8]. They were recorded on analogue media (e.g. magnetic tape) and used in cinema and
television.
   Modern sound design has reached a level where technologies allow the creation of high-quality sound
for various types of media – from cinema and video games to advertising and virtual reality. Audio
processing tools have become more powerful, and access to large sound libraries, virtual instruments
and modern technical means for recording and processing sound have greatly simplified the process of
creating sound content. But the demand for constant improvement remains unchanged. That is why
the question arises: how to adapt and use the possibilities of artificial intelligence, which continues to
develop comprehensively, in the sound design industry. And how expedient is such use?


3. Artificial intelligence in sound design: main directions
The use of artificial intelligence for sound generation has become one of the most promising areas in
the field of sound design. At a basic level, this process is based on the ability of algorithms to learn from
large volumes of audio data, analyse them and create new sound textures that can be used in music,
cinema, video games and virtual reality [9].
   In the early stages of development, artificial intelligence algorithms worked primarily with existing
sounds. They could restore audio, remove noise or imitate the sound character of specific instruments.
However, with the development of machine learning [10] and neural networks [11], AI has become
capable of creating completely new soundscapes that did not exist before. For example, generative
adversarial networks (GANs) allow systems to synthesise sounds that have a natural timbre, and
recurrent neural networks (RNNs) learn to predict the next sound segments, creating a continuous
audio stream.
   One of the most striking examples of AI applications is the creation of sounds for music. Algorithms
analyse thousands of music tracks, extracting patterns and harmonies, and then generate melodies
or rhythms. Programs like AIVA (Artificial Intelligence Virtual Artist) are capable of creating entire
compositions in various genres, providing composers with a foundation for further work [12]. In the
field of electronic music, AI is often used to create unique samples or synthetic textures that can be
integrated into compositions.
   At the beginning of 2024, the Suno AI network gained a high level of popularity [13]. Suno AI is an
innovative platform that uses artificial intelligence to generate music based on text prompts. The user
enters a description of the desired song, specifying the style, genre or theme, and the system creates a
corresponding composition. This process takes about two minutes, after which the user receives two
versions of the track: one with vocals, the other instrumental.
   Suno AI technology is based on artificial intelligence models such as Bark and Chirp, which are
capable of generating not only instrumental music but also adding vocal parts to songs. The algorithm
analyses the entered text, determines its rhythmic and semantic features, and then synthesises a melody
and harmony that match the given description. The vocals are synthesised taking into account the
rhythm and intonation of the text, giving the song a natural sound.
   The system uses an approach similar to large language models, such as ChatGPT [14]: it splits the
text into individual segments (tokens), studies millions of usage variants, styles and structures, and
then reconstructs them on request. However, creating audio, especially music, is a more complex task,
as it requires taking into account many parameters such as melody, harmony, rhythm and timbre.
   Of course, the tracks generated by both Suno AI and other platforms and models have noticeable
subjective flaws, which most often manifest themselves in distortions in vocal parts, sharp changes
in volume or banal misunderstanding or neglect of prompts. Artificial intelligence can work much
more accurately with small sounds. Classic elements of sound design are various sound transitions, for
example, a gradually increasing sound (rise), a sound of impact (hit), a sound of cutting air (whoosh),
etc.
   Artificial intelligence is able to generate and process such short sound effects with high accuracy due
to its ability to analyse thousands of samples and extract key sound characteristics. These effects have a


                                                    321
Oleksandr A. Bobarchuk et al. CEUR Workshop Proceedings                                              319–328


clear structure and predictable dynamics, making them ideal material for algorithm work. AI models
can create variations of hits, rises or noises based on text descriptions or user settings, providing precise
adjustment of the duration, frequency spectrum and amplitude of each sound.
   In addition, thanks to machine learning technologies, artificial intelligence can automatically select
sounds for different scenes, creating smooth transitions and adapting them to the visual content. For
example, AI can generate a whoosh sound of varying intensity depending on the speed of an object in
the frame or synchronise impact effects with moments of climax.
   There are already services that provide the ability to create cinematic sounds with a text prompt. But
such sounds are only a small part of sound design. For us, sound design is primarily a complex sound
landscape, an immersive global environment. Is artificial intelligence capable of forming something like
this?
   Immersive environment sound design requires not only layering sounds on top of each other, but
also fine-tuning spatial acoustics, dynamics and the emotional content of each layer. In the real world,
sounds interact with each other in unpredictable ways – echoes in space, gradual fading or swelling, the
influence of textures of materials and objects that create or reflect sound. It is difficult for algorithms to
reproduce this chaos and versatility of the sound environment in the same way as human hearing and
perception.
   Currently, artificial intelligence does an excellent job of reconstructing real environments through
recordings and spatial analysis, but creating completely fictional sound landscapes that have no ana-
logues in reality requires creative intuition. A human sound designer works not only with sounds as
such, but with a concept – they create a story through sound, using audio as a tool to evoke emotions
and build atmosphere.
   However, there are also positive aspects. Artificial intelligence algorithms are becoming increasingly
effective in creating procedural sound landscapes. They are able to analyse visual sequences or text
descriptions and generate corresponding sound environments, automatically adding necessary elements:
the sound of wind, raindrops, city bustle or any other simple ambient.
   Artificial intelligence cannot fully construct a multi-layered sound environment. But it can be used as
a tool that provides a certain foundation to work with. For example, it is possible to create simple patterns
of classical instruments in a given key and rhythm, perform their gradual processing using classical
means in any sound editing environment, mix the tracks, supplement them with various generated
sounds, and further integrate the created composition into complex multimedia environments.
   Another equally important aspect is the integration of artificial intelligence technologies into various
plugins for working with sound. Often these are a wide variety of tools for different tasks, which
have appeared quite a lot recently. AI assistants are used to perform general mastering (such as the
built-in assistant in Izotope Ozone 10/11), for individual tasks such as compression, limiting, saturation,
equalisation, etc. AI plugins are trained on large volumes of audio data. Developers train the neural
network based on various recordings manually processed by professional sound engineers. The model
analyses how classical tools for saturation, compression or limiting work, and learns patterns of effect
application depending on the type of sound, genre or processing style.
   When the user loads the plugin, the algorithm performs a multivariate analysis of the audio signal –
analyses the frequency spectrum, dynamics, harmonics and noise level. AI models are able to detect
problem areas or potentially weak zones and suggest processing parameters. Practical experience shows
that these parameters are often not optimal, but can be used as a basis for further work with sound.
   Much more interesting from the point of view of sound design are plugins such as Synplant 2. The
Genopatch technology (figure 2) built into the plugin allows generating a variety of new sounds based
on a single loaded sample [15]. The unique capabilities and interface of Synplant 2 promote experiments
in the field of intuitive sound design, allowing to explore how non-standard methods of interaction
with technology can influence the creative process.
   Considering all of the above, we can note that artificial intelligence is already transforming the field
of sound design today, opening up new possibilities for creativity and automation. Despite significant
achievements, AI technologies in sound design face certain significant challenges: limited emotional
depth of generated sounds, complexity of creating complex immersive environments, and various


                                                     322
Oleksandr A. Bobarchuk et al. CEUR Workshop Proceedings                                           319–328


Figure 2: Working in the Genopatch editor.


technical defects that can distort the perception of the overall picture. However, these limitations
stimulate the development of the industry and create space for improving algorithms, integrating new
approaches and synergy with human creative ideas. AI does not replace sound engineers, but becomes
a powerful tool that helps accelerate the workflow and expand creative horizons.
   Now, let’s demonstrate the possibilities and ways of applying artificial intelligence technologies in
sound design through practical experience.


4. Practical aspects of using artificial intelligence technologies in
   sound design
In this part, as an example of the possibilities of using AI in sound design, we will form a simple sound
design for several scenes that are planned to be used in a visual novel. Visual novels (VN) are a genre of


                                                   323
Oleksandr A. Bobarchuk et al. CEUR Workshop Proceedings                                           319–328


interactive games where the main emphasis is on the plot and characters, and sound plays an important
role in creating an emotional response. The scenes themselves were created using Dalle-E 3 and refined
in Adobe Photoshop (figure 3).


Figure 3: Prepared scenes.


  To begin with, we break down the scenes into components and determine the overall mood and
which sounds we need (figure 4).


 Dark ambient;
 Low frequency noise;
 Wind noise;
 Melancholic classical
 instruments.


                                  Humming of wires               Cracking of branches
Figure 4: Determining sounds for the scene.


   Now let’s determine the necessary artificial intelligence tools. For forming the general landscape, the
aforementioned Suno AI from the previous section will work. Using the following prompt, we generate
two compositions for download: Violin and piano, melancholic style, slow tempo, dark ambient. We
transfer the downloaded result to the FL Studio environment and perform the following sequential
processing: slowing down, equalisation, adding reverb and echo effects using the Crystalize granular
generator from the developer SoundToys (figure 5).
   It is worth noting that the recording generated in Suno AI without further processing would not fully
correspond to the general concept of sound design for this project. As already mentioned above, Suno
AI often does not interpret prompts very accurately, which leads to problems with generating music in
less well-known genres. However, artistic processing tools allow to significantly change and improve
the nature of the input sound and adapt it to the needs of the project.
   Unlike Suno AI, the AudioGen AI instrument handled the prompt more accurately, generating distant
humming of electric wires and cracking of branches with a rather short request (figure 6).


                                                   324
Oleksandr A. Bobarchuk et al. CEUR Workshop Proceedings                                       319–328


Figure 5: Work in the FL Studio environment.


Figure 6: Work with the AudioGen tool.


   To create various variations of the sample generated using AudioGen, we will use Synplant2 and
its Genopatch technology. We load the sample into the plugin environment, after which Synplant2
automatically generates new sound samples based on the provided one (figure 7).
   After combining all the generated sounds, we have as a result a simple but subjectively quite high-
quality sample of sound design for scenes from a visual novel. Thus, based on practical experience,


                                                  325
Oleksandr A. Bobarchuk et al. CEUR Workshop Proceedings                                           319–328


Figure 7: Generation of new sounds using Synplant2.


the feasibility of using artificial intelligence technologies as a tool for quickly obtaining the necessary
sound samples for their further processing and combining into a coherent composition was confirmed.


5. Conclusions
The article provides a thorough analysis of the theoretical and practical aspects of using artificial
intelligence technologies in the field of sound design. The current state of the industry is highlighted,
considering its technical capabilities and creative challenges. A study of classical and modern tools for
forming sound environments is conducted.
   It has been proven that artificial intelligence significantly changes the traditional approach to sound
creation, providing process automation, time savings, and expanded possibilities for experimentation.
For the first time, a detailed analysis of the main trends and directions of the impact of artificial
intelligence on sound design is presented. Special attention is paid to tools such as Suno AI, AudioGen,
Synplant2, which demonstrate significant potential for generating sound textures and integration into
creative projects.
   The practical aspect of the research is based on the example of creating a sound accompaniment
for visual novels, where artificial intelligence was used to generate musical compositions and sound
effects. These materials, after further processing, can become the basis for high-quality full-fledged
sound design. It is important to emphasise that although AI tools provide speed and adaptability in


                                                   326
Oleksandr A. Bobarchuk et al. CEUR Workshop Proceedings                                               319–328


working with sound, their results often require refinement to match creative ideas.
   In this article, for the first time, an integrated approach to the use of various artificial intelligence
tools for creating sound design is proposed. This approach takes into account both technical capabilities
and creative needs. The study outlines the advantages of modern AI algorithms, such as efficiency
in creating short sound effects, as well as their limitations, including difficulties in forming complex
immersive sound landscapes.
   Also, for the first time, the article presents a methodology for selecting artificial intelligence tools for
specific tasks in the context of sound design for multimedia projects. For example, it is determined that
tools such as Suno AI are appropriate for creating music and musical effects, AudioGen for generating
sounds of certain environments, and Synplant2 for editing sounds. This methodology is formed on the
basis of practical work with these tools and subjective evaluation of the generation results.
   The overall results and prospects for further development of this problem can be defined as follows:

    • A study of the main directions of using artificial intelligence in sound creation has been conducted,
      including the generation of musical compositions, short sound effects, and procedural sound
      landscapes. It is shown that tools like Suno AI, AudioGen, and Synplant2 are able to effectively
      perform sound generation and processing tasks, which greatly simplifies the complex process of
      creating sound design;
    • The article presents an example of creating sound design for visual novels, which illustrates the
      capabilities of AI for quickly obtaining basic sound textures. It is shown that artificial intelligence
      can be used to automate sound creation with further processing and refinement, which allows
      achieving high-quality final results;
    • An integration approach is proposed, which consists in using various artificial intelligence tools
      for different tasks that may include short musical compositions, simple sound landscapes, and
      short sounds. Subjective evaluation of the quality of the created samples shows that they are
      quite suitable for use in various multimedia projects.

   The practical significance of the obtained results lies in increasing the efficiency of sound design
creation processes through the integration of artificial intelligence technologies. In particular, the
proposed methods allow automating routine tasks, such as generating basic sound textures and creating
simple sound or musical effects. This reduces the time and resources required for work and allows
designers to focus on the creative aspects of projects.
   Prospects for further research are primarily related to improving algorithms for creating immersive
sound environments, deepening the synergy of AI and human creativity, and more active integration
of generated sounds into multimedia projects. These prospects demonstrate the potential for further
transformation of the sound design industry, expanding the capabilities of creative professionals and
stimulating the development of innovations in the use of artificial intelligence.
   The study confirmed the practical value of artificial intelligence in transforming sound design,
expanding the toolkit for creating sound compositions and opening up new horizons in creative
industries.
Declaration on Generative AI: The authors have not employed any Generative AI tools.


References
 [1] M. V. Marienko, S. O. Semerikov, O. M. Markova, Artificial intelligence literacy in secondary
     education: methodological approaches and challenges, CEUR Workshop Proceedings 3679 (2024)
     87–97.
 [2] I. Mintii, S. Semerikov, Optimizing Teacher Training and Retraining for the Age of AI-Powered
     Personalized Learning: A Bibliometric Analysis, in: E. Faure, Y. Tryus, T. Vartiainen, O. Danchenko,
     M. Bondarenko, C. Bazilo, G. Zaspa (Eds.), Information Technology for Education, Science, and
     Technics, volume 222 of Lecture Notes on Data Engineering and Communications Technologies,
     Springer Nature Switzerland, Cham, 2024, pp. 339–357. doi:10.1007/978-3-031-71804-5_23.


                                                      327
Oleksandr A. Bobarchuk et al. CEUR Workshop Proceedings                                         319–328


 [3] K. Zizza, Sound Design, in: Game Audio Fundamentals: An Introduction to the Theory, Planning,
     and Practice of Soundscape Creation for Games, Focal Press, London, 2023, pp. 142–163. doi:10.
     4324/9781003218821-11.
 [4] E. Miranda, Computer sound synthesis fundamentals, in: Computer Sound Design: Synthe-
     sis techniques and programming, 2 ed., Routledge, New York, 2012, pp. 19–36. doi:10.4324/
     9780080490755-7.
 [5] A. Rosiński, The Use of Virtual Musical Instruments in Timbre Recognition Training, International
     Journal of Learning and Teaching 9 (2023) 256–257. doi:10.18178/ijlt.9.3.256-260.
 [6] T. Suzuki, A. Nakabayashi, Virtual Studio, The Journal of The Institute of Image Information and
     Television Engineers 61 (2007) 657–659. doi:10.3169/itej.61.657.
 [7] A. Zecevic, G. Durity, Handbook of Game Audio Using Wwise, Taylor & Francis Group, 2020.
 [8] M. Katz,      Music in 1s and 0s: The Art and Politics of Digital Sampling,                in: Cap-
     turing Sound: How Technology has Changed Music, University of California Press,
     Berkeley, 2004, pp. 137–157. URL: https://ia600409.us.archive.org/29/items/mat-bib_201710/
     Capturing-sound-how-technology-has-changed-music.pdf.
 [9] K. Saraf, M. D. Amritphale, H. Akhand, K. Vijayvargiya, Music AI, International Research Journal
     of Modernization in Engineering Technology and Science 6 (2024) 11174–11177. doi:10.56726/
     irjmets54679.
[10] P. V. Zahorodko, S. O. Semerikov, V. N. Soloviev, A. M. Striuk, M. I. Striuk, H. M. Shalatska, Com-
     parisons of performance between quantum-enhanced and classical machine learning algorithms
     on the IBM Quantum Experience, Journal of Physics: Conference Series 1840 (2021) 012021.
     doi:10.1088/1742-6596/1840/1/012021.
[11] S. Semerikov, H. Kucherova, V. Los, D. Ocheretin, Neural network analytics and forecasting the
     country’s business climate in conditions of the coronavirus disease (COVID-19), CEUR Workshop
     Proceedings 2845 (2021) 22–32.
[12] Aiva Technologies SARL, AIVA, the AI Music Generation Assistant, 2025. URL: https://www.aiva.
     ai/.
[13] Suno, Inc., Suno, 2025. URL: https://suno.com/.
[14] R. Liashenko, S. Semerikov, The Determination and Visualisation of Key Concepts Related to the
     Training of Chatbots, in: E. Faure, Y. Tryus, T. Vartiainen, O. Danchenko, M. Bondarenko, C. Bazilo,
     G. Zaspa (Eds.), Information Technology for Education, Science, and Technics, volume 222 of
     Lecture Notes on Data Engineering and Communications Technologies, Springer Nature Switzerland,
     Cham, 2024, pp. 111–126. doi:10.1007/978-3-031-71804-5_8.
[15] NuEdge Development, Sonic Charge - Synplant, 2025. URL: https://soniccharge.com/synplant.


                                                  328

</pre>