Composing with Generative Systems in the Digital Audio Workstation Ian Clester, Jason Freeman Georgia Institute of Technology, Atlanta, GA, USA Abstract Generative systems present new opportunities for composers, but it can be unclear how to integrate such systems into creative workflows. We put forward a vision for a generative audio workstation, in which the composer can work with generative expressions much like ordinary audio or MIDI items, seamlessly mixing static and dynamic musical material. We present our research prototype in this direction, LambDAW, which takes the form of an extension to REAPER (a popular digital audio workstation) that executes Python expressions directly in the timeline, and we discuss possibilities for integration with generative models and machine learning libraries. Keywords generative music, end-user programming, music composition, digital audio workstation 1. Vision as DAWs are with (static) digital audio. In our vision, code is not something separate to be executed outside Generative musical systems have a long history [3], from of the DAW, nor is it timeless in the FX chain. Instead, Hiller and Isaacson’s Illiac Suite in 1957, to David Cope’s it is right in the timeline, alongside the other musical Experiments in Musical Intelligence in the 1980s [4], to materials. Furthermore, it can connect to other materials deep neural nets such as OpenAI’s MuseNet or Magenta’s by reference, enabling the composer to create meaningful Music Transformer today. In all these cases, the computer links through the piece. In this model, the composer need is entirely relied on to generate the final musical material. not give up all control to generative systems, nor reject Human input ends after constructing the program and them entirely. Instead, they retain ultimate control of the possibly providing a prompt. piece and can bring in generative systems as they see fit, This workflow contrasts with the more hands-on ap- like any other source of material. proach enabled by the digital audio workstation (DAW), We take inspiration from end-user programming soft- in which the composer can directly manipulate and ar- ware, including the classic example of spreadsheets and range items along a timeline, making it easy to import more recent work from Ink & Switch [5, 6]. We also and work with disparate materials. However, this ap- draw on computational notebooks such as Jupyter [7], proach affords limited support for generative music. To which promote interactive programming and allow code incorporate generative systems into a piece, the com- to generate pieces of the document it is embedded in. poser has two options.1 They can abandon the DAW in Our work is also related to Manhattan [8] (which brings favor of a computer music environment (e.g. Csound, code fragments into a music tracker), Ossia Score [9] SuperCollider, Max) or general-purpose language that (an “intermedia sequencer” which can be scripted via allows programming the system directly. Or they can run JavaScript), and Computational Counterpoint Marks [10] the generative system separately, generate some audio or (which brings code into Western music notation). MIDI output, and then import that into the DAW, going back and forth each time they want to tweak the system or generate a different output. 2. Prototype We put forward a vision for a generative audio work- station: an environment as adept with generative audio Our research prototype takes the form of an extension to REAPER called LambDAW (“lambda” + “DAW”).2 With Joint Proceedings of the ACM IUI Workshops 2023, March 2023, Sydney, LambDAW, the composer can write Python expressions Australia in the timeline to generate audio or MIDI output directly Envelope-Open ijc@gatech.edu (I. Clester); jason.freeman@gatech.edu in the DAW, as shown in Fig. 1. These expressions are (J. Freeman) GLOBE https://ijc8.me/ (I. Clester); https://distributedmusic.gatech.edu/ stored in take names, so that both the code and its gener- (J. Freeman) ated output are visible in the timeline. Like a spreadsheet © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). formula, a name begins with = to indicate that it contains CEUR Workshop CEUR Workshop Proceedings (CEUR-WS.org) Proceedings http://ceur-ws.org ISSN 1613-0073 1 They can also put the generative system in a plugin, but this is 2 hidden in the FX chain, exists for all time, and has limited ability to LambDAW is free software, and it is available at https://github.com/ interact with the DAW. ijc8/lambdaw. Figure 1: LambDAW allows the user to embed Python expressions that generate audio or MIDI directly in the DAW timeline. In this screenshot, tracks 1 & 2 show examples of expression items that programmatically generate audio and MIDI, while tracks 3 & 4 show examples that invoke ML models (MusicVAE [1] for melody generation and RAVE [2] for timbre transfer). an expression to be evaluated. (The = may be further pre- 3. Conclusion & Future Work fixed to give it a name for later reference, as in f o o = b a r ( ) .) When LambDAW detects a new or updated expression, it We believe LambDAW offers a useful way to compose automatically evaluates it and puts the generated output with AI and ML by bringing code into the familiar in- in the associated item. The user can also re-evaluate ex- terface of the DAW. Because LambDAW allows arbitrary pressions on demand, e.g. to get different outputs from a Python expressions in the timeline and provides the model. project module as a place to import libraries and load as- Expressions can refer to other items in the timeline as sets, it enables the user to take advantage of Python’s rich variables. For example, if there is a MIDI item in the time- ecosystem. Expression items can serve as user-specified line named m y _ c o o l _ r i f f , an expression item can refer “blanks” for an AI to fill in, which the user can re-evaluate to it with an expression like = t r a n s p o s e ( m y _ c o o l _ r i f f , to generate new output. The ability of LambDAW expres- 5 ) . This feature allows the composer to establish connec- sions to refer to other items in the timeline provides tions between musical material; if they later modify the a convenient way to work with models that transform original riff, the derived item can be updated simply be or continue input material, and for the user to pre- or re-evaluating it (unlike a transformed copy, which “for- post-process model data. gets” its relationship to the original material). Referring Looking ahead, we plan to further explore integrations to items in expressions also facilitates the transformation with generative libraries and models to find how they can of recorded material with code. (Expression items can fit into compositional workflows. Some initial work in also reference other expression items, as in the example this direction is depicted in Fig. 1, which features expres- with c a r p in track 1 of Fig. 1.) sions that invoke MusicVAE [1] and RAVE [2] to generate Linking expressions by reference enables the user to symbolic data and audio, respectively. We also aim to divide up complexity between expressions and so pro- make architectural changes to improve LambDAW’s us- vides a way to manage complexity and expression length; ability (especially with ML libraries), such as performing the user project module provides another.3 LambDAW evaluation in a separate process and allowing to user to loads a user-defined module for each project in which the interrupt long-running operations.4 Finally, we hope to user can define functions, import useful libraries, load present our system to composers to get their feedback resources, etc., for use in timeline-embedded expressions. about the quality of the integration, the degree to which The user can also customize how LambDAW converts it facilitates their use of generative systems, and how it DAW items to/from Python objects. The user project affects their creative process. module thus supports concision and customizability, en- abling the user to choose their own set of abstractions for composing and maintain the brevity of expressions 4 While building our prototype, we encountered technical issues with in the timeline. REAPER’s embedded Python support when using libraries such as NumPy, TensorFlow, and PyTorch, which required workarounds. Moving evaluation out to a separate process (as in Jupyter kernels) 3 An expression in the timeline can be arbitrarily complex, but it is would avoid these issues, enable interrupting/killing the interpreter advisable to keep it brief so that the whole thing can be seen at a without restarting the DAW, and prevent evaluation from blocking glance and without excessive zooming. the UI thread. References [1] A. Roberts, J. Engel, C. Raffel, C. Hawthorne, D. Eck, A hierarchical latent vector model for learning long- term structure in music, in: International Con- ference on Machine Learning (ICML), 2018. URL: http://proceedings.mlr.press/v80/roberts18a.html. [2] A. Caillon, P. Esling, Rave: A variational autoen- coder for fast and high-quality neural audio syn- thesis, 2021. URL: https://arxiv.org/abs/2111.05011. doi:1 0 . 4 8 5 5 0 / A R X I V . 2 1 1 1 . 0 5 0 1 1 . [3] K. Essl, Algorithmic composition, in: N. Collins, J. d’Escrivan (Eds.), The Cambridge Companion to Electronic Music, Cambridge Companions to Mu- sic, Cambridge University Press, 2007, p. 107–125. doi:1 0 . 1 0 1 7 / C C O L 9 7 8 0 5 2 1 8 6 8 6 1 7 . 0 0 8 . [4] D. Cope, Experiments in musical intelli- gence (emi): Non‐linear linguistic‐based compo- sition, Interface 18 (1989) 117–139. URL: https: //doi.org/10.1080/09298218908570541. doi:1 0 . 1 0 8 0 / 09298218908570541. [5] G. Litt, M. Schoening, P. Shen, P. Sonnentag, Potluck: Dynamic documents as personal soft- ware (2022). URL: https://www.inkandswitch.com/ potluck/. [6] J. Lindenbaum, S. Kaliski, J. Horowitz, Inkbase: Programmable ink (2022). URL: https://www. inkandswitch.com/inkbase/. [7] T. Kluyver, B. Ragan-Kelley, F. Pérez, B. Granger, M. Bussonnier, J. Frederic, K. Kelley, J. Hamrick, J. Grout, S. Corlay, P. Ivanov, D. Avila, S. Abdalla, C. Willing, J. development team, Jupyter note- books - a publishing format for reproducible com- putational workflows, in: F. Loizides, B. Scmidt (Eds.), Positioning and Power in Academic Pub- lishing: Players, Agents and Agendas, IOS Press, Netherlands, 2016, pp. 87–90. URL: https://eprints. soton.ac.uk/403913/. [8] C. Nash, Manhattan: End-user programming for music, in: Proceedings of the 14th International Conference on New Interfaces for Musical Expres- sion, 2014, pp. 221–226. URL: https://www.nime. org/proceedings/2014/nime2014_371.pdf. [9] J.-M. Celerier, P. Baltazar, C. Bossut, N. Vuaille, J.-M. Couturier, M. Desainte-Catherine, Ossia: Towards a unified interface for scoring time and interaction, in: TENOR 2015 - First International Conference on Technologies for Music Notation and Representa- tion, 2015. URL: http://tenor2015.tenor-conference. org/papers/13-Celerier-OSSIA.pdf. [10] J. C. Martinez, Extending music notation as a programming language for interactive music, in: ACM International Conference on Interactive Me- dia Experiences, ACM, 2021, pp. 28–36. URL: https: //doi.org/10.1145/3452918.3458807.