Composing with Generative Systems in the Digital Audio
Workstation
Ian Clester, Jason Freeman
Georgia Institute of Technology, Atlanta, GA, USA


                                       Abstract
                                       Generative systems present new opportunities for composers, but it can be unclear how to integrate such systems into creative
                                       workflows. We put forward a vision for a generative audio workstation, in which the composer can work with generative
                                       expressions much like ordinary audio or MIDI items, seamlessly mixing static and dynamic musical material. We present
                                       our research prototype in this direction, LambDAW, which takes the form of an extension to REAPER (a popular digital
                                       audio workstation) that executes Python expressions directly in the timeline, and we discuss possibilities for integration with
                                       generative models and machine learning libraries.

                                       Keywords
                                       generative music, end-user programming, music composition, digital audio workstation


1. Vision                                                                                        as DAWs are with (static) digital audio. In our vision,
                                                                                                 code is not something separate to be executed outside
Generative musical systems have a long history [3], from                                         of the DAW, nor is it timeless in the FX chain. Instead,
Hiller and Isaacson’s Illiac Suite in 1957, to David Cope’s                                      it is right in the timeline, alongside the other musical
Experiments in Musical Intelligence in the 1980s [4], to                                         materials. Furthermore, it can connect to other materials
deep neural nets such as OpenAI’s MuseNet or Magenta’s                                           by reference, enabling the composer to create meaningful
Music Transformer today. In all these cases, the computer                                        links through the piece. In this model, the composer need
is entirely relied on to generate the final musical material.                                    not give up all control to generative systems, nor reject
Human input ends after constructing the program and                                              them entirely. Instead, they retain ultimate control of the
possibly providing a prompt.                                                                     piece and can bring in generative systems as they see fit,
              This workflow contrasts with the more hands-on ap-                                 like any other source of material.
proach enabled by the digital audio workstation (DAW),                                              We take inspiration from end-user programming soft-
in which the composer can directly manipulate and ar-                                            ware, including the classic example of spreadsheets and
range items along a timeline, making it easy to import                                           more recent work from Ink & Switch [5, 6]. We also
and work with disparate materials. However, this ap-                                             draw on computational notebooks such as Jupyter [7],
proach affords limited support for generative music. To                                          which promote interactive programming and allow code
incorporate generative systems into a piece, the com-                                            to generate pieces of the document it is embedded in.
poser has two options.1 They can abandon the DAW in                                              Our work is also related to Manhattan [8] (which brings
favor of a computer music environment (e.g. Csound,                                              code fragments into a music tracker), Ossia Score [9]
SuperCollider, Max) or general-purpose language that                                             (an “intermedia sequencer” which can be scripted via
allows programming the system directly. Or they can run                                          JavaScript), and Computational Counterpoint Marks [10]
the generative system separately, generate some audio or                                         (which brings code into Western music notation).
MIDI output, and then import that into the DAW, going
back and forth each time they want to tweak the system
or generate a different output.                                                                                  2. Prototype
              We put forward a vision for a generative audio work-
station: an environment as adept with generative audio Our research prototype takes the form of an extension                                                              to
                                                                                                                 REAPER called LambDAW (“lambda” + “DAW”).2 With
Joint Proceedings of the ACM IUI Workshops 2023, March 2023, Sydney, LambDAW, the composer can write Python expressions
Australia                                                                                                        in the timeline to generate audio or MIDI output directly
Envelope-Open ijc@gatech.edu (I. Clester); jason.freeman@gatech.edu
                                                                                                                 in the DAW, as shown in Fig. 1. These expressions are
(J. Freeman)
GLOBE https://ijc8.me/ (I. Clester); https://distributedmusic.gatech.edu/ stored in take names, so that both the code and its gener-
(J. Freeman)                                                                                                     ated output are visible in the timeline. Like a spreadsheet
                    © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License
                    Attribution 4.0 International (CC BY 4.0).                                                   formula, a name begins with = to indicate that it contains
    CEUR
    Workshop
             CEUR Workshop Proceedings (CEUR-WS.org)
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073


1
    They can also put the generative system in a plugin, but this is
                                                                                                 2
    hidden in the FX chain, exists for all time, and has limited ability to                          LambDAW is free software, and it is available at https://github.com/
    interact with the DAW.                                                                           ijc8/lambdaw.
Figure 1: LambDAW allows the user to embed Python expressions that generate audio or MIDI directly in the DAW timeline.
In this screenshot, tracks 1 & 2 show examples of expression items that programmatically generate audio and MIDI, while
tracks 3 & 4 show examples that invoke ML models (MusicVAE [1] for melody generation and RAVE [2] for timbre transfer).


an expression to be evaluated. (The = may be further pre-                       3. Conclusion & Future Work
fixed to give it a name for later reference, as in f o o = b a r ( ) .)
When LambDAW detects a new or updated expression, it                            We believe LambDAW offers a useful way to compose
automatically evaluates it and puts the generated output                        with AI and ML by bringing code into the familiar in-
in the associated item. The user can also re-evaluate ex-                       terface of the DAW. Because LambDAW allows arbitrary
pressions on demand, e.g. to get different outputs from a                       Python expressions in the timeline and provides the
model.                                                                          project module as a place to import libraries and load as-
     Expressions can refer to other items in the timeline as                    sets, it enables the user to take advantage of Python’s rich
variables. For example, if there is a MIDI item in the time-                    ecosystem. Expression items can serve as user-specified
line named m y _ c o o l _ r i f f , an expression item can refer               “blanks” for an AI to fill in, which the user can re-evaluate
to it with an expression like = t r a n s p o s e ( m y _ c o o l _ r i f f ,   to generate new output. The ability of LambDAW expres-
5 ) . This feature allows the composer to establish connec-                     sions to refer to other items in the timeline provides
tions between musical material; if they later modify the                        a convenient way to work with models that transform
original riff, the derived item can be updated simply be                        or continue input material, and for the user to pre- or
re-evaluating it (unlike a transformed copy, which “for-                        post-process model data.
gets” its relationship to the original material). Referring                         Looking ahead, we plan to further explore integrations
to items in expressions also facilitates the transformation                     with generative libraries and models to find how they can
of recorded material with code. (Expression items can                           fit into compositional workflows. Some initial work in
also reference other expression items, as in the example                        this direction is depicted in Fig. 1, which features expres-
with c a r p in track 1 of Fig. 1.)                                             sions that invoke MusicVAE [1] and RAVE [2] to generate
     Linking expressions by reference enables the user to                       symbolic data and audio, respectively. We also aim to
divide up complexity between expressions and so pro-                            make architectural changes to improve LambDAW’s us-
vides a way to manage complexity and expression length;                         ability (especially with ML libraries), such as performing
the user project module provides another.3 LambDAW                              evaluation in a separate process and allowing to user to
loads a user-defined module for each project in which the                       interrupt long-running operations.4 Finally, we hope to
user can define functions, import useful libraries, load                        present our system to composers to get their feedback
resources, etc., for use in timeline-embedded expressions.                      about the quality of the integration, the degree to which
The user can also customize how LambDAW converts                                it facilitates their use of generative systems, and how it
DAW items to/from Python objects. The user project                              affects their creative process.
module thus supports concision and customizability, en-
abling the user to choose their own set of abstractions
for composing and maintain the brevity of expressions                           4
                                                                                    While building our prototype, we encountered technical issues with
in the timeline.                                                                    REAPER’s embedded Python support when using libraries such as
                                                                                    NumPy, TensorFlow, and PyTorch, which required workarounds.
                                                                                    Moving evaluation out to a separate process (as in Jupyter kernels)
3
    An expression in the timeline can be arbitrarily complex, but it is             would avoid these issues, enable interrupting/killing the interpreter
    advisable to keep it brief so that the whole thing can be seen at a             without restarting the DAW, and prevent evaluation from blocking
    glance and without excessive zooming.                                           the UI thread.
References
 [1] A. Roberts, J. Engel, C. Raffel, C. Hawthorne, D. Eck,
     A hierarchical latent vector model for learning long-
     term structure in music, in: International Con-
     ference on Machine Learning (ICML), 2018. URL:
     http://proceedings.mlr.press/v80/roberts18a.html.
 [2] A. Caillon, P. Esling, Rave: A variational autoen-
     coder for fast and high-quality neural audio syn-
     thesis, 2021. URL: https://arxiv.org/abs/2111.05011.
     doi:1 0 . 4 8 5 5 0 / A R X I V . 2 1 1 1 . 0 5 0 1 1 .
 [3] K. Essl, Algorithmic composition, in: N. Collins,
     J. d’Escrivan (Eds.), The Cambridge Companion to
     Electronic Music, Cambridge Companions to Mu-
     sic, Cambridge University Press, 2007, p. 107–125.
     doi:1 0 . 1 0 1 7 / C C O L 9 7 8 0 5 2 1 8 6 8 6 1 7 . 0 0 8 .
 [4] D. Cope,                 Experiments in musical intelli-
     gence (emi): Non‐linear linguistic‐based compo-
     sition, Interface 18 (1989) 117–139. URL: https:
     //doi.org/10.1080/09298218908570541. doi:1 0 . 1 0 8 0 /
     09298218908570541.
 [5] G. Litt, M. Schoening, P. Shen, P. Sonnentag,
     Potluck: Dynamic documents as personal soft-
     ware (2022). URL: https://www.inkandswitch.com/
     potluck/.
 [6] J. Lindenbaum, S. Kaliski, J. Horowitz, Inkbase:
     Programmable ink (2022). URL: https://www.
     inkandswitch.com/inkbase/.
 [7] T. Kluyver, B. Ragan-Kelley, F. Pérez, B. Granger,
     M. Bussonnier, J. Frederic, K. Kelley, J. Hamrick,
     J. Grout, S. Corlay, P. Ivanov, D. Avila, S. Abdalla,
     C. Willing, J. development team, Jupyter note-
     books - a publishing format for reproducible com-
     putational workflows, in: F. Loizides, B. Scmidt
     (Eds.), Positioning and Power in Academic Pub-
     lishing: Players, Agents and Agendas, IOS Press,
     Netherlands, 2016, pp. 87–90. URL: https://eprints.
     soton.ac.uk/403913/.
 [8] C. Nash, Manhattan: End-user programming for
     music, in: Proceedings of the 14th International
     Conference on New Interfaces for Musical Expres-
     sion, 2014, pp. 221–226. URL: https://www.nime.
     org/proceedings/2014/nime2014_371.pdf.
 [9] J.-M. Celerier, P. Baltazar, C. Bossut, N. Vuaille, J.-M.
     Couturier, M. Desainte-Catherine, Ossia: Towards
     a unified interface for scoring time and interaction,
     in: TENOR 2015 - First International Conference on
     Technologies for Music Notation and Representa-
     tion, 2015. URL: http://tenor2015.tenor-conference.
     org/papers/13-Celerier-OSSIA.pdf.
[10] J. C. Martinez, Extending music notation as a
     programming language for interactive music, in:
     ACM International Conference on Interactive Me-
     dia Experiences, ACM, 2021, pp. 28–36. URL: https:
     //doi.org/10.1145/3452918.3458807.