=Paper=
{{Paper
|id=Vol-2327/MILC6
|storemode=property
|title=A Minimal Template for Interactive Web-based Demonstrations of Musical Machine Learning
|pdfUrl=https://ceur-ws.org/Vol-2327/IUI19WS-MILC-6.pdf
|volume=Vol-2327
|authors=Vibert Thio,Hao-Min Liu,Yin-Cheng Yeh,Yi-Hsuan Yang
|dblpUrl=https://dblp.org/rec/conf/iui/ThioLYY19
}}
==A Minimal Template for Interactive Web-based Demonstrations of Musical Machine Learning==
<pdf width="1500px">https://ceur-ws.org/Vol-2327/IUI19WS-MILC-6.pdf</pdf>
<pre>
                                                IUI Workshops’19, March 20, 2019, Los Angeles, USA.


                    A Minimal Template for Interactive Web-based
                    Demonstrations of Musical Machine Learning

                       Vibert Thio, Hao-Min Liu, Yin-Cheng Yeh, and Yi-Hsuan Yang
             Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan
                            {vibertthio, paul115236, ycyeh, yang}@citi.sinica.edu.tw


ABSTRACT                                                                              on the accompanying project websites. This method works
New machine learning algorithms are being developed to solve                          well in the early days. However, as the ML models themselves
problems in different areas, including music. Intuitive, acces-                       are getting more complicated, some concepts of the algorithms
sible, and understandable demonstrations of the newly built                           may not be clearly expressed with only static sounds.
models could help attract the attention of people from different
                                                                                      In the neighboring field of computer vision, many interactive
disciplines and evoke discussions. However, we notice that
                                                                                      demonstrations of ML models have been developed recently.
it has not been a common practice for researchers working
                                                                                      Famous examples include DeepDream [18], image/video style
on musical machine learning to demonstrate their models in
                                                                                      transfer [8, 23], and DCGAN [19]. These interactive demos
an interactive way. To address this issue, we present in this
                                                                                      provoke active discussions and positive anticipation about
paper an template that is specifically designed to demonstrate
                                                                                      the technology. Nevertheless, the demonstration of musical
symbolic musical machine learning models on the web. The
                                                                                      machine learning models is not as easy in the case of computer
template comes with a small codebase, is open source, and
                                                                                      vision, due to the fact that it involves audio rendering (i.e., we
is meant to be easy to use by any practitioners to implement
                                                                                      cannot simply use images for demonstration). Web Audio API,
their own demonstrations. Moreover, its modular design fa-
                                                                                      a high-level JavaScript API for processing and synthesizing
cilitates the reuse of the musical components and accelerates
                                                                                      audio in web applications, was published only in 2011, which
the implementation. We use the template to build interactive
                                                                                      is not far from now compared to WebGL and other features of
demonstrations of four exemplary music generation models.
                                                                                      the browser. Furthermore, interactivity is needed to improve
We show that the built-in interactivity and real-time audio
                                                                                      understandability and create engaging experiences.
rendering of the browser make the demonstration easier to
understand and to play with. It also helps researchers to gain                        Musical machine learning is gaining increasing attention. We
insights into different models and to A/B test them.                                  believe that if more people from other fields, such as art and
                                                                                      music, start to appreciate the new models of musical machine
ACM Classification Keywords                                                           learning, it is easier to create an active community and to
D.2.2 Design Tools and Techniques: Modules and interfaces;                            stimulate new ideas to improve the technology.
H.5.2 User Interfaces: Prototyping; H.5.5 Sound and Music
Computing: Systems                                                                    The goal of this paper is to fulfill this need by building and
                                                                                      sharing with the community a template that is designed to
Author Keywords                                                                       demonstrate ML models for symbolic-domain music process-
Musical interface; web; latent space; deep learning                                   ing and generation, in an interactive way. Therefore, The
                                                                                      template is also open-source on GitHub (https://github.com/
ACM Reference Format                                                                  vibertthio/musical-ml-web-demo-minimal-template).
Vibert Thio, Hao-Min Liu, Yin-Cheng Yeh, and Yi-Hsuan
Yang. 2019. A Minimal Template for Interactive Web-based                              RELATED WORKS
Demonstrations of Musical Machine Learning. In Joint Pro-
ceedings of the ACM IUI 2019 Workshops, Los Angeles, USA,                             Audio Rendering in Python
March 20, 2019, 6 pages.                                                              When it comes to testing or interacting with the musical ma-
                                                                                      chine learning models, the output of the models must be ren-
INTRODUCTION                                                                          dered as audio files or streams to be listened to by humans.
Recent years have witnessed great progress in applying ma-                            Most researchers in the field nowadays use Python as the pro-
chine learning (ML) to music related problems, such as thumb-                         gramming language for model implementation because of the
nailing [10], music generation [5,7,21], and style transfer [15].                     powerful ML and statistical packages built around it. For
To demonstrate the result of such musical machine learning                            example, librosa [17] is a Python package often used for
models, researchers usually put the audio output as the result                        audio and signal processing. It includes functions for spectral
                                                                                      analysis, display, tempo detection, structural analysis, and out-
                                                                                      put. Many interactive demonstrations are built with librosa
IUI Workshops’19, March 20, 2019, Los Angeles, USA.
Copyright ©2019 for the individual papers by the papers’ authors. Copying permitted   on the Jupyter Notebook. However, a major drawback of this
for private and academic purposes. This volume is published and copyrighted by its    approach is that the audio files have to be sent over the Internet
editors.
                                       IUI Workshops’19, March 20, 2019, Los Angeles, USA.


for demonstration, which can be slow sometimes depending
on the network connection bandwidth.
Another widely-used Python package is pretty_midi [20],
which is designed for manipulation of symbolic-domain data
such as Musical Instrument Digital Interface (MIDI) data. It
could be used as a tool to render the symbolic output of a
musical machine learning model, such as a melody generation
model [24]. The problem is that after getting the result as
a MIDI file, the user still has to put it into a digital audio
workstation (DAW) to synthesize the audio waveform from
the MIDI. For better listening experience, the researcher still
has to synthesize the audio files offline and then send the audio
                                                                            Figure 1. The schematic diagram of the proposed template.
files over the Internet for demonstration.
Different from prior works, we propose to use Tone.js, a
JavaScript framework for rendering MIDI files into audio di-           We choose the web as the platform for the demonstrations for
rectly in the browser on the client side. This turns out to be a       several reasons. First, it is convenient as the user only has to
much more efficient way to demonstrate a symbolic musical              open a browser or click the hyperlink to play with the models.
ML model. It also helps build an interactive demo.                     Second, it is inherently interactive. The system can utilize a
                                                                       plenty of forms of interaction available in the browser to create
Interactive Musical Machine Learning                                   the specific user experience.
Similar to our work, Vogl et al. [7] introduced an interactive
App for drum pattern generation based on ML. They used                 Requirements
generative adversarial networks (GAN) [9] as the generative            In the design process, we have prioritized some crucial quali-
model, which is trained on a MIDI dataset. The user inter-             ties. First, we made the structure of the design as simple as pos-
face consists of some classic sequencer with an x/y pad that           sible. In most cases, the demo is for a proof-of-concept rather
controls the complexity and the loudness of the drum pattern           than to showcase a ready-to-sell product. Hence, we desire
generated. Additionally, controls for genre and swing are said         that a person with basic knowledge of Python and JavaScript
to be provided. However, both the demo and its source code             could understand our template within a short period of time so
cannot be found online currently. It is not clear whether the          that the template can serve as a minimal starting point.
App is built on iOS, Android, or the Web.                              Second, the codebase should be small, so that transplanting
Closely related to our project is the MusicVAE model [21]              a new model into this template is easier. Moreover, a small
presented by Magenta, Google Brain Team. MusicVAE is a                 codebase also makes it easier to debug.
generative recurrent variational autoencoder (VAE) model that          Third, the audio rendering must be interactive and real-time.
can generate melodies and drum beats. Importantly, the au-             The demonstrations must be responsive to some inputs from
thors also released a JavaScript package called Magenta.js             the user so that the user could understand the model by know-
(https://github.com/tensorflow/magenta-js/) [22], to make              ing how it works in several different ways. As for researchers,
their models more accessible. They also provide some pre-              if the result could be rendered instantly, it would be easier to
trained models of MusicVAE along with other ones. There are            A/B test different designs of models or parameters.
several interactive demos using the package, as can be found
on their website [16]. Most of them are well designed, user-           Finally, we want the components of the template to be modular,
friendly, and extremely helpful for understanding the models.          so that they can be reused and recombined easily. Such compo-
Yet, the major drawback is that the codebase of the project is a       nents may include, e.g., chord progression, pianoroll,
monolithic one [11] and is therefore quite big.1 Users may not         drum pattern, and sliders. Practitioners can build their
easily modify the code for customization. For example, be-             own demonstrations based on these components.
cause Magenta uses Tensorflow as the backbone deep learning
                                                                       System Architecture
framework, it is hard for PyTorch users to use Magenta.js.
                                                                       As shown in Figure 1, the system consists of three parts: a
                                                                       musical machine learning model, a server, and a client. When
TEMPLATE DESIGN                                                        a user opens the URL of the demonstration site, it will load the
In this paper, we present a minimal template, which is sim-            client program into the browser and render the basic interface.
ple, flexible, and designed for interactive demonstration of           The client program will send a request to the server to fetch
symbolic musical machine learning models on the web.                   the data. The server program will parse the request and use
                                                                       the function based on the model to make the corresponding
1 A monolithic repo is defined in [11] as: “a model of source code
                                                                       output and send it back to the client to render the audio effect.
organization where engineers have broad access to source code, a
shared set of tooling, and a single set of common dependencies. This
standardization and level of access are enabled by having a single,    Server
shared repo that stores the source code for all the projects in an     We used Flask (http://flask.pocoo.org/), a lightweight web
organization.” This is the case of the Google Magenta project.         application framework, to build the server. Flask only han-
                                          IUI Workshops’19, March 20, 2019, Los Angeles, USA.


Figure 2. The Latent Inspector. Left: a snapshot of the demonstration. The upper half is an editable drum pattern display, whereas the bottom half is a
graph showing the N-dimensional latent vector (here N = 32) of the VAE model. Each vertex of the graph can be adjusted independently, to change the
latent vector and accordingly the drum pattern. This is exemplified by the snapshot shown on the right. Although not shown in the figure, we can also
click on the drum pattern display to modify the drum pattern directly, which would also change the latent vector accordingly.


dles essential core functions to build a web server, such as                   on the musical machine learning model. The classes of the
representational state transfer (REST) handling the request                    models are not limited to certain ones. For instance, the first
from the client. Therefore, we can build the server without                    two demonstrate the musical machine learning models based
any redundant elements but focus on the function of the model.                 on VAE [13]. In contrast, the last two are mainly based on a
As a result, the server template code has only about 150 lines,                recurrent neural network (RNN). Also, the type of instruments
excluding the model implementation part.                                       could be different. For example, the first one is about percus-
                                                                               sion and the other three are about melody. This is designed
Client                                                                         deliberately to show the general purpose of this template. We
Several technologies have been built to render the real-time au-               aim to make them more understandable and interesting by
dio output since the Web Audio API was released in 2010 [2].                   adding interactivity, interface design, and visual effects.
Tone.js (https://tonejs.github.io/) is such a framework for
creating interactive music in the browser. It provides simple
                                                                               Latent Inspector with DrumVAE
workflows for synthesizing audio from oscillators, organizing
the audio samples, musician-friendly API, and timeline sched-                 DrumVAE is an original work. It uses VAE for generating
ule. It makes the development of real-time rendering from                     one-bar drum patterns. Drum patterns are represented using
the output of the ML model much easier. Added in the new                      the pianoroll format [6] with 96 time-step per bar. It com-
standard HTML5, the HTML canvas element can be used to                        presses (or encodes) the drum patterns into a latent space via a
draw dynamic graphics via writing JavaScript [1]. Thus, we                    bidirectional gated recurrent unit (BGRU) neural network [4].
use JavaScript canvas with Tone.js to create audio and visual                 The outputs from BGRUs are used as mean and variance of
experience coherently.                                                        a Gaussian distribution. A latent vector is sampled from the
                                                                              Gaussian distribution. We apply the similar but reverse struc-
The modularization is taken care of in the design of the in-                  ture of the encoder in the decoder and pass the latent vector
terface (see Figures 2–5). The layout of the user interfaces                  into it to reconstruct the drum patterns. It is trained on one-bar
is implemented as a grid system. This speeds up the design                    drum patterns collected from the Lakh Pianoroll Dataset [5],
process because it simplifies the choices for the positions of                considering the following nine drums: kick drum, snare drum,
the elements and the margins between them. The recurring el-                  closed/open hi-hat, low/mid/high toms, crash cymbal, and ride
ements, such as pianoroll and drum pattern display, are                       cymbal.
implemented based on object-oriented principles, thus they
can be re-used easily. See Table 1 for a summary.                             The Latent Inspector, shown in Figure 2, lets the user modify
                                                                              the latent vector of a drum pattern displayed in the browser to
                                                                              find out how the drum pattern will alter correspondingly. On
DEMONSTRATIONS DESIGN
                                                                              the other hand, the user can also modify the drum pattern to
We built four different demonstrations based on the proposed                  observe the changes in the latent vectors.
 template. We call them ‘Latent Inspector,’ ‘Song Mixer,’
‘Comp It,’ and ‘Tuning Turing.’ Each of them was designed to                  X/Y pads are used in other works to explore the latent space.
 serve one exact purpose and demonstrate a single idea based                  Yet, the dimension of the latent vectors used in practice is
                                         IUI Workshops’19, March 20, 2019, Los Angeles, USA.


Figure 3. The Song Mixer. The top and bottom panels display the melody and the chords of the first and the second song. In the middle is the
interpolation between the two songs. We add visual aids to guide the user to interact with the App. For example, the top panel is highlighted in this
figure to invite the user to listen to the first song, before the second song and then the interpolations.


usually larger than two. As a result, we designed a circular                  new lead sheets from scratch, but we use it for generating
diagram which can represent high dimensional data. As shown                   interpolations here.
in Figure 2, the latent vector or our DrumVAE model has
                                                                             The Song Mixer, shown in Figure 3, takes two existing lead
dimension N = 32. Since the effect of every dimension should
                                                                             sheets as input and shows the interpolations of them generated
be symmetrical in the latent vector of DrumVAE, using circular
                                                                             by LeadSheetVAE. Similarly, a user can modify the melody
diagram can eliminate the terminal point of the line chart.
                                                                             or chords using the upper panel, or choose other lead sheets
It is possible to further improve the UI by adding conditional               from our dataset, to see how it affects the interpolation.
functionalities, to give each vertex some musical or seman-
tic meaning. While this can be a future direction, we argue                  The aim of this demo is to make the interpolation understand-
                                                                             able. Therefore, we build interactive guidance with visual cues
that the current is also interesting— for musicians, it is some-
                                                                             through the process to make sure the user grasp the idea of
times more interesting to have a bunch of knob of unknown
                                                                             lead sheet interpolation. The demo website of Song Mixer can
functionalities to play with.
                                                                             be found at http://vibertthio.com/leadsheet-vae-client/.
The demo website of Latent Inspector can be found at http:
//vibertthio.com/drum-vae-client/public/.
                                                                             Evaluating the quality of interpolations generated by general
                                                                             VAE models (not limited to music-related ones), and many
                                                                             other generative models, has been known to be difficult. A
                                                                             core reason is that there is no ground truth for such interpola-
Song Mixer with LeadSheetVAE                                                 tions. Song Mixer makes it easy to assess the result of musical
LeadSheetVAE is another model we recently developed [14].                    interpolations. Moreover, with the proposed template, it is
It is also based on a VAE, but it is designed to deal with lead              easy to extend Song Mixer to show the interpolation produced
sheets instead of drum patterns. A lead sheet is composed of a               by two different models side-by-side and in-sync in the middle
melody line and a sequence of chord labels [14]. We consider                 of the UI. This facilitates A/B testing the two different models
four-bar lead sheets here. Melody lines and chord sequences                  with a user study.
are represented using one-hot vectors and chroma vectors, re-
spectively. It resembles the structure of DrumVAE, but the                    Comp It & Tuning Turing with MTRNNHarmonizer
main difference is that by the end of the encoder the output of               Finally, MTRNNHarmonizer is another new model that we
the two BGRUs (one for melody and one for chords) are con-                    recently developed.2 It is an RNN-based model for adding
catenated and passed through few dense layers for calculating                 chords to harmonize a given melody. In other words, given a
the mean and variance for the Gaussian distribution. In the                   melody line, the model produces a chord sequence to make it
decoder, we apply two unidirectional GRUs to reconstruct the                  a lead sheet. The model is special in that it takes a multi-task
melody lines and chord sequences. The model is trained on the                 learning framework to predict not only the chord label but
TheoryTab dataset [14] with 31,229 four-bar segments of lead
sheets featuring different genres. LeadSheetVAE can generate                  2 More details of the model will be provided in a forthcoming paper.
                                          IUI Workshops’19, March 20, 2019, Los Angeles, USA.


Figure 4. Comp It. The upper half is the editable melody and the chords predicted by the underlying melody harmonization model. In the lower left is
a graph showing which class the current chord belongs to. In the lower right is a graph showing the position of the current chord on the so-called circle
of fifths. The current melody note and chord being played are marked in red.


also the chord’s functional harmony, for a given segment of                     vibertthio), including the template (https://github.com/
melody (half-bar in our implementation). Taking the func-                       vibertthio/musical-ml-web-demo-minimal-template), the inter-
tional harmony into account makes the model less sensitive to                   faces, and the ML models.
the imbalance of different chords in the training data. Further-
more, the chord progression can have the phrasing that better                   CONCLUSION
matches the given melody line.                                                  This paper presents an open-source template for creating an
Similar to the two aforementioned demos, Comp It allows                         interactive demonstration of musical machine learning on the
a user to modify the melodies displayed in the browser to                       web along with four exemplary demonstrations. The architec-
find out how this will alter the chord progression correspond-                  ture of the template is meant to be simple and the codebase is
ingly. Furthermore, as shown in Figure 4, we add a triangular                   small so that other practitioners can implement their models
graph and an animated circle of fifths [12] graph to visualize                  with it within a short time. The modular design makes the
the changing between different chord classes. The triangular                    musical component reusable. The interactivity and real-time
graph displays the chord class of the chord being played, cov-                  audio rendering of the browser make the demonstration easier
ering tonal, dominant, and sub-dominant. The circle of fifths                   to understand and to play with. However, we try to elaborate
graph, on the other hand, organizes the chords in a way that re-                the quantitative aspects of the project without quantitative anal-
flects the “harmonic distance” between chords [3]. These two                    ysis. For future work, we will run user studies to validate the
graphs make it easier to study the chord progression generated                  effectiveness of these projects. With more intuitive, accessible,
by the melody harmonization model, which is MTRNNHarmo-                         and understandable demonstrations of the new models, we
nizer here but can be other models in other implementations.                    hope new people might be brought together to form a larger
                                                                                community to stimulate new ideas.
Furthermore, we made a simple Turing game for the model,
called “Tuning Turing.” As shown in Figures 5, the player has                   REFERENCES
to pick out the harmonization generated by the model from two                    1. 2006. Canvas API. MDN Web docs. (2006). https:
music clips. There are both “practice mode” and “challenge                           //developer.mozilla.org/en-US/docs/Web/API/Canvas_API.
mode.” The former has 6 fixed levels. In the “challenge mode,”
the player can keep playing until three wrong answers.                           2. 2011. Web Audio API. W3C. (2011).
                                                                                     https://www.w3.org/TR/2011/WD-webaudio-20111215/.
The demo website of Comp It and Tuning Turing can be found
at http://vibertthio.com/m2c-client/ and http://vibertthio.                      3. Juan Bello and Jeremy Pickens. 2005. A robust mid-level
com/tuning-turing/ respectively.                                                    representation for harmonic content in music signals. In
                                                                                    Proc. Int. Soc. Music Information Retrieval Conf.
AVAILABILITY
Supplementary resources including open source code will                          4. Kyunghyun Cho and others. 2014. Learning phrase
be available at the GitHub repos (https://github.com/                               representations using RNN encoder-decoder for statistical
                                         IUI Workshops’19, March 20, 2019, Los Angeles, USA.


   Demonstration    Modules                                               10. Yu-Siang Huang, Szu-Yu Chou, and Yi-Hsuan Yang.
                    audio rendering (sample)                                  2018. Pop music highlighter: Marking the emotion
                    editable pianoroll (drum)                                 keypoints. Transactions of the International Society for
   Latent Inspector
                    editable latent vector (circular)                         Music Information Retrieval 1, 1 (2018), 68–78.
                    **radio panel (genre selection)
                    audio rendering (synthesize)                          11. Ciera Jaspan and others. 2018. Advantages and
                    editable pianoroll (melody) × 3                           disadvantages of a monolithic codebase. In Proc. Int.
   Song Mixer                                                                 Conf. Software Engineering.
                    chord visualization (text)
                    radio panel (interpolations selec-                    12. Claudia R. Jensen. 1992. A theoretical work of late
                    tion)                                                     seventeenth-century muscovy: Nikolai Diletskii’s
                    audio rendering (synthesize)                              “Grammatika” and the earliest circle of fifths. J. American
   Comp It          editable pianoroll (melody)                               Musicological Society 45, 2 (1992), 305–331.
                    chord visualization (text, function,
                    circle of fifths)                                     13. Diederik P. Kingma and Max Welling. 2014.
                    audio rendering (sample)                                  Auto-encoding variational Bayes. In Proc. Int. Conf.
   Tuning Turing    waveform visualization                                    Learning Representations.
Table 1. Some of the modules are reused by more than one demonstra-       14. Hao-Min Liu, Meng-Hsuan Wu, and Yi-Hsuan Yang.
tion. For example, three of them uses “editable pianoroll”. In our im-        2018. Lead sheet generation and arrangement via a hybrid
plementation, We reuse the modules to reduce the development effort.          generative model. In Proc. Int. Soc. Music Information
Therefore, it is useful and convenient to develop new demos with these
modules. ** indicates that a feature that has not been implemented yet.       Retrieval Conf., Late Breaking and Demo Papers.
                                                                          15. Chien-Yu Lu and others. 2019. Play as You Like:
                                                                              Timbre-enhanced multi-modal music style transfer. Proc.
                                                                              AAAI Conf. Artificial Intelligence.
                                                                          16. Google Brain Magenta. 2018. Demos. Magenta Blog.
                                                                              (2018). https://magenta.tensorflow.org/demos.
                                                                          17. Brian McFee and others. 2015. librosa: Audio and music
                                                                              signal analysis in python. Proc. 14th Python in Science
                                                                              Conf., 18–25.
                                                                          18. Alexander Mordvintsev, Christopher Olah, and Mike
                                                                              Tyka. 2015. Inceptionism: Going deeper into neural
                                                                              networks. Google AI Blog. (2015).
                                                                              https://ai.googleblog.com/2015/06/
                                                                              inceptionism-going-deeper-into-neural.html.
Figure 5. Tuning Turing. Two different kinds of harmonization for a
single melody are rendered on the page. The player has to pick out the    19. Alec Radford, Luke Metz, and Soumith Chintala. 2015.
one generated by the algorithm and send the result after choosing with        Unsupervised representation learning with deep
the mouse.
                                                                              convolutional generative adversarial networks. (2015).
    machine translation. CoRR abs/1406.1078 (2014).                           https://arxiv.org/abs/1511.06434.
    http://arxiv.org/abs/1406.1078                                        20. Colin Raffel and Daniel P. W. Ellis. 2014. Intuitive
 5. Hao-Wen Dong and others. 2018a. MuseGAN:                                  analysis, creation and manipulation of MIDI data with
    Multi-track sequential generative adversarial networks for                pretty_midi. ISMIR Late Breaking and Demo Papers.
    symbolic music generation and accompaniment. Proc.                    21. Adam Roberts and others. 2018a. A hierarchical latent
    AAAI Conf. Artificial Intelligence.                                       vector model for learning long-term structure in music.
 6. Hao-Wen Dong, Wen-Yi Hsiao, and Yi-Hsuan Yang.                            (2018). https://arxiv.org/abs/1803.05428.
    2018b. Pypianoroll: Open source Python package for                    22. Adam Roberts, Curtis Hawthorne, and Ian Simon. 2018b.
    handling multitrack pianoroll. In Proc. Int. Soc. Music                   Magenta.js: A JavaScript API for Augmenting Creativity
    Information Retrieval Conf. Late-breaking paper;                          with Deep Learning. (2018).
    https://github.com/salu133445/pypianoroll.                                https://ai.google/research/pubs/pub47115.
 7. Hamid Eghbal-zadeh and others. 2018. A GAN based                      23. Manuel Ruder, Alexey Dosovitskiy, and Thomas Brox.
    drum pattern generation UI prototype. ISMIR Late                          2018. Artistic style transfer for videos and spherical
    Breaking and Demo Papers.                                                 images. Int. J. Computer Vision (2018). http://lmb.
 8. Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge.                   informatik.uni-freiburg.de/Publications/2018/RDB18
    2015. A neural algorithm of artistic style. (2015).                   24. Ian Simon and Sageev Oore. 2017. Performance RNN:
    https://arxiv.org/abs/1508.06576.
                                                                              Generating music with expressive timing and dynamics.
 9. Ian J. Goodfellow and others. 2014. Generative                            (2017). https://magenta.tensorflow.org/performance-rnn.
    adversarial nets. In Proc. Advances in Neural Information
    Processing Systems. 2672–2680.

</pre>