A Multi-modal Perspective for the Artistic Evaluation
of Robotic Dance Performances
Luca Giuliani, Allegra De Filippo, Andrea Borghesi, Paola Mello and Michela Milano
Department of Computer Science and Engineering, University of Bologna


                                      Abstract
                                      In recent years, the interplay between creativity and Artificial Intelligence (AI) has been gaining more
                                      and more attention from the research community. Within this vast area, humanoid robots have been
                                      successfully used in artistic research areas, and many works have studied and implemented systems
                                      for robotic dance. However, only few works take into account the human evaluation of these artistic
                                      outputs. For this aim, we start from a recent work focused on defining criteria for the evaluation of
                                      robotic dance performances, and we analyze a further crucial aspect such as the need for a multi-modal
                                      perspective: the musical element needs to blend seamlessly and organically with the choreography, in
                                      accordance to the judgments expressed by human evaluators. Based on the analysis of results, musical
                                      elements emerged as having a large impact on the artistic evaluation. For this reason, the final purpose
                                      of this work is twofold: we would like both to explore new creative paths where humans ideas can be
                                      broadened by the use of AI software, but also to help both human choreographers and AI algorithms to
                                      create dance performances and music with a major impact on the audience.

                                      Keywords
                                      Artistic Evaluation, Robotic Dance Choreography, Multi-Modal Creativity, Educational AI


1. Introduction
The adoption of humanoid robots in artistic research areas showed several successful applica-
tions, ranging from the generation of artificially generated music, images, and even tentative
literature [1, 2]. Dance choreography is an area where the potential application of artificial
intelligence has raised interest in recent years [3, 4], both choreographies for human performers
and robotic ones, which have been showed to be well suited also to educational purposes [5].
The creation of artificial art is a new and expanding field where multiple challenges remain
wide open, ranging from the actual acceptance of artificial creations as artistic products [6] to
the evaluation of the quality of the generated works of art [7]. For instance, it is usually the
case that humans are not kept in the creative loop, especially when it comes to the evaluation
of these artistic performances.
   We tackle this challenge and address the complex task of artistic evaluation by trying to
extract some human-employed criteria for robotic dance performances evaluation. This work is
the continuation of a previous research on the artistic evaluation of humanoid robotic dances
[8]. Differently from the majority of studies in the field, which focuses on the automation of
robotic dance performances [9, 10, 11], we want to involve humans in both the creation and

CREAI 2022, Workshop on Artificial Intelligence and Creativity, Nov.28– Dec.02, 2022, Udine, Italy
$ luca.giuliani13@unibo.it (L. Giuliani)
                                    © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
 CEUR
 Workshop
 Proceedings
               http://ceur-ws.org
               ISSN 1613-0073
                                    CEUR Workshop Proceedings (CEUR-WS.org)
evaluation of the artistic composition (i.e., the robotic dance choreographies). In order to do that,
in [8] we have organized a competition aimed at the development of robotic choreographies
which has been held within the context of a Master Degree course on Fundamentals of Artificial
Intelligence1 at the University of Bologna (Italy). On a voluntary basis, students could decide to
participate in the competition by developing their own algorithm for building the choreography;
participants were also asked to fill a questionnaire to evaluate each of the proposed performances
and a winner was decided based on the responses. Alongside, answers to the questionnaire
are used to build a public dataset which maps specific features of each choreography to the
judgement provided by the audience. We then trained different Machine Learning (ML) models to
learn relationship between choreography features and evaluation score; we have also inspected
these models to rank the importance of each feature, to help both human choreographers and
AI algorithms to understand which aspects of a performance are more likely to increase its
appreciation from the audience.
   This analysis revealed that one of the aspects with greater impact on the human evaluation is
the musical element incorporated in the dance. Hence, we decided to explore new directions
involving multi-modal creativity. The idea is to propose that the creative process of developing
a robotic choreography during the challenge should be effectively paired with music generation
techniques, based on the feature importance analysis. In this way, both human artists and AI
algorithms would be able to better integrate the music within the performance, thus achieving
more impact on the audience.


2. Related Work
2.1. Humanoid Robotic Dance Automation
Humanoid robots have been employed in a variety of social tasks including, but not limited to,
healthcare [12] and education [13, 14]. Likewise, dance is considered to be an important part
of our social activities, therefore many research works on the development of robotic dance
automation started to emerge aiming both at entertainment and human-machine interaction.
Thanks to their ability to move in a physical environment and to mimic human-like actions,
humanoid robots such as NAO2 are the perfect candidate for these tasks. However, among the
collection of works that have been proposed about robotic dances [10, 11, 15, 16, 17], the majority
of them focused solely on the implementation of balanced and coordinated choreographies,
without taking into account how a human audience would eventually evaluate it.

2.2. Music Generation Techniques
As revealed by [8], the music element has a great impact in choreography appreciation, suggest-
ing that creating musical score in accordance to the robotic dance is crucial. However, to the
best of our knowledge, no works has explicitly addressed this challenge. Moreover, automatic
music generation is still in its infancy, as it is a very complex task which involves multiple
experts with different roles, starting from an initial phase of composition up to the final mixing
1
    https://www.unibo.it/en/teaching/course-unit-catalogue/course-unit/2022/446566
2
    https://www.aldebaran.com/en/nao
and mastering step. See for instance, Ji et al. [18] who propose a taxonomy based on three
incremental levels: the score generation, the performance generation, and the audio generation.
  A wide plethora of different techniques ranging from Genetic Algorithms [19, 20] to Swarm
Intelligence [21], Markov Chains [22], and Artificial Neural Networks [23, 24] have been
successfully applied in the past years both to the most common unconstrained generative task
and to some of its variations – e.g., melody generation based on a given harmonic progression
[25, 26] and vice versa [27], but also real-time improvisation along with a human performer
[28]. However, mainly due to the higher complexity of audio signals with respect to scores, as
well as a lack of proper audio generation datasets, most of the research have been focused on
the top level only rather than developing a full, end-to-end approach.

2.3. Artistic Evaluation of Performances
Due to its subjective nature, the evaluation of art works and performances is known to be a
very complex task. As pointed out in [29], being there no objective methodology to compare
the results of models involved in the production of new art pieces, the vast majority of the
approaches is based on surveying human responses with ad-hoc questions and scales.
   As a matter of fact, various measurement tools have been proposed in the last decades in
order to assess the quality of dance performances [30, 31]; still, most of them are specifically
designed to evaluate human skills rather than mechanical subjects. An exception can be found in
recent works by Gemeinboeck [32, 33, 34], which specifically focus on the perception of robotic
body and movements; similarly, the Likert-like questionnaire proposed in [35] intentionally
addresses the evaluation of robotic dances. This last questionnaire in particular, integrated with
additional questions regarding the use of the surrounding space and the overall theatricality of
the choreography, as well as its level of human reproducibility, was used in our previous work
[8] in order to assess the importance of various aspects of a robotic dance performance in its
success among social groups coming from different academic backgrounds.
   Similarly, as regards evaluation techniques, the field of automatic music generation is dom-
inated by listening tests followed by Likert-like questionnaires about the pleasantness and
naturalness of the proposed audio track [36, 37, 38]. An interesting usage of this kind of evalua-
tion can be found in [29], where the authors gathered data from human annotation of more
than a thousand computer-generated records based on a 1-to-5 scale and eventually used this
data to build a classifier aimed at helping the generative model to filter out unpleasant tracks
before returning them to the users. For a broader perspective on machine-generated music
evaluation, [18] provides an extensive review of the most important techniques.


3. Robotic Choreography Creation
As illustrated in [8], during the last years, we proposed the idea of a challenge aimed at the
creation of robotic choreographies. This was motivated by the belief that a synergy between
technical and creative areas, such as those of AI and dance performances respectively, would
increase students’ participation as well as providing them with new perspectives and transversal
skills. The competition was based on voluntary participation, with students covering both the
roles of choreography creators and audience; the winning team was elected during a final voting
day.
   Participants of the challenge are asked to work in group on the development of a choreography
within a simulated NAO robot environment. To encourage diversification among groups, no
limitation about viable technique is given, although the larger part of performances were built
upon those seen during the course, i.e., Planning, Search Strategy, and Constraints. The whole
set of competition rules and guidelines has been described in [8]; in any case, what it is expected
from students is to implement a system which, given a fixed initial state, a goal, and a set of
requirements – e.g., duration of the choreography, mandatory positions, etc. –, is able to return
a sequence of actions and transitions that respect the balancing and coordination constraints of
the robot, thus avoiding any kind of possible inconsistency between subsequent positions.

3.1. Multi-modal Creation Perspective
So far, the challenge focused exclusively on its dance component. Indeed, no requirements
on background music was given, leaving each group the freedom to select any piece of music
suitable for the choreography – provided, of course, that it respected the total time limit.
However, we plan to start employing a new modality, that is asking teams to generate their
background music using state-of-the-art AI-based tools. This has the dual aim of exploring
multi-modal art creation – thus increasing the levels of expressive freedom of students – while,
simultaneously, allowing team members to freely synchronize their choreography with the
underlying music beat, an aspect which has been found to be positively correlated with a higher
aesthetic appeal of gymnastics and acrobatics performances [39, 40, 41].
   Thanks to the emergence of recent neural architectures based on improved attention mech-
anisms, multi-modal inputs started to be employed in all kind of generative tasks [42], in-
cluding robotic dance. [43] integrates visual and non-visual information to increase robots’
self-awareness on aesthetics movements, similarly to what a human would do in front of a
mirror; [44] exploits an integrated input source of past motion and music in order to better
condition the generative process; [45] focuses on the integration of music context to help the
model create a performance which correctly matches the musical elements. However, to the
best of our knowledge, nobody within the field of robotic dance automation has ever made
attempts to use this variety of input sources neither for an educational purpose nor to develop
new forms of human-machine interactions where the AI software does not perform end-to-end
generative tasks, but rather it serves as a enabling tool to improve human creative skills. Indeed,
with this new challenge modalities, we expect students to form larger groups where members
can focus on the artistic task they prefer, hopefully allowing them to increase both their creative
and teamwork abilities.


4. Robotic Choreography Evaluation
As previously stated in 2.3, the evaluation of art works and performances is a complex task.
This entails that dance researchers and choreographers lack of a formal knowledge that might
help them to create more aesthetically pleasing performances. For this reason, two different
audiences – one with a scientific (𝑆) and one an artistic (𝐴) background – were asked to evaluate
the challenge participants’ works respectively to both the structure of the choreography and
the behaviour of the robot performer [8]. We collected this data into two different datasets,
one per type of audience, and eventually analyzed it with machine learning methods in order
to extract the features that best predicted a success for the performance. The two datasets
are publicly available3 ; both of them are composed by 403 records – 31 attendees to whom 13
choreographies were submitted –, each one encoding 20 different features concerning various
aspects of a choreography such as, e.g., its duration, the number of movements it contains, the
background music genre, the AI technique which was used, etc.

4.1. Feature Importance Analysis
The data analysis phase started with a straight-forward prediction task aimed at assessing
whether or not machine learning models would have been able to extract informative patterns
from the input features. Four different symbolic and explanatory models were taken into
account in order to handle both the low amount of data and the open issue of interpretability in
sub-symbolic AI. Among all, Gradient Boosting performed the best, thus it came as a natural
choice for the subsequent task of feature importance analysis.
   This analysis highlighted interesting trends. Some features such as time duration, music
bpm and music genre showed a higher impact both in the 𝑆 and in the 𝐴 datasets (see 1),
confirming our idea that background music strongly influences the pleasantness of a robotic
dance performance. Similarly, other music-based features exhibited not only the significance
of audio information in the overall evaluation, but also how this significance differs between
the two social groups, with certain music genres influencing the scientific group more than the
artistic one, and vice versa. Apart from those, even features directly assessing choreographic
elements confirmed that members of the two audiences systematically leaned on certain aspects
of the performance rather than others when judging its aesthetic quality. A more exhaustive
analysis of the audience responses, along with a more detailed description of the evaluation
questionnaire and dataset structure, can be found in [8].


Figure 1: This figure illustrates the feature importance analysis for two target analyzed in [8], respec-
tively the choreography public involvement and the human reproducibility as an average of both the
analyzed audiences (i.e., 𝐴 and 𝑆).


3
    https://github.com/ProjectsAI/NAOPlanningChallenge/tree/main/datasets
4.2. Multi-modal Evaluation Perspective
As already mentioned in 2.2, fully automated music generation tools seem to be far from
being achieved in a short time span, especially if our aim is to keep the human in the loop.
Nonetheless, not only we can try to experiment with some existing techniques for the sake
of educational purposes and based on the previous feature importance analysis, but also we
should design proper ways to assess the artistic quality of both the generated music and the
overall multi-modal artistic product.
   As in any other creative task, most of the evaluation methodologies for automated music
generation are based on subjective judgements [18], thus they require human involvement.
However, annotators are rarely asked to express an aesthetic opinion on the track, but rather
they are simply required to discern machine-generated from human-generated pieces of music.
In our case, the technical aspects of the song must be given for granted, with the audience being
explicitly asked to assess its artistic traits. So far, works regarding the aesthetic evaluation
of music mainly focused on the music performance [46], while in our case the performance
is strictly choreographic and does not comprise any live musical act; quite the opposite, in
our case we do not even care about the songs on their own, but rather on how well do they
conform to the choreographic elements in terms of rhythm and mood. For this reason, we
will need to extend the set of questionnaire topics taking into account this specific integration
aspects between dance and music. Previous results demonstrate the strength of integrating
complementary sources of information in different tasks such as artistic recognition [47] and
creation [48], and also indicate the potential of applying multi-modal approaches within specific
research areas like the artistic evaluation one.


5. Conclusions
The role of AI in creative activities is becoming more and more central. Within the field of
robotic dance performance, this is manifested by the large number of studies on computational
dance automation and humanoid robotic dance. We claim that this research area can particularly
benefit from AI when humans are kept in the loop, both in the creative and in the evaluation
process. Furthermore, starting from one of our previous works aimed at discovering correlations
between certain features of a choreography and its success among a human audience, we address
the issue of multi-modality. Indeed, since musical elements emerged as having a large impact
from our feature importance analysis, we argue that the creative aspect must comprise music
generation as well. Even though very few AI-based tools exist for assisting non-professionals
to create musical tracks from scratch, we provide some future directions on how to build such
integration and, eventually, how to evaluate it with appropriate questionnaires. The final
purpose is twofold, since we would like both to explore new creative paths where humans ideas
can be broadened by the use of AI software and to foster awareness about which aspects of a
choreography correlate with a major impact on the audience.
Acknowledgements
This work has been partially supported by European ICT-48-2020 Project TAILOR (g.a. 952215).
We thank the performing robots group4 .


References
 [1] R. López de Mántaras, Artificial intelligence and the arts: Toward computational creativity
     (2016).
 [2] M. Mazzone, A. Elgammal, Art, creativity, and the potential of artificial intelligence, in:
     Arts, volume 8, MDPI, 2019, p. 26.
 [3] F. Sagasti, Information technology and the arts: the evolution of computer choreography
     during the last half century, Dance Chronicle 42 (2019) 1–52.
 [4] A. Plone, The influence of artificial intelligence in dance choreography (2019).
 [5] S. Anwar, N. A. Bascou, M. Menekse, A. Kardgar, A systematic review of studies on
     educational robotics, Journal of Pre-College Engineering Education Research (J-PEER) 9
     (2019) 2.
 [6] J.-W. Hong, N. M. Curran, Artificial intelligence, artists, and art: attitudes toward art-
     work produced by humans vs. artificial intelligence, ACM Transactions on Multimedia
     Computing, Communications, and Applications (TOMM) 15 (2019) 1–16.
 [7] Y. Wang, H. Ma, The value evaluation of artificial intelligence works of art, in: 2019
     International Joint Conference on Information, Media and Engineering (IJCIME), IEEE,
     2019, pp. 445–449.
 [8] A. De Filippo, P. Mello, M. Milano, Do you like dancing robots? AI can tell you why,
     in: PAIS 2022, IOS Press, 2022. URL: https://doi.org/10.3233%2Ffaia220064. doi:10.3233/
     faia220064.
 [9] A. Manfrè, I. Infantino, F. Vella, S. Gaglio, An automatic system for humanoid dance
     creation, Biologically Inspired Cognitive Architectures 15 (2016) 1–9. URL: https://doi.org/
     10.1016%2Fj.bica.2015.09.009. doi:10.1016/j.bica.2015.09.009.
[10] O. E. Ramos, N. Mansard, O. Stasse, C. Benazeth, S. Hak, L. Saab, Dancing humanoid
     robots: Systematic use of OSID to compute dynamically consistent movements following
     a motion capture pattern, IEEE Robotics &amp Automation Magazine 22 (2015) 16–26.
     URL: https://doi.org/10.1109%2Fmra.2015.2415048. doi:10.1109/mra.2015.2415048.
[11] K. Shinozaki, A. Iwatani, R. Nakatsu, Concept and construction of a dance robot system,
     in: Proceedings of the 2nd international conference on Digital interactive media in enter-
     tainment and arts - DIMEA '07, ACM Press, 2007. URL: https://doi.org/10.1145%2F1306813.
     1306848. doi:10.1145/1306813.1306848.
[12] A. Joseph, B. Christian, A. A. Abiodun, F. Oyawale, A review on humanoid robotics in
     healthcare, MATEC Web of Conferences 153 (2018) 02004. URL: https://doi.org/10.1051%
     2Fmatecconf%2F201815302004. doi:10.1051/matecconf/201815302004.
[13] O. Mubin, C. J. Stevens, S. Shahid, A. A. Mahmud, J.-J. Dong, A REVIEW OF THE APPLI-
     CABILITY OF ROBOTS IN EDUCATION, Technology for Education and Learning 1 (2013).

4
    https://site.unibo.it/performingrobots/en
     URL: https://doi.org/10.2316%2Fjournal.209.2013.1.209-0015. doi:10.2316/journal.209.
     2013.1.209-0015.
[14] A. K. Pandey, R. Gelin, Humanoid robots in education: A short review, in: Humanoid
     Robotics: A Reference, Springer Netherlands, 2018, pp. 2617–2632. URL: https://doi.org/10.
     1007%2F978-94-007-6046-2_113. doi:10.1007/978-94-007-6046-2_113.
[15] K. Shinozaki, A. Iwatani, R. Nakatsu, Construction and evaluation of a robot dance
     system, in: RO-MAN 2008 - The 17th IEEE International Symposium on Robot and
     Human Interactive Communication, IEEE, 2008. URL: https://doi.org/10.1109%2Froman.
     2008.4600693. doi:10.1109/roman.2008.4600693.
[16] D. Grunberg, R. Ellenberg, Y. Kim, P. Oh, Creating an autonomous dancing robot, in:
     Proceedings of the 2009 International Conference on Hybrid Information Technology
     - ICHIT '09, ACM Press, 2009. URL: https://doi.org/10.1145%2F1644993.1645035. doi:10.
     1145/1644993.1645035.
[17] C. Angulo, J. Comas, D. Pardo, Aibo JukeBox – a robot dance interactive experience, in: Ad-
     vances in Computational Intelligence, Springer Berlin Heidelberg, 2011, pp. 605–612. URL:
     https://doi.org/10.1007%2F978-3-642-21498-1_76. doi:10.1007/978-3-642-21498-1_
     76.
[18] S. Ji, J. Luo, X. Yang, A comprehensive survey on deep music generation: Multi-level
     representations, algorithms, evaluations, and future directions, CoRR abs/2011.06801
     (2020). URL: https://arxiv.org/abs/2011.06801. arXiv:2011.06801.
[19] A. Gartland-Jones, MusicBlox: A real-time algorithmic composition system incorporating a
     distributed interactive genetic algorithm, in: Lecture Notes in Computer Science, Springer
     Berlin Heidelberg, 2003, pp. 490–501. URL: https://doi.org/10.1007%2F3-540-36605-9_45.
     doi:10.1007/3-540-36605-9_45.
[20] C. Rizzuti, E. Bilotta, P. Pantano, A GA-based control strategy to create music with
     a chaotic system, in: Lecture Notes in Computer Science, Springer Berlin Heidelberg,
     2009, pp. 585–590. URL: https://doi.org/10.1007%2F978-3-642-01129-0_66. doi:10.1007/
     978-3-642-01129-0_66.
[21] F. Mauceri, S. M. Majercik, A swarm environment for experimental performance and impro-
     visation, in: Computational Intelligence in Music, Sound, Art and Design, Springer Interna-
     tional Publishing, 2017, pp. 190–200. URL: https://doi.org/10.1007%2F978-3-319-55750-2_13.
     doi:10.1007/978-3-319-55750-2_13.
[22] A. S. Ramanto, N. U. Maulidevi, Markov chain based procedural music generator with
     user chosen mood compatibility, International Journal of Asia Digital Art and Design As-
     sociation 21 (2017) 19–24. URL: http://dx.doi.org/10.20668/adada.21.1_19. doi:10.20668/
     adada.21.1_19.
[23] J. Wu, C. Hu, Y. Wang, X. Hu, J. Zhu, A hierarchical recurrent neural network for
     symbolic melody generation, IEEE Transactions on Cybernetics 50 (2020) 2749–2757. URL:
     https://doi.org/10.1109%2Ftcyb.2019.2953194. doi:10.1109/tcyb.2019.2953194.
[24] C.-Z. A. Huang, A. Vaswani, J. Uszkoreit, N. Shazeer, I. Simon, C. Hawthorne, A. M. Dai,
     M. D. Hoffman, M. Dinculescu, D. Eck, Music transformer, 2018. URL: https://arxiv.org/
     abs/1809.04281. doi:10.48550/ARXIV.1809.04281.
[25] J. A. Biles, GenJam: An interactive genetic algorithm jazz improviser, The Journal of the
     Acoustical Society of America 102 (1997) 3181–3181. URL: https://doi.org/10.1121%2F1.
     420841. doi:10.1121/1.420841.
[26] M. Kikuchi, Y. Osana, Automatic melody generation considering chord progression by
     genetic algorithm, in: 2014 Sixth World Congress on Nature and Biologically Inspired
     Computing (NaBIC 2014), IEEE, 2014. URL: https://doi.org/10.1109%2Fnabic.2014.6921876.
     doi:10.1109/nabic.2014.6921876.
[27] H. Lim, S. Rhyu, K. Lee, Chord generation from symbolic melody using BLSTM networks, in:
     S. J. Cunningham, Z. Duan, X. Hu, D. Turnbull (Eds.), Proceedings of the 18th International
     Society for Music Information Retrieval Conference, ISMIR 2017, Suzhou, China, October
     23-27, 2017, 2017, pp. 621–627. URL: https://ismir2017.smcnus.org/wp-content/uploads/
     2017/10/134_Paper.pdf.
[28] F. Pachet, P. Roy, J. Moreira, M. d'Inverno, Reflexive loopers for solo musical impro-
     visation, in: Proceedings of the SIGCHI Conference on Human Factors in Comput-
     ing Systems, ACM, 2013. URL: https://doi.org/10.1145%2F2470654.2481303. doi:10.1145/
     2470654.2481303.
[29] I. P. Yamshchikov, A. Tikhonov, Music generation with variational recurrent autoen-
     coder supported by history, SN Applied Sciences 2 (2020). URL: https://doi.org/10.1007%
     2Fs42452-020-03715-w. doi:10.1007/s42452-020-03715-w.
[30] S. Chatfield, W. Byrnes, Correlational analysis of aesthetic competency, skill acquisition
     and physiologic capabilities of modern dancers, in: 5th Hong Kong International Dance
     Conference Papers, 1990, pp. 79–100.
[31] D. Krasnow, S. J. Chatfield, Development of the “performance competence evaluation
     measure”: assessing qualitative aspects of dance performance, Journal of Dance Medicine
     & Science 13 (2009) 101–107.
[32] P. Gemeinboeck, R. Saunders, Movement matters, in: Proceedings of the 4th International
     Conference on Movement Computing, ACM, 2017. URL: https://doi.org/10.1145%2F3077981.
     3078035. doi:10.1145/3077981.3078035.
[33] P. Gemeinboeck, R. Saunders, Dancing with the nonhuman, Thinking in the world.
     London: Bloomsbury Academic (2019).
[34] P. Gemeinboeck, The aesthetics of encounter: A relational-performative design approach
     to human-robot interaction, Frontiers in Robotics and AI 7 (2021). URL: https://doi.org/10.
     3389%2Ffrobt.2020.577900. doi:10.3389/frobt.2020.577900.
[35] J. L. Oliveira, L. P. Reis, B. M. Faria, F. Gouyon, An empiric evaluation of a real-time robot
     dancing framework based on multi-modal events, TELKOMNIKA Indonesian Journal of
     Electrical Engineering 10 (2012). URL: https://doi.org/10.11591%2Ftelkomnika.v10i8.1327.
     doi:10.11591/telkomnika.v10i8.1327.
[36] L. Yang, S. Chou, Y. Yang, Midinet: A convolutional generative adversarial network for
     symbolic-domain music generation, in: S. J. Cunningham, Z. Duan, X. Hu, D. Turnbull
     (Eds.), Proceedings of the 18th International Society for Music Information Retrieval
     Conference, ISMIR 2017, Suzhou, China, October 23-27, 2017, 2017, pp. 324–331. URL:
     https://ismir2017.smcnus.org/wp-content/uploads/2017/10/226_Paper.pdf.
[37] H.-M. Liu, Y.-H. Yang, Lead sheet generation and arrangement by conditional generative
     adversarial network, in: 2018 17th IEEE International Conference on Machine Learning
     and Applications (ICMLA), IEEE, 2018. URL: https://doi.org/10.1109%2Ficmla.2018.00114.
     doi:10.1109/icmla.2018.00114.
[38] Y. Yan, E. Lustig, J. VanderStel, Z. Duan, Part-invariant model for music generation and
     harmonization, in: E. Gómez, X. Hu, E. Humphrey, E. Benetos (Eds.), Proceedings of the
     19th International Society for Music Information Retrieval Conference, ISMIR 2018, Paris,
     France, September 23-27, 2018, 2018, pp. 204–210. URL: http://ismir2018.ircam.fr/doc/pdfs/
     293_Paper.pdf.
[39] Q. Cao, X. Chen, R. Song, H. Jiang, G. Yang, Z. Cao, Multi-modal experience inspired ai
     creation, arXiv preprint arXiv:2209.02427 (2022).
[40] F. Veit, L. Riedel, D. Jeraj, Does jumping to the beat result in better ratings from gymnastics
     experts?, Journal of Human Sport and Exercise 17 (2021). URL: https://doi.org/10.14198%
     2Fjhse.2022.174.17. doi:10.14198/jhse.2022.174.17.
[41] F. Veit, J. Veit, T. Heinen, The influence of music on judges’ evaluation of complex skills
     in gymnastics, European Journal of Sport Sciences 1 (2022) 1–7. URL: https://doi.org/10.
     24018%2Fejsport.2022.1.5.31. doi:10.24018/ejsport.2022.1.5.31.
[42] M. Suzuki, Y. Matsuo, A survey of multimodal deep generative models, Advanced Robotics
     36 (2022) 261–278. URL: https://doi.org/10.1080%2F01691864.2022.2035253. doi:10.1080/
     01691864.2022.2035253.
[43] J. Li, H. Peng, H. Hu, Z. Luo, C. Tang, Multimodal information fusion for auto-
     matic aesthetics evaluation of robotic dance poses, International Journal of Social
     Robotics 12 (2019) 5–20. URL: https://doi.org/10.1007%2Fs12369-019-00535-w. doi:10.
     1007/s12369-019-00535-w.
[44] K. Kritsis, A. Gkiokas, A. Pikrakis, V. Katsouros, Attention-based multimodal feature
     fusion for dance motion generation, in: Proceedings of the 2021 International Conference
     on Multimodal Interaction, ACM, 2021. URL: https://doi.org/10.1145%2F3462244.3479961.
     doi:10.1145/3462244.3479961.
[45] G. Valle-Pérez, G. E. Henter, J. Beskow, A. Holzapfel, P.-Y. Oudeyer, S. Alexanderson,
     Transflower, ACM Transactions on Graphics 40 (2021) 1–14. URL: https://doi.org/10.1145%
     2F3478513.3480570. doi:10.1145/3478513.3480570.
[46] E. Gordon, Rating scales and their uses for measuring and evaluating achievement in music
     performance, GIA Publications, 2002.
[47] M. Wysoczanska, T. Trzcinski, Multimodal dance recognition., in: VISIGRAPP (5: VISAPP),
     2020, pp. 558–565.
[48] F. Ofli, E. Erzin, Y. Yemez, A. M. Tekalp, Multi-modal analysis of dance performances
     for music-driven choreography synthesis, in: 2010 IEEE International Conference on
     Acoustics, Speech and Signal Processing, IEEE, 2010, pp. 2466–2469.