=Paper=
{{Paper
|id=Vol-2009/fmt-proceedings-2017-paper19
|storemode=property
|title=HoloKeys - An Augmented Reality Application for Learning the Piano
|pdfUrl=https://ceur-ws.org/Vol-2009/fmt-proceedings-2017-paper19.pdf
|volume=Vol-2009
|authors=Dominik Hackl,Christoph Anthes
|dblpUrl=https://dblp.org/rec/conf/fmt/HacklA17
}}
==HoloKeys - An Augmented Reality Application for Learning the Piano==
HoloKeys -
An Augmented Reality Application
for Learning the Piano
Dominik Hackl Christoph Anthes
University of Applied Sciences University of Applied Sciences
Upper Austria Upper Austria
4232 Hagenberg/Austria 4232 Hagenberg/Austria
Email: dominikhackl@gmx.at Email: christoph.anthes@fh-hagenberg.at
Abstract—This paper describes the design and the implemen- IV. Finally conclusions are drawn and an outlook into the
tation approach of a piano training application. HoloKeys is an future work is given.
Augmented Reality tool which is capable to superimpose the keys
to be played on a real piano. Musical pieces are loaded as MIDI II. R ELATED W ORK
files, interpreted and can be displayed in two different ways.
This prototype provides many possibilities for extension which Music education has a long tradition in the field of AR.
can make it a powerful teaching tool. In an early approach Cheng and Robinson provided a visual
sheet music overlay displayed planar in the visual field of the
I. I NTRODUCTION user. The display of the augmentations is triggered when he
Augmented Reality (AR), described by Azuma as a technol- looks at the hands. The type of sheet is depending on which
ogy where the user sees ’the real world, with virtual objects hand he looks. The augmentation is not registered (meaning
superimposed upon or composited with the real world’ [1], has it is not directly spatially interconnected) to a real object
become a hot topic in the recent years. The application areas opposed to the approach presented in this publication. An
are wide spread and range far beyond simple advertisements HMD is used for display [2]. Cakmakci et al. augmented the
and virtual manuals from advanced training to sophisticated information which string to pull on a guitar with the intention
remote collaboration scenarios. Using AR to train musical to reduce cognitive discontinuities compared to the traditional
instruments has a long tradition in the field but because the way of learning an instrument. They were the first to provide
rapid development in AR Head-Mounted-Displays (HMDs) information on the interaction to be taken in an immediate way
this application area has gained new attention. on an instrument [3]. The registration of the guitar and the
We present HoloKeys, a prototypical implementation of virtual hand is implemented with the help of fiducial markers.
an AR training tool for learning the piano. HoloKeys runs In order to avoid the use of fiducial markers on the piano
on an HMD which the user is wearing while sitting in Huang et al. use their knowledge on the application domain
front of a physical piano. The application indicates notes and track the keys of the piano for pose estimation with
that are supposed to be played by displaying virtual keys the help of natural feature recognition [4]. Unfortunatly they
superimposing the physical keyboard with two different ap- provide no details on the diplay used, but the frame-rate of 15
proaches. Acquiring the musical data dynamically by loading frames per second, implies that it has not been developed for
and processing MIDI (Musical Instrument Digital Interface) a head-tracked system.
files, the application is fully agnostic considering the musical Chow et al. focus on the educational level of AR piano
pieces to be trained. To achieve the required precision for the teaching showing that with the help of augmentations and
augmentations on the piano, the application was implemented gamification components the motivation and interest in learn-
using fiducial marker tracking. Since this application is a ing the piano could be increased. They provided a system
prototype, an extensive collection of possible enhancements illustrating the notes to be played by lines approaching the
and prospects for the future is given. keys. Their findings also indicate that notation literacy does not
increase using their system of illustration [5]. We use a similar
A. Outline approach for the augmentations of the notes to be played but
The remainder of this paper is structured as follows: The rely on a optical see-through HMD instead of a video-based
next chapter provides an overview of the related work in music HMD.
teaching applications. Chapter III will introduce the conceptual Opposed to this visualisation approach Torres-Fernandez et
design of the application describing the architecture and the al. introduce a virtual character which illustrates how well the
user interface. Implementation details are provided in Chapter piano player has performed. To interpret the played music they
140
HoloKeys – An Augmented Reality Application for Learning the Piano
compare the input from a MIDI keyboard with an initially
loaded MIDI file [6]. A similar analysis was suggested and
implemented earlier by Barakonyi and Schmalstieg [7]. They
make use of fiducials for tracking and a desktop AR system
equipped with a webcam and a traditional screen.
In terms of visualisation Weing et al. demonstrate a system
in the area of Spatial Augmented Reality where they project
the keys to be pressed directly on the piano. Different modes
show for example the current and the next keys to be pressed.
If a wrong key is pressed it is highlighted in red to provide Fig. 1. Illustration of the conceptual design. The user, sitting in front of the
feedback to the user [8]. piano and wearing an HMD, looks down at the keyboard. When there are
notes to be played the respective key is highlighted. Underneath the keyboard
Zhang et al. use a completely virtual keyboard and track the there is an image marker which is used for tracking.
hand of the user with fiducial markers and the finger positions
with a self-developed data glove. Their approach targets the
rehabilitation of the motor function of stroke survivors rather 1) The Main Menu: The initial scene of the application is
than teaching the piano [9]. the main menu. There the user can select the musical piece to
Compared to these existing and presented approaches our play as well as the desired playback speed. By pressing the
system is unique in terms of used display technology. start button the application will switch to playback mode and
begin visualizing and playing the musical piece.
III. C ONCEPTUAL D ESIGN 2) Playback Mode: In playback mode the user sees the
The following chapter gives an overview of the application’s augmentations of the keys to be played superimposing the
hardware and software components and explains how the physical keyboard. Additionally a timeline shows the current
individual parts interact with each other. playback position and gives the user the option to jump to
different positions inside the piece. With the pause button the
A. Architecture Overview user is able to interrupt the playback or return to the main
menu.
The application’s setup is illustrated in Fig. 1 and consists 3) Calibration Mode: In calibration mode the application
of the following two hardware components. displays an augmentation of only one key, the middle C. The
1) The Piano: The core component is a physical piano user can adjust the position of the marker until the virtual key
which is used for the actual playing. Underneath the piano perfectly fits the real one. This is useful to setup the optimal
keyboard which is usually made of 88 keys a fiducial marker position of the marker on the piano. Additionally the user can
is placed which is used by the application for tracking. The also adjust the pitch of the virtual piano sound in calibration
keys of a regular piano are standardized in size which makes mode because this does not necessarily match with the real
the application fully independent considering the type of piano. piano. Playback volume can be adjusted in the HMD.
In case a keyboard is used the key width can be adjusted.
2) The Head-Mounted-Display: The user sits in front of C. Display of Augmentations
the piano and wears an HMD on which the application runs. Generally the HMD displays an augmentation of a bright
Through the HMD the user sees augmentations in the form of green key to indicate that the actual key on that position has
highlighted keys on top of the real keyboard. The HMD also to be pressed. Two different approaches as seen in Fig. 2
handles tracking by recognizing the image marker with the were tested and both have their advantages and disadvantages
help of computer vision algorithms. The HMD therefore keeps concerning predictability and Field Of View (FOV) limitations.
track of the player’s position and displays the augmentations 1) The Instant Approach: The moment a key is supposed
accordingly. Additionally, the HMD is responsible for sound to be pressed it becomes highlighted. Once it is supposed to
output of the music to be played. This gives the user an be released it switches back to normal. This way the user can
impression on how the piece is supposed to sound and makes more or less observe the playing of the piece in real-time,
it easier to play along with it. comparable to watch the fingers of an actual pianist. While
this approach can be useful for advanced players, it is hardly
B. Interface possible to learn a new piece or even to play along with it,
In order to manage different settings and control the play- because the player has no way of predicting the next notes.
back, a simple user interface was implemented. The originally Still, observing this looks great and could be used for showcase
two-dimensional UI is placed inside the 3D scene using world- purposes (self-playing piano), as the limited FOV is also less
stabilized coordinates. Considering the usually static setup of of a problem there.
the application with the user sitting in front of the piano, the 2) The Beatmania Approach: Note objects are created far in
world-stabilized menu is a reasonable approach [10]. User the distance and from there start moving towards the particular
input works through gaze-based interaction combined with keys. As soon as the virtual object reaches the real key, the
gestures. note should be played. With this approach, which became
141
HoloKeys – An Augmented Reality Application for Learning the Piano
with Unity. Vuforia supports several different tracking
methods ranging from recognizing plain images to com-
plex objects. With a specific setup, Vuforia can also be
used on the HoloLens.
4
• C# Synth Project and MIDI Support
The C# Synth Project is an open-source library which is
used for processing MIDI data and synthesizing it to au-
dio data. MIDI is an industry standard for interconnection
between musical instruments and digital devices. Its file
Fig. 2. Comparing the two tested approaches. Left: The Instant Approach.
format represents musical information like notes values,
Right: The Beatmania Approach. volume and tempo. Although MIDI is a complex format,
it is still the most popular and commonly used format to
store musical data. For piano pieces the format is usually
popular with the game ’Beatmania’ [11] and is still used in sufficient because only one channel is required to store a
many music rhythm games today, the user can anticipate the series of notes and tempo changes.
upcoming notes and prepare accordingly. When learning a
piano piece the musician’s brain utilizes its ’muscle memory’
and fine motor skills rather than memorizing each individual B. Visualization and Tracking
note [12]. Therefore learning a piece with the Beatmania The application’s visuals consist of a Unity 3D scene which
approach should be equally efficient than learning it from sheet renders the virtual keys, combined with Vuforia’s tracking
music, especially for beginners. abilities to provide the information on where to render the
keys.
IV. I MPLEMENTATION
1) Vuforia’s image target: For this application tracking via
This chapter goes into detail regarding the concrete imple- fiducial marker and image target was used. The image target
mentation of HoloKeys. It starts with a brief overview of in Unity is a planar object in 3D space which is associated
used hardware and software tools followed by an in-depth with a set of 2D images. These images represent the markers
description of the two main development tasks, visualization that are placed somewhere in the real world. Once the camera
and MIDI processing. recognizes a marker the application can trace back the position
A. Used Technologies of the HMD and can therefore project all augmented objects
accordingly.
The application was developed for tablet devices as well
2) Tracking setup: Marker images and other tracking set-
as the HoloLens. The tablet approach is mainly used for
tings can be configured in Vuforia’s web interface. This con-
demonstration purposes, rather than actual training.
figuration with all related assets is then compiled into a Unity
1) Hardware:
package that can be imported into Unity after that. In Unity
1
• HoloLens two components of Vuforia, ARCamera and ImageTarget, are
The HoloLens as a current AR HMD provides good used. Subordinate objects of the ImageTarget become affected
sensory support as well as spatial audio and stereoscopic by the marker-related projection.
display capabilities. Its main disadvantage the limited
3) Generating the keyboard: In order to display the cur-
FOV poses an issue to the applicability of this use case.
rently played keys, first an entire virtual keyboard is displayed
2) Software: To allow cross-platform and cross-device de- half-transparently superimposing the real one. A script takes
velopment the following set of tools and libraries was used. care of automatically generating all 88 key objects. One base
2
• Unity key object is placed in the scene and aligned at around 90
Unity is traditionally a game engine which has found degrees relative to the ImageTarget. This registration has to
wide adoption in the whole domain of Mixed Reality match with the real world relation between marker and piano
[13]. It allows scene setup and provides scripting capabil- keyboard. All other keys are then generated as duplicates of the
ities. The applications developed with Unity can easily be base object with respective offset and color (black or white).
deployed on a multitude of target platforms including iOS
and Android devices as well as UWP (Universal Windows
C. Audio and MIDI Playback
Platform) devices.
3
• Vuforia The two core components of the C# Synth Project library
The Augmented Reality part of the project is based on are the MidiSequencer which handles loading and processing
Vuforia, an AR tracking library which perfectly integrates MIDI data and the MidiStreamSynthesizer which handles the
1 https://www.microsoft.com/en-us/hololens
actual audio playback.
2 https://unity3d.com/
3 https://www.vuforia.com/ 4 https://csharpsynthproject.codeplex.com/
142
HoloKeys – An Augmented Reality Application for Learning the Piano
1) Handling key actions: During playback the MidiSe- system, the student would be even more aware of his
quencer fires two events that are relevant for this applica- progress and more likely to remain motivated.
tion: MidiNoteOn and MidiNoteOff. These two events are • Dictionary of chords, scales etc.
respectively fired when the playback of a note is triggered or A very useful utility not only for beginners but also
terminated and therefore indicate exactly the time when a key for advanced pianists would be a piano dictionary. The
is pressed and released. In the implementations of these two player could look up all possible chords and scales and
event handlers the MIDI code of the affected note is passed would be able to see them highlighted right on top of
as a parameter. The only operation is to map this MIDI code his keyboard. Especially for jazz piano where complex
to our according key object and set its material color to either chords and scales are common, this technology would be
green (in NoteOn) or the default color (in NoteOff). of great service.
2) Combining the audio sources: The MidiStreamSynthe- 2) Further Improvements:
sizer creates actual audio data based on the sequencer’s input.
• Using music sheets as markers
To make sure that this audio data is actually redirected to
The use of music sheets, perhaps in the form of a special
Unity’s audio source, the special method OnAudioFilterRead
music book, as fiducial markers could eliminate the need
has to be implemented. This method supports direct writing
for additional markers placed on the piano. It could not
into the audio buffer and therefore redirect the contents of the
only automatically detect the musical piece to be played
StreamSynthesizer to Unity’s audio source.
but also indicate, when to turn the sheets or even highlight
V. C ONCLUSION musical attributes on the sheets.
• Checking the learning performance
As a prototype the application serves well, but due to the
limited FOV, which will most likely increase in the next years Real-time feedback of the user’s playing could greatly
with the following generations of AR hardware, its real world contribute to the learning experience. This could be
usage could be doubted. Furthermore, an evaluation of the achieved on the one hand by using MIDI keyboards
different augmentation methods would be useful. Especially to directly receive the MIDI input of pressed keys or
when trying out a few more possible approaches, a user test on the other hand by recording and deconstructing the
could find out which of the methods are most likely to work audio data. The first approach would be technologically
in a real-world scenario. A more in-depth study of musical straight-forward but would limit the application to elec-
augmentation methods would also be useful for teaching other tronic keyboard instruments while the second approach
instruments or even in completely different areas of music. would be more flexible but complicated to implement and
perhaps inaccurate [14].
A. Future Work - The Virtual Piano Teacher The possibilities of the virtual piano teacher are enormous
A long-term vision could be the creation of a full-featured but all are based on the core concept of the technique explained
virtual piano teacher using AR. Especially early-stage piano in this paper. As soon as there are improvements in AR
learning contains many tasks that could be implemented with hardware, especially concerning FOV, virtual piano teachers
AR technologies like the one explained in this paper combined can be implemented and actually start to become a helpful
with gamification elements. tool.
1) Use Cases:
• Learning notes and the piano keyboard R EFERENCES
Simple exercises or games to recognize the note names [1] R. T. Azuma, “A survey of augmented reality,” Presence: Teleoperators
and match it with the proper keys could really increase the and Virtual Environments, vol. 6, no. 4, pp. 355–385, August 1997.
early-stage learning rate. For beginners the note names [2] L.-T. Cheng and J. Robinson, “Personal contextual awareness through
visual focus,” IEEE Intelligent Systems, vol. 16, no. 3, pp. 16–20, 2001.
could be augmented on top of every key until they [3] O. Cakmakci, F. Brard, and J. Coutaz, “An augmented reality based
become familiar with it. learning assistant for electric bass guitar,” in 10th International Confer-
• Learning easy to intermediate musical pieces ence on Human-Computer Interaction, 2003.
Especially for smaller pieces the AR learning approach [4] F. Huang, Y. Zhou, Y. Yu, Z. Wang, and S. Du, “Piano AR: A markerless
augmented reality based piano teaching system,” in Third International
could surpass traditional learning by music sheets. Begin- Conference on Intelligent Human-Machine Systems and Cybernetics,
ners who are not used to reading music yet, would still 2011.
be able to learn pieces quickly on their own. Additionally [5] J. Chow, H. Feng, R. Amor, and B. C. Wunsche, “Music education
using augmented reality with a head mounted display,” in Fourteenth
a lot more useful information like fingering, expression Australasian User Interface Conference (AUIC2013). Melbourne,
and dynamics could be displayed during playback. Australia: ACM, Jan. 2013, pp. 73–79.
• Technical exercises [6] C. A. T. Fernandez, P. Paliyawan, and C. C. Yin, “Piano learning
application with feedback provided by an ar virtual character,” in 5th
The importance of regular technical exercises for piano Global Conference on Consumer Electronics. Kyoto, Japan: IEEE, Oct.
students is huge but generally underestimated and dis- 2016.
liked. With the introduction of AR and gamification, a [7] I. Barakonyi and D. Schmalstieg, “Augmented reality agents in the
development pipeline of computer entertainment,” in 4th international
whole lot of enjoyable and still pianistically valuable conference on Entertainment Computing (ICEC’05). Sanda, Japan:
exercises could be realized. By adding some sort of level Springer, Sep. 2005, pp. 345–356.
143
HoloKeys – An Augmented Reality Application for Learning the Piano
[8] M. Weing, A. Rhlig, K. Rogers, J. Gugenheimer, F. Schaub, B. Knings,
E. Rukzio, and M. Weber, “P.i.a.n.o.: Enhancing instrument learning
via interactive projected augmentation,” in Conference on Pervasive
and ubiquitous computing adjunct publication (UbiComp13). Zurich,
Switzerland: ACM, Sep. 2013, pp. 75–78.
[9] D. Zhang, Y. Shen, S. Ong, and A. Nee, “An affordable augmented
reality based rehabilitation system for hand motions,” in International
Conference on Cyberworlds (CW ’10). Singapore, Singapore: IEEE,
Oct. 2010.
[10] M. Billinghurst and H. Kato, “Collaborative mixed reality,” in Interna-
tional Symposium on Mixed Reality (ISMR ’99). Springer, 1999, pp.
261–284.
[11] S. Steinberg, Music Games Rock. P3: Power Play Publishing, 2011.
[Online]. Available: http://www.musicgamesrock.com/
[12] R. Shusterman, “Muscle memory and the somaesthetic pathologies of
everyday life,” Human Movement, vol. 12, no. 1, pp. 4–15, 2011.
[13] P. Milgram, H. Takemura, A. Utsumi, and F. Kishino, “Augmented re-
ality: A class of displays on the reality-virtuality continuum,” Presence:
Telemanipulator and Telepresence Technologies, vol. 2351, pp. 282–292,
1994.
[14] S. Dixon, “On the computer recognition of solo piano music,” in
Proceedings of Australasian computer music conference, 2000, pp. 31–
37.
144