=Paper= {{Paper |id=Vol-1343/paper6 |storemode=property |title=Introduction to Optical Music Recognition:Overview and Practical Challenges |pdfUrl=https://ceur-ws.org/Vol-1343/paper6.pdf |volume=Vol-1343 |dblpUrl=https://dblp.org/rec/conf/dateso/NovotnyP15 }} ==Introduction to Optical Music Recognition:Overview and Practical Challenges== https://ceur-ws.org/Vol-1343/paper6.pdf
        Introduction to Optical Music Recognition:
        Introduction to Optical Music Recognition:
            Overview and Practical Challenges
            Overview and Practical Challenges
                           Jiřı́ Novotný and Jaroslav Pokorný
                           Jiřı́ Novotný and Jaroslav Pokorný
         Department of Software Engineering, Faculty of Mathematics and Physics
            Charles University,
         Department  of SoftwareMalostranské
                                 Engineering,nám.  25, of
                                              Faculty   Prague, Czech Republic
                                                           Mathematics and Physics
                         {novotny,
            Charles University, Malostranské nám. 25, Prague, Czech Republic
                                     pokorny}@ksi.mff.cuni.cz
                          {novotny, pokorny}@ksi.mff.cuni.cz


          Abstract. Music has been always an integral part of human culture.
          In our computer age, it is not surprising that there is a growing inter-
          est to store music in a digitized form. Optical music recognition (OMR)
          refers to a discipline that investigates music score recognition systems.
          This is similar to well-known optical character recognition systems, ex-
          cept OMR systems try to automatically transform scanned sheet music
          into a computer-readable format. In such a digital format, semantic in-
          formation is also stored (instrumentation, notes, pitches and duration,
          contextual information, etc.). This article introduces the OMR field and
          presents an overview of the relevant literature and basic techniques. Prac-
          tical challenges and questions arising from the automatic recognition of
          music notation and its semantic interpretation are discussed as well as
          the most important open issues.

          Key words: optical music recognition, document image analysis, ma-
          chine learning


   1    Introduction
   Computer perception of music notation forms a constantly growing research field
   called optical music recognition (OMR). The main goal of all OMR systems is to
   automatically decode and interpret the symbols of music notation from scanned
   images. Results of the recognition are represented in a digital format suitable to
   store the semantic information (notes, pitches, dynamics and so on). The main
   advantage of such representation of music scores is the possibility of different
   applications such as: audio playback, reediting, musicological analyses, conver-
   sions to different formats (e.g. Braille music notation) and the preservation of
   cultural heritage [23]. More recent applications are for example: concert-planning
   systems sensitive to the emotional content of music [7] or automatic mapping of
   scanned sheet music to audio recordings [18].
       Over the years, music had been traditionally written down with ink and
   paper. During the 1980s, early computer music typesetting programs were de-
   veloped, which revolutionized the way how music can be recorded. Nowadays,
   the most common approach to transform music data into a computer-readable
   format (used by professional musicians) combines musical keyboard input (e.g.


M. Nečaský, J. Pokorný, P. Moravec (Eds.): Dateso 2015, pp. 65–76, CEUR-WS.org/Vol-1343.
266     Jiřı́ı́ Novotný,
        Jiř     Novotný, Jaroslav
                           Jaroslav Pokorný
                                    Pokorný


MIDI piano) with computer keyboard and mouse. It is a time-consuming pro-
cedure, which requires advanced keyboard-playing skills. The musical keyboard
is utilized to enter the notes playing voice by voice and then the computer key-
board and mouse is used to correct mistakes and to add another information
such as articulation marks, slurs and dynamics.
    The majority of music scores exist only in the paper-based form and many
contemporary composers and musicians still prefer to use pen and paper as the
most efficient way to record their ideas. OMR systems can thus greatly simplify
the music data acquisition and save a lot of human time.
    In this article, we survey the area of OMR, its fundamental approaches and
problems. Section 1.1 introduces a few aspects of music notation, and Section 1.2
reviews the historical context of OMR. A general framework for the music recog-
nition is presented in Section 2. Challenges making the OMR difficult in practice
are discussed in Section 3. Section 4 debates several opened questions of the OMR
research and finally, we conclude this paper in Section 5.

1.1   Music Notation
Music notation has evolved over the period of centuries as the composers and
musicians tried to express their musical ideas by written symbols [33]. In this ar-
ticle, we focus exclusively on the Western music notation (also known as common
music notation — CMN) [10, 39], although certain OMR systems are developed
to recognize other types of notation (e.g. medieval music notation [13, 42]).
    Understanding of any music notation requires knowledge of the information
the notation attempts to capture. In the case of CMN, there are four types of
information involved [10]: a pitch, time, loudness (also dynamics) and timbre
(tone quality). Figure 1 shows selected music notation marks. Clefs (Fig. 1a)
determine the pitches for each line and space of the staff (Fig. 1b), accidentals
(Fig. 1e) temporarily modify the pitch of following notes. The pitch of notes
itself (Fig. 1c) is indicated by their vertical placement on the staff, and their
appearance affects the relative duration. Ornaments (Fig. 1f) change the pitch
pattern of individual notes. Rests (Fig. 1d) indicate a relative duration of silence.
Dynamics (Fig. 1g) signify the varying loudness. Articulations (Fig. 1h) change
the timbre or duration of a note. In practice, certain symbols have almost unlim-
ited variations in representation (e.g. beams connecting notes into note groups
or slurs indicating phrasing). The most used CMN symbols and their graphical
aspects are listed e.g. in the Essential Dictionary of Music Notation [21].

Music Scores. For the music recognition purposes, it is useful to realize, that
music scores can be divided into three categories: entirely printed music scores
(Figure 2a), scores written by hand over the preprinted staff lines (Figure 2b) and
entirely handwritten music scores (Figure 2c). Although the majority of OMR
systems operates with printed scores only [35], OMR systems for handwritten
music have been researched as well (e.g. [1, 17, 27, 36, 37, 45]). Also it should be
noted, that scores of different visual qualities exist — from the clear music sheets
to the degraded ones (mentioned e.g. in [9, 11]).
                                      Introduction to
                                      Introduction to Optical
                                                      Optical Music
                                                              Music Recognition
                                                                    Recognition           67
                                                                                           3




      (a) Clefs.              (b) Staff.             (c) Notes.              (d) Rests.




  (e) Accidentals.          (f) Ornaments.        (g) Dynamics.          (h) Articulations.

                    Fig. 1: Selected common music notation symbols.




   (a) Entirely printed.         (b) Preprinted staff lines.      (c) Entirely handwritten.

                   Fig. 2: Examples of different sheet music categories.


1.2   Historical Background

The OMR research began in 1966, when Pruslin [32] first attempted the auto-
matic recognition of sheet music. His system was able to recognize note heads
and chords. In 1970, Prerau [31] introduced the concept of image segmentation
to detect primitive elements of music notation. These two OMR founding works
were later reviewed by Kassler [24].
    With the availability of inexpensive optical scanners, the OMR research ex-
panded in the late 1980s. An interesting contribution was a Japanese keyboard-
playing robot WABOT-2 [25], developed in 1984. It was the first robot able to
recognize simple music scores and play them on the organ. A critical survey of
the OMR systems developed between 1966 and 1990 can be found in [8].
    The first commercial OMR products appeared in the early 1990s [23,35]. Also,
the first attempts to handle handwritten scores were made (e.g. [37,45]). In 1997,
Bainbridge summarized the existing techniques and proposed an extensible music
recognition system [2] not restricted to particular primitive shapes and semantic
features. Together with Bell [3], they formulated a general framework for OMR
systems, which has been adopted by many researchers since then [35].
    During the last years, several important studies have been performed: Jones
et al. [23] presented a study about music sheet digitization, recognition and
restoration. Moreover, they listed available OMR software and provided an eval-
uation of three OMR systems. Noteworthy contributions to the OMR have been
468     Jiřı́ı́ Novotný,
        Jiř     Novotný, Jaroslav
                           Jaroslav Pokorný
                                    Pokorný


made by Rebelo et al. [30, 34, 36]. In 2012, they published probably the most
recent review of the OMR field [35], including an overview of the state-of-the-art
techniques and a discussion about the open issues.


2     General Framework

Automatic recognition of music scores is a complex task affecting many areas of
computer science. Different OMR systems use various strategies, but the most
common algorithms decompose the problem into four smaller tasks [35]:

 1. Image Preprocessing
 2. Segmentation
 3. Object Recognition
 4. Semantic Reconstruction

Terminology is not always the same: the segmentation is also called primitive
detection or musical object location, and the recognition phase is sometimes called
musical feature classification [2, 3].


2.1   Image Preprocessing

The main goal of the preprocessing phase is to adjust the scanned image to make
the recognition process more robust and efficient. Different methods are typically
used: enhancement, blurring and morphological operations [22] and noise removal
(e.g. [20, 22, 40, 42]), deskewing [17, 20, 22, 27, 42] and binarization (e.g. [9, 17, 20,
22, 27, 30, 42]). In the following text, only the binarization is introduced as it is
the most crucial step for the vast majority of OMR systems.


Binarization. Binarization algorithms convert the input image into a binary
one, where objects of interest (music symbols, staves, etc.) are separated from
the background. This is motivated by the fact, that music scores have inherently
binary nature (colors are not used to store music information in CMN).
    Binarization is usually an automated process driven without special knowl-
edge of the image content. It facilitates the subsequent tasks by reducing the
volume of information that is needed to be processed. For example, it is much
easier to design an algorithm for staff detection, primitive segmentation and
recognition in binary images than in grayscale or color ones.
    In general, there are two types of binarization approaches. The first are global
thresholding methods, which apply one particular threshold to the entire image.
The Otsu’s method [29] is often assessed to be the best and fastest [38,44]. Global
thresholding works well when extracting objects from uniform backgrounds, but
usually fails on non-uniform images. Nevertheless, it is used in several OMR
research articles (e.g. [22, 34, 42]) because of its simplicity and time efficiency.
The second category is represented by adaptive binarization techniques, which
select a threshold individually to each pixel using information from the local
                                 Introduction to
                                 Introduction to Optical
                                                 Optical Music
                                                         Music Recognition
                                                               Recognition      69
                                                                                 5

neighborhood. These methods can eliminate non-uniform backgrounds at the
expense of longer processing time. One of the most popular adaptive thresholding
method is the Niblack’s [28] that computes a local threshold from the mean and
standard deviation in pixel’s surroundings. Adaptive thresholding is also used
in some OMR systems (e.g. [40]). Overview of binarization techniques used in
OMR can be found in [9, 38].

2.2   Segmentation
The segmentation stage parses music scores into the elementary primitives. It is
usually initiated by establishing the size of the music notation being processed.
This is an important step before any shape recognition. Staff lines are a reliable
feature of music notation used to estimate two important reference values: staff
line thickness and staff space height, which are further used to deduce the size
of other music symbols. The most common way of their approximation is based
on the run-length encoding (RLE), which is a simple form of data compression.
For instance, lets assume the binary sequence {1 1 1 0 0 1 1 1 1 0 0 0 0}. It can
be represented in RLE as {3 2 4 4} (supposing 1 starts the original sequence,
otherwise the first number in the encoded sequence would be 0). Binarized music
scores can be encoded column by column with RLE, then the relative lengths
can be easily estimated: the most common black run approximates the staff
line thickness and the most common white run estimates the staff space height.
However, more robust approximations exist [12].

Staff Lines. Staff line detection is fundamental in OMR, because the staff
creates a two dimensional coordinate system essential to understand the CMN.
Unfortunately, staff lines are not guaranteed to be perfectly horizontal, straight
or of uniform thickness in scanned images (even in printed music scores). Precise
staff detection is a tricky problem that still represents a challenge.
    The simplest algorithms use horizontal projections [19,20]. A horizontal pro-
jection maps a binary image into a histogram by accumulating the number of
black pixels in each row. If the lines are straight and horizontal, staff can be
detected as five consequent distinct peaks (local maxima) in the histogram. Fig-
ure 3 shows an excerpt of music and its horizontal projection. In practice, several
horizontal projections on images with slightly different rotation angles are com-
puted to deal with not completely horizontal staff lines. The projection with the
highest local maxima is then chosen.
    Another strategies use vertical scan lines [13] or Hough Transform or grouping
of vertical columns [35]. Although there are many staff line detection techniques,
they all have certain limitations. Dalitz [15] surveyed the existing methods and
proposed a method based on skeletonization. Handwritten staff lines are usually
detected using different kinds of techniques (e.g. [1, 43]).

Symbol Segmentation. Once the staff lines have been detected, the music
primitives must be located and isolated. This can be performed in two manners:
670    Jiřı́ı́ Novotný,
       Jiř     Novotný, Jaroslav
                          Jaroslav Pokorný
                                   Pokorný




           Fig. 3: The horizontal projection of a music score excerpt.



either remove the staff lines or ignore them. Although the majority of researchers
remove the staff lines in order to isolate the musical symbols as connected com-
ponents, there are some authors who suggest the opposite (e.g. [4,22]). The most
simple line removal algorithm removes the line piecewise — following it along
and replacing the black line pixels with white pixels unless there is evidence of
an object on either side of the line [3]. The staff line removal procedure must
be careful not to broke any object. Despite that, the algorithms often cause
fragmentation, especially to objects that touch the staff lines tangentially.
    The score is then divided into regions of interest to localize and isolate the
musical primitives. The best approach is hierarchical decomposition [35]. At first,
a music score is analyzed and split by staves. Then, the primitive symbols (note
heads, stems, flags, rests, etc.) are extracted [22, 27, 34]. Particular procedures
vary system to system. For example, some approaches consider note heads, stems
and flags to be separate objects, whereas other concepts consider these primitives
as a whole object representing a single note. More details can be found in [34].


2.3   Object Recognition

Segmented symbols are further processed and given to the classifier that tries
to recognize them (assign them a label from predefined groups). Unfortunately,
music shapes are inherently complex — they are often formed by several touching
and overlapping graphical components. In addition, the staff line removal can
break some objects (they are sometimes already fragmented because of the music
score quality itself). Hence, the object recognition phase is very delicate and it
is usually combined with the segmentation step [35].
    Objects are classified according to their distinctive features. Some authors
suggest classification using projection profiles [19], others apply template match-
ing to recognize symbols [22] or propose a recognition process entirely driven by
grammars formalizing the music knowledge [14]. Statistical classification meth-
ods using support vector machines (SVMs), neural networks (NN), k -nearest
neighbours (k NN) and hidden Markov models (HMM) classifiers were investi-
gated by Rebelo et al. [34]. Handwritten music symbols are sometimes segmented
and recognized using the mathematical morphology, applying a skeletonization
technique and an edge detection algorithm [26]. Despite the number of recog-
                                  Introduction to
                                  Introduction to Optical
                                                  Optical Music
                                                          Music Recognition
                                                                Recognition      71
                                                                                  7

nition techniques available, research on symbol segmentation and recognition is
still important and necessary, because all OMR systems depend on it [35].

2.4    Semantic Reconstruction
The inevitable task of all OMR systems is to reconstruct the musical seman-
tics from previously recognized graphical primitives and store the information
in a suitable data structure. This necessarily requires an interpretation of spa-
cial relationships between objects found in the image. Relations in CMN are
essentially two dimensional and the positional information is very critical. For
example, a dot can change note’s duration if it is placed on the right of a note
head, or it can alter the articulation if it is placed above the note.
    These musically syntactic rules can be formalized using the grammars [2,
14, 19, 31]. Grammar rules specify semantically valid music notation events and
a way, how the graphical primitives should be segmented. Alternative techniques
build the semantic reconstruction on a set of rules and heuristics (e.g. [16, 26]).
    The last and fundamental aspect of OMR systems is the transformation
of semantically recognized scores in a coding format that is able to model and
store music information. Many computer formats are available, but none of them
has been accepted as a standard. The best known formats are: MIDI (Musical
Instrument Digital Interface), NIFF (Notation Information File Format), SMDL
(Standard Music Description Language) and MusicXML1 . MIDI is mainly used
as an interchange format between digital instruments and computers. Although
its capability of modeling music scores is very limited (e.g. the relationships
among symbols cannot be represented), most of the music editors can operate
with MIDI files. NIFF was developed in 1994 to exchange data between different
music notation software. NIFF is able to describe visual and logical aspects of
music, however nowadays it is considered to be obsolete. SMDL strictly separates
visual and logical sites and it is rather a standardized formal scheme than a
practical file format. MusicXML is designed especially for sharing and archiving
of music sheets. It covers the logical structure and also graphical aspects of music
scores. It is becoming more and more popular and it targets to be the standard
open format for exchanging digital sheet music. A more detailed review and
comparison of music notation file formats can be found in [6].


3     Practical Challenges
Despite the fact that OMR systems have been researched thoroughly over the
last few decades and even several commercial tools exist, the practical results are
still far from ideal [35]. Proposed techniques are typically tailored to different
properties of music scores, which makes them difficult to combine in one general
OMR system robust enough to overcome all the practical issues. In this section,
we focus on reasons that makes the OMR systems challenging in practice and
we also discuss some open problems of the research area.
1
    http://www.musicxml.com/
872    Jiřı́ı́ Novotný,
       Jiř     Novotný, Jaroslav
                          Jaroslav Pokorný
                                   Pokorný


3.1   Preprocessing
Preprocessing is the initial step of all OMR systems, which obviously affects the
subsequent stages. However, no goal-directed studies investigating the impact
of this phase on the recognition have been carried out [35]. Binarization often
produces artifacts and its advantages in the complete OMR process are not
clear. There are few attempts to use prior knowledge when performing a bina-
rization process [30]. Such algorithm extracts content-related information from
a grayscale image and uses it to guide the binarization. Cardoso et al. [12] en-
courage the idea of using grayscale images rather then the binary ones. A special
care must be also given to highly degraded music scores [9].
    In our opinion, it is also worth considering the possibility to analyze the
color information when processing handwritten scores with preprinted staff lines,
because the color of the composer’s ink may slightly vary from the color of the
staff lines. It could possibly result in a more efficient staff removal algorithm.
This and similar image analysis topics are in our research interest.

3.2   Music Notation Inherent Problems
Music notation itself implies many difficult-to-process variants and possibilities
of music representation typically responsible for serious recognition errors. Two
different practical troubles are shown in Figure 4. The long curves connecting
notes of distinct pitches (slurs) can have an arbitrary shape, thus they represent
a great challenge for OMR systems. The last bar of the examples presents an-
other difficulty: the notes pass to another staff, while their beams are crossing
(moreover, they superimpose the crescendo sign and the slur).
    There are plenty of similar problematic properties in CMN, for example:
a smaller staff placed above the main staff indicating how a part of music can
be alternatively played (ossia), simplifications for a better human readability
that can be interpreted ambiguously (e.g. omitting the number 3 in triplets or
alternating the left and right hands across the staves in the piano literature)
or ornamental note groups that do not fit the prescribed meter. It should be
also noted, that not all notation formats are able to represent such features.
Nevertheless, these kinds of difficulties are nothing exceptional in real music
sheets and hence cannot be omitted in a practical OMR system.

3.3   Handwritten Scores
Handwritten music sheets produce specific kinds of problems. Although they
are also mentioned in the literature [1, 17, 27, 36, 37, 45], the results are still
not usable for practical applications. In general, the major problem of OMR
systems are fragmented and connected (touching or overlapping) music symbols.
Handwritten music scores contain even more broken and merged symbols — it
could be a part of composer’s written style or just a consequence of quickly-
made strokes. Figure 5 shows a huge variability of written styles of four different
composers. These facts make the recognition of handwritten scores complicated.
                                Introduction to
                                Introduction to Optical
                                                Optical Music
                                                        Music Recognition
                                                              Recognition      73
                                                                                9




    Fig. 4: Example of variations in notation (from Maurice Ravel’s Scarbo).




       Fig. 5: An example of composer variability in handwritten scores.



4    Open Problems

One of the most important open issues of the OMR research is the lack of
an available ground-truth database that could serve as a benchmark. Such a
data set would contain a large amount of music scores of different types and
qualities (clean scans, photocopies, degraded manuscripts, etc.) together with
their ground-truth representation in a uniform notation format. Compilation of
such corpus is extremely time-consuming, because the music sheets have to be
processed by hand. Available data sets (e.g. [30]) are typically very limited or
designed only for specific tasks [35]. Maybe a solution would be to design an
automatic method able to procedurally simulate different types of writing styles
and paper degradation levels from a given notation file. Available electronic
scores then would be easily transformed to images of different qualities.
    Another significant problem is the absence of common methodologies and
metrics that would compare the results of OMR systems. This is a more compli-
cated issue than it might seem at first glance, because OMR systems can target
different goals (audio playback, score archiving, . . . ) and the outputs can be
stored in very unlike formats. However, performance evaluation and related top-
ics have been also studied in the literature. For example, Szwoch [41] proposed
a method able to compare and evaluate the results of recognition systems stored
in MusicXML format. More on this topic can be found in [5, 11].
    In addition, we think, that music knowledge should be incorporated more to
support the recognition and reconstruction processes. For example, considering
the advanced analysis of music harmony or building a composer-adaptive system
74
10      Jiřı́ı́ Novotný,
        Jiř     Novotný, Jaroslav
                           Jaroslav Pokorný
                                    Pokorný


(adaptive to the composer’s writing style as well as to the music style). To the
best knowledge of the authors, no studies concerning these or similar topics exist.
Together with the image analysis issues, this is one of the subjects on which we
would like to focus our research.


5    Conclusion

During the last decades, OMR has been actively studied and a lot of achieve-
ments have been done. Even so, the problem is not solved and represents a great
challenge in many ways. Possible OMR applications are still relevant today,
which makes the research area constantly growing.
   In this article, we have introduced the OMR field, its main goals and practical
applications. We have also presented an overview of the most common method-
ologies including the idea of a generalized framework. The most delicate and
challenging problems that all OMR systems have to face have been discussed as
well. We hope, that our contribution helps to motivate the researchers, because
there are many demanding problems waiting to be solved.


References

 [1] Alirezazadeh, F., Ahmadzadeh, M.R.: Effective staff line detection, restoration
     and removal approach for different quality of scanned handwritten music sheets.
     Journal of Advanced Computer Science and Technology 3(2), 136–142 (2014)
 [2] Bainbridge, D.: Extensible Optical Music Recognition. Ph.D. Thesis, Department
     of Computer Science, University of Canterbury, Christchurch, NZ (1997)
 [3] Bainbridge, D., Bell, T.: The Challenge of Optical Music Recognition. Computers
     and the Humanities 35(2), 95 – 121 (2001)
 [4] Bellini, P., Bruno, I., Nesi, P.: Optical music sheet segmentation. In: Web Deliv-
     ering of Music, 2001. Proceedings. First International Conference on. pp. 183–190
     (Nov 2001)
 [5] Bellini, P., Bruno, I., Nesi, P.: Assessing Optical Music Recognition Tools. Com-
     put. Music J. 31(1), 68–93 (Mar 2007)
 [6] Bellini, P., Nesi, P.: Modeling Music Notation in the Internet Multimedia Age.
     In: George, S.E. (ed.) Visual Perception of Music Notation: On-Line and Off Line
     Recognition, pp. 272–303. IGI Global (2004)
 [7] Billinge, D., Addis, T.: Towards Constructing Emotional Landscapes with Music.
     In: George, S.E. (ed.) Visual Perception of Music Notation: On-Line and Off Line
     Recognition, pp. 227–271. IGI Global (2004)
 [8] Blostein, D., Baird, H.S.: A Critical Survey of Music Image Analysis. In: Baird,
     H.S., Bunke, H., Yamamoto, K. (eds.) Structured Document Image Analysis, pp.
     405–434. Springer Berlin Heidelberg (1992)
 [9] Burgoyne, J.A., Pugin, L., Eustace, G., Fujinaga, I.: A Comparative Survey of
     Image Binarisation Algorithms for Optical Recognition on Degraded Musical
     Sources. pp. 509–512. Austrian Computer Society (2007)
[10] Byrd, D.: Music Notation by Computer. Ph.D. thesis, Indiana University, Com-
     puter Science Department (1984)
                                   Introduction to
                                   Introduction to Optical
                                                   Optical Music
                                                           Music Recognition
                                                                 Recognition         75
                                                                                     11

[11] Byrd, D., Simonsen, J.G.: Towards a Standard Testbed for Optical Music Recog-
     nition: Definitions, Metrics, and Page Images. University of Copenhagen, Copen-
     hagen (2013)
[12] Cardoso, J., Rebelo, A.: Robust Staffline Thickness and Distance Estimation in
     Binary and Gray-Level Music Scores. In: Pattern Recognition (ICPR), 2010 20th
     International Conference on. pp. 1856–1859 (Aug 2010)
[13] Carter, N.: Segmentation and Preliminary Recognition of Madrigals Notated in
     White Mensural Notation. Machine Vision and Applications 5(3), 223–229 (1992)
[14] Coüasnon, B., Camillerapp, J.: Using grammars to segment and recognize music
     scores. International Association for Pattern Recognition Workshop on Document
     Analysis Systems pp. 15–27 (1994)
[15] Dalitz, C., Droettboom, M., Pranzas, B., Fujinaga, I.: A Comparative Study
     of Staff Removal Algorithms. Pattern Analysis and Machine Intelligence, IEEE
     Transactions on 30(5), 753–766 (May 2008)
[16] Droettboom, M., Fujinaga, I., MacMillan, K.: Optical Music Interpretation. In:
     Caelli, T., Amin, A., Duin, R., de Ridder, D., Kamel, M. (eds.) Structural, Syn-
     tactic, and Statistical Pattern Recognition, Lecture Notes in Computer Science,
     vol. 2396, pp. 378–387. Springer Berlin Heidelberg (2002)
[17] Fornés, A., Lladós, J., Sánchez, G.: Primitive Segmentation in Old Handwritten
     Music Scores. In: Proceedings of the 6th International Conference on Graphics
     Recognition: Ten Years Review and Future Perspectives. pp. 279–290. Springer-
     Verlag, Berlin, Heidelberg (2006)
[18] Fremerey, C., Müller, M., Kurth, F., Clausen, M.: Automatic Mapping of Scanned
     Sheet Music to Audio Recordings. In: Bello, J.P., Chew, E., Turnbull, D. (eds.)
     ISMIR 2008, 9th International Conference on Music Information Retrieval, Drexel
     University, Philadelphia, PA, USA, September 14-18, 2008. pp. 413–418 (2008)
[19] Fujinaga, I.: Optical music recognition using projections. M.A p. Thesis (1988)
[20] Fujinaga, I.: Staff Detection and Removal. In: George, S.E. (ed.) Visual Perception
     of Music Notation: On-Line and Off Line Recognition, pp. 1–39. IGI Global (2004)
[21] Gerou, T., Lusk, L.: Essential Dictionary of Music Notation. Alfred Music Pub-
     lishing (1996)
[22] Göcke, R.: Building a System for Writer Identification on Handwritten Music
     Scores (2003)
[23] Jones, G., Ong, B., Bruno, I., NG, K.: Optical Music Imaging: Music Document
     Digitisation, Recognition, Evaluation, and Restoration. Interactive Multimedia
     Music Technologies pp. 50–79 (2008)
[24] Kassler, M.: Optical Character Recognition of Printed Music: A Review of Two
     Dissertations. Perspectives of New Music 11 (1972)
[25] Matsushima, T.: Automated recognition system for musical score: The vision
     system of WABOT-2. Bulletin of Science and Engineering Research Laboratory
     (1985)
[26] Ng, K.C., Cooper, D., Stefani, E., Boyle, R.D., Bailey, N.: Embracing the Com-
     poser: Optical Recognition of Handwritten Manuscripts. In: Proceedings of the
     International Computer Music Conference. pp. 500–503 (1999)
[27] Ng, K.: Optical Music Analysis for Printed Music Score and Handwritten Music
     Manuscript. In: George, S.E. (ed.) Visual Perception of Music Notation: On-Line
     and Off Line Recognition, pp. 108–127. IGI Global (2004)
[28] Niblack, W.: An Introduction to Digital Image Processing. Strandberg Publishing
     Company, Birkeroed, Denmark, Denmark (1985)
[29] Otsu, N.: A Threshold Selection Method from Gray-Level Histograms. Systems,
     Man and Cybernetics, IEEE Transactions on 9(1), 62–66 (Jan 1979)
76
12      Jiřı́ı́ Novotný,
        Jiř     Novotný, Jaroslav
                           Jaroslav Pokorný
                                    Pokorný


[30] Pinto, T., Rebelo, A., Giraldi, G., Cardoso, J.: Music Score Binarization Based
     on Domain Knowledge. In: Vitri, J., Sanches, J., Hernndez, M. (eds.) Pattern
     Recognition and Image Analysis, Lecture Notes in Computer Science, vol. 6669,
     pp. 700–708. Springer Berlin Heidelberg (2011)
[31] Prerau, D.: Computer pattern recognition of standard engraved music notation.
     Ph.D. Dissertation, Massachusetts Institute of Technology (1970)
[32] Pruslin, D.: Automatic recognition of sheet music. Sc.D. Dissertation, Mas-
     sachusetts Institute of Technology (1966)
[33] Read, G.: Music Notation: A Manual of Modern Practice. Taplinger Publishing
     Company (1979)
[34] Rebelo, A., Capela, G., Cardoso, J.: Optical recognition of music symbols: A
     comparative study. International Journal on Document Analysis and Recognition
     (IJDAR) 13(1), 19–31 (2010)
[35] Rebelo, A., Fujinaga, I., Paszkiewicz, F., Marcal, A., Guedes, C., Cardoso, J.:
     Optical music recognition: state-of-the-art and open issues. International Journal
     of Multimedia Information Retrieval 1(3), 173–190 (2012)
[36] Rebelo, A.M.: Robust Optical Recognition of Handwritten Musical Scores based
     on Domain Knowledge. Ph.D. thesis, University of Porto (2012)
[37] Roach, J.W., Tatem, J.E.: Using Domain Knowledge in Low-level Visual Process-
     ing to Interpret Handwritten Music: An Experiment. Pattern Recognition 21(1),
     33–44 (Jan 1988)
[38] Sezgin, M., Sankur, B.: Survey over image thresholding techniques and quantita-
     tive performance evaluation. Journal of Electronic Imaging 13(1), 146–168 (2004)
[39] Stone, K.: Music Notation in the Twentieth Century: A practical guidebook. Nor-
     ton New York (1980)
[40] Szwoch, M.: Guido: A Musical Score Recognition System. In: Document Analysis
     and Recognition, 2007. ICDAR 2007. Ninth International Conference on. vol. 2,
     pp. 809–813 (Sept 2007)
[41] Szwoch, M.: Using MusicXML to Evaluate Accuracy of OMR Systems. In: Sta-
     pleton, G., Howse, J., Lee, J. (eds.) Diagrammatic Representation and Inference,
     Lecture Notes in Computer Science, vol. 5223, pp. 419–422. Springer Berlin Hei-
     delberg (2008)
[42] Tardón, L.J., Sammartino, S., Barbancho, I., Gómez, V., Oliver, A.: Optical Mu-
     sic Recognition for Scores Written in White Mensural Notation. J. Image Video
     Process. 2009, 6:3–6:3 (Feb 2009)
[43] Timofte, R., Van Gool, L.: Automatic Stave Discovery for Musical Facsimiles.
     In: Lee, K., Matsushita, Y., Rehg, J., Hu, Z. (eds.) Computer Vision ACCV
     2012, Lecture Notes in Computer Science, vol. 7727, pp. 510–523. Springer Berlin
     Heidelberg (2013)
[44] Trier, O., Jain, A.: Goal-Directed Evaluation of Binarization Methods. Pattern
     Analysis and Machine Intelligence, IEEE Transactions on 17(12), 1191–1201 (Dec
     1995)
[45] Wolman, J., Choi, J., Asgharzadeh, S., Kahana, J.: Recognition of Handwritten
     Music Notation. Proceedings of the International Computer Music Conference pp.
     125–127 (1992)