<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Web application for recognition of mathematical formulas?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jan Stria</string-name>
          <email>stria.jan@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniel Prºu</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Charles University</institution>
          ,
          <addr-line>Malostransk</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Czech Technical University, Center for Machine Perception Karlovo n</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>am.</institution>
          <addr-line>13, 121 35 Prague 2</addr-line>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
      </contrib-group>
      <fpage>47</fpage>
      <lpage>54</lpage>
      <abstract>
        <p>We present a system for on-line mathematical singer and V. Hlavac in [15]. It is a general framework, formulas recognition. The main principles of the method suitable for recognition of objects exhibiting a rich we applied for recognition are explained. It is based on the structure. It has been also applied for musical structural construction paradigm and utilizes a sort of two- scores [14] and electric circuits [5]. The power of the dimensional grammars called coordinate grammars. Gram- structural construction is in driving symbols segmenmar productions are used to model spatial relationships tation by the structural analysis, avoiding thus erdaemvoenlogpemdaatsheamwaetbicaalppslyimcabtoiolsn. iTnhHeTsMysLte5m. Witseeglfivheadsebtaeeilns rors that could be normally done during a standalone on its client-server architecture and user interface. We also segmentation. Some of the known methods solve this discuss advantages of the chosen approach. by error recovery techniques, but such techniques are more complicated and less natural. Our work is incremental. We started with the sim1 Introduction plistic subset of formulas and the proof-of-concept recognition system implemented in Java [10, 11]. Later Mathematical formulas recognition is a task of grow- we adopted the system to on-line inputs written on ing importance. There are two domains in which it a tablet and we also extended the set of supported can be applied. The ¯rst one covers scanned math- formulas [12]. ematical texts (books or notes) we want to convert This paper reports our recent results. We focus into an electronic form automatically. We speak about on two topics. Principles of the developed recognio®-line formulas recognition. Another situation arises tion method are outlined in Section 2. We introduce when we need to support entering formulas into ap- a revised formalism of the previously used two-dimenplications in a natural way, i.e. by drawing them by sional grammars { so called coordinate grammars. The hand. This can be done on a tablet or simply using grammars model relationships among formula syma mouse. In such a case, we do not recognize raster bols. The system itself has been completely rewritten images but rather sets of strokes and we speak about in C# and includes several improvements. The new on-line recognition. priority was also make the system accessible by a web Several approaches to formulas recognition have application in which the user draws a formula and rebeen described in the literature [20, 8, 6]. A tax- ceives the output in TeX or MathML format. The web onomy of the methods can be found in [2]. application architecture, implementation in HTML5 Only few published papers o®er a publicly avail- and user interface design are described in Section 3. able implementation for evaluation. We mention three known applications giving good results. Infty 2 Method proposed for formulas project [17, 16] targets primarily o®-line recognition, however, it includes a module for on-line recognition recognition as well. xMathJournal [27] is a commercial software product which serves as a sophisticated calculator for handwritten mathematics. Microsoft Windows 7 provides a simple tool called Math Input Panel. It allows to draw a formula, recognize it, copy the result into the clipboard and paste it to other programs (e.g. Microsoft Word). The approach we chose is motivated by the structural construction paradigm presented by M.I. Schle-</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The method consists of two phases which are
common in pattern recognition { elementary symbols
detection and structural analysis. In our case, the ¯rst
phase does not make any ¯nal decision what the
elementary symbols are. Instead of it, only candidates to
such symbols are selected. It is completely up to the
structural analysis to decide then, which of the
candidates are really parts of the formula and what is their
meaning in the formula structure.</p>
      <p>We have chosen grammar based structural
analysis, since it allows to express the syntax of the whole
? The authors were supported by the Grant Agency of the</p>
      <p>
        Czech Republic under project P103/10/0783.
formula and ¯ts thus well to the structural construc- 1
tion pattern. Non-grammar based approaches usually
recognize local con¯gurations among symbols only and 3
transform them to graphs. We introduce a (2D)
coordinate grammar which can be viewed as an extension symbol variable V num. 6 frac. line minus square root
of the grammar described by R. Anderson in his sem- stroke(s) 1 3 2 2 1, 2
inal work [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The productions have context-free form
and are assigned by spatial constraints on the right- Fig. 1. Elementary symbol candidates: The input is a
se(hnaontdtshideetoeple-mdoewntns.oTnehelipkaerisnin[g1]i)s, asibnoctetoimta-ullpowprsobceests- caqiursecynlmec.ebToolhf.ethtraebelestlriostksess,uebagcrhoudpesnootfedstrboykeasnruemcobgenriziendthaes
ter control the number of derived elements and reduce
them by pruning. Moreover, our formalism is stochas- knowledge of the formula structure. An example of
tic and penalty oriented { each derivation is assigned possible candidates is given in Figure 1.
by a real number, which determines its quality. In general, there are 2jSj subsets of S. It is obvious
      </p>
      <p>
        The grammar is designed to recognize structures that not all these subsets have the potential to
repreformed of terminal elements freely located in a plane. sent a symbol. For performance reasons, we propose
This is something di®erent comparing to picture lan- several strategies that select and evaluate only some
guages [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] consisting of rectangular grids over ¯nite of the subsets:
alphabets. We have to deal with the time
complexity of parsing. In general, it can grow exponentially in
the number of elementary symbols. The well known
Cocke-Younger-Kasami algorithm [
        <xref ref-type="bibr" rid="ref19 ref4">4, 19</xref>
        ] recognizes
context-free languages in time O ¡n3¢ and can be
generalized to a sort of 2D context-free grammars [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
      </p>
      <p>It both cases, the algorithm can rely on the
sequential ordering of characters in strings or pictures. When
working with coordinate grammars, we are forced to
restrict the set of sentential forms allowed to be
derived during the process to some sort of continuous
areas. The task how to parse 2D structures in a plane
e®ectively has been studied by others, especially by P.</p>
      <p>
        Viola and E. Miller [
        <xref ref-type="bibr" rid="ref7 ref9">9, 7</xref>
        ]. Some of the techniques we
employ in our method coincide with their proposals.
      </p>
    </sec>
    <sec id="sec-2">
      <title>1. A reasonable restriction is to limit the number</title>
      <p>of strokes that form one symbol: jS0j · K. This
leaves O ¡jSjK ¢ subsets to be evaluated by the
OCRtool. K = 4 should fully comply with all the
symbols in VT .
2. Construct a graph G where vertices correspond to
strokes and two vertices are connected i® the two
related strokes are ,,close enough" (various metrics
can be applied: the smallest distance between two
points, the greatest distance, the average distance,
etc.). Consider only those S0 µ S, where jS0j · K
and all vertices in the graph that correspond to
strokes in S0 are located in the same component
of connectivity.
3. Label each edge in G by the distance between the
related strokes. Find minimum spanning tree T .</p>
      <p>Search for subsets S0 using T instead of G.
4. If we assume the user does not make any
corrections in the already written symbols, we can even
consider only the subsets of consecutive strokes
S0 = fsi; si+1; : : : ; si+kg, where k &lt; K.</p>
    </sec>
    <sec id="sec-3">
      <title>By a stroke s we mean a ¯nite, non-empty sequence</title>
      <p>of points in a plane, i.e. s = ((x1; y1); : : : ; (xk; yk)),
where xi; yi 2 R, k ¸ 1.</p>
      <p>Let S = (s1; : : : ; sn) be a sequence of strokes
entered to the system by the user. For i &lt; j, we assume If the OCRtool ¯nds a match of a subset S0 to some
si has been entered before sj . Moreover, we assume symbol, it provides a record consisting of the following
the following condition is ful¯lled: items:
2.1</p>
      <sec id="sec-3-1">
        <title>Elementary symbols detection</title>
        <p>{ one stroke is not a part of two or more di®erent
symbols in the input formula</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>The goal of the elementary symbols detection phase is</title>
      <p>to ¯nd each subset S0 µ S which, taken separately, is
recognized by an Optical Character Recognition tool
(OCRtool) as an elementary symbol from a set of
known symbols denoted VT (alphabet letters,
numbers, mathematical operators, Greek letters, etc.).
Note that there can be even more interpretations for
one subset S0, the OCRtool can return several
hypothesis for it. The detection is performed without any
{ recognized symbol identi¯er l 2 VT
{ a penalty p expressing reliability of the recognition
(p 2 R+, a lower value implies a bigger con¯dence)
{ metrics M of the recognized symbol (a record
storing base line and mean line given relative within
the bounding box, see Figure 2)</p>
    </sec>
    <sec id="sec-5">
      <title>We denote tuple (S0; l; p; M ) as a terminal unit.</title>
      <p>2.2</p>
      <sec id="sec-5-1">
        <title>Coordinate grammars</title>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Before we can describe how mathematical relationships are modeled by a coordinate grammar, we need</title>
      <p>ascent
mean line
base line
descent
to de¯ne a structure to represent partial derivations
(an analogy to the sentential form). We denote this
structure a labeled group of strokes (over an input
sequence of strokes S, a set of elementary symbols {
terminals VT and a set of non-terminals VN ). It is a tuple
(S1; l1; p1; T ) such that S1 µ S, S1 6= ;, l1 2 VN [ VT ,
p1 2 R+ and T = (S2; l2; p2; M ) is a terminal unit,
where S2 µ S1 and l2 2 VT . The interpretation is as
follows:
{ there is a sequence of derivations which result in
assigning l1 to S1
{ penalty p expresses a con¯dence in the derivations
{ T determines so called leading symbol in S1; it is
a terminal recognized by the OCRtool as l2,
considered as the main (root) symbol among other
symbols in S1; the leading symbol is determined
by applied grammar productions
the mutual positions of leading symbols and
bounding boxes of labeled groups of strokes corresponding
to Si's. Figure 3 shows a constraint function
evaluation example for the production that models a binary
operation of the form:
BinOperation ! Expression ¯ BinOperator ¯ Term:</p>
    </sec>
    <sec id="sec-7">
      <title>The parsing algorithm starts with terminal units</title>
      <p>
        produced by the terminals detection phase (they are
A coordinate grammar G is a tuple (VT ; VN ; I; P ), transformed to labeled groups of strokes ¯rst). Larger
where I 2 VN is the initial non-terminal and P is groups of strokes are incrementally derived then. The
a set of productions of the form input is recognized i® (S; I; p; T ) is among the
derivations. When there are more such groups, the resulting
N ! A1 ¯ A2 ¯ : : : ¯ Ak; (1) one is the group with the lowest penalty. More details
on the parsing process can be found in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
where N 2 VN and each Ai 2 fVT [ VN g. Moreover,
each production is assigned by three functions: spatial
constraint ¾, penalty ¼ and leading symbol selector ¹. 2.3 Notes on method implementation
To explain their meaning and de¯ne their parameters One of the weaknesses in the former Java
implemenand functional values, let us consider k labeled groups tation was the quality of OCR results. We used own
of strokes S1; : : : ; Sk such that implementation based on the elastic matching
technique [
        <xref ref-type="bibr" rid="ref18 ref3">18, 3</xref>
        ]. The OCRtool in the current version is
Si = (Si; li; pi; Ti): a combination of two methods. We bene¯t partially
from the OCR provided by Microsoft .NET API which
Production (1) can be applied to derive is quite robust. However, it does not support
recogÃ k ! nition of all mathematical symbols, thus we are still
S = [ Si; N; p; T forced to maintain and use our OCR to detect certain
i=1 symbols. Nevertheless, we have made additional
improvements to it and the overall OCRtool accuracy is
i® the following conditions are ful¯lled: better now.
      </p>
      <p>{ Si's are pairwise disjunct The form of ¾ and ¼ is designed based on a
sta{ 8i 2 f1; : : : ; kg : li = Ai tistical model. Mutual placements of labeled groups of
{ ¾(S1; : : : ; Sk) = true strokes are expressed by a normal distribution which
{ p = ¼(S1; : : : ; Sk) + Pk of parameters are extracted from the training data for
{ T = ¹(T1; : : : ; Tk) i=1 pi each production. This is another progress comparing
to Java version where we set up constraints manually.</p>
      <p>Boolean function ¾ determines whether the produc- Grammar productions support constructs as
subtion can be applied at all, ¼ penalizes the application scripts, superscripts, brackets, common unary and
biof the production and ¹ selects the resulting leading nary operators, fractions, integrals, sums,
exponentiasymbols. In the implementation, ¾ is strongly based on tion, square roots, functions and binomial coe±cients.</p>
    </sec>
    <sec id="sec-8">
      <title>The related syntax is expressed by 200 productions stored in a text ¯le. A sample from the ¯le follows.</title>
      <p>BinOperation-&gt;BinOp|Expression@L|Term@R
BinOp-&gt;[+]
BinOp-&gt;[-]
Fraction-&gt;[line]|Expression@T|Expression@B
Power-&gt;Factor|Expression@TR
Root-&gt;[root]|Expression@TL|Expression@I</p>
      <p>Each line represents one production. On the
lefthand side of the production, there is a source
nonterminal symbol. On the right-hand side, there is a
mixture of nonterminal and terminal target symbols
separated by vertical lines. To distinguish the
nonterminal and terminal symbols, we enclose the
terminals in brackets. The ¯rst of the target symbols is
a mandatory leading symbol. Potential following
symbols are always denoted by special characters
determining their spatial relation to the leading target
symbol (e.g. @L stays for left which means that the
appropriate symbol is located on the left with respect to the
leading target symbol).</p>
      <p>The parameters de¯ning ¾, ¼ and ¹ more precisely
are saved in a separate data ¯le.
3</p>
      <p>Web application
web browsers found in Android and iOS operating
systems. However despite of some optimizations editing
of formulas in these mobile web browsers in not
always as smooth as in desktop browsers. It is caused by
a poorer performance of mobile devices which is not
su±cient for a frequent rendering of formulas. But as
the performance of these mobile devices grows rapidly
there is a well-founded reason to believe this will go
better soon.</p>
      <p>Our application for formulas recognition can be
started using immediately. This is especially
important in that case when the potential user has only
several formulas to recognize and installing a new
software would cost more time than writing the formula
in TeX of MathML format by a hand. We have also
tried to make the user interface simple and friendly so
usage of our application should be intuitive enough.</p>
      <p>The recognition engine itself runs at the server so it
can be easily updated. It is especially important in the
development phase when we can continuously provide
enhanced versions of the recognizer. We can also
easily collect all formulas supplied by users for a further
analysis which helps us to improve the recognition.</p>
      <p>The disadvantage of choosing an implementation
as a web application is a permanent need of an
Internet connection to work with it. In addition this
connection should have a low latency to ensure a true
real-time recognition during a formula insertion. On
the other hand, high throughput of the connection is
not demanded as not much data is transferred between
the client and the server.</p>
      <p>We have developed two web applications, both
utilizing the well known client-server architecture. The ¯rst
application serves for sample mathematical formulas
collection. It allows to insert a mathematical formula
together with a name of the writer and store it at 3.1 jQueryInk widget
the server. Formulas collected by this application are
then used to train and test our recognizer. The second For both the data collection application and the
recogapplication provides a functionality of the recognizer nition application the core part of the web user
initself. It continuously recognizes the formula during terface is a canvas that allows to insert
mathematiits insertion and provides the recognition result to the cal formulas in the form of strokes. It supports
writuser. ing new strokes, selecting and editing previously
writ</p>
      <p>Both these applications comprise an user inter- ten groups of strokes, erasing strokes or clearing the
face running in a web browser at the client side that whole canvas. The component representing the canvas
communicates with web services at the server side. was developed in HTML5 and Javascript as a jQuery
The web interface was developed in HTML5 [22] and UI [24] widget. It is called jQueryInk [23] and it is
pubJavascript. The web services were written in C# pro- licly available for download as an open-source project.
gramming language using WCF (Windows Communi- The jQueryInk widget internally uses a &lt;canvas&gt;
cation Foundation) technology and they run on Mi- element introduced by HTML5. It has a speci¯ed
crosoft Windows Server 2008 R2. The client and the width and height and provides a Javascript interface to
server communicate via Ajax. The main reasons for access and modify its surface dynamically. The
interchoosing such an architecture were accessibility and face comprises functions for drawing and
transformaeasy maintenance. tion of 2D shapes and bitmaps. These functions are</p>
      <p>Both applications are accessible from an every used by our widget to render strokes and additional
modern web browser supporting HTML5. They were annotations (as selection polygon). The jQueryInk
successfully tested in the latest versions of Internet widget can be easily created over a speci¯ed &lt;div&gt;
Explorer, Mozilla Firefox, Google Chrome, Apple Sa- element by creating a jQuery object representing the
fari and Opera. They also work in the default mobile element and calling ink() method on it.
3.2</p>
      <sec id="sec-8-1">
        <title>Data collection</title>
        <p>&lt;div id="myInk"&gt;&lt;/div&gt;
$("#myInk").ink( /* settings */ );
many vertices of a polyline approximating the stroke
lie inside the selection polygon. To speed up these
tests, results from the previous selection polygon tests
are reused. Finally, when at least 2/3 of stroke
approximating vertices lie in the selection polygon, the stroke
is selected. The described algorithm allows the user to
draw a shape of a selection region freely. This is
important because using a simple rectangular selection
region would not be su±cient due to a spatial
complexity of the formula. The described optimizations
ensure then a smooth selection.</p>
        <p>Calling the ink() method creates a new &lt;canvas&gt;
element inside the &lt;div&gt; element and initializes a new
jQueryInk object with the settings speci¯ed by the
method parameters. These settings can be changed
later by calling the ink() method again. The most
important setting is a mode in which the widget is
operating. Depending on this mode strokes are
written, erased or selected while the mouse interacts with
the &lt;canvas&gt; element. The operating mode can be set
independently for left and right mouse buttons. The
settings parameters also specify handler functions for
events generated by the widget which are ¯red when
something important happens as writing a new stroke,
erasing a stroke etc. Depending on the usage of the
introduced jQueryInk widget, these handler functions
can react on the events properly.</p>
        <p>The jQueryInk widget also de¯nes supporting data
structures representing strokes, their collections etc.
These data can be accessed either directly through
the widgets interface or as a parameters of the
previously described events. The handler functions can
utilize the data, e.g. send it to the recognizer located
on the server.</p>
        <p>As it has been stated, our jQueryInk widget can
operate in three modes:</p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>The data collection application employs the described</title>
      <p>jQueryInk widget on the client side to enable insertion
of a mathematical formula. The user is also asked to ¯ll
his name to help us identify who wrote which formula.</p>
      <p>Then the formula can be sent and stored at the server.</p>
      <p>This is archieved by serializing objects representing
strokes to a JSON format and sending them via an
Ajax request. The receiver of this request is a WCF
service which stores the strokes together with the ¯lled
username and IP address of the sender.</p>
      <p>We have already collected approximately 200
formulas. To use these data to train and test our
recognizer we have to assign some semantics to them at
¯rst. Unfortunately as far as we know there is no
stanWriting: When it is in the writing mode and the user dard ¯le format for storing handwritten mathematical
is writing a stroke, our widget has to repeatedly render formulas with annotations denoting their meaning. So
the stroke. However, because rendering of all strokes we have developed our own XML ¯le format based
would be time consuming, only added points are ren- on the Presentation MathML [28] which is commonly
dered. After ¯nishing writing the stroke the whole can- used for representing math in web pages and other
vas is rendered once again to avoid artifacts. documents. The ¯rst part of our annotated MathML
¯le comprises coordinates and timestamps of points
Erasing: In the erasing mode there are two meth- forming individual handwritten strokes. Each of these
ods how to determine which strokes should be erased strokes is also assigned an unique numeric identi¯er
when the user moves a mouse. The ¯rst method serves that can be referenced in the second part of the
docfor testing short strokes comprising only a few points ument that comprises MathML description of the
forand having a small bounding rectangle. These strokes mula. To allow referencing strokes we have extended
are tested for proximity to a mouse path. The second MathML by tags denoting single symbols and having
method tests longer strokes by checking their intersec- identi¯ers of appropriate strokes as their attributes.
tion with a mouse path. We have to incorporate both Before the MathML description of the formula can
these methods because its quite di±cult for the user be annotated by identi¯ers of strokes, the MathML
to intersect short strokes such as dots. notation itself has to be created. To accomplish that
we have developed a fairly simple desktop application.</p>
      <p>Selection: When in the selection mode the closed path It loads the strokes as they were stored at the server
of a mouse is continuously approximated by a poly- and passes them to the Math Input Control [25] which
gon which determines the selection region. Also the is a COM control in Microsoft Windows 7. This
constrokes themselves are approximated by polylines to trol is a panel on which surface the formula can be
reduce the count of their vertices. Each time the selec- written by hand. The control continuously tries to
rection polygon changes, its bounding rectangle its com- ognize the formula and shows the recognition result.
puted. This rectangle is then tested for intersection Simply said, the control does almost the same thing
with bounding rectangles of all strokes to determine as our recognition application. The recognition result
potentially selected strokes. It is determined, how can be accessed as a string containing Presentation</p>
    </sec>
    <sec id="sec-10">
      <title>MathML notation. Unfortunately, the strokes to be</title>
      <p>recognized cannot be passed programmatically to the
control. Thus we developed an algorithm that utilizes
coordinates of the loaded strokes to move a mouse
cursor over the control simulating a real user interaction.</p>
      <p>Besides obtaining a meaning of the handwritten
formula in a MathML, the presented mechanism can
also be used for a comparison of our recognizer with
the Math Input Control. Although their recognizer
performs quite well its not de¯nitely perfect. But when
used by a real user it allows to erase and rewrite
arbitrary strokes or assign a meaning to a group of strokes
by a hand. So when the recognition of automatically
written strokes fails we input it to the panel by hand Fig. 4. User interface of the recognizer: Shows the
selecusing the mentioned editing possibilities to always ob- tion (number 1) and the editing of a meaning (variable q).
tain their correct description in MathML notation. Result of the recognition is at the top.</p>
      <p>Once we have the MathML notation of the formula,
we process it by adding special tags for each symbol
and we can start annotating it. It has to be done by the TeX format, the MathJax always converts it to an
a hand as the Math Input Control gives us only one image that is then shown by the browser.
MathML string describing the whole formula. Using Our goal was to develop a truly interactive
appliour application, each symbol included in the formula cation for formulas recognition. We wanted the user
has to be selected and bound with the appropriate to see partial results of a recognition as he inserts the
symbol tag. formula. We also wanted to provide a possibility of
seThe data collection application is available at [21]. lecting a group of strokes and determining its meaning.</p>
      <p>This can help the recognizer a lot when a part of the
formula is not recognized correctly, e.g. one of the
sym3.3 Data recognition bols is recognized wrong. Then strokes forming this
symbol can be selected and the recognizer proposes all
The web interface of the formulas is shown in the alternative meanings of that group of strokes.
Assum¯gure 4. It also utilizes presented jQueryInk widget ing that the correct meaning is present among these
for strokes insertion. Besides that it also comprises alternatives, the user can choose it. Then the correct
a ¯eld showing the recognition result which is a for- meaning is assigned to that group and the recognizer
mula represented in the Presentation MathML for- has to respect it in a following recognition process.
mat. Although MathML is intended for incorporat- To achieve the described interactivity, the
ing mathematics in web pages, its support in the cur- Javascript client and the WCF service running at the
rent browsers is quite poor. Among the most used web server have to communicate a lot. Figure 5 gives us
browsers only Mozilla Firefox and Opera support it. a closer look at that communication and it also shows
Moreover, even these two browsers do not render ev- how the user a®ects it. Each time the user enters the
erything correctly. Remaining browsers (Internet web interface, a new asynchronous Ajax request is sent
Explorer, Google Chrome and Apple Safari) still do to the service. The service generates a unique session
not support it at all. However, the MathML support id and initializes a new instance of the recognizer for
is planned in future versions of all mentioned browsers. this id. The session id is then sent back to the client</p>
      <p>To ensure a correct rendering of the recognized for a further communication.
formulas in all current browsers, we incorporated the Meanwhile the user could start inserting the
forMathJax library [26]. It is written in Javascript and mula, so it has to be checked whether he performed
it serves for a correct displaying of mathematics in all some actions that need to be sent to the server. These
major web browsers. It can display mathematics in actions comprehend insertion of a new stroke, erasing
both the MathML and TeX format. While displaying an existing stroke, clearing the whole canvas, selection
mathematics in the MathML format, it automatically of a group of strokes or determining a meaning of the
detects whether the browser supports the MathML na- previously selected group of strokes. Every time one of
tively. If so, it lets the browser to render the MathML. these actions is performed by the user, it is added to
If the browser lacks a MathML support, it converts a queue of actions waiting to be sent to the server. We
the MathML to an image and lets the browser display use this queue to ensure that there is always only one
this image instead. While displaying mathematics in request from the client to the server being processed.</p>
    </sec>
    <sec id="sec-11">
      <title>When the session is initialized and the queue of actions</title>
      <p>is non-empty or every time a new action is performed
and there is no request being processed, the queue of
actions is emptied and a new request containing all
actions from the queue is sent to the server.</p>
      <p>The request with the recent actions is sent as an
asynchronous Ajax request so the user can keep editing
the formula meanwhile. The request always contains
a forementioned session id so the service could pass
the actions to the appropriate recognizer. The
recognizer updates its set of strokes and provides the best
estimation of what these strokes mean. When there
is an action denoting selection of a group of strokes,
the recognizer also o®ers all possible meanings of these
strokes to allow the user to determine their meaning.
And ¯nally, when there is an action denoting a
meaning determination of some group of strokes, the
recognizer uses it.</p>
      <p>After that, a response containing a recognition
result is sent back to the client. The response can also
contain possible meanings of a selected group of
strokes if the request comprised a strokes selection
action. The client script receives the response and shows
the recognition result. It can also show possible
meanings of a previously selected group of strokes to let the
user to select the correct meaning. Then the queue of
actions is checked and if it is non-empty a new request
is generated. This goes over and over.</p>
      <p>Besides the request for a session initialization and
the request containing a queue of performed actions,
there is also the third one. It is sent synchronously
when the user leaves our web interface and it noti¯es
the service that the sessions ends and the recognizer
can be released.</p>
      <p>Unfortunately, sending an Ajax request when
a web page is being left or a web browser is being
closed is not fully reliable. We have adopted an
another mechanism for releasing unused resources. The
service periodically checks all the sessions and if it
¯nds a session without any requests for some time, it
releases the recognizer allocated for it. This can lead to
releasing a recognizer for a session that has not been
accessed for a long time but has not truly ended at
the client side. When a request is received from such
a session, a new recognizer for a speci¯ed session id is
created and the client is asked to send all its strokes
to restore an original state of the recognizer.
4</p>
      <p>Conclusions</p>
    </sec>
    <sec id="sec-12">
      <title>We have described a grammar based approach</title>
      <p>
        to mathematical formulas recognition. Our previous
results [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ] show that the proposed method is
practical and can be implemented e®ectively. The
implementation is responsive and gives acceptable results.
It is true that some OCR errors usually occur during
the recognition and even the structural analysis does
not help to prevent them all. We have addressed this
problem in the new user interface which allows to
correct OCR results interactively.
      </p>
      <p>
        We have redesigned the architecture of the whole
system and realized it as a web application. This gives
new possibilities for further improvements. Our
priority now is to assemble a su±ciently large set of
formulas, which will include samples from all common areas
of mathematics, written by di®erent users. We want
to analyze such data and use it for tuning parameters
of the grammar productions. We also plan to publish
a database of annotated formulas to be used by others
for testing purposes. We have not found any set of
online formulas publicly available. There are only o®-line
formulas [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] or samples of single on-line characters,
so this kind of contribution could be valuable for the
community.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>R. Anderson</surname>
          </string-name>
          :
          <article-title>Syntax-directed recognition of handprinted two-dimensional mathematics</article-title>
          .
          <source>In Interactive Systems for Experimental Applied Mathematics</source>
          , pp.
          <volume>436</volume>
          {
          <issue>459</issue>
          , London,
          <year>1968</year>
          . Academic Press.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>K.F.</given-names>
            <surname>Chan</surname>
          </string-name>
          and D.Y. Yeung:
          <article-title>Mathematical expression recognition: a survey</article-title>
          .
          <source>In IJDAR</source>
          , vol.
          <volume>3</volume>
          , pp.
          <volume>3</volume>
          {
          <issue>15</issue>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. S. Hellkvist:
          <article-title>On-line character recognition on small hand-held terminals using elastic matching</article-title>
          .
          <source>Master's Thesis</source>
          , Royal Institute of Technology,
          <source>Department of Numerical Analysis and Computing Science</source>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>T.</given-names>
            <surname>Kasami</surname>
          </string-name>
          :
          <article-title>An e±cient recognition and syntax analy- 20</article-title>
          . R.
          <string-name>
            <surname>Zanibbi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Blostein</surname>
            , and
            <given-names>J.R.</given-names>
          </string-name>
          <string-name>
            <surname>Cordy</surname>
          </string-name>
          :
          <article-title>Recognizsis algorithm for context-free languages</article-title>
          . Scienti¯c Re- ing
          <source>mathematical expressions using tree transformaport AFCLR-65-758</source>
          , Air Force Cambridge Research tion.
          <source>IEEE Transactions on Pattern Analysis and MaLaboratory</source>
          , Bedford, Mass., USA,
          <year>1965</year>
          .
          <source>chine Intelligence</source>
          ,
          <volume>24</volume>
          (
          <issue>11</issue>
          ),
          <year>2002</year>
          ,
          <volume>1455</volume>
          {
          <fpage>1467</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>V.M.</surname>
          </string-name>
          <article-title>Kiyko: Recognition of objects in images of paper 21. Data collection apllication. based line drawings</article-title>
          . In Third International Conference http://mfr.aspone.cz/DataCollector.html.
          <source>on Document Analysis and Recognition</source>
          , pp.
          <volume>970</volume>
          {
          <issue>973</issue>
          ,
          <fpage>22</fpage>
          . HTML5 working draft. Montreal, Canada,
          <year>1995</year>
          . http://www.w3.org/TR/html5.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>S.</given-names>
            <surname>Lavirotte</surname>
          </string-name>
          and L.
          <source>Pottier: Mathematical formula 23</source>
          .
          <article-title>jQueryInk widget. recognition using graph grammar</article-title>
          .
          <source>In Proceedings of http://plugins.jquery.com/project/Ink. the SPIE</source>
          <year>1998</year>
          , vol.
          <volume>3305</volume>
          , p p.
          <volume>44</volume>
          {
          <issue>52</issue>
          , San Jose, CA,
          <year>24</year>
          . jQuery UI library. http://jqueryui.com.
          <year>1998</year>
          . 25. Math Input Control reference.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>P.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Narasimhan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Shilman</surname>
          </string-name>
          , and P. Vi- http://msdn.microsoft.com/en-us/library/ ola : E±
          <article-title>cient geometric algorithms for parsing in two dd317324</article-title>
          .
          <source>aspx. dimensions. International Conference on Document 26</source>
          .
          <article-title>MathJax library</article-title>
          . http://www.mathjax.
          <source>org. Analysis and Recognition</source>
          , pp.
          <volume>1172</volume>
          {
          <issue>1177</issue>
          ,
          <year>2005</year>
          . 27. MathJournal 2.1.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>N.</surname>
          </string-name>
          <article-title>Matsakis: Recognition of handwritten mathematical http://www</article-title>
          .xthink.com/MathJournal.html. expressions.
          <source>Master's Thesis</source>
          , Massachusetts Institute 28.
          <article-title>MathML2 recommendation</article-title>
          . of Technology, Cambridge, MA, May
          <year>1999</year>
          . http://www.w3.org/TR/MathML2.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>E.G.</given-names>
            <surname>Miller</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.A.</given-names>
            <surname>Viola</surname>
          </string-name>
          <article-title>: Ambiguity and constraint in mathematical expression recognition</article-title>
          .
          <source>In AAAI '98/IAAI '98</source>
          , pp.
          <volume>784</volume>
          {
          <fpage>791</fpage>
          . American Association for Arti¯
          <source>cial Intelligence</source>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. D.
          <article-title>Prºu·sa and V. Hlav¶a·c: 2D context-free grammars: Mathematical formulae recognition</article-title>
          .
          <source>In J. Holub and J. Zdarek</source>
          , (eds),
          <source>Proceedings of the Prague Stringology Conference</source>
          , pp.
          <volume>77</volume>
          {
          <issue>89</issue>
          , Czech Technical University in Prague, Czech Republic,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. D.
          <article-title>Prºu·sa and V. Hlava¶·c: Mathematical formulae recognition using 2d grammars</article-title>
          .
          <source>In Proceedings of the 9th International Conference on Document Analysis and Recognition</source>
          , vol. II, pp.
          <volume>849</volume>
          {
          <issue>853</issue>
          ,
          <string-name>
            <surname>Curitiba</surname>
          </string-name>
          , Brazil,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12. D.
          <article-title>Prºu·sa and V. Hlava¶·c: Structural construction for on-line mathematical formulae recognition</article-title>
          .
          <source>In Proceedings of the Iberoamerican Conference on Pattern Recognition</source>
          , pp.
          <volume>317</volume>
          {
          <fpage>324</fpage>
          . Springer Verlag,
          <year>September 2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13. A. Rosenfeld:
          <article-title>Picture languages - formal models of picture recognition</article-title>
          . Academic Press, New York,
          <year>1979</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>B.</given-names>
            <surname>Savchynsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.I.</given-names>
            <surname>Schlesinger</surname>
          </string-name>
          , and
          <string-name>
            <surname>M.O.</surname>
          </string-name>
          <article-title>Anochina: Parsing and recognition of printed notes</article-title>
          .
          <source>In Proceedings of the Conference Control Systems and Computers</source>
          , pp.
          <volume>30</volume>
          {
          <issue>38</issue>
          , Kiev, Ukraine,
          <year>2003</year>
          . in Russian, preprint in English available.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <given-names>M.I.</given-names>
            <surname>Schlesinger</surname>
          </string-name>
          and
          <string-name>
            <given-names>V.</given-names>
            <surname>Hlava</surname>
          </string-name>
          <article-title>¶·c: Ten lectures on statistical and structural pattern recognition</article-title>
          .
          <source>Vol. 24 of Computational Imaging and Vision</source>
          . Kluwer Academic Publishers, Dordrecht, The Netherlands,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16. M.
          <article-title>Suzuki: Infty project</article-title>
          . http://www.inftyproject.org/en/index.html.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>M. Suzuki</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Tamari</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Fukuda</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Uchida</surname>
          </string-name>
          , and T. Kanahori:
          <article-title>Infty: an integrated ocr system for mathematical documents</article-title>
          .
          <source>In DocEng '03: Proceedings of the 2003 ACM Symposium on Document Engineering</source>
          , pp.
          <volume>95</volume>
          {
          <issue>104</issue>
          , New York, NY, USA,
          <year>2003</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>C.C.</surname>
          </string-name>
          <article-title>Tappert: Cursive script recognition by elastic matching</article-title>
          .
          <source>IBM J. Res. Dev.</source>
          , 26,
          <year>November 1982</year>
          ,
          <volume>765</volume>
          {
          <fpage>771</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>D.H. Younger</surname>
          </string-name>
          <article-title>: Recognition of context-free languages in time n3</article-title>
          .
          <source>Information and Control</source>
          ,
          <volume>10</volume>
          ,
          <year>1967</year>
          ,
          <volume>189</volume>
          {
          <fpage>208</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>