<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Angela Wigmore, Gordon Hunter, Eckhard P ugel, James Denholm-Price, and Vincent Binelli.
Using automatic speech recognition to dictate mathematical expressions. Journal of Computers in
Mathematics and Science Teaching</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Proposal for Coexistence of Mathematical Handwritten and Keyboard Input in a WYSIWYG Expression Editor</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>In: A. Editor, B. Coeditor (eds.): Proceedings of the XYZ Workshop, Location, Country, DD-MMM-YYYY, published at</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>http://ceur-ws.org, 1https://www.w3.org/TR/html5/, 2https://www.w3.org/TR/html-markup/input.html, 3https://www.tinymce.com/</institution>
          ,
          <addr-line>4http://ckeditor.com/</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2013</year>
      </pub-date>
      <volume>28</volume>
      <issue>2</issue>
      <abstract>
        <p>Despite math notation is well-known and extensively used in many elds, introducing math expressions into a computer device is not straightforward and requires speci c notations. For this reason, users commonly use editors that make this task easier, and even handwriting can be used directly as input method. In this paper, we propose a design of interface and behaviour for the coexistence of a keyboard-based mathematical editor and a handwritten mathematical expression recognizer. Advantages of both modes are analyzed and it is shown that they are complementary, so a design based on coexistence is more appropriate for the nal user than single modes. Also, regarding accessibility support, we provide some hints and guidelines on an implementation that o ers a good experience to users with disabilities.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Copyright c by the paper's authors. Copying permitted for private and academic purposes.
do not provide a standarized solution for this requirement.</p>
      <p>There are di erent libraries available to deal with math expressions in previously discussed scenarios. Some
of them let the user insert expressions using LATEX, like WikiEditor5 (content editor for Wikipedia), while other
libraries o er a more user-friendly interface, where what you see is what you get (WYSIWYG) and no previous
knowledge of any language or format is needed. Examples of this kind of libraries are DragMath,6 MathQuill,7
CodeCogs8 or WIRIS editor.9</p>
      <p>With the rise of touch devices and arti cial intelligence systems, new systems are appearing for recognizing
handwritten mathematical expressions, like MyScript10, MathBrush [ML15], min [SHP+12] or WIRIS hand.11
These systems have some advantages compared to keyboard input: they o er a more comfortable interface on
touch devices and they are more user-friendly and intuitive than classic overloaded toolbars that keyboard-based
editors usually have. However, sometimes a user may prefer to insert expressions using the classic keyboard for
several reasons:</p>
      <p>Lack of a touch device or not achieving the required accuracy with the mouse when drawing an expression.</p>
      <sec id="sec-1-1">
        <title>Su ering from visual impairment.</title>
        <p>In general, keyboard input o ers a wider variety of mathematical expressions than a handwriting recognizer.</p>
        <p>Creation of expressions with a speci c style and formatting (colors, position of elements, etc.).</p>
        <p>For these reasons, the user should decide what input method (s)he wants to use, and should be able even to
change it during the editing session.</p>
        <p>In summary, as a developer, there should be a mathematical input eld, independent to the input method
(keyboard or handwritten); and as a user, the interface should o er a way to manage the input methods. Some
vendors already implemented this duality in their systems, like Learnosity12 and WIRIS.13
2</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Keyboard Input Interface</title>
      <p>A keyboard-based mathematical editor is traditionally composed by a toolbar and an editing area. Of course,
this structure can vary between implementations by adding or removing interface elements but these variations
are out of the scope of this paper. An example of this structure is shown in Figure 1.</p>
      <p>x
y
x =
xn
b
px
pb2
2a
pnx
5https://www.mediawiki.org/wiki/Extension:WikiEditor
6http://dragmath.bham.ac.uk/demo.html
7http://mathquill.com/
8http://www.codecogs.com/latex/about.php
9http://www.wiris.com/editor/demo/en/
10http://webdemo.myscript.com/#/demo/equation
11http://www.wiris.com/hand
12https://www.learnosity.com/
13http://www.wiris.com/editor/demo/en/</p>
      <p>The toolbar of a keyboard-based mathematical editor is used for changing content formatting, switching
element properties, managing clipboard (copy, cut, paste), performing undo/redo actions and inserting
mathematical structures that cannot be typed using a keyboard (like fractions or square roots), among other features.</p>
      <p>A keyboard-based mathematical editor also o ers advanced features that are di cult to implement in a
handwriting recognition system, such as support for di erent character sets (chinese, japanese, korean, arabic),
Right To Left (RTL) support, syntax checking, copy/cut/paste, formatting or pixel-perfect element position. In
summary, a keyboard-based mathematical editor is a powerful tool that can be used not just to let the user
insert expressions, but also to decide exactly how they should look.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Handwritten Input Interface</title>
      <p>It is di cult to determine the interface structure of a handwritten mathematical expression recognizer, since
the technology is still new and there are di erent implementations in the market without any established design
pattern.</p>
      <p>The common characteristic in most of them is that there are two main components: the drawing area and the
action buttons, that can be displayed inside a tiny toolbar (Figure 2) or oating over the drawing area (Figure 3).
!
Action buttons can be used to perform simple tasks, as undo/redo actions or clearing the drawing area.</p>
      <p>The drawing area allows the user to write mathematical expressions using a touch screen or a mouse, but it
can be used also to enter gestures. A gesture is a special input that the system interprets as a command. For
example, drawing several strokes over an existing drawn character can be interpreted as a desire to delete the
character (Figure 4); or touching with two ngers at the same time and changing the distance between them can
be interpreted as a desire to increase/decrease the zoom of the drawing area.</p>
      <p>!
As said previously, both interfaces have di erent and complementary characteristics. In some scenarios, just one
of them is enough to ful ll the given speci c requirements, but in other cases the nal decision is up to the user.</p>
      <p>In this paper, we propose the coexistence between both interfaces. On the startup, the system detects and
uses the appropriate interface depending on the device, user preferences or developers decision, but the user can
interchange the interface at any moment, even during the editing session.
4.1</p>
      <sec id="sec-3-1">
        <title>Interchange of Interface</title>
        <p>There are several ways to interchange between a keyboard-based mathematical editor interface and a handwritten
mathematical expression recognizer interface. In this paper, we suggest the usage of an element that indicates
clearly the concept of duality, avoiding approaches based on gestures like swiping without any previous visual
clue. The described proposal can be implemented via toolbar (Figure 5) or oat buttons.
x
y
xn
px
pnx
...</p>
        <p>Ò
!</p>
        <p>In this proposal, we also recommend the conservation of the existing content (if there is any) as long as
possible, converting it from one input method to another, as explained below.
4.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Conversion of Formal Content to Strokes and Vice Versa</title>
        <p>Given a handwritten math input, once recognized, its conversion to a formal content for the keyboard-based
mathematical editor is trivial: the formal result of the recognizer is just passed to the editor, assuming that all
the pieces use the same format.</p>
        <p>The conversion of content typed with a keyboard-based editor to a set of strokes understandable by the
handwritten expression recognizer is not immediate, because it is necessary to create a correct 2D structure
containing handwritten symbols. The technical details are out of the subject of this paper, but one possible
solution can consist on reusing the existing painting technology of the WYSIWYG editor, sending the paint
instructions directly to the drawing area of the recognizer.</p>
        <p>For the conversion of text characters to strokes, an existent font made with drawing instructions can be used,
and the system just have to execute the corresponding paint instructions when a character must be drawn. A
real implementation of this proposal has been done in the WIRIS editor. Figure 6 shows the automatically
generated handwritten expression.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Developer Tools</title>
        <p>In this section, we present some guidelines for API development where keyboard and handwriting inputs coexist.
4.3.1</p>
      </sec>
      <sec id="sec-3-4">
        <title>Providing Clues and Hints About the Expected Value</title>
        <p>In some platforms, like in assessment environments, the mathematical input eld is used to let students provide
an answer to a problem or question.</p>
        <p>Since recognition of handwritten math expressions is not 100% accurate and due to handwriting ambiguities,
it is recommended to pass hints and clues to the recognizer in order to increase the success rate.</p>
        <p>For example, if the platform knows that the student has to give an answer that is composed just by numbers,
notifying it to the recognizer can increase the success rate since the recognizer will not classify confusing drawings
of \2" like \z".</p>
        <p>Technical details about the format of those clues and hints go beyond the purpose of this paper.
4.3.2</p>
      </sec>
      <sec id="sec-3-5">
        <title>Obtaining Information About the Input Mode</title>
        <p>In some cases it is interesting for the platform to store the drawing of the user. For example, in assessment
environments, when the student answers using the handwriting recognizer and the system is unable to recognize
the expression with success, a teacher can inspect manually the strokes drawn by the student and determine if
the answer is correct or not.</p>
        <p>In some formats like MathML it is possible to insert semantic content next to the formal representation of
the expression (in this case, using a semantics tag14). In other formats like LATEX, this semantic content can be
14https://www.w3.org/TR/REC-MathML/
inserted just as a LATEX comment, and the real code that represents the expression is not a ected by it.</p>
        <p>This paper proposes the insertion of the user drawing as a semantic content inside the returned value when
the developer explicitly retrieves it in this way from the input eld.</p>
        <p>In the future, if new input methods are implemented (like voice recognition), other semantic content can be
returned next to the real representation of the expression using this method.</p>
      </sec>
      <sec id="sec-3-6">
        <title>4.3.3 Interface Events</title>
        <p>An API for managing interface events increases extensibility and adaptation to customer requisites, and allows
the developer to have full control on the input eld behaviour. This paper proposes the following events, not
entering in details about the technical implementation. In general:</p>
        <sec id="sec-3-6-1">
          <title>An event that is red when the value of the eld changes.</title>
        </sec>
        <sec id="sec-3-6-2">
          <title>An event that is red when the interface input mode changes.</title>
        </sec>
        <sec id="sec-3-6-3">
          <title>For the keyboard-based mathematical editor:</title>
        </sec>
        <sec id="sec-3-6-4">
          <title>An event that is red when the selection or caret position changes.</title>
          <p>An event that is red when the formatting of the expression or the selection changes.</p>
        </sec>
        <sec id="sec-3-6-5">
          <title>For the handwritten mathematical expression recognizer:</title>
        </sec>
        <sec id="sec-3-6-6">
          <title>An event that is red when the drawing area changes.</title>
        </sec>
        <sec id="sec-3-6-7">
          <title>An event that is red when the system recognizes an expression.</title>
          <p>An event that is red when the system cannot recognize an expression.</p>
          <p>With these events a developer can, for example, prevent the submission of a form if the entered strokes have
not been still recognized (or there has been an error in the recognition), in order to prevent wrong submissions.
4.4</p>
        </sec>
      </sec>
      <sec id="sec-3-7">
        <title>Default Interface Mode on Startup</title>
        <p>On the startup the system has to decide the appropriate interface to be displayed for entering mathematical
expressions. This decision is based on several factors.</p>
        <p>The developer has all the power to decide what interface must be used, since the developer knows the context
of the page and, for example, if the entered value is going to be evaluated later or is going to be inserted in a
page with a speci c formatting. The developer knows also the main user target (primary education, secondary
education, university students, etc.), and can decide that a complex expression should be entered just with the
keyboard-based editor (like, for example, a Taylor series development).</p>
        <p>If the developer does not take a decision over the default interface, there are several scenarios. If there are
stored user preferences from previous sessions, the system can load them and display the default interface based
on the current con guration.</p>
        <p>If a user con guration is not de ned, this proposal suggests that the system should determine, as far as
possible, if the platform expects a given value and the user can use the handwriting recognizer to enter it.</p>
        <p>In other words: if the handwriting recognizer is not able to recognize the expected value, the system must
display the keyboard-based editor interface by default (and in some cases, it is recommended to disable completely
the handwritten input). Of course, in this scenario the developer needs to provide a clue about the expected
value, as seen previously.</p>
        <p>Otherwise, the system can detect characteristics from the device. For example, if the user is using a device
with a touch screen or a digital pen, the handwriting recognizer can be used as default interface. The system
can detect other properties, like if the user is using a screen reader due to a visual impairment and, in that case,
enable the keyboard-based editor by default.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Accessibility</title>
      <p>Web accessibility refers to the inclusive practice of removing barriers that prevent interaction with, or access
to websites, by people with disabilities. When sites are correctly designed, developed and edited, all users have
equal access to information and functionality.</p>
      <p>The needs that Web accessibility aims to address include:
Visual Visual impairments including blindness, various common types of low vision and poor eyesight, various
types of color blindness.</p>
      <p>Motor/mobility e.g. di culty or inability to use the hands, including tremors, muscle slowness, loss of ne
muscle control, etc., due to conditions such as Parkinson's Disease, muscular dystrophy, cerebral palsy,
stroke.</p>
      <p>Auditory Deafness or hearing impairments, including individuals who are hard of hearing.
Seizures Photo epileptic seizures caused by visual strobe or ashing e ects.</p>
      <p>Cognitive/Intellectual Developmental disabilities, learning disabilities (dyslexia, dyscalculia, etc.), and
cognitive disabilities of various origins, a ecting memory, attention, developmental \maturity", problem-solving
and logic skills, etc.</p>
      <p>By the nature of a mathematical input eld and the subject of this paper, only visual and motor/mobility
disabilities are taken in account in this proposal.</p>
      <p>In the case of motor/mobility disabilities, the major part of the work is already performed by the web browser,
the operating system and third-party native tools, performing the required zoom on the page and enabling
special features on mouse and keyboard (like enabling special keys or modifying the behaviour of classical mouse
drag&amp;drop). In this paper we assume that these features are enough to o er a good experience to motor/mobility
disabled users and It does not propose any extra work. But, of course, implementations are free to include new
features that help on the removal of this barrier.</p>
      <p>In the case of visual disabilities, due to the nature of a handwriting recognizer, this proposal recommends the
exclusive usage of a keyboard-based mathematical editor, which can include more advanced features for visual
impaired users than a handwriting recognizer, like keyboard navigation and support for screen readers.
5.1</p>
      <sec id="sec-4-1">
        <title>Keyboard Navigation</title>
        <p>On the keyboard-based editor, the system must implement a way of accessing all elements (toolbar, buttons,
editing area and others) using just the keyboard. This feature can be already implemented using the focus
property de ned by W3C.15</p>
        <p>When a user activates the button that changes the interface to the handwriting recognizer, we propose to
make focusable just the button to go back to the keyboard-based editor. By this, a user that su ers from visual
impairment or a motor/mobility disability can go easily back to the classical editor that o ers full accessible
support.
5.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Screen Reader Support</title>
        <p>Modern accessible web technologies o er advanced support for screen readers, allowing the developer decide what
the screen reader should synthesize. On the keyboard-based mathematical editor, this feature can be used for
two purposes:</p>
        <p>Providing description of interface elements when they are focused, like tabs, action buttons and editing area.
Providing a description of the context when the caret is moved, like in a classic input eld. When the caret
is moved in a classic input eld, the screen reader usually spells the name of the character at the caret
position (or other details, like if the caret is at the end of a line). This feature can be also implemented in
a keyboard-based mathematical editor using the standard aria-live region,16 providing a brief description of
the context of the caret (beginning of a square root, denominator of a fraction, etc).
15https://www.w3.org/TR/html5/editing.html#focus
16https://developer.mozilla.org/en-US/docs/Web/Accessibility/ARIA/ARIA_Live_Regions</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Examples and First Integrations</title>
      <p>The model of interface proposed in this paper has been already implemented in WIRIS quizzes. WIRIS quizzes
(Figure 7) is a software component that allows teachers to create questions for students, containing random
variables, providing automatic evaluation of answers and using WIRIS editor as mathematical input eld for
students.</p>
      <p>When a teacher creates a question and provides the structure of a correct answer (e.g. specifying what
variables, numbers or symbols are involved in it), WIRIS quizzes analyzes it and passes a set of constraints to
WIRIS editor to increase the success rate of the handwriting recognizer. For example, if the correct answer
contains only numbers and does not contain any \z", the ambiguous strokes that might either be a number \2"
or a letter \z" will be recognized as numbers. Figure 8 shows an example of failed recognition due to unknown
information about the context, while Figure 9 shows how, with information, the recognizer has a higher success
rate.</p>
    </sec>
    <sec id="sec-6">
      <title>Discussion and Future Work</title>
      <sec id="sec-6-1">
        <title>Technical Details and Standardization</title>
        <p>This paper is a proposal for coexistence of mathematical handwritten and keyboard input in a WYSIWYG
expression editor, and is not an API proposal or a de nition of a standard.</p>
        <p>From the point of view of the authors of this paper, the following subjects are still pending of discussion and
standardization:</p>
        <sec id="sec-6-1-1">
          <title>Inclusion of a math type on standard HTML input eld types.</title>
          <p>De nition of an API for the control of \math" input elds, including con guration parameters and event
listeners.</p>
          <p>Election of a standard mathematical format for \math" input eld values, like MathML.
7.2</p>
        </sec>
      </sec>
      <sec id="sec-6-2">
        <title>Other Input Methods</title>
        <p>Advancements in arti cial intelligence and machine learning are promoting the development of new technologies
that can analyze and classify information impossible to handle until now. Specially, advancements in ASR
(Automatic Speech Recognition) introduce speech as a new input method of mathematical expressions [WHP+09].
There are even some studies that combine both handwriting and speech input to enter a mathematical expression
[MMPVG13].</p>
        <p>The interface proposal of this paper can be extended to these new input methods, adding a new button
for interchanging the interface to the new input modes, extending the functionality of the existing interchanger
button or adding a new button that just opens a modal dialog for speech input, like traditional text input already
do (e.g. Google, Apple, Microsoft).
7.3</p>
      </sec>
      <sec id="sec-6-3">
        <title>Drawing over the Current Formal Expression</title>
        <p>Another possible approach for the problem handled in this paper can consist in allowing the user to draw new
subexpressions over an existing formal expression in the mathematical editor. But this approach presents some
problems:</p>
        <p>Using the mouse or the touch screen to handle the caret position and selection can con ict with the feature
of drawing or performing gestures.</p>
        <p>Sometimes there is not enough space to draw a subexpression. For example, in this case it would be di cult
to draw a new fraction inside the square root:
px
2</p>
        <p>Recognizing incomplete subexpressions has lower success rate than recognizing complete expressions.
In this paper we proposed a design of interface and behaviour for the coexistence of a keyboard-based
mathematical editor and a handwritten mathematical expression recognizer. Advantages of both modes have been analyzed
and it has been shown that they are complementary, so a design based on coexistence is more appropriate for
the nal user than single modes.</p>
        <p>In this proposal we also present some guidelines about the communication channel between developers and
mathematical input elds that ful lls the needs a platform can have on a real scenario.</p>
        <p>Regarding accessibility support, we proposed some hints and guidelines on an implementation that o ers a
good experience to users with disabilities.</p>
        <p>Finally, future work should be focused on the de nition of a standard API and format for real case scenarios
and exposed.
[ML15]
[SHP+12]
[WHP+09]</p>
        <p>Scott MacLean and George Labahn. A bayesian model for recognizing handwritten mathematical
expressions. Pattern Recognition, 48(8):2433{2445, 2015.
Christopher Sasarak, Kevin Hart, Richard Pospesel, David Stalnaker, Lei Hu, Robert Livolsi, Siyu
Zhu, and Richard Zanibbi. min: A multimodal web interface for math search. Symp.
HumanComputer Interaction and Information Retrieval, Cambridge, MA, 2012.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>