<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>InkChat: A Collaboration Tool for Mathematics</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rui Hu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stephen M. Watt</string-name>
          <email>Stephen.Watt@uwo.ca</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>The University of Western Ontario London Ontario</institution>
          ,
          <country country="CA">Canada</country>
          <addr-line>N6A 5B7</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>We investigate the question of how multimodality can be used in computer-mediated mathematical collaboration. To demonstrate our ideas, we present InkChat, a whiteboard application, which can be used to conduct collaborative sessions on a shared canvas. It allows participants to use voice and digital ink independently and simultaneously, which has been found useful in mathematical collaboration.</p>
      </abstract>
      <kwd-group>
        <kwd>Pen computing</kwd>
        <kwd>multimodal computing</kwd>
        <kwd>computer aided collaboration</kwd>
        <kwd>InkML</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        A number of computer applications have been developed over the past years
to accommodate the needs of collaboration. One general category of these is
whiteboard systems, where multiple users can interact over a shared canvas using
a particular input method. For example, the use of pen input allows participants
to write or draw naturally, which has great potential to increase productivity,
especially in mathematics [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Existing systems, however, typically allow only one
input method to be used at one time. For example, in Microsoft OneNote [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], one
can either type or draw, but not simultaneously. This places strong limitations
on what can be done. For example, it becomes quite awkward to explain an
activity while it is being performed.
      </p>
      <p>Mathematical collaboration software can be most useful when it supports
input from multiple modalities, as it allows collaborators to interact more richly
and participants to use the input methods that are most suitable. For example, in
mathematics, some points can be most efficiently made through the spoken word
while others can best be communicated by a hand-drawn diagram or equation.
Software for collaboration also should allow users to perform various editing
operations to revise or tidy up jointly created drawings, text and so on. Finally,
collaborative work usually includes exploring ideas in discussions that can have
false starts and dead ends so the ability to record and roll-back the discussion
to prior points is important.</p>
      <p>
        The primary objective of this article is to analyze the design issues in
incorporating multimodal interactions in this kind of mathematical collaboration.
Considerable related work has been conducted, some of which we highlight here.
In the 1990s, QuickSet [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] was a multimodal framework used by the US Navy
and US Marine Corps to set up training scenarios and to control virtual
environments. It accepted voice and pen input, communicating via a wireless LAN
through an agent architecture to a number of systems. The system could
recognize voice input with certain responses. If voice interaction was not feasible,
it could still analyze digital ink and then give several possible interpretations.
This demonstrated that multimodal interactions could enable efficient
communication. Classroom 2000 was an application [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] whose primary purpose was
to create an environment to capture as many activities as possible from the
classroom experience. It included tools to automate the production of lecture
notes and to assist students in reliving lectures. This application did not support
real-time distributed collaboration, however. In 2004, the InkBoard [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
networkshared whiteboard application was released. It allowed graphical collaboration
and design, including network-shared ink strokes and audio/video conferencing.
Because it integrated the Microsoft Conference XP technology, it was limited to
Windows platforms. In the same year, Electronic Chalkboard [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] was developed.
Its goal was to integrate distance education tools with the traditional blackboard
experience. It could load images and interactive programs from a file system or
the Internet and could interact with computer algebra systems and display
computation results. Because content was saved as images, it was not possible to
later edit or perform semantic operations on saved sessions.
      </p>
      <p>
        To explore ideas for mathematical collaboration, Regmi and Watt [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
developed a whiteboard application that provided collaborative sessions with
synchronized voice and digital ink on a shared canvas. This system could save sessions
with the digital ink in InkML [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] format and the voice as an MP3 sound file.
A significant drawback, however, was that the client interface implementation
varied from platform to platform: The client for Windows was implemented in
C#, using the .NET framework, while the client for Linux and Mac OS X was
implemented in Python. Although Python supports cross-platform portability,
the client was constructed using Linux-specific and Mac OS X-specific APIs and
thus could not be ported to other platforms.
      </p>
      <p>
        To address the portability issue, Hu, Mazalov, and Watt [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] proposed a
streaming digital ink framework for multi-party collaboration. The framework
was portable across multiple platforms and consisted of a number of
extensions, These extensions could work independently and simultaneously, serving
as plug-ins for the host collaboration software. It is portable across multiple
platforms including Windows, Linux, and Mac OS X. It currently uses the
popular Skype and Google Talk services as the backbone to deliver data streams,
but other transport mechanisms could be used. The digital ink data is
represented in InkML, allowing flexible manipulations for different content types, such
as mathematics and diagrams. The collaborative sessions can be recorded and
stored for playback, analysis or annotation. InkChat is available for download
at http://www.orcca.on.ca/InkChat/.
      </p>
      <p>
        The present article explores support for collaboration in this framework. It
is based on the same InkChat software infrastructure as [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], but where that
article focused on the handling of streaming digital ink, we now focus on the
multimodal aspects. The remainder of the article is organized as follows. In
Section 2, we examine how multimodal input can best be used in collaborative
environments to improve efficiency. Section 3 recalls InkChat, the whiteboard
application for multimodal collaboration. In Section 4, we explain how InkChat
supports multimodality. Section 5 describes the collaborative aspects of InkChat.
In Section 6, we conclude the article.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Multimodal Collaboration</title>
      <p>Multimodal input is useful as it provides versatile means for users to interact
with computers. These input modalities include keyboard, mouse, voice, pen,
and video and so on. We will focus on the voice input and pen input in this
section, and explore their capabilities in mathematical collaboration.
Voice Input Voice communication is a fast and natural way to interact with
other people and computers. Most people speak faster than they can type or
manipulate a mouse. Notably, certain people with physical disabilities prefer
operating their computers simply by speaking. Voice input is hands-free, which
is useful if one is driving. Also, voice input is flexible. One does not have to sit in
front of a computer: it is possible to use voice input while active or while sitting,
standing, or reclining. Most modern devices come with built-in microphones.
Handwriting Input With the widespread availability of pen-based devices such
as Tablet PCs, PDAs and even cell phones, pen input starts playing an important
role in human computer interaction. Pen input is a natural and powerful input
modality since everyone learns to write in school. It is versatile as it provides
more gestures and motions available compared to mouse and keyboard input.
Many devices without pens support touch input, which may also be used to
capture handwriting using a finger, but at a lower resolution.</p>
      <p>Handwritten input is expressive. Modern devices capture digital ink traces
in a two dimensional writing plane, and may support pressures angles and
noncontact pen height. This may be used to capture mathematics, as most
mathematical notations are two dimensional, with similarities to both text and
drawing. Mathematical formulae are hard to understand by means of voice, keyboard
or mouse, but can be expressed easily in handwriting.</p>
      <p>Voice and Handwriting Multimodal Input We chose voice and pen as
the input modalities for collaboration. Together they provide more than either
individually, and indeed we find in this combination the whole is more than
the sum of its parts. The advantages of voice and pen multimodal collaboration
include:
Portability Most computing platforms support both voice and pen input.
Ease of use Writing and voice are familiar and require little training to use.
Complementarity Both input modalities can work independently and
simultaneously. Speaking and writing at the same time gives two communication
channels, allowing one to be used to explain or amplify the other. Either may give
the main message, with the other supporting it, or both be equally important.
Two channel communication avoids the clutter and confusion that arises when
two related messages are multiplexed through one modality. For example, text
with footnotes or parenthetical remarks or formulae with many annotations.
3</p>
    </sec>
    <sec id="sec-3">
      <title>InkChat</title>
      <p>
        As stated earlier, we have developed a platform-independent version of InkChat
to evaluate and demonstrate our ideas. This is built on top of the portable
framework presented in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], whose primary purpose is to collect digital ink across
a variety of platforms and to provide a platform-independent, consistent interface
for digital ink applications. As a result, InkChat is available on these platforms
and can process ink data without knowing the underlying details. To support
digital ink data portability, InkChat uses InkML to represent digital ink data as
it provides digital ink streaming and archival support independent of platforms.
This allows flexible interchange of digital ink data in collaborative environments
and, in addition, allows cut and paste of digital ink between applications, e.g.
between Microsoft Office 2010 and InkChat.
      </p>
      <p>Figure 1 shows the user interface of InkChat. A number of control buttons are
located at the top of the canvas and grouped together to minimize the distance
pen movement. InkChat provides a set of pre-defined colors and a palette to
create new colors, if needed. To accommodate the needs of different writing
activities, we have developed a few brush types. For example, one can choose
the pen or pencil for diagramming and the tear drop brush for digital painting
or calligraphy. Editing is also supported. This includes redo, undo, and select,
cut, copy, and paste of different kinds of content, such as images, typed text,
and digital ink.</p>
    </sec>
    <sec id="sec-4">
      <title>InkChat Support for Multimodality</title>
      <p>InkChat is a multi-year ongoing project. Its primary design objective is to
enhance mathematical collaboration by incorporating multiple modalities. This
allows participants to flexibly choose the input methods that are most suitable
for a particular topic. Below we describe the modalities that are supported by
InkChat.</p>
      <p>Ink Traces Handwriting is one of the most natural ways to input mathematics,
as most mathematical notations are two dimensional, with elements of both
writing and drawing. InkChat captures handwriting as ink traces and exchanges
the data with other clients using InkML.</p>
      <p>Voice InkChat also supports voice input. The voice stream is paired with the ink
stream to improve the efficiency of collaboration. For example, one can verbally
explain the underlying meaning of a complex diagram while drawing it on a
shared canvas. This avoids the clutter and confusion that may arise when either
input method is used individually.</p>
      <p>Floating Pointer To support collaboration, InkChat provides users with
floating pointers that can be used to point at target objects on the shared canvas
without leaving any ink mark. Together with the voice channel, participants can
point to and discuss aspects of the common canvas.
5</p>
    </sec>
    <sec id="sec-5">
      <title>InkChat Support for Collaboration</title>
      <p>Communication The primary goal of InkChat is to allow users on different
computers to collaborate on a shared canvas. It currently uses the popular Skype
and Google Talk services for the communication channel, but other transport
mechanisms could be used. The primary design principle is to give users the
freedom to choose the communication mechanism without too much
configuration. If one service is not available in a particular location, it is easy to switch
to another. Conference mode is supported, where more than two participants
can be involved in one conversation. Depending on the chosen underlying
communication service, InkChat adopts different mechanism to exchange data with
other participants. For example, when a P2P backbone is used, the conference is
initiated by the host that has a connection with every other participant. Digital
ink routing shares the same mechanism as audio routing, each piece of ink stroke
will be broadcast by the host to all participants except the initiator.
Page Navigation InkChat also supports page navigation. This has been found
useful when participants wish to cover multiple topics in one session or to load
previous work in the middle of a conversation. In both cases, the current page
will first be saved to the file system as an InkML file. Then the Ink Canvas will
send a page request to the file system to check if the next page already exists. If
so, the Ink Canvas will parse the InkML file and load the content so that users
can continue to work on that page. Otherwise, a blank page will be created.
Figure 2 illustrates the communications used in page navigation.
Ink Editing In collaboration it is useful to edit or modify a work in progress,
and in order to edit, it is necessary to be able to erase digital ink. InkChat
provides two ways to erase ink: either by erasing whole strokes or parts of strokes.
We call these “stroke-wise” and “point-wise” erasing. Stroke-wise erasing uses
a hit testing method to detect whether a particular stroke is selected. If so,
it removes the stroke from the canvas and re-renders other strokes that may
be affected. Point-wise erasing erases part of a stroke instead of removing the
whole from the canvas. A stroke may be split into pieces when using point-wise
erasing, and this requires the application to detect where the stroke is broken
up. Point-wise erasing uses a hit testing method as well, and this returns a
collection of ink points that need to be removed from the target stroke. It then
groups the remaining ink points into new strokes and calculates the properties
for each, including starting time and duration. The new strokes are then placed
in sequence by starting time.</p>
      <p>Drag and Drop InkChat allows existing ink to be moved on the canvas
using drag and drop. This uses a special lasso cursor to select the content to be
moved. This lasso is a free selection tool that allows users to create a selection
by encircling a region with a pen. The Lasso is useful in mathematical domains
as users may often wish to select a portion of a mathematical expression. Figure
3 shows an example of using Lasso. Notably, as the bounding boxes of character
“a” and “+” overlap, a rectangular selection is not suitable for this operation.
Real-Time Mirroring InkChat is able to animate the drawing of ink strokes,
and uses this to render the ink of collaborators as it is being written. To avoid
jarring and distraction that are caused by large visual changes, InkChat splits
long ink traces into small pieces and send each individually. This is to allow
smooth rendering on each participant’s canvas. The representation of digital ink
data is the key to the success of this animation. Collaboration sessions often take
place in heterogeneous environments, where participants may work on different
platforms and use various pen devices. These pen devices typically have different
settings such as sampling rate, sensitivity, channel properties and so on, and
consequently output digital ink data with different characteristics. This requires
digital ink data to be represented in a flexible, platform- and vendor-independent
format so that the animation is possible across different platforms. Meanwhile,
ink strokes must be organized in time order in order to support smooth rendering
and synchronization with other modalities.</p>
      <p>We have found InkML suitable for these animation purposes. It is
platformand vendor-independent and allows complete and accurate representation of
digital ink by capturing and recording information such as the device characteristics,
pen tilt, pen pressure and so on. Most importantly, it provides a wide range of
features to support smooth rendering and synchronization.</p>
      <p>
        Session Recording and Playback Collaboration sessions may be recorded
and stored for later playback, analysis or annotation. InkChat stores digital ink
in InkML archival style which keeps the contextual information and ink traces
separately in order to achieve compact representation. When playback is
desired, the digital ink data can be efficiently converted into streaming style which
organizes ink strokes along with contextual information in time order [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. In
addition, each ink trace and its constituent ink points can be timestamped in
order to support accurate synchronization with content input by other
modalities. Figure 4 shows an example of playback in InkChat.
6
      </p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion and Future Work</title>
      <p>We have shown how InkChat has been made to support multimodal
interaction and communication. We have found, informally, that these features greatly
enhance the InkChat’s effectiveness for collaboration. This seems to arise
primarily by separating the creation and manipulation of the objects of discourse
(diagrams, equations, and so on) from the discussion about the objects and
the manipulations. It is an ongoing question of investigation to quantify these
findings.</p>
      <p>We would like thank Michael Friesen, Vadim Mazalov, Amit Regmi, and
Coby Viner for their contributions to the implementation of InkChat. We would
also like to thank James Wake for investigating how InkChat may be integrated
in other environments, including Google Hangouts.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Anthony</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koedinger</surname>
            ,
            <given-names>K.R.</given-names>
          </string-name>
          :
          <article-title>Evaluation of multimodal input for entering mathematical equations on the computer</article-title>
          .
          <source>In: CHI '05 Extended Abstracts on Human Factors in Computing Systems. CHI EA '05</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2005</year>
          )
          <fpage>1184</fpage>
          -
          <lpage>1187</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Microsoft</given-names>
            <surname>Inc</surname>
          </string-name>
          .: Onenote 2010
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Cohen</surname>
            ,
            <given-names>P.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Johnston</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McGee</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oviatt</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pittman</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clow</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>QuickSet: Multimodal Interaction for Distributed Applications</article-title>
          .
          <source>In: Proceedings of the fifth ACM international conference on Multimedia. MULTIMEDIA '97</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>1997</year>
          )
          <fpage>31</fpage>
          -
          <lpage>40</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Abowd</surname>
            ,
            <given-names>G.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brotherton</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bhalodia</surname>
          </string-name>
          , J.:
          <source>Classroom</source>
          <year>2000</year>
          :
          <article-title>A system for capturing and accessing multimedia classroom experiences</article-title>
          .
          <source>In: CHI'98: CHI 98 conference summary on Human factors in computing systems</source>
          ,
          <source>ACM</source>
          (
          <year>1998</year>
          )
          <fpage>20</fpage>
          -
          <lpage>21</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ning</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Slocum</surname>
            ,
            <given-names>A.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sanchez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>InkBoard - Tablet PC Enabled Design Oriented Learning</article-title>
          .
          <source>In: Proc. of the 7th International Conference on Computers and Advanced Technology in Education. CATE</source>
          <year>2004</year>
          154-
          <fpage>160</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Friedland</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Knipping</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rojas</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tapia</surname>
          </string-name>
          , E.:
          <article-title>Teaching with an intelligent electronic chalkboard</article-title>
          .
          <source>In: ETP'04: Proceedings of the 2004 ACM SIGMM workshop on Effective telepresence, ACM</source>
          (
          <year>2004</year>
          )
          <fpage>16</fpage>
          -
          <lpage>23</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Regmi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Watt</surname>
            ,
            <given-names>S.M.:</given-names>
          </string-name>
          <article-title>A Collaborative Interface for Multimodal Ink and Audio Documents</article-title>
          .
          <source>In: Proceedings of the 2009 10th International Conference on Document Analysis and Recognition</source>
          . ICDAR '
          <volume>09</volume>
          (
          <year>2009</year>
          )
          <fpage>901</fpage>
          -
          <lpage>905</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Watt</surname>
            ,
            <given-names>S.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Underhill</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          (Editors)
          <article-title>: Ink Markup Language (InkML) W3C Recommendation</article-title>
          . http://www.w3.org/TR/InkML/ (
          <year>September 2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mazalov</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Watt</surname>
            ,
            <given-names>S.M.</given-names>
          </string-name>
          :
          <string-name>
            <given-names>A Streaming</given-names>
            <surname>Digital</surname>
          </string-name>
          <article-title>Ink Framework for MultiParty Collaboration</article-title>
          .
          <source>In: Proceedings of the 11th international conference on Intelligent Computer Mathematics</source>
          . CICM'
          <volume>12</volume>
          (
          <year>2012</year>
          )
          <fpage>81</fpage>
          -
          <lpage>95</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Keshari</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Watt</surname>
            ,
            <given-names>S.M.</given-names>
          </string-name>
          :
          <article-title>Streaming-archival inkml conversion</article-title>
          .
          <source>In: Proc. 2007 International Conference on Document Analysis and Recognition</source>
          ,
          <source>(ICDAR</source>
          <year>2007</year>
          ), Curitiba,
          <source>Brazil (September</source>
          <volume>23</volume>
          -26
          <year>2007</year>
          ) pp.
          <fpage>1253</fpage>
          -
          <lpage>1257</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>