<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The plain text trap when copying mathematical formul</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Paul Libbrecht</string-name>
          <email>paul@hoplahup.net</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Matija Lokar</string-name>
          <email>Matija.Lokar@fmf.uni-lj.si</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Copyright c by the paper's authors. Copying permitted for private and academic purposes.</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>In: A. Editor, B. Coeditor (eds.): Proceedings of the XYZ Workshop</institution>
          ,
          <addr-line>Location, Country, DD-MMM-YYYY, published at, http://ceur-ws.org</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>PH Weingarten</institution>
          ,
          <addr-line>Weingarten</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Ljubljana</institution>
          ,
          <addr-line>Ljubljana</addr-line>
          ,
          <country country="SI">Slovenia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>When an object, of any nature, is displayed and selectable on a computer screen, users expect it to be copy-and-paste-able: one can invoke the copy function and insert (paste) it at other places, within the same programme or beyond. This holds for many di erent kinds of objects: texts and images, at least. Unfortunately, for mathematical objects, this is rarely so. Most operating systems o er multiple channels to carry exchanged content but most mathematical systems do not take advantage of it: they transfer the content in plain text, expecting it to have the right syntax or, if necessary, expecting the user to use a di erent copy function so that the right syntax is exchanged. While ways to circumvent these issues are available, they are mostly not used by mathematical software. We explore potential justi cations and describe for which type of users, these justi cations do not apply. To support this, we report brie y on the experiment students about their expectations and observations on the above mentioned process.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>work somewhat properly. Switching between di erent systems, transferring relevant data, worrying about things
getting \out of sync",di erences in command sets and capabilities between di erent applications, soon becomes
overwhelming[Lie00].</p>
      <p>A simple sequence select, copy, switch, insert, paste is mostly expected by users. However, it is not rare a more
complex procedure is needed such as the invocation of a special copy or adjusting the pasted content before it is
further processed. For example[Kin02], the Stanford Interactive Workspaces'smart clipboard can copy and paste
data between incompatible applications on di erent platforms[Kic00]. The smart clipboard must transparently
invoke the machinery whenever the user performs a copy and paste operation. A more sophisticated but less
general approach, semantic snar ng, as implemented in Carnegie Mellon's Pebbles project, captures content from
a large display onto a small display and attempts to emulate the content's underlying behaviors [Mey01].</p>
      <p>In the middle of this process lies the exchanged content. Most operating systems o er multiple channels to
carry this exchanged content but most mathematical systems do not take advantage of it: they transfer the
content in plain text, expecting it to have the right syntax or, if necessary, expecting the user to use a di erent
copy function so that the right syntax is exchanged.</p>
      <p>Why is this a problem? There are ranges of issues which are encountered by users and are all due to this
choice of plain text. They range from syntax mismatch to unmasterable expressions, from the failure of apparent
syntax compatibility to the somewhat arbitrary text-linearization of the text appearing within the formula.</p>
      <p>While ways to circumvent these issues are available since long in a standardized form (clipboard avours or
alternative representations in MathML), they are not used by mathematical software. We propose potential
justi cations and describe for which type of users, these justi cations do not apply.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Observation Methods for Copy and Paste of Formul</title>
      <p>To be able to evaluate what are the expectations of users when transferring between systems, the authors employ
their usage and teaching experience. These includes courses dedicated to the introduction of various systems
(e.g. introduction to LaTeX, to computer algebra systems, or to dynamic geometry) in undergraduate teacher
education classes. While this class of users is clearly not representative of the complete population of users of
mathematical systems, it represents the important share of moderately technical users and also precludes those
that will educate broad masses of citizens.</p>
      <p>With this class of users, a practical experiment has been done at the University of Ljubljana for about 30
students in the rst cycle professional study program Practical Mathematics: a mathematical task was given,
explicitly requiring the exchange of mathematical expressions between various systems, of which comments and
reports were expected.</p>
      <p>In the rst weeks of the subject called Computer tools in mathematics they get used to typical examples of
mathematical software they are likely to apply in their career. They obtain mostly the basic knowledge, so they
know only the most common functions and functionality of the applications. After a few weeks they received the
experiment's task as follows (one of the weekly assignments): They should solve a certain mathematical task (and
report in detail the process of obtaining the solution). In all the tasks given we foreseen the usage of di erent
tools. Most often the combination of computer algebra system, dynamic geometry system and numerically
oriented matrix software was needed. So even when certain tasks could be solved within one software, their their
limited familiarity with the software lead them to use multiple softwares. For reporting they mostly used the most
common word processing software. We were interested how they will cope with the process of exchanging the
mathematical object between programs and how they will report on that where transfer of various mathematical
objects has been expet. The subjects were instructed beforehand to report on all di culties and obstacles.</p>
      <p>To be able to evaluate what the mathematical systems are able to import, clipboard inspectors are used:
On Windows the ClipSpy utility which shows the bytes allocated for each avour within the clipboard. The
application seems to show most of the information accessible by API as documented on [MSCl]. On MacOSX,
ClipboardViewer (an example application of the Apple Developer Tools) has been used. It shows, similarly, an
association between the \Uniform Type Identi ers" (the name of the clipboard avour types de ned by Apple,
see [ApUTI]) and the byte-values. Both the Uniform Type Identi ers (on MacOSX) and the Windows avour
names are a exible mechanism to encode the type of content- avours: the strings de ne, in a kind of cooperative
agreement way, which data-type is put in the clipboard. Platform makers de ne basic types (e.g. basic images
and texts) while applications are free to de ne new formats. In the case of UTI, a mechansim of inheritance is
provided.</p>
    </sec>
    <sec id="sec-3">
      <title>Issues when employing plain text in copy and pasting formul</title>
      <p>The rst family of issues that we have met is that default copy mechanisms are rarely universally applicable and
are, very commonly, restricted to only copy and paste internally, for which the system clipboard is not used (as
copying from the system to itself is much safer). This same default copy function tends to copy a representation
that is close to a source format within the text avours. While this representation may be readable by users,
it is not e ective to input in most other mathematical systems as the input syntax is speci c to each system.
Mismatches of syntaxes then need to be manually xed by the user who has to understand both syntaxes;
something that is cognitively demanding when the mathematics thinking also demands cognitive resources.</p>
      <p>Examples of such incompatibilities include an incomplete LaTeX compatibility of a dynamic geometry system:
the user is left to explore bits by bits what works and what does not and, contrary to LaTeX, the documentation
for the supported features is far smaller.</p>
      <p>The second family of issues lies in the apparent compatibility of syntaxes between mathematical systems:
While some expressions seem naturally exchangeable, e.g. simple polynomials written with the \^" as exponent,
this compatibility breaks very quickly (e.g. these polynomials might be usable in a JavaScript source but not in
a Python source because the exponent there is \**"). Much more delicate is the exchange of formul between
Mathematica and Geogebra as is displayed in the picture below, produced by one of the experiment subjects: it
shows that the Sum operator between the two computer algebra systems does not follow a completely similar
syntax even if it shares quite some similarity as is the case for elementary formul between these systems.</p>
      <p>A third issue lies in the need to use dedicated copy functions. It forces the user to navigate through sequential
menu choices to decide which format to copy (at worst she needs to use a dedicated command before). Among the
particularly invasive ones lies the frequent function \Copy as MathML" which puts the (generally big) MathML
source code as text source and which is intercepted by paste actions (\this looks like MathML"): the di culty
to reach the menu and the rather wild appearance of the XML expression makes this operation very far from
natural.</p>
      <p>Finally, another issue in using plain-text is on web-browsers: As MathML there is generally represented as
part of an HTML text page, the copied content is text, sometimes styled text. The copy function generally copies
only an unordered sequence of characters which may be valid only if fractions or roots are not present; moreover,
the copy function is supported by a selection mechanism which completely ignores the underlying mathematical
structure (even that of MathML).
4</p>
    </sec>
    <sec id="sec-4">
      <title>User types and their a nity to ddling with text</title>
      <p>While we acknowledge that using plain text to copy and paste often yields readable syntax and, what is probably
more important, repairable syntax, it can be accepted di erently depending on the users.</p>
      <p>Experts we have met generally accept well the syntax di erences; they can even be a good hint to the capacity
spectrum of the computing system. However, there is a clear need to master well the di erent syntaxes, which
is generally not available in many user types we met.</p>
      <p>Such comments appeared in our experiment to illustrate our case:
".. that Copy/Paste method does not work with drawing plots in di erent programs. For drawing plots each
one of used programs has its own `language' which we have to use. \
\... we were very rarely able to use the Copy/Paste method to transfer expressions between programs and
even then some corrections were usually needed for programs to work"
\After several tries I gave up. The only way to transfer an expression between programs is to manually type
it..."
\To input matrix in X is surprisingly identical to input matrix in Y. But unfortunately here the joy ends."
In the discussion of the task afterwards students we performed experimentation with all express their deep
disappointment about the softwares. When started the task they mostly all expected there will be just some
minor problems. As one students said \it is all mathematical software and mathematics is just one, so I expected
no problems in using Copy/Paste"</p>
      <p>We conclude that the learning of syntaxes is the biggest hurdle for these students and that alternative ways
need to be found to make a more productive use of the interchanges between pieces of software.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion: Standards that Support Copy and Paste</title>
      <p>While copying plain text formul has been considered a universal way, we have described a population of users
for which this approach fails and for which di erent exchange mechanisms are needed.</p>
      <p>Standards exist to this end. They include the clipboard avour names of Windows and UTIs on MacOSX as
speci ed in the media-types of MathML: these describe the names of three clipboard avours that can encode
MathML for several applications. Using them, it is possible for a computing system to use its internal mechanisms
to export MathML content in the form of a parseable tree and MathML presentation in the form of a rendered
LaTeX fragment.</p>
      <p>Similarly, embedding MathML within HTML fragments might potentially include the wealth of MathML
exchanges as indeed a considerably bigger consideration exists for Web-pages' content. However, many of the e orts
towards empowering web-oriented copy-and-paste such as the Clipboard API draft tend to consider [Che15].</p>
      <p>Exchanging MathML o ers another possibility to combine multiple types: through the semantics,
annotation, and annotation-xml, the markup may contain alternative representations of the formul in
different media-types (which we consider equivalent to clipboard avour names), see [Car13, x6.4]. It is not clear if
this competing alternate-representations mechanism may produce di erent results.</p>
      <p>If doing this, the receiving party can choose the format it best understands from the list of clipboard avour
names and from the alternate representations in MathML. Thus a layout software might use a
MathMLpresentation or Rich Text Format fragment, or even a picture; however another computing system would use the
MathML-content fragment or another avour which is potentially better suited (e.g. to accommodate for
bilateral import-export functions). Moreover, receiving multiple representations is possible with clipboard avours
and with MathML alternate representations. This allows the recipient to request all of the data-streams for each
of the formats and potentially evaluate further its processability or o er a choice to the user using the internal
display mechanisms (as investigated in [Lib09]).</p>
      <p>As for systems which do not understand any of the mathematical oriented avours: a picture, a rich text
format or. . . a fragment of plain text (as a last resort) might be satisfactory.
[ApUTI] Apple Inc. Introduction to Uniform Type Identi ers Mac Developer Library Available at
https://developer.apple.com/library/mac/documentation/FileManagement/Conceptual/
understanding_utis/understand_utis_intro/understand_utis_intro.html.
[Car10] D. Carlisle. MathML on the Clipboard, Blog entry. http://dpcarlisle.blogspot.de/2010/01/
mathml-on-clipboard.html.
[Car13] D. Carlisle and P. Ion and R. Miner. Mathematical Markup Language, Version 3 (2nd edition) W3C.</p>
      <p>http://www.w3.org/TR/MathML3/.
[Che15] D. Cheng Clipboard API: remove dangerous formats from mandatory data types public-webapps
mailinling list post on 2015-06-09. Archive of the thread visible at https://lists.w3.org/Archives/
Public/public-webapps/2015AprJun/0819.html.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [Kic00]
          <string-name>
            <given-names>E.</given-names>
            <surname>Kiciman</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Fox</surname>
          </string-name>
          .
          <article-title>Using Dynamic Mediation to Integrate COTS Entities in a Ubiquitous Computing Environment</article-title>
          . in Handheld and Ubiquitous Computing: Second International Symposium, HUC 2000 Bristol, UK,
          <source>September</source>
          <volume>25</volume>
          {
          <fpage>27</fpage>
          ,
          <issue>2000</issue>
          <year>Proceedings</year>
          ,
          <volume>221</volume>
          {
          <fpage>226</fpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [Kin02]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kindberg</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Fox</surname>
          </string-name>
          .
          <article-title>System software for ubiquitous computing</article-title>
          .
          <source>IEEE pervasive computing</source>
          ,
          <volume>1</volume>
          (
          <issue>1</issue>
          ):
          <volume>70</volume>
          {
          <fpage>81</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [Lie00]
          <string-name>
            <given-names>H.</given-names>
            <surname>Lieberman</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Selker</surname>
          </string-name>
          .
          <article-title>Out of context: Computer systems that adapt to, and learn from, context</article-title>
          .
          <source>IBM Systems Journal, 3</source>
          .4(
          <issue>39</issue>
          ):
          <volume>617</volume>
          {
          <fpage>632</fpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [Lib09]
          <string-name>
            <given-names>P.</given-names>
            <surname>Libbrecht</surname>
          </string-name>
          , E. Andres, and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gu</surname>
          </string-name>
          <article-title>Smart pasting for ActiveMath Authoring Proceedings</article-title>
          of MathUI 09 Workshop, http://www.activemath.org/workshops/MathUI/09/,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [MSCl]
          <string-name>
            <given-names>Microsoft</given-names>
            <surname>Inc</surname>
          </string-name>
          . Clipboard Class, API Documentation available at https://msdn.microsoft.com/ en-us/library/system.windows.clipboard.
          <source>aspx (checked</source>
          <year>2016</year>
          -
          <volume>07</volume>
          -20).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [Mey01]
          <string-name>
            <given-names>B. A.</given-names>
            <surname>Myers</surname>
          </string-name>
          and
          <string-name>
            <given-names>C. H.</given-names>
            <surname>Peck</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Nichols</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Kong</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Miller</surname>
          </string-name>
          .
          <article-title>Interacting at a Distance Using Semantic Snar ng</article-title>
          .
          <source>in Ubicomp 2001: Ubiquitous Computing: International Conference Atlanta Georgia, USA, September 30{October 2</source>
          ,
          <issue>2001</issue>
          <year>Proceedings</year>
          ,
          <volume>305</volume>
          {
          <fpage>314</fpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [Pad04]
          <string-name>
            <given-names>L.</given-names>
            <surname>Padovani</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Solmi</surname>
          </string-name>
          .
          <article-title>An Investigation on the Dynamics of Direct-Manipulation Editors for Mathematics</article-title>
          .
          <source>in Mathematical Knowledge Management: Third International Conference, MKM</source>
          <year>2004</year>
          ,
          <article-title>Bialowiez_a</article-title>
          , Poland,
          <source>September 19-21</source>
          ,
          <year>2004</year>
          . Proceedings,
          <volume>302</volume>
          {
          <fpage>316</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [Zha03]
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhang</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Fateman</surname>
          </string-name>
          .
          <article-title>Survey of user input models for mathematical recognition: Keyboards, mice, tablets, voice</article-title>
          . Computer Science Division, University of California,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>