<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Factors for Reading Mathematical Expressions</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andrea Kohlhase</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hochschule Neu-Ulm</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Germany</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>Mathematical expressions are all around us, so the practice of reading them should be a skill comparable to reading text. But there is an intuitively experienced and by now also better understood difference. In this paper we report on our experiments that helped to get a deeper grasp on this difference. By taking a closer look we hope to understand how to bridge the gap between text and represented mathematics by supporting the readers with customized interactive documents.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Expressing mathematical knowledge in mathematical expressions evolved over
the last three centuries and because of its elegance is considered a form of art by
mathematicians. Indeed, it has revolutionized the way mathematics is created,
stored, and communicated. Unfortunately, not only mathematicians read and
understand mathematical expressions, as they are used in all kinds of
scientific/technical/engineering/mathematics (STEM) documents. Unfortunately,
formulae are at the same time an indispensable tool for the initiated and a formidable
barrier to novices - consider for example the well-known “formula anxiety” (see
e.g. [SGM10]). Surprisingly little is known about the skill to read formulae, so
we decided in 2016 to start a series of eye-tracking experiments with a Tobii t60
eye-tracker to understand how people read them.
In a pilot study [KF16] we explored and identified idiosyncratic practices with
math expressions. We invited 23 participants to look at concrete
math expressions like the one in Figure 1. One goal of this
exploratory study consisted in finding relevant
discrepancies within various user groups, particularly the
dimensions older– younger, female – male, and math-oriented
– non-math-oriented. To our surprise, the only difference
we could make out was the one between math-oriented
(math) and non-math-oriented (no-math) subjects. Fig. 1. A Simple Set</p>
      <p>For instance, we gathered the gaze opacity maps1 after of Equations
the indicated time for the math and no-math participant
group in Fig. 2.
1 With a gaze opacity map one can only see those areas that the selected participants
looked at and thus, saw. Everything else is blacked out.</p>
      <p>If nothing specifically strikes the human eye, then a typical grasp of the input
given starts in the centre. This explains why participants in the no-math group
began to cope with the input in the center of the math expression. Interestingly
though, the math members did not. Instead they started by looking at the first
equation symbol “=” and later on the operator “+”. We interpreted this as a
content-driven reading behaviour by our math-literate subjects.</p>
      <p>Even more interestingly, we could demonstrate that people in the no-math
group read a complex formula like text from left to right (see an exemplary
gaze plot2 of one on the right-hand side in Figure 3). In contrast, math-literate
subjects – like the one on the left-hand side of Figure 3 – explored the
complex formula along its operator tree: first understand the left-hand part of the
equation and analyze its constituents, only then go to the following parts.
2 A gazeplot is a visualization of fixations over time.</p>
    </sec>
    <sec id="sec-2">
      <title>Factor: Community of Practice</title>
      <p>For [KKF17] we did an eye-tracking study with 29 math-affine subjects to
confirm/refine/reject these hypotheses. In a nutshell, we could show that math
expressions are decoded in the order of a depth-first traversal of the operator
tree, simple ones often only serve as placeholders for argument positions, and
visual patterns are used for top-level structure detection in math expressions. We
summarized our observations in the “Gestalt Law Hypothesis”, which basically
states that the decoding process of mathematical expressions has a first parsing
phase based on its Gestalt structure and a second understanding phase, in which
structural details are cross-checked.</p>
      <p>The data of the same experiment led to an analysis in [Koh17] with respect
to subjects belonging to different Communities of Practice (CoP). We started
out with the distinction between participants being mathematicians or computer
scientists as the whole group of subjects could be roughly split in half according
to their self-conception. Astonishingly, this analysis was fruitless: no distinctions
were observable. Looking for a reason we realized that our test set-up used
equations from different CoPs and not all the participants did belong to the fitting
CoP of the equation. When we looked at distinctions between CoP-members and
non-CoP-members for a specific equation, a clear distinction could be observed.</p>
      <p>Fig. 5 shows the variant A – a modification of the Type equation given in
Fig. 4 with masked subexpressions: the index/type → → was replaced with
α → → . 3 Interestingly, members of both groups – Type Theory experts
from CS and classical mathematicians from the MATH group – identified the
erroneous modification quite clearly as all fixated the error spot the longest (row
1 and 2).</p>
      <p>But did they also grasp the error? After each formula presentation we asked
the participants the question whether they considered the presented math
expression as a representation of the original expression. So we looked at the
questionnaire answers and were surprised to find that 75% of CS subjects did
acknowledge the modification (and 17% not), whereas the MATH participants
were evenly acknowledging and not so (50% both, see row 5 and 6). A closer
look to the heatmap of the CS participants judging Variant A as a false
variant (4th row in Fig. 5) reveals that those participants focused on the erroneous
3 In general, CS members are more acquainted with Type Theory than MATH
members.
type. Their knowledge retrieval seems to be very effective: they focus on the
relevant information concerning indices/types and check the wrong type against the
other types to confirm their suspicion. The CS group paid significantly more
attention to the type components, whereas the true mathematicians seemed more
interested in the homomorphic structure at the term level.</p>
      <p>Note that knowing that there was an error (given by the subjects in the
questionnaire) could hint at differences in comprehension, that is, competency.
Nevertheless, the CoP of type theorists within the CS group coincided strongly
with the group of participants from CS that declined the question. This should
be affirmed in a future study.</p>
    </sec>
    <sec id="sec-3">
      <title>Factor: Math Competency</title>
      <p>The most recent eye-tracking study [KKO18] was carried out with 23 students
from the Biomechanical and Electrical Engineering Program at Srinakharinwirot
University (SWU), Thailand. The students were observed reading and
understanding a ‘solved-problem’, which was/should have been familiar to all of them
due to their previous studies.</p>
      <p>Here, we want to look at the data from the aspect of students’ comprehension
level or competency. We have used the eye-tracker’s own metrics of visits,
fixations, and dwelling time for equations present in the solved-problem document.
There were 3 equations {Eq-1, Eq2, Eq3} present in the text. Each equation
consisted of sub-expressions like Eq-1-1 in Eq-1 and we even detailed the next
sub-expression with Eq-1-1-1 being part of Eq-1-1.</p>
      <p>In the following figures we see the differences between the students in the
high group (consisting of those students who could show a level of high
comprehension in an interview after the experiment) and the low group (who did
not). The data points represent the mean of the according values.</p>
      <p>In particular, if we look at Figure 6, we observe that high-members
consistently visited all math expressions more than low-members. Interestingly, the
percentual difference (“Visits-Diff”) is for all levels relatively stable: it ranges
only between 12,61% and 19,23%. The first equation gets the most visits for
the high group, the second the next lower number, the third the lowest, as if
interest in the latter equations is reduced. Note that this is not the case for the
low group. Here, the first and second equation are on the same level.</p>
      <p>Moreover, in Figure 7 the number of fixations for the distinct expression levels
by the high vs. low group is shown. Here, the members of the high group clearly
have more fixations in the upper expression levels. The percentual difference
(“Fixations-Diff”) is for all levels relatively stable: it ranges only between 12,95%
and 19,08%. Note that the number of fixations by members of the high group
is highest with the first equation, second with the second, and third with the
third. And again this is different for the low group members, as the fixations
of the first and second equation are almost identical.</p>
      <p>Finally, if we look at the relative visit durations, that is, the time the
participants spent on the distinct areas relative to their own total time, in Figure 8,
we conclude that the high subjects spent just barely more time on the distinct
expressions than the low subjects. This is interesting as they clearly used more
visits and fixations, that is, their eyes jumped a lot more within the text. We
guess that this means that they are much more immersed and active in their
information perception. We cannot guess whether the activity is the cause or
the effect of this activity, an extra experiment must show us the answer in the
future. The percentual difference (“Rel. Visit Duration-Diff”) is for all levels
even more relatively stable: it ranges only between 10,07% and 14,13%.</p>
      <p>We can conclude, that the competency of readers has an impact on reading
behaviour as well.
5</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>In this paper we summarized the results of our eye-tracking experiments with
respect to the factors which cause the math reading practice to be different
from text reading. Math literacy as an important first factor seems to help to
differentiate reading math from reading text. A second factor is the membership
to a community of practice. This membership and its according practices help
to distinguish important information from irrelevant information, which in turn
cause distinct reading patterns for math. As a third factor we hypothesize that
competency changes how mathematical expressions are looked at. Note that CoP
membership neither necessarily means competency nor vice versa. In the long
run we hope to understand how to bridge the gap between reading text and
mathematics by supporting the readers with customized active documents.
5.1</p>
      <p>Acknowledgement
Many thanks to the colleagues Michael Kohlhase, Michael Frsich, and Taweechai
Ouypornkochagorn working jointly on the eye-tracking experiments and analyses
thereof.
[KF16]
[KKF17]
[KKO18]</p>
      <p>Andrea Kohlhase and Michael Fu¨rsich. “Understanding Mathematical
Expressions: An Eye-Tracking Study”. In: Mathematical User Interfaces
Workshop. Ed. by Andrea Kohlhase and Paul Libbrecht. July 2016. url: http:
//ceur-ws.org/Vol-1785/M2.pdf.</p>
      <p>Andrea Kohlhase, Michael Kohlhase, and Michael Fu¨rsich. “Visual
Structure in Math Expressions”. In: Intelligent Computer Mathematics (CICM)
2017. Conferences on Intelligent Computer Mathematics. Ed. by Herman
Geuvers et al. LNAI 10383. Springer, 2017. doi: 10.1007/978- 3-
31962075-6. url: http://kwarc.info/kohlhase/papers/cicm17-eyetracking.
pdf.</p>
      <p>Andrea Kohlhase, Michael Kohlhase, and Taweechai Ouypornkochagorn.
“Discourse Phenomena in Math Documents”. In: Intelligent Computer
Mathematics (CICM) 2018. Conferences on Intelligent Computer
Mathematics. Ed. by Florian Rabe et al. LNAI. accepted. Springer, 2018. url:
http://kwarc.info/kohlhase/papers/cicm18-discourse.pdf.
[SGM10]</p>
      <p>Andrea Kohlhase. “Domain-Dependant Decoding of Math Expressions”.
In: MathUI 2017: The 12th Workshop on Mathematical User Interfaces.
Ed. by Andrea Kohlhase and Marco Pollanen. 2017.</p>
      <p>Alexander Strahl, Julian Grobe, and Rainer Mu¨ller. “Was schreckt bei
Formeln ab? - Untersuchung zur Darstellung von Formeln”. In: PhyDid B
- Didaktik der Physik - Beitr¨age zur DPG-Fru¨hjahrstagung 0.0 (2010).</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>