Introduction

Factors for Reading Mathematical Expressions

Andrea Kohlhase

Hochschule Neu-Ulm

Germany

Mathematical expressions are all around us, so the practice of reading them should be a skill comparable to reading text. But there is an intuitively experienced and by now also better understood difference. In this paper we report on our experiments that helped to get a deeper grasp on this difference. By taking a closer look we hope to understand how to bridge the gap between text and represented mathematics by supporting the readers with customized interactive documents.

Introduction

Expressing mathematical knowledge in mathematical expressions evolved over the last three centuries and because of its elegance is considered a form of art by mathematicians. Indeed, it has revolutionized the way mathematics is created, stored, and communicated. Unfortunately, not only mathematicians read and understand mathematical expressions, as they are used in all kinds of scientific/technical/engineering/mathematics (STEM) documents. Unfortunately, formulae are at the same time an indispensable tool for the initiated and a formidable barrier to novices - consider for example the well-known “formula anxiety” (see e.g. [SGM10]). Surprisingly little is known about the skill to read formulae, so we decided in 2016 to start a series of eye-tracking experiments with a Tobii t60 eye-tracker to understand how people read them. In a pilot study [KF16] we explored and identified idiosyncratic practices with math expressions. We invited 23 participants to look at concrete math expressions like the one in Figure 1. One goal of this exploratory study consisted in finding relevant discrepancies within various user groups, particularly the dimensions older– younger, female – male, and math-oriented – non-math-oriented. To our surprise, the only difference we could make out was the one between math-oriented (math) and non-math-oriented (no-math) subjects. Fig. 1. A Simple Set

For instance, we gathered the gaze opacity maps1 after of Equations the indicated time for the math and no-math participant group in Fig. 2. 1 With a gaze opacity map one can only see those areas that the selected participants looked at and thus, saw. Everything else is blacked out.

If nothing specifically strikes the human eye, then a typical grasp of the input given starts in the centre. This explains why participants in the no-math group began to cope with the input in the center of the math expression. Interestingly though, the math members did not. Instead they started by looking at the first equation symbol “=” and later on the operator “+”. We interpreted this as a content-driven reading behaviour by our math-literate subjects.

Even more interestingly, we could demonstrate that people in the no-math group read a complex formula like text from left to right (see an exemplary gaze plot2 of one on the right-hand side in Figure 3). In contrast, math-literate subjects – like the one on the left-hand side of Figure 3 – explored the complex formula along its operator tree: first understand the left-hand part of the equation and analyze its constituents, only then go to the following parts. 2 A gazeplot is a visualization of fixations over time.

Factor: Community of Practice

For [KKF17] we did an eye-tracking study with 29 math-affine subjects to confirm/refine/reject these hypotheses. In a nutshell, we could show that math expressions are decoded in the order of a depth-first traversal of the operator tree, simple ones often only serve as placeholders for argument positions, and visual patterns are used for top-level structure detection in math expressions. We summarized our observations in the “Gestalt Law Hypothesis”, which basically states that the decoding process of mathematical expressions has a first parsing phase based on its Gestalt structure and a second understanding phase, in which structural details are cross-checked.

The data of the same experiment led to an analysis in [Koh17] with respect to subjects belonging to different Communities of Practice (CoP). We started out with the distinction between participants being mathematicians or computer scientists as the whole group of subjects could be roughly split in half according to their self-conception. Astonishingly, this analysis was fruitless: no distinctions were observable. Looking for a reason we realized that our test set-up used equations from different CoPs and not all the participants did belong to the fitting CoP of the equation. When we looked at distinctions between CoP-members and non-CoP-members for a specific equation, a clear distinction could be observed.

Fig. 5 shows the variant A – a modification of the Type equation given in Fig. 4 with masked subexpressions: the index/type → → was replaced with α → → . 3 Interestingly, members of both groups – Type Theory experts from CS and classical mathematicians from the MATH group – identified the erroneous modification quite clearly as all fixated the error spot the longest (row 1 and 2).

But did they also grasp the error? After each formula presentation we asked the participants the question whether they considered the presented math expression as a representation of the original expression. So we looked at the questionnaire answers and were surprised to find that 75% of CS subjects did acknowledge the modification (and 17% not), whereas the MATH participants were evenly acknowledging and not so (50% both, see row 5 and 6). A closer look to the heatmap of the CS participants judging Variant A as a false variant (4th row in Fig. 5) reveals that those participants focused on the erroneous 3 In general, CS members are more acquainted with Type Theory than MATH members. type. Their knowledge retrieval seems to be very effective: they focus on the relevant information concerning indices/types and check the wrong type against the other types to confirm their suspicion. The CS group paid significantly more attention to the type components, whereas the true mathematicians seemed more interested in the homomorphic structure at the term level.

Note that knowing that there was an error (given by the subjects in the questionnaire) could hint at differences in comprehension, that is, competency. Nevertheless, the CoP of type theorists within the CS group coincided strongly with the group of participants from CS that declined the question. This should be affirmed in a future study.

Factor: Math Competency

The most recent eye-tracking study [KKO18] was carried out with 23 students from the Biomechanical and Electrical Engineering Program at Srinakharinwirot University (SWU), Thailand. The students were observed reading and understanding a ‘solved-problem’, which was/should have been familiar to all of them due to their previous studies.

Here, we want to look at the data from the aspect of students’ comprehension level or competency. We have used the eye-tracker’s own metrics of visits, fixations, and dwelling time for equations present in the solved-problem document. There were 3 equations {Eq-1, Eq2, Eq3} present in the text. Each equation consisted of sub-expressions like Eq-1-1 in Eq-1 and we even detailed the next sub-expression with Eq-1-1-1 being part of Eq-1-1.

In the following figures we see the differences between the students in the high group (consisting of those students who could show a level of high comprehension in an interview after the experiment) and the low group (who did not). The data points represent the mean of the according values.

In particular, if we look at Figure 6, we observe that high-members consistently visited all math expressions more than low-members. Interestingly, the percentual difference (“Visits-Diff”) is for all levels relatively stable: it ranges only between 12,61% and 19,23%. The first equation gets the most visits for the high group, the second the next lower number, the third the lowest, as if interest in the latter equations is reduced. Note that this is not the case for the low group. Here, the first and second equation are on the same level.

Moreover, in Figure 7 the number of fixations for the distinct expression levels by the high vs. low group is shown. Here, the members of the high group clearly have more fixations in the upper expression levels. The percentual difference (“Fixations-Diff”) is for all levels relatively stable: it ranges only between 12,95% and 19,08%. Note that the number of fixations by members of the high group is highest with the first equation, second with the second, and third with the third. And again this is different for the low group members, as the fixations of the first and second equation are almost identical.

Finally, if we look at the relative visit durations, that is, the time the participants spent on the distinct areas relative to their own total time, in Figure 8, we conclude that the high subjects spent just barely more time on the distinct expressions than the low subjects. This is interesting as they clearly used more visits and fixations, that is, their eyes jumped a lot more within the text. We guess that this means that they are much more immersed and active in their information perception. We cannot guess whether the activity is the cause or the effect of this activity, an extra experiment must show us the answer in the future. The percentual difference (“Rel. Visit Duration-Diff”) is for all levels even more relatively stable: it ranges only between 10,07% and 14,13%.

We can conclude, that the competency of readers has an impact on reading behaviour as well. 5

Conclusion

In this paper we summarized the results of our eye-tracking experiments with respect to the factors which cause the math reading practice to be different from text reading. Math literacy as an important first factor seems to help to differentiate reading math from reading text. A second factor is the membership to a community of practice. This membership and its according practices help to distinguish important information from irrelevant information, which in turn cause distinct reading patterns for math. As a third factor we hypothesize that competency changes how mathematical expressions are looked at. Note that CoP membership neither necessarily means competency nor vice versa. In the long run we hope to understand how to bridge the gap between reading text and mathematics by supporting the readers with customized active documents. 5.1

Acknowledgement Many thanks to the colleagues Michael Kohlhase, Michael Frsich, and Taweechai Ouypornkochagorn working jointly on the eye-tracking experiments and analyses thereof. [KF16] [KKF17] [KKO18]

Andrea Kohlhase and Michael Fu¨rsich. “Understanding Mathematical Expressions: An Eye-Tracking Study”. In: Mathematical User Interfaces Workshop. Ed. by Andrea Kohlhase and Paul Libbrecht. July 2016. url: http: //ceur-ws.org/Vol-1785/M2.pdf.

Andrea Kohlhase, Michael Kohlhase, and Michael Fu¨rsich. “Visual Structure in Math Expressions”. In: Intelligent Computer Mathematics (CICM) 2017. Conferences on Intelligent Computer Mathematics. Ed. by Herman Geuvers et al. LNAI 10383. Springer, 2017. doi: 10.1007/978- 3- 31962075-6. url: http://kwarc.info/kohlhase/papers/cicm17-eyetracking. pdf.

Andrea Kohlhase, Michael Kohlhase, and Taweechai Ouypornkochagorn. “Discourse Phenomena in Math Documents”. In: Intelligent Computer Mathematics (CICM) 2018. Conferences on Intelligent Computer Mathematics. Ed. by Florian Rabe et al. LNAI. accepted. Springer, 2018. url: http://kwarc.info/kohlhase/papers/cicm18-discourse.pdf. [SGM10]

Andrea Kohlhase. “Domain-Dependant Decoding of Math Expressions”. In: MathUI 2017: The 12th Workshop on Mathematical User Interfaces. Ed. by Andrea Kohlhase and Marco Pollanen. 2017.

Alexander Strahl, Julian Grobe, and Rainer Mu¨ller. “Was schreckt bei Formeln ab? - Untersuchung zur Darstellung von Formeln”. In: PhyDid B - Didaktik der Physik - Beitr¨age zur DPG-Fru¨hjahrstagung 0.0 (2010).