=Paper=
{{Paper
|id=Vol-2068/uistda2
|storemode=property
|title=Reading Type Classification based on Generative Models and Bidirectional Long Short-Term Memory
|pdfUrl=https://ceur-ws.org/Vol-2068/uistda2.pdf
|volume=Vol-2068
|authors=Seyyed Saleh Mozaffari Chanijani,Federico Raue,Saeid Dashti Hassanzadeh,Stefan Agne,Syed Saqib Bukhari,Andreas Dengel
|dblpUrl=https://dblp.org/rec/conf/iui/ChanijaniRHABD18
}}
==Reading Type Classification based on Generative Models and Bidirectional Long Short-Term Memory==
Reading Type Classification based on Generative Models and Bidirectional Long Short-Term Memory Seyyed Saleh Mozaffari Federico Raue Saeid Dashti Hassanzadeh Chanijani TU Kaiserslautern TU Clausthal TU Kaiserslautern German Research Center for saeid.dashti.hassanzadeh@tu- German Research Center for Artificial Intelligence clausthal.de Artificial Intelligence Federico.Raue@dfki.de mozafari@dfki.uni-kl.de Stefan Agne Syed Saqib Bukhari Andreas Dengel German Research Center for German Research Center for TU Kaiserslautern Artificial Intelligence Artificial Intelligence German Research Center for Stefan.Agne@dfki.de Saqib.Bukhari@dfki.de Artificial Intelligence dengel@dfki.uni-kl.de ABSTRACT Author Keywords Measuring the attention of users is necessary to design smart Eye Tracking; reading type; classification; synthetic data; Human Computer Interaction (HCI) systems. Particularly, in generative models; Hierarchical Hidden Markov Models; reading, the reading types, so-called reading, skimming, and Gaussian Mixture Models; LSTM; Recurrent Neural scanning are signs to express the degree of attentiveness. Eye Networks; reading; skimming; scanning; movements are informative spatiotemporal data to measure quality of reading. Eye tracking technology is the tool to INTRODUCTION record eye movements. Even though there is increasing usage Reading, is the ability to extract visual information from the of eye trackers in research and especially in psycholinguistics, page and comprehend the meaning of underlying text [16]. collecting appropriate task-specific eye movements data is Considering attention, as presented in Figure 1, the reading expensitive and time consuming. Moreover, machine learn- types is divided into three categories: reading, skimming, and ing tools like Recurrent Neural Networks need large enough scanning. On eye tracking context, the reading is a method of samples to be trained. Hence, designing a generative model moving the eyes over the text to comprehend the meaning of in order to have reliable research-oriented synthetic eye move- it. The skimming is a rapid eye movement over the document ments is desirable. This paper has two main contributions. with the purpose of getting only the main ideas and a general First, a generative model in order to synthesize reading, skim- overview of the document whereas scanning rapidly covers ming, and scanning in reading is developed. Second, in order a lot of contexts in order to locate specific fact or piece of to evaluate the generative model, a bidirectional Long Short- information. Term Memory (BLSTM) is proposed. It was trained with The fixation progress on words expressed in character synthetic data and tested with real-world eye movements to units must be measured in order to detect the reading type, classify reading, skimming, and scanning where more than i.e., deciding whether observed eye movement patterns in 95% classification accuracy is achieved. the reading types [4, 12]. This approach applies in cases where the eye tracking accuracy is high enough to provide word level resolution [16]. Example applications include ACM Classification Keywords ScentHighlight [6], which highlights related sentences during reading; the eyeBook [2], where ambient effects I.5.4. Computing Methodologies: PATTERN RECOGNI- are to be triggered in proximity of the reading position; or TION; Applications; QuickSkim [3], where non-content words may be faded out in real time with an increase of skimming speed to make reading more efficient. Due to the noisy nature of the eye tracking apparatus where the point of gaze cannot be determined exactly, it can be desirable to automatically decide to what extent eye movements resemble a reading type patterns. In this regard, a psycholinguist is able to determine what segments of the ©2018. Copyright for the individual papers remains with the authors. scanpath belong to reading, skimming, or scanning, even Copying permitted for private and academic purposes. though the fixations do not match the underlying text. UISTDA ’18, March 11, 2018, Tokyo, Japan Figure 2: The top-level architecture of the eyeReading frame- work. The eyeReading Server and eyeReading Client compo- nents are connected via WebSocket protocol. [14] Figure 1: The three reading types. A: Reading; the saccades original dataset as well as the synthetic dataset. are short and progressive over the context. B: Skimming; here saccades are longer compare to the reading pattern. C: Scan- The paper is structured as follows. We start with presenting ning; compare to skimming the saccades are less unstructured. an experiment conducted to collect real-world eye movements in reading in order to build a reference for data synthesization and the reading type classification. Then, a two-layered Biedert et al. [4] proposed a reading-skimming classifier. Hierarchical Hidden Markov Model (HHMM) for eye Despite the model classifies the reading and the skimming movement data synthesization is proposed. Moreover, we patterns, it does not cover scanning patterns. It is desirable present a BLSTM-based sequential model to detect and to have scanning patterns in order to have better estimation classify reading, skimming, and scanning. This model built on the degree of attiontion in reading to designing a proper on features described in section Features and Training. In the Human Document Interaction system. In addition, in the Evaluation and Result section, we evaluate our models and domain of information retrieval, it has been shown that describe our results. This is followed by our conclusion. acquiring implicit feedback from a reading type detection, including scanning, can significantly improve search accuracy through personalization [5]. REAL-WORLD DATA ACQUISITION Detecting of the reading types is a sequence classification The first step of constructing a system which is able to learn problem. The term sequence classification encompasses all and distinguish eye movement patterns of reading types is tasks where sequences of data are transcribed with sequences to record the real-world eye movement data during reading of discrete labels [9]. The discrete labels in the reading mode. The recorded data must comprise all possible state type classification problem are shown in Figure 1. Long categories: reading, skimming, and scanning. To perform the Short-Term Memory (LSTM) [10] is a variation of Recurrent task, we designed an experiment to record eye movements of Neural Networks (RNNs) which is suitable to classify the ten participants from the local university. We used this data as sequential data. However, such networks need large enough a reference to build up a Hierarchical Hidden Markov Model samples for training. Unfortunately, the task-specific eye (HHMM) for synthetic data generation. Furthermore, this real- tracking data size would not big enough to be applied in world data partially employed for testing and evaluating the the deep networks. Hence, there is a need to synthesize classifiers. task-specific eye movement patterns in order to deploy deep neural networks. Apparatus In this paper, we propose a generative model which syn- In this study, we deploy SensoMotoric Instrument iViewX thesizes the reading types patterns. We also designed and scientific REDn eye tracker operating at 60Hz. The tracking evaluated a BLSTM model. The model trained on both the error reported by the manufacturer was less than 0.4 degree, Figure 3: The feature components of our study: The fixations are shown by yellow circles. A regression is an implicit sign that the reader is having difficulty understanding the material. It is shown by the red gaze path. When processing the fixations to form forward reads, a forward read will be stopped when (1) a regression is encountered, (2) a forward saccade is too large and likely a forward skip through the text, or (3) the eye gaze moves to another line of text. The last called sweep return and indicated with dashed red color in the figure. [16] [13]. which makes it appropriate for fixation-based eye movement of them were native Engish speaker but they were fluent in studies. 1 English as the second language. In the first phase of the experiment, the participants were requested to write a compre- Experimental Setup hensive report on the article they had given to read. Hence, The experiment was designed in which all the three reading they would read the selected article thoroughly. In the second types could be obtained. Two articles in plain English were phase, the participants were asked to find the specific informa- chosen from Wikipedia. They are about two airplane crashes tion in the second article, e.g., how many crews were in the took place in Colombia2 and Pakistan3 in 2016. Ten partic- airplane or what was the flight number. Therefore, most of the ipants from local university participated in our study. None eye movement patterns associated with skimming and scan- ning. Consequently, all three reading type patterns recorded 1 https://www.smivision.com/eye-tracking/product/redn-scientific- during trials. The trials have been recorded in specialized eye- eye-tracker/ tracking interface eyeReading [14]. The eyeReading facilitates 2 https://en.wikipedia.org/wiki/LaMia_Flight_2933 3 http://en.wikipedia.org/wiki/Pakistan_International_Airlines_Flight_661 research in reading psychology and provides a framework for gaze-based Human Document Interaction. Figure 2 shows top-level architecture of eyeReading. FEATURES AND ANNOTATION The recorded raw gaze information must be processed in order to extract saccadic features associated with reading. The ex- tracted saccadic features are the length of the saccade, velocity of the saccadic movement, fixation duration associated with the saccades, and angularity of the saccade. In this section, first, we demonstrate the features extraction step and then the process of making ground-truth will be explained. Features On account of inevitable noise in the eye tracking trials, we first applied a virtual median filter on the input raw gaze points E 0 = e01 , ..., e0n to eliminate any possible outliers. Figure 4: The application designed for two steps annotation. A sequence buffer of window size 5 presented to the annotator. ei = (medx (e0i−2 , ..., e0i+2 ), medy (e0i−2 , ..., e0i+2 )) (1) This made to facilitate annotator’s decision for the state label In the second step, the fixations were detected using dispersion as well as the saccade label of the red saccade. The orange method [11]. We considered 100ms temporal and 50px spatial circles are the fixation durations and the bigger size implies for dispersion parameters. The saccades are considered as two the bigger duration. consecutive fixations with the following features: MFR(l, θ, ν, γ) MFR(l, θ, ν, γ) FR FS COVFR(l, θ, ν, γ) COVFS(l, θ, ν, γ) MFR(l, θ, ν, γ) MFR(l, θ, ν, γ) RG LS COVRG(l, θ, ν, γ) COVLS(l, θ, ν, γ) MFR(l, θ, ν, γ) MFR(l, θ, ν, γ) SW SR COVSW(l, θ, ν, γ) COVSR(l, θ, ν, γ) State i Figure 6: The second layer of the probabilistic model for Figure 5: The first layer of the probabilistic model for the reading types data generation. Here, in the GMM- the reading types data generation. An HMM where the HMM, the states are the saccades’ labels and the compo- states are our class labels and the emissions are the saccade nents (features) value calculated through Gaussian distri- labels. The output is the sequence of class labels with the bution with respect to the covariance matrix and the mean corresponding length. matrix of the feature set F : `, θ , ν, γ. 1. amplitude(`): the distance between two progressive • FR: Forward Read (FR) is a progressive saccade associated fixations in virtual character unit (vc). with reading. The amplitude is between 7 to 10 characters [16]. 2. angularity(θ ): the angle of the saccade respect to its • FS: Forward Skim(FS)is also a progressive saccade which starting point. The α indicates the direction of the saccade the amplitude is larger than FR but not too large. in circle domain: −180◦ ≤ α ≤ 179◦ . • LS: Long Saccades(LS) is those saccades which have the bigger amplitude than the threshold considered for the con- 3. velocity(ν): the speed of the saccade: text. The direction of saccade does not apply to LS. ` ν= (2) • RG: The regressions(RG) are regressive saccades usually te − ts associated with reading which is the sign of difficulties in where ts and te are the first timestamps in the start fixation reading. The amplitude is varied and it must target the and the end fixation of a saccade in milliseconds. passed context. • SR: Sweeping back to the left on the next line of text is 4. duration(γ): The start fixation duration in each saccade in called the Sweep Returns (SR). milliseconds respectively. • SW : Unstructured sweeping the text to look up information are labeled as Sweeps (SW ). Therefore, F(`, θ , ν, γ) are the features selected for the sac- cades in the collected data. Figure 1 intuitively shows the difference between the six sac- cade labels. Data Annotation Sequence Annotation After the feature extraction step, we designed a labeling appli- After all the saccades were labeled in the previous section into cation to make ground truth from data. The figure 4 shows the six categories, in the second step, the sequences made by the interface of the labeling application. Ground-truthing data these saccades annotated as reading, skimming, or scanning. was accomplished in two steps: the saccade labeling and the At the end, 396 annotated sequences for reading, 378 for sequence labeling. skimming, and 118 for scanning were acquired. Saccade Labeling SYNTHETIC EYE MOVEMENTS IN READING In the first step of labeling, the expert made a judgment It is always desirable to have enough data samples to construct about the saccades with respect to the provided features robust machine learning models. Especially, in deep neural F : `, α, ν, θ .The saccades grouped into six categories or la- networks, a very big training set is usually required. Unfortu- bels [13]. nately, appropriate data acquisition in eye tracking studies is Algorithm 1: Algorithm to simulate a HMM states sequence often not flexible enough for the analysis of real-world data, as S given the model λ = {Π, A, B}. the state corresponding to a specific event (observation) has to be known. However, in many problems of interest, this is not Data: statesground−truth = s1 , s2 , ..., sn and given. Hidden Markov models (HMM) as originally proposed observationground−truth = o1 , o2 , ..., on where n is the by Baum et al. (1970) [1] can be viewed as an extension of number of saccades in the ground-truth. Markov chains. The only difference compared to common Result: States sequence Ssequence = ([s1 , l1 ], [s2 , l2 ], ..., [sk , lk ]) Markov chains is, that the state sequence corresponding to a where k is number of sequences, s is the sequence particular observation sequence, i.e., reading types in our case, label si ∈ C, and li is the length of si . is not observable but hidden. In other words, the observation 1 Π = (0.34, 0.33, 0.33); is a probabilistic function of the state, whereas the underlying 2 A = {ai j |i = 1, 2, 3; j = 1, 2, 3}: State transition probability state sequence itself is a hidden stochastic process [15]. That where ai j = P(st+1 = j|st = i) and ∑3j=1 ai j = 1; means the underlying state sequence can only be observed 3 B = {bk (ot )|k = 1, 2, 3;t = 1, ..., n}: Observation probability indirectly through another stochastic process that emits an ob- where bk (ot ) = P(ot |st = k); servable output. Hidden Markov models are extremely popular 4 Choose an initial state S1 according to the initial state when dealing with sequential data, such as speech recognition, distribution π; character recognition, gesture recognition as well as biolog- 5 for time t = {1, ..., n} do ical sequences. Therefore, the HMM is a right candidate to 6 Draw ot from the probability distribution Bst; handle the eye movement patterns where they are sequential 7 Go to state st + 1 according to the transition probabilities and by nature stochastic. In order to synthesize our data, the Ast ; graphical model should be able to generate both saccadic se- 8 Set t = t + 1; quences and reading state sequences. Therefore, in this paper, a two-layered Hierarchical HMM is designed. In an HHMM each state is considered to be a self-contained probabilistic model [8]. Briefly, in the first layer as shown in Figure 5 we Algorithm 2: Algorithm to generate the emissions sequence modeled the reading, skimming, and scanning as states of the S0 from S. Markov model and emissions are FR (Forward Reading), FS Data: S and observationground−truth = o1 , o2 , ..., on where n (Forward Skimming), SR (Sweep Return), RG (Regression), is the number of saccades in the ground-truth data. SW (Sweeps), LS (Long Saccades). As shown in Figure 6, Result: synthetic emissions for all s ∈ S each of states in the first level is self-contained mixture graphi- 1 Π0 = (π10 , ..., π60 ): Initial state probabilities where cal model so-called GMM-HMM (Hidden Markov Model with πi0 = P(s01 = i) and ∑6i=1 πi0 = 1; Gaussian Mixture Model). This layer responsible to generate 2 A0 = {a0i j |i = 1, ..., 6; j = 1, ..., 6}: Emissions transition values for the four mentioned saccadic features F : `, α, ν, θ . probability where a0i j = P(st+1 0 = j|st0 = i) and ∑6j=1 a0i j = 1; 3 foreach states reading, skimming, scanning calculate the Method covariance matrix COV of the emissions The 2-layered HHMM constructs the probabilistic model that FR, FS, RG, LS, SW, SR for the features `, θ , ν, γ; generates saccades associated with the reading types. It is a 4 foreach states reading, skimming, scanning calculate the top-down approach to synthesize natural reading types. The mean matrix M of the emissions FR, FS, RG, LS, SW, SR for task of the first layer, which is shown in Figure 5, is to generate the features `, θ , ν, γ; 5 Choose an initial state S10 according to the initial state distribution Π0 ; 6 for time t = {1, ..., n} do 7 Draw ot from the Gaussian distribution with covst and meanst ; 8 Go to state st + 1 according to the transition probabilities A0st ; 9 Set t = t + 1; not an easy task. It needs eye trackers which are still expen- sive in the market as well as an appropriate experimental setup designed for a specific goal, i.e., reading type classification. Hence, the idea is to generate task-specific eye movement data from the real-world data in which these synthetic data can be Figure 7: BLSTM architecture: the forward (resp. backward) used to construct a better model and even to use in other appli- layer processes the input sequence in the (resp. reverse) or- cations and research. It motivated us to design a Hierarchical der. Output layer concatenates hidden layers values at each Hidden Markov Model (HHMM) to generate synthetic eye timestep to make a decision by considering both past and movements in reading. Usually, ordinary Markov chains are future contexts [9]. N Precision Recall Accuracy data synthesization. The other half used for testing. In all 5 0.81 0.79 0.784 cases, the train set first fitted with standard scale function to 8 0.89 0.90 0.896 scale the mean (µ) to 0 and the standard deviation(δ ) to 1. 10 0.93 0.93 0.925 Then the validation and test set transformed respect to fitted data. Table 1: The results of BLSTM model on the original dataset. The model trained with different sequence length N = 5, 8, 10. 196 testing sequences distributed in 75 for reading, 83 for skimming, and 38 scanning. Baseline: Actual data and SVM-RBF classifier The baseline is to test and evaluate SVM-RBF classifier proposed by Biedert et al. [4]. For the test set, there are only the sequence of the states reading, skimming, and scanning. 196 sequences to support the model. 75 for reading, 83 for Algorithm 1 constructs this layer. In order to build the 1st layer skimming, and 38 scanning. By 5 cross-fold validation the HMM, λ (π, A, B), the transition state matrix A and emission best accuracy acquired was 69% with parameters C = 1000 matrix B are built upon the labeled data explained in section and gamma = 0.001. With a closer look at the confusion Data Labeling. We considered equal probabilities (33%) for matrix in figure 9, it is obvious that there is an unacceptable the states of reading types in π. Then, the states reading, confusion in the class scanning. This problem is on account skimming, and scanning is generated based on multinomial of the less number of supports for the class scanning where distribution. there is just 19% of the class labels. Another reason is about sequential characteristics of the data. It shows Support Vector The algorithm 2 presents the construction of the second layer Machines are not the best machine learning model tailored for of our graphical model so-called GMM-HMM model. Where such data. the input is the state sequences produced from the first layer. In contrast to the first layer, the emissions (observations) as- sociated with each sequence are generated based on Gaussian Proposed method: Actual data and BLSTM distribution. Hence, for each state (reading, skimming, and The model presented in Figure 8 is used to train and test the scanning), it needs to compute the transition matrix of obser- original data. The model trained with different sequence length vation, the mean of each component (features F : `, θ , ν, γ) as N = 5, 8, 10. In case, the input sequence has a different length, well as the covariance matrix of the features. Figure 6 presents the sequence padded or truncated to the fixed length n. Table the second layer of the model. 1 show the accuracies for the different length. 92.5% accuracy achieved for sequence length of 10. READING TYPE CLASSIFICATION WITH BLSTM roposed method: Synthetic data and BLSTM Recurrent neural networks (RNNs) are able to access a wide Finally, the BLSTM model trained with synthetic data which range of context and sequences [17]. However, standard RNNs was generated with the half of the original dataset. The same make use of the previous context only whereas bidirectional half of the original dataset also was used to validating the RNNs (BRNNs) are able to incorporate emissions on both model. The second half of the original dataset used for testing. sides of every position in the input sequence [18]. This is Table 2 presents the results for different variation of data size useful in the problem of reading type detection since it is often and sequence length. While the larger sequence windows has necessary to look at both sides to the right and to the left of the better results in the model, an instant user interface favors a given sequence in order to identify it. BLSTM is a BRNN smaller sequences. The length of a sequence is related to that has hidden layers, which are made up of the so-called the number of fixations. If we consider the average fixation Long Short-Term Memory (LSTM) cells. LSTM is an RNN duration in reading 250ms [16], the model must wait for the architecture specifically designed to bridge long time delays input sequence for 2.5s. This result supports the reliability of between relevant input and target events, making it suitable for the synthetic data. The larger data, the better performance as problems where long-range emission sequences are required expected in deep net frameworks. to disambiguate individual labels [10]. In fact, BLSTM net- works suit well for the reading type detection. Figure 7 shows CONCLUSION BLSTM-RNN architecture and Recurrent Neural Networks (RNNs) are suitable to model spa- Figure 8 presents the model implemented for our study with tiotemporal data, i.e, eye movements in reading. They usually keras [7]. It consists of two BLSTMs, one dropout to prevent need large enough data sample to be trained. On account of overfitting, and two dense networks. Where n is the sequence constraints in the experimental setups, accessibility to the ex- length of the input. The loss function is categorical crossen- pensive eye tracking apparatus, and finding appropriate and tropy and SoftMax is used as the activation function. enough participants, there is usually lack of enough data in eye tracking research to employ RNNs. In this paper a novel EVALUATION AND RESULTS probabilistic approach for eye movements data synthesization In this section, the evaluation of both generative model and the in reading is proposed. Also a BLSTM model for both the classifier is presented. Here, two types of the data are available; original recorded data and the synthetic data in order to clas- the original data recorded with the eye tracker (actual data) sify the reading types: reading, skimming, and scanning is and the synthetic data. The actual data was randomly split into presented. The RNN-based classifier proposed in this paper the train set 60%, validation set 10%, and test set 30%.First achieved more than 95% accuracy in the reading type detec- half of the actual data was used in the generative model for tion which not only outperforms the previous works but also Figure 8: The BLSTM model used in our study. It consist of two BLSTMs, one dropout to prevent overfitting, and a two fully connected networks. Here, N is the sequence length of the input. The loss function was categorical cross-entropy and Softmax is used as the activation function. Synthetic Data Size N Precision Recall Accuracy 5K 5 0.87 0.86 0.86 5K 8 0.92 0.91 0.91 5K 10 0.93 0.93 0.93 10K 5 0.88 0.87 0.87 10K 8 0.92 0.92 0.92 * 10K 10 0.94 0.93 0.94 50K 5 0.88 0.92 0.92 50K 8 0.95 0.94 0.94 50K 10 0.96 0.95 0.95 * Table 2: The result of BLSTM model trained with synthesize data. The result supports the reliability of the synthetic data. The larger data, the better performance. The larger sequences show the better results but a realtime classifier favors the shorter sequences. Figure 9: Confusion matrix for two base lines. The left confusion matrix shows high confusion between scanning and skimming whereas BLSTM confusion matrix (the synthetic data with size of 105 with sequence length N = 5) shows robustness of the model. contains the scanning reading type. in reading i.e., in the research about dyslexia. Also, it gives One important note is on the sterategy on selecting the se- insight to other eye-tracking research to generate eye move- quence length (N) for the model. Even though the longer ment transitions in different area of interests which is very sequence length would lead to the higher accuracy, in order to helpful to distinguish experts and novices in several domains design instant user interface the shorter length is more desir- of education. It is also desirable to explore alternative to HMM able as the model waits for N number of saccades to classify. for data synthesization. In this regard, using LSTM itself as Depend on the application, the sequence length could be se- a generative model for eye tracking data is in our agenda of lected occasionally. research for the future work. The outcome of this research is promising in which appropri- ate data synthesization breaks limitations on using RNNs in Acknowledgment eye tracking research in general. It also may offers the possi- This work was funded by the Federal Ministry of Education bilities to provide standard eye movement datasets not only and Research(BMBF) for the project AICASys. for the reading type detection but for other research aspects REFERENCES 10. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long 1. Leonard E Baum, Ted Petrie, George Soules, and short-term memory. Neural computation 9, 8 (1997), Norman Weiss. 1970. A maximization technique 1735–1780. occurring in the statistical analysis of probabilistic 11. Kenneth Holmqvist, Marcus Nyström, Richard functions of Markov chains. The annals of mathematical Andersson, Richard Dewhurst, Halszka Jarodzka, and statistics 41, 1 (1970), 164–171. Joost Van de Weijer. 2011. Eye tracking: A 2. Ralf Biedert, Georg Buscher, and Andreas Dengel. 2010a. comprehensive guide to methods and measures. OUP The eyebook–using eye tracking to enhance the reading Oxford. experience. Informatik-Spektrum 33, 3 (2010), 272–281. 12. Aulikki Hyrskykari. 2006. Eyes in attentive interfaces: 3. Ralf Biedert, Georg Buscher, Sven Schwarz, Jörn Hees, Experiences from creating iDict, a gaze-aware reading and Andreas Dengel. 2010b. Text 2.0. In CHI’10 aid. Tampereen yliopisto. Extended Abstracts on Human Factors in Computing Systems. ACM, 4003–4008. 13. Seyyed Saleh Mozaffari Chanijani, Mohammad Al-Naser, Syed Saqib Bukhari, Damian Borth, Shanley EM Alleny, 4. Ralf Biedert, Jörn Hees, Andreas Dengel, and Georg and Andreas Denge. 2016a. An eye movement study on Buscher. 2012. A robust realtime reading-skimming scientific papers using wearable eye tracking technology. classifier. In Proceedings of the Symposium on Eye In Mobile Computing and Ubiquitous Networking Tracking Research and Applications. ACM, 123–130. (ICMU), 2016 Ninth International Conference on. IEEE. 5. Georg Buscher, Andreas Dengel, Ralf Biedert, and 14. Seyyed Saleh Mozaffari Chanijani, Syed Saqib Bukhari, Ludger V Elst. 2012. Attentive documents: Eye tracking and Andreas Dengel. 2016b. eyeReading: Interaction as implicit feedback for information retrieval and beyond. with Text through Eyes. In Proceedings of The Ninth ACM Transactions on Interactive Intelligent Systems International Conference on Mobile Computing and (TiiS) 1, 2 (2012), 9. Ubiquitous Networking, Vol. 2016. 1–2. 6. Ed H Chi, Lichan Hong, Michelle Gumbrecht, and 15. Lawrence R Rabiner. 1989. A tutorial on hidden Markov Stuart K Card. 2005. ScentHighlights: highlighting models and selected applications in speech recognition. conceptually-related sentences during reading. In Proc. IEEE 77, 2 (1989), 257–286. Proceedings of the 10th international conference on Intelligent user interfaces. ACM, 272–274. 16. Keith Rayner, Alexander Pollatsek, Jane Ashby, and Charles Clifton Jr. 2012. Psychology of reading. 7. François Chollet. 2015. keras. Psychology Press. https://github.com/fchollet/keras. (2015). 8. Shai Fine, Yoram Singer, and Naftali Tishby. 1998. The 17. Raúl Rojas. 2013. Neural networks: a systematic hierarchical hidden Markov model: Analysis and introduction. Springer Science & Business Media. applications. Machine learning 32, 1 (1998), 41–62. 18. Mike Schuster and Kuldip K Paliwal. 1997. Bidirectional 9. Alex Graves and others. 2012. Supervised sequence recurrent neural networks. IEEE Transactions on Signal labelling with recurrent neural networks. Vol. 385. Processing 45, 11 (1997), 2673–2681. Springer.