=Paper=
{{Paper
|id=Vol-1543/p11
|storemode=property
|title=A Prototype Gaze-Controlled Speller for Text Entry
|pdfUrl=https://ceur-ws.org/Vol-1543/p11.pdf
|volume=Vol-1543
|authors=Mindaugas Vasiljevas,Justas Šalkevičius,Tadas Gedminas,Robertas Damaševičius
|dblpUrl=https://dblp.org/rec/conf/system/VasiljevasSGD15
}}
==A Prototype Gaze-Controlled Speller for Text Entry==
A Prototype Gaze-Controlled Speller for Text Entry Mindaugas Vasiljevas, Justas Šalkevičius, Tadas Gedminas, Robertas Damaševičius Software Engineering Department Kaunas University of Technology Kaunas, Lithuania robertas.damasevicius@ktu.lt Abstract—Eye typing provides a means of communication developed in order to efficiently and effectively address the that is useful for people with disabilities which prevent using accessibility problems in human interaction with software their hands for text entry. In this paper, we describe the applications and services while meeting individual development of the prototype gaze-controlled speller and discuss requirements of the users in general. Eye typing has been its experimental evaluation. Using scrollable virtual keyboard defined as the production of text using the focus of the eye (aka interface, the text input speed of 1.2 wpm was achieved. gaze) as a means of input [5]. Eye typing has been known for 30 years now [6], but recently it has received an increased Keywords—gaze tracking; gaze writing; eye typing; hands-free attention from the researchers with the arrival of affordable eye text entry; speller; assistive technology. tracking devices on the market. I. INTRODUCTION Systems using gaze-controlled eye typing may be called as gaze spellers (using an analogy to brain-controlled BCI spellers Communication is central to the human life and experience. [7]). It is a kind of assistive technology [8], specifically With the rise of the electronic means of communication and designed for the purpose of increasing or maintaining the internet-based social networks as well as wide-spread use of capabilities of people with disabilities, which can be used in smartphones and tablet PCs, the role of communication the role ambient assisted living (AAL) environments [9] for people of texting has increased significantly. According to one report with special needs. It has the general aim of bridging the digital [1], different types of text-based communication (text divide and providing universal access [10] to anyone. messaging, e-mail) are the preferred mode of communication for young people (text messaging – 54%, email – 11% vs. for The current research is important as the existing eye typing example, cell-phone call – 38%, face-to-face talk – 33%). systems still have many limitations (low entry speed, poor usability, etc.) and even small improvements in the design of To most people, text entry is a simple action. However, such systems can lead to significantly improved life quality of over a billion people are estimated to be living with some form impaired people. of disability or impairment [2]. To those suffering from physical disabilities or age-related impairments, the text entry The structure of the remaining parts of the paper is as task may present a significant challenge. For example, in case follows. Section 2 discusses the related work. Section 3 of such disabilities as amyotrophic lateral sclerosis (ALS) often describes the developed prototype of gaze-controlled speller. lead to complete loss of control over voluntary muscles, except Section 4 describes the experimental results. Finally, Section 5 the eye muscles. Today's computer systems are not suitable to presents conclusions and discusses future work. be used for such people as the input to computers is still fairly limited to mechanical (keyboard, mouse), audio (speech) and II. RELATED WORK tactile (touchpad) inputs. Inability to use a conventional physical input device, such as the mouse or the keyboard, Several different eye typing systems have been described raises the importance of other input modalities such as eyes for in research papers. These systems mainly differ in their connecting persons with severe motor impairments to the approach towards presentation and layout of letters in the user digital world. The design of the hardware and software that interface. A typical example is presenting an on-screen enables access of handicapped people to ICT services often keyboard. The user has to options for action: looking at the fails to take into account the user needs [3]. Such limitations desired letter or key for selecting it, and dwelling (i.e., pausing raise barriers for people with major or minor disabilities such eye movements for a moment) on it for input. Known as elderly people with motor impairments in benefiting from examples of such systems are GazeTalk [11], ERICA [12], the use of modern ICT applications. Therefore, a large number pEYEwrite [13], and Špakov et al. [14]. of individuals are at risk of becoming excluded from the GazeTalk [11] uses a probabilistic character layout information and knowledge society [4]. strategy to show only 6 most likely next characters on-screen, To overcome these barriers, new concepts and methods of while next 6 most likely words predicted from the previous human-computer interaction (HCI) must be researched and words in the sentence are shown. The users have achieved text entry speed from 4 words per minute (wpm) for character-only Copyright © 2016 held by the authors input to up to 10 wpm using the most likely-words feature. 79 In ERICA [12], six large on-screen keys were used instead III. DEVELOPMENT OF GAZE SPELLER of an entire keyboard due to limited resolution of the eye tracker. A prediction algorithm allowed to decrease eye- A. Usage scenario typewriting time by 25%. Usually eye-tracking interfaces are designed to imitate pEYEwrite [13] groups the letters together in a operation of a standard pointing device such as a mouse. The hierarchical pie structure. To enter a letter, the user first dwells gaze tracking system, either head mounted or attached in front on the pie slice containing the desired group of letters, then of the user then tracks the user’s gaze and transforms it to the dwells on the desired slice in a popup pie menu. Novice entry screen coordinates. rates of 7.9 wpm were reported with a dwell time of 400 ms. During eye typing, the user first locates the letter on a Špakov et al. [14] use “scrollable keyboards” where one or virtual keyboard by moving his/her gaze to it. The gaze more rows are hidden to save space combined with keyboard tracking device follows the user’s point of gaze while software layout optimization according to letter-to- letter probabilities. records and analyses the gaze behavior. For input, the user has The users achieved 8.86 wpm speed for the 1-row keyboard, to fix his/her gaze at the letter for a pre-defined time interval and 12.18 wpm for the 2-row keyboard, respectively. (aka dwell time). When the dwell time has passed, the letter is selected by the system and users can move on to gaze to the Other related works include different kinds of text entry next letter. Feedback is typically shown on both on focus and systems using virtual keyboard interface. Methods employed on selection. in these systems for increasing input systems can be directly transferred to the gaze speller domain, e.g., predictive B. Advantages and disadvantages keyboard layouts in SoftType [15]. As an input method, gaze has both advantages and AUK [16] uses a 12-key soft keyboard similar to the one disadvantages. It is easy to focus on items by looking at them used in mobile phones and supports several different entry and target acquisition using gaze is very fast, given the targets modes (1-to-10 key, joystick), various layout configurations are sufficiently large [22]. However, gaze is not as accurate as for different performance levels; integration with additional the mouse partly due to technological reasons as well as some performance enhancing techniques, such as text prediction and features of the eye [23]. The size of the fovea and the inability dictionary or prefix-based word disambiguation. of the camera to resolve the fovea position restrict the accuracy of the measured point of gaze [5]. Alternative interfaces for gaze typing include Dasher. Dasher [17] allows users to write by zooming through a world of boxes. Each box represents an individual letter and the size C. Technical limitations of a box is proportional to the probability of that letter given When humans look at things, their fix their gaze on them the preceding letters. The entry rates for Dasher range between for 200 to 600 ms [23]. For a computer to distinguish whether 16–26 wpm [17]. the user is looking at an object, a longer interval longer of time is needed. Usually, 1000 ms is long enough to prevent false Dwell-free eye-typing interface [18] tracks how simply selections [22]. While requiring the user to fixate for long look at or near their desired letters without stopping to dwell intervals allows preventing false selections, this may be on each letter. The users reached a mean entry rate of 46 wpm uncomfortable for most users. on a perfect letter recognizer. While dwell-free eye-typing may be more than twice as fast as traditional eye-typing The dwell time also places an upper limit on eye typing systems, the working prototype of the system still has to speed, e.g., if dwell time is 1,000 ms, the upper limit for typing implemented that would deal effectively with entry errors. speed is 12 words per minute (wpm) (considering that 1 word is equal to about 5 characters, for English text). SMOOVS [19] utilized smooth-pursuit eye movements combined with a two-stage interface that uses a hexagonal D. Accessibility/usability requirements and limitations layout of letters. The system had achieved the speed of 4.5 Accessibility limitations of eye gaze tracking systems have wpm, while the users have complained about low been formulated by Hansen et al. [11] as follows: comprehensibility of the interface. Word/phrase prediction or completion is also widely used 1. A large portion of the users is not able to get a [20]. As a word is entered, the stem of the current word is sufficiently good calibration due to false reflections from expanded to form a list of matching complete words. The list is glasses, occlusion by eyelids or eyelashes, interference with the displayed in a dedicated region of a user interface allowing the ambient light, or low contrast between iris and the pupil. user to select the word early. An example is Filteryedping [21] 2. Gaze tracking systems usually require that the user does - a key filtering–based approach for supporting dwell-free eye not move. It is very difficult for most people and impossible for typing that recognizes the intended word by performing a people with involuntary, e.g. spastic, movements. lookup in a word list for possible words that it can form after discarding none or some of the letters that the user has looked 4. People’s eyes tend to dry out due to eye fatigue and long at. It sorts the candidate words based on their length and exposure to strong light. frequency and presents them to the user for confirmation. The 5. Present eye -tracking systems are only for stationary and method has achieved the rate of 19.8 wpm. indoor use. 80 The requirements for interfaces for impaired users are [24]: Current implementation uses standard QWERTY layout 1) Limited access to details: complex and vital details of the mapped to a single scrollable line of letters. Feedback is system have to be hidden to avoid user overwhelming and ensured by the black line which always stays on the center of trapping. 2) Self-learning: detected common patterns in the the screen while the one-line keyboard moves underneath it behavior of the user should be used to automatically create depending on the horizontal position of the gaze. Letter rules or shortcuts that speed and ease up the use of the system. selection for input is provided by eye dwelling. Additional 3) System interruption: Impaired users have in most cases no menu buttons are provided for calibration, connection to the idea how the system is working, therefore easy cancellation of gaze tracking device, loading of alternative keyboard layouts, system’s activities must be ensured. and setting program options. Layout editor has been implemented for designing other keyboard layouts. According to Lopes [25], user interface for persons with disabilities must: support variability allowing to provide the Finally, the operation of the system can be imitated using a means to adapt to user-specific requirements; support of a wide mouse if a gaze tracking device is disconnected. range of input devices and output modes; provide minimal user interface design; promote interaction and retain user attention IV. EXPERIMENTS on the tasks; and provide strong feedback mechanisms that may provide rewarding schemes for correct results. A. Apparatus The eyeTribe eye tracker (tracking range 45cm – 75cm, E. Architecture tracking area 40cm x 30cm at 65cm distance) was connected to The architecture of the developed prototype gaze speller a HP Ultrabook notebook running Microsoft 8 OS 64-bit with a system is quite simple (see Fig. 1). It consists of the gaze Intel Core i5-4202Y 1.60 GHz CPU and 4 GB RAM. The tracking device (Eye Tribe), which is connected to a PC via application was displayed on a 14” LCD display with LED USB 3.0 connection. On the PC, the core modules are backlight and screen resolution of 1920x1080 (see Fig. 3). The responsible to calibration procedure and gaze feedback. eyeTribe eye tracker communicates with notebook via USB 3.0 interface. Fig. 1. Architecture of the gaze speller prototype system F. Interface The primary driving motive for designing a user interface for a gaze speller is usability as good user experience would also enhance the user acceptance of the system. Our developed interface was inspired by Špakov et al. [14] and is based on the concept of “scrollable keyboard” (see Fig. 2). Fig. 3. Deployment of the eye tracking system. B. Procedure Prior to collecting data, the experimenter explained the task and demonstrated the software. The experiment was carried out with one disabled person, who could not control his legs and his hand movements are limited. The participant was instructed on the method of text entry, early word selection, error correction, and the audio feedback. He was instructed to enter the given phrases as quickly and accurately as possible and make corrections only if an error is detected in the current or previous word. The participant was allowed to enter a few trial phrases to become familiar with the gaze-controlled selection and correction methods. Fig. 2. Interface of the developed gaze speller 81 For the experiment, we used a fragment of the well-known lower range of the similar systems. However, there is much novel “Alice in Wonderland” by Lewis Carroll (Charles space for improvement still left. Lutwidge Dodgson): “The rabbit-hole went straight on like a tunnel for some V. CONCLUSSION AND FUTURE WORK way, and then dipped suddenly down, so suddenly that Alice This paper has presented a new hands-free text entry had not a moment to think about stopping herself before she system using gaze as the only source of input. Gaze speller is found herself falling down a very deep well.” designed to assist the severely motor impaired individuals who The text consists from 219 characters (including spaces). are unable to create motion input, but are able to voluntarily control their eyes. A volunteer participant was recruited, who had no prior experience using an eye tracker, to enter the text. Further research is needed to perform more extensive experiments using a large group of participants (both healthy and impaired), to analyze more efficient letter layouts based on C. Performance metrics letter frequency and letter/word prediction, to implement Typing speed is measured in wpm, where a word is any adaptive control of dwell time, to evaluate usability of the user sequence of five characters, including letters, spaces, interface using common usability evaluation procedures such punctuation, etc. When measuring accuracy, both corrected as NASA-TLX [30], to assess user learnability vs. fatigue with errors and errors left in the final text are taken into account. gaze speller in prolonged sessions, and, possibly, integrate Keystrokes per character (KSPC) measures the average several different input modalities (e.g., also using EMG number of keystrokes used to enter each character of text. signals) for text entry tasks. KSPC is an accuracy measure reflecting the overhead incurred in correcting mistakes. ACKNOWLEDGMENT Error rate is calculated by comparing the text written by the The authors would like to acknowledge the contribution of participant with the presented text. the COST Action IC1303 – Architectures, Algorithms and Platforms for Enhanced Living Environments (AAPELE). D. Results The mean for typing speed achieved was 1.2 wpm. This is REFERENCES quite typical for traditional dwell-based eye typing, but is still [1] Interpersonal Communication: A First Look. SAGE. too slow for fluent text entry. However, the experiment showed https://us.sagepub.com/sites/default/files/upm- binaries/52575_Gamble_(IC)_Chapter_1.pdf that the participant improved with practice over the four blocks [2] M. Dawe, “Desperately Seeking Simplicity: How Young Adults with of input. The error rate is quite low overall, as the participant Cognitive Disabilities and Their Families Adopt Assistive generally chose to correct errors during the text entry. Technologies,” CHI '06: Proc. of the SIGCHI conference on Human factors in computing systems, pp. 1143-1152, 2006. TABLE I. RESULTS OF EXPERIMENT [3] M.C. Domingo, “Review: An overview of the Internet of Things for people with disabilities,” J. Netw. Comput. Appl. 35, 2, pp. 584-596, 2012. Speed, wpm KSPC Error rate [4] P. Gregor, and A. Dickinson, “Cognitive dificulties and access to information systems: an interac-tion design perspective,” Journal of 1.2 1.44 0.01 Universal Access to Information Society, Vol. 05, pp. 393-400, 2006. [5] P. Majaranta, I. Scott MacKenzie, Anne Aula, and Kari-Jouko Räihä, “Effects of feedback and dwell time on eye typing speed and accuracy”, E. Evaluation Universal Access in the Information Society 5(2), pp. 199-208, 2006. We can compare the input speed of the developed gaze [6] P. Majaranta, and K.-J. Raiha, “Twenty years of eye typing: systems and speller with other text typing systems using both traditional and design issues,” Proc. of the ACM Symposium on Eye Tracking Research alternative input methods and modalities. Average computer and Applications—ETRA 2002, pp. 15–22. ACM, New York, 2002. users achieve 33 wpm text entry speed [26] while using [7] I. Scott MacKenzie, and K. Tanaka-Ishii, “Text Entry Systems: standard PC and a keyboard. An average user of the “T9 input Mobility, Accessibility, Universality,” Morgan Kaufmann Publishers Inc., 2007. method” on a 12-key mobile phone keypad can produce up to [8] A. Gillespie, C. Best, and B. O'Neill, “Cognitive function and Assistive 10 wpm [27]. The speed achieved using the Brain Computer Technology for Cognition: A Systematic Review,” Journal of the Interface (BCI) or Neural Computer Interface (NCI) spellers International Neuropsychological Society, 18, pp. 1-19, 2012. and electroencephalogram (EEG) / electromyogram (EMG) [9] A. Dohr, R. Modre-Opsrian, M. Drobics, D. Hayn, G. Schreier, “The data as input is in range of 0.2-2.55 wpm, while the eye-blink Internet of Things for Ambient Assisted Living,” in 7th Int. Conf.on based EMG speller developed by the authors of this paper Information Technology: New Generations (ITNG), pp. 804-809, 2010. achieved 2.4 wpm [28, 29]. Other gaze tracking based text [10] C. Stephanidis, G. Salvendy, D. Akoumianakis, A. Arnold, N. Bevan, D. entry spellers report up to 12 wpm speed for dwell-based Dardailler, P.L. Emiliani, I. Iakovidis, P. Jenkins, A. Karshmer, P. Korn, interfaces [14] and 20 wpm for dwell-free interfaces [21]. A. Marcus, H. Murphy, C. Oppermann, C. Stary, H. Tamura, M. Tscheligi, H. Ueda, G. Weber, and J. Ziegler, “Toward an Information The prototype gaze speller described in this speller is still in Society for All: HCI challenges and R&D recommendations,” Int. Journal of Human-Computer Interaction 11(1), pp. 1–28, 1998. the early stage of development and its performance is in the 82 [11] J.P. Hansen, D.W. Hansen, and A.S. Johansen, “Bringing gaze based Conference Extended Abstracts on Human Factors in Computing interaction back to basics,” Proc. of HCI International 2001, pp. 325– Systems (CHI EA '15). ACM, pp. 303-306, 2015. 328. Erlbaum, Mahwah, NJ, 2001. [22] C. Ware, and H.H. Mikaelian, “An evaluation of an eye tracker as a [12] M. Ashmore, A.T. Duchowski, and G. Shoemaker, “Efficient eye device for computer input,” Proceedings of CHI/GI ’87. ACM Press, pp. pointing with a fisheye lens,” Proc. of Graphics Interface 2005, pp. 203– 183–188, 1987. 210, 2005. [23] R.J.K. Jacob, “Eye tracking in advanced interface design,” in: W. [13] A. Huckauf, and M. H. Urbina,”Gazing with pEYE: new concepts in eye Barfield, and T.A. Furness (eds.), Virtual environments and advanced typing,” ACM SIGGRAPH Symposium on Applied Perception in interface design. Oxford University Press, pp. 258–288, 1995. Graphics and Visualization, APGV 2007, p. 141, 2007. [24] A. Marinc, C. Stocklöw, A. Braun, C. Limberger, C. Hofmann, and A. [14] O. Špakov, and P. Majaranta, “Scrollable Keyboards for Casual Eye Kuijper, “Interactive personalization of ambient assisted living Typing,” PsychNology Journal 7(2), Special issue on "Gaze Control for environments,” Proc. of the 2011 Int. Conf. on Human interface and the Work and Play", pp. 159-173, 2009. Management of Information - Volume Part I (HI'11), LNCS vol. 6771, [15] P.E. Jones, “Virtual keyboard with scanning and augmented by Springer-Verlag, pp. 567-576, 2011. prediction,” Proc. of the 2nd European Conference on Disability, Virtual [25] J.B. Lopes, “Designing user interfaces for severely handicapped Reality and Associated Technologies, pp. 45–51, 1998. persons,“ Proc. of the 2001 EC/NSF Workshop on Universal [16] A. Mourouzis, E. Boutsakis, S. Ntoa, M. Antona, and C. Stephanidis, accessibility of ubiquitous computing: providing for the elderly “An accessible and usable soft keyboard,” Proceedings of HCI (WUAUC'01),” ACM, pp. 100-106, 2001. International 2007, pp. 961–970. Berlin: Springer, 2007. [26] C.M. Karat, C. Halverson, D. Horn, and J. Karat, “Patterns of entry and [17] O. Tuisku, P. Majaranta, P. Isokoski, and K.-J. Raiha, “Now dasher! correction in large vocabulary continuous speech recognition systems”, dash away! longitudinal study of fast text entry by eye gaze,” Proc. of Proc. of the SIGCHI conference on Human Factors in Computing the 5th ACM Symposium on Eye-Tracking Research & Applications, Systems (CHI '99), pp. 568–575, 1999. ACM Press, pp. 19–26, 2008. [27] A. Cockburn, and A. Siresena, “Evaluating Mobile Text Entry with the [18] P.O. Kristensson and K. Vertanen, “The potential of dwell-free eye- Fastap Keypad,” British Computer Society Conference on Human typing for fast assistive gaze communication,” Proc. of the Symposium Computer Interaction, pp. 77-80, 2003. on Eye Tracking Research and Applications (ETRA '12), ACM, pp. 241- [28] M. Vasiljevas, R. Turčinas, and R. Damaševičius, “EMG Speller with 244, 2012. Adaptive Stimulus Rate and Dictionary Support,” Proc. of [19] O. Lutz, A. Venjakob, and S. Ruff, “SMOOVS: Towards calibration- FeDCSIS'2014: Federated Conference nn Computer Science and free text entry by gaze using smooth pursuit movements,” Journal of Eye Information Systems, pp. 233-240, 2014. Movement Research 8(1):2, pp. 1-11, 2015. [29] M. Vasiljevas, R. Turcinas, and R. Damasevicius, “Development of [20] J. Miro, and P.A. Bernabeu, “Text entry system based on a minimal scan EMG-based speller,” Proceedings of INTERRACCION 2014: XV matrix for severely physically handicapped people,” Proc. of the 11th International Conference on Human Computer Interaction. Conference on Computers Helping People with Special Needs—ICCHP [30] T. Hayashi and R. Kishi, “Utilization of NASA-TLX for Workload 2008, pp. 1216–1219. Springer, Berlin, 2008. Evaluation of Gaze-Writing Systems,” Proc. of the 2014 IEEE [21] D. Pedrosa, M. da Graça Pimentel, and K.N. Truong, “Filteryedping: A International Symposium on Multimedia (ISM '14), pp. 271-272, 2014. Dwell-Free Eye Typing Technique,” Proc. of the 33rd Annual ACM 83