Advanced HCI and 3D Web over Low Performance Devices David Oyarzun, Arantza del Pozo, John Edgar Iñaki Sainz Congote, Igor G. Olaizola Aholab Group, Basque Country University Vicomtech Research Centre Ingeniaritza Goi Eskola Teknikoa {doyarzun, adelpozo, jcongote, inaki@bips.bi.ehu.es iolaizola}@vicomtech.org Igor Leturia Xabier Arregi Oscar Ruiz Elhuyar Foundation IXA Group, University of the Basque Universidad EAFIT i.leturia@elhuyar.com Country oruiz@eafit.edu.co Informatika Fakultatea xabier.arregi@ehu.es ABSTRACT For the field of education, we have created a demo of a personal This position paper presents the authors’ goals on advanced tutor in language learning that provides a natural way to human human computer interaction and 3D Web. Previous work on computer interaction, by means of 3D virtual characters, speech speech, natural language processing and visual technologies has and natural language (see Figure 1). achieved the development of the BerbaTek language learning The tutor is a 3D avatar that shows emotions, developed by demonstrator, a 3D virtual tutor that supports Basque language Vicomtech-IK4, and which speaks Basque and understands what students through spoken interaction. Next steps consist on is said in Basque using Aholab's technology. The tutor is capable migrating all the system to multidevice web technologies. This of assisting the user in the following tasks: paper shows the architecture defined and the steps to be performed in the next months. - carrying out grammar (e.g. verb conjugation, word Categories and Subject Descriptors inflection) and reading comprehension exercises (e.g. D.2.2 [Software Engineering]: Design tools and techniques – fill in gaps in a text, choosing from several options), user interfaces. that are created automatically from texts using technology from IXA; General Terms - evaluating the quality of its pronunciation through Algorithms, Performance, Design, Standardization, Languages Aholab technology; - and giving aids to write texts (e.g. inflection of words, Keywords writing of numbers or querying dictionaries) by means Advanced HCI, Virtual Characters, WebGL, Speech, Natural of technology from IXA and Elhuyar. Language 1. INTRODUCTION During the last 3 years, the authors of this paper have been working in BerbaTek, a strategic research project on speech, language and visual technologies for Basque, promoted by the Basque Government. Copyright © 2012 for the individual papers by the papers' authors. Copying permitted only for private and academic purposes. This volume Figure 1: Appearance of the BerbaTek language learning tutor is published and copyrighted by its editors. This position paper presents the current efforts to migrate this Dec3D2012 workshop at WWW2012, Lyon, France language learning demonstrator resulting from the BerbaTek project into 3D Web technologies using the new WebGL API [1, • The user interacts with the system, ideally via voice. If 2]. the device does not support it, other input/output paradigms can be used. The resulting prototype will fulfill these features: • Avoid the use of plug-ins. Web technologies capable of 4. MULTIDEVICE FEATURES mixing several media contents in a native way will be One of the main goals of the migration is to obtain a real exploited to avoid the use of plugins. multidevice system. In order to achieve that, protocols for device discovering and automatic content adaption will be implemented. • Keep all the BerbaTek demo functionalities. Both 3D virtual character and advanced HCI technologies will be These protocols will allow the server to know which kind of included. device is connected and which its performance features are. With this information, the server will be able to decide the best way to • Be multidevice and multiplatform. The result will be send the multimedia information to the device (in-device multiplatform and multidevice and designed as to be rendering or interactive streaming). In the case of interactive accesible from low performance devices such as streaming, standard protocols like RTP/RTCP will be used to standard mobile phones. obtain a better performance. 2. ARCHITECTURE OVERVIEW As a consquence, the resulting prototype will allow a complete description of immersive environments that will be automatically After the review of the state of the art, a system architecture that adapted to the specific visualization properties of the end device takes into account the targeted multidevice capabilities has been (e.g. from a totally immersive 360º with haptic interaction in a defined. Figure 3 shows the architecture schema. cave, to a small representation in auto-stereoscopic handheld Basically, in the server side, the BerbaTek modules and databases devices with simple input controls). (both internal and external) are managed. When a new device Moreover, the interaction paradigm will be adapted to the device connects to the system, it automatically discovers its capabilities features too. Depending on the end-device, not only speech-based and decides which is the most suitable way to perform the paradigms, but also new paradigms that better fit into the specific rendering of 3D contents. device interaction features will be automatically applied. 3. SERVER MANAGEMENT The goal of this part of the architecture is to decentralize the 5. CONCLUSIONS AND NEXT STEPS storage of the modules. In this position paper, the goals in the evolution of the BerbaTek language learning demonstrator have been presented. The conclusions of a technological review and an architecture design have been explained. The goal of this initial work is to obtain an advanced HCI prototype based on Web 3D and HTML5 technologies. The next steps will consist on the implementation of the system described in the previous sections. • In a first stage, the architecture will be implemented using local infrastructure. Both PC and mobile devices will be used to test the discovery protocol. • In a second stage, advanced interaction and visualization devices will be included, in order to implement extra features like stereoscopy. The results of these stages will be disseminated among the scientific community by means of publications, technical workshops and ad-hoc developed web sites. Figure 3: Architecture schema Each arrow in the architecture represents an HTTP call. 6. REFERENCES Therefore, the physical location of each part of the system is not [1] Khronos Group, “WebGL Specification,” 2011, Retrieved relevant. from https://www.khronos.org/registry/webgl/specs/latest/ A typical process will be as follow: [2] Behr, J., Eschler, P., Jung, Y. and Zöllner M. X3DOM: a • The user connects to the system DOM-based HTML5/X3D integration model. Proceedings of the 14th International Conference on 3D Web Technology, • The device features are detected ACM, 2009, pp. 175-184. • The virtual character is sent via streaming or rendered in the device depending on the detected system capabilities.