Advanced HCI and 3D Web over Low Performance Devices
David Oyarzun, Arantza del Pozo, John Edgar                                                    Iñaki Sainz
         Congote, Igor G. Olaizola                                              Aholab Group, Basque Country University
                  Vicomtech Research Centre                                         Ingeniaritza Goi Eskola Teknikoa
           {doyarzun, adelpozo, jcongote,                                               inaki@bips.bi.ehu.es
              iolaizola}@vicomtech.org

              Igor Leturia                                     Xabier Arregi                               Oscar Ruiz
           Elhuyar Foundation                     IXA Group, University of the Basque                  Universidad EAFIT
      i.leturia@elhuyar.com                                    Country                               oruiz@eafit.edu.co
                                                        Informatika Fakultatea
                                                        xabier.arregi@ehu.es


ABSTRACT                                                                For the field of education, we have created a demo of a personal
This position paper presents the authors’ goals on advanced             tutor in language learning that provides a natural way to human
human computer interaction and 3D Web. Previous work on                 computer interaction, by means of 3D virtual characters, speech
speech, natural language processing and visual technologies has         and natural language (see Figure 1).
achieved the development of the BerbaTek language learning              The tutor is a 3D avatar that shows emotions, developed by
demonstrator, a 3D virtual tutor that supports Basque language          Vicomtech-IK4, and which speaks Basque and understands what
students through spoken interaction. Next steps consist on              is said in Basque using Aholab's technology. The tutor is capable
migrating all the system to multidevice web technologies. This          of assisting the user in the following tasks:
paper shows the architecture defined and the steps to be
performed in the next months.
                                                                            -     carrying out grammar (e.g. verb conjugation, word
Categories and Subject Descriptors                                                inflection) and reading comprehension exercises (e.g.
D.2.2 [Software Engineering]: Design tools and techniques –                       fill in gaps in a text, choosing from several options),
user interfaces.                                                                  that are created automatically from texts using
                                                                                  technology from IXA;
General Terms                                                               -     evaluating the quality of its pronunciation through
Algorithms, Performance, Design, Standardization, Languages                       Aholab technology;
                                                                            -     and giving aids to write texts (e.g. inflection of words,
Keywords                                                                          writing of numbers or querying dictionaries) by means
Advanced HCI, Virtual Characters, WebGL, Speech, Natural                          of technology from IXA and Elhuyar.
Language

1. INTRODUCTION
During the last 3 years, the authors of this paper have been
working in BerbaTek, a strategic research project on speech,
language and visual technologies for Basque, promoted by the
Basque Government.


Copyright © 2012 for the individual papers by the papers' authors.
Copying permitted only for private and academic purposes. This volume   Figure 1: Appearance of the BerbaTek language learning tutor
is published and copyrighted by its editors.
                                                                        This position paper presents the current efforts to migrate this
Dec3D2012 workshop at WWW2012, Lyon, France
                                                                        language learning demonstrator resulting from the BerbaTek
project into 3D Web technologies using the new WebGL API [1,                •    The user interacts with the system, ideally via voice. If
2].                                                                              the device does not support it, other input/output
                                                                                 paradigms can be used.
The resulting prototype will fulfill these features:
     •    Avoid the use of plug-ins. Web technologies capable of       4. MULTIDEVICE FEATURES
          mixing several media contents in a native way will be        One of the main goals of the migration is to obtain a real
          exploited to avoid the use of plugins.                       multidevice system. In order to achieve that, protocols for device
                                                                       discovering and automatic content adaption will be implemented.
     •    Keep all the BerbaTek demo functionalities. Both 3D
          virtual character and advanced HCI technologies will be      These protocols will allow the server to know which kind of
          included.                                                    device is connected and which its performance features are. With
                                                                       this information, the server will be able to decide the best way to
     •    Be multidevice and multiplatform. The result will be
                                                                       send the multimedia information to the device (in-device
          multiplatform and multidevice and designed as to be
                                                                       rendering or interactive streaming). In the case of interactive
          accesible from low performance devices such as
                                                                       streaming, standard protocols like RTP/RTCP will be used to
          standard mobile phones.
                                                                       obtain a better performance.

2. ARCHITECTURE OVERVIEW                                               As a consquence, the resulting prototype will allow a complete
                                                                       description of immersive environments that will be automatically
After the review of the state of the art, a system architecture that
                                                                       adapted to the specific visualization properties of the end device
takes into account the targeted multidevice capabilities has been
                                                                       (e.g. from a totally immersive 360º with haptic interaction in a
defined. Figure 3 shows the architecture schema.
                                                                       cave, to a small representation in auto-stereoscopic handheld
Basically, in the server side, the BerbaTek modules and databases      devices with simple input controls).
(both internal and external) are managed. When a new device
                                                                       Moreover, the interaction paradigm will be adapted to the device
connects to the system, it automatically discovers its capabilities
                                                                       features too. Depending on the end-device, not only speech-based
and decides which is the most suitable way to perform the
                                                                       paradigms, but also new paradigms that better fit into the specific
rendering of 3D contents.
                                                                       device interaction features will be automatically applied.
3. SERVER MANAGEMENT
The goal of this part of the architecture is to decentralize the       5. CONCLUSIONS AND NEXT STEPS
storage of the modules.                                                In this position paper, the goals in the evolution of the BerbaTek
                                                                       language learning demonstrator have been presented. The
                                                                       conclusions of a technological review and an architecture design
                                                                       have been explained. The goal of this initial work is to obtain an
                                                                       advanced HCI prototype based on Web 3D and HTML5
                                                                       technologies.
                                                                       The next steps will consist on the implementation of the system
                                                                       described in the previous sections.
                                                                            •    In a first stage, the architecture will be implemented
                                                                                 using local infrastructure. Both PC and mobile devices
                                                                                 will be used to test the discovery protocol.
                                                                            •    In a second stage, advanced interaction and
                                                                                 visualization devices will be included, in order to
                                                                                 implement extra features like stereoscopy.
                                                                       The results of these stages will be disseminated among the
                                                                       scientific community by means of publications, technical
                                                                       workshops and ad-hoc developed web sites.
                  Figure 3: Architecture schema
Each arrow in the architecture represents an HTTP call.                6. REFERENCES
Therefore, the physical location of each part of the system is not     [1] Khronos Group, “WebGL Specification,” 2011, Retrieved
relevant.                                                                  from https://www.khronos.org/registry/webgl/specs/latest/
A typical process will be as follow:                                   [2] Behr, J., Eschler, P., Jung, Y. and Zöllner M. X3DOM: a
     •    The user connects to the system                                  DOM-based HTML5/X3D integration model. Proceedings of
                                                                           the 14th International Conference on 3D Web Technology,
     •    The device features are detected                                 ACM, 2009, pp. 175-184.
     •    The virtual character is sent via streaming or rendered in
          the device depending on the detected system
          capabilities.