=Paper=
{{Paper
|id=None
|storemode=property
|title=A Supportive User Interface for Customization of Graphical-to-Vocal Adaptation
|pdfUrl=https://ceur-ws.org/Vol-828/SUI_2011_paper1.pdf
|volume=Vol-828
}}
==A Supportive User Interface for Customization of Graphical-to-Vocal Adaptation==
        A Supportive User Interface for Customization of
                Graphical-to-Vocal Adaptation
               Fabio Paternò                                                Christian Sisti
          CNR-ISTI, HIIS Laboratory                                   CNR-ISTI, HIIS Laboratory
        Via Moruzzi 1, 56124 Pisa, Italy                            Via Moruzzi 1, 56124 Pisa, Italy
           Fabio.Paterno@isti.cnr.it                                   Christian.Sisti@isti.cnr.it
ABSTRACT                                                    Concrete User Interfaces (CUIs) are dependent on the
In this paper, we describe an approach to adapting          interaction resources of the target platforms but are
graphical Web pages into vocal ones, and show how           independent of the implementation languages.
the approach is supported by a tool that allows the user
                                                            An AUI is composed by a number of presentations, a
to drive the adaptation results by customizing the
                                                            data model and a set of external functions. Moreover
adaptation parameters. The adaptation process exploits
                                                            each presentation contains a number of user interface
model-based user interface descriptions.
                                                            elements, called interactors, and a number of, so
Keywords                                                    called, interactor compositions. Examples of interactor
Vocal Interfaces, Model-Based, Adaptation, Supportive       compositions are grouping and relations to
User Interfaces, Accessibility.                             group/relate different interactors. The interactors can
                                                            be classified in terms of editing, selection, output and
INTRODUCTION                                                control and may have associated a number of events
Vocal interfaces are important in a number of different     handler.
contexts, such as for vision-impaired users or when the     As already mentioned, the CUIs are dependent on the
visual channel is busy (e.g, car driving) [7]. Design       interaction resources of the target platform so, while in
techniques in developing Vocal Interfaces has been          Desktop modality a presentation can be defined as a
widely studied [1] but little attention has been paid on    set of user interface elements perceivable at a given
how to adapt web pages for vocal browsing. Moreover,        time, in the case of Vocal modality a presentation is
recognition of natural language is improving [2] and in     defined as a set of dialogues between user and platform
future it will be possible to develop vocal interfaces      that can be identified as a logical unit (e.g. the
able to recognize any user input.                           communication necessary for a vocal form filling).
We found that adaptation of graphical Web pages into
vocal ones needs to be supplemented through
Supportive User Interfaces (SUI), that enable the users
to customize the adaptation. Indeed, a completely
automatic transformation cannot provide good results
in many case.
The adaptation process is based on the exploitation of
MARIA [5], a recent model-based language, which
allows designers to specify abstract and concrete user
interface languages according to the CAMELEON
Reference framework [3]. The customization tool has a
Web interface allowing the user to drive the Vocal
Interfaces generation.
In this workshop paper we firstly present the overall
Model-Based Language Architecture, secondly we
introduce the adaptation approach and lastly we show
an example of application of the supportive interface
for graphical-to-vocal adaptations, also showing how a
parameter change can lead to different results in the
final user interface.                                              Figure 1. Some Possible Abstraction Levels
MODEL-BASED INTERFACES in MULTI-DEVICE                      Figure 1 shows the relationship between AUI and CUIs
ENVIRONMENTS                                                limited to Desktop and Vocal target platform (some
MARIA is a model-based language, which allows               other target platforms available are Mobile, Multi-
designers to specify abstract and concrete user interface   Touch and Multi-Modal). Figure 1 also represents
languages. Abstract User Interfaces (AUIs) are              some possible transformations that can be performed,
independent on the interaction modalities, while            such as the HTML generation from Desktop Logical
Descriptions (an instance of a Desktop CUI) and the                  into a (semantically equivalent) element of a
VoiceXML       generation   from    Vocal   Logical                  new Vocal CUI.
Descriptions.
The aim of our work is to develop an adaptation             The final implementation language is VoiceXML 2.0
process that take as input HTML pages, and generates        [8], a standard language, supported by W3C, for the
corresponding VoiceXML (opportunely adapted for             specification of Vocal Interfaces. The VoiceXML code
voice modality) documents. This is not a simple task        generated by the transformation has been tested with
and raises a large number of adaptation issues (such as     the Voxeo Voice Browser [9] (suggested by W3C), and
the retrieving of the menu items for vocal interaction      has passed the validation test integrated in it. More
and the adaptation of images). In this context              detail on the VoiceXML generation is provided in [4].
Supportive User Interfaces can provide useful support,
in particular in the customization of the adaptation
rules.                                                      THE CUSTOMIZATION SUPPORT
                                                            The adaptation process is complex and the results
                                                            depend on a number of factors, such as the structure of
APPROACH                                                    the Web pages in input and their conformance to the
Our solution is based on an adaptation server that          accessibility guidelines. In order to obtain better results
consists of three modules (see Figure 2):                   we have designed a Supportive User Interface, which
        Reverser: parses the Web pages and builds up       allows the user to customize the adaptation results.
         an equivalent Desktop Concrete Logical             The adaptation process can be driven setting a number
         Description.                                       of parameters. Such parameters can influence different
        Adapter: transforms the Desktop Concrete           states of the transformation process.
         Logical Description into an adapted Vocal          To adjust the pre-conversion step the following
         Concrete Logical Description.                      parameters are available:
        Generator: generates the VoiceXML taking
         in input the Vocal Concrete Logical                        Remove Whitespaces: if enabled it removes
         Description.                                                the grouping that contains only whitespaces
                                                                     from the computation. This can happen due to
                                                                     graphical formatting purposes (e.g., list of
                                                                     “ ”).
                                                                    Min Image Width/Height: images under
                                                                     these size limits (that not contains ALT
                                                                     attribute) are removed.
                                                                    Min      Grouping      Threshold:   in   the
                                                                     specification provided by the reverse
                                                                     engineering removing grouping operators
                                                                     when they contain little text (below the
                                                                     threshold) to synthesize.
     Figure 2. The Adaptation Server Architecture.
                                                            To customize the menu generator step it is possible to
                                                            set the following parameters:
The reverser, taking into account the associated page
style-sheet, transforms the HTML tag patterns into                  Max Grouping Threshold: if the textual
opportune Desktop CUI elements. This process enables                 grouping content length is above the max
the possibility to obtain a more semantic description.               threshold, then new menu items are created by
The adapter is subdivided into three sub-modules that                splitting the original grouping.
are executed in pipeline:                                           Descr/Nav ratio: to set the ratio between the
    1.   Pre-Converter: removes the elements that                    description and navigator interactors in order
         cannot be rendered vocally (e.g., images                    to identify the groupings that contain a
         without ALT tag) but also corrects possible                 navigator bar.
         inconsistences due to the reverse process (e.g.,
         grouping containing only one interactor due to     Finally, to customize the mapper step, the parameters
         formatting purposes).                              are:
    2.   Menu-Generator: generally the vocal
         interfaces are navigated through lists of                  Multiple Choice: to set how the final vocal
         menus. This step aims to convert a Desktop                  interface will perform the multiple choice.
         Logical Description into a new one structured               There are two solutions: Yes/No Questions, for
         into a set of of menus/sub-menus                            every possible choice the platform will ask a
         hierarchically structured.                                  Yes/No confirmation to the user; Grammar
    3.   Graphical-to-Vocal Mapper: with this step                   Based: the user can select more than one
         each elements of the Desktop CUI is mapped
        possible choice with one single sentence              EXAMPLE        CONFIGURATION             PARAMETER
        (listing the choices in sequence).                    CHANGE
       End Form Sound: to decide if each vocal               In this section we show an example of configuration
        dialogue should terminate with a short sound.         parameter change, which affects the structure of the
                                                              resulting user interface.
Figure 3 and 4 show our Supportive User Interface that        In particular, we consider Max_Threshold parameter,
allows such parameterization. The left panel (shown in        which defines the threshold in terms of text length to
Figure 3) contains some modifiable parameters and             render vocally. If the length exceeds this limit the
their default’s values.                                       adaptation system splits the presentation content. If we
                                                              set max_threshold = 2500 then we obtain the structure
                                                              in Figure 4.
                                                                           Figure 4. Initial parameter set.
                                                              Thus, the Returning home part (see Figure 5) will be
                                                              rendered a single piece of information.
         Figure 3. Customization of the adapter.
The right panel (see figure below) shows the structure
and the menu items of the generated vocal page. In this
way the designer can decide whether to download the
final vocal interface (as a zip file containing the
VoiceXML documents) or change the transformation
parameters in order to obtain a different structure.
                                                                       Figure 5. The considered content part.
                                                              If we change the parameter to max_threshold = 700
   Figure 4. Application right panel: vocal menu structure.
                                                              we obtain the structure in Figure 6.
                                                           We consider this tool as useful support to provide users
                                                           with full control on the final results. Given the
                                                           complexity of the existing Web content, we plan to add
                                                           new features to both the adaptation rules and the
                                                           customization interface, in order to have further
                                                           flexible control on the adaptation results.
                                                           ACKNOWLEDGMENTS
                                                           We gratefully acknowledge support from the Artemis
                                                           EU SMARCOS and the ICT EU SERENOA projects.
       Figure 6. The resulting modified structure          REFERENCES
                                                               1.   A., Edwards and I., Pitt.: Design of Speech-
                                                                    Based devices. Springer (2007).
We can note that the resulting structure has more sub-
levels: the section Returning home is subdivided in            2.   A., Franz. and B., Milch.:Searching the web
multiple parts, highlighted by dashed lines in Figure 7,            by Voice. In proceeding of the 19th
which can be further subdivided.                                    international conference on Computational
                                                                    Linguistic - Volume 2, pp. 1-5, Stroudsburg,
                                                                    PA, USA. (2002).
                                                               3.   Calvary, G., Coutaz, J., Bouillon, L., Florins,
                                                                    M., Limbourg, O., Marucci, L., Paternò, F.:
                                                                    The CAMELEON reference framework.
                                                                    CAMELEON project, Deliverable 1.1. (2002).
                                                               4.   F., Paternò and C., Sisti.: Deriving Vocal
                                                                    Interfaces    in   Multi-device   Authoring
                                                                    Environments. In Proceedings of the 10th
                                                                    International Conference on Web Engineering,
                                                                    pp. 204-217 (2010).
                                                               5.   Paternò F., Santoro C., Spano L.D.: MARIA:
                                                                    A universal, declarative, multiple abstraction-
                                                                    level     language     for     service-oriented
                                                                    applications in ubiquitous environments.
                                                                    ACM Trans. Comput.-Hum. Interact., 16(4).
                                                                    (2009).
                                                               6.   UNICEF. http://www.unicef.org/.
                                                               7.   Voice Browser Activity.
                                                                    http://www.w3.org/Voice/.
                                                               8.   Voice     extensible   markup        language
                                                                    (VoiceXML)            version             2.0.
                                                                    http://www.w3.org/TR/2009/REC-
                                                                    voicexml20-20090303/7.
                                                               9.   Voxeo Voice Browser.
                                                                    http://www.voxeo.com/.
      Figure 7. How the content is further divided.
CONCLUSION
A Model-Based approach to supporting Graphical-to-
Vocal Adaptation is introduced. A Supportive User
Interface is then proposed (as Web Application) in
order to help the user to manage the overall adaptation
process.