E-Composer: Enabling the Composition of Mobile
                               Assistants

               Ilhan Aslan*, Dyuti Menon*, Robert Brauer*, Kristin Albert* and Christian Maugg*
                                                                   Fraunhofer ESK, Germany*
                                                       name.lastname@esk.fraunhofer.de*

ABSTRACT                                                                                     front-end of the ELEPHANT system that allows users to
ELEPHANT (ELEments for Pervasive and Handheld AssistaNTs)                                    graphically compose mobile assistants based on components. The
is a system that aims to integrate a broad range of users (e.g.                              graphical presentation of a mobile assistant modeled with the E-
designers, domain experts and end users) with different                                      Composer has a tree-like structure (see figure 1). The backend of
backgrounds in the process of developing personal mobile                                     the ELEPHANT system manages these components. Components
assistants. In this paper we present a user study that we have                               can be accessed and tagged with information by all users. Users
conducted for two reasons: First, to screen characteristics of                               can search for components and they can set up a components
modeling mobile assistants by non-experts of mobile software                                 library. In [1] we described the component based development of
development; and second, to test a first prototype of the                                    mobile assistants in more detail.
ELEPHANT system’s graphical modeling tool (E-Composer).                                      In order to derive essential feedback regarding the ELEPHANT’s
                                                                                             composer tool, its reception by users and its functionalities, we
1. INTRODUCTION                                                                              describe in this paper usability tests that we conducted to measure
                                                                                             user satisfaction from working with the tool and the overall
Today, the use of mobile phones is very wide spread. In addition,
                                                                                             performance of the tool. A small test scenario was setup, where
the capabilities of mobile technology as also the underlying
                                                                                             users were given the task of modeling a mobile assistant using the
infrastructure are increasing on a regular basis. This development
                                                                                             ELEPHANT composer. Based on the user reactions and
qualifies mobiles phones as digital companions in everyday life.
                                                                                             suggestions during and after the tests, conclusions were drawn
However, when it comes to modeling the interaction for a broad
                                                                                             regarding the performance and efficacy of the composer and how
spectrum of target users, target domains and context of use, the
                                                                                             it may be improved. In this paper we present a description about
modeling process becomes very cumbersome. On the one hand,
                                                                                             the usability tests, the set-up and the data, what we intend to
designing interaction and user interfaces is a profession in itself
                                                                                             deduce from these usability tests and what methods we used to
and most software engineers do not have the required skills to
                                                                                             evaluate the data.
build user centered, attractive and usable interactions without
being guided or having a framework set for them. On the other                                2. User Study
hand, general modeling languages (e.g. UML based) that are
                                                                                             The usability tests were conducted with 11 participants in the age
being used by software engineers are either too low level or
                                                                                             group of 22 – 28 years. They came with different backgrounds in
foreign to most designers and domain experts. The ELEPHANT
                                                                                             the areas of computer expertise, authoring systems and system
(ELEments for Pervasive and Handheld AssistaNTs) system aims
                                                                                             modeling skills. The tests were conducted individually and in an
to integrate non-software engineers (e.g. designers, domain
                                                                                             undisturbed setting with the test subject being initially instructed
experts and end users) in the process of developing personal
                                                                                             as to the nature and goal of the test. The test subjects were advised
mobile assistants. The ELEPHANT system’s modeling tool that
                                                                                             to complete the test within 1 hour and to keep in mind that this
we refer to as the E-Composer allows a high level of modeling
                                                                                             test was composed of 2 separate tasks. Once the test subjects were
based on components [1]. One of the reasons why users access
                                                                                             given all the instructions and provided with all the material to
services while mobile is basically because they need assistance to
                                                                                             proceed with the test, the members of our team left the premises
complete an activity (e.g. shopping, dining, driving or route
                                                                                             The goal of the tests was for the participants to create a mobile
finding) or to proceed with an activity in the real world. Although
                                                                                             assistant, which would assist a friend who would shortly be
today's mobile phones have advanced interfaces and can handle
                                                                                             travelling to the city of Barcelona. This mobile assistant would
most websites that have been originally designed for the desktop
                                                                                             aid the visitor with the Spanish language by helping them with the
environment, single services that focus on content and
                                                                                             translations of common phrases (to buy tickets, order food etc.),
functionality are not sufficient in assisting mobile users during
                                                                                             be a guide for sightseeing in the city of Barcelona (by providing
their specific activities. Especially, if users are involved in real
                                                                                             background information on the interesting places to see) and
world activities in which they are pressed for time, the assistance
                                                                                             provide additional information such as suggestions about
provided through the capabilities of the mobile phones has to be
                                                                                             interesting places to eat or things to do in Barcelona. Keeping the
highly personalized and centered to the user's activity. The
                                                                                             generation of a Barcelona mobile assistant as the common goal,
requirements on personalization and adaptation to user activities
                                                                                             two tasks were designed to differentiate between a known and an
are very high. To fulfill these requirements, domain experts and
                                                                                             unknown framework. The first task was to design a paper based
end users have to participate in the design process. Therefore, the
                                                                                             Barcelona mobile assistant (see figure 2). The second task was to
ELEPHANT system provides a browser based tool support for the
                                                                                             do the same, i.e. design a Barcelona mobile assistant, with the
participative design of mobile assistants. The E-Composer is the
Pre-proceedings of the 5th International Workshop on Model Driven Development of Advanced User Interfaces (MDDAUI 2010): Bridging between User Experience and UI
Engineering, organized at the 28th ACM Conference on Human Factors in Computing Systems (CHI 2010), Atlanta, Georgia, USA, April 10, 2010.

Copyright © 2010 for the individual papers by the papers' authors. Copying permitted for private and academic purposes. Re-publication of material from this volume requires
permission by the copyright owners. This volume is published by its editors.


                                                                                                                                                                    37
help of the ELEPHANT composer (see figure 1). For both the             functionalities of the tool and we as developers are able to
tasks, the test subjects were provided with a list of content they     interpret this and change and improve the authoring tool
had at their disposal to create this assistant. The content included   accordingly. Indication of stress and Cognitive Load: The term
text data, images, video clips and audio files, all connected to       cognitive load (CL) may be described as the amount of effort that
Barcelona and the Spanish language.                                    accompanies learning, thinking and reasoning [9] and hence has a
                                                                       bearing on the overall evaluation of the tool.
                                                                       System performance and user satisfaction: In our usability tests,
                                                                       both these metrics were evaluated from user feedback in the form
                                                                       of questionnaires, user comments and user reactions. Real-time
                                                                       user reactions were also recorded by capturing the screen activity,
                                                                       recording any comments made by the test subjects while doing the
                                                                       tests and by using a webcam to record the activity of the test
                                                                       subjects (see figure 1). Stress and Cognitive Load: As discussed
                                                                       earlier, both stress and cognitive load introduce physiological
                                                                       changes in body, they can be identified using biosensors that
                                                                       monitor and record certain bio-signals. In our usability tests, we
                                                                       monitored the heart rate, skin conductivity and skin temperature
                                                                       of our test subjects.

                                                                       3. Data Collection
                                                                       Two questionnaires were administered to the users. The first was
                                                                       used to understand the background of the user and his experience
                                                                       with any of the authoring tools available in the market. This was
                                                                       answered by the test subject before beginning the usability test.
                                                                       The second questionnaire addressing issues related to the
                                                                       ELEPHANT Composer was answered by the test participants after
                                                                       the completion of both the tasks. This one was largely based on
                                                                       the USE Questionnaire for User Interface satisfaction, designed
                                                                       by Arnold Lund [6]. This particular questionnaire evaluates four
                                                                       key factors, Usefulness, Ease of Use, Ease of Learning and
                                                                       Satisfaction, through a series of questions, which are answered by
 Figure 1: Screenshot of one of the subject’s audio and video          rating (from 1 to 7) between a strongly positive reaction (scored
                            data                                       as 7) to a strongly negative one (scored as 1). Test subjects were
                                                                       also given the freedom to express their suggestions and ideas. The
                                                                       test subjects were asked to think aloud and a continuous audio
                                                                       and video recording was made, whereby we could register their
                                                                       thoughts and reactions during the course of the task. In order to
                                                                       correlate these audio comments with the task being performed, the
                                                                       activity on the screen was also captured with the help of Camtasia
                                                                       Studio 5, Screen Recording Software. Using Camtasia we were
                                                                       also able to record the video feed from a webcam that was
                                                                       monitoring the test subject (see figure 1). All these 3 inputs were
                                                                       recorded to be part of the usability test analysis.
                                                                       In our study, we intended to measure changes in 3 physiological
                                                                       variables, namely heart rate (indicator of stress), skin conductivity
                                                                       (or electrodermal activity [3] - an indicator of CL) and skin
Figure 2: Photo of a result of one of the subject’s paper based        temperature (indicator of stress). To carry out these measurements
                 model of a mobile assistant                           we used two biosensors, the Alive Technologies Heart Monitor
                                                                       and the SenseWear BMS from Body Media. We monitored the
Our aim in conducting these tests was to measure the system            bio-signals of the test subjects over both the tasks, allowing us to
performance, user satisfaction and the emotional response (in          compare levels of parameters such as CL or stress between the
terms of stress and cognitive load on the participant) due to using    paper-based and tool-based task.
the tool. System performance: Evaluating the operation and
efficiency of the tool is a key step in its development. Identifying   4. Data Interpretation
areas that require more attention or areas that we can build up on     An initial questionnaire was answered by the test subjects at the
help enrich the authoring tool and provide a solid basis to create     start of the test to ascertain the level of computer knowledge and
an advanced product. User satisfaction: Based on actual user           experience with authoring tools and system modeling. Since the
experience, this metric is a powerful indicator of how the product     test subjects’ profession ranged from computer scientists to
might be received and how quickly it might be adopted by users.        economists and electrical engineers, we have encountered
The test subjects rate and rank different features and                 different levels of both computer knowledge and designing and


                                                                                                                                 38
modeling experience. However, all participants estimated                      x    Computer Based task: where the participants used the
themselves as being capable of operating personal computers,                       ELEPHANT composer to create the same travel
while the self-assessment regarding the experience with software                   assistant
modeling and authoring tools varied quite a lot between the test
subjects. We were expecting to see reduced cognitive load for
participants with a high level of knowledge regarding software
modeling and authoring tools. The second questionnaire (based
on the USE Questionnaire for User Interface) was administered
after the completion of both the tasks. The second questionnaire
was evaluated based on the guidelines as set by the author, and
gave us an insight into the levels of user satisfaction and ease of
use of the composer. The audio and video recording was
evaluated in conjunction with the task that was being performed at       Figure 3: Rise of GSR in μS for participant Banner, opposed
that time. The comments made were interpreted along with the             for each of the three individual parts (instruction, paper based
activity occurring on the screen and the webcam feed recorded            and computer based)
within that time frame, to see what it was about our tool that
caused them to have a problem and to see if they had any
suggestions to change and improve the tool. As our aim was to
analyze the cognitive load (the evaluation of stress is a part of our
future work) on the test subjects and depending on the findings,
find ways to improve the tool, making it easier to use. To this
effect, we analyzed the Galvanic Skin Response (GSR) values
tracked by the SenseWear BMS biosensor. We performed a
simple statistical analysis, calculating the mean over the entire test
duration and over each of the tasks separately. Any task which
requires learning, thinking and/or reasoning, puts a certain
amount of load on the working memory, known as Cognitive
Load (CL) [8]. There are 3 types of CLs associated with learning
a task. The intrinsic CL is the inherent difficulty and complexity
associated with a task. The extraneous CL is produced based on
the manner in which the instruction or information is presented to
the student and must be minimized for optimum learning. Finally,
the germane CL also originates from the manner of instruction,
                                                                         Figure 4: Average GSR for paper based and computer based
but contributes towards the learning process [8]. As the number
                                                                         tasks for each participant
of issues that can be simultaneously handled by the working
memory is limited, the Cognitive Load Theory (CLT) provides a
basis for designing optimum instructional interfaces which               The SenseWear BMS from Body Media provided us with a
reduces the extraneous CL thereby ensuring more effective                moving average of GSR for every minute over the entire duration
learning [7]. A lot of work has been done on using CL to reduce          of the test. As each participant spent variable amounts of time on
the difficulties associated with learning computer programming           each of the tasks, we calculated the mean GSR for each of the
which is a highly interactive task. More interaction increases the       above time intervals for each participant, which allowed us to
CL on the working memory as multiple activates and skills are            compare these values.
being called upon simultaneously [10]. For tasks rich in
                                                                                    avgGSRtask(i) = task(i)                 (1)
interactivity, it is particularly important to reduce the extraneous
CL [8]. As in [9] we use the GSR data obtained from our                                                 ttask(i)
biosensors in order to analyze the effect of CL on our participants,     where ttask is the duration of each task, i represents the participant
as there is a directly proportional correlation between the GSR          and GSRtask represents the recorded moving average GSR values
values and CL (an increase in CL results in an increase in the           for the task being undertaken (listening to the instructions,
GSR [9] and vice versa). Out of the 11 participants, 9 were chosen       working on the paper-based, or using the composer). The mean
for the analysis of biosensor data (the data for the other 2             GSR values of the paper-based and computer-based tasks for each
participants was not collected as planned due to problems with           of the participants were then compared. Based on these metrics,
improper skin contact).                                                  we present our results in the next section.
For the analysis, the entire duration of the test was split up into 3
parts (see figure 3), namely:                                            5. Conclusion and Future Work
                                                                         Using the composer people felt comfortable with the system and
     x    Listening to instructions: where the participants              recommended the quiet simple use of its interface. User-
          received the initial instructions, including a brief           friendliness and the ease of learning were also appreciated by
          description of the test and the goals                          most of the participants. All participants succeeded in searching
     x    Paper Based task: where the participant carried out the        for resources and arranging them to an expected final structure
          paper-based task (not time limited) to design a mobile         with marginal variations based on the respective level of creativity
          travel assistant on paper                                      and effort put into the application. A limited scale of ELEPHANT


                                                                                                                                      39
elements (E-elements) provided from the system within the testing      In [1] we defined an ELEPHANT element (E-element) as a
scenario delimitated freedom of choice. Participants felt restricted   component with application logic. E-elements could only be
of the predetermined set of E-elements. They desired a drilldown       developed by software engineers or designers with scripting
of basic E-elements with the possibility to vary these items           abilities. We are planning to allow that new E-elements can also
according to their goals.                                              be composed with the E-Composer (see figure 5). With this
Once the mean GSR for each participant for each of the tasks was       improvement, the modeling based on components becomes more
calculated, we performed the following comparisons to deduce the       flexible but still keeps the high level. Because of the flexibility we
CL generated in our test subjects, due to using our tool. The          gain, we also approach our long term goal of supporting activity-
average GSR for the 3 tasks of the usability tests were as follows:    based design. Activities are dynamic and hierarchical structures.
listening to instructions 0.18 μS, paperbased 0.24 μS and              In activity theory, the objective of an activity can be realized
computer based 0.28 μS. As expected, there was an increase in the      through different sets of actions [5], different people might need
average GSR for the computer based task, indicating an increase        different actions for the same activity and hence different ways to
in the CL. This clearly supports the theory that moving from a         model the assistance for the same activity. Same actions can
known environment (paper based) to an unknown environment              contribute to different activities, and may also have different
(the ELEPHANT Composer) which involves the usage of a new              meanings for the people undertaking them [4].
computer tool causes a rise in the cognitive load on the memory.
The next step was to examine the average GSR for each of the           6. ACKNOWLEDGMENTS
participants individually. As we are specifically interested in the    This work was funded in part by the Bavarian Ministry of
paper based and computer based tasks, figure 4 plots the average       Economic Affairs, Infrastructure, Transport and Technology
GSR calculated for each participant in these 2 tasks. In order to      within the project „Dynamische Plattformen für Verteilte
see the significance of the change (increase or decrease), we also     Systeme“.
calculated the change in the average GSR in the computer based
task with respect to that of the paper based task and expressed it     7. REFERENCES
as a percentage.
Change % = avgGSRcomputer(i) - avgGSRpaper(i) x 100         (2)        [1] I. Aslan and D. Menon. Component-based development of
                         avgGSRpaper(i)                                    mobile assistants with the elephant system. In Proceedings of
                                                                           Mobility 2009, Nice, France, September 2-4, 2009.
where i is represents each participant. While the general trend is
to have an increase in the GSR (and hence an increase in CL), we       [2] Elliot, S. N. et al., Cognitive load theory and universal
observed that for 2 participants (Richards and Parker) there was a         design principles: Applications to test item development,
decrease in the GSR recorded during the computer based test.               Vanderbilt University, NASP Session, 2009
Comparing the GSR results with those of the questionnaires, we         [3] Haag, A., Goronzy, S., Schaich, P., and Williams, J. Emotion
saw that Richards and Parker, both hailing from background of IT           recognition using bio-sensors: First steps towards an
and with extensive computer expertise and experience in using              automatic system.2004, pp. 36-48.
authoring systems found our tool easy to use and were able to
                                                                       [4] K. Kuutti. Activity theory as a potential framework for
learn the use of it quickly. This was expected, as we have already
                                                                           human-computer interaction research. In Context and
noticed the test subjects’ varying knowledge level in software
                                                                           Consciousness: Activity Theory and Human-computer
modeling and authoring, as pointed out above. The CL that was
                                                                           Interaction, pages 17–44, 1996.
exerted on their working memories reduced during the computer
based task.                                                            [5] A. Leont’ev. Activity, Consciousness, and Personality.
                                                                           Prentice Hall, New Jersey, 1978.
                                                                       [6] Lund, A. Measuring usability with the use questionnaire, stc
                                                                           usability sig newsletter, 8:2.
                                                                       [7] Oviatt, S., Human-Centred Design meetns Cognitive Load
                                                                           Theory: Designing interfaces that help people think,
                                                                           Proceedings of the 14th annual ACM international
                                                                           conference on Multimedia, 2006, pp. 871-880
                                                                       [8] Richard E. M., The Cambridge handbook of multimedia
                                                                           learning, Cambridge University Press, 2005
                                                                       [9] Shi, Y., Ruiz, N., Taib, R., Choi, E., and Chen, F. Galvanic
                                                                           skinresponse (gsr) as an index of cognitive load. In CHI '07:
                                                                           CHI '07 extended abstracts on Human factors in computing
                                                                           systems (New York, NY, USA, 2007), ACM, pp. 2651-2656.
                                                                       [10] Yousoof, M., Sapiyan, M., and Kamaluddin, K., Reducing
                                                                            cognitive load in learning computer programming, World
       Figure 5: Bundling of substructures in tree nodes                    Academy of Science and Technology, Volume 12, 2006,
                                                                            ISSN 1307-6


                                                                                                                                  40