=Paper= {{Paper |id=Vol-1514/paper1 |storemode=property |title=Active World Model for Testing Autonomous Systems Using CEFSM |pdfUrl=https://ceur-ws.org/Vol-1514/paper1.pdf |volume=Vol-1514 |dblpUrl=https://dblp.org/rec/conf/models/AndrewsAG15 }} ==Active World Model for Testing Autonomous Systems Using CEFSM== https://ceur-ws.org/Vol-1514/paper1.pdf
        Active World Model for Testing Autonomous
                  Systems Using CEFSM
             Anneliese Andrews                       Mahmoud Abdelgawad                             Ahmed Gario
      Department of Computer Science            Department of Computer Science            Department of Computer Science
           University of Denver                      University of Denver                      University of Denver
         Denver, CO 80208 USA                      Denver, CO 80208 USA                      Denver, CO 80208 USA
        Email: andrews@cs.du.edu                  Email: abdelgaw@cs.du.edu                    Email: agario@du.edu



   Abstract—This paper describes a model-based test generation      To address these challenges, we propose a systematic MBT
approach for testing autonomous systems interacting with their      approach, World Model-based Test Generation (WMBTG),
environment (i.e., world). Unlike other approaches that assume      that identifies what, where and how to use worlds for testing
a static world with attributes and values, we present and test
a dynamic world. We use Communicating Extended Finite               autonomous systems inteacting with their surroundings. Test
State Machine (CEFSM) to illustrate an active world model           cases are generated by aggregating test paths in the individ-
that describes behaviors of environmental factors (i.e., actors).   ual models. These test paths are grouped as concurrent test
Abstract World Behavioral Test Cases (AWBTCs) are then gen-         paths which can be used with simulators or test-harnesses to
erated by covering the active world model using graph coverage      validate autonomous systems. WMBTG has been introduced,
criteria. We also generate test-data by input-space partitioning
to transform the generated AWBTCs into executable test cases.       in our previous work, and was applied to the unmanned
We apply the World Model-based Test Generation (WMBTG)              vehicle application domain [12]. We extend the applicability of
technique to a case study from the Human-Robot Interaction          WMBTG to the Human-Robot Interaction (HRI) domain. We
domain (HRI) specifically a tour-guide robot. Reachability of the   also evaluate the efficiency of test path coverage criteria used
active world model and efficiency of coverage criteria are also     to generate Abstract World Behavioral Test Cases (AWBTs).
discussed.
                                                                    We evaluate input-space partitioning coverage criteria [13]
                      I. I NTRODUCTION                              used to generate test-data. We use UML class diagrams to
   Autonomous systems are commonly defined as those sys-            construct the structural model of actors and their relationships.
tems that are able to accomplish entirely or in part certain        We also use CEFSM [7,14] to represent landscapes in worlds.
tasks/goals without human intervention [1,2]. Autonomous            We call these landscapes snippets. Snippets are used to link the
systems exist in various applications such as driverless vehicles   behavioral models of various actors that are involved in this
(so-called Unmanned Vehicles) [3], Search and Rescue robots         world. We also explore the applicability of the MBT technique
(SaR) [4], and Human-Interaction Robots (HRI) [5]. As such,         for testing autonomous systems behaviors in dynamic worlds
the robotic vacuum cleaner (Roomba) is a prime example              in [12] alongside behavioral testing that is flexible, system-
of autonomous systems [6]. Testing the interactions between         atic, scalable, and shows that this technique is extendable to
autonomous systems and world actors- pedestrians, mobile            other domains of autonomous systems and to other types of
objects, and unknown obstacles- poses a series of challenges,       behavioral models.
due to the complexity of these systems and the uncertainty          The remainder of this paper is organized as follows. Section
of their surroundings. In order to generate behavioral test         II discuses the state of research related to MBT, testing
cases in the form of simultaneous world stimuli, Model-based        autonomous systems, and World Model-based Testing. Sec-
Testing (MBT) is able to leverage behavioral models, such           tion III defines the human-robot interaction domain, explores
as CEFSM [7], Coloured Petri Nets (CPN) [8], Labelled               HRI classifications, and describes the case study. Section
Transition Systems (LTS) [9], and sequence and communi-             IV presents our approach and applies it to the case study.
cation diagrams in Unified Modeling Language (UML) [10],            We analyze and discuss reachability and efficiency issues in
to describe the behavioral scenarios that can occur between         section V. Section VI draws conclusions and future work.
the System Under Test (SUT) and its world. This requires                              II. S TATE OF R ESEARCH
testing solutions to deal with the large number of possibilities
                                                                    A. Model-based Testing (MBT)
of the behavioral scenarios. Current MBT approaches for
testing Real-time Embedded Systems (RTES) interacting with             According to Dias-Neto et al. [15], MBT uses various
their worlds assume a static world model [11]. However,             models to automatically generate tests. MBT includes three
for autonomous systems, the world cannot be described only          key elements: models that describe software behavior, crite-
statically with attributes and values, the world should also be     ria that guide the test-generation algorithms, and tools that
presented and used for testing dynamically.                         generate supporting infrastructure for the tests. Zander et al.
                                                                    [16] define MBT as an algorithm that generates test cases
automatically from models instead of creating them manually.        characteristics of CPNs, such as color-based and event-based
Utting et al. [16,17] define six dimensions of MBT approaches       coverage criteria. Color-based coverage criteria focus on the
(a taxonomy): model scope, characteristics, paradigm, test          consumption of tokens that belong to pre-defined color sets.
selection criteria, test generation technology and test execu-      Event-based coverage criteria focus on the CPN events, where
tion. They also classify MBT techniques as state, history,          an event is defined as a transition together with enabling
functional, operational, stochastic, and transition based. Shi-     variables. This work does not provide validation, nor does it
role et al. [10] present a survey on model-based test genera-       address dynamic worlds. The authors however found MBT is
tion from UML behavioral specification diagrams (sequence,          the most promising option for testing autonomous systems.
state chart and activity diagrams). They classify the various
                                                                    C. World Model-based Testing
research approaches based on formal specifications, graph-
based, heuristics, and direct UML specification processing. In         Most approaches in the literature for modeling the world
graph-based testing, a test case is a path that covers some         of autonomous systems define the world model as a software
specific system requirement. Test case generation from graphs       control component that represents the autonomous system’s
includes the following steps: build a graph model, identify test    view of its world. In our approach, the world is consid-
requirements, select test paths to cover those requirements,        ered as independent actors interacting with the SUT instead
and derive test data. Shafique and Labiche [18] present a           of being part of the system. These approaches are mostly
systematic review to determine the current state of the art         for the purpose of increasing the understandability of the
of MBT tool support. They scope their study to tools that           decision-making module to the relevant surroundings in order
use state-based models: FSM, Extended FSM, Abstract State           to implement proper, efficient, and safe behaviors, but they
Machine (ASM), State-chart, UML state machine, (timed,              are not aiming for model-based testing. Gheta et al. [21]
input/output)-automata, Harel Statechart, Petri Net, State Flow     contribute an intelligent information storage and management
diagram and Markov chain. They grouped MBT tools based              system approach for autonomous systems with the aim of
on test criteria similarity. Twelve MBT tools are selected for      modeling the world of an autonomous system. The world
primary studies. A comparison enables tool selection based on       model is represented as instances of classes with class specific
project needs. The literature shows that CEFSM is practicable       attributes and relations. Furda and Vlacic [22] also present
for representing dynamic behaviors including dependency,            an object-oriented world model approach for the road traffic
concurrency, and communication.                                     environment of autonomous vehicles. The approach uses UML
                                                                    class diagrams to represent the structure of the world actors.
B. Testing Autonomous Systems                                       The authors conducted an experiment, a field trial, using
   The techniques for testing autonomous systems are mostly         two autonomous vehicles. The experiment illustrates that the
computer-based simulation and test fields/arenas. Jacoff et al.     world model strongly supports the decision-making module
[19], by the National Institute of Standards and Technology         for making appropriate driving decisions in real-time. Never-
(NIST), introduce a standard for designing and evaluating test      theless, this work neither intends to be a testing technique for
arenas (Reference Test Arena for Autonomous Mobile Robots).         autonomous systems nor does it handle the dynamic aspect of
The test arenas consist of collapsed structures that are designed   world actors.
from buildings in various stages of the collapse. Pepper et al.     A closely related approach for world model-based testing is
[20] illustrate a computer-based simulation technique for eval-     presented in [23]. It generates black-box test cases automati-
uating USAR robots using USARSim, a robot simulation tool.          cally based on a static world model. The main characteristics
The computer-based simulation is flexible, repeatable, and          of the approach are: 1) modeling the structural and behavioral
accurate compared with physical test fields/arenas; however,        world properties, especially real-time properties. Invariants
it lacks physical fidelity. Both techniques, computer-based         and error states such as unsafe, undesirable, or illegal states
simulation and test fields/arenas, also limit possible behavioral   are also modeled. They use an extension of UML (MARTE)
scenarios that may occur in autonomous system worlds. Lill          that models and analyzes real-time embedded systems. They
et al. [8] use a MBT technique for testing autonomous sys-          also use Object-Constraints Language (OCL)for specifying
tems. The authors, first, compare different modeling notations      environmental constraints. 2) Test oracles are then generated
(Process Algebras like Calculus of Communicating Systems            automatically from the world model. They use a simulator to
(CCS) and Communicating Sequential Processes (CSP), UML             observe actual response. 3) To identify feasible test cases and
activity diagrams, Petri Nets (PNs), and Coloured Petri Nets        maximize possibilities of fault detection, heuristic algorithms
(CPNs)) that are used to model concurrent behavior of co-           such as Random Testing (RT), Adaptive Random Testing
operating autonomous systems. The comparison is based on            (ART), and Search-based Testing (SBT) (specifically, Genetic
four evaluating criteria (understandability, well-definedness,      algorithm and (1+1) Evolutionary algorithm) are used as
scalability, and testability). They then select CPNs to model       test generation strategies. An empirical study is conducted
a factory robot that carries a load from one place to another.      to identify which test case generation approach obtains the
Four behavioral transitions (look ahead, raise alarm, go ahead,     best results. The experiment shows that ART is the best
mission completed) were modeled. Obstacle passing is not            among these algorithms. Recently, this approach has been
considered. They also define coverage criteria tailored to the      enhanced and applied to automotive sensor system [24]. This
approach generates use cases automatically by using Natural           [31]. It has a head with two eyes, two arms, a body, and
Language Processing (NLP). The approach (so called Use                wheels for mobility. Its mobile platform includes two driving
Case Modeling for System Tests Generation (UMTG)), is                 wheels and one free wheel. Robovie is equipped with 10 tactile
applied to BodySense system. BodySense system monitors                sensors, an omnidirectional vision sensor, two microphones to
a car seat to classify the occupants. It disables the airbag          listen, and 24 ultrasonic sensors for detecting obstacles. The
for children and unoccupied seats while it enables airbag for         eyes have a pan-tilt mechanism with direct-drive motors, and
adults. It also includes a seat belt reminder function. The           they are used for stereo vision and gaze control. It also has
result indicates that test requirements generated by UMTG             skin sensors for realizing interactive behaviors. Robovie com-
are entirely feasible. However, this approach limits the world        municates with its sensory environment via wireless LAN. En-
model to a static world. It is also specified for testing real-time   vironmental sensors that Robovie communicates with usually
embedded systems (RTES). It is not applicable for autonomous          are Laser Range Finders (LRFs). Many cognitive experiments
systems because their worlds are dynamic.                             were conducted on Robovie to increase its behavioral skills.
                                                                      For instance, Robovie can predict human behaviors. This skill
               III. A PPLICATION DESCRIPTION
                                                                      was added recently to Robovie for the purpose of escaping
   In this paper, we consider tour guide-robot applications. Ac-      from children’s abuse [32]. Robovie can perform meaningful
cording to [5,25,26], a tour guide-robot is classified as a highly    interactive-behaviors for a human. For example, it can gaze,
autonomous, serviceable, anthropomorphic, navigational, and           gesture, greet, converse, listen, assist, follow, accompany and
social HRI robot. Tour guide-robots usually perform in indoor         guide people. Robovie also can perform in various sensory en-
sensory environments where sensors are placed ubiquitously            vironments. Kanda et al. [33] present a field trial conducted on
for perceiving static/mobile objects. Tour guide-robots use the       Robovie at a shopping mall for five weeks. Each day, approxi-
sensory environment to understand their surroundings and to           mately 100 groups of customers signed up to interact with the
localize themselves. Museum, campus, shopping arcade, train           robot. The experimenters observe Robovie’s interaction with
station, and hotel-lobby are examples of indoor sensory envi-         customers. They also collect feedback using a questionnaire.
ronments. MacDougall et al. [27] present a sensory environ-           The findings show that the customers accepted the robot with
ment. They built the sensory environment in the Electrical and        positive impressions. In [34] and [35], the authors present the
Computer Engineering Department of Kettering University for           results of two field trials on Robovie at a train station. Its task
the purpose of college tours. They then conducted a field trial       includes greeting, providing directions, and advertising. The
on a tour guide-robot that gives tours to visitors. Although,         experimenters also consider this robot being capable to elicit
experimenters use visitors’ opinions (questionnaire), which is        spontaneous participation from pedestrians. This experiment
not enough for validation, these experiments illustrate that the      investigates the robot’s technical performance and its attitudes.
tour guide-robots are capable to be proximate, conversational,        The authors illustrate fine results in both considerations except
serviceable, and sociable. Burgard et al. [28] conducted a            speech recognition. Shiomi et al. [36] also present a field
field trial on a tour guide-robot, RHINO, in the Deutsche             trial conducted on Robovie at Osaka Science Museum for two
Museum Bonn. For six days, RHINO gave tours to more                   months. The robot is assigned to welcome, guide, and provide
than 2000 visitors. Thrun et al. [29] experimented with a tour        scientific information about the exhibits. In this field trial,
guide-robot, MINERVA, in the Smithsonian National Museum              the target visitors are children, therefore Robovie expresses
of American History. MINERVA successfully educated and                childlike behaviors such as handshaking, hugging, and free-
entertained many thousand visitors. Socially, MINERVA is              playing. The findings indicate that performing childlike in-
compared with RHINO. A key difference between both robots             teractions effectively attracts visitors’ attention for scientific
relates to their interactive capabilities. RHINO acts more            explanations. These field trials were conducted with a few
rudimentarily and does not exhibit emotional states, while            interventions (so-called semi-autonomous trial) by operators
MINERVA behaves more effectively in attracting people and             (humans). Operators use ubiquitous cameras to watch partici-
making progress. The Microsoft Research Team [30] also                pants and start the robot for greeting when a participant talks
conducted a field trial on a tour guide-robot (a humanoid robot       to it. They also intervene when the robot encounters a critical
(NAO), from Aldebaran Robotics, France). The experiment               situation. In compliance with these field trials, one can image
focuses on conversational engagement, handling queries, and           that Robovie generally deals with three types of environmental
providing directions to visitors. The experimental results show       actors (people, obstacles, and environmental sensors). Also, to
successful conversational engagements with individuals but            present an active world model, we assume that this robot is
not multiparty conversational situations. The HRI literature          fully autonomous.
with the most relevance to the guide-robot applications comes            2) Crowds in a train station: As mentioned, Robovie can
from a series of studies on Robovie, a humanoid robot invented        perform in various sensory environments where crowds can be
by Advanced Telecommunications Research (ATR) Institute,              formed. A crowd is defined as a large group of individuals
Japan. Therefore, we select Robovie as a System Under Test            in the same physical environment, sharing a common goal
(SUT) in our case study.                                              [37]. Each crowd has a set of behaviors resulting from world
   1) Tour-guide Robot (Robovie): Robovie is an interactive           actors. For instance, in a museum, an attendee can attend, ask,
humanoid robot performing human-like physical expressions             film, and leave an exhibit. In a train station, crowds can be
                                                                                            IV. A PPROACH
                                                                      Our objective is to apply a systematic model-based test
                                                                    generation approach [12] to generate test cases from an active
                                                                    world model that represents world actors of an autonomous
                                                                    system. There are a multitude of world actors. They can
                                                                    be mobile or static. World actors also act independently,




        Fig. 1: Shopping Arcade in a Train Station, [35]

formed over many snippets (entry gate, train doors, shopping
arcade, restaurant, exit gate). Each snippet has a somewhat
different set of behaviors from other snippets. At a restaurant,
for example, people can walk, talk, order, sit, eat, etc. At the
entry gate, people run, sometimes push and shove, follow, pass,
etc. In both locations, people engage in common behaviors
such as talking, listening, gesturing, gazing, etc. In addition,
people naturally do these behaviors simultaneously. Therefore,
crowd behaviors are infinite and simultaneous. [37] explores                Fig. 3: World Behavioral Test Generation Process
the crowd behaviors in detail.
In this paper, the shopping arcade in the train station presented   simultaneously, and unpredictably. To avoid scalability and
in [35] is considered as a world snippet of the case study. As      complexity issues of the dynamic worlds, we concentrate on
                                                                    actors that autonomous systems are dealing with and the
shown in Fig. 1, the shopping arcade in a train station consists
of participants, obstacles, and four LRFs mounted around            behavioral messages that autonomous systems can perceive
the trial area. In [35], the authors classify the participants      from these actors. The locations where actors interact are also
                                                                    considered. In other words, a set of actions that a group of
as addressees, side-participants, bystanders, and pedestrians.
Addressee is a person in front of the robot listening to and        actors can perform may occur over a particular snippet. For
                                                                    instance, when travelers stop by a shopping arcade in a train
following. Side-participants are participants accompanying an
addressee. Addressee and Side-participants are responsible for      station, behaviors can be walking or standing.
responding to the robot. Bystanders are participants encour-        We build the world behavioral model in two steps. First,
                                                                    we construct a structural model of actors to represent their
aged by the robot, addressee, or side-participants but they are
not responsible for interacting with the robot. Pedestrians are     attributes, functions and relations. Second, we construct the
people who are not arranged in any of these classes. Obstacles      behavioral model to describe actors’ possible states and tran-
                                                                    sitions and their interactions. Each actor is presented by
are classified as Mobile obstacles (e.g., drivable cleaning-cart)
and Static obstacles like ”caution: wet floor” sign. The four       one behavioral model showing its behavioral messages. The
LRFs are assigned to sense the obstacles and the participants’      interactions between these actors represent the active world
movements in the trial area and provide the sensory data to         model. These interactions need to be modeled by a com-
the robot. Fig. 2 visualizes a set of shopping arcade actors.       municating behavioral semantic model such as CEFSM. As
                                                                    such in our application, actors are interacting simultaneously,
                                                                    the active world model should cover not only the internal
                                                                    transitions of actors, but also the interactions between them.
                                                                    The active world model can then be leveraged to generate
                                                                    world behavioral test cases. Once we build the active world
                                                                    model, any member of the graph-based testing criteria from
                                                                    [8,13] can be used to generate abstract behavioral test paths,
                                                                    which are AWBTCs. Finally, we generate test-data by input-
             Fig. 2: A set of shopping arcade actors
                                                                    space partitioning to transform the generated AWBTCs into
The areas also are classified into two types, Area of Audience      executable test cases. The test generation process is illustrated
(AoA) and Area of Passing (AoP) [35]. AoAs are locations            in Fig. 3. The world Model-based test generation process
where pedestrians tend to become members of the audience            follows three phases:
like a restaurant. AoPs are locations where pedestrians tend to        • Model the active world by constructing structural and
keep passing, for instance entry and exit gates, see [35]. The            then behavioral models.
shopping arcade is considered as AoA.
           TABLE I: Instances of Shopping Arcade Actors                      world actors is shown in Fig. 4. The actors are aggregated
                                                                             into a Crowd snippet. Similar actors are generalized to a
   Class           Actor Instance          Behavioral message example
                                                                             single class. For instance, Static and Mobile obstacle are
   Participant     An elderly person,      child.attend()                    generalized into the Obstacle class. The number of actors in
                   an adult, a teen, and   child.behave(talk)
                   a child                 elderly.behave(gaze)              the Crowd is determined by their multiplicity relationship.
   Mobile          Cleaning cart and       cleaningCart.appear()             Instances of shopping arcade actors and examples of their
   obstacle        maintenance cart        cleaningCart.act(moveBW)          behavioral messages are illustrated in Table I. We assume
                                           cleaningCart.act(flashingLight)
                                                                             that only one robot performs in a snippet; this robot is not
   Static          Caution signs, flash    caution.appear()                  considered a world actor as it is the SUT.
   obstacle        lights, siren, and      caution.inform(”Wet Floor”)
                   fire alarms             alram.inform(”siren”)                2) Behavioral Model: Although a wide range of behavioral
   LRF             LRF1,      LRF2,        LRF1.on()                         models exists, we illustrate the behavioral model using com-
                   LRF3, and LRF4          LRF1.detect(participant)          municating extend finite state machine (CEFSM). The strength
                                                                             of CEFSM is that it can model orthogonal states of a system
                                                                             in a flat manner and does not need to compose the whole
  • Select proper graph-based coverage criteria for test-path                system in one state as in state charts, which would make it
    generation and proper input-space partitioning coverage                  more complicated and harder to analyze and/or test [7,38].
    criteria for test-data.                                                  CEFSM = (S, s0 , E, P, T, A, M, V, C), such that: S is a finite
  • Generate AWTCs which are concurrent test paths and                       set of states, s0 is the initial state, E is a set of events, P is
    then generate test-data to transform these concurrent test               a set of boolean predicates, T is a set of transition functions
    paths into executable test cases.                                        such that T: S×P ×E→S×A×M , A is a set of actions, M
                                                                             is a set of communicating messages, V is a set of variables,
A. Phase 1: World Models                                                     and C is the set of input/output communication channels used
   1) Structural Model: The structural model is constructed                  in the CEFSM. State changes (action language): The function
using a UML class diagram, where classes represent actors                    T returns a next state, a set of output signals, and an action
including their important characteristics, behavioral messages,              list for each combination of a current state, an input signal,
and relationships. In our application, the shopping arcade in a              and a predicate. It is defined as: T(si , pi , get(mi ))/(sj , A,
train station can be represented by a single snippet ”Crowd”.                send(mj1 ,..., mjk )) where, si is the current state, sj is the
World actors that are considered to perform in this snippet are              next state, pi is the predicate that must be true in order to
of three types: participants (humans), obstacles, and LRFs.                  execute the transition, ei is the event that when combined
A participant can be addressee, side-participant, bystander,                 with a predicate trigger the transition function, mi1 ,..., mik
or pedestrian [35]. Obstacles also are classified as mobile or               are the messages. CEFSM is a generalization of an EFSM
static. Pedestrians and mobile obstacles perform independently               [39] (i.e., adding communication channels between EFSMs).
                                                                             Modeling behavioral models follows two steps. First, we




              Fig. 4: Structural Model for Shopping Arcade

and concurrently several behaviors (communicated via mes-                         Fig. 5: Behavioral Models for Shopping Arcade Actors
sages); however, static obstacles inform messages only. LRFs
detect objects that appear in the Crowd and describe these                   model each individual actors as EFSMs. Then we model the
objects to the robot. However, these LRFs do not interact with               interaction between these actors as CEFSM. Fig. 5 shows
other world actors except the robot. The UML class diagram                   a set of CFSM that represents a group of shopping arcade
that represents the shopping arcade in a train station and its               actors interacting with each other. It is clear that a participant
actor can express multiple behavioral messages simultane-              processes. We use transition-based coverage [7] to generate in-
ously. For instance, a participant can walk and talk at the            ternal paths that cover the processes of shopping arcade actors.
same time. Similarly, mobile obstacles also can concurrently           In the shopping arcade, as illustrated in Table II, each of these
reveal several behavioral messages. For instance, a cleaning-
cart moves forward/backward and flashes its lighting-alarm at                          TABLE II: Internal Test Path Sets
the same time. However, in some cases, exhibiting different
behavioral messages concurrently is infeasible. For example,            1. Participant process, T P1 = {tp11 }
                                                                                                p.attend()                 p.behave(behavior)
a pedestrian cannot sit and walk at the same time. Therefore,           tp11 : absent −−−−−−→                 present      −−−−−−−−−−−−→
                                                                                 p.miss()
proper input-space partitioning criteria can be used to exclude         present −−−−−→ absent
the infeasible combinations of behavioral messages. However,            2. Mobile obstacle process, T P2 = {tp21 }
static obstacles inform by messages only. For example, a                                        mo.appear()
                                                                        tp21 : hidden −−−−−−−−→ visible                       −
                                                                                                                               mo.act(action)
                                                                                                                              −−−−−−−−−
                                                                                                                                      →
cautionary sign shows a wet floor message. As shown in Fig.                      mo.disappear()
                                                                        visible −−−−−−−−−−→ hidden
5, a participant initially is out of the crowd (absent). When
this participant attends the crowd, the attend() transition fires       3. Static obstacle process, T P3 = {tp31 }
                                                                                                so.appear()               so.inf orm(message)
and the participant moves to present place. This participant            tp31 : unseen −−−−−−−→                  seen      −−−−−−−−−−−−−→
                                                                              so.disappear()
then starts a behavior. Whenever this participant behaves, the          seen −−−−−−−−−→ unseen
behave() transition fires, the behavioral message reveals, and          4. LRF process, T P4 = {tp41 }
the participant moves to present state again. This process                           lrf.on()        lrf.detect(object)      lrf.of f ()
                                                                        tp41 : of f −−−−−→ On −−−−−−−−−−−→ On −−−−−−→ Of f
can occur at will. It is similar to other actors’ processes.
The key difference is that the behavioral/information messages
these processes reveal are dissimilar. Secondly, these behav-          processes T Pi is coincidentally covered by one internal test
ioral models (EFSMs) that represent world actors are linked            path tpi1 only because the size of these processes is small. The
together into a higher level behavioral model which describes          test path sets, (T P1 , T P2 , T P3 , T P4 ), interact concurrently
the interactions among these actors. Fig. 6 illustrates the high       with each other via the exchange of behavioral/information
level behavioral model. This model shows that participants             messages (i.e., interaction messages). These interaction mes-
can interact with mobile obstacles and vice-versa. However,            sages represent the high level of execution behavior of the
                                                                       active world model, as shown in the behavioral model (High-
                                                                       Level), in Fig. 6. The interaction among the processes can
                                                                       be covered by interaction test paths which represent the
                                                                       possibilities of execution behavior of the interaction messages.
                                                                       To avoid cyclic paths, the Simple-path coverage criterion [13],
                                                                       is used to generate the interaction test paths that cover the
                                                                       shopping arcade interaction messages. Table III shows six
                                                                       simple paths, Interaction Test Paths (IT P1 , IT P2 , . . . , IT P6 ),

                                                                                       TABLE III: Interaction Test Paths

                                                                          IT P1 : (participant −
                                                                                               → mobile obstacle −
                                                                                                                 → participant)
              Fig. 6: Behavioral Model (High-Level)
                                                                          IT P2 : (participant −
                                                                                               → LRF )
static obstacles can only show messages to other actors (i.e.,            IT P3 : (mobile obstacle −
                                                                                                   → LRF )
unidirectional interaction). A LRF also is a unidirectional actor
due to the fact that they detect objects only.                            IT P4 : (static obstacle −
                                                                                                   → participant)

B. Phase 2: Coverage Criteria                                             IT P5 : (static obstacle −
                                                                                                   → mobile obstacle)

   Since each actor is represented as a EFSM (a process), an              IT P6 : (static obstacle −
                                                                                                   → LRF )
active world behavioral model can be defined as a collection
of concurrent processes, AW M = {P1 , P2 , . . . , Pi } where          that cover the high level of the active world model for shopping
1 ≤ i ≤ M and M is the number of actors that share                     arcade actors. Each interaction test path combines the internal
a snippet. Model-flow coverage criteria such as node (state-           test paths of processes T Pi that are involved in the interaction
based) coverage, edge (transition-based) coverage, etc. [8,13],        scenario. For instance, IT P1 , shown in Table III, covers the
can be applied. Using any of a number of test path generation          interaction between the participant process and the mobile
techniques, test paths that fulfill these coverage criteria can        obstacle process. Thus, the internal paths of this interaction is
then be generated. Let T Pi = (tpi1 , tpi2 , ..., tpik ) be a set of   (tp11 → tp21 → tp11 ). However, the behaviors vary due to the
such internal test paths that cover the process Pi and k is            non-deterministic interactions between the internal test paths.
the number of these internal test paths. These internal paths          For instance, in the shopping arcade, when a participant moves
describe the internal execution (possible behaviors) of the            ahead of a cleaning-cart, the cleaning-cart may stop to give the
participant free way, or it may keep moving and alerting the                        3) Input-space Partitioning Coverage Criteria: The gen-
participant by a beep. The interaction test paths are considered                 erated concurrent test paths are still abstract. To transform
concurrent paths. The concurrent interaction between internal                    these concurrent test paths into executable test cases, test-
test paths that represent multiple processes produces a number                   data coverage criteria, i.e. input-space partitioning [13], also
of possible combinations of internal paths. As a result, we                      are required. The input-space partitioning criteria can be
have two types of coverage criteria, path combination and                        considered as methods to divide a collection of values (input-
concurrent test path coverage criteria.                                          domain) into test-data blocks that make the concurrent test
   1) Path Combination: In order to cover all possible                           paths executable. The input-domain is the set of possible
combinations of internal paths, path combination coverage                        values that input variables can take on. In the shopping arcade
criteria should determine what combinations are required.                        snippet, the behavior execution of actors is controlled by
Let (T P1 , T P2 , . . ., T Pn ) be sets of internal test paths for              five input-domains: participant type, participant behaviors,
(P1 , P2 , . . . , Pn ) where T P1 = {tp11 , tp12 , . . ., tp1k }, T P2 =        mobile obstacle actions, static obstacles and LRF detected
{tp21 , tp22 , . . ., tp2k }, . . . , and T Pj = {tpj1 , tpj2 , . . ., tpjk }.   objects. There is one block for each. The Participant type
Then, the selection of a tp1i from T P1 and a tp2j from                          block includes {addressee, side participant, bystander, and
T P2 is called a path combination. Let len(tpij ) be the                         pedestrian}. The Participant behavior block consists of values
number of nodes in tpij , the length of tpij . The com-                          {sit, walk, listen, talk, gaze, gesture, eat, drink}. The mo-
bination set for interaction test path IT Pi , CombIT Pi =                       bile obstacle actions block consists of {move forward, move
{(tpjk , . . ., tpmn )|tpmn ∈ T Pm , m = len(IT Pi ), n =                        backward, turn right, turn lift, beep, flash lights} while the
|T Pi |, 1 ≤ j ≤ m, 1 ≤ k ≤ n}. The number of all path                           static obstacles block includes the messages {“Wet Floor”,
combinations of IT P1 , for instance, will be the product of                     “Do Not Enter”}. The LRF detected objects block contains
the number of internal paths of T P1 , T P2 , and T P1 . Each                    {participant, mobile obstacle, and static obstacle}. In this
combination introduces a set of concurrent test paths.                           work, we use All Combinations Coverage (ACoC) which
   2) Concurrent Test Path Coverage Criteria: The path com-                      exercises all possible combinations of test-data. The number of
bination sets do not show how these paths interact concur-                                                             Q
                                                                                                                       Q
                                                                                 test-data sets that satisfy ACoC is       (Bi ), where Bi is a block
rently. Therefore, concurrent test path coverage criteria are                                                          i=1

required. In this work, we apply the all possible serialized                     of values for a parameter and Q is the number of parameters.
execution sequences coverage criterion. We also use the Ren-                     To compare with ACoC, we also use Each Choice Coverage
dezvous coverage criterion, as in [14]. These concurrent test                    (ECC) that selects one value from each block of values. The
                                                                                                                              Q
path coverage criteria are defined as follows:                                   number of test-data that satisfy ECC is M AXi=1  (Bi ) [13].

   • All Possible Serialized Execution Sequences Coverage                        C. Phase 3: Test Generation
      Criterion (APSESCC): Test requirements contain a set                          The path combinations are represented as ordered references
      of all possible serialized nodes of the test paths that are                to internal test paths of the processes involved in the execution.
      included in each path combination, i.e. each node in                       These combinations may result in a huge number of concurrent
      test path tpij can be triggered by each node in test path                  test paths, AWBTCs, although not all of these concurrent test
      tpmn and vice versa. For example, let tpij be a → b and                    paths are feasible. We used the serialization algorithm in [14]
      tpmn be x → y , where tpij and tpmn are in the same                        to generate these concurrent test paths. The concurrent test
      path combination. All serialized execution sequences of                    paths are serialized nodes of the internal paths. We expressed
      path combination cxy = (tpij , tpmn ) will be:                             the concurrency of test paths using double-bar “||” as used in
       ((a → b → x → y), (a → x → b → y), (a → x → y →                           LOTOS for defining concurrency [40]. In the shopping arcade
       b), (x → y → a → b), (x → a → y → b), (x → a → b →                        snippet, each process involved in the interaction is satisfied by
       y)).                                                                      one internal test path only. As a result, each interaction test
                                                                                 path is composed of one combination which represents the
       If a path combination includes two paths and each                         concurrent test path, an AWBTC. Table IV shows the path
       one contains three nodes, the all serialized execution                    combinations and the AWBTCs that satisfy the interaction
       sequences will be 20 possible serializations. All possible                test paths presented in Table III. Six path combinations are
       number of serializations of nodes is                                      created for covering interaction test paths of the shopping
                        len(cij )
                           P
                                                                                 arcade; consequently, six AWBTCs are generated. This num-
       |CombIT Pi | (
          X               j=1
                                  |tpij |)!                                      ber of AWBTCs is reasonable for this small number of actor
                   ( len(cij )                ).                                 processes. Nevertheless, when we impose the APSESCC and
                          Q
           i=1                   (|tpij |)!                                      RCC to serialize these AWBTCs, the total number of test paths
                         j=1

   •   Rendezvous Coverage Criterion (RCC): The test require-                    serialized by the APSESCC is 35000 serialized paths while
       ments contain a set of all paths that have rendezvous                     the RCC produces 244 rendezvous paths. To transform these
       nodes. Then the possible number of rendezvous-paths                       AWBTCs into executable test cases, we also apply ACoC and
                                                 Q
                                                 n                               ECC coverage criteria to generate test-data that meet these
       RZV of the interaction test path IT Pi is   (T Pj +1)−1.
                                                                                 AWBTCs. The five blocks of values described in section IV-B3
                                                         j=1
                                                                                 meet these criteria. For each interaction test path IT Pi , there
       TABLE IV: Combinations of Concurrent Test Paths                                352 states connected by 1046 arcs. However, the number of
                                                                                      reachable states grows exponentially as the number or the size
   1. Combination CombIT P 1 (tp11 , tp21 , tp11 ) = AW BT C1 :
                      p.attend()                                 p.behave(behavior)   of processes increase. For instance, for 8 processes, the number
   (tp11 [absent      −−−−−−−→          present                  −−−−−−−−−−−−−→
                p.miss()                  ||                           mo.appear()    of reachable states expands to 1346 states with 3658 arcs.
                           →tp21 [hidden −−−−−−−−−→
   present −−−−−−→ absent] −
                   mo.act(action)                                    mo.disappear()
                                                                                      Although CADP is scalable up to 1013 nodes, a display in
   visible        −
                  −−−−−−−−−−→      visible                          −−−−−−−−−−−→      CADP is no longer easily readable.
                         ||                                              p.attend()
   hidden]               →tp11 [absent
                         −                                              −−−−−−−→
              p.behave(behavior)                          p.miss()
   present −−−−−−−−−−−−−→ present −−−−−−→ absent])                                    B. Criteria Efficiency
   2. Combination CombIT P 2 (tp11 , tp41 ) = AW BT C2 :                                It is clear that the number of generated concurrent test paths
                      p.attend()                                 p.behave(behavior)
   (tp11 [absent      −−−−−−−→          present                  −−−−−−−−−−−−−→       depends on several factors: the number of actor processes that
                  p.miss()                            ||                   lrf.on()   are involved in the execution and the size of these processes,
   present       −−−−−−→        absent]              →tp41 [of f
                                                     −                     −
                                                                           −−−−−
                                                                               →
        lrf.detect(object)          lrf.of f ()
   On −−−−−−−−−−−−→ On −−−−−−→ Of f ])                                                the combination criteria selected to combine the internal
                                                                                      paths of these actor processes, the coverage criteria chosen
   3. Combination CombIT P 3 (tp21 , tp41 ) = AW BT C3 :
                        mo.appear()                                  mo.act(action)   to serialize these internal paths, and the coverage criteria
   (tp21 [hidden       −−−−−−−−−→               visible              −
                                                                     −−−−−−−−−−
                                                                              →
               mo.disappear()                              ||              lrf.on()
   visible     −−−−−−−−−−−→         hidden]               →tp41 [of f
                                                          −                −
                                                                           −−−−−
                                                                               →                      TABLE V: Test Criteria Efficiency
        lrf.detect(object)          lrf.of f ()
   On −−−−−−−−−−−−→ On −−−−−−→ Of f ])
                                                                                        Interaction   Test-paths C.    Test-data C.   APSESCC RCC
   4. Combination CombIT P 4 (tp31 , tp11 ) = AW BT C4 :                                Test                                          with    with
                       so.appear()                              so.inf orm(message)     Path          APSESCC RCC      ACoC    ECC    ACoC    EEC
   (tp31 [unseen      −−−−−−−−→           seen                  −−−−−−−−−−−−−−−→
             so.disappear()                     ||                       p.attend()     IT P1         34650      124   192     8      6652800 992
                            →tp11 [absent −−−−−−−→
   seen −−−−−−−−−−→ unseen] −
              p.behave(behavior)                                                        IT P2         70         24    96      8      6720      192
   present −−−−−−−−−−−−−→ present])
                                                                                        IT P3         70         24    18      6      1260      144
   5. Combination CombIT P 5 (tp31 , tp21 ) = AW BT C5 :                                IT P4         70         24    96      8      6816      192
                       so.appear()                              so.inf orm(message)
   (tp31 [unseen      −−−−−−−−→           seen                  −−−−−−−−−−−−−−−→        IT P5         70         24    18      6      1260      144
          so.disappear()                   ||                          mo.appear()
                            →tp21 [hidden −−−−−−−−−→
   seen −−−−−−−−−−→ unseen] −                                                           IT P6         70         24    9       3      630       72
             mo.act(action)               mo.disappear()
   visible −        → visible −−−−−−−−−−−→ hidden])
           −−−−−−−−−−                                                                   Total         35000      244   429     39     6669486   1736
   6. Combination CombIT P 6 (tp31 , tp41 ) = AW BT C6 :
                       so.appear()                              so.inf orm(message)
   (tp31 [unseen      −−−−−−−−→           seen                  −−−−−−−−−−−−−−−→      chosen to generate test-data. We use APSESCC and RCC
             so.disappear()                          ||                  LRF.ON()
   seen      −−−−−−−−−−→        unseen]           →tp41 [of f
                                                  −                     −
                                                                        −−−−−−−
                                                                              →       coverage criteria on six interaction test paths to serialize the
        LRF.detect(object)
   On −−−−−−−−−−−−−→ On])                                                             generated AWBTCs. We then apply ACoC and ECC coverage
                                                                                      criteria to generate test-data in oder to transform the serialized
                                                                                      test paths into executable test cases. Table V illustrates the
is a set of test-data that fulfills at least one concurrent test path                 efficiency of these coverage criteria. As mentioned above,
that belongs to this IT Pi . This set of test-data is selected from                   although exercising AWBTCs generated by APSESCC on test-
blocks that only are related to the actor processes involved                          data sets selected by ACoC is not practicable, it is considered
in the IT Pi . For instance, IT P1 represents the interactions                        as an upper bound. On the other hand, exploiting RCC on
between participant and mobile obstacle; as a result, three                           ECC is more feasible and efficient.
blocks (participant types, participant behaviors, and mobile
                                                                                                 VI. C ONCLUSION AND F UTURE WORK
obstacle actions) are used to generate test-data for IT P1 . The
ACoC results in 429 test-data while the ECC produces 39.                                 This paper presented the applicability of a model-based test
When these 429 test-data are used with the 35000 serialized                           generation approach [12] that allows testing of autonomous
paths, this results in 6669486 executable test cases. On the                          systems in their active world. We modeled an active world
other hand, the number of executable test cases generated                             of an autonomous system. A test generation process is ap-
by ECC with RCC is 1736. Using APSESCC with ACoC is                                   plied. Path serialization techniques APSESCC and RCC are
clearly impractical; however, it presents the upper bound of                          imposed. The APSESCC is also compared withx RCC. The
test cases. We also consider using ECC with RCC as a lower                            findings indicate that RCC is practically feasible. To transform
bound.                                                                                the generated AWBTCs into executable test cases, we also
                                                                                      exploited ACoC and ECC coverage criteria to generate test-
        V. R EACHABILITY & C RITERIA E FFICIENCY                                      data. The findings also show that the number of executable
A. Reachability                                                                       test cases depends on the size of generated test-data and
  To perform reachability analysis on the behavioral models,                          the size/number of actor processes that are involved in the
we use the Construction and Analysis of Distributed Processes                         execution. The CADP toolbox is used for reachability analysis.
(CADP) toolbox [41]. For generating all possible states that                          Future work will explore other testing techniques such as
a system can reach, CADP transforms the LOTOS code                                    search-based testing techniques to handle path-selection and
that represents a CEFSM into a Labelled Transition System                             test-data generation of the concurrent processes. Future work
(LTS) graph. The reachability graph generated by CADP, for                            will also investigate the effectiveness of this approach by
four concurrent processes with three nodes each, consists of                          executing the generated test cases.
              VII. ACKNOWLEDGMENTS                                                [20] C. Pepper, S. Balakirsky, and C. Scrapper, “Robot simulation physics
                                                                                       validation,” in Proceedings of the 2007 Workshop on Performance
This work was supported, in part, by NSF IUCRC grant                                   Metrics for Intelligent Systems, ser. PerMIS ’07. New York, NY, USA:
#0934413, 1127947, 1332078, and 1439693 to the University                              ACM, 2007, pp. 97–104.
of Denver.                                                                        [21] I. Ghete, M. Heizmann, A. Belkin, and J. Beyerer, “World modeling for
                                                                                       autonomous systems,” in KI 2010: Advances in Artificial Intelligence,
                              R EFERENCES                                              ser. Lecture Notes in Computer Science, R. Dillmann, J. Beyerer,
                                                                                       U. Hanebeck, and T. Schultz, Eds. Springer Berlin Heidelberg, 2010,
 [1] L. Steels, “When are robots intelligent autonomous agents?” Robotics              vol. 6359, pp. 176–183.
     and Autonomous Systems, vol. 15, no. 12, pp. 3–9, 1995, the Biology          [22] A. Furda and L. Vlacic, “An object-oriented design of a world model
     and Technology of Intelligent Autonomous Agents.                                  for autonomous city vehicles,” in Intelligent Vehicles Symposium (IV),
 [2] S. Franklin and A. Graesser, “Is it an agent, or just a program?:                 IEEE, June 2010, pp. 1054–1059.
     A taxonomy for autonomous agents,” in Intelligent Agents III Agent           [23] M. Iqbal, A. Arcuri, and L. Briand, “Environment modeling with
     Theories, Architectures, and Languages, 1997, pp. 21–35.                          UML/MARTE to support black-box system testing for real-time embed-
 [3] H. Cheng, Autonomous Intelligent Vehicles: Theory, Algorithms, and                ded systems: Methodology and industrial case studies,” in Model Driven
     Implementation, 1st ed. Springer London Dordrecht Heidelberg, New                 Engineering Languages and Systems, ser. Lecture Notes in Computer
     York: Springer-Verlag, 2011.                                                      Science. Springer Berlin Heidelberg, 2010, vol. 6394, pp. 286–300.
 [4] Y. Liu and G. Nejat, “Robotic urban search and rescue: A survey from         [24] C. Wang, F. Pastore, A. Goknil, L. Briand, and Z. Iqbal, “Automatic
     the control perspective,” Journal of Intelligent and Robotic Systems,             generation of system test cases from use case specifications,” in
     vol. 72, no. 2, pp. 147–165, 2013.                                                Proceedings of the 2015 International Symposium on Software Testing
 [5] M. A. Goodrich and A. C. Schultz, “Human-robot interaction: A survey,”            and Analysis, ser. ISSTA 2015. New York, NY, USA: ACM, 2015,
     Found. Trends Hum.-Comput. Interact., vol. 1, no. 3, pp. 203–275, Jan.            pp. 385–396. [Online]. Available: http://doi.acm.org/10.1145/2771783.
     2007.                                                                             2771812
 [6] Y.-W. Bai and M.-F. Hsueh, “Using an adaptive iterative learning             [25] H. Yanco and J. Drury, “Classifying human-robot interaction: an updated
     algorithm for planning of the path of an autonomous robotic vacuum                taxonomy,” in Proceedings of the IEEE International Conference on
     cleaner,” in Proceedings of the 1st IEEE Global Conference on Con-                Systems, Man and Cybernetics, vol. 3, Oct 2004, pp. 2841–2846 vol.3.
     sumer Electronics (GCCE), 2012, pp. 401–405.                                 [26] A. Steinfeld, T. Fong, D. Kaber, M. Lewis, J. Scholtz, A. Schultz,
 [7] J. Li and W. Wong, “Automatic test generation from communicating                  and M. Goodrich, “Common metrics for human-robot interaction,” in
     extended finite state machine (CEFSM)-based models,” in Proceedings               Proceedings of the 1st ACM SIGCHI/SIGART Conference on Human-
     of 5th IEEE International Symposium on Object-Oriented Real-Time                  robot Interaction, ser. HRI ’06. New York, NY, USA: ACM, 2006, pp.
     Distributed Computing. (ISORC 2002), 2002, pp. 181–185.                           33–40.
 [8] R. Lill and F. Saglietti, “Model-based testing of autonomous systems         [27] J. MacDougall and G. Tewolde, “Tour guide robot using wireless based
     based on coloured petri nets,” in ARCS Workshops (ARCS), 2012, pp.                localization,” in Proceedings of the IEEE International Conference on
     1–5.                                                                              Electro/Information Technology (EIT), May 2013, pp. 1–6.
 [9] J. Tretmans, “Model based testing with labelled transition systems,” in      [28] W. Burgard, A. B. Cremers, D. Fox, D. Hähnel, G. Lakemeyer,
     Formal Methods and Testing, ser. Lecture Notes in Computer Science,               D. Schulz, W. Steiner, and S. Thrun, “Experiences with an interactive
     R. Hierons, J. Bowen, and M. Harman, Eds. Springer Berlin Heidelberg,             museum tour-guide robot,” Artif. Intell., vol. 114, no. 1-2, pp. 3–55, Oct.
     2008, vol. 4949, pp. 1–38.                                                        1999.
[10] M. Shirole and R. Kumar, “UML behavioral model based test case               [29] S. Thrun, M. Bennewitz, W. Burgard, A. Cremers, F. Dellaert, D. Fox,
     generation: A survey,” Softw. Eng. Notes, SIGSOFT, vol. 38, no. 4, pp.            D. Hahnel, C. Rosenberg, N. Roy, J. Schulte, and D. Schulz, “Minerva:
     1–13, Jul. 2013.                                                                  a second-generation museum tour-guide robot,” in Proceedings of the
[11] M. Iqbal, A. Arcuri, and L. Briand, “Empirical investigation of search            IEEE International Conference on Robotics and Automation, vol. 3,
     algorithms for environment model-based testing of real-time embedded              1999, pp. 1999–2005 vol.3.
     software,” in Proceedings of the 2012 International Symposium on             [30] D. Bohus, C. W. Saw, and E. Horvitz, “Directions robot: In-the-wild
     Software Testing and Analysis, ser. ISSTA 2012. New York, NY, USA:                experiences and lessons learned,” in Proceedings of the 2014 Interna-
     ACM, 2012, pp. 199–209.                                                           tional Conference on Autonomous Agents and Multi-agent Systems, ser.
[12] A. Andrews, M. Abdelgawad, and A. Gario, “Towards world model-                    AAMAS ’14. Richland, SC: International Foundation for Autonomous
     based test generation in autonomous systems,” in Proceedings of the 3rd           Agents and Multiagent Systems, 2014, pp. 637–644.
     International Conference on Model-Driven Engineering and Software            [31] T. Kanda, H. Ishiguro, T. Ono, M. Imai, T. Maeda, and R. Nakatsu,
     Development (MODELSWARD) 2015. SCITEPRESS Digital Library,                        “Development of robovie as a platform for everyday-robot research,”
     2015, pp. 165–176.                                                                Electronics and Communications in Japan (Part III: Fundamental Elec-
[13] P. Ammann and J. Offutt, Introduction to Software Testing, 1st ed.                tronic Science), vol. 87, no. 4, pp. 55–65, 2004.
     32 Avenue of the Americas, New York, NY 10013, USA: Cambridge                [32] D. Brscić, H. Kidokoro, Y. Suehiro, and T. Kanda, “Escaping from
     University Press, 2008.                                                           children’s abuse of social robots,” in Proceedings of the Tenth Annual
[14] R. Yang and C.-G. Chung, “A path analysis approach to concurrent                  ACM/IEEE International Conference on Human-Robot Interaction, ser.
     program testing,” in Proceedings of the 9th Annual International Phoenix          HRI ’15. New York, NY, USA: ACM, 2015, pp. 59–66.
     Conference on Computers and Communications, Mar 1990, pp. 425–               [33] T. Kanda, D. Glas, M. Shiomi, and N. Hagita, “Abstracting people’s tra-
     432.                                                                              jectories for social robots to proactively approach customers,” Robotics,
[15] A. Dias-Neto, R. Subramanyan, M. Vieira, and G. H. Travassos, “A                  IEEE Transactions on, vol. 25, no. 6, pp. 1382–1396, Dec 2009.
     survey on model-based testing approaches: A systematic review,” in           [34] M. Shiomi, D. Sakamoto, T. Kanda, C. Ishi, H. Ishiguro, and N. Hagita,
     Proceedings of the 1st ACM International Workshop on Empirical                    “Field trial of a networked robot at a train station,” International Journal
     Assessment of Software Engineering Languages and Technologies: Held               of Social Robotics, vol. 3, no. 1, pp. 27–40, 2011.
     in Conjunction with the 22Nd IEEE/ACM International Conference on            [35] M. Shiomi, T. Kanda, H. Ishiguro, and N. Hagita, “A larger audience,
     Automated Software Engineering (ASE) 2007. ACM, 2007, pp. 31–36.                  please!: Encouraging people to listen to a guide robot,” in Proceedings
[16] J. Zander, I. Schieferdecker, and P. J. Mosterman, Model-based testing            of the 5th ACM/IEEE International Conference on Human-robot Inter-
     for embedded systems, 1st ed. CRC Press, 2012.                                    action. IEEE Press, 2010, pp. 31–38.
[17] M. Utting, A. Pretschner, and B. Legeard, “A taxonomy of model-based         [36] ——, “Interactive humanoid robots for a science museum,” in Pro-
     testing approaches,” Softw. Test. Verif. Reliab., vol. 22, no. 5, pp. 297–        ceedings of the 1st ACM SIGCHI/SIGART Conference on Human-robot
     312, August 2012.                                                                 Interaction. ACM, 2006, pp. 305–312.
[18] M. Shafique and Y. Labiche, “A systematic review of state-based test         [37] M. Bouchard, J. Haegele, and H. Hexmoor, “Crowd dynamics of be-
     tools,” International Journal on Software Tools for Technology Transfer,          havioural intention: train station and museum case studies,” Connection
     pp. 1–18, 2013.                                                                   Science, pp. 1–24, 2014.
[19] A. Jacoff, E. Messina, B. Weiss, S. Tadokoro, and Y. Nakagawa, “Test
     arenas and performance metrics for urban search and rescue robots,” in
     Proceedings of the IEEE International Conference on Intelligent Robots
     and Systems (IROS), vol. 4, Oct 2003, pp. 3396–3403 vol.3.
[38] D. Brand and P. Zafiropulo, “On communicating finite-state machines,”    [40] M. Sighireanu, C. Chaudet, H. Garavel, M. Herbert, R. Mateescu, and
     J. ACM, vol. 30, no. 2, pp. 323–342, Apr. 1983.                               B. Vivien, “LOTOS NT user manual,” 2000.
[39] K. T. Cheng and A. Krishnakumar, “Automatic functional test generation   [41] H. Garavel, F. Lang, R. Mateescu, and W. Serwe, “CADP 2011: a
     using the extended finite state machine model,” in Proceedings of the         toolbox for the construction and analysis of distributed processes,”
     30th International Design Automation Conference, ser. DAC ’93. New            International Journal on Software Tools for Technology Transfer, vol. 15,
     York, NY, USA: ACM, 1993, pp. 86–91.                                          no. 2, pp. 89–107, 2013.