=Paper=
{{Paper
|id=Vol-1514/paper1
|storemode=property
|title=Active World Model for Testing Autonomous Systems Using CEFSM
|pdfUrl=https://ceur-ws.org/Vol-1514/paper1.pdf
|volume=Vol-1514
|dblpUrl=https://dblp.org/rec/conf/models/AndrewsAG15
}}
==Active World Model for Testing Autonomous Systems Using CEFSM==
Active World Model for Testing Autonomous
Systems Using CEFSM
Anneliese Andrews Mahmoud Abdelgawad Ahmed Gario
Department of Computer Science Department of Computer Science Department of Computer Science
University of Denver University of Denver University of Denver
Denver, CO 80208 USA Denver, CO 80208 USA Denver, CO 80208 USA
Email: andrews@cs.du.edu Email: abdelgaw@cs.du.edu Email: agario@du.edu
Abstract—This paper describes a model-based test generation To address these challenges, we propose a systematic MBT
approach for testing autonomous systems interacting with their approach, World Model-based Test Generation (WMBTG),
environment (i.e., world). Unlike other approaches that assume that identifies what, where and how to use worlds for testing
a static world with attributes and values, we present and test
a dynamic world. We use Communicating Extended Finite autonomous systems inteacting with their surroundings. Test
State Machine (CEFSM) to illustrate an active world model cases are generated by aggregating test paths in the individ-
that describes behaviors of environmental factors (i.e., actors). ual models. These test paths are grouped as concurrent test
Abstract World Behavioral Test Cases (AWBTCs) are then gen- paths which can be used with simulators or test-harnesses to
erated by covering the active world model using graph coverage validate autonomous systems. WMBTG has been introduced,
criteria. We also generate test-data by input-space partitioning
to transform the generated AWBTCs into executable test cases. in our previous work, and was applied to the unmanned
We apply the World Model-based Test Generation (WMBTG) vehicle application domain [12]. We extend the applicability of
technique to a case study from the Human-Robot Interaction WMBTG to the Human-Robot Interaction (HRI) domain. We
domain (HRI) specifically a tour-guide robot. Reachability of the also evaluate the efficiency of test path coverage criteria used
active world model and efficiency of coverage criteria are also to generate Abstract World Behavioral Test Cases (AWBTs).
discussed.
We evaluate input-space partitioning coverage criteria [13]
I. I NTRODUCTION used to generate test-data. We use UML class diagrams to
Autonomous systems are commonly defined as those sys- construct the structural model of actors and their relationships.
tems that are able to accomplish entirely or in part certain We also use CEFSM [7,14] to represent landscapes in worlds.
tasks/goals without human intervention [1,2]. Autonomous We call these landscapes snippets. Snippets are used to link the
systems exist in various applications such as driverless vehicles behavioral models of various actors that are involved in this
(so-called Unmanned Vehicles) [3], Search and Rescue robots world. We also explore the applicability of the MBT technique
(SaR) [4], and Human-Interaction Robots (HRI) [5]. As such, for testing autonomous systems behaviors in dynamic worlds
the robotic vacuum cleaner (Roomba) is a prime example in [12] alongside behavioral testing that is flexible, system-
of autonomous systems [6]. Testing the interactions between atic, scalable, and shows that this technique is extendable to
autonomous systems and world actors- pedestrians, mobile other domains of autonomous systems and to other types of
objects, and unknown obstacles- poses a series of challenges, behavioral models.
due to the complexity of these systems and the uncertainty The remainder of this paper is organized as follows. Section
of their surroundings. In order to generate behavioral test II discuses the state of research related to MBT, testing
cases in the form of simultaneous world stimuli, Model-based autonomous systems, and World Model-based Testing. Sec-
Testing (MBT) is able to leverage behavioral models, such tion III defines the human-robot interaction domain, explores
as CEFSM [7], Coloured Petri Nets (CPN) [8], Labelled HRI classifications, and describes the case study. Section
Transition Systems (LTS) [9], and sequence and communi- IV presents our approach and applies it to the case study.
cation diagrams in Unified Modeling Language (UML) [10], We analyze and discuss reachability and efficiency issues in
to describe the behavioral scenarios that can occur between section V. Section VI draws conclusions and future work.
the System Under Test (SUT) and its world. This requires II. S TATE OF R ESEARCH
testing solutions to deal with the large number of possibilities
A. Model-based Testing (MBT)
of the behavioral scenarios. Current MBT approaches for
testing Real-time Embedded Systems (RTES) interacting with According to Dias-Neto et al. [15], MBT uses various
their worlds assume a static world model [11]. However, models to automatically generate tests. MBT includes three
for autonomous systems, the world cannot be described only key elements: models that describe software behavior, crite-
statically with attributes and values, the world should also be ria that guide the test-generation algorithms, and tools that
presented and used for testing dynamically. generate supporting infrastructure for the tests. Zander et al.
[16] define MBT as an algorithm that generates test cases
automatically from models instead of creating them manually. characteristics of CPNs, such as color-based and event-based
Utting et al. [16,17] define six dimensions of MBT approaches coverage criteria. Color-based coverage criteria focus on the
(a taxonomy): model scope, characteristics, paradigm, test consumption of tokens that belong to pre-defined color sets.
selection criteria, test generation technology and test execu- Event-based coverage criteria focus on the CPN events, where
tion. They also classify MBT techniques as state, history, an event is defined as a transition together with enabling
functional, operational, stochastic, and transition based. Shi- variables. This work does not provide validation, nor does it
role et al. [10] present a survey on model-based test genera- address dynamic worlds. The authors however found MBT is
tion from UML behavioral specification diagrams (sequence, the most promising option for testing autonomous systems.
state chart and activity diagrams). They classify the various
C. World Model-based Testing
research approaches based on formal specifications, graph-
based, heuristics, and direct UML specification processing. In Most approaches in the literature for modeling the world
graph-based testing, a test case is a path that covers some of autonomous systems define the world model as a software
specific system requirement. Test case generation from graphs control component that represents the autonomous system’s
includes the following steps: build a graph model, identify test view of its world. In our approach, the world is consid-
requirements, select test paths to cover those requirements, ered as independent actors interacting with the SUT instead
and derive test data. Shafique and Labiche [18] present a of being part of the system. These approaches are mostly
systematic review to determine the current state of the art for the purpose of increasing the understandability of the
of MBT tool support. They scope their study to tools that decision-making module to the relevant surroundings in order
use state-based models: FSM, Extended FSM, Abstract State to implement proper, efficient, and safe behaviors, but they
Machine (ASM), State-chart, UML state machine, (timed, are not aiming for model-based testing. Gheta et al. [21]
input/output)-automata, Harel Statechart, Petri Net, State Flow contribute an intelligent information storage and management
diagram and Markov chain. They grouped MBT tools based system approach for autonomous systems with the aim of
on test criteria similarity. Twelve MBT tools are selected for modeling the world of an autonomous system. The world
primary studies. A comparison enables tool selection based on model is represented as instances of classes with class specific
project needs. The literature shows that CEFSM is practicable attributes and relations. Furda and Vlacic [22] also present
for representing dynamic behaviors including dependency, an object-oriented world model approach for the road traffic
concurrency, and communication. environment of autonomous vehicles. The approach uses UML
class diagrams to represent the structure of the world actors.
B. Testing Autonomous Systems The authors conducted an experiment, a field trial, using
The techniques for testing autonomous systems are mostly two autonomous vehicles. The experiment illustrates that the
computer-based simulation and test fields/arenas. Jacoff et al. world model strongly supports the decision-making module
[19], by the National Institute of Standards and Technology for making appropriate driving decisions in real-time. Never-
(NIST), introduce a standard for designing and evaluating test theless, this work neither intends to be a testing technique for
arenas (Reference Test Arena for Autonomous Mobile Robots). autonomous systems nor does it handle the dynamic aspect of
The test arenas consist of collapsed structures that are designed world actors.
from buildings in various stages of the collapse. Pepper et al. A closely related approach for world model-based testing is
[20] illustrate a computer-based simulation technique for eval- presented in [23]. It generates black-box test cases automati-
uating USAR robots using USARSim, a robot simulation tool. cally based on a static world model. The main characteristics
The computer-based simulation is flexible, repeatable, and of the approach are: 1) modeling the structural and behavioral
accurate compared with physical test fields/arenas; however, world properties, especially real-time properties. Invariants
it lacks physical fidelity. Both techniques, computer-based and error states such as unsafe, undesirable, or illegal states
simulation and test fields/arenas, also limit possible behavioral are also modeled. They use an extension of UML (MARTE)
scenarios that may occur in autonomous system worlds. Lill that models and analyzes real-time embedded systems. They
et al. [8] use a MBT technique for testing autonomous sys- also use Object-Constraints Language (OCL)for specifying
tems. The authors, first, compare different modeling notations environmental constraints. 2) Test oracles are then generated
(Process Algebras like Calculus of Communicating Systems automatically from the world model. They use a simulator to
(CCS) and Communicating Sequential Processes (CSP), UML observe actual response. 3) To identify feasible test cases and
activity diagrams, Petri Nets (PNs), and Coloured Petri Nets maximize possibilities of fault detection, heuristic algorithms
(CPNs)) that are used to model concurrent behavior of co- such as Random Testing (RT), Adaptive Random Testing
operating autonomous systems. The comparison is based on (ART), and Search-based Testing (SBT) (specifically, Genetic
four evaluating criteria (understandability, well-definedness, algorithm and (1+1) Evolutionary algorithm) are used as
scalability, and testability). They then select CPNs to model test generation strategies. An empirical study is conducted
a factory robot that carries a load from one place to another. to identify which test case generation approach obtains the
Four behavioral transitions (look ahead, raise alarm, go ahead, best results. The experiment shows that ART is the best
mission completed) were modeled. Obstacle passing is not among these algorithms. Recently, this approach has been
considered. They also define coverage criteria tailored to the enhanced and applied to automotive sensor system [24]. This
approach generates use cases automatically by using Natural [31]. It has a head with two eyes, two arms, a body, and
Language Processing (NLP). The approach (so called Use wheels for mobility. Its mobile platform includes two driving
Case Modeling for System Tests Generation (UMTG)), is wheels and one free wheel. Robovie is equipped with 10 tactile
applied to BodySense system. BodySense system monitors sensors, an omnidirectional vision sensor, two microphones to
a car seat to classify the occupants. It disables the airbag listen, and 24 ultrasonic sensors for detecting obstacles. The
for children and unoccupied seats while it enables airbag for eyes have a pan-tilt mechanism with direct-drive motors, and
adults. It also includes a seat belt reminder function. The they are used for stereo vision and gaze control. It also has
result indicates that test requirements generated by UMTG skin sensors for realizing interactive behaviors. Robovie com-
are entirely feasible. However, this approach limits the world municates with its sensory environment via wireless LAN. En-
model to a static world. It is also specified for testing real-time vironmental sensors that Robovie communicates with usually
embedded systems (RTES). It is not applicable for autonomous are Laser Range Finders (LRFs). Many cognitive experiments
systems because their worlds are dynamic. were conducted on Robovie to increase its behavioral skills.
For instance, Robovie can predict human behaviors. This skill
III. A PPLICATION DESCRIPTION
was added recently to Robovie for the purpose of escaping
In this paper, we consider tour guide-robot applications. Ac- from children’s abuse [32]. Robovie can perform meaningful
cording to [5,25,26], a tour guide-robot is classified as a highly interactive-behaviors for a human. For example, it can gaze,
autonomous, serviceable, anthropomorphic, navigational, and gesture, greet, converse, listen, assist, follow, accompany and
social HRI robot. Tour guide-robots usually perform in indoor guide people. Robovie also can perform in various sensory en-
sensory environments where sensors are placed ubiquitously vironments. Kanda et al. [33] present a field trial conducted on
for perceiving static/mobile objects. Tour guide-robots use the Robovie at a shopping mall for five weeks. Each day, approxi-
sensory environment to understand their surroundings and to mately 100 groups of customers signed up to interact with the
localize themselves. Museum, campus, shopping arcade, train robot. The experimenters observe Robovie’s interaction with
station, and hotel-lobby are examples of indoor sensory envi- customers. They also collect feedback using a questionnaire.
ronments. MacDougall et al. [27] present a sensory environ- The findings show that the customers accepted the robot with
ment. They built the sensory environment in the Electrical and positive impressions. In [34] and [35], the authors present the
Computer Engineering Department of Kettering University for results of two field trials on Robovie at a train station. Its task
the purpose of college tours. They then conducted a field trial includes greeting, providing directions, and advertising. The
on a tour guide-robot that gives tours to visitors. Although, experimenters also consider this robot being capable to elicit
experimenters use visitors’ opinions (questionnaire), which is spontaneous participation from pedestrians. This experiment
not enough for validation, these experiments illustrate that the investigates the robot’s technical performance and its attitudes.
tour guide-robots are capable to be proximate, conversational, The authors illustrate fine results in both considerations except
serviceable, and sociable. Burgard et al. [28] conducted a speech recognition. Shiomi et al. [36] also present a field
field trial on a tour guide-robot, RHINO, in the Deutsche trial conducted on Robovie at Osaka Science Museum for two
Museum Bonn. For six days, RHINO gave tours to more months. The robot is assigned to welcome, guide, and provide
than 2000 visitors. Thrun et al. [29] experimented with a tour scientific information about the exhibits. In this field trial,
guide-robot, MINERVA, in the Smithsonian National Museum the target visitors are children, therefore Robovie expresses
of American History. MINERVA successfully educated and childlike behaviors such as handshaking, hugging, and free-
entertained many thousand visitors. Socially, MINERVA is playing. The findings indicate that performing childlike in-
compared with RHINO. A key difference between both robots teractions effectively attracts visitors’ attention for scientific
relates to their interactive capabilities. RHINO acts more explanations. These field trials were conducted with a few
rudimentarily and does not exhibit emotional states, while interventions (so-called semi-autonomous trial) by operators
MINERVA behaves more effectively in attracting people and (humans). Operators use ubiquitous cameras to watch partici-
making progress. The Microsoft Research Team [30] also pants and start the robot for greeting when a participant talks
conducted a field trial on a tour guide-robot (a humanoid robot to it. They also intervene when the robot encounters a critical
(NAO), from Aldebaran Robotics, France). The experiment situation. In compliance with these field trials, one can image
focuses on conversational engagement, handling queries, and that Robovie generally deals with three types of environmental
providing directions to visitors. The experimental results show actors (people, obstacles, and environmental sensors). Also, to
successful conversational engagements with individuals but present an active world model, we assume that this robot is
not multiparty conversational situations. The HRI literature fully autonomous.
with the most relevance to the guide-robot applications comes 2) Crowds in a train station: As mentioned, Robovie can
from a series of studies on Robovie, a humanoid robot invented perform in various sensory environments where crowds can be
by Advanced Telecommunications Research (ATR) Institute, formed. A crowd is defined as a large group of individuals
Japan. Therefore, we select Robovie as a System Under Test in the same physical environment, sharing a common goal
(SUT) in our case study. [37]. Each crowd has a set of behaviors resulting from world
1) Tour-guide Robot (Robovie): Robovie is an interactive actors. For instance, in a museum, an attendee can attend, ask,
humanoid robot performing human-like physical expressions film, and leave an exhibit. In a train station, crowds can be
IV. A PPROACH
Our objective is to apply a systematic model-based test
generation approach [12] to generate test cases from an active
world model that represents world actors of an autonomous
system. There are a multitude of world actors. They can
be mobile or static. World actors also act independently,
Fig. 1: Shopping Arcade in a Train Station, [35]
formed over many snippets (entry gate, train doors, shopping
arcade, restaurant, exit gate). Each snippet has a somewhat
different set of behaviors from other snippets. At a restaurant,
for example, people can walk, talk, order, sit, eat, etc. At the
entry gate, people run, sometimes push and shove, follow, pass,
etc. In both locations, people engage in common behaviors
such as talking, listening, gesturing, gazing, etc. In addition,
people naturally do these behaviors simultaneously. Therefore,
crowd behaviors are infinite and simultaneous. [37] explores Fig. 3: World Behavioral Test Generation Process
the crowd behaviors in detail.
In this paper, the shopping arcade in the train station presented simultaneously, and unpredictably. To avoid scalability and
in [35] is considered as a world snippet of the case study. As complexity issues of the dynamic worlds, we concentrate on
actors that autonomous systems are dealing with and the
shown in Fig. 1, the shopping arcade in a train station consists
of participants, obstacles, and four LRFs mounted around behavioral messages that autonomous systems can perceive
the trial area. In [35], the authors classify the participants from these actors. The locations where actors interact are also
considered. In other words, a set of actions that a group of
as addressees, side-participants, bystanders, and pedestrians.
Addressee is a person in front of the robot listening to and actors can perform may occur over a particular snippet. For
instance, when travelers stop by a shopping arcade in a train
following. Side-participants are participants accompanying an
addressee. Addressee and Side-participants are responsible for station, behaviors can be walking or standing.
responding to the robot. Bystanders are participants encour- We build the world behavioral model in two steps. First,
we construct a structural model of actors to represent their
aged by the robot, addressee, or side-participants but they are
not responsible for interacting with the robot. Pedestrians are attributes, functions and relations. Second, we construct the
people who are not arranged in any of these classes. Obstacles behavioral model to describe actors’ possible states and tran-
sitions and their interactions. Each actor is presented by
are classified as Mobile obstacles (e.g., drivable cleaning-cart)
and Static obstacles like ”caution: wet floor” sign. The four one behavioral model showing its behavioral messages. The
LRFs are assigned to sense the obstacles and the participants’ interactions between these actors represent the active world
movements in the trial area and provide the sensory data to model. These interactions need to be modeled by a com-
the robot. Fig. 2 visualizes a set of shopping arcade actors. municating behavioral semantic model such as CEFSM. As
such in our application, actors are interacting simultaneously,
the active world model should cover not only the internal
transitions of actors, but also the interactions between them.
The active world model can then be leveraged to generate
world behavioral test cases. Once we build the active world
model, any member of the graph-based testing criteria from
[8,13] can be used to generate abstract behavioral test paths,
which are AWBTCs. Finally, we generate test-data by input-
Fig. 2: A set of shopping arcade actors
space partitioning to transform the generated AWBTCs into
The areas also are classified into two types, Area of Audience executable test cases. The test generation process is illustrated
(AoA) and Area of Passing (AoP) [35]. AoAs are locations in Fig. 3. The world Model-based test generation process
where pedestrians tend to become members of the audience follows three phases:
like a restaurant. AoPs are locations where pedestrians tend to • Model the active world by constructing structural and
keep passing, for instance entry and exit gates, see [35]. The then behavioral models.
shopping arcade is considered as AoA.
TABLE I: Instances of Shopping Arcade Actors world actors is shown in Fig. 4. The actors are aggregated
into a Crowd snippet. Similar actors are generalized to a
Class Actor Instance Behavioral message example
single class. For instance, Static and Mobile obstacle are
Participant An elderly person, child.attend() generalized into the Obstacle class. The number of actors in
an adult, a teen, and child.behave(talk)
a child elderly.behave(gaze) the Crowd is determined by their multiplicity relationship.
Mobile Cleaning cart and cleaningCart.appear() Instances of shopping arcade actors and examples of their
obstacle maintenance cart cleaningCart.act(moveBW) behavioral messages are illustrated in Table I. We assume
cleaningCart.act(flashingLight)
that only one robot performs in a snippet; this robot is not
Static Caution signs, flash caution.appear() considered a world actor as it is the SUT.
obstacle lights, siren, and caution.inform(”Wet Floor”)
fire alarms alram.inform(”siren”) 2) Behavioral Model: Although a wide range of behavioral
LRF LRF1, LRF2, LRF1.on() models exists, we illustrate the behavioral model using com-
LRF3, and LRF4 LRF1.detect(participant) municating extend finite state machine (CEFSM). The strength
of CEFSM is that it can model orthogonal states of a system
in a flat manner and does not need to compose the whole
• Select proper graph-based coverage criteria for test-path system in one state as in state charts, which would make it
generation and proper input-space partitioning coverage more complicated and harder to analyze and/or test [7,38].
criteria for test-data. CEFSM = (S, s0 , E, P, T, A, M, V, C), such that: S is a finite
• Generate AWTCs which are concurrent test paths and set of states, s0 is the initial state, E is a set of events, P is
then generate test-data to transform these concurrent test a set of boolean predicates, T is a set of transition functions
paths into executable test cases. such that T: S×P ×E→S×A×M , A is a set of actions, M
is a set of communicating messages, V is a set of variables,
A. Phase 1: World Models and C is the set of input/output communication channels used
1) Structural Model: The structural model is constructed in the CEFSM. State changes (action language): The function
using a UML class diagram, where classes represent actors T returns a next state, a set of output signals, and an action
including their important characteristics, behavioral messages, list for each combination of a current state, an input signal,
and relationships. In our application, the shopping arcade in a and a predicate. It is defined as: T(si , pi , get(mi ))/(sj , A,
train station can be represented by a single snippet ”Crowd”. send(mj1 ,..., mjk )) where, si is the current state, sj is the
World actors that are considered to perform in this snippet are next state, pi is the predicate that must be true in order to
of three types: participants (humans), obstacles, and LRFs. execute the transition, ei is the event that when combined
A participant can be addressee, side-participant, bystander, with a predicate trigger the transition function, mi1 ,..., mik
or pedestrian [35]. Obstacles also are classified as mobile or are the messages. CEFSM is a generalization of an EFSM
static. Pedestrians and mobile obstacles perform independently [39] (i.e., adding communication channels between EFSMs).
Modeling behavioral models follows two steps. First, we
Fig. 4: Structural Model for Shopping Arcade
and concurrently several behaviors (communicated via mes- Fig. 5: Behavioral Models for Shopping Arcade Actors
sages); however, static obstacles inform messages only. LRFs
detect objects that appear in the Crowd and describe these model each individual actors as EFSMs. Then we model the
objects to the robot. However, these LRFs do not interact with interaction between these actors as CEFSM. Fig. 5 shows
other world actors except the robot. The UML class diagram a set of CFSM that represents a group of shopping arcade
that represents the shopping arcade in a train station and its actors interacting with each other. It is clear that a participant
actor can express multiple behavioral messages simultane- processes. We use transition-based coverage [7] to generate in-
ously. For instance, a participant can walk and talk at the ternal paths that cover the processes of shopping arcade actors.
same time. Similarly, mobile obstacles also can concurrently In the shopping arcade, as illustrated in Table II, each of these
reveal several behavioral messages. For instance, a cleaning-
cart moves forward/backward and flashes its lighting-alarm at TABLE II: Internal Test Path Sets
the same time. However, in some cases, exhibiting different
behavioral messages concurrently is infeasible. For example, 1. Participant process, T P1 = {tp11 }
p.attend() p.behave(behavior)
a pedestrian cannot sit and walk at the same time. Therefore, tp11 : absent −−−−−−→ present −−−−−−−−−−−−→
p.miss()
proper input-space partitioning criteria can be used to exclude present −−−−−→ absent
the infeasible combinations of behavioral messages. However, 2. Mobile obstacle process, T P2 = {tp21 }
static obstacles inform by messages only. For example, a mo.appear()
tp21 : hidden −−−−−−−−→ visible −
mo.act(action)
−−−−−−−−−
→
cautionary sign shows a wet floor message. As shown in Fig. mo.disappear()
visible −−−−−−−−−−→ hidden
5, a participant initially is out of the crowd (absent). When
this participant attends the crowd, the attend() transition fires 3. Static obstacle process, T P3 = {tp31 }
so.appear() so.inf orm(message)
and the participant moves to present place. This participant tp31 : unseen −−−−−−−→ seen −−−−−−−−−−−−−→
so.disappear()
then starts a behavior. Whenever this participant behaves, the seen −−−−−−−−−→ unseen
behave() transition fires, the behavioral message reveals, and 4. LRF process, T P4 = {tp41 }
the participant moves to present state again. This process lrf.on() lrf.detect(object) lrf.of f ()
tp41 : of f −−−−−→ On −−−−−−−−−−−→ On −−−−−−→ Of f
can occur at will. It is similar to other actors’ processes.
The key difference is that the behavioral/information messages
these processes reveal are dissimilar. Secondly, these behav- processes T Pi is coincidentally covered by one internal test
ioral models (EFSMs) that represent world actors are linked path tpi1 only because the size of these processes is small. The
together into a higher level behavioral model which describes test path sets, (T P1 , T P2 , T P3 , T P4 ), interact concurrently
the interactions among these actors. Fig. 6 illustrates the high with each other via the exchange of behavioral/information
level behavioral model. This model shows that participants messages (i.e., interaction messages). These interaction mes-
can interact with mobile obstacles and vice-versa. However, sages represent the high level of execution behavior of the
active world model, as shown in the behavioral model (High-
Level), in Fig. 6. The interaction among the processes can
be covered by interaction test paths which represent the
possibilities of execution behavior of the interaction messages.
To avoid cyclic paths, the Simple-path coverage criterion [13],
is used to generate the interaction test paths that cover the
shopping arcade interaction messages. Table III shows six
simple paths, Interaction Test Paths (IT P1 , IT P2 , . . . , IT P6 ),
TABLE III: Interaction Test Paths
IT P1 : (participant −
→ mobile obstacle −
→ participant)
Fig. 6: Behavioral Model (High-Level)
IT P2 : (participant −
→ LRF )
static obstacles can only show messages to other actors (i.e., IT P3 : (mobile obstacle −
→ LRF )
unidirectional interaction). A LRF also is a unidirectional actor
due to the fact that they detect objects only. IT P4 : (static obstacle −
→ participant)
B. Phase 2: Coverage Criteria IT P5 : (static obstacle −
→ mobile obstacle)
Since each actor is represented as a EFSM (a process), an IT P6 : (static obstacle −
→ LRF )
active world behavioral model can be defined as a collection
of concurrent processes, AW M = {P1 , P2 , . . . , Pi } where that cover the high level of the active world model for shopping
1 ≤ i ≤ M and M is the number of actors that share arcade actors. Each interaction test path combines the internal
a snippet. Model-flow coverage criteria such as node (state- test paths of processes T Pi that are involved in the interaction
based) coverage, edge (transition-based) coverage, etc. [8,13], scenario. For instance, IT P1 , shown in Table III, covers the
can be applied. Using any of a number of test path generation interaction between the participant process and the mobile
techniques, test paths that fulfill these coverage criteria can obstacle process. Thus, the internal paths of this interaction is
then be generated. Let T Pi = (tpi1 , tpi2 , ..., tpik ) be a set of (tp11 → tp21 → tp11 ). However, the behaviors vary due to the
such internal test paths that cover the process Pi and k is non-deterministic interactions between the internal test paths.
the number of these internal test paths. These internal paths For instance, in the shopping arcade, when a participant moves
describe the internal execution (possible behaviors) of the ahead of a cleaning-cart, the cleaning-cart may stop to give the
participant free way, or it may keep moving and alerting the 3) Input-space Partitioning Coverage Criteria: The gen-
participant by a beep. The interaction test paths are considered erated concurrent test paths are still abstract. To transform
concurrent paths. The concurrent interaction between internal these concurrent test paths into executable test cases, test-
test paths that represent multiple processes produces a number data coverage criteria, i.e. input-space partitioning [13], also
of possible combinations of internal paths. As a result, we are required. The input-space partitioning criteria can be
have two types of coverage criteria, path combination and considered as methods to divide a collection of values (input-
concurrent test path coverage criteria. domain) into test-data blocks that make the concurrent test
1) Path Combination: In order to cover all possible paths executable. The input-domain is the set of possible
combinations of internal paths, path combination coverage values that input variables can take on. In the shopping arcade
criteria should determine what combinations are required. snippet, the behavior execution of actors is controlled by
Let (T P1 , T P2 , . . ., T Pn ) be sets of internal test paths for five input-domains: participant type, participant behaviors,
(P1 , P2 , . . . , Pn ) where T P1 = {tp11 , tp12 , . . ., tp1k }, T P2 = mobile obstacle actions, static obstacles and LRF detected
{tp21 , tp22 , . . ., tp2k }, . . . , and T Pj = {tpj1 , tpj2 , . . ., tpjk }. objects. There is one block for each. The Participant type
Then, the selection of a tp1i from T P1 and a tp2j from block includes {addressee, side participant, bystander, and
T P2 is called a path combination. Let len(tpij ) be the pedestrian}. The Participant behavior block consists of values
number of nodes in tpij , the length of tpij . The com- {sit, walk, listen, talk, gaze, gesture, eat, drink}. The mo-
bination set for interaction test path IT Pi , CombIT Pi = bile obstacle actions block consists of {move forward, move
{(tpjk , . . ., tpmn )|tpmn ∈ T Pm , m = len(IT Pi ), n = backward, turn right, turn lift, beep, flash lights} while the
|T Pi |, 1 ≤ j ≤ m, 1 ≤ k ≤ n}. The number of all path static obstacles block includes the messages {“Wet Floor”,
combinations of IT P1 , for instance, will be the product of “Do Not Enter”}. The LRF detected objects block contains
the number of internal paths of T P1 , T P2 , and T P1 . Each {participant, mobile obstacle, and static obstacle}. In this
combination introduces a set of concurrent test paths. work, we use All Combinations Coverage (ACoC) which
2) Concurrent Test Path Coverage Criteria: The path com- exercises all possible combinations of test-data. The number of
bination sets do not show how these paths interact concur- Q
Q
test-data sets that satisfy ACoC is (Bi ), where Bi is a block
rently. Therefore, concurrent test path coverage criteria are i=1
required. In this work, we apply the all possible serialized of values for a parameter and Q is the number of parameters.
execution sequences coverage criterion. We also use the Ren- To compare with ACoC, we also use Each Choice Coverage
dezvous coverage criterion, as in [14]. These concurrent test (ECC) that selects one value from each block of values. The
Q
path coverage criteria are defined as follows: number of test-data that satisfy ECC is M AXi=1 (Bi ) [13].
• All Possible Serialized Execution Sequences Coverage C. Phase 3: Test Generation
Criterion (APSESCC): Test requirements contain a set The path combinations are represented as ordered references
of all possible serialized nodes of the test paths that are to internal test paths of the processes involved in the execution.
included in each path combination, i.e. each node in These combinations may result in a huge number of concurrent
test path tpij can be triggered by each node in test path test paths, AWBTCs, although not all of these concurrent test
tpmn and vice versa. For example, let tpij be a → b and paths are feasible. We used the serialization algorithm in [14]
tpmn be x → y , where tpij and tpmn are in the same to generate these concurrent test paths. The concurrent test
path combination. All serialized execution sequences of paths are serialized nodes of the internal paths. We expressed
path combination cxy = (tpij , tpmn ) will be: the concurrency of test paths using double-bar “||” as used in
((a → b → x → y), (a → x → b → y), (a → x → y → LOTOS for defining concurrency [40]. In the shopping arcade
b), (x → y → a → b), (x → a → y → b), (x → a → b → snippet, each process involved in the interaction is satisfied by
y)). one internal test path only. As a result, each interaction test
path is composed of one combination which represents the
If a path combination includes two paths and each concurrent test path, an AWBTC. Table IV shows the path
one contains three nodes, the all serialized execution combinations and the AWBTCs that satisfy the interaction
sequences will be 20 possible serializations. All possible test paths presented in Table III. Six path combinations are
number of serializations of nodes is created for covering interaction test paths of the shopping
len(cij )
P
arcade; consequently, six AWBTCs are generated. This num-
|CombIT Pi | (
X j=1
|tpij |)! ber of AWBTCs is reasonable for this small number of actor
( len(cij ) ). processes. Nevertheless, when we impose the APSESCC and
Q
i=1 (|tpij |)! RCC to serialize these AWBTCs, the total number of test paths
j=1
• Rendezvous Coverage Criterion (RCC): The test require- serialized by the APSESCC is 35000 serialized paths while
ments contain a set of all paths that have rendezvous the RCC produces 244 rendezvous paths. To transform these
nodes. Then the possible number of rendezvous-paths AWBTCs into executable test cases, we also apply ACoC and
Q
n ECC coverage criteria to generate test-data that meet these
RZV of the interaction test path IT Pi is (T Pj +1)−1.
AWBTCs. The five blocks of values described in section IV-B3
j=1
meet these criteria. For each interaction test path IT Pi , there
TABLE IV: Combinations of Concurrent Test Paths 352 states connected by 1046 arcs. However, the number of
reachable states grows exponentially as the number or the size
1. Combination CombIT P 1 (tp11 , tp21 , tp11 ) = AW BT C1 :
p.attend() p.behave(behavior) of processes increase. For instance, for 8 processes, the number
(tp11 [absent −−−−−−−→ present −−−−−−−−−−−−−→
p.miss() || mo.appear() of reachable states expands to 1346 states with 3658 arcs.
→tp21 [hidden −−−−−−−−−→
present −−−−−−→ absent] −
mo.act(action) mo.disappear()
Although CADP is scalable up to 1013 nodes, a display in
visible −
−−−−−−−−−−→ visible −−−−−−−−−−−→ CADP is no longer easily readable.
|| p.attend()
hidden] →tp11 [absent
− −−−−−−−→
p.behave(behavior) p.miss()
present −−−−−−−−−−−−−→ present −−−−−−→ absent]) B. Criteria Efficiency
2. Combination CombIT P 2 (tp11 , tp41 ) = AW BT C2 : It is clear that the number of generated concurrent test paths
p.attend() p.behave(behavior)
(tp11 [absent −−−−−−−→ present −−−−−−−−−−−−−→ depends on several factors: the number of actor processes that
p.miss() || lrf.on() are involved in the execution and the size of these processes,
present −−−−−−→ absent] →tp41 [of f
− −
−−−−−
→
lrf.detect(object) lrf.of f ()
On −−−−−−−−−−−−→ On −−−−−−→ Of f ]) the combination criteria selected to combine the internal
paths of these actor processes, the coverage criteria chosen
3. Combination CombIT P 3 (tp21 , tp41 ) = AW BT C3 :
mo.appear() mo.act(action) to serialize these internal paths, and the coverage criteria
(tp21 [hidden −−−−−−−−−→ visible −
−−−−−−−−−−
→
mo.disappear() || lrf.on()
visible −−−−−−−−−−−→ hidden] →tp41 [of f
− −
−−−−−
→ TABLE V: Test Criteria Efficiency
lrf.detect(object) lrf.of f ()
On −−−−−−−−−−−−→ On −−−−−−→ Of f ])
Interaction Test-paths C. Test-data C. APSESCC RCC
4. Combination CombIT P 4 (tp31 , tp11 ) = AW BT C4 : Test with with
so.appear() so.inf orm(message) Path APSESCC RCC ACoC ECC ACoC EEC
(tp31 [unseen −−−−−−−−→ seen −−−−−−−−−−−−−−−→
so.disappear() || p.attend() IT P1 34650 124 192 8 6652800 992
→tp11 [absent −−−−−−−→
seen −−−−−−−−−−→ unseen] −
p.behave(behavior) IT P2 70 24 96 8 6720 192
present −−−−−−−−−−−−−→ present])
IT P3 70 24 18 6 1260 144
5. Combination CombIT P 5 (tp31 , tp21 ) = AW BT C5 : IT P4 70 24 96 8 6816 192
so.appear() so.inf orm(message)
(tp31 [unseen −−−−−−−−→ seen −−−−−−−−−−−−−−−→ IT P5 70 24 18 6 1260 144
so.disappear() || mo.appear()
→tp21 [hidden −−−−−−−−−→
seen −−−−−−−−−−→ unseen] − IT P6 70 24 9 3 630 72
mo.act(action) mo.disappear()
visible − → visible −−−−−−−−−−−→ hidden])
−−−−−−−−−− Total 35000 244 429 39 6669486 1736
6. Combination CombIT P 6 (tp31 , tp41 ) = AW BT C6 :
so.appear() so.inf orm(message)
(tp31 [unseen −−−−−−−−→ seen −−−−−−−−−−−−−−−→ chosen to generate test-data. We use APSESCC and RCC
so.disappear() || LRF.ON()
seen −−−−−−−−−−→ unseen] →tp41 [of f
− −
−−−−−−−
→ coverage criteria on six interaction test paths to serialize the
LRF.detect(object)
On −−−−−−−−−−−−−→ On]) generated AWBTCs. We then apply ACoC and ECC coverage
criteria to generate test-data in oder to transform the serialized
test paths into executable test cases. Table V illustrates the
is a set of test-data that fulfills at least one concurrent test path efficiency of these coverage criteria. As mentioned above,
that belongs to this IT Pi . This set of test-data is selected from although exercising AWBTCs generated by APSESCC on test-
blocks that only are related to the actor processes involved data sets selected by ACoC is not practicable, it is considered
in the IT Pi . For instance, IT P1 represents the interactions as an upper bound. On the other hand, exploiting RCC on
between participant and mobile obstacle; as a result, three ECC is more feasible and efficient.
blocks (participant types, participant behaviors, and mobile
VI. C ONCLUSION AND F UTURE WORK
obstacle actions) are used to generate test-data for IT P1 . The
ACoC results in 429 test-data while the ECC produces 39. This paper presented the applicability of a model-based test
When these 429 test-data are used with the 35000 serialized generation approach [12] that allows testing of autonomous
paths, this results in 6669486 executable test cases. On the systems in their active world. We modeled an active world
other hand, the number of executable test cases generated of an autonomous system. A test generation process is ap-
by ECC with RCC is 1736. Using APSESCC with ACoC is plied. Path serialization techniques APSESCC and RCC are
clearly impractical; however, it presents the upper bound of imposed. The APSESCC is also compared withx RCC. The
test cases. We also consider using ECC with RCC as a lower findings indicate that RCC is practically feasible. To transform
bound. the generated AWBTCs into executable test cases, we also
exploited ACoC and ECC coverage criteria to generate test-
V. R EACHABILITY & C RITERIA E FFICIENCY data. The findings also show that the number of executable
A. Reachability test cases depends on the size of generated test-data and
To perform reachability analysis on the behavioral models, the size/number of actor processes that are involved in the
we use the Construction and Analysis of Distributed Processes execution. The CADP toolbox is used for reachability analysis.
(CADP) toolbox [41]. For generating all possible states that Future work will explore other testing techniques such as
a system can reach, CADP transforms the LOTOS code search-based testing techniques to handle path-selection and
that represents a CEFSM into a Labelled Transition System test-data generation of the concurrent processes. Future work
(LTS) graph. The reachability graph generated by CADP, for will also investigate the effectiveness of this approach by
four concurrent processes with three nodes each, consists of executing the generated test cases.
VII. ACKNOWLEDGMENTS [20] C. Pepper, S. Balakirsky, and C. Scrapper, “Robot simulation physics
validation,” in Proceedings of the 2007 Workshop on Performance
This work was supported, in part, by NSF IUCRC grant Metrics for Intelligent Systems, ser. PerMIS ’07. New York, NY, USA:
#0934413, 1127947, 1332078, and 1439693 to the University ACM, 2007, pp. 97–104.
of Denver. [21] I. Ghete, M. Heizmann, A. Belkin, and J. Beyerer, “World modeling for
autonomous systems,” in KI 2010: Advances in Artificial Intelligence,
R EFERENCES ser. Lecture Notes in Computer Science, R. Dillmann, J. Beyerer,
U. Hanebeck, and T. Schultz, Eds. Springer Berlin Heidelberg, 2010,
[1] L. Steels, “When are robots intelligent autonomous agents?” Robotics vol. 6359, pp. 176–183.
and Autonomous Systems, vol. 15, no. 12, pp. 3–9, 1995, the Biology [22] A. Furda and L. Vlacic, “An object-oriented design of a world model
and Technology of Intelligent Autonomous Agents. for autonomous city vehicles,” in Intelligent Vehicles Symposium (IV),
[2] S. Franklin and A. Graesser, “Is it an agent, or just a program?: IEEE, June 2010, pp. 1054–1059.
A taxonomy for autonomous agents,” in Intelligent Agents III Agent [23] M. Iqbal, A. Arcuri, and L. Briand, “Environment modeling with
Theories, Architectures, and Languages, 1997, pp. 21–35. UML/MARTE to support black-box system testing for real-time embed-
[3] H. Cheng, Autonomous Intelligent Vehicles: Theory, Algorithms, and ded systems: Methodology and industrial case studies,” in Model Driven
Implementation, 1st ed. Springer London Dordrecht Heidelberg, New Engineering Languages and Systems, ser. Lecture Notes in Computer
York: Springer-Verlag, 2011. Science. Springer Berlin Heidelberg, 2010, vol. 6394, pp. 286–300.
[4] Y. Liu and G. Nejat, “Robotic urban search and rescue: A survey from [24] C. Wang, F. Pastore, A. Goknil, L. Briand, and Z. Iqbal, “Automatic
the control perspective,” Journal of Intelligent and Robotic Systems, generation of system test cases from use case specifications,” in
vol. 72, no. 2, pp. 147–165, 2013. Proceedings of the 2015 International Symposium on Software Testing
[5] M. A. Goodrich and A. C. Schultz, “Human-robot interaction: A survey,” and Analysis, ser. ISSTA 2015. New York, NY, USA: ACM, 2015,
Found. Trends Hum.-Comput. Interact., vol. 1, no. 3, pp. 203–275, Jan. pp. 385–396. [Online]. Available: http://doi.acm.org/10.1145/2771783.
2007. 2771812
[6] Y.-W. Bai and M.-F. Hsueh, “Using an adaptive iterative learning [25] H. Yanco and J. Drury, “Classifying human-robot interaction: an updated
algorithm for planning of the path of an autonomous robotic vacuum taxonomy,” in Proceedings of the IEEE International Conference on
cleaner,” in Proceedings of the 1st IEEE Global Conference on Con- Systems, Man and Cybernetics, vol. 3, Oct 2004, pp. 2841–2846 vol.3.
sumer Electronics (GCCE), 2012, pp. 401–405. [26] A. Steinfeld, T. Fong, D. Kaber, M. Lewis, J. Scholtz, A. Schultz,
[7] J. Li and W. Wong, “Automatic test generation from communicating and M. Goodrich, “Common metrics for human-robot interaction,” in
extended finite state machine (CEFSM)-based models,” in Proceedings Proceedings of the 1st ACM SIGCHI/SIGART Conference on Human-
of 5th IEEE International Symposium on Object-Oriented Real-Time robot Interaction, ser. HRI ’06. New York, NY, USA: ACM, 2006, pp.
Distributed Computing. (ISORC 2002), 2002, pp. 181–185. 33–40.
[8] R. Lill and F. Saglietti, “Model-based testing of autonomous systems [27] J. MacDougall and G. Tewolde, “Tour guide robot using wireless based
based on coloured petri nets,” in ARCS Workshops (ARCS), 2012, pp. localization,” in Proceedings of the IEEE International Conference on
1–5. Electro/Information Technology (EIT), May 2013, pp. 1–6.
[9] J. Tretmans, “Model based testing with labelled transition systems,” in [28] W. Burgard, A. B. Cremers, D. Fox, D. Hähnel, G. Lakemeyer,
Formal Methods and Testing, ser. Lecture Notes in Computer Science, D. Schulz, W. Steiner, and S. Thrun, “Experiences with an interactive
R. Hierons, J. Bowen, and M. Harman, Eds. Springer Berlin Heidelberg, museum tour-guide robot,” Artif. Intell., vol. 114, no. 1-2, pp. 3–55, Oct.
2008, vol. 4949, pp. 1–38. 1999.
[10] M. Shirole and R. Kumar, “UML behavioral model based test case [29] S. Thrun, M. Bennewitz, W. Burgard, A. Cremers, F. Dellaert, D. Fox,
generation: A survey,” Softw. Eng. Notes, SIGSOFT, vol. 38, no. 4, pp. D. Hahnel, C. Rosenberg, N. Roy, J. Schulte, and D. Schulz, “Minerva:
1–13, Jul. 2013. a second-generation museum tour-guide robot,” in Proceedings of the
[11] M. Iqbal, A. Arcuri, and L. Briand, “Empirical investigation of search IEEE International Conference on Robotics and Automation, vol. 3,
algorithms for environment model-based testing of real-time embedded 1999, pp. 1999–2005 vol.3.
software,” in Proceedings of the 2012 International Symposium on [30] D. Bohus, C. W. Saw, and E. Horvitz, “Directions robot: In-the-wild
Software Testing and Analysis, ser. ISSTA 2012. New York, NY, USA: experiences and lessons learned,” in Proceedings of the 2014 Interna-
ACM, 2012, pp. 199–209. tional Conference on Autonomous Agents and Multi-agent Systems, ser.
[12] A. Andrews, M. Abdelgawad, and A. Gario, “Towards world model- AAMAS ’14. Richland, SC: International Foundation for Autonomous
based test generation in autonomous systems,” in Proceedings of the 3rd Agents and Multiagent Systems, 2014, pp. 637–644.
International Conference on Model-Driven Engineering and Software [31] T. Kanda, H. Ishiguro, T. Ono, M. Imai, T. Maeda, and R. Nakatsu,
Development (MODELSWARD) 2015. SCITEPRESS Digital Library, “Development of robovie as a platform for everyday-robot research,”
2015, pp. 165–176. Electronics and Communications in Japan (Part III: Fundamental Elec-
[13] P. Ammann and J. Offutt, Introduction to Software Testing, 1st ed. tronic Science), vol. 87, no. 4, pp. 55–65, 2004.
32 Avenue of the Americas, New York, NY 10013, USA: Cambridge [32] D. Brscić, H. Kidokoro, Y. Suehiro, and T. Kanda, “Escaping from
University Press, 2008. children’s abuse of social robots,” in Proceedings of the Tenth Annual
[14] R. Yang and C.-G. Chung, “A path analysis approach to concurrent ACM/IEEE International Conference on Human-Robot Interaction, ser.
program testing,” in Proceedings of the 9th Annual International Phoenix HRI ’15. New York, NY, USA: ACM, 2015, pp. 59–66.
Conference on Computers and Communications, Mar 1990, pp. 425– [33] T. Kanda, D. Glas, M. Shiomi, and N. Hagita, “Abstracting people’s tra-
432. jectories for social robots to proactively approach customers,” Robotics,
[15] A. Dias-Neto, R. Subramanyan, M. Vieira, and G. H. Travassos, “A IEEE Transactions on, vol. 25, no. 6, pp. 1382–1396, Dec 2009.
survey on model-based testing approaches: A systematic review,” in [34] M. Shiomi, D. Sakamoto, T. Kanda, C. Ishi, H. Ishiguro, and N. Hagita,
Proceedings of the 1st ACM International Workshop on Empirical “Field trial of a networked robot at a train station,” International Journal
Assessment of Software Engineering Languages and Technologies: Held of Social Robotics, vol. 3, no. 1, pp. 27–40, 2011.
in Conjunction with the 22Nd IEEE/ACM International Conference on [35] M. Shiomi, T. Kanda, H. Ishiguro, and N. Hagita, “A larger audience,
Automated Software Engineering (ASE) 2007. ACM, 2007, pp. 31–36. please!: Encouraging people to listen to a guide robot,” in Proceedings
[16] J. Zander, I. Schieferdecker, and P. J. Mosterman, Model-based testing of the 5th ACM/IEEE International Conference on Human-robot Inter-
for embedded systems, 1st ed. CRC Press, 2012. action. IEEE Press, 2010, pp. 31–38.
[17] M. Utting, A. Pretschner, and B. Legeard, “A taxonomy of model-based [36] ——, “Interactive humanoid robots for a science museum,” in Pro-
testing approaches,” Softw. Test. Verif. Reliab., vol. 22, no. 5, pp. 297– ceedings of the 1st ACM SIGCHI/SIGART Conference on Human-robot
312, August 2012. Interaction. ACM, 2006, pp. 305–312.
[18] M. Shafique and Y. Labiche, “A systematic review of state-based test [37] M. Bouchard, J. Haegele, and H. Hexmoor, “Crowd dynamics of be-
tools,” International Journal on Software Tools for Technology Transfer, havioural intention: train station and museum case studies,” Connection
pp. 1–18, 2013. Science, pp. 1–24, 2014.
[19] A. Jacoff, E. Messina, B. Weiss, S. Tadokoro, and Y. Nakagawa, “Test
arenas and performance metrics for urban search and rescue robots,” in
Proceedings of the IEEE International Conference on Intelligent Robots
and Systems (IROS), vol. 4, Oct 2003, pp. 3396–3403 vol.3.
[38] D. Brand and P. Zafiropulo, “On communicating finite-state machines,” [40] M. Sighireanu, C. Chaudet, H. Garavel, M. Herbert, R. Mateescu, and
J. ACM, vol. 30, no. 2, pp. 323–342, Apr. 1983. B. Vivien, “LOTOS NT user manual,” 2000.
[39] K. T. Cheng and A. Krishnakumar, “Automatic functional test generation [41] H. Garavel, F. Lang, R. Mateescu, and W. Serwe, “CADP 2011: a
using the extended finite state machine model,” in Proceedings of the toolbox for the construction and analysis of distributed processes,”
30th International Design Automation Conference, ser. DAC ’93. New International Journal on Software Tools for Technology Transfer, vol. 15,
York, NY, USA: ACM, 1993, pp. 86–91. no. 2, pp. 89–107, 2013.