<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Experiences with an Approach to Abstract Handling of Content for Human Machine Interaction Applications</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Richard Schmidt</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Johannes Fonfara</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sven Hellbach</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hans-J. Böhme?</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Artificial Intelligence Lab, University of Applied Sciences</institution>
          ,
          <addr-line>Dresden</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Current robotic software frameworks lack a mean to aid in the generation, validation and presentation of high quality content for user interaction. This paper introduces a new approach to extend a basic robotic software framework with a layer for content management. This layer has capabilities for controlling the content presentation subsystems already integrated. We introduce abstract dialog acts as a centerpiece for creating and handling robot behavior including user interactions. A simple file format is used to edit the dialog act structure and it allows the delegation of the dialog creation to domain experts within the desired field. We demonstrate that the creation of different sets of dialog acts allows the implementation of completely different use cases without requiring any changes to existing software components.</p>
      </abstract>
      <kwd-group>
        <kwd>dialog content</kwd>
        <kwd>content creation</kwd>
        <kwd>corpus building</kwd>
        <kwd>human machine interaction</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Within recent years, the field of human machine interaction (HMI) has drawn
more and more attention within the robotics community. Interactions with
human users play a key role in numerous disciplines such as robotic guidance,
entertainment, and ambient assisted living.</p>
      <p>There are plenty of ways for robotic platforms to communicate with
human users. The most common ones are to display text, images, and videos on
a mounted screen, and output speech – either prerecorded or generated by a
text-to-speech system. Touch screens and automated speech recognition systems
along with dialog management systems receive and process the input from the
user. A schematic overview can be seen in Fig. 1.</p>
      <p>For researchers and developers working in this field, the following three
problems usually arise:
? This work was supported by ESF grant number 100076162.</p>
      <p>Speech</p>
      <p>Movement
Touchscreen Input</p>
      <p>Microphone</p>
      <p>Cameras
Laser Range Finder</p>
      <p>Touchscreen</p>
      <p>Speakers
Projector</p>
      <p>Speech
Images
Videos</p>
    </sec>
    <sec id="sec-2">
      <title>User</title>
    </sec>
    <sec id="sec-3">
      <title>Robot</title>
      <p>
        Missing Presentation Middleware In the robotics community several
software packages like ROS [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], Player/Stage [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and MIRA [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] aid the
implementation process of real-world robotic applications. These frameworks are very
helpful to abstract hardware access, interconnect software modules and develop
algorithms even up to a behavioral level. Yet they all lack focus regarding the
interaction with humans, as they were never designed specifically for this
purpose.
      </p>
      <p>Content Creation In real world applications, the dialog content 1 that the
robot can present has to be gathered, edited, and evaluated. Unfortunately, the
developers of a robotic application commonly do not have the expert knowledge
to generate high quality dialog content for the robot’s operational scenario. So
domain experts should be enabled to author such content instead. Furthermore,
the authored dialog content usually has to be tested and evaluated in real world
scenarios, as this may reveal additional ways to improve the content.
Corpus Building Log data from previous deployments might be needed to
create a data collection allowing further analysis, also known as a corpus. This
corpus can be used to develop and tune speech recognition or dialog management
systems and adapt them to the content. But as long as these systems are not
yet capable of performing a user-satisfactory dialog autonomously, reliable data
is hard to obtain.</p>
      <p>In the following section, we formulate the requirements for a HMI-capable
museum tour guide robot (see Fig. 2). In Sect. 3, we describe our proposed
framework extension to fulfill the requirements. We continue with a discussion of how
we applied our extension to multiple real world use cases in Sect. 4. Sect. 5
con1 The term dialog content herein comprises everything that is used for user interaction
such as speech, text, images, videos and even interactive applications like games.
cludes this paper with an evaluation of our approach and gives an outlook for
possible subsequent work.
1.1</p>
      <p>Robotic Platform
For our experiments we used a Scitos G5 robot by MetraLabs GmbH2, as shown
in Fig. 2. Its anthropomorphic qualities, such as its life-sized proportions and a
movable head, make this platform adequate for HMI applications. A sonar array,
two laser range finders (front and back), microphones, a 360° camera array and
a depth camera are the sensors on the platform. Speakers, a touchscreen and
a digital video projector are the devices that allow presentation of information
towards visitors.
1.2</p>
      <p>
        Related Work
Several approaches for multimodal dialog management systems for robotic
application like MuDiS [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and the dialog system of the BIRON project [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] exist.
Their goal is to enable a natural interaction with robot applications by
interpreting input from different modalities, fusing the input and generating dialog
output accordingly. Commonly not addressed are the aspects of dialog content
authoring, evaluation and presentation that we focus on in this paper.
      </p>
      <p>
        The Artificial Intelligence Markup Language (AIML) [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] is a markup
language that serves as the knowledge base for HMI applications. It shares
similarities with a markup language proposed in this paper, but lacks means of handling
multimodal dialog content while being more complex. In [
        <xref ref-type="bibr" rid="ref2 ref3">3,2</xref>
        ] the Multimodal
Interaction Markup Language (MIML) is introduced. The language abstracts
2 http://metralabs.com
global tasks, means of interaction and low level modalities. MIML itself, beeing
a language concept, does not solve the mentioned real world problems that we
are going to address in this paper.
2
      </p>
      <sec id="sec-3-1">
        <title>Requirements</title>
        <p>
          Our goal is to use the robot as a tour guide in a museum. In this real world
application, the robot has to inform and entertain visitors that were not trained
for interaction with the device. Therefore, we think that spoken natural language
is the best mean of interaction. The main reason why robots in public areas still
lack complex dialog capabilities is that speaker-independent speech recognition is
still a challenging task. Having this problem in conjunction with a dialog system
still being in its development phase, a satisfactory spoken user interaction is
not within short term reach. In order to still be able to gather dialog data for
research and evaluate our already created subsystem under real world conditions,
we decided to deploy a so called Wizard of Oz setup [
          <xref ref-type="bibr" rid="ref11 ref9">11,9</xref>
          ], in which a human
operator remotely controls the application. This creates the illusion of an already
completely operational system with spoken dialog interactions in a manner and
quality that we aim to achieve eventually with a completely autonomous system.
        </p>
        <p>For this Wizard of Oz extension to our existing platform the following main
requirements were formulated:
Framework Integration The extension has to be implemented on top of an
already employed robotic framework without requiring extensive modifications
to the framework itself or existing subsystems. This enables the evaluation of
these subsystems, for example navigation and people tracking, in a real – possibly
crowded – environment.</p>
        <p>Remote Operation A remote operator must be able to control certain high
level aspects of the interaction and the robot behavior, for example triggering
dialog reactions letting the robot navigate to waypoints. Therefore the operator
has to take a remote location, where video and audio data are streamed from
the robot’s sensors via wireless connection.</p>
        <p>Multimodal Dialog Presentation We intend to present dialog contents mainly
by natural language outputs being generated by a text-to-speech system on the
robot, together with a touch screen and a digital video projector presenting
images, videos and text contents. The touch-capability of the screen should be used
to allow browsing through a graphical user interface.</p>
        <p>Content Creation In our scenario, the expert knowledge and media files about
museum exhibits are not directly available to the developers. Therefore, the wish
emerged to hand over certain aspects of the content authoring process to the
domain experts of the exhibition. A template structure for all the content has to be
created, easy enough to be filled by the experts without requiring background
information about the software framework. The development of additional content
authoring tools should be avoided for simplicity reasons.</p>
        <p>Beside letting domain experts author the content, real world deployment
sometimes requires the ability to adapt content without much effort, for example
to react to unforeseen changes in the environment.</p>
        <p>Contextual Dialogs It should be possible to provide a dialog text statement
in different alternatives. This is necessary to adapt dialogs to the current
operational context of the robot for an socially acceptable behavior. For example,
groups should be addressed differently than a single person and facts should be
explained more easily understandable for children.</p>
        <p>Migration to Dialog System From the software developer point of view, the
extension to our framework has to work independently of whether the dialog
is directed by the remote operator or a dialog management system. A parallel
deployment of a human operator and a dialog management system needs to be
possible as well, which is desired for an iterative test-and-development cycle.
Then more and more tasks of the human operator can be gradually taken over
by an autonomous dialog management system.</p>
        <p>
          Corpus Building A corpus of speech and interaction between robot and
visitors is needed as the foundation to develop and train a dialog system as noted
in [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. The extension should aid in the process of building such a corpus.
3
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Design and Implementation</title>
        <p>
          We decided to base our dialog system on our General-Robot framework, whose
design is heavily influenced by the actor model [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] and thus allows the concurrent
processing of internal messages.
        </p>
        <p>Depending on the design of the robot software framework in use, concurrent
processing might not be necessary or even desired at the message passing and
dispatching level. For an adaptation of our approach to different frameworks, a
simple observer pattern, whereby messages are passed to observers, should be
sufficient.</p>
        <p>The General-Robot framework, maintains a set of states, state containers and
state processors. Data objects representing messages, for example sensor data,
are encapsulated into state objects. These encapsulated state objects are then
enqueued into state containers which retain a certain constant number of states
or alternatively all states within a certain time horizon. State processors can
observe these containers, in which case they are notified about new states. We
will discuss certain aspects of the design further in this section.
3.1</p>
        <p>Dialog Acts
As mentioned in Sect. 1.1, our robot can present dialog content as uttered speech,
on a touchscreen and as projected images or videos, of which speech is the most
common and important mode. For natural language generation we use static
text blocks. In our speech module, we use a third party text-to-speech system
to transform text blocks on the robot directly into an audible signal. Whereas a
system with prepared audio files would also be possible, we use this approach as
it avoids the process of generating new audio files even after minor text changes.</p>
        <p>Formally, a dialog is made up of a series of dialog acts. Every dialog act
consists of one or more atomic system commands and has a unique label that
also serves as a short description. When they are triggered by either the dialog
system or the remote operator, the dialog act dispatcher sends the associated
commands to the respective subsystems where they are processed accordingly
(see Fig. 3). In order to formalize dialogs, we use a clear text markup language
with a focus on human readability which is shown in Fig. 4. This allows to
externalize the creation of the dialog text as mentioned in Sect. 2 and also
circumvents the creation of additional tools for authoring.</p>
        <p>The text blocks that the robot should utter are directly embedded into the
dialog acts file. Although a stricter separation between structure and content –
to which the text belongs – of a dialog act might appear desirable, we find that
the convenience of being able to editing both in one common place is worth the
structural breach and allows the required fast changes to text as mentioned in
Sect. 2.</p>
        <p>As shown with dialog act OK in Fig. 4, it is possible to offer several
alternative texts for one statement. The dialog act dispatcher chooses randomly from
the alternatives. This avoids a tedious listening experience to often repeated
dialog acts like YES and NO.</p>
        <p>To allow different text alternatives for different dialog contexts as required
in Sect. 2, we added an extra layer of differentiation as shown on dialog act
WHERE_FROM which is available in the alternations named Text, TextGroup
and TextFormally. Text is the default and has to exist for all dialog acts where
text utterance is desired. Other alternatives can be created and named freely.
The dialog act dispatcher will choose the one preferred by the dialog management
system.
3.2</p>
        <p>Command Dispatching
A command represents a single task to be executed by the robot. Every command
has a certain command type and may or may not carry arguments. There are
command types for every aspect of our existing robotic platform that need to be
controlled remotely in our museum scenario. An overview of the types is shown
in Tab. 1. Within the framework, commands and arguments get encapsulated
within a state.</p>
        <p>Software components can instruct the dialog act dispatcher to trigger dialog
acts by their label. The dispatcher then looks up in its in-memory representation
of the dialog acts file and resolves the acts into commands. Submodules – which
are also state processors – can listen to the dispatcher’s commands state
container in order to receive notifications when new commands arrive (see Fig. 3).
Dialog Model</p>
        <p>Dialog Mgnt.</p>
        <p>System</p>
        <p>Trigger
Dlg. Act</p>
        <p>WiFi Link
Network Socket</p>
        <p>+
Deserialization</p>
        <p>Trigger Dlg. Act
Dialog Act Dispatcher</p>
        <p>Map
DialogAct1
CMD 1A
CMD 1B
DialogAct2
CMD 2A</p>
        <p>CMD 2B
Generate Commands</p>
        <p>Load
File</p>
        <p>Dialog</p>
        <p>Acts File
Trigger</p>
        <p>Dlg. Act
oCm Cmds. Touchscreen Submodule
m
and Cmds. Projection Submodule
toanC Cmds. Speech Submodule
iren Cmds. Drive Submodule</p>
        <p>Remote
Operator
Dialog Content</p>
        <p>Images
Videos
Pages</p>
        <p>Load
Content
We use the touchscreen and the projector to present visual dialog content to
visitors. Similar to the speech submodule, the corresponding submodules are
controlled by commands from the dialog act dispatcher. But here, the commands
carry a file path to media files as argument. It is up to the submodules to render
the files on the output devices appropriately.</p>
        <p>
          The projection submodule takes care of finding projection regions and
perspective correction which is further described in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. For touchscreen content, we
decided to not only resort to plain images and videos, but also to use a HTML
renderer. This allows the creation of interactive pages, through which users can
browse by touch gestures. We extended the HTML render with the possibility
- Label : WELCOME
        </p>
        <p>Text :</p>
        <p>- Hello !</p>
        <p>Cmds :
- Label : OK</p>
        <p>Text :
- OK !
- Great !
- Splendid !
- DISPLAY_PAGE Welcome / Welcome . html
- Label : WHERE_FROM</p>
        <p>Text :</p>
        <p>- Where do you come from ?
TextGroup :</p>
        <p>- Where are you from ?
TextFormally :</p>
        <p>- May I mask from where you are ?
- Label : RUN_VIDEO_EXHIBIT</p>
        <p>Text :
Cmds :
- Let me show you a video about this exhibit .</p>
        <p>- PROJECT Exhibit / video . mpg
to trigger dialog acts, which enables more complex reactions to touch gestures,
for example speech output.</p>
        <p>It should be noted, that the capabilities of the projection and touchscreen
submodules are not limited to preexisting media files, as we can display
everything that could be rendered to a pixel buffer. Therefore, the modules allow the
presentation of runtime generated media content – for example interactive games
– which we plan to integrate in the future.
3.4</p>
        <p>Client/Server Communication
We developed a remote control client that connects to a server component on
the robot over the network. The server itself is an extension to an existing robot
software stack, acquiring live camera images and audio from the sensors and
streaming them over the network to the client, which plays them back to the
remote operator.</p>
        <p>On startup, the client loads and parses a dialog content file. The remote
operator can trigger each dialog act by clicking the corresponding button. A
screen shot can be seen in Fig. 5. Then the client forwards the triggered dialog
act over network to the dialog act dispatcher on the robot. There, the dialog
acts get resolved into commands which are sent to the listening subsystems. It
should be noted that is does not matter for the subsystems where the commands
come from, which allows a seamless migration between remote and autonomous
dialog operation as required in Sect. 2.
In this section, we will discuss the usage of our dialog content extension in
real world applications, regarding the tour guide use case and the building of a
corpus for dialog management systems. Over time other use cases emerged, that
we wanted to realize with our existing robot hard- and software. Our extension
proved to be quite flexible and could also handle these new use cases with little
to no modification to the existing setup. We will also discuss two additional use
cases in the Sect. 4.3 and Sect. 4.4.
We deploy our system in an exhibition of vintage computer hardware. As
developers do not have access to all the resources and knowledge of the museum staff,
a major part of the content authoring for the tour guide was done by the staff.
To ease the process to them, we provided documentation and a template dialog
acts structure that could be used as a building block for different exhibits.</p>
        <p>We used the remote operation capability to give personalized tours to
single visitors or smaller groups. In this setup, the remote workstation is located
hidden from the visitors and connected to the robot via Wireless LAN. Our
approach proved very suited to provide entertaining tours to visitors and gather
real live data of dialog interactions. Also, we were able to evaluate already
existing sub modules, like people tracking and path planning, in a candid real world
environment.</p>
        <p>
          Remote operating the robot has shown to be a complex task, as the operator
has to choose an appropriate dialog option from a wide range of possibilities in
a short period of time [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
4.2
        </p>
        <p>Corpus Building
To build a corpus, we use the logging capabilities of the robotic framework to
record audio data and camera video streams. The dialog interactions triggered by
the remote operator get recorded by a simple extension to the framework’s
logging capabilities. Due to lacking reliability of current speech recognition systems
in our operational scenarios, it is unavoidable to resort to manual annotation of
the recorded sensor data to comprehensively record the dialog interaction from
the visitors towards the robot.</p>
        <p>During several sessions we gathered a corpus of about six hours audio and
video material, showing genuine interactions between robot and visitors. The
corpus consists of 133 dialogs involving 378 test subjects. We annotated the
corpus distinguishing about 30 different dialog situations, in which we transcribed
all spoken utterances from the visitors. Additionally, major movement actions,
the location and attention of the user were labeled.</p>
        <p>The corpus analysis was very helpful in many ways. Firstly, it gave us a
general feeling for the type of behavior to expect from visitors interacting with
our system. We were able to designate four main classes of interaction behavior:
interested, chatting, passively interested, and not interested. Surprisingly, most
people reveal a chatting behavior, which included a lot of small talk before the
interest shifts towards the museum exhibits.</p>
        <p>
          Secondly, we computed various statistics of user behavior which were used
to train a user simulator. The simulator model used is described in [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. Using
this simulator, we were able to reproduce versatile interactions of the tour guide
scenario and train a dialog management system.
        </p>
        <p>Thirdly, having all the user utterances annotated, we tested several
algorithms for text classification. This allowed us to build a natural language
processing module that can make use of a large-vocabulary speech recognizer to
recognize broad range of speech inputs.
4.3</p>
        <p>Info Terminal
We deployed a robot as an advanced info terminal at a variety of venues. There,
the robot ought to provide basic information, such as schedules and maps of the
location, to users using the touch screen. Remote dialog operation was not used.</p>
        <p>This use case has been implemented without modifications to our software,
only by creating dialog content. We had to write a dialog acts file and appropriate
browsable pages for the touchscreen. The ability to trigger dialog acts from
page elements (see Sect. 3.3) like buttons, allowed not only to let users progress
between pages, but also to let the robot verbally utter descriptions of the pages.</p>
        <p>We tested our info terminal application on various occasions and it behaved
as expected. But to further improve this use case, a simple dialog system could
be added, that employs data from our people tracker to allow the robot to react
to nearby persons and automatically advertise itself as a source of information.
4.4</p>
        <p>Poster Presenter
We also wanted to use our robot in an entertaining way as a presenter for posters
at exhibitions, workshops and conferences. To present a poster, the robot
highlights a certain area on a poster pinned to a wall using its projector, while
uttering speech towards its audience. The touchscreen is used to show
supplementary information. After a poster area has been explained, the robot proceeds
to the next one.</p>
        <p>For this setup, the projection submodule is used to simply project black
images containing white patches matching the areas of the poster. Every poster
section is represented by a dialog act. The dialog progress is controllable either
remotely by an remote operator or automatically. For the automatic progression,
we use JavaScript-Timers in the touchscreen HTML page to trigger the following
dialog act.</p>
        <p>
          Both variations were tested successfully on various occasions. However, the
additional flexibility that currently only the operator can warrant, allows for
a sometimes desirable variation from an otherwise static flow of information
towards the user. Further information about the projection setup of this
application are presented in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
5
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Conclusion</title>
        <p>In this paper, we proposed a concept to deal with the problem of dialog
content handling in robotic applications. The described software stack did not only
regard the organization of dialog content and its presentation, but also the
authoring phase. By involving domain experts with the required domain knowledge
in the authoring phase, the dialog content can become more useful and achieves a
higher quality. In the end, this will increase the usefulness of the resulting robotic
application to the user and might improve his impression of how interesting and
pleasant the interaction with the robot turns out to be.</p>
        <p>The presented approach proved highly versatile and flexible, as it allowed
the realization of different applications by merely authoring additional content.
Uttered texts are the main focus, but also on-screen content, images, and videos
are considered.</p>
        <p>By being able to remote control the dialog flow, the approach allows to build
a corpus of real world dialog data needed for the further improvement of dialog
systems. The migration to a completely autonomous dialog system can directly
be done utilizing the existing implementation.</p>
        <p>Further Work The extension of the framework towards our application is fairly
complete. However, there is still room for improvement. As we are employing
predefined text blocks as the foundation for spoken utterances, an extension
towards less static representations of text might be desirable for further iterations
of our dialog system.</p>
        <p>In regards to the requirements noted in Sect. 2, the content creation phase,
and especially the externalization aspect, could be optimized further. Even if we
employed a very simple markup language to hold dialog actions, a special written
editing software could still prove more user friendly. Such a software could easily
be integrated into the current workflow.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Agha</surname>
          </string-name>
          , G.:
          <article-title>ACTORS: A Model of Concurrent Computation in Distributed Systems</article-title>
          . MIT Press, Cambridge, MA, USA (
          <year>1986</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Araki</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Proposal of a markup language for multimodal semantic interaction</article-title>
          .
          <source>In: Proceedings of the 2007 Workshop on Multimodal Interfaces in Semantic Interaction</source>
          . pp.
          <fpage>58</fpage>
          -
          <lpage>62</lpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Araki</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tachibana</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Multimodal dialog description language for rapid system development</article-title>
          .
          <source>In: Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue</source>
          . pp.
          <fpage>109</fpage>
          -
          <lpage>116</lpage>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Donner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Himstedt</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hellbach</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Böhme</surname>
            ,
            <given-names>H.J.:</given-names>
          </string-name>
          <article-title>Awakening history: Preparing a museum tour guide robot for augmenting exhibits</article-title>
          .
          <source>In: Proceedings of the European Conference on Mobile Robots (ECMR)</source>
          . pp.
          <fpage>337</fpage>
          -
          <lpage>342</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Einhorn</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Langner</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stricker</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gross</surname>
            ,
            <given-names>H.M.</given-names>
          </string-name>
          :
          <article-title>Mira - middleware for robotic applications</article-title>
          .
          <source>In: Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS)</source>
          . pp.
          <fpage>2591</fpage>
          -
          <lpage>2598</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Fonfara</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hellbach</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Böhme</surname>
            ,
            <given-names>H.J.</given-names>
          </string-name>
          :
          <article-title>Learning Dialog Management for a Tour Guide Robot using Museum Visitor Simulation</article-title>
          .
          <source>In: Proceedings of the Workshop - New Challenges in Neural Computation 2012 (NC2)</source>
          . pp.
          <fpage>61</fpage>
          -
          <lpage>68</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Gerkey</surname>
            ,
            <given-names>B.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vaughan</surname>
          </string-name>
          , R.T.,
          <string-name>
            <surname>Howard</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>The player/stage project: Tools for multi-robot and distributed sensor systems</article-title>
          .
          <source>In: Proceedings of the 11th International Conference on Advanced Robotics</source>
          . pp.
          <fpage>317</fpage>
          -
          <lpage>323</lpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Giuliani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kaßecker</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schwärzler</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bannat</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gast</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wallhoff</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mayer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wimmer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wendt</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmidt</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Mudis - a multimodal dialogue system for human-robot interaction</article-title>
          .
          <source>In: Proc. 1st Intern. Workshop on Cognition for Technical Systems</source>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Poschmann</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Donner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bahrmann</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rudolph</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fonfara</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hellbach</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Böhme</surname>
            ,
            <given-names>H.J.:</given-names>
          </string-name>
          <article-title>Wizard of Oz revisited: Researching on a tour guide robot while being faced with the public</article-title>
          .
          <source>In: 21th IEEE Int. Symposium on Robot and Human Interactive Communication (RO-MAN)</source>
          . pp.
          <fpage>701</fpage>
          -
          <lpage>706</lpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Quigley</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Conley</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gerkey</surname>
            ,
            <given-names>B.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Faust</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Foote</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leibs</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wheeler</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>A.Y.</given-names>
          </string-name>
          :
          <article-title>Ros: an open-source robot operating system</article-title>
          .
          <source>In: ICRA Workshop on Open Source Software</source>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Shiomi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kanda</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koizumi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ishiguro</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hagita</surname>
          </string-name>
          , N.:
          <article-title>Group attention control for communication robots with wizard of oz approach</article-title>
          .
          <source>In: Proceedings of Conference on Human-Robot Interaction (HRI)</source>
          . pp.
          <fpage>121</fpage>
          -
          <lpage>128</lpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Shuyin</surname>
            ,
            <given-names>I.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toptsis</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wrede</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fink</surname>
            ,
            <given-names>G.A.</given-names>
          </string-name>
          :
          <article-title>A multi-modal dialog system for a mobile robot</article-title>
          .
          <source>In: Proc. Int. Conf. on Spoken Language Processing</source>
          . pp.
          <fpage>273</fpage>
          -
          <lpage>276</lpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Wallace</surname>
            ,
            <given-names>R.S.:</given-names>
          </string-name>
          <article-title>The anatomy of a</article-title>
          .l.i.c.e. In: Epstein,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Roberts</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Beber</surname>
          </string-name>
          ,
          <string-name>
            <surname>G</surname>
          </string-name>
          . (eds.)
          <source>Parsing the Turing Test</source>
          , pp.
          <fpage>181</fpage>
          -
          <lpage>210</lpage>
          . Springer Netherlands (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>