<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Designing Multimodal Interactive Systems Using EyesWeb XMI</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Gualtiero Volpe Paolo Alborno Antonio Camurri Paolo Coletta Simone Ghisio University of Genova DIBRIS Genova</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Genova DIBRIS Genova</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <fpage>49</fpage>
      <lpage>56</lpage>
      <abstract>
        <p>This paper introduces the EyesWeb XMI platform (for eXtended Multimodal Interaction) as a tool for fast prototyping of multimodal systems, including interconnection of multiple smart devices, e.g., smartphones. EyesWeb is endowed with a visual programming language enabling users to compose modules into applications. Modules are collected in several libraries and include support of many input devices (e.g., video, audio, motion capture, accelerometers, and physiological sensors), output devices (e.g., video, audio, 2D and 3D graphics), and synchronized multimodal data processing. Specific libraries are devoted to real-time analysis of nonverbal expressive motor and social behavior. The EyesWeb platform encompasses further tools such EyesWeb Mobile supporting the development of customized Graphical User Interfaces for specific classes of users. The paper will review the EyesWeb platform and its components, starting from its historical origins, and with a particular focus on the Human-Computer Interaction aspects.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Copyright is held by the author/owner(s).
AVI, June 07–10, 2016, Bari, Italy.</p>
    </sec>
    <sec id="sec-2">
      <title>Author Keywords</title>
      <p>Multimodal interactive systems; visual programming
languages; EyesWeb XMI</p>
    </sec>
    <sec id="sec-3">
      <title>ACM Classification Keywords</title>
      <p>H.5.2 [Information interfaces and presentation (HCI)]: User
interfaces</p>
    </sec>
    <sec id="sec-4">
      <title>Introduction</title>
      <p>Summer 1999, Opening of Salzburg Festival, Austria: In
the music theatre opera Cronaca del Luogo by Italian
composer Luciano Berio, a major singer (David Moss) plays a
schizophrenic character at times appearing wise and calm,
while other times appearing crazy, with nervous and jerky
movements. Some of his movement qualities are
automatically extracted by using sensors embedded in his clothes
and a flashing infrared light on his helmet, synchronized
with video cameras positioned above the stage. This
information is used to morph the singer’s voice from profound
(wise) to a harsh, sharp (crazy) timbre. The impact with
such concrete real-world applications of multimodal
analysis and mapping was of paramount importance to shape
the requirements for our first publicly available version of</p>
      <sec id="sec-4-1">
        <title>EyesWeb [1].</title>
        <p>In particular, the need for fast prototyping tools made us
leave the concept of EyesWeb as a monolithic
application, to be recompiled and rebuilt after any possible minor
change, and made us move to a more flexible approach,
which was being already adopted by other software
platforms both in the tradition of computer music programming
languages and tools and in other domains such as
simulation tools for system engineering. EyesWeb was thus
conceived as a modular software platform, where a user can
assemble the single modules in an application by means of
a visual programming language. As such, EyesWeb
supports its users in designing and developing interactive
multimodal systems is several ways, such as for example (i) by
providing built-in input/output capabilities for a broad range
of sensor and capture systems, (ii) by enabling to easily
define and customize how data is processed and feedback
is generated, and (iii) by offering tools for creating a wide
palette of interfaces for different classes of users.
Since then EyesWeb was reworked, improved, and
extended along years and went through five major versions,
being always available for free1. Nowadays, it is employed
in various application domains, going beyond the original
area of computer music and performing arts, and including
for example active experience of cultural heritage,
exergaming, education and technology-enhanced learning, therapy
and rehabilitation.</p>
        <p>This paper is organized as follows: the next section presents
some related work, i.e., other modular platforms endowed
with a visual programming language with a particular
reference to multimedia and multimodal systems; then, the
major components of the EyesWeb platform are introduced;
finally, the different classes of users for EyesWeb and the
reasons that make it suitable for fast prototyping of
applications including interconnection of smart objects are
discussed under an HCI perspective.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Related work</title>
      <p>
        Whereas general-purpose tools such as, for example,
Mathworks’ Simulink exists since long time, platforms especially
devoted to (possible real-time) analysis of multimodal
signals are far less common. Max [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] is a platform and a visual
programming language for music and multimedia, originally
conceived by Miller Puckette at IRCAM, Paris, and
nowadays developed and maintained by Cycling ’74. Born for
sound and music processing in interactive computer
music, it is also endowed with packages for real-time video,
3D graphics, and matrix processing. Pd (Pure Data) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
is similar in scope and design to Max. It also includes a
visual programming language and it is intended to support
development of interactive sound and music computing
applications. The addition of GEM (Graphics Environment for
1http://www.casapaganini.org
The actual released version is EyesWeb 5.6.0.0.
Multimedia) enables real-time generation and processing of
video, OpenGL graphics, images, and so on. Moreover, Pd
is natively designed to enable live collaboration across
networks or the Internet. vvvv [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] is a hybrid graphical/textual
programming environment for easy prototyping and
development. It has a special focus on real-time video
synthesis and it is designed to facilitate the handling of large
media environments with physical interfaces, real-time motion
graphics, audio and video that can interact with many users
simultaneously. Isadora [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] is an interactive media
presentation tool created by composer and media-artist Mark
Coniglio. It mainly includes video generation, processing,
and effects and is intended to support artists in developing
interactive performances. In the same field of performing
arts, Eyecon [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] aims at facilitating interactive performances
and installations in which the motion of human bodies is
used to trigger or control various other media, e.g., music,
sounds, photos, films, lighting changes, and so on. The
Social Signal Interpretation framework (SSI) [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] offers tools
to record, analyze and recognize human behavior in
realtime, such as gestures, mimics, head nods, and emotional
speech. Following a patch-based design, pipelines are set
up from autonomic components and allow the parallel and
synchronized processing of sensor data from multiple input
devices.
      </p>
      <p>Whereas Max and Pd are especially devoted to audio
processing, and vvvv to video processing, EyesWeb has a
special focus on higher-level nonverbal communication,
i.e., EyesWeb provides modules to automatically compute
features describing the expressive, emotional, and affective
content multimodal signals convey, with particular reference
to full-body movement and gesture. EyesWeb also has a
somewhat wider scope with respect to Isadora, which
especially addresses interactive artistic performances, and
to SSI, which is particularly suited for analysis of social
interaction. Finally, with respect to its previous versions, the
current version of EyesWeb XMI encompasses enhanced
synchronization mechanisms, improved management and
analysis of time-series (e.g., with novel modules for analysis
of synchronization and coupling), extended scripting
capabilities (e.g., a module whose behavior can be controlled
through Python scripts), and a reorganization of the
EyesWeb libraries including novel supported I/O devices (e.g.,
Kinect V2) and modules for expressive gesture processing.</p>
    </sec>
    <sec id="sec-6">
      <title>EyesWeb kernel, tools, libraries, and devices</title>
      <p>Figure 1 shows the overall architecture of EyesWeb (the
current version is named XMI, i.e., for eXtended Multimodal</p>
      <sec id="sec-6-1">
        <title>Interaction). The EyesWeb Graphic Development Environ</title>
        <p>ment tool (GDE), shown in Figure 2, manages the
interaction with the user and supports the design of applications
(patches). An EyesWeb patch consists of interconnected
modules (blocks). A patch can be defined as a structured
network of blocks that channels and manipulates a digital
input dataflow resulting in a desired output dataflow. Data
manipulation can be either done automatically or through
real-time interaction with a user. For example, to create
a simple video processing patch, an EyesWeb developer
would drag and drop an input video block (e.g., to capture
the video stream from a webcam), a processing block (e.g.,
a video effect), and an output block (e.g., a video display).
The developer would then connect the blocks and she may
also include some interaction with the user, e.g., by adding
a block that computes the energy of the user’s movement
and connecting it to the video effect block to control the
amount of effect to be applied.</p>
        <p>The build-up of an EyesWeb patch in many ways
resembles that of an object oriented program (a network of
clusters, classes, and objects). A patch could therefore be
interpreted as “a small program or application”. The sample
patch displayed in Figure 2 was built for performing
synchronized playback of multimodal data. In particular, this
example displays motion capture (MoCap) data obtained
from a MoCap system (Qualisys). The recorded MoCap
data is synchronized with the audio and video tracks of the
same music performance (a string quartet performance).</p>
      </sec>
      <sec id="sec-6-2">
        <title>The EyesWeb kernel</title>
        <p>The core of EyesWeb is represented by its kernel. It
manages the execution of the patches by scheduling each block,
it handles data flow, it notifies events to the user interface,
and it is responsible of enumerating and organizing blocks
in catalogs including a set of coherent libraries. The
kernel works as a finite state machine consisting of two major
states: design-time and run-time. At design-time users can
design and develop their patches, which are then executed
at run-time. Patches are internally represented by a graph
whose nodes are the blocks and whose edges are the links
between the blocks. In the transition between design and
run time, the kernel performs a topological sort of the graph
associated to the patch to be started and establishes the
order for scheduling the execution of each block.</p>
      </sec>
      <sec id="sec-6-3">
        <title>The EyesWeb libraries</title>
        <p>
          EyesWeb libraries include modules for image and video
processing, for audio processing, for mathematical
operations on scalar and matrices, for string processing, for
timeseries analysis, and for machine learning (e.g., SVMs,
clustering, neural networks, Kohonen maps, and so on). They
also implement basic data structures (e.g., lists, queues,
and labeled sets) and enable to connect with the
operating systems (e.g., for launching processes or operating on
the filesystem). Particularly relevant in EyesWeb are the
libraries for real-time analysis of nonverbal full-body
movement and expressive gesture of single and multiple users
[
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], and the libraries for the analysis of nonverbal social
interaction within groups [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. The former include modules for
computing features describing human full-body movement
both at a local temporal granularity (e.g., kinematics,
energy, postural contraction and symmetry, smoothness, and
so on) and at the level of entire movement units (e.g.,
directness, lightness, suddenness, impulsivity, equilibrium,
fluidity, coordination, and so on). The latter implements
techniques that have been employed for analyzing features
of the social behavior of a group such as physical and
affective entrainment and leadership. Techniques include, e.g.,
Recurrence Quantification Analysis [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], Event
Synchronization [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], SPIKE Synchronization [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], and nonlinear
asymmetric measures of interdependence between time-series.
        </p>
      </sec>
      <sec id="sec-6-4">
        <title>Available Tools</title>
        <p>Referring to Figure 1, in addition to EyesWeb GDE that was
already described above, further available tools are:
• EyesWeb Mobile: an external tool supporting the
design of Graphic User Interfaces linked to an EyesWeb
patch. EyesWeb Mobile is composed of a designer
and a runtime component. The former is used to
design the visual layout of the interfaces; the latter runs
together with an EyesWeb patch and communicates
via network with the EyesWeb kernel for receiving
results and controlling the patch remotely.
• EyesWeb Console: a tool allowing to runs patches
from the command line to reduce the GDE overhead.
On Windows, it runs both as a standard application or
as a Windows service and additionally it can be used
to run patches in Linux and OS X.
• EyesWeb Query: a tool to automatically generate
documentation from a specific EyesWeb
installation, including icon, description, input and output
datatypes of each block. Documentation can be
generated in latex (pdf), text, and MySql.
• EyesWeb Register Module: a tool allowing to add
new blocks (provided in a dll) to an existing
EyesWeb installation, and to use them to make and run
patches. The new blocks extend the platform and
may be developed and distributed by third parties.</p>
      </sec>
      <sec id="sec-6-5">
        <title>Supported devices</title>
        <p>EyesWeb supports a broad range of input and output
devices, which are managed by a dedicated layer (devices)
located in between the kernel and the operating system. In
addition to usual computer peripherals (mouse, keyboard,
joystick and game controllers, and so on), input devices
include audio (from low-cost mother-board-integrated to
professional audio cards), video (from low-cost webcams
to professional video cameras), motion capture systems (to
get with high accuracy 3D coordinates of markers in
environments endowed with a fairly controlled setup), RGB-D
sensors (e.g., Kinect for X-Box One, also known as Kinect
V2, extracting 2D and 3D coordinates of relevant body
joints, and capturing RGB image, grayscale depth image,
and infrared image of the scene), Leap Motion (a sensor
device capturing hand and finger movement),
accelerometers (e.g., onboard of Android smarthpones and connected
via network, by means of the Mobiles to EyesWeb app,
see Figure 3; X-OSC sensors, and so on), Arduino,
Nintendo Wiimote, RFID (Radio-Frequency Identification)
devices (e.g., used to detect the presence of a user in a
specific area), and biometric sensors (e.g., respiration, hearth
rate, skin conductivity, and so on). Output includes audio,
video, 2D and 3D graphics, and possible control signals to
actuator devices (e.g., haptic devices, robots). Moreover,
EyesWeb implements standard networking protocols such
as, for example, TCP, UDP, OSC (Open Sound Control),
and ActiveMQ. In such a way, it can receive and send data
from/to any device, including smart objects, endowed with
network communication capabilities.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>Classes of users and interfaces</title>
      <p>
        Users of EyesWeb can be ideally collected in several classes.
EyesWeb end users usually do not directly deal with the
platform, but rather experience the multimodal interactive
systems that are implemented using the platform. Their
interface is therefore a natural interface they can operate
by means, e.g., of their expressive movement and gesture
to act upon multimedia content. In such a way, end users
can also interact with smart devices they may wear or hold
(e.g., smartphones). Smart devices can either be used to
directly operate on content, or the data they collect can be
presented back to end users, e.g., using data sonification
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and visualization technologies.
      </p>
      <p>EyesWeb developers make applications (patches) using
the EyesWeb GDE and its visual programming language.
This implements all the basic constructs of programming
languages such as sequences (a chain of interconnected
blocks), conditional instructions (e.g., by means of switch
blocks that direct the flow of data to a specific part of a
patch when a given condition is matched), iterative
instructions (by means of a specific mechanism that allows to
execute a given sub-patch repetitively), and subprograms
(implemented as sub-patches). Because of the visual
programming paradigm, EyesWeb developers do not need to
be computer scientists or expert programmers. In our
experience, EyesWeb patches were developed by artists,
technological staff of artists (e.g., sound technicians),
designers, content creators, students in performing arts and
digital humanities, and so on. Still, some skills in usage of
computer tools and, especially, in algorithmic thinking are
required. EyesWeb developers can exploit EyesWeb as a
tool for fast-prototyping of applications for smart objects:
EyesWeb can receive data from such objects by means of
its input devices and the task of the developer is to design
and implement in a patch the control flow of the application.</p>
      <sec id="sec-7-1">
        <title>Supervisors of EyesWeb patches are users who super</title>
        <p>vise and control the execution of an EyesWeb application.
They can both set parameters to customize the execution of
the patch before running it and act upon the patch (e.g., by
changing the value of some parameters) while the patch is
running. Consider, for example, a teacher who customizes
an educational serious game for her pupils and operates
on the game as a child is playing with it, or a therapist who
sets e.g., target and difficulty level of an exergame for a
patient. Supervisors of EyesWeb patches do not need any
particular skill in computer programming and indeed they
are usually a special kind of end user, with a specific
expertise in the given application area. The EyesWeb Mobile tool
allows for endowing them with traditional Graphical User
Interfaces they can use for their task of customizing and
controlling the execution of a patch. In such a way, they do not
need to go into the details of the EyesWeb GDE and they
work with an interaction paradigm, which is more familiar to
them. Moreover, the patch is prevented from possible
unwanted modifications. The EyesWeb Mobile interfaces can
also work on mobile devices (e.g., tablets) to facilitate
operations when the operator cannot stay at the computer (e.g.,
a therapist who needs to participate in the therapy session).
This feature makes EyesWeb Mobile a tool that also suits
remote configuration of smart objects applications. Figure 4
shows an EyesWeb Mobile interface developed for enabling
a teacher to customize a serious game for children.
EyesWeb programmers develop new software modules for
the EyesWeb platform. They need to be skilled C++
programmers and are endowed with the EyesWeb SDK, which
enables them to extend the platform with third-parties
modules. In particular, the EyesWeb SDK enables including in
the platform new modules for interfacing possible smart
objects that e.g., do not communicate through standard
networking protocols or are not supported by the platform yet.</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Conclusion</title>
      <p>
        EyesWeb is nowadays employed by thousands of users
spanning over several application domains. It was adopted
in both research and industrial projects by research centers,
universities, and companies. A one-week tutorial, the
EyesWeb Week is organized every two years at our research
center. In our experience at Casa Paganini - InfoMus,
EyesWeb was used in research projects that combine
scientific research in information and communications
technology (ICT) with artistic and humanistic research [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. In this
context, the platform provided the technological ground for
artistic performances (e.g., in Allegoria dell’opinione
verbale by R. Doati, Medea by A. Guarnieri, and Invisible Line
by A. Cera, to name just some of them), for multimodal
interactive systems supporting new ways of experiencing art
and cultural heritage (e.g., Museum of Bali in Fano, Italy;
      </p>
      <sec id="sec-8-1">
        <title>Museum of La Roche d’Oëtre in Normandy, France; Enrico</title>
        <p>Caruso Museum near Florence, Italy), for serious games in
educational settings (e.g., The Potter and BeSound), and
for exergames for therapy and rehabilitation (e.g., in an
ongoing collaboration with children hospital Giannina Gaslini
in Genova, Italy). EyesWeb is being currently improved and
extended in the framework of the EU-H2020-ICT DANCE
Project, investigating how sound and music can express,
represent, and analyze the affective and relational qualities
of body movement.</p>
        <p>The need of supporting such a broad range of application
domains required to make a trade-off between
implementing general-purpose mechanisms and exploiting
domainspecific knowledge. For example, on the one hand,
support to generality sometimes encompasses a somewhat
reduced learnability and usability of the platform by
lowlyskilled EyesWeb developers due to the increased
complexity of the implemented mechanisms. On the other hand,
some EyesWeb modules developed on purpose for specific
application scenarios have a limited scope and are difficult
to reuse. Future directions and open challenges include
increased cross-platform interoperability and a tighter
integration with cloud services and storage technologies.</p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>Acknowledgements</title>
      <p>This research has received funding from the European
Union’s Horizon 2020 research and innovation programme
under grant agreement No 645553 (H2020-ICT Project
DANCE, http://dance.dibris.unige.it). DANCE investigates
how affective and relational qualities of body movement can
be expressed, represented, and analyzed by the auditory
channel.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1. Antonio Camurri, Shuji Hashimoto, Matteo Ricchetti, Andrea Ricci, Kenji Suzuki, Riccardo Trocca, and
          <string-name>
            <given-names>Gualtiero</given-names>
            <surname>Volpe</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>EyesWeb: Toward Gesture and Affect Recognition in Interactive Dance and Music Systems</article-title>
          .
          <source>Computer Music Journal</source>
          <volume>24</volume>
          ,
          <issue>1</issue>
          (April
          <year>2000</year>
          ),
          <fpage>57</fpage>
          -
          <lpage>69</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. Antonio Camurri, Barbara Mazzarino, and
          <string-name>
            <given-names>Gualtiero</given-names>
            <surname>Volpe</surname>
          </string-name>
          .
          <year>2004</year>
          .
          <article-title>Analysis of expressive gesture: The EyesWeb expressive gesture processing library. In Gesture-based communication in human-computer interaction</article-title>
          . Springer,
          <fpage>460</fpage>
          -
          <lpage>467</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. Antonio Camurri and
          <string-name>
            <given-names>Gualtiero</given-names>
            <surname>Volpe</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>The Intersection of Art and Technology</article-title>
          .
          <source>IEEE MultiMedia 23, 1 (Jan</source>
          <year>2016</year>
          ),
          <fpage>10</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Eyecon</surname>
          </string-name>
          .
          <year>2008</year>
          . http://eyecon.palindrome.de/. (
          <year>2008</year>
          ). Accessed:
          <fpage>2016</fpage>
          -03-25.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Hermann</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Taxonomy and Definitions for Sonification and Auditory Display</article-title>
          .
          <source>In Proceedings of the 14th International Conference on Auditory Display (ICAD</source>
          <year>2008</year>
          ),
          <article-title>Patrick Susini</article-title>
          and Olivier Warusfel (Eds.). IRCAM.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Isadora</surname>
          </string-name>
          .
          <year>2002</year>
          . http://troikatronix.com/. (
          <year>2002</year>
          ). Accessed:
          <fpage>2016</fpage>
          -03-25.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Thomas</surname>
            <given-names>Kreuz</given-names>
          </string-name>
          , Daniel Chicharro, Conor Houghton, Ralph G Andrzejak,
          <article-title>and</article-title>
          <string-name>
            <given-names>Florian</given-names>
            <surname>Mormann</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Monitoring spike train synchrony</article-title>
          .
          <source>Journal of Neurophysiology</source>
          (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Norbert</given-names>
            <surname>Marwan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Carmen</given-names>
            <surname>Romano</surname>
          </string-name>
          , Marco Thiel, and
          <string-name>
            <given-names>Jürgen</given-names>
            <surname>Kurths</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Recurrence plots for the analysis of complex systems</article-title>
          .
          <source>Physics Reports</source>
          <volume>438</volume>
          ,
          <fpage>5</fpage>
          -
          <lpage>6</lpage>
          (
          <year>2007</year>
          ),
          <fpage>237</fpage>
          -
          <lpage>329</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Max</surname>
          </string-name>
          .
          <year>1988</year>
          . http://cycling74.com/products/max/. (
          <year>1988</year>
          ). Accessed:
          <fpage>2016</fpage>
          -03-25.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>Miller</given-names>
            <surname>Puckette</surname>
          </string-name>
          .
          <year>1996</year>
          .
          <article-title>Pure Data</article-title>
          .
          <source>In Proceedings of the International Computer Music Conference</source>
          .
          <volume>224</volume>
          -
          <fpage>227</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. Rodrigo Quian Quiroga, Thomas Kreuz, and
          <string-name>
            <given-names>Peter</given-names>
            <surname>Grassberger</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Event synchronization: A simple and fast method to measure synchronicity and time delay patterns</article-title>
          .
          <source>Physical Review E 66</source>
          ,
          <issue>4</issue>
          (Oct
          <year>2002</year>
          ),
          <fpage>041904</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Giovanna</surname>
            <given-names>Varni</given-names>
          </string-name>
          , Gualtiero Volpe, and Antonio Camurri.
          <year>2010</year>
          .
          <article-title>A system for real-time multimodal analysis of nonverbal affective social interaction in user-centric media</article-title>
          .
          <source>IEEE Transactions on Multimedia 12</source>
          ,
          <issue>6</issue>
          (
          <year>2010</year>
          ),
          <fpage>576</fpage>
          -
          <lpage>590</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13. vvvv.
          <year>1998</year>
          . http://vvvv.org/. (
          <year>1998</year>
          ). Accessed:
          <fpage>2016</fpage>
          -03-25.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Johannes</surname>
            <given-names>Wagner</given-names>
          </string-name>
          , Florian Lingenfelser, Tobias Baur, Ionut Damian, Felix Kistler, and
          <string-name>
            <given-names>Elisabeth</given-names>
            <surname>André</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>The Social Signal Interpretation (SSI) Framework: Multimodal Signal Processing and Recognition in Real-time</article-title>
          .
          <source>In Proceedings of the 21st ACM International Conference on Multimedia (MM '13)</source>
          . ACM, New York, NY, USA,
          <fpage>831</fpage>
          -
          <lpage>834</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>