=Paper=
{{Paper
|id=Vol-2609/AfCAI2019_paper_8
|storemode=property
|title=Towards Multimodal Characterization of Dialogic Moments on Social Group Face-to-Face Interaction
|pdfUrl=https://ceur-ws.org/Vol-2609/AfCAI2019_paper_8.pdf
|volume=Vol-2609
|authors=Ngoc N. T. Doan, Andrius Penkauskas, Ecaterina Grigoriev, Lisa E. Rombout, Paris Mavromoustakos-Blom, Maryam Alimardani, Martin Atzmueller
|dblpUrl=https://dblp.org/rec/conf/afcai/DoanPGRBAA19
}}
==Towards Multimodal Characterization of Dialogic Moments on Social Group Face-to-Face Interaction==
<pdf width="1500px">https://ceur-ws.org/Vol-2609/AfCAI2019_paper_8.pdf</pdf>
<pre>
   Towards Multimodal Characterization of Dialogic
   Moments On Social Group Face-to-Face Interaction

    Ngoc N. T. Doan, Andrius Penkauskas, Ecaterina Grigoriev, Lisa E. Rombout,
     Paris Mavromoustakos-Blom, Maryam Alimardani, and Martin Atzmueller

       Tilburg University, Department of Cognitive Science and Artificial Intelligence,
                    Warandelaan 2, 5037 AB Tilburg, The Netherlands
              {n.t.n.doan,a.penkauskas,e.grigoriev}@uvt.nl
                {l.e.rombout,p.mavromoustakosblom}@uvt.nl
                     {m.alimardani,m.atzmuller}@uvt.nl


       Abstract. Multimodal data enables powerful methodological approaches to in-
       vestigate social group interaction. This paper specifically focuses on dialogic mo-
       ments, i. e., episodes of human communication with high mutual understanding.
       We present preliminary results of a pilot study, where we apply multimodal anal-
       ysis of dialogic moments in the context of storytelling, for obtaining data-driven
       characterizations. We collected multimodal sensor data, including skin conduc-
       tance, face-to-face proximity, and vocal non-verbal features of the participants,
       complemented by their subjective experiences collected via self-report question-
       naires. Our first preliminary findings provide novel perspectives on different pro-
       files of dialogic moments, characterized by objective and subjective features.

       Keywords: Social Signal Processing, Multimodal Analytics, Dialogic Moment


1 Introduction
For analyzing group interactions, wearable sensors complemented by information from
self-report questionnaires enable powerful methodological approaches for analysis, cf.
e. g., [1–3]. This paper focuses on specific interaction episodes, so-called dialogic mo-
ments. These are characterized by a considerable level of mutual understanding, leading
to the development of a sense of belonging to the group. During storytelling conversa-
tions, for example, such moments can be elicited efficiently [4,5]. In particular, dialogic
moments occur when each participant aims to retain their “own truth” while acknowl-
edging to a respective other “experienced truth”, leading to mutual understanding.
     The occurrence of dialogic moments can be useful for supporting collaborations and
mediating conflict situations. Here, the detection and analysis of dialogic moments are
important steps to understand the underlying mechanisms. In this paper, we take first
steps towards the data-driven analysis and characterization of such dialogic moments
during an immersive story-telling experience, focussing on face-to-face interaction. We
investigated multimodal information of the participants while they were engaged in
a group discussion that was based on fictional storytelling. The collected multimodal
(sensor) data includes skin conductance, face-to-face proximity, vocal non-verbal fea-
tures of the participants, and turn-taking interaction; here in addition, respective subjec-
tive experiences of the participants were collected via self-report questionnaires.


Copyright © 2019 for this paper by its authors.
Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
2 Related work

In this section, we briefly summarize related work on observing human group interac-
tions and the idea of dialogic moments in such interaction contexts. Furthermore, we
sketch time series analysis methods for the analysis of according sensor data.


2.1     Observing Human Group Interaction using Sensors

Based on collected sensor data we can construct social interaction networks which cap-
ture offline interactions between people [6–8]. Eagle and Pentland [9], for example,
presented an analysis using proximity information collected by bluetooth devices as a
proxy for human proximity. While this relates to face-to-face communication, however,
the detected proximity does not necessarily correspond to face-to-face contacts [10].
Also, it does not cover vocal (non-)verbal aspects of communication. Another approach
for observing human face-to-face communication is the Sociometric Badge.1 . It records
more details of the interaction, but requires significantly larger devices. The SocioPat-
terns Collaboration2 developed proximity tags based on Radio Frequency Identifica-
tion technology (RFID), which have been used in several ubiquitous and social envi-
ronments, e. g., regarding educational/university contexts including scientific confer-
ences [11–14] and schools [15]. In this paper, we apply the Openbeacon as well as the
Rhythm badges [16], as a successor of the Sociometric badge, which provides a richer
set of information. In addition, we complement the data collection using further sensors
(e. g., skin conductance,cf. [17, 18]) and subjective questionnaire-based information.
     Regarding dialogic moments, research has shown the importance of these but pro-
vides only few indications on how to predict or replicate a dialogical moment [5]. While
dialogic moments have not yet been extensively studied, especially from a computa-
tional perspective, multimodal approaches to the study of social interaction are becom-
ing more common [2, 19]. In this paper we target the multimodal characterization of
dialogic moments focussing on social group face-to-face interaction.


2.2     Explicative Time Series Data Analysis

For the characterization of dialogic moments, explicative data analysis methods are
important, because they provide interpretable and explainable approaches, in order to
make sense of the data and the obtained patterns [20, 21]. For multimodal sensor data,
time series data is recorded from multiple channels at the same time. Capturing the
interplay of the different signals and using it for gaining relevant information from
the data represents a challenge. In this context, several single time series techniques
are used after different feature transformation procedures. One simple but quite ef-
fective transformation is the symbolic aggregate approximation (SAX), where a time
series is initially transformed into a set of words through quantization using sliding
windows [22]. Then, we can directly apply the interpretable symbolic representation
towards the characterization of time series segments in the context of dialogic moments.
 1
     http://hd.media.mit.edu/badges
 2
     http://www.sociopatterns.org
3 Method
In the following, we first outline our research question for data analysis before describ-
ing the data collection context, measures, and methods in detail.

3.1   Research Questions
In our analysis, we focus on the following research questions:
 1. Is it possible to observe certain traces of dialogic moments in the multimodal data
    of the participants?
 2. Can we identify characteristic multi-modal features?
 3. Can we identify conforming/deviating patterns among different moments and users,
    respectively?

3.2   Procedure
For the pilot study outlined below, we recorded a diverse set of data while participants
were engaged in a storytelling-based discussion. In order to obtain ground-truth infor-
mation about dialogic moments, we employed feedback from a domain expert for de-
tecting and labeling the respective dialogic moments, also assessed by an independent
annotator. In evaluating the data of the pilot study, we mainly resorted to basic statistical
analysis and similarity assessment of the computed aggregated measurements, in order
to identify characterizing features and patterns.
    The participants (n=4, 2 female) were all university students. They received no in-
centives for their participation and gave informed consent for performing the experi-
ment. The participants sat around a square table in an empty room and remained seated
during the experiment. The experiment included 4 discussion sessions of 10 minutes
each, with breaks of 2 minutes. All discussions were moderated by a storytelling ex-
pert. Each session was moderated to have either a negative or a positive valence. For
the four discussion sessions, this was given with the order Negative – Positive – Nega-
tive – Positive, as follows:

Session 1 : Negative valence.
Session 2 : Positive valence.
Session 3 : Negative valence
Session 4 : Positive valence.
    The moderator presented a fictional scenario to the group – the earth being inhabit-
able so that the group was seeking asylum on Mars. Valence was indicated by dilemmas
or opportunities, respectively. The moderator could also choose to moderate interaction
among the group, by directing attention to a less active member of the group, or by
asking a direct question. After the discussions, the moderator, then indicated the critical
moments where participants seemed to have come closer together, and potentially all
shared a mutual understanding, as a subjective labeling of those moments. The strongest
dialogic moments were then heuristically chosen, and additionally checked by an exter-
nal observer.
3.3     Measurements

We utilized several sensors for collecting multi-modal data, including galvanic skin
response, vocal non-verbal features and face-to-face interaction information, cf. [3, 16–
18, 23, 24].


3.4     Multimodal Sensor Data

Galvanic skin response (skin conductance, GSR) of the participants were recorded us-
ing the Shimmer GSR+ wearable sensors, which are often applied in similar contexts,
e. g., [17, 18]. Essentially, GSR indicates arousal [23]. Arousal can be calculated by
subtracting the highest GSR in the period of interest and the average GSR in the 30
seconds prior to it. Besides that, we specifically focus on face-to-face interaction com-
plemented by questionnaire information in our preliminary analysis. In addition, speak-
ing volume and physical proximity of the participants were observed using the Rhythm
Badge [16]. Given the speaking data from the Rhythm badge and the video recordings
we investigated the speaking turns. We transcribed the content of participants’ speech,
and manually assessed and recorded their speaking turns. The recording of speaking
turns assumes one speaker at a time. In a period of overlapping speech, a participant
who spoke the longest was recorded as taking the turn. For observing face-to-face prox-
imity, we utilized wearable sensors developed by the SocioPatterns consortium3 , as dis-
cussed above [24, 25]. The proximity tags are based on Radio Frequency Identification
technology (RFID tags), capable of detecting close-range and face-to-face proximity
(1–1.5 meters). The most significant advantage of using the SocioPatterns tags is that
the human body acts as a radio-frequency blocker at the frequency used on the tags [24].
Therefore, only the signals that are broadcasted directly forward from the person wear-
ing the tag will be detected by other tags.


3.5     Self-Report Assessment Information

In addition to collecting multimodal data, an Inclusion of Other in the Self (IOS) ques-
tionnaire was distributed to the participants to fill out at the baseline and after each dis-
cussion session (2-minute break each). In this break, they were asked to momentarily
exit the fictional world and individually report (on a 7-point Likert scale) their perceived
level of inclusion in the group in the preceding session. Here, we also applied the self-
assessment manikin (SAM) [26] for estimating the affective dimensions of pleasure,
arousal and dominance after each session. In general, SAM is an emotion assessment
tool using graphic scales, depicting cartoon characters expressing the respective three
emotion elements – pleasure, arousal and dominance – for the affective dimensions. In
that context, pleasure refers to the happiness in a specific situation, arousal to the emo-
tional arousal while dominance relates to the control of a situation. Low dominance,
for example, relates to the feeling of lacking of control in a situation. A person with
low dominance can present states, such as subordination, intimidation or withdrawal.
In contrast, the participants with high dominance have control over a situation.
 3
     http://www.sociopatterns.org
4 Results and Discussion

In our first exploratory analysis, we provide preliminary results focussing on face-to-
face interaction, in particular, face-to-face proximity contacts, turn taking and the IOS
questionnaire information. After that, we present initial results on the collected GSR
data. It is important to note, that due to the exploratory nature of this study and the
small number of participants, insights and conclusions are regarded as preliminary ex-
ploratory results, which we aim to extend on in future work.


4.1     Global Group Interaction Behavior: Session Face-to-Face Proximity

In the following, we briefly discuss insights on face-to-face proximity contacts of the
participants during the individual sessions. Overall, the collected contact data was rela-
tively sparse. Below, we focus on indications of proximity contacts between the partic-
ipants, where especially the existance of a “non-contact” vs. “some contact” is interest-
ing. Figure 1 shows an overview of the contact behavior during the different sessions in
different graphs. In particular, the figures for the different sessions (Session 1 – Session
4) indicate the participants (as nodes of the respective graph) and proximity contacts
(as edges of the respective graph) between those. An edge between two nodes (partici-
pants) is created if at least one signal between the tags of the participants was received,
where the width of an edge is proportional to the number of proximity signal contacts.


              P2             P2                                    P2            P2


   P3                   P1                        P1   P3                   P1                        P1


              P4             P3                                    P4            P3


        (a) Session 1             (b) Session 2             (c) Session 3             (d) Session 4

Fig. 1: Face-to-Face proximity networks of the different sessions (Session 1 – Session 4). Parti-
        pants are denoted by nodes of the respective graphs, an edge between two nodes is created
        if there is at least one signal between the tags of the participants; edge width is propor-
        tional to the number of proximity signal contacts.


    Overall, we can distinguish different face-to-face proximity contact situations. Ses-
sion 1 and Session 3 are quite similar, with participants P1, P2, and P3 in a triangle, and
a weakly connected participant P4. Also, Session 2 and Session 4 indicate quite similar
situations. The interesting feature that is common to the two pairs of sessions is the
equivalence in the topological graph structure, as well as their valence, since Session 1
and Session 3 were negative, while Session 2 and Session 4 are positive regarding their
valence. We will also investigate this further in the following sections.
4.2                 Group Interaction Behavior: Turntaking

In total, 6 moments were selected as dialogic moments. Content-wise, moment 4 was
when participants engaged in a heated dispute about the ethical way to choose which
people on Earth were to be evacuated first. Conversely, the other 5 moments evolved
around rather non-provocative sharing such as during moment M1. As shown in Fig-
ure 2 (left) for the overall turntaking behavior (shares), turntaking is not completely
balanced. Participant P1, for example, tended to dominate during sessions 2-4.


             100%

             90%

             80%

             70%

             60%
 Share (%)


                                              P1
             50%
                                              P2
             40%                              P3

             30%                              P4

             20%

             10%

              0%
                      S1    S2     S3    S4


Fig. 2: Turn-taking of the participants (P1–P4) in each speaking session (left) and their turn-
        taking in each specific dialogic moment (right). Share (%) indicates the respective speak-
        ing turns per session and moment, respectively. Moment M1–M3 belong to Session S2;
        Moment M4 belongs to Session S3; the moments M5 and M6 belong to Session S4.


    Investigating the individual moments (see Figure 2, right), we also observe no bal-
ance of speaking contribution. Looking at the areas of speaking contributions, partic-
ipant P1 and participant P4 tend to dominate the turn taking. The only moment with
rather even share of speaking turns is moment M4 in Session 3. This already makes this
moment special. While it had a negative valence at first, participants tended to become
more “aligned” in a balanced way. Moment M4 also was the moment almost started
by everyone, while the other 5 moments were started by a single member (a chief sto-
ryteller), who also took the most speaking turns, cf. Figure 2. Overall, we observed
a trend towards more stable alignment on the “Inclusion of Others in the Self” self-
reported scores, see Figure 3. Here, we can already see that moment M4 (session S3,
respectively) seems special – observing a (positive) change in the overall IOS trend,
while also the speaking contributions are more uniformly distributed.
    We further investigated the differences among the moments M1-M3, and M5-M6,
which indicated rather similar patterns. Figure 4 visualizes the similarities in turn taking
behavior for Sessions 2 and 4, containing the respective dialogic moments M1–M3
and M5–M6. Specifically, for estimating the (dis-)similarity we applied a very simple
measure; essentially, we computed the euclidean distance on the respective turntaking
contributions (a darker color indicates more similarity). Here, moments M1 and M5
somewhat stand out, while M2, M3 and M6 are more conforming to the overall session
contributions in terms of speaking turns.
      6


                                                                                                     All
                                                P1
                                                P2
                                                P3                                                   M2
      5


                                                P4

                                                                                                     M3
      4


                                                                                                     M1


                                                                        M1


                                                                              M3


                                                                                         M2


                                                                                               All
IOS

      3
      2


                                                                                                     M6
      1


                                                                                                     All
      0


                                                                                                     M5
             S1       S2              S3   S4

                           Sessions


                                                                         M5


                                                                                   All


                                                                                              M6
Fig. 3: Self-report scores of participants (P1–P4)    Fig. 4: Similarities w.r.t. turn-taking behavior
        of “Inclusion of Others in the Self” (IOS);           among the moments (M1-M3, M5-M6)
        7 point Likert-scale after every session.             and enclosing sessions (S2, S4).


4.3       Self-Report Group Interaction: Self-Assessment Manikin

For obtaining a broader view on the affective dimensions after each session, we applied
the self-assessment manikin (SAM) approach, cf. [26], as discussed above. Figure 5
outlines the results.


            Fig. 5: Self-Assessment Manikin (SAM): Emotional state after each session.


    Overall, we see that the scores remain relatively stable/similar for the individual par-
ticipants, with a few exceptions (e. g., participant P2 which reported a more “dynamic”
behavior). Specifically, the majority of participants reported to have a rather stable level
of pleasure/ happiness, which implies a general satisfactory state. They also reported
the level of arousal in a similar fashion: arousal was self-assessed to be unchanged or
slightly decreased. This analysis of the self-report information provides some first indi-
cation on the dialogic moments; we aim to validate the observed trends in future studies,
and to provide more context information on the affective states of the participants.
4.4   Group Interaction Behavior: Galvanic Skin Response (GSR)


Fig. 6: Examples of two different GSR signal profiles during a dialogic period (M1): highly fluc-
        tuating signals (P2 and P4) vs. rather stable signals (P1 and P3).


    Looking into each individual’s GSR, we distinguished different profiles, see Fig-
ure 6, mainly distinguishing between the signals that highly fluctuate, and those that
remain rather stable. The figures manifest this diversity among the participants of a di-
alogic event, visualizing the mean of GSR signal 30 seconds prior to, during and 30
seconds after the event.
    Interestingly, the signal during the event can be considered a marker of an overall
persistent change in the level of arousal: the mean of GSR 30 seconds prior to the
event (in red) is highly distinctive from the mean of GSR 30 seconds after the event (in
orange). This phenomenon is prevalent among the moments and among the participants.
The only difference is whether the blue line - the mean of a participant’s GSR during
the event - is closer to the red or the orange line. If the blue line is closer to the red one,
the participant seemed to experience the "dialogic" impact at the end of the event: a late
adopter. If the blue line is closer to the orange one, the participant seemed to experience
the "dialogic" impact at the beginning of the event: an early adopter.
    Additionally, there are half of the moments where its participants are mixed between
late and early adopters, such as M1 in Fig. 6, The other half witnesses its participants
unanimously falling into one categories of either early or late adopter, such as M3 in
Fig. 7. Regardless when we can witness the impact of a dialogic period on each partic-
ipant, we can assume that such impact exists and can be significant.


Fig. 7: Examples of a dialogic period, M3 that have all of its participants being "late adopters",
        experiencing the "dialogic impact" rather at the end of this period.
    For the interpretation of time series data, explicative methods [20, 21] provide suit-
able approaches for making the analysis interpretable and explainable. One central idea
is symbolic abstraction of the data, in order to make it easier to understand via feature
reduction. One exemplary method for that is the piecewise aggregate approximation
(PAA), which is employed in the SAX transformation. Applying that method, segments
of a time series are mapped to an alphabet which can subsequently be processed to
words, which – due to their symbolic nature – facilitate analysis and interpretation,
in order to allow also explanation, e. g., by detailed inspection in a “drill-down” ap-
proach [27, 28] referring back to the detailed time series.
    Comparing among individuals’ arousal during dialogic events, as discussed before,
SAX was applied on each of their GSR signals to transform them into a discrete and
symbolic representation (a word, i. e., a sequence of characters). Then, the Levenshtein
distance (also called edit distance) metric, e. g., [29] was employed to measure the dif-
ference between these sequence of letters, thus, allowing the calculation of the normal-
ized similarity between each pair of string. Essentially, the Levenshtein metric estimates
the distance between two character sequences (words), where intuitively, the distance
between two sequences is the minimum number of single-character edits (i. e., inser-
tions, deletions, or substitutions) which are required in order to transform one sequence
into the other sequences.
     In the beginning, the PAA components of SAX were set to 4, cf. Table 1, corre-
sponding to each event’s 4 statistical quartiles. Hence, there are total 256 permutations
with replacement of the 4 letters a, b, c, and d; therefore there is approximately a chance
of 0.016 that 2 participants have 2 or more of the same letters at exactly the same place.
In other word, this means having a normalized similarity value of 0.5 or higher, Yet such
values have been recorded in numerous occasions (Table 1), especially in the case of M2
and M4, the GSR signals of the participants fluctuates in a highly similar fashion. Over-
all, as can be seen in Table 2, M2, M4 and M5 appear to be the moments has the highest
similarity in term of GSR fluctuation patterns among the participants, Content-wise,
while the other moments emerged around a sharing of pleasant experiences or ideas,
M2, M4 and M5 arose from disputes around moral-related issues. Likewise, in general,
every participant tends to slightly "synchronize" during these moments, however, the
level of pairwise “synchronization” varies. In other words, just looking at these tables,
we could argue that the facilitator should have paid more attention to, for example, P3
and P4 in M3, The reason is even though, during M3, P3 and P4 vocally contributed
to the discussion more than some of P1 and P2 (see Figure 2), it is more likely that P3
and P4 had felt less included in the group than the others: on average, the normalized
similarity of GSR patterns of P3 and P4 to every other is 0.167, while those of P1 and
P4 are 0.333. Furthermore, we applied the same analysis process onto the data again
but increased the level of sensitivity by 5 times, setting PAA to equal to 20. This en-
tails 2020 permutations with replacement of the used letters, making it highly unlikely
for a random two symbol representations of participants’ GSR signals to be similar. As
predicted, Table 3 overall entails smaller values of normalized similarity than Table 2.
   Note that the complete list of symbolic representations, based on which Table 3
was calculated, is not included in here due to its size. Also note that, we abstained
from setting the PAA to be proportional to the various length of the events, because the
    Table 1: Strings represent each participant’s GSR signal used SAX conversion (PAA=4)
 Participants      M1            M2            M3           M4          M5           M6
      P1          dcaa          aacd          bbcc         dcba        aacd         bdca
      P2          dbbb          aadc          ccca         ddab        bbdb         babd
      P3          cdca          cacc          dcba         daad        dbca         ccca
      P4          accc          abcc          abdc         bbad        abcd         dabc

  Table 2: Pairwise Levenshtein normalized similarity between the pairs of strings (PAA=4).
   M        P1-P2     P1-P3     P1-P4     P2-P3       P2-P4     P3-P4       Mean       SD
   M1        0.25      0.5       0.25        0           0       0.25       0.208     0.188
   M2        0.5       0.5        0.5       0.5         0.5        0        0.417     0.204
   M3        0.25       0        0.50       0.5          0         0        0.208     0.246
   M4        0.25      0.25      0.25       0.5        0.25       0.5       0.333     0.129
   M5          0       0.25      0.75      0.25        0.25       0.5       0.333     0.258
   M6        0.25      0.5         0         0          0.5        0        0.208     0.246
  Mean       0.25     0.333      0.375     0.292      0.250     0.208
   SD       0.158     0.204      0.262     0.246      0.224     0.246


Levenshtein distance algorithm is highly sensitive to the string’s length. This means, if
PAA is set to be proportional to the events’ length, certain event will induce on average
higher similar GSR patterns just because they are shorter than other event. In other
words, by keeping the same PAA among the moments, we normalize their length, and
thus, make it possible to compare between their enclosed pairwise similarity values.
Interestingly, M2 is distinct to M5 in term of its length: M2 is the longest dialogic period
(136 seconds) while M5 is the shortest period (23 seconds) (mean = 64.83, SD = 40.29).
However, both of them entail the highest similarity in GSR among the participants
regardless of PAA settings.


 Table 3: Pairwise Levenshtein normalized similarity between each pair of strings (PAA=20).
   M        P1-P2     P1-P3     P1-P4     P2-P3       P2-P4    P3-P4       Mean       SD
  M1        0.301     0.051       0          0          0       0.051      0.067     0.117
  M2        0.051     0.250     0.151       0.1       0.051      0.1       0.117     0.075
  M3        0.051     0.051     0.151      0.250        0         0        0.084     0.098
  M4        0.151     0.151     0.051      0.151      0.051       0        0.093     0.067
  M5         0.1      0.051     0.250      0.051      0.151     0.151      0.126     0.076
  M6          0         0       0.051        0         0.1        0        0.025     0.042
  Mean      0.109     0.092     0.109      0.092      0.059     0.050
   SD       0.107     0.092     0.092      0.097      0.059     0.064
5 Conclusions
The preliminary descriptive analysis of the data has shown great potential in studying
dialogic moments computationally. Although all 6 moments in the pilot arguably fell
into the category of dialogic moments, we observed different profiles regarding the
agreement/similarity between their resulting multimodal data. In particular, moment
M4 was quite distinctive. However, we were able to obtain characteristic indicators,
considering turntaking, GSR data, as well as the questionnaire-based (IOS) information.
We aim to complement the analysis further for enhancing these preliminary insights,
and to provide a more comprehensive context for facilitating multimodal interpretation.
    Overall, future research in the direction initialized by this pilot can help to cre-
ate digital storytelling technology that has the ability to identify dialogic moments in
conversation, quantifying an otherwise very subjective phenomenon. Potentially, under-
standing dialogic moments can even help us induce them in (otherwise) unproductive
conversations, assisting in the context of difficult discussions and negotiations. Then,
such factors can also be incorporated in the design process of affective systems.


Acknowledgements
This work has been partially supported by the German Research Foundation (DFG)
project “MODUS” (under grant AT 88/4-1). Furthermore, we wish to thank the artists
of SPACE, Petra Ardai and Rinske Bosma, for their help in designing, setting up and
executing this study, and transcribing the discussions.


References
 1. Thiele, L., Atzmueller, M., Kauffeld, S., Stumme, G.: Subjective versus Objective Captured
    Social Networks: Comparing Standard Self-Report Questionnaire Data with Observational
    RFID Technology Data. In: Proc. Measuring Behavior, Wageningen, The Netherlands (2014)
 2. Rombout, L., Atzmueller, M., Postma, M.: Towards Estimating Collective Motor Behavior:
    Aware of Self vs. Aware of the Other. In: Proc. Workshop on Affective Computing and
    Context Awareness in Ambient Intelligence, UPV, Valencia, Spain (2018)
 3. Atzmueller, M., Thiele, L., Stumme, G., Kauffeld, S.: Analyzing Group Interaction on Net-
    works of Face-to-Face Proximity using Wearable Sensors. In: Proc. IEEE International Con-
    ference on Future IoT Technologies, Boston, MA, USA, IEEE Press (2018)
 4. Cissna, K.N., Anderson, R.: Theorizing about dialogic moments: The buber-rogers position
    and postmodern themes. Communication theory 8(1) (1998) 63–104
 5. Black, L.W.: Deliberation, storytelling, and dialogic moments. Communication Theory
    18(1) (2008) 93–116
 6. Mitzlaff, F., Atzmueller, M., Benz, D., Hotho, A., Stumme, G.: Community Assessment
    using Evidence Networks. In: Analysis of Social Media and Ubiquitous Data. Volume 6904
    of LNAI. (2011)
 7. Mitzlaff, F., Atzmueller, M., Stumme, G., Hotho, A.: Semantics of User Interaction in Social
    Media. In: Complex Networks IV. Volume 476 of Studies in Computational Intelligence.
    Springer, Berlin/Heidelberg, Germany (2013)
 8. Atzmueller, M.: Data Mining on Social Interaction Networks. Journal of Data Mining and
    Digital Humanities 1 (June 2014)
 9. Eagle, N., Pentland, A.S., Lazer, D.: Inferring Friendship Network Structure by Using Mo-
    bile Phone Data. PNAS 106(36) (2009) 15274–15278
10. Barrat, A., Cattuto, C., Colizza, V., Pinton, J.F., den Broeck, W.V., Vespignani, A.: High
    Resolution Dynamical Mapping of Social Interactions with Active RFID. PLoS ONE 5(7)
    (2010) e11596
11. Cattuto, C., Van den Broeck, W., Barrat, A., Colizza, V., Pinton, J.F., Vespignani, A.: Dy-
    namics of Person-to-Person Interactions from Distributed RFID Sensor Networks. PLoS
    ONE 5(7) (2010)
12. Atzmueller, M., Doerfel, S., Hotho, A., Mitzlaff, F., Stumme, G.: Face-to-Face Contacts at
    a Conference: Dynamics of Communities and Roles. In: Modeling and Mining Ubiquitous
    Social Media. Volume 7472 of LNAI. Springer, Berlin/Heidelberg, Germany (2012)
13. Macek, B.E., Scholz, C., Atzmueller, M., Stumme, G.: Anatomy of a Conference. In: Proc.
    ACM Hypertext, New York, NY, USA, ACM, ACM Press (2012) 245–254
14. Atzmueller, M., Becker, M., Kibanov, M., Scholz, C., Doerfel, S., Hotho, A., Macek, B.E.,
    Mitzlaff, F., Mueller, J., Stumme, G.: Ubicon and its Applications for Ubiquitous Social
    Computing. New Review of Hypermedia and Multimedia 20(1) (2014) 53–77
15. Stehlé, J., Voirin, N., Barrat, A., Cattuto, C., Isella, L., Pinton, J.F., Quaggiotto, M., Van den
    Broeck, W., Régis, C., Lina, B., et al.: High-Resolution Measurements of Face-to-Face
    Contact Patterns in a Primary School. PloS one 6(8) (2011) e23176
16. Lederman, O., Mohan, A., Calacci, D., Pentland, A.S.: Rhythm: A unified measurement
    platform for human organizations. IEEE MultiMedia 25(1) (2018) 26–38
17. Burns, A., Greene, B.R., McGrath, M.J., O’Shea, T.J., Kuris, B., Ayer, S.M., Stroiescu, F.,
    Cionca, V.: ShimmerT M – a wireless sensor platform for noninvasive biomedical research.
    IEEE Sensors Journal 10(9) (2010) 1527–1534
18. Mavromoustakos-Blom, P., Bakkes, S., Spronck, P.: Personalized crisis management training
    on a tablet. In: Proc. FDG, ACM (2018) 33
19. Gaskin, J., Jenkins, J., Meservy, T., Steffen, J., Payne, K.: Using wearable devices for non-
    invasive, inexpensive physiological data collection. In: Proc. HICCS. (2017)
20. Merino, S., Atzmueller, M.: Behavioral Topic Modeling on Naturalistic Driving Data. In:
    Proc. BNAIC, JADS, Den Bosch, The Netherlands (2018)
21. Atzmueller, M.: Declarative Aspects in Explicative Data Mining for Computational Sense-
    making. In: Proc. DECLARE, Heidelberg, Germany, Springer (2018)
22. Keogh, E., Lin, J., Fu, A.: Hot Sax: Efficiently Finding the Most Unusual Time Series
    Subsequence. In: Proc. ICDM, IEEE (2005)
23. Critchley, H.D.: Electrodermal responses: what happens in the brain. The Neuroscientist
    8(2) (2002) 132–142
24. Cattuto, C., Van den Broeck, W., Barrat, A., Colizza, V., Pinton, J.F., Vespignani, A.: Dy-
    namics of Person-to-Person Interactions from Distributed RFID Sensor Networks. PloS ONE
    5(7) (2010) e11596
25. Atzmueller, M., Becker, M., Doerfel, S., Kibanov, M., Hotho, A., Macek, B.E., Mitzlaff, F.,
    Mueller, J., Scholz, C., Stumme, G.: Ubicon: Observing Social and Physical Activities. In:
    Proc. CPSCom, Washington, DC, USA, IEEE Computer Society (2012) 317–324
26. Bradley, M.M., Lang, P.J.: Measuring emotion: the self-assessment manikin and the semantic
    differential. Journal of behavior therapy and experimental psychiatry 25(1) (1994) 49–59
27. Atzmueller, M., Roth-Berghofer, T.: The Mining and Analysis Continuum of Explaining
    Uncovered. In: Proc. AI-2010. (2010)
28. Le Nguyen, T., Gsponer, S., Ilie, I., O’Reilly, M., Ifrim, G.: Interpretable Time Series Clas-
    sification Using Linear Models and Multi-Resolution Multi-Domain Symbolic Representa-
    tions. Data Mining and Knowledge Discovery (2019) 1–40
29. Van der Loo, M.P.: The stringdist package for approximate string matching. The R Journal
    6(1) (2014) 111–122

</pre>