<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards a distributed framework to analyze multimodal data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Vanessa Echeverría</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>vecheverria@cti.espol.edu.ec Federico Domínguez</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Escuela Superior Politécnica del Litoral</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>fexadomi@espol.edu.ec Katherine Chiluiza</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Escuela Superior Politécnica del Litoral</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>kchilui@espol.edu.ec</string-name>
        </contrib>
      </contrib-group>
      <fpage>52</fpage>
      <lpage>57</lpage>
      <abstract>
        <p>Data synchronization gathered from multiple sensors and its corresponding reliable data analysis has become a difficult challenge for scalable multimodal learning systems. To tackle this particular issue, we developed a distributed framework to decouple the capture task from the analysis task through nodes across a publish/subscription server. Moreover, to validate our distributed framework we build a multimodal learning system to give on-time feedback for presenters. Fifty-four presenters used the system. Positive perceptions about the multimodal learning system were received from presenters. Further functionality of the framework will allow an easy plug and play deployment for mobile devices and gadgets.</p>
      </abstract>
      <kwd-group>
        <kwd>learning analytics</kwd>
        <kwd>distributed framework</kwd>
        <kwd>data synchronization</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Related Work</title>
      <p>
        Research in multimodal data has gained a lot of attention in recent years, independently of the analysis and area
to be explored. The central goal of such multimodal systems is to gather data from several sources and analyze
data to discover patterns. While most of the work has been done in the analysis of multimodal data from one
media source, it is still difficult to find a framework that allows a simpler interconnectivity and ease of data
handling and analysis. Manual interactions such as clapping or performing a gesture are common ways that
researchers use to start collecting data at the same time
        <xref ref-type="bibr" rid="ref9">(Leong, Chen, Feng, Lee &amp; Mulholland, 2015)</xref>
        .
Nevertheless, data with imprecise synchronization is the result of this approach. In the literature, we found
wellstructured software and frameworks to gather data from multiple inputs. These systems allow controlling several
inputs through components and translating them into predefined actions or output signals from basic analysis of
the data stream
        <xref ref-type="bibr" rid="ref3 ref8">(Camurri et al., 2000; Hoste &amp; Signer, 2011)</xref>
        . One concern about the mentioned systems is that
all input data is processed in the same machine, lacking of scalability to add new inputs.
      </p>
      <p>
        A framework presented by a research group of the National Institute of Standards and Technology
        <xref ref-type="bibr" rid="ref5">(Diduch, Fillinger, Hamchi, Hoarau, &amp; Stanford, 2008)</xref>
        strives to capture multimodal data from several sources
using a decentralized NTP server and one node for each input source. This framework is similar to the one
presented in this paper but our approach differs on allowing TCP/IP connections from any input source who
wants to subscribe to the capture session.
      </p>
    </sec>
    <sec id="sec-2">
      <title>Framework Architecture</title>
      <p>Our framework architecture is based on a publish/subscribe service to synchronize data collection, processing
and storage among distributed computational nodes. Data collection is performed by nodes attached to sensors
(for example a webcam, microphone, kinect, etc.), depicted as capture device nodes in Figure 1. Each capture
device node subscribes to start and stop recording events with the centralized server. These events are triggered
by an interface to interact with the system’s user. When the event is published, the centralized server starts to
synchronize all data coming from device nodes.</p>
      <p>At the moment the user triggers a start recording event via the session station, capture device nodes
start streaming their raw data to one or more processing nodes (Figure 2). Each processing node handles an
input mode, e.g. video, audio, posture, etc., and each capture device node can send one or more streams to
several processing nodes (for example the capture device node for the kinect sends several streams to different
processing nodes).</p>
      <p>All data processing tasks are done in parallel while the session is recording. When the user decides to
finish the session, a stop recording event is published and all capture device nodes stop their data streams.
Additionally, after this event, the data aggregation service waits for all mode-processing nodes to submit their
reports before preparing a feedback summary that is sent to the user (Figure 2).</p>
      <p>The purpose of this architecture is to decouple the data processing tasks from the data capture tasks.
Capture devices and mode processing nodes can easily be added or removed from the multimodal system
without major reconfiguration. Upon registration, each capture device is given one or more Uniform Resource
Identifiers URIs of their corresponding processing nodes.</p>
      <p>The publish/subscribe server is implemented in a central server using Node.js while all nodes use
Python to receive and send events. Messages are not queued or stored and all recorded data is time-stamped
locally. All server and node clocks are synchronized using the Network Time Protocol (NTP).</p>
    </sec>
    <sec id="sec-3">
      <title>Application Example: Multimodal Learning System</title>
      <p>To test the developed framework, we created a multimodal learning system (MLS) to collect and analyze
multimodal data from oral presentation’ students. The aim of the MLS is to capture data from several sensors
while students present their work orally and to provide on-time feedback at the end of the presentation by
analyzing nonverbal skills from gathered data. Thus, we design a physical space to locate all sensors having an
immersive, non-intrusive and automatic learning system.</p>
    </sec>
    <sec id="sec-4">
      <title>Hardware and Software</title>
      <p>The MLS is composed of three media streams: audio, video and Kinect data. The audio is recorded using a
6microphone array with embedded echo cancellation and background noise reduction. This device is located at
the lower border of the presenter’s field of view. Video is recorded using three Raspberry Pis, each one attached
with two low-cost cameras, forming a 6-camera array that covers all sensing area (figure 3 and 4). Kinect data is
recorded with a Microsoft Kinect sensor (version 1). This device is located at the lower border of the presenter’s
field of view, near to the audio device. As depicted in figure 1, all recording hardware is positioned to cover the
multimodal sensing area (4 m2 approximately).</p>
      <p>At the top of the MLS, we created an application using the proposed framework to capture the
presenter’s data and give on-time feedback to the presenter. The application works as follows: 1) The presenter
loads the slides in the computer terminal via USB; 2) The presenter enters a valid email address to receive the
feedback results from the application; 3) the presenter starts the recording of all inputs by clicking a “Start”
button in the computer terminal; 4) the presenter does the oral presentation meanwhile all the servers gather and
analyze each input source; 5) the presenter stops the recording of the data by clicking a “stop” button that
appears in the computer terminal; 6) all partial analysis are sent to the central server ; 7) a summary of the
analysis is constructed and an email is sent to the presenter.</p>
      <sec id="sec-4-1">
        <title>Data Analysis</title>
        <p>Doing oral presentations implies the use of verbal and nonverbal communication skills. The purpose of this
MLS is to explore nonverbal skills through the analysis of audio, video and Kinect data streams. Therefore,
from each data stream, a set of features are extracted and analyzed to provide a feedback message after the
presentation.</p>
        <p>
          The audio stream is used to measure the clarity of the speech while doing an oral presentation. We
calculate the speech rate and detect the filled pauses of the presenter by following the work of
          <xref ref-type="bibr" rid="ref4">De Jong &amp;
Wempe (2009</xref>
          ) and
          <xref ref-type="bibr" rid="ref1">Audhkhasi, Kandhway, Deshmukh &amp; Verma (2009</xref>
          ), respectively.
        </p>
        <p>
          The video stream from the six-camera array estimates the presenter’s gaze. Four of the cameras,
located in front of the presenter, indicate if the presenter is looking at the virtual audience screen, while the left
and right corner cameras help to point if the presenter is looking at the presentation screen. For each video input
the HAAR Cascade face detection algorithm
          <xref ref-type="bibr" rid="ref10">(Lienhart, Kuranov, &amp; Pisarevsky, 2003)</xref>
          is calculated and then,
joining all partial results, we obtained the final gaze position, which is determined by one of the two states:
facing the audience or watching presentation. At the end, we label each frame with one of the two mentioned
states.
        </p>
        <p>
          Kinect data extracts body posture from skeleton data. Each skeleton frame is composed of 3D
coordinates of 20 joints from the full body. For purposes of this application, only upper limbs and torso joints
are relevant to calculate body posture of presenter. To determine whether a presenter is doing an specific
posture, we define three common postures founded in previous work
          <xref ref-type="bibr" rid="ref6">(Echeverría, Avendaño, Chiluiza, Vásquez
&amp; Ochoa, 2014)</xref>
          . Thus, the euclidean distances and orientation are calculated from limb to limb at a frame level
and, each frame is labeled with one of the three postures.
        </p>
        <p>
          An additional feature of the presentation was determined by analyzing the presenter’s presentation file.
Thus, based on a slide tutoring system tool
          <xref ref-type="bibr" rid="ref7">(Echeverria, Guaman &amp; Chiluiza, 2015)</xref>
          , we extracted three features
from each presentation: contrast, number of words and font size. In the end, the tool determined if the
presentation was good or not. The presentation was analyzed per slide and globally.
        </p>
        <p>Feedback
Due to the demanding time when giving on-time feedback in traditional setups, our MLS help us to provide the
feedback information right after finishing the presentation. The email that is sent to the presenter shows a
summary of the states gathered from each modality. Some predefined messages were inferred after selecting a
set of rules that describes whether the nonverbal communication skills were good or bad (figure 5).</p>
        <p>Thus, for audio analysis we use speech rate and number of filled pauses to determine a set of rules that
describe the performance of the presenter while speaking. As for the posture analysis, we took each Kinect
frame labeled as: explaining, pointing or arms down and, calculate a percentage for each posture; with this
information we also determine a set of rules to describe the presenter’s performance according to the body
posture. Finally, the results obtained from the slide tutoring system tool helped to create the set of rules for the
slide analysis.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Experiment</title>
        <p>Fifty-four computer science undergraduate students, 42 male and 12 female, were asked to participate in an
experiment to evaluate the proposed framework and the multimodal learning system.</p>
        <p>Prior to the presentation, they were informed to select an oral presentation they have previously
prepared. The day of the presentation, each student was briefly introduced to the learning system through an
explanation on the usage of the system and immediately started the presentation.</p>
        <p>Once the oral presentation concluded, the student observed the system’s feedback and filled a
questionnaire, which consists on six questions using a 10-point likert scale (1: lower value, 10: higher value) and
four open-ended questions about the learning system’s overall impression and suggestions to improve it.</p>
        <p>After recording all students, a manual verification task was carried out to delete presentations where a
source was not correctly recorded. Fifty presentations with an average of 8.52 minutes were selected as the final
dataset.</p>
      </sec>
      <sec id="sec-4-3">
        <title>Results</title>
        <p>Learners’ feedback showed positive results about the ease of use, intrusiveness, motivation and experience with
non-traditional classrooms (mode: 8), whereas students’ perception about the usefulness was reported in a lesser
extent (mode: 7). Nonetheless, they think that they learnt anything with the MLS (mode: 9) compared to their
previous knowledge. Table 1 shows the minimum, maximum, mode and standard deviation for each likert-scale
question.</p>
        <p>Question
On a scale from 1 to 10 with 1 being very awkward, and 10 being very natural, how would
you rate your experience with the application?
On a scale from 1 to 10, with 1 being very motivated, and 10 being very bored, how
motivated would you be to use the application again?
On a scale from 1 to 10, with 1 being low, and 10 being very high, how invasive were the
sensors being used to collect data about you?
On a scale from 1 to 10, with 1 being very likely, and being very unlikely, how likely would
you be to use this application in your free time?
On a scale of 1 to 10, with 1 being not at all, and 10 being completely, do you feel like you
learned anything while interacting with the application?
On a scale of 1 to 10, with 1 being much worse, and 10 being much better, how does using
this application compare to how you would normally learn the same content in a traditional
classroom?
Min
1</p>
        <p>Max
10</p>
        <p>Mode</p>
        <p>8
1
1
1
2
1
10
10
10
10
10
8
1
7
9
8</p>
        <p>Stdv.
1.84
2.95
2.88
2.73
1.98
2.05</p>
        <p>From open-ended questions, learners revealed that they learnt from provided feedback specific issues
related to posture; slide content and contrast; and filled pauses while speaking.</p>
        <p>It is important to note that in the verification task we realized that some of the recordings (sources)
were not correctly recorded due to the location of the device according to the presenter’s location. For instance,
some audio recordings were deleted because of the lower tone of voice; in this particular case, the coverage area
of the microphone was overestimated.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Discussion and future work</title>
      <p>This paper describes the architecture of a distributed framework to gather and analyze multimodal data. The
framework uses a publish/subscribe paradigm to facilitate the connectivity among nodes along with sensors.
This framework also helps to maintain all the data well organized and in one place through recording sessions.
The analysis of data is made on each dedicated node, which helps to boost the performance of the different
algorithms for feature extraction and further analysis.</p>
      <p>Using this framework, help researchers to be more efficient to keep all data synchronized. From this
experience, we reduced the synchronization time and we put more effort on the analysis of data.</p>
      <p>In the future, we will make this framework publicly available. We are going to test this framework not
only for mobile devices (e.g. camera/voice recorder from smartphone) but also for digital pens and gadgets or
any kind of sensor. In addition, we want to add some functionality such as basic feature extraction algorithms
depending on the media to help multimodal community focus on the analysis of data.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>The authors would like to thank the SENESCYT for its support in the development of this study, to ESPOL's
educators and students that participated in the experiment.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Audhkhasi</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kandhway</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deshmukh</surname>
            ,
            <given-names>O. D.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Verma</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>Formant-based technique for automatic filled-pause detection in spontaneous spoken English</article-title>
          .
          <source>In Acoustics, Speech and Signal Processing</source>
          ,
          <year>2009</year>
          .
          <article-title>ICASSP 2009</article-title>
          . IEEE International Conference on (pp.
          <fpage>4857</fpage>
          -
          <lpage>4860</lpage>
          ). IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Blikstein</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Multimodal learning analytics</article-title>
          .
          <source>In Proceedings of the third international conference on learning analytics and knowledge</source>
          (pp.
          <fpage>102</fpage>
          -
          <lpage>106</lpage>
          ). ACM.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Camurri</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hashimoto</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ricchetti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ricci</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suzuki</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trocca</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Volpe</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          (
          <year>2000</year>
          ).
          <article-title>Eyesweb: Toward gesture and affect recognition in interactive dance and music systems</article-title>
          .
          <source>Computer Music Journal</source>
          ,
          <volume>24</volume>
          (
          <issue>1</issue>
          ),
          <fpage>57</fpage>
          -
          <lpage>69</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>De Jong</surname>
            ,
            <given-names>N. H.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Wempe</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>Praat script to detect syllable nuclei and measure speech rate automatically</article-title>
          .
          <source>Behavior research methods</source>
          ,
          <volume>41</volume>
          (
          <issue>2</issue>
          ),
          <fpage>385</fpage>
          -
          <lpage>390</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Diduch</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fillinger</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hamchi</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hoarau</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Stanford</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          (
          <year>2008</year>
          ).
          <article-title>Synchronization of data streams in distributed realtime multimodal signal processing environments using commodity hardware</article-title>
          .
          <source>In ICME</source>
          (pp.
          <fpage>1145</fpage>
          -
          <lpage>1148</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Echeverría</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Avendaño</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chiluiza</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vásquez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Ochoa</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Presentation Skills Estimation Based on Video and Kinect Data Analysis</article-title>
          .
          <source>In Proceedings of the 2014 ACM workshop on Multimodal Learning Analytics Workshop and Grand Challenge</source>
          (pp.
          <fpage>53</fpage>
          -
          <lpage>60</lpage>
          ). ACM.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Echeverria</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guaman</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Chiluiza</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Mirroring Teachers' Assessment of Novice Students' Presentations through an Intelligent Tutor System</article-title>
          .
          <source>In Computer Aided System Engineering (APCASE)</source>
          ,
          <year>2015</year>
          Asia-Pacific Conference on (pp.
          <fpage>264</fpage>
          -
          <lpage>269</lpage>
          ). IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Hoste</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dumas</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Signer</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          (
          <year>2011</year>
          , November).
          <article-title>Mudra: a unified multimodal interaction framework</article-title>
          .
          <source>In Proceedings of the 13th international conference on multimodal interfaces</source>
          (pp.
          <fpage>97</fpage>
          -
          <lpage>104</lpage>
          ). ACM.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Leong</surname>
            ,
            <given-names>C. W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Feng</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>C. M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mulholland</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Utilizing Depth Sensors for Analyzing Multimodal Presentations: Hardware, Software and Toolkits</article-title>
          .
          <source>In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction</source>
          (pp.
          <fpage>547</fpage>
          -
          <lpage>556</lpage>
          ). ACM.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Lienhart</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuranov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Pisarevsky</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          (
          <year>2003</year>
          ).
          <article-title>Empirical analysis of detection cascades of boosted classifiers for rapid object detection</article-title>
          .
          <source>In Pattern Recognition</source>
          (pp.
          <fpage>297</fpage>
          -
          <lpage>304</lpage>
          ). Springer Berlin Heidelberg.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Oviatt</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Problem solving, domain expertise and learning: Ground-truth performance results for math data corpus</article-title>
          .
          <source>In Proceedings of the 15th ACM on International conference on multimodal interaction</source>
          (pp.
          <fpage>569</fpage>
          -
          <lpage>574</lpage>
          ). ACM.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Scherer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weibel</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morency</surname>
            ,
            <given-names>L. P.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Oviatt</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>Multimodal prediction of expertise and leadership in learning groups</article-title>
          .
          <source>In Proceedings of the 1st International Workshop on Multimodal Learning Analytics</source>
          (p.
          <fpage>1</fpage>
          <lpage>)</lpage>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Worsley</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Blikstein</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Leveraging multimodal learning analytics to differentiate student learning strategies</article-title>
          .
          <source>In Proceedings of the Fifth International Conference on Learning Analytics And Knowledge</source>
          (pp.
          <fpage>360</fpage>
          -
          <lpage>367</lpage>
          ). ACM.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>