<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Dataset on an online collaborative learning situation in a computer networks course</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Eduardo Gómez-Sánchez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Alejandra Martínez-Monés Universidad de Valladolid</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Cristina Villa-Torrano Universidad de Valladolid</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Juan I. Asensio-Pérez Universidad de Valladolid</institution>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Miguel L. Bote-Lorenzo Universidad de Valladolid</institution>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Pankaj Chejara Tallinn University</institution>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>Universidad de Valladolid</institution>
        </aff>
        <aff id="aff6">
          <label>6</label>
          <institution>Yannis Dimitriadis Universidad de Valladolid</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper presents a dataset of a collaborative learning situation. Students were enrolled in two undergradute courses on computer networks where they were required to carry out a set of learning activities supported by Moodle and an online collaborative environment called CoTrackV2. The data collected includes logs of the writing process of shared documents, logs of the chat messages between the group members, and logs from Moodle with coarser-grained information about course-level interactions. This dataset has been generated with the aim of allowing researchers to study selfand socially-shared regulation in online environments.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Computer-Supported Collaborative Learning</kwd>
        <kwd>Socially-Shared Regulation of Learning</kwd>
        <kwd>Self-Regulated Learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        Academic and work contexts are increasingly demanding the
competence of being able to collaborate with peers [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] as one
of the 21st Century Skills [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In order to have a
successful collaboration, many studies show that it is necessary to
develop regulatory processes where group members can
activate and maintain their cognition, motivation, and emotion
towards their common goals [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. This need is also present in
computer science and engineering courses. Moreover, the use
of Information Communication and Technology (ICT) tools
to support collaboration (leading to Computer-Supported
Collaborative Learning settings (CSCL) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]) enables the
collection of traces to model students' behavior while
collaboCopyright ©2021 for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0)
rating.
      </p>
      <p>
        The literature has shown that students' motivation and
strategic regulation play a critical role in their success in Science,
Technology, Engineering and Mathematics (STEM) courses
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. For example, in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] the authors studied how motivation,
strategic self-regulation, and creative competency were
associated with computational thinking knowledge and skills
in introductory computer science courses. They found that
student performance and long-term retention were positively
correlated with the use of self-regulated strategies.
Concerning the motivation, higher pursuit of goals, and
positive a ect were also correlated with high performers, higher
knowledge retention, strategic self-regulation and
engagement. Moreover, collaborative activities, especially those
including CSCL tools, have been shown to favor knowledge
building [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and also can bene t from socially-shared
regulation in order to be successful. However, although there
are studies that show that it is necessary to develop
regulatory processes while collaborating, further study is needed in
STEM courses and, speci cally, in computer science courses.
Concerning the latter, we have not found shared datasets
enabling the study of regulation in collaborative activities
in computer science. This is a challenging issue, because the
study of regulation in collaborative learning settings requires
the collection and integration of a variety of data sources
like, for example, logs of di erent learning platforms, the
communication between group members, and self-reported
data. The absence of such a dataset led us to the need of
generating one of them. Computer networks are part of the
ACM Computer Curricula [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], and we had access to two
courses on this topic. Therefore, we generated one dataset
related to a learning situation on this subject, designed to
ful ll the aforementioned requirements. Further details will
be provided in the following sections.
2.
      </p>
    </sec>
    <sec id="sec-2">
      <title>CONTEXT AND DATASET</title>
    </sec>
    <sec id="sec-3">
      <title>Description of the learning situation</title>
      <sec id="sec-3-1">
        <title>Characters added during this action</title>
      </sec>
      <sec id="sec-3-2">
        <title>Length of the text before performing the action</title>
        <p>Type of operation (&gt;: writing,</p>
        <p>&lt;: deleting)
The di erence in number of characters caused
by the current action and source length</p>
      </sec>
      <sec id="sec-3-3">
        <title>Text from the document at the current time</title>
        <p>Have you contacted your Internet service
provider, i.e. your operator? [...]
The learning situation took place at two undergraduate courses
on Computer Networks during 4 days in the spring semester
of the academic year 2021 in a European University. There
were 33 students, that were grouped into 8 di erent groups
of 4-5 people to carry out an introductory learning
situation aimed at challenging their previous knowledge and
beliefs about certain computer network topics. Before starting
the learning situation, students were asked to ll out an
informed consent.</p>
        <p>The situation was designed following the so-called pyramid
or snowfall pattern, where the students had to rst carry
out the proposed activities individually and then in groups
(thus fostering the agreement among the group members in
order to submit a common solution). The di erent activities
were carried out during 4 two-hour face-to-face sessions.
The learning situation was based on the following scenario:
A hotel owner (role played by the teacher) goes to a team
of telco engineers (role played by the students) to ask them
to solve his problem: the internet connection is not
working properly; the internet access is very slow and sometimes
does not work at all. The hotel owner and the telco
engineers agree to an interview in a few days. In order for the
telco engineers to think about the problem, the hotel owner
sends them a diagram of the current network. The di erent
activities that students needed to complete were:
Questions ind (individual): Thinking of questions to
ask the hotel owner to nd out more about his network.
Questions group (in groups of 4-5 students): Agreeing
on 7 nal questions to ask the hotel owner.</p>
        <p>Questions class (whole class): Asking the hotel owner
about his network. For this task, there was a
spokesperson in each group. The teacher, playing the role of the
hotel owner, answered those questions posed by the
groups.</p>
        <p>Diagnosis ind (individual): Proposing a solution to the
hotel's Internet access problem.</p>
        <p>Diagnosis group (in groups of 4-5 students): Agreeing
on a nal proposal with the rest of the group members.
Diagnosis class (whole class): Creating a concept map
of the technical concepts that emerged during the whole
situation.</p>
        <p>Students had to work through an online collaborative
environment called CoTrackV21. This environment o ered
the possibility to write documents collaboratively and had
a built-in chat so that the di erent members of the group
could communicate. In addition, students used Moodle to
submit individual assignments, to visit subject-related
content and to access the link to the CoTrackV2 sessions, so we
were able to obtain traces of the content visited by the
students, the writing process and the chat messages. Besides
these traces, at the end of the learning situation, the
students answered a questionnaire related to group regulation.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>2.2 Description of the dataset</title>
      <p>The dataset collected2 is based on the logs of two di
erent tools: CoTrackV2 and Moodle. The data obtained by
CoTrackV2 is divided into 2 les: 1) document_logs.csv,
with actions from the writing process of the shared
documents; and 2) chat_logs.csv, that contains the logs of
the communication between the group members. The
attributes in the two les are presented in Table 1, and Table
2, respectively. Regarding the data obtained through
Moodle, we have 2 les: 1) moodle_logs.csv, with the logs of
the contents visited by the students. Details are given in
Table 33; and 2) individual_submissions.csv, where the
individual submissions for activities 1 and 4 are collected,
containing the timestamp of the submission, the id of the
student submitting the solution and the solution itself.
Besides these les, we have two others: 1) a le containing the
learning design, including the start time and the name of
the tasks; and 2) a le containing the students' answers to
the nal questionnaire. All les provided have been properly
anonymized.
1CotrackV2 website: https://www.cotrack.website/
2The dataset will be available at
https://zenodo.org/record/5033198#.YNsQv-gzaUk
3The examples presented in the di erent tables have been
translated into English for a better understanding, but the
dataset is in Spanish.</p>
    </sec>
    <sec id="sec-5">
      <title>3. ANALYSIS</title>
      <p>The dataset we have generated may allow researchers to
answer di erent research questions related to Socially-Shared
Regulation of Learning (SSRL). For example, the research
questions that have guided the design of this learning
situation are the following: 1) How do self- and socially-shared
regulation processes occur in groups that complete group
activities with di erent levels of success?; 2) Are there di
erent patterns of regulation associated with the performance
of groups when solving activities? To answer these
questions, we want to analyse the data from a temporal
perspective using di erent techniques, like process mining (e.g.:
Heuristic Miner or Fuzzy Miner algorithms), Markov models
(e.g.: pMiner algorithm), social network analysis (temporal
networks) and epistemic network analysis. Beforehand, we
want to identify SSRL features that allow us to map
lowlevel data to higher-level constructs. After that, we could
make use of the techniques mentioned above and compare
the results of the di erent approaches. Beyond detecting the
di erent processes, we would like to build predictive models
with the identi ed features. However, at this stage of the
research, it would be very bene cial to get feedback from
the community to better guide the analysis.</p>
    </sec>
    <sec id="sec-6">
      <title>4. ACKNOWLEDGMENTS</title>
      <p>This research is partially funded by the European Regional
Development Fund and the Spanish National Research Agency
under project grant TIN2017-85179-C3-2-R.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>[1] ACM and IEEE-CS. Computing Curricula</source>
          <year>2020</year>
          {
          <article-title>CC 2020: Paradigms for Global Computing Education</article-title>
          ,
          <source>A Computing Curricula Series Report</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>N. R.</given-names>
            <surname>Council</surname>
          </string-name>
          .
          <article-title>How students learn: History, mathematics, and science in the classroom</article-title>
          . National Academies Press,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.</given-names>
            <surname>Dillenbourg</surname>
          </string-name>
          , S. Jarvela, and
          <string-name>
            <given-names>F.</given-names>
            <surname>Fischer</surname>
          </string-name>
          .
          <article-title>The evolution of research on computer-supported collaborative learning</article-title>
          .
          <source>In Technology-enhanced learning</source>
          , pages
          <volume>3</volume>
          {
          <fpage>19</fpage>
          . Springer,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ja</surname>
          </string-name>
          rvela,
          <string-name>
            <given-names>D.</given-names>
            <surname>Gasevic</surname>
          </string-name>
          , T. Seppanen, M. Pechenizkiy, and
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Kirschner</surname>
          </string-name>
          .
          <article-title>Bridging learning sciences, machine learning and a ective computing for understanding cognition and a ect in collaborative learning</article-title>
          .
          <source>British Journal of Educational Technology</source>
          ,
          <volume>51</volume>
          (
          <issue>6</issue>
          ):
          <volume>2391</volume>
          {
          <fpage>2406</fpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Malmberg</surname>
          </string-name>
          , S. Jarvela, H. Jarvenoja, and
          <string-name>
            <given-names>E.</given-names>
            <surname>Panadero</surname>
          </string-name>
          .
          <article-title>Promoting socially shared regulation of learning in CSCL: Progress of socially shared regulation among high-and low-performing groups</article-title>
          .
          <source>Computers in Human Behavior</source>
          ,
          <volume>52</volume>
          :
          <fpage>562</fpage>
          {
          <fpage>572</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Rotherham</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Willingham</surname>
          </string-name>
          . 21st century.
          <source>Educational leadership</source>
          ,
          <volume>67</volume>
          (
          <issue>1</issue>
          ):
          <volume>16</volume>
          {
          <fpage>21</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Scardamalia</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Bereiter</surname>
          </string-name>
          .
          <article-title>Knowledge building</article-title>
          . The Cambridge,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Shell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. P.</given-names>
            <surname>Hazley</surname>
          </string-name>
          , L.
          <string-name>
            <surname>-K. Soh</surname>
            , E. Ingraham, and
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Ramsay</surname>
          </string-name>
          .
          <article-title>Associations of students' creativity, motivation, and self-regulation with learning and achievement in college computer science courses</article-title>
          .
          <source>In 2013 IEEE Frontiers in Education Conference (FIE)</source>
          , pages
          <fpage>1637</fpage>
          {
          <fpage>1643</fpage>
          . IEEE,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>