<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Detect Abnormal Submissions for CodeWorkout Dataset</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alex Hicks</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yang Shi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Arun-Balajiee Lekshmi-Narayanan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Wei Yan</string-name>
          <email>wei.yan@nau.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Samiha Marwan</string-name>
          <email>samihamarwan21@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>CS1, Introductory Programming, Dataset Cleaning, Dataset Standards, Educational Data Mining</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dept of Computer Science, Utah State University</institution>
          ,
          <addr-line>Logan, UT</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dept of Computer Science, Virginia Tech</institution>
          ,
          <addr-line>Blacksburg, VA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Dept. of Computer Science, University of Virginia</institution>
          ,
          <addr-line>wCharlottesville, VA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Intelligent Systems Program, University of Pittsburgh</institution>
          ,
          <addr-line>Pittsburgh,PA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>School of Informatics, Computing and Cyber Systems, North Arizona University</institution>
          ,
          <addr-line>Flagstaf, AZ</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Students' interactions while solving problems in learning environments (i.e. log data) are often used to support students' learning. For example, researchers use log data to develop systems that can provide students with personalized problem recommendations based on their knowledge level. However, anomalies in the students' log data, such as cheating to solve programming problems, could introduce a hidden bias in the log data. As a result, these systems may provide inaccurate problem recommendations, and therefore, defeat their purpose. Classical cheating detection methods, such as MOSS, can be used to detect code plagiarism. However, these methods cannot detect other abnormal events such as a student gaming a system with multiple attempts of similar solutions to a particular programming problem. This paper presents a preliminary study to analyze log data with anomalies. The goal of our work is to overcome the abnormal instances when modeling personalizable recommendations in programming learning environments.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Students cheating to submit programming solutions is a
common occurrence. Cheating can be of any kind – copying
solutions to the problem available online, by other students
learning programming with the course or by other means of
plagiarism. Generally, researchers have explored methods
to curb cheating in the context of academic integrity [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
Some techniques that could work [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] include the detection
of collusion and continual feedback to students to encourage
them towards better academic integrity. There is a tendency
for students to cheat when solving programming puzzles
or practice assignments. When online log data is collected
using the interaction logs of the interfaces for programming
assignments, there is a risk for some of these anomalies to be
recorded among regular student interaction logs. This could
potentially afect student modeling approaches that use the
interaction logs to make recommendations for students [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        Student modeling in the context of solving programming
assignments like the Normalized Student Modeling for
Programming [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] use Error Quotient and Watwin score that
measure changes help estimate student knowledge or
understanding [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. In other cases, student modeling facilitates
the identification and prediction of students’ learning
proifles in tutoring systems, which, in turn, enables such
systems to be adaptive and personalized to students’ needs [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
This makes them sensitive to the quality of the data and
anomalies created by students gaming the system or
cheatCSEDM’24: 8th Educational Data Mining in Computer Science Education
(CSEDM) Workshop, June 14, 2024, Atlanta, GA
⋆You can use this document as the template for preparing your
publication. We recommend using the latest version of the ceurart style.
∗Corresponding author.
https://www.samihamarwan.com/ (S. Marwan)
One of the most popular approaches is “The Measure Of
Software Similarity (MOSS)”, an open-source tool designed
to identify similarities between students’ programming
assignments [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. However, to our knowledge, there is no
evidence that researchers apply cheating detection methods
on online shared data before applying log data analysis and
student modeling.
      </p>
      <p>We present a work in progress, where we look into this
aspect closely in order to mitigate anamolies in student
submissions : 1) using classical methods like Measure of
Software Similarity (MOSS), 2) alternative approaches of
analyzing log data). We use the CodeWorkout (CWO)
programming dataset (as introduced in [10])1. While the use of
generative AI has been very popular now, this dataset was
collected before 2021 when Generative AI was not generally
used to cheat when submitting programming solutions.
CEUR</p>
      <p>ceur-ws.org
1https://pslcdatashop.web.cmu.edu/DatasetInfo?datasetId=3458
condition
X-Grade (before)</p>
    </sec>
    <sec id="sec-2">
      <title>2. Methods &amp; Analysis</title>
      <p>In this work, we compare two ways to analyze abnormal
submissions:</p>
      <p>Proposed method: Log Data Analysis. We used two
main identifiers to explore anomalies such as suspected
cheating behaviors from submission log data: the number
of submission attempts before completing the exercise, and
the elapsed time between correct submissions. The choice of
these variables correlates with the possibility that students
who attempt and submit a correct solution on their first
attempt could be cheating. We discuss more details on this
below.</p>
      <p>Baseline method: MOSS. MOSS is a tool used to detect
cheating in programming submissions. The tool works by
taking into all the students’ submissions and comparing
them pairwise for similarities. We compared students’ code
submissions using MOSS to identify similarities in
submissions for a selected set of problems from a collection of easy,
medium, and hard assignments made available on CWO.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Results &amp; Discussions</title>
      <sec id="sec-3-1">
        <title>3.1. MOSS Detection Results</title>
        <p>We further evaluated whether accessible and common
cheating detection tools such as MOSS can be applied to detect
students’ cheating in this dataset. However, we found that
running MOSS across CWO exercises led to high rates of
similarity on a majority of students’ submissions. In
addition, we found no clear diference between students whom
we previously identified and those whom we believe that
have engaged authentically with the CWO exercises. We
hypothesize that this failure could be due to the size of the
solutions to several CWO exercises. Some solutions to these
exercises could be just 10 lines of source code as these
problems are well-constrained and target specific learning goals.
Hence, these problems may not have possible alternative
solutions (refer Figure 1). Students like those in the
example may end with 93% of their solutions matching despite
no indications of anomalous behaviour. This indicates that
identifying an acceptable threshold for MOSS detection on
CWO exercises is unreasonable and highlights the need for
other options.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Log Data Analysis Detection</title>
        <p>We calculate students’ “one shot” percent, or the percent of
CWO exercises where a student correctly answers an
exercise on their first attempt. In Table 1, this is represented as
the one_shot column and is calculated as a correlation with
the student’s final course grade. Once this value was
calculated, we were able to compare the diferences between the
correlations on a student’s first score on a given problem to
how often they were getting their first attempt fully correct
and found a suspicious diference. Figures 2 shows the
relationship between the first scores of the students’ submission
to the exercises and their final exam scores, and the figure
on the right shows the distribution of the first scores of the
students’ submission. While many students perform well on
their first submissions of exercises, showing their mastery
of programming skills, only a small subset match this
performance in the course as a whole. Specifically, students who
perform well in the CWO exercises on their first attempt,
often do not perform well for their final grade of the course.
This preliminary data analysis did not make intuitive sense
and led us to further investigate this phenomenon using
more traditional methods, including MOSS.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Limitations and Future Work</title>
      <p>This preliminary investigation focused only on CWO
submissions, but we hope this data cleaning approach can be
generalized to other datasets that use the ProgSnap2 format.
We also hope to continue investigating the metadata about
submissions included in this format to find more accurate
indicators of cheating behavior in the programming snapshot
data. While MOSS is generally used to compare final
students’ submissions with other final students’ submissions,
in future work, we will consider the case for running MOSS
with sequential data where submissions made on platforms
like CWO that allow multiple submissions. For example, we
could compare attempt 1 of a student 1 with attempt 2 of
student 2 and so on to see if a students copy each others’
solutions from their first attempt onwards or after trying
multiple attempts, failing and then cheat to proceed to the
next programming problem on the CWO platform.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>We thank the contributions by Dr. Thomas Price for his
guidance on this work. We also thank the 2023 Session of
LearnLab Summer School Organizers and our sponsors Dr.
Peter Brusilovsky and SPLICE project PI(s) for bringing us
all together to work on this.
(IEEE Cat. No. 99CH37011, volume 3, IEEE, 1999, pp.
13B3–18.
[10] Y. Shi, R. Schmucker, M. Chi, T. Barnes, T. Price,
Kcifnder: Automated knowledge component discovery
for programming problems., International Educational
Data Mining Society (2023).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S. E.</given-names>
            <surname>Allen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. F.</given-names>
            <surname>Kizilcec</surname>
          </string-name>
          ,
          <article-title>A systemic model of academic (mis) conduct to curb cheating in higher education, Higher Education (</article-title>
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>21</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>O.</given-names>
            <surname>Karnalim</surname>
          </string-name>
          , Simon,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chivers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. S.</given-names>
            <surname>Panca</surname>
          </string-name>
          ,
          <article-title>Educating students about programming plagiarism and collusion via formative feedback</article-title>
          ,
          <source>ACM Transactions on Computing Education (TOCE) 22</source>
          (
          <year>2022</year>
          )
          <fpage>1</fpage>
          -
          <lpage>31</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.</given-names>
            <surname>Brusilovsky</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. Millán,</surname>
          </string-name>
          <article-title>User models for adaptive hypermedia and adaptive educational systems, in: The adaptive web: methods and strategies of web personalization</article-title>
          , Springer,
          <year>2007</year>
          , pp.
          <fpage>3</fpage>
          -
          <lpage>53</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Carter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Hundhausen</surname>
          </string-name>
          ,
          <string-name>
            <surname>O. Adesope,</surname>
          </string-name>
          <article-title>The normalized programming state model: Predicting student performance in computing courses based on programming behavior</article-title>
          ,
          <source>in: Proceedings of the eleventh annual international conference on international computing education research</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>141</fpage>
          -
          <lpage>150</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T. W.</given-names>
            <surname>Price</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hovemeyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Rivers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. C.</given-names>
            <surname>Bart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Kazerouni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. A.</given-names>
            <surname>Becker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Petersen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Gusukuma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. H.</given-names>
            <surname>Edwards</surname>
          </string-name>
          , et al.,
          <article-title>Progsnap2: A lfexible format for programming process data</article-title>
          ,
          <source>in: Proceedings of the 2020 ACM Conference on Innovation and Technology in Computer Science Education</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>356</fpage>
          -
          <lpage>362</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Umer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Susnjak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mathrani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Suriadi</surname>
          </string-name>
          ,
          <article-title>Current stance on predictive analytics in higher education: Opportunities, challenges and future directions</article-title>
          ,
          <source>Interactive Learning Environments</source>
          <volume>31</volume>
          (
          <year>2023</year>
          )
          <fpage>3503</fpage>
          -
          <lpage>3528</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hellas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Leinonen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ihantola</surname>
          </string-name>
          ,
          <article-title>Plagiarism in takehome exams: Help-seeking, collaboration, and systematic cheating</article-title>
          ,
          <source>in: Proceedings of the 2017 ACM Conference on Innovation and Technology in Computer Science Education</source>
          , ITiCSE '17,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2017</year>
          , p.
          <fpage>238</fpage>
          -
          <lpage>243</lpage>
          . URL: https://doi.org/10.1145/3059009.3059065. doi:
          <volume>10</volume>
          . 1145/3059009.3059065.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Sosnovsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Müter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Valkenier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brinkhuis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hofman</surname>
          </string-name>
          ,
          <article-title>Detection of student modelling anomalies</article-title>
          ,
          <source>in: Lifelong Technology-Enhanced Learning: 13th European Conference on Technology Enhanced Learning, EC-TEL</source>
          <year>2018</year>
          ,
          <article-title>Leeds</article-title>
          , UK, September 3-
          <issue>5</issue>
          ,
          <year>2018</year>
          , Proceedings 13, Springer,
          <year>2018</year>
          , pp.
          <fpage>531</fpage>
          -
          <lpage>536</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>K. W.</given-names>
            <surname>Bowyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. O.</given-names>
            <surname>Hall</surname>
          </string-name>
          ,
          <article-title>Experience using ”moss” to detect cheating on programming assignments</article-title>
          ,
          <source>in: FIE'99 Frontiers in Education. 29th Annual Frontiers in Education Conference. Designing the Future of Science and Engineering Education</source>
          . Conference Proceedings
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>