=Paper=
{{Paper
|id=Vol-2531/paper2
|storemode=property
|title=Segmenting Student Answers to Textual Exercises Based on Topic Modeling
|pdfUrl=https://ceur-ws.org/Vol-2531/poster03.pdf
|volume=Vol-2531
|authors=Jan Philip Bernius,Anna Kovaleva,Bernd Bruegge
|dblpUrl=https://dblp.org/rec/conf/seuh/BerniusKB20
}}
==Segmenting Student Answers to Textual Exercises Based on Topic Modeling==
<pdf width="1500px">https://ceur-ws.org/Vol-2531/poster03.pdf</pdf>
<pre>
   Segmenting Student Answers to Textual Exercises
              Based on Topic Modeling
                Jan Philip Bernius                                            Anna Kovaleva                                                    Bernd Bruegge
           Department of Informatics                                Department of Informatics                                        Department of Informatics
         Technical University of Munich                           Technical University of Munich                                   Technical University of Munich
               Munich, Germany                                          Munich, Germany                                                  Munich, Germany
            janphilip.bernius@tum.de                                  anna.kovaleva@tum.de                                               bruegge@in.tum.de


   Abstract—Giving feedback when grading textual exercises in                                 Multiple graders require means to create consistent feedback
very large courses is a challenge, especially when instructors                                for learners.
want to provide consistent feedback to each student in real-time                                 This paper outlines a segmentation algorithm to be applied
already during the lecture.
   This paper outlines a real-time assessment approach based on                               to student answers to textual exercises. It is intended to be
topic modeling and reuse. Segmenting student answers fosters a                                used as part of an assessment system for textual exercises,
structured form of feedback, improving the feedbacks’ reusabil-                               fostering reuse of feedback between students and increasing
ity. We present the design of an answer segmentation system, to                               consistency between assessments [7].
be integrated with an assessment system for textual exercises. The
resulting system aims at quicker and more consistent feedback                                                    II. S EGMENTING S TUDENT A NSWER
for textual exercises and an improved learning experience for
students.                                                                                        We abstracted the topic modeling approach and preserve
                                                                                              the idea that every answer is a collection of topics, and
                         I. I NTRODUCTION                                                     many topics are distributed among different answers [8]. We
   With a growing number of students enrolled at universi-                                    compensate for the scarcity of the words in the answers by
ties worldwide,1 large courses have thousands of students                                     reducing topics to keywords. Another strategy adapted from
participating. Large courses pose a problem for instructors                                   other works is ”vocabulary introduction” [9]. As soon as new
when grading textual exercises. The main problem is the                                       keywords are introduced, a new segment begins. The presented
asynchronous assessment, which usually requires a week of                                     approach differs from thesaurus or ontology in a way that we
time, or even longer. To reduce this delay, we teach interactive                              do not know what the keywords are going to be, and they are
lectures where we combine theory and exercises live during the                                calculated for every problem separately.
lectures, grade them immediately and provide quick feedback                                      The algorithm can be separated into three phases: Text Pre-
to students [1]. This increases student comprehension and                                     processing, Keyword Extraction and Segmentation. Figure 1
deepens understanding.                                                                        depicts the algorithm’s flow of events, which is described in
   Technology to foster interaction and discussion within large                               detail in the following sections. Segments can be used as a
courses does exist [2, 3], as well as scalable exercise sys-                                  baseline for providing manual structured instructor feedback,
tems for programming and modeling exercises with automatic                                    or as a unit for assessment systems to generate feedback
assessment [4, 5]. Textual exercises are commonly used in                                     automatically [7].
examination, but no automatic assessment solution is available
on the market for this exercise type.                                                         A. Text Preprocessing
   Conducting open answer questions requires time-consuming                                      Student answers are of inconsistent quality in regards to
activities from instructors, including designing exercises and                                spelling, formatting and use of punctuation. Poor data qual-
manual assessment, due to the high variability in student                                     ity impacts the segmentation quality negatively. Due to the
answers. To reduce efforts, instructors tend to reuse exer-                                   nature of the system, manual preprocessing is not practical.
cises from previous years. Grading is a repeatable process,                                   Student submissions must not be modified, as feedback should
instructors look for common mistakes or predefined solution                                   be based on the original answer only. We correct common
patterns. The students’ learning success benefits from detailed                               irregularities to an intermediate format suitable for further
and personalized feedback [6]. To enable large scale courses,                                 calculations.
the need to reuse feedback comments arises. Individual feed-                                     Removing stop words from text is a very common way
back can still rely on the domain expertise of the teacher.                                   to clean textual data for Natural Language Processing (NLP)
                                                                                              tasks [9, 10]. Words like ”I”, ”the”, ”what” and ”did” do not
  1 United Nations, ”UN Global Assessment on Higher Education Reveals
                                                                                              contain much lexical content and can be removed.
Broad Socio-Economic, Gender Disparities,” https://news.un.org/en/story/
2017/04/555642-un-global-assessment-higher-education-reveals-broad-                              Lemmatization is the process of reducing a word to its
socio-economic-gender, 2017.                                                                  meaningful root. Naturally, students use different forms of a


S. Krusche, S. Wagner (Hrsg.): SEUH 2020                                                                                                                            72


                         Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            Answer           Remove           Convert to         Remove stop        Lemmatize all          Extract            Keyword
           Answer
           Answer                                                                                                            Keyword
                                                                                                                             Keyword          Stem keywords
                            punctuation       lower-case           words               words              keywords


                                                   [ same keywords found ]
                                                                                                                       Sentence
                                                                                                                      Sentence
                                                                                                                      Sentence
                             Merge text                                                      Search for                                          Segment
            TextBlock
           TextBlock
          TextBlock       blocks between                                                     stemmed                                           answers into
                             topic shifts                                                    keywords                Atomic Text                text blocks
                                                    TopicShift                                                        Segment
                                                                 [ new keywords found ]


                              Fig. 1. The segmentation algorithms flow of events depicted using a UML activity diagram.


word: either singular or plural, different tenses, degrees of                     The algorithm produces topically coherent segments. Seg-
comparison, etc. The result of the text preprocessing is a set                 ments allow for more structured assessment approaches, sim-
of lemmatized lower-case words without any punctuation or                      ilar to how modeling exercises can be assessed today. This
stop words.                                                                    enables use of semi-automated assessment systems to be
                                                                               used in the assessment process, reducing the delay between
B. Keyword Extraction                                                          exercise and feedback. Further, tools can help to keep feedback
                                                                               consistent between students, as comparisons can be made
  We generalize the idea of topic modeling that claims that
                                                                               between segments.
every student’s submission is a collection of topics that are
                                                                                  The result of the algorithm’s application can be improved in
common among different answers. Compensating for data
                                                                               two areas: (1) deriving keywords and text blocks using statisti-
scarcity, we reduce each topic to a single keyword.
                                                                               cal models, topic models, or decision trees. (2) Additionally, a
  The resulting keywords are the ten most frequently used
                                                                               thesaurus could be used to recognize synonyms. Future work
words in the texts. The number was chosen empirically based
                                                                               is needed to evaluate this algorithm in a lecture setting.
on our data.
                                                                                                             R EFERENCES
C. Segmentation                                                                 [1] S. Krusche, A. Seitz, J. Börstler, and B. Bruegge, “Interactive learning:
                                                                                    Increasing student participation through shorter exercise cycles,” in 19th
   The segmentation of the texts is split up into two steps:                        Australasian Computing Education Conference. ACM, 2017, pp. 17–
First, the answers are split up into initial text blocks. Second,                   26.
                                                                                [2] J. Knobloch and E. Gigantiello, “AMATI: Another massive audience
adjacent text blocks are considered and merged if there are                         teaching instrument,” in 15. Workshop für Software Engineering im
no new keywords introduced. The result of this is a set of                          Unterricht der Hochschulen, 2017, pp. 63–68.
segments for each answer.                                                       [3] R. Mayer, A. Stull, K. DeLeeuw, K. Almeroth, B. Bimber, D. Chun,
                                                                                    M. Bulger, J. Campbell, A. Knight, and H. Zhang, “Clickers in college
   For identifying sentences we use a pre-trained model of                          classrooms: Fostering learning with questioning methods in large lecture
the ”punkt tokenizer” [11, 12] and a custom implementation                          classes,” Contemporary Educational Psychology, vol. 34, pp. 51–57,
for bulleted lists. To identify clauses we rely on conjunctions.                    2009.
                                                                                [4] S. Krusche and A. Seitz, “Artemis: An automatic assessment manage-
This is an incomplete clause identification approach, however                       ment system for interactive learning,” in 49th ACM Technical Symposium
sufficient for this use case. We consider that subordinating                        on Computer Science Education. ACM, 2018, pp. 284–289.
conjunctions indicate a new clause, only considering sentences                  [5] S. Krusche and A. Seitz, “Increasing the Interactivity in Software
                                                                                    Engineering MOOCs - A Case Study,” in 52nd Hawaii International
that are longer than 20 words to reduce false positives.                            Conference on System Sciences, 2019, pp. 1–10.
   We use a stemmer to unify different forms of a word in the                   [6] A. Poulos and M. J. Mahony, “Effectiveness of feedback: the students’
text. Based on lexical cohesion and vocabulary introduction                         perspective,” Assessment & Evaluation in Higher Education, vol. 33,
                                                                                    no. 2, pp. 143–154, 2008.
[9, 13], we define segments. Within each student answer, the                    [7] J. P. Bernius and B. Bruegge, “Towards the Automatic Assessment of
extracted keywords are compared for adjacent segments. A                            Text Exercises,” in 2nd Workshop on Innovative Software Engineering
change in keywords signals a topic shift. For equal keywords,                       Education, 2019, pp. 19–22.
                                                                                [8] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet allocation,” J.
segments are merged into a single text block.                                       Mach. Learn. Res., vol. 3, pp. 993–1022, 2003.
                                                                                [9] M. A. Hearst, “Texttiling: Segmenting text into multi-paragraph subtopic
                          III. S UMMARY                                             passages,” Computational Linguistics, vol. 23, pp. 33–64, 1997.
                                                                               [10] A. Hulth, “Improved automatic keyword extraction given more linguistic
                                                                                    knowledge,” in Conference on Empirical Methods in Natural Language
   In this paper we have presented a high level overview of a                       Processing. ACL, 2003, pp. 216–223.
new algorithm based on topic modeling and text segmentation                    [11] S. Bird, E. Klein, and E. Loper, Natural Language Processing with
to segment student answers into topically coherent text blocks.                     Python, 1st ed. O’Reilly Media, Inc., 2009.
                                                                               [12] T. Kiss and J. Strunk, “Unsupervised multilingual sentence boundary
Following a ”divide & conquer” approach, we first divide                            detection,” Comput. Linguist., vol. 32, no. 4, pp. 485–525, 2006.
student answers into initial, small segments and then merge                    [13] M. A. K. Halliday and R. Hasan., Cohesion in English, ser. English
them according to topic boundaries to larger text blocks.                           Language Series. London: Longman, 1976.


S. Krusche, S. Wagner (Hrsg.): SEUH 2020                                                                                                                      73

</pre>