=Paper= {{Paper |id=Vol-3051/CSEDM_4 |storemode=property |title=Clustering Students' Short Text Reflections: A Software Engineering Course Case Study (Full Paper) |pdfUrl=https://ceur-ws.org/Vol-3051/CSEDM_4.pdf |volume=Vol-3051 |authors=Mohsen Dorodchi,Alexandria Benedict,Erfan Al-hossami,Andrew Quinn,Sandra Wiktor,Aileen Benedict,Mohammadali Fallahian |dblpUrl=https://dblp.org/rec/conf/edm/DorodchiBA0WBF21 }} ==Clustering Students' Short Text Reflections: A Software Engineering Course Case Study (Full Paper)== https://ceur-ws.org/Vol-3051/CSEDM_4.pdf
     Clustering Students’ Short Text Reflections: A Software
                 Engineering Course Case Study

                 Mohsen Dorodchi                          Alexandria Benedict                    Erfan Al-Hossami
           University of North Carolina at            University of North Carolina at       University of North Carolina at
                      Charlotte                                  Charlotte                             Charlotte
               Charlotte, NC 28223                        Charlotte, NC 28223                   Charlotte, NC 28223
           Mohsen.Dorodchi@uncc.edu abenedi4@uncc.edu                                          ealhossa@uncc.edu
               Andrew Quinn            Sandra Wiktor                                             Aileen Benedict
           University of North Carolina at            University of North Carolina at       University of North Carolina at
                      Charlotte                                  Charlotte                             Charlotte
               Charlotte, NC 28223                        Charlotte, NC 28223                   Charlotte, NC 28223
              aquinn16@uncc.edu                          swiktor@uncc.edu                      abenedi3@uncc.edu
                                                       Mohammadali Fallahian
                                                      University of North Carolina at
                                                                 Charlotte
                                                          Charlotte, NC 28223
                                                           mfallahi@uncc.edu


ABSTRACT                                                                  instructor, especially in large classroom settings, where timely
Student reflections can provide instructors with beneficial               feedback is needed to address students’ possible concerns.
knowledge regarding their progress in the course, what chal-              Machine learning and knowledge discovery-based methods
lenges they are facing, and how the instructor can provide                have been used to assist educators in understanding and
more effectively to the students’ needs. Reading every stu-               helping students [14, 1, 20]. Unsupervised methods in natu-
dent reflection, however, can be a time-consuming task that               ral language processing (NLP) such as topic modeling, have
may affect the instructor’s ability to efficiently address stu-           been used to automatically extract topics from student re-
dent needs in a timely manner. In this research, we ex-                   flective journals [5]. However, they fall short when it comes
plore the use of clustering and sorting of student reflections            to short text, typically around a sentence in length, such
to shorten reading time while maintaining a comprehensive                 as tweets. Recent research has utilized K-means cluster-
understanding of the reflection content. We obtain student                ing along with transformer-based sentence-embeddings to
reflections from a software engineering course. Next, we                  automatically extract topics from tweets [12, 2]. K-means
generate transformer-based sentence embeddings and then                   clustering is often supplemented with a representation of
cluster the reflections using K-Means. Lastly, we sort the                text. Representations can include statistical-learnt repre-
reflections based on the distance of each reflection from its             sentations such as term frequency-inverse document (TF-
cluster center. We conduct a small-scale user study with the              IDF) [13], neural-learnt representations also known as word
course’s Teaching Assistants and provide promising prelim-                embeddings (e.g. Word2Vec [19], Glove [22]), and more re-
inary results showing a significant increase in reading time              cently representations computed from large pretrained trans-
efficiency without sacrificing understanding.                             former deep learning models.

Keywords                                                                  Transformers are deep learning models following the archi-
Natural Language Processing, Student Reflections, Cluster-                tecture proposed by Vaswani et al. [27]. These models often
ing                                                                       undergo an unsupervised pretraining on a massive text cor-
                                                                          pus to create an initial version of the network later fine-tuned
                                                                          for more specific tasks in process called transfer learning.
1.    INTRODUCTION                                                        Pretrained transformers such as BERT [6], RoBERTa [16],
Reflections are an effective way for instructors to detect                and GPT-3 [4], have achieved state of the art in many nat-
what their students may be struggling with throughout their               ural language processing tasks. Some of these tasks include
courses, gain a perspective on students’ impressions of course            detecting positive and uplifting discussion on social media
content, and track their overall progress [9]. However, in or-            (e.g. [17]), determining answers to questions given a passage
der to utilize these benefits to the fullest, instructors would           of text (e.g. [28]), summarizing text (e.g. [15]), and estimat-
need to manually read through each individual reflection.                 ing semantic similarity between sentences. For this reason,
Manually analyzing reflections can be overwhelming for an                 we select a transformer-based language model to create a
                                                                          semantic representation of student responses.
Copyright ©2021 for this paper by its authors. Use permitted under Cre-
ative Commons License Attribution 4.0 International (CC BY 4.0)           In this research, we implement an approach using k-means
clustering from the scikit-learn library and utilize transformer-        2. What was your biggest challenge this past week? This
based sentence embeddings. We evaluate our approach in a                    can include in-class activities, assignments, prep work,
preliminary user study, observing the time taken for teach-                 studying, time management, motivation, and so on.
ing assistants to read and analyze student reflections.
                                                                         3. How can you address the challenge you mentioned above?
                                                                            What can you do to overcome this challenge for next
2.     DATASET                                                              time?
Course. The data used in our research was collected from
an undergraduate software engineering course based on the
active learning course model proposed in [7]. The number of         For the purpose of this research, we focused solely on the
students enrolled in the course was 108 students. Modules           students’ responses to question 2, as this question was free-
are organized based on the concepts being taught and typ-           response and would provide unique responses for the clus-
ically spanned across approximately one week. The course            tering process.
contained 11 modules in total with the topics listed in Ta-
ble 1. Following the active learning course model presented         Dataset Statistics. We used two different module reflections
                                                                    from the software engineering course throughout this study:
Table 1: Concepts taught within each module of the software         Module 7 reflections and Module 8 reflections. Table 2 show-
engineering course.                                                 cases our descriptive statistics of our collected student reflec-
                                                                    tion responses corpora. The selected module reflections were
                                                                    comparable in size. Firstly, the response rates to the Module
  Module                        Topic(s)
                                                                    7 and Module 8 reflections are 94 or 87.0% and 89 or 82.4%
               Introduction to Software Engineering and             responses out of 108 total students after preprocessing re-
       1
                     Agile Development Methods                      spectively. Moreover, the total word counts were 1866, 1390
       2      Introduction to Requirements and Modeling             for Module 7 and Module 8 reflections respectively. We also
                                                                    observe that most student reflections contained between a
       3          Requirement Analysis and Modeling                 sentence or two on average in both module reflections. Fur-
       4              Architecture and Modeling                     thermore, we note that most reflections in our corpus were
                                                                    around a sentence in length.
             Data Flow Diagrams, Context Diagrams, and
       5
                          UML Diagrams                              Table 2: Descriptive statistics of the Module 7 and Module
                   Use Case Diagrams & Extracting                   8 reflections collected from the undergraduate software engi-
       6                                                            neering course.
                            Requirements
               Cloud based Software engineering, Testing,
       7                                                                   Module Reflection         Reflection 7      Reflection 8
                    Object-oriented Design Pattern
       8                Microservices, feasibility                          Responses (%)            94 (87.0%)        89 (82.4%)
                                                                           Avg. Word Count              19.4              15.4
       9                 Reliable programming                               Avg. Number of
                                                                                                         1.4               1.3
      10                       Final exam                                      Sentences
                                                                            Avg. Words per
      11                      Final project                                                              14.0              11.5
                                                                               Sentence
                                                                             Total Words                 1866              1390
in Dorodchi et al. [10, 7], each module is typically divided
into multiple scaffolds: prep-work to complete before class
including reading assignments and videos to watch, in-class         3.     APPROACH
activities, post-lecture activities, including assignments and      Our overall approach is illustrated in Figure 1. First we col-
labs, and a reflection at the end of the module. Labs are           lect data from an undergraduate course with 108 students.
more challenging assignments provided to students which             This is described in more detail in section 2. Then, we
require hands-on coding. These lab activities are typically         perform preprocessing on the data using natural language
divided into multiple parts. There are a total of 4 labs in         processing (section 3.1). Next, we generate sentence embed-
this course, with the first lab beginning in Module 2 and the       dings (section 3.2), cluster those embeddings (section 3.3),
last lab being introduced in Module 8.                              and sort the reflections based on clusters for TA’s to view
                                                                    (section 3.4).
Data Collection. A survey questionnaire was provided to
students within Canvas, the University’s Learning Manage-           3.1      Preprocessing
ment System (LMS), at the end of each module to allow stu-          Before we generate sentence embeddings from our reflec-
dents to reflect on their learning and challenges. We refer to      tions dataset, we first preprocess the data by removing any
student responses of this questionnaire as student reflections      blank, or null, student responses, and also removing any
throughout this work. The questions asked of students were:         non-breaking spaces which appear in the text. Next, the
                                                                    student responses are compiled and provided to the model
                                                                    for generating sentence embeddings.
     1. On a scale of 1 to 5, with 5 being Very Active and 1
        being Not Active, how engaged would you rate your
        group this week?                                            3.2      Sentence Transformers
                   Figure 1: Illustration of our clustering and sorting approach of short student reflections.


Background. Transformer architectures can be computation-          reflection 7 and 8 clusters for module reflection 8.
ally inefficient when trying to find the most semantically
similar pair in sizable collection of sentences. To address        3.4    Sorting of Student Reflections
this issue, sentence transformers were developed. Sentence         After each student reflection is assigned a cluster, the re-
transformers utilize mean pooling which computes the av-           flections undergo a sorting process. The goal of the sorting
erage of all the word-level vectors in the inputted sentence.      process is to group reflections from most similar to least
Pooling helps sentence transformers maintain a fixed size          similar to assist in the reading process. Cluster distances
vector as their output. Sentence transformers then undergo         were calculated using the scikit-learn library fit_transform
a fine-tuning training process using the SNLI dataset [3]          function which computes and transforms the sentence em-
containing over 570,000 annotated sentence pairs. The fine-        beddings to cluster-distance space. This function uses the
tuning process Siamese and triplet networks [26] are utilized      euclidean distance formula for calculating the distance be-
to compute weights during fine-tuning so that sentence em-         tween a student reflection response r and its assigned cluster
beddings are optimizing for meaningfulness and can be com-         center rc , as follows:
pared with cosine-similarity. Working with sentence-level
representations make it easier and more efficient for tasks
                                                                                          p
                                                                   distance(e(r), rc ) = e(r) · e(r) − (2 ∗ e(r) · rc ) + rc · rc
such as computing the semantic similarity of 2 sentences.
Sentence transformers reduce computation time of finding           where e(r) represents a student response r embedded us-
the most similar Quora question from over 50 hours to a            ing sentence-transformers into a vector of 768 elements. rc
few milliseconds using Transformer architectures [23]. Fur-        represents the computed cluster center assigned to r.
thermore, Sentence transformers outperform regular trans-
formers on several semantic textual similarity tasks [23].         After computation, we sort the reflections using the assigned
                                                                   cluster number to group reflections within the same clus-
Approach. We use the sentence-transformers package [23].           ter together. Lastly, we sort the reflections within the same
We particularly select the DistilRoBERTa-base-cased                cluster using the distance metric in descending order as well.
model to get our sentence embeddings. DistilRoBERTa-               This way reflections are sorted by most semantically simi-
base-cased is a RoBERTa transformer model [16], dis-               lar to the cluster center to least semantically similar to the
tilled using [25]. The dimension of the embeddings is 768.         cluster center. Next, we explore our user study set up and
In the embedding process, we take each student response            evaluate how well this approach assists in the reading pro-
which is typically a sentence in length, and convert it into a     cess.
vector of 768 floats representing the sentence. These embed-
dings are then used to cluster the reflections as described in
the next subsection.
                                                                   4. RESULTS
                                                                   4.1 Experimental Setup
                                                                   In order to measure the efficacy of clustering in the knowl-
3.3   Clustering                                                   edge extraction process, we developed a user study which
Our earlier step yields a set of embedded student responses        compares the time efficiency of reading through and extract-
one set for module 7 reflections and another for module re-        ing topics from student reflections in two formats:
flection 8. For each set of embedded student responses from
our earlier step, we use K-means clustering using the scikit-
learn machine learning library [21]. We compute the clus-             1. Unsorted student reflections exported directly from the
ter centers for each cluster using the embedded student re-              LMS.
sponses, hence cluster centers are represented by an embed-
                                                                      2. Sorted student reflections sorted based on cluster dis-
ding vector of the same shape. We also assign each response
                                                                         tances.
to a cluster based on the nearest cluster center.

The number of clusters was determined using the Silhouette         First, the method of the user study will be described, and
method [24] for finding the optimal number of clusters. Us-        then a summary of the results. Our hypothesis when con-
ing the Silhouette method, we generate 4 clusters for module       ducting this study was that clustering can help reduce the
cognitive load and increase effectiveness and efficiency of      Table 4: Normalized time taken to fully extract knowledge
knowledge extraction.                                            from all student responses per module reflection.

In this user study, four teaching assistants were selected to       Module Reflection       Unsorted    Sorted      N
read through the student reflections of a Software Engineer-
                                                                       Reflection 7         90.0        15.0        94
ing course. The module 7 and 8 reflections were chosen as
                                                                                            minutes     minutes
the corpora to extract knowledge from, as the TAs had not
                                                                       Reflection 8         121.4       20.9        89
yet read these in particular.
                                                                                            minutes     minutes
Each TA was assigned a reflection and a format. For exam-
ple, TA 1 would read and extract topics from Reflection 7
                                                                 remained consistent, with a slight improvement in compar-
unsorted, TA 2 would read and extract topics from Reflec-
                                                                 ison to the unsorted format. Following the portion of the
tion 7 clustered/sorted, and so on, as illustrated in Table
                                                                 user study which required TAs to individually extract top-
3. For the TAs which were assigned the clustered/sorted
                                                                 ics from the reflections, they then met afterwards to discuss
format, they individually ran the K-Means clustering algo-
                                                                 their similarities and differences in topics. The TAs who an-
rithm first without reading any responses before beginning
                                                                 alyzed Reflection 7 extracted the same topics from the stu-
the process.
                                                                 dent responses with no differences. During the discussion,
                                                                 the Reflection 7 TAs took turns sharing the topics they had
Table 3: Assignment of TAs to specified reflection and format
                                                                 extracted during the user study, and concluded that they
type for the knowledge extraction process.
                                                                 were in 100% agreement with the topics coded. Reflection 8,
                                                                 however, had one topic which was extracted in the clustered
       Module Reflection          Unsorted     Sorted            and sorted reflections and not in the unclustered/unsorted
          Reflection 7            TA 1         TA 2              reflections. The TAs assigned with Reflection 8 noted that
          Reflection 8            TA 3         TA 4              this was most likely due to a lack of time to completely an-
                                                                 alyze all unsorted student reflections, hence displaying how
                                                                 time efficiency can also be beneficial to improving the ac-
The free-response question used in particular for this study     curacy of knowledge extraction if given a time-constraint.
was:                                                             Despite the improved time efficiency of the clustered and
                                                                 sorted reflection format, no topics were missed.
      “What was your biggest challenge this past week?
                                                                 We utilize the dimension reduction algorithm UMAP [18]
      This can include in-class activities, assignments,
                                                                 to visualize the resulting clusters of student reflections as
      prep work, studying, time management, motiva-
                                                                 shown in Figure 2. The student reflections for Module 7
      tion, and so on.”
                                                                 resulted in 4 clusters with 4 major topics including manag-
                                                                 ing workload, motivation and time management, lab work,
Each TA individually read through each student’s reflection      and group work. The Module 8 student reflections resulted
response for this question, extracted any new topics men-        in 8 clusters with each cluster containing a challenge in at
tioned in the student response, and timed themselves ac-         least one of the following categories: Lab work, time man-
cordingly for the duration of the process. Once all TAs had      agement, studying, motivation, group work, and some reflec-
collectively finished, they then met to discuss what topics      tions mentioned no challenges whatsoever. Managing work-
they found, and compared times and results.                      load, motivation, studying, and time management relate to
                                                                 the student’s own discerned ability to handle the course-
                                                                 work in general. Lab work and group work were challenges
4.2    Evaluation                                                in which students related their troubles more specifically to
After comparing results of this study, we derive that by pro-    difficult topics being covered, confusions about instructions,
viding instructors with student reflections in a clustered and   or trouble with communicating among their groups to com-
sorted format, the time needed for knowledge extraction de-      plete activities. Students who were in the category of “no
creases while maintaining the accuracy of identifying top-       challenges” noted that they did not have any difficulties or
ics. Reflection 7, with a total of 94 student responses, took    confusion during the span of that module. As displayed in
90 minutes to completely read through and extract topics         these scatter plots and the major topics described, there are
on the unsorted responses, while only requiring 15 minutes       overlaps among several of the clusters. This overlap is cre-
in the sorted and clustered format. Reflection 8 had simi-       ated by the similarities in the students’ wordings. For exam-
lar results in which efficiency increased, with a total of 89    ple, two student responses within the “Managing Workload”
responses taking approximately 121.4 minutes on the un-          cluster of the Module 7 reflection were:
sorted format and 20.9 minutes on the clustered and sorted
responses. It is important to note that the TA extract-
ing knowledge from Reflection 8 unsorted did not complete          1. “My biggest challenge has been not procrastinating my
within a 90 minute time frame, thus their results were nor-           work.”
malized based on how many reflections they did complete.           2. “The biggest challenge this week was working with the
These results are provided in Table 4.                                dash and the dashboard framework.”
In addition to the increased efficiency of knowledge extrac-
tion with a clustered and sorted format, the topics extracted    The first student response was the cluster center with a dis-
(a) Module 7 Reflection clusters based on question 2: student chal- (b) Module 8 Reflection clusters based on question 2: student chal-
lenges.                                                             lenges.

                         Figure 2: UMAP scatter plots visualizing student reflection K-Mean clusters


tance of 3.12, and the second student response was one of             and cluster short text student reflections and we conduct
the farthest points from the cluster center, with a distance          an educator-centered evaluation where we assess the direct
of 7.05. Therefore, clusters still maintain semantic simi-            impact of our approach on teaching assistants’ reading and
larities to many of the responses with smaller intracluster           analysis time.
distances, but contain outliers due to the overlap caused by
similar word usages.                                                  6.    DISCUSSION & FUTURE WORK
                                                                      In our research, we implement an approach using k-means
                                                                      clustering and sentence-transformers on student reflections
5.   RELATED WORK                                                     to aid in reducing the labor and time-consumption of man-
Reflections are a necessary component in active learning
                                                                      ually analyzing reflections. Our study presents promising
courses, as it allows the instructor to track students’ im-
                                                                      preliminary results showing that by clustering student re-
pression on the course, activities, and social learning aspects
                                                                      flections based on semantic similarities and sorting by intr-
[9]. In Dorodchi et al. [8], student reflections are used in an
                                                                      acluster distance, instructors are able to decrease the time
introductory computer science (CS1) course to test its effi-
                                                                      needed to extract topics from the student corpora. How-
cacy as a feature to predict early on which students may be
                                                                      ever, our study suffers from several limitations. Firstly, our
at-risk of failing. By including student reflection data as a
                                                                      sample size for the user study is very small (N = 4) and
feature in a temporal data model, referred to as the student
                                                                      our results may not generalize to different classes, or re-
sequence model, the authors were able to increase the accu-
                                                                      flection corpora. Furthermore, teaching assistants read at
racy of predicting student outcomes of pass or fail [8]. De-
                                                                      different paces. Our results may not generalize to different
spite the advantages of integrating student reflections into a
                                                                      teaching assistants. To address these limitations we intend
course model, these benefits require the time-consuming pro-
                                                                      to conduct a user study with a significantly larger pool of
cess of manually reading through individual reflections and
                                                                      participants, module reflections, and in multiple courses. In
extracting common themes. For this reason, creating an au-
                                                                      addition, we are planning to utilize fuzzy clustering [11] in
tomated process to assist instructors is similarly explored in
                                                                      the future version as well.
[5]. Chen et al. [5] presents positive results in exploring the
usage of topic modeling for analyzing and extracting knowl-
                                                                      Reflections are fundamental for enhancing learning in class-
edge from student reflections. In this particular study, the
                                                                      rooms [9], and provides the instructor with instant feedback
MALLET toolkit was utilized for the topic modeling pro-
                                                                      on student progress. This study focuses on exploring the
cess, and the number of clusters K was manually selected.
                                                                      impact of clustering on student reflections to assist instruc-
These methods of knowledge extraction are not only effec-
                                                                      tors in reducing time costs of analysis. In our future work,
tive in an academic environment, but is also used in other
                                                                      we plan to integrate our k-means clustering algorithm into
applications such as social media mining for COVID-19 re-
                                                                      a dashboard tool for instructors and conduct an expanded
lated information. Comparatively to the time-sensitive task
                                                                      user study to further evaluate our approach. The dashboard
of analyzing student reflections, clustering can also be used
                                                                      will provide instructors and TAs the functionality to cluster
to discover new information from relevant tweets to assist
                                                                      student reflections from the LMS and be guided through the
in the decision-making steps that may follow [12]. For this
                                                                      responses.
task, Ito et al. [12] and Asgari et al. [2] implement algo-
rithms using K-means clustering and sentence embeddings,
which both provide positive results in topic extraction. Our
study is distinguished from prior works in that, we collect
7.   ADDITIONAL AUTHORS                                          [13] K. S. Jones. A statistical interpretation of term
8.   REFERENCES                                                       specificity and its application in retrieval. Journal of
 [1] A. Al-Doulat, N. Nur, A. Karduni, A. Benedict,                   documentation, 1972.
     E. Al-Hossami, M. L. Maher, W. Dou, M. Dorodchi,            [14] H. Li, W. Ding, S. Yang, and Z. Liu. Identifying
     and X. Niu. Making sense of student success and risk             at-risk k-12 students in multimodal online
     through unsupervised machine learning and interactive            environments: A machine learning approach. arXiv
     storytelling. In International Conference on Artificial          preprint arXiv:2003.09670, 2020.
     Intelligence in Education, pages 3–15. Springer, 2020.      [15] Y. Liu and M. Lapata. Text summarization with
 [2] M. Asgari-Chenaghlu, N. Nikzad-Khasmakhi, and                    pretrained encoders. arXiv preprint arXiv:1908.08345,
     S. Minaee. Covid-transformer: Detecting covid-19                 2019.
     trending topics on twitter using universal sentence         [16] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen,
     encoder.                                                         O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov.
 [3] S. R. Bowman, G. Angeli, C. Potts, and C. D.                     Roberta: A robustly optimized bert pretraining
     Manning. A large annotated corpus for learning                   approach. arXiv preprint arXiv:1907.11692, 2019.
     natural language inference. In Proceedings of the 2015      [17] K. Mahajan, E. Al-Hossami, and S. Shaikh.
     Conference on Empirical Methods in Natural Language              TeamUNCC@LT-EDI-EACL2021: Hope speech
     Processing, pages 632–642, Lisbon, Portugal, Sept.               detection using transfer learning with transformers. In
     2015. Association for Computational Linguistics.                 Proceedings of the First Workshop on Language
 [4] T. B. Brown, B. Mann, N. Ryder, M. Subbiah,                      Technology for Equality, Diversity and Inclusion,
     J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam,                pages 136–142, Kyiv, Apr. 2021. Association for
     G. Sastry, A. Askell, et al. Language models are                 Computational Linguistics.
     few-shot learners. arXiv preprint arXiv:2005.14165,         [18] L. McInnes, J. Healy, and J. Melville. Umap: Uniform
     2020.                                                            manifold approximation and projection for dimension
 [5] Y. Chen, B. Yu, X. Zhang, and Y. Yu. Topic modeling              reduction. arXiv preprint arXiv:1802.03426, 2018.
     for evaluating students’ reflective writing: A case         [19] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and
     study of pre-service teachers’ journals. In Proceedings          J. Dean. Distributed representations of words and
     of the sixth international conference on learning                phrases and their compositionality. arXiv preprint
     analytics & knowledge, pages 1–5, 2016.                          arXiv:1310.4546, 2013.
 [6] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova.           [20] N. Nur, N. Park, M. Dorodchi, W. Dou, M. J.
     Bert: Pre-training of deep bidirectional transformers            Mahzoon, X. Niu, and M. L. Maher. Student network
     for language understanding. arXiv preprint                       analysis: a novel way to predict delayed graduation in
     arXiv:1810.04805, 2018.                                          higher education. In International Conference on
 [7] M. Dorodchi, E. Al-Hossami, M. Nagahisarchoghaei,                Artificial Intelligence in Education, pages 370–382.
     R. S. Diwadkar, and A. Benedict. Teaching an                     Springer, 2019.
     undergraduate software engineering course using             [21] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel,
     active learning and open source projects. In 2019                B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer,
     IEEE Frontiers in Education Conference (FIE), pages              R. Weiss, V. Dubourg, J. Vanderplas, A. Passos,
     1–5. IEEE, 2019.                                                 D. Cournapeau, M. Brucher, M. Perrot, and
 [8] M. Dorodchi, A. Benedict, D. Desai, M. J. Mahzoon,               E. Duchesnay. Scikit-learn: Machine learning in
     S. Macneil, and N. Dehbozorgi. Design and                        Python. Journal of Machine Learning Research,
     implementation of an activity-based introductory                 12:2825–2830, 2011.
     computer science course (cs1) with periodic reflections     [22] J. Pennington, R. Socher, and C. D. Manning. Glove:
     validated by learning analytics. 12 2018.                        Global vectors for word representation. In Proceedings
 [9] M. Dorodchi, L. Powell, N. Dehbozorgi, and                       of the 2014 conference on empirical methods in natural
     A. Benedict. Strategies to Incorporate Active Learning           language processing (EMNLP), pages 1532–1543, 2014.
     Practice in Introductory Courses, pages 20–37. 04           [23] N. Reimers and I. Gurevych. Sentence-bert: Sentence
     2020.                                                            embeddings using siamese bert-networks. In
[10] M. M. Dorodchi, N. Dehbozorgi, A. Benedict,                      Proceedings of the 2019 Conference on Empirical
     E. Al-Hossami, and A. Benedict. Scaffolding a                    Methods in Natural Language Processing. Association
     team-based active learning course to engage students:            for Computational Linguistics, 11 2019.
     A multidimensional approach. In 2020 ASEE Virtual           [24] P. J. Rousseeuw. Silhouettes: A graphical aid to the
     Annual Conference Content Access. ASEE                           interpretation and validation of cluster analysis.
     Conferences, Virtual On line, 2020.                              Journal of Computational and Applied Mathematics,
[11] M. Doroodchi and A. Reza. Implementation of fuzzy                20:53–65, 1987.
     cluster filter for nonlinear signal and image processing.   [25] V. Sanh, L. Debut, J. Chaumond, and T. Wolf.
     In Proceedings of IEEE 5th International Fuzzy                   Distilbert, a distilled version of bert: smaller, faster,
     Systems, volume 3, pages 2117–2122 vol.3, 1996.                  cheaper and lighter. arXiv preprint arXiv:1910.01108,
[12] H. Ito and B. Chakraborty. Social media mining with              2019.
     dynamic clustering: A case study by covid-19 tweets.        [26] F. Schroff, D. Kalenichenko, and J. Philbin. Facenet:
     In 2020 11th International Conference on Awareness               A unified embedding for face recognition and
     Science and Technology (iCAST), pages 1–6. IEEE,                 clustering. In Proceedings of the IEEE conference on
     2020.                                                            computer vision and pattern recognition, pages
     815–823, 2015.
[27] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit,
     L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin.
     Attention is all you need. arXiv preprint
     arXiv:1706.03762, 2017.
[28] Z. Zhang, J. Yang, and H. Zhao. Retrospective reader
     for machine reading comprehension. arXiv preprint
     arXiv:2001.09694, 2020.