<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Clustering Students' Short Text Reflections: A Software Engineering Course Case Study</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mohsen Dorodchi Alexandria Benedict</string-name>
          <email>Mohsen.Dorodchi@uncc.edu</email>
          <email>Mohsen.Dorodchi@uncc.edu abenedi4@uncc.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrew Quinn</string-name>
          <email>aquinn16@uncc.edu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sandra Wiktor</string-name>
          <email>swiktor@uncc.edu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mohammadali Fallahian</string-name>
          <email>mfallahi@uncc.edu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Erfan Al-Hossami</string-name>
          <email>ealhossa@uncc.edu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aileen Benedict</string-name>
          <email>abenedi3@uncc.edu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of North Carolina at University of North Carolina at</institution>
          ,
          <addr-line>Charlotte Charlotte, Charlotte, NC 28223 Charlotte, NC 28223</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of North Carolina at</institution>
          ,
          <addr-line>Charlotte, Charlotte, NC 28223</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Student re ections can provide instructors with bene cial knowledge regarding their progress in the course, what challenges they are facing, and how the instructor can provide more e ectively to the students' needs. Reading every student re ection, however, can be a time-consuming task that may a ect the instructor's ability to e ciently address student needs in a timely manner. In this research, we explore the use of clustering and sorting of student re ections to shorten reading time while maintaining a comprehensive understanding of the re ection content. We obtain student re ections from a software engineering course. Next, we generate transformer-based sentence embeddings and then cluster the re ections using K-Means. Lastly, we sort the re ections based on the distance of each re ection from its cluster center. We conduct a small-scale user study with the course's Teaching Assistants and provide promising preliminary results showing a signi cant increase in reading time e ciency without sacri cing understanding.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Natural Language Processing</kwd>
        <kwd>Student Re ections</kwd>
        <kwd>Clustering</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        Re ections are an e ective way for instructors to detect
what their students may be struggling with throughout their
courses, gain a perspective on students' impressions of course
content, and track their overall progress [
        <xref ref-type="bibr" rid="ref11">9</xref>
        ]. However, in
order to utilize these bene ts to the fullest, instructors would
need to manually read through each individual re ection.
Manually analyzing re ections can be overwhelming for an
Copyright ©2021 for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0)
instructor, especially in large classroom settings, where timely
feedback is needed to address students' possible concerns.
Machine learning and knowledge discovery-based methods
have been used to assist educators in understanding and
helping students [
        <xref ref-type="bibr" rid="ref16 ref22 ref3">14, 1, 20</xref>
        ]. Unsupervised methods in
natural language processing (NLP) such as topic modeling, have
been used to automatically extract topics from student
reective journals [
        <xref ref-type="bibr" rid="ref7">5</xref>
        ]. However, they fall short when it comes
to short text, typically around a sentence in length, such
as tweets. Recent research has utilized K-means
clustering along with transformer-based sentence-embeddings to
automatically extract topics from tweets [
        <xref ref-type="bibr" rid="ref14 ref4">12, 2</xref>
        ]. K-means
clustering is often supplemented with a representation of
text. Representations can include statistical-learnt
representations such as term frequency-inverse document
(TFIDF) [
        <xref ref-type="bibr" rid="ref15">13</xref>
        ], neural-learnt representations also known as word
embeddings (e.g. Word2Vec [
        <xref ref-type="bibr" rid="ref21">19</xref>
        ], Glove [
        <xref ref-type="bibr" rid="ref24">22</xref>
        ]), and more
recently representations computed from large pretrained
transformer deep learning models.
      </p>
      <p>
        Transformers are deep learning models following the
architecture proposed by Vaswani et al. [27]. These models often
undergo an unsupervised pretraining on a massive text
corpus to create an initial version of the network later ne-tuned
for more speci c tasks in process called transfer learning.
Pretrained transformers such as BERT [
        <xref ref-type="bibr" rid="ref8">6</xref>
        ], RoBERTa [
        <xref ref-type="bibr" rid="ref18">16</xref>
        ],
and GPT-3 [
        <xref ref-type="bibr" rid="ref6">4</xref>
        ], have achieved state of the art in many
natural language processing tasks. Some of these tasks include
detecting positive and uplifting discussion on social media
(e.g. [
        <xref ref-type="bibr" rid="ref19">17</xref>
        ]), determining answers to questions given a passage
of text (e.g. [28]), summarizing text (e.g. [
        <xref ref-type="bibr" rid="ref17">15</xref>
        ]), and
estimating semantic similarity between sentences. For this reason,
we select a transformer-based language model to create a
semantic representation of student responses.
      </p>
      <p>In this research, we implement an approach using k-means
clustering from the scikit-learn library and utilize
transformerbased sentence embeddings. We evaluate our approach in a
preliminary user study, observing the time taken for
teaching assistants to read and analyze student re ections.</p>
    </sec>
    <sec id="sec-2">
      <title>2. DATASET</title>
      <p>
        Course. The data used in our research was collected from
an undergraduate software engineering course based on the
active learning course model proposed in [
        <xref ref-type="bibr" rid="ref1 ref9">7</xref>
        ]. The number of
students enrolled in the course was 108 students. Modules
are organized based on the concepts being taught and
typically spanned across approximately one week. The course
contained 11 modules in total with the topics listed in
Table 1. Following the active learning course model presented
in Dorodchi et al. [
        <xref ref-type="bibr" rid="ref1 ref12 ref9">10, 7</xref>
        ], each module is typically divided
into multiple sca olds: prep-work to complete before class
including reading assignments and videos to watch, in-class
activities, post-lecture activities, including assignments and
labs, and a re ection at the end of the module. Labs are
more challenging assignments provided to students which
require hands-on coding. These lab activities are typically
divided into multiple parts. There are a total of 4 labs in
this course, with the rst lab beginning in Module 2 and the
last lab being introduced in Module 8.
      </p>
      <p>Data Collection. A survey questionnaire was provided to
students within Canvas, the University's Learning
Management System (LMS), at the end of each module to allow
students to re ect on their learning and challenges. We refer to
student responses of this questionnaire as student re ections
throughout this work. The questions asked of students were:
1. On a scale of 1 to 5, with 5 being Very Active and 1
being Not Active, how engaged would you rate your
group this week?
2. What was your biggest challenge this past week? This
can include in-class activities, assignments, prep work,
studying, time management, motivation, and so on.
3. How can you address the challenge you mentioned above?
What can you do to overcome this challenge for next
time?
For the purpose of this research, we focused solely on the
students' responses to question 2, as this question was
freeresponse and would provide unique responses for the
clustering process.</p>
      <p>Dataset Statistics. We used two di erent module re ections
from the software engineering course throughout this study:
Module 7 re ections and Module 8 re ections. Table 2
showcases our descriptive statistics of our collected student re
ection responses corpora. The selected module re ections were
comparable in size. Firstly, the response rates to the Module
7 and Module 8 re ections are 94 or 87.0% and 89 or 82.4%
responses out of 108 total students after preprocessing
respectively. Moreover, the total word counts were 1866, 1390
for Module 7 and Module 8 re ections respectively. We also
observe that most student re ections contained between a
sentence or two on average in both module re ections.
Furthermore, we note that most re ections in our corpus were
around a sentence in length.</p>
    </sec>
    <sec id="sec-3">
      <title>3. APPROACH</title>
      <p>Our overall approach is illustrated in Figure 1. First we
collect data from an undergraduate course with 108 students.
This is described in more detail in section 2. Then, we
perform preprocessing on the data using natural language
processing (section 3.1). Next, we generate sentence
embeddings (section 3.2), cluster those embeddings (section 3.3),
and sort the re ections based on clusters for TA's to view
(section 3.4).</p>
    </sec>
    <sec id="sec-4">
      <title>3.1 Preprocessing</title>
      <p>
        Before we generate sentence embeddings from our re
ections dataset, we rst preprocess the data by removing any
blank, or null, student responses, and also removing any
non-breaking spaces which appear in the text. Next, the
student responses are compiled and provided to the model
for generating sentence embeddings.
Background. Transformer architectures can be
computationally ine cient when trying to nd the most semantically
similar pair in sizable collection of sentences. To address
this issue, sentence transformers were developed. Sentence
transformers utilize mean pooling which computes the
average of all the word-level vectors in the inputted sentence.
Pooling helps sentence transformers maintain a xed size
vector as their output. Sentence transformers then undergo
a ne-tuning training process using the SNLI dataset [
        <xref ref-type="bibr" rid="ref5">3</xref>
        ]
containing over 570,000 annotated sentence pairs. The
netuning process Siamese and triplet networks [
        <xref ref-type="bibr" rid="ref28">26</xref>
        ] are utilized
to compute weights during ne-tuning so that sentence
embeddings are optimizing for meaningfulness and can be
compared with cosine-similarity. Working with sentence-level
representations make it easier and more e cient for tasks
such as computing the semantic similarity of 2 sentences.
Sentence transformers reduce computation time of nding
the most similar Quora question from over 50 hours to a
few milliseconds using Transformer architectures [
        <xref ref-type="bibr" rid="ref25">23</xref>
        ].
Furthermore, Sentence transformers outperform regular
transformers on several semantic textual similarity tasks [
        <xref ref-type="bibr" rid="ref25">23</xref>
        ].
Approach. We use the sentence-transformers package [
        <xref ref-type="bibr" rid="ref25">23</xref>
        ].
We particularly select the DistilRoBERTa-base-cased
model to get our sentence embeddings.
DistilRoBERTabase-cased is a RoBERTa transformer model [
        <xref ref-type="bibr" rid="ref18">16</xref>
        ],
distilled using [
        <xref ref-type="bibr" rid="ref27">25</xref>
        ]. The dimension of the embeddings is 768.
In the embedding process, we take each student response
which is typically a sentence in length, and convert it into a
vector of 768 oats representing the sentence. These
embeddings are then used to cluster the re ections as described in
the next subsection.
      </p>
    </sec>
    <sec id="sec-5">
      <title>3.3 Clustering</title>
      <p>
        Our earlier step yields a set of embedded student responses
one set for module 7 re ections and another for module
reection 8. For each set of embedded student responses from
our earlier step, we use K-means clustering using the
scikitlearn machine learning library [
        <xref ref-type="bibr" rid="ref23">21</xref>
        ]. We compute the
cluster centers for each cluster using the embedded student
responses, hence cluster centers are represented by an
embedding vector of the same shape. We also assign each response
to a cluster based on the nearest cluster center.
re ection 7 and 8 clusters for module re ection 8.
      </p>
    </sec>
    <sec id="sec-6">
      <title>3.4 Sorting of Student Reflections</title>
      <p>After each student re ection is assigned a cluster, the
reections undergo a sorting process. The goal of the sorting
process is to group re ections from most similar to least
similar to assist in the reading process. Cluster distances
were calculated using the scikit-learn library fit_transform
function which computes and transforms the sentence
embeddings to cluster-distance space. This function uses the
euclidean distance formula for calculating the distance
between a student re ection response r and its assigned cluster
center rc, as follows:
distance(e(r); rc) = pe(r) e(r)
(2 e(r) rc) + rc rc
where e(r) represents a student response r embedded
using sentence-transformers into a vector of 768 elements. rc
represents the computed cluster center assigned to r.
After computation, we sort the re ections using the assigned
cluster number to group re ections within the same
cluster together. Lastly, we sort the re ections within the same
cluster using the distance metric in descending order as well.
This way re ections are sorted by most semantically
similar to the cluster center to least semantically similar to the
cluster center. Next, we explore our user study set up and
evaluate how well this approach assists in the reading
process.</p>
    </sec>
    <sec id="sec-7">
      <title>4. RESULTS</title>
    </sec>
    <sec id="sec-8">
      <title>4.1 Experimental Setup</title>
      <p>In order to measure the e cacy of clustering in the
knowledge extraction process, we developed a user study which
compares the time e ciency of reading through and
extracting topics from student re ections in two formats:
1. Unsorted student re ections exported directly from the</p>
      <p>LMS.
2. Sorted student re ections sorted based on cluster
distances.</p>
      <p>
        The number of clusters was determined using the Silhouette
method [
        <xref ref-type="bibr" rid="ref26">24</xref>
        ] for nding the optimal number of clusters.
Using the Silhouette method, we generate 4 clusters for module
First, the method of the user study will be described, and
then a summary of the results. Our hypothesis when
conducting this study was that clustering can help reduce the
cognitive load and increase e ectiveness and e ciency of
knowledge extraction.
      </p>
      <p>In this user study, four teaching assistants were selected to
read through the student re ections of a Software
Engineering course. The module 7 and 8 re ections were chosen as
the corpora to extract knowledge from, as the TAs had not
yet read these in particular.</p>
      <p>Each TA was assigned a re ection and a format. For
example, TA 1 would read and extract topics from Re ection 7
unsorted, TA 2 would read and extract topics from Re
ection 7 clustered/sorted, and so on, as illustrated in Table
3. For the TAs which were assigned the clustered/sorted
format, they individually ran the K-Means clustering
algorithm rst without reading any responses before beginning
the process.
The free-response question used in particular for this study
was:
\What was your biggest challenge this past week?
This can include in-class activities, assignments,
prep work, studying, time management,
motivation, and so on."
Each TA individually read through each student's re ection
response for this question, extracted any new topics
mentioned in the student response, and timed themselves
accordingly for the duration of the process. Once all TAs had
collectively nished, they then met to discuss what topics
they found, and compared times and results.</p>
    </sec>
    <sec id="sec-9">
      <title>4.2 Evaluation</title>
      <p>After comparing results of this study, we derive that by
providing instructors with student re ections in a clustered and
sorted format, the time needed for knowledge extraction
decreases while maintaining the accuracy of identifying
topics. Re ection 7, with a total of 94 student responses, took
90 minutes to completely read through and extract topics
on the unsorted responses, while only requiring 15 minutes
in the sorted and clustered format. Re ection 8 had
similar results in which e ciency increased, with a total of 89
responses taking approximately 121.4 minutes on the
unsorted format and 20.9 minutes on the clustered and sorted
responses. It is important to note that the TA
extracting knowledge from Re ection 8 unsorted did not complete
within a 90 minute time frame, thus their results were
normalized based on how many re ections they did complete.</p>
      <p>These results are provided in Table 4.</p>
      <p>In addition to the increased e ciency of knowledge
extraction with a clustered and sorted format, the topics extracted
remained consistent, with a slight improvement in
comparison to the unsorted format. Following the portion of the
user study which required TAs to individually extract
topics from the re ections, they then met afterwards to discuss
their similarities and di erences in topics. The TAs who
analyzed Re ection 7 extracted the same topics from the
student responses with no di erences. During the discussion,
the Re ection 7 TAs took turns sharing the topics they had
extracted during the user study, and concluded that they
were in 100% agreement with the topics coded. Re ection 8,
however, had one topic which was extracted in the clustered
and sorted re ections and not in the unclustered/unsorted
re ections. The TAs assigned with Re ection 8 noted that
this was most likely due to a lack of time to completely
analyze all unsorted student re ections, hence displaying how
time e ciency can also be bene cial to improving the
accuracy of knowledge extraction if given a time-constraint.</p>
      <p>Despite the improved time e ciency of the clustered and
sorted re ection format, no topics were missed.</p>
      <p>
        We utilize the dimension reduction algorithm UMAP [
        <xref ref-type="bibr" rid="ref20">18</xref>
        ]
to visualize the resulting clusters of student re ections as
shown in Figure 2. The student re ections for Module 7
resulted in 4 clusters with 4 major topics including
managing workload, motivation and time management, lab work,
and group work. The Module 8 student re ections resulted
in 8 clusters with each cluster containing a challenge in at
least one of the following categories: Lab work, time
management, studying, motivation, group work, and some re
ections mentioned no challenges whatsoever. Managing
workload, motivation, studying, and time management relate to
the student's own discerned ability to handle the
coursework in general. Lab work and group work were challenges
in which students related their troubles more speci cally to
di cult topics being covered, confusions about instructions,
or trouble with communicating among their groups to
complete activities. Students who were in the category of \no
challenges" noted that they did not have any di culties or
confusion during the span of that module. As displayed in
these scatter plots and the major topics described, there are
overlaps among several of the clusters. This overlap is
created by the similarities in the students' wordings. For
example, two student responses within the \Managing Workload"
cluster of the Module 7 re ection were:
1. \My biggest challenge has been not procrastinating my
      </p>
      <p>work."
2. \The biggest challenge this week was working with the</p>
      <p>dash and the dashboard framework."
The rst student response was the cluster center with a
dis(a) Module 7 Re ection clusters based on question 2: student chal- (b) Module 8 Re ection clusters based on question 2: student
challenges. lenges.
tance of 3:12, and the second student response was one of
the farthest points from the cluster center, with a distance
of 7:05. Therefore, clusters still maintain semantic
similarities to many of the responses with smaller intracluster
distances, but contain outliers due to the overlap caused by
similar word usages.</p>
    </sec>
    <sec id="sec-10">
      <title>5. RELATED WORK</title>
      <p>
        Re ections are a necessary component in active learning
courses, as it allows the instructor to track students'
impression on the course, activities, and social learning aspects
[
        <xref ref-type="bibr" rid="ref11">9</xref>
        ]. In Dorodchi et al. [
        <xref ref-type="bibr" rid="ref10 ref2">8</xref>
        ], student re ections are used in an
introductory computer science (CS1) course to test its e
cacy as a feature to predict early on which students may be
at-risk of failing. By including student re ection data as a
feature in a temporal data model, referred to as the student
sequence model, the authors were able to increase the
accuracy of predicting student outcomes of pass or fail [
        <xref ref-type="bibr" rid="ref10 ref2">8</xref>
        ].
Despite the advantages of integrating student re ections into a
course model, these bene ts require the time-consuming
process of manually reading through individual re ections and
extracting common themes. For this reason, creating an
automated process to assist instructors is similarly explored in
[
        <xref ref-type="bibr" rid="ref7">5</xref>
        ]. Chen et al. [
        <xref ref-type="bibr" rid="ref7">5</xref>
        ] presents positive results in exploring the
usage of topic modeling for analyzing and extracting
knowledge from student re ections. In this particular study, the
MALLET toolkit was utilized for the topic modeling
process, and the number of clusters K was manually selected.
These methods of knowledge extraction are not only e
ective in an academic environment, but is also used in other
applications such as social media mining for COVID-19
related information. Comparatively to the time-sensitive task
of analyzing student re ections, clustering can also be used
to discover new information from relevant tweets to assist
in the decision-making steps that may follow [
        <xref ref-type="bibr" rid="ref14">12</xref>
        ]. For this
task, Ito et al. [
        <xref ref-type="bibr" rid="ref14">12</xref>
        ] and Asgari et al. [
        <xref ref-type="bibr" rid="ref4">2</xref>
        ] implement
algorithms using K-means clustering and sentence embeddings,
which both provide positive results in topic extraction. Our
study is distinguished from prior works in that, we collect
and cluster short text student re ections and we conduct
an educator-centered evaluation where we assess the direct
impact of our approach on teaching assistants' reading and
analysis time.
      </p>
    </sec>
    <sec id="sec-11">
      <title>6. DISCUSSION &amp; FUTURE WORK</title>
      <p>
        In our research, we implement an approach using k-means
clustering and sentence-transformers on student re ections
to aid in reducing the labor and time-consumption of
manually analyzing re ections. Our study presents promising
preliminary results showing that by clustering student
reections based on semantic similarities and sorting by
intracluster distance, instructors are able to decrease the time
needed to extract topics from the student corpora.
However, our study su ers from several limitations. Firstly, our
sample size for the user study is very small (N = 4) and
our results may not generalize to di erent classes, or
reection corpora. Furthermore, teaching assistants read at
di erent paces. Our results may not generalize to di erent
teaching assistants. To address these limitations we intend
to conduct a user study with a signi cantly larger pool of
participants, module re ections, and in multiple courses. In
addition, we are planning to utilize fuzzy clustering [
        <xref ref-type="bibr" rid="ref13">11</xref>
        ] in
the future version as well.
      </p>
      <p>
        Re ections are fundamental for enhancing learning in
classrooms [
        <xref ref-type="bibr" rid="ref11">9</xref>
        ], and provides the instructor with instant feedback
on student progress. This study focuses on exploring the
impact of clustering on student re ections to assist
instructors in reducing time costs of analysis. In our future work,
we plan to integrate our k-means clustering algorithm into
a dashboard tool for instructors and conduct an expanded
user study to further evaluate our approach. The dashboard
will provide instructors and TAs the functionality to cluster
student re ections from the LMS and be guided through the
responses.
815{823, 2015.
[27] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit,
L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin.
Attention is all you need. arXiv preprint
arXiv:1706.03762, 2017.
[28] Z. Zhang, J. Yang, and H. Zhao. Retrospective reader
for machine reading comprehension. arXiv preprint
arXiv:2001.09694, 2020.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>7. ADDITIONAL AUTHORS</mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>8. REFERENCES</mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Al-Doulat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Nur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Karduni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Benedict</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Al-Hossami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Maher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Dou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dorodchi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>X.</given-names>
            <surname>Niu</surname>
          </string-name>
          .
          <article-title>Making sense of student success and risk through unsupervised machine learning and interactive storytelling</article-title>
          .
          <source>In International Conference on Arti cial Intelligence in Education</source>
          , pages
          <fpage>3</fpage>
          <lpage>{</lpage>
          15. Springer,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Asgari-Chenaghlu</surname>
          </string-name>
          ,
          <string-name>
            <surname>N.</surname>
          </string-name>
          <article-title>Nikzad-Khasmakhi, and</article-title>
          <string-name>
            <given-names>S.</given-names>
            <surname>Minaee.</surname>
          </string-name>
          Covid-transformer:
          <article-title>Detecting covid-19 trending topics on twitter using universal sentence encoder</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Bowman</surname>
          </string-name>
          , G. Angeli,
          <string-name>
            <given-names>C.</given-names>
            <surname>Potts</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          .
          <article-title>A large annotated corpus for learning natural language inference</article-title>
          .
          <source>In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing</source>
          , pages
          <volume>632</volume>
          {
          <fpage>642</fpage>
          , Lisbon, Portugal, Sept.
          <year>2015</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [4]
          <string-name>
            <surname>T. B. Brown</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Mann</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Ryder</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Subbiah</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Kaplan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Dhariwal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Neelakantan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Shyam</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Sastry</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Askell</surname>
          </string-name>
          , et al.
          <article-title>Language models are few-shot learners</article-title>
          .
          <source>arXiv preprint arXiv:2005.14165</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yu</surname>
          </string-name>
          .
          <article-title>Topic modeling for evaluating students' re ective writing: A case study of pre-service teachers' journals</article-title>
          .
          <source>In Proceedings of the sixth international conference on learning analytics &amp; knowledge, pages 1{5</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          . Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          .
          <source>arXiv preprint arXiv:1810.04805</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Dorodchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Al-Hossami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Nagahisarchoghaei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Diwadkar</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Benedict</surname>
          </string-name>
          .
          <article-title>Teaching an undergraduate software engineering course using active learning and open source projects</article-title>
          .
          <source>In 2019 IEEE Frontiers in Education Conference (FIE)</source>
          , pages
          <fpage>1</fpage>
          <article-title>{5</article-title>
          . IEEE,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Dorodchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Benedict</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Desai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Mahzoon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Macneil</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Dehbozorgi</surname>
          </string-name>
          .
          <article-title>Design and implementation of an activity-based introductory computer science course (cs1) with periodic re ections validated by learning analytics</article-title>
          .
          <source>12</source>
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Dorodchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Powell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Dehbozorgi</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Benedict</surname>
          </string-name>
          . Strategies to Incorporate
          <source>Active Learning Practice in Introductory Courses</source>
          , pages
          <volume>20</volume>
          {
          <fpage>37</fpage>
          . 04
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [10]
          <string-name>
            <surname>M. M. Dorodchi</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Dehbozorgi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Benedict</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Al-Hossami</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Benedict</surname>
          </string-name>
          .
          <article-title>Sca olding a team-based active learning course to engage students: A multidimensional approach</article-title>
          .
          <source>In 2020 ASEE Virtual Annual Conference Content Access. ASEE Conferences</source>
          , Virtual On line,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Doroodchi</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Reza</surname>
          </string-name>
          .
          <article-title>Implementation of fuzzy cluster lter for nonlinear signal and image processing</article-title>
          .
          <source>In Proceedings of IEEE 5th International Fuzzy Systems</source>
          , volume
          <volume>3</volume>
          , pages
          <fpage>2117</fpage>
          <lpage>{</lpage>
          2122 vol.
          <volume>3</volume>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>H.</given-names>
            <surname>Ito</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Chakraborty</surname>
          </string-name>
          .
          <article-title>Social media mining with dynamic clustering: A case study by covid-19 tweets</article-title>
          .
          <source>In 2020 11th International Conference on Awareness Science and Technology (iCAST)</source>
          , pages
          <fpage>1</fpage>
          <article-title>{6</article-title>
          . IEEE,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>K. S.</given-names>
            <surname>Jones</surname>
          </string-name>
          .
          <article-title>A statistical interpretation of term speci city and its application in retrieval</article-title>
          .
          <source>Journal of documentation</source>
          ,
          <year>1972</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Yang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          .
          <article-title>Identifying at-risk k-12 students in multimodal online environments: A machine learning approach</article-title>
          . arXiv preprint arXiv:
          <year>2003</year>
          .09670,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Lapata</surname>
          </string-name>
          .
          <article-title>Text summarization with pretrained encoders</article-title>
          .
          <source>arXiv preprint arXiv:1908.08345</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          , and
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          .
          <article-title>Roberta: A robustly optimized bert pretraining approach</article-title>
          . arXiv preprint arXiv:
          <year>1907</year>
          .11692,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>K.</given-names>
            <surname>Mahajan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Al-Hossami</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Shaikh. TeamUNCC@LT-EDI-EACL2021</surname>
          </string-name>
          :
          <article-title>Hope speech detection using transfer learning with transformers</article-title>
          .
          <source>In Proceedings of the First Workshop on Language Technology for Equality, Diversity and Inclusion</source>
          , pages
          <volume>136</volume>
          {
          <fpage>142</fpage>
          ,
          <string-name>
            <surname>Kyiv</surname>
          </string-name>
          , Apr.
          <year>2021</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>L.</given-names>
            <surname>McInnes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Healy</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Melville</surname>
          </string-name>
          . Umap:
          <article-title>Uniform manifold approximation and projection for dimension reduction</article-title>
          .
          <source>arXiv preprint arXiv:1802.03426</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , I. Sutskever,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          , G. Corrado, and
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          .
          <source>arXiv preprint arXiv:1310.4546</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>N.</given-names>
            <surname>Nur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Park</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dorodchi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Dou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Mahzoon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Niu</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Maher</surname>
          </string-name>
          .
          <article-title>Student network analysis: a novel way to predict delayed graduation in higher education</article-title>
          .
          <source>In International Conference on Arti cial Intelligence in Education</source>
          , pages
          <volume>370</volume>
          {
          <fpage>382</fpage>
          . Springer,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pedregosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Varoquaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gramfort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Thirion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Grisel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Prettenhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dubourg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vanderplas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Passos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cournapeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Perrot</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Duchesnay</surname>
          </string-name>
          .
          <article-title>Scikit-learn: Machine learning in Python</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          ,
          <volume>12</volume>
          :
          <fpage>2825</fpage>
          {
          <fpage>2830</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pennington</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Socher</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          . Glove:
          <article-title>Global vectors for word representation</article-title>
          .
          <source>In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)</source>
          , pages
          <fpage>1532</fpage>
          {
          <fpage>1543</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>N.</given-names>
            <surname>Reimers</surname>
          </string-name>
          and
          <string-name>
            <given-names>I.</given-names>
            <surname>Gurevych</surname>
          </string-name>
          .
          <article-title>Sentence-bert: Sentence embeddings using siamese bert-networks</article-title>
          .
          <source>In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics</source>
          ,
          <volume>11</volume>
          <fpage>2019</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>P. J.</given-names>
            <surname>Rousseeuw</surname>
          </string-name>
          .
          <article-title>Silhouettes: A graphical aid to the interpretation and validation of cluster analysis</article-title>
          .
          <source>Journal of Computational and Applied Mathematics</source>
          ,
          <volume>20</volume>
          :
          <fpage>53</fpage>
          {
          <fpage>65</fpage>
          ,
          <year>1987</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>V.</given-names>
            <surname>Sanh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Debut</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chaumond</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Wolf</surname>
          </string-name>
          .
          <article-title>Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter</article-title>
          . arXiv preprint arXiv:
          <year>1910</year>
          .01108,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>F.</given-names>
            <surname>Schro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kalenichenko</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Philbin</surname>
          </string-name>
          .
          <article-title>Facenet: A uni ed embedding for face recognition and clustering</article-title>
          .
          <source>In Proceedings of the IEEE conference on computer vision and pattern recognition</source>
          , pages
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>