=Paper= {{Paper |id=Vol-1584/paper22 |storemode=property |title=Student Understanding and Engagement in a Class Employing COMPS Computer Mediated Problem Solving: A First Look |pdfUrl=https://ceur-ws.org/Vol-1584/paper22.pdf |volume=Vol-1584 |authors=Jung Hee Kim,Michael Glass,Taehee Kim,Kelvin Bryant,Angelica Willis,Ebonie McNeil,Zachery Thomas |dblpUrl=https://dblp.org/rec/conf/maics/KimGKBWMT16 }} ==Student Understanding and Engagement in a Class Employing COMPS Computer Mediated Problem Solving: A First Look== https://ceur-ws.org/Vol-1584/paper22.pdf

Jung Hee Kim et al. MAICS 2016 pp. 69–74

Student Understanding and Engagement in a Class Employing
COMPS Computer Mediated Problem Solving: A First Look
Jung Hee Kim Michael Glass Taehee Kim Kelvin Bryant
North Carolina A&T State U. Valparaiso U. North Carolina A&T North Carolina A&T
jungkim@ncat.edu michael.glass@valpo.edu tkim@ncat.edu ksbryant@ncat.edu

Angelica Willis Ebonie McNeil Zachery Thomas
North Carolina A&T North Carolina A&T North Carolina A&T
awillis@aggies.ncat.edu eimcneil@aggies.ncat.edu zithomas@aggies.ncat.edu

Abstract solving discussions, where students respond to each other
in normal dialogue fashion, are a natural addition to the lab
COMPS computer-mediated group discussion exercises are component of a computer programming class.
being added to a second-semester computer programming NC A&T has migrated to an objects-later curriculum,
class. The class is a gateway for computer science and com- meaning that CS2 contains more object concepts than the
puter engineering students, where many students have diffi- first semester CS1 class. The student exercises in this
culty succeeding well enough to proceed in their major. This
paper reports on first results of surveys on student experi- intervention are thus oriented toward object concepts.
ence with the exercises. It also reports on the affective states Expressions of affect have three potential uses for this
observed in the discussions that are candidates for analysis project. One is they are indications of emotional states that
of group functioning. As a step toward computer monitoring may effect student enthusiasm, self-efficacy and
of the discussions, an experiment in using dialogue features satisfaction. Another is they will be used in studies of
to identify the gender of the participants is described. group interaction. Finally, they may detectable by machine,
contributing to an instructor's dashboard or other
assessment of how well the group discussions are working.
Introduction
The second Java programming class, GEEN 165, at North
Carolina A&T State University is a bottleneck for many Background
Computer Science and Computer Engineering students. As
an experiment in improving student learning and interest, COMPS Dialogue Platform and Exercises
COMPS computer-mediated discussion exercises (Glass et COMPS is a web-delivered computer-mediated chat envi-
al., 2014a) have been introduced. This paper reports on ronment (Kim et al., 2013). It permits the instructor (or a
first measurements of a) student self-efficacy and interest, TA) to monitor each conversation. The dialogue data from
b) expressions of affect within the discussions. As a test of this study comes from log files. Attesting to the interactiv-
our ability to have the computer monitor the conversation, ity of the COMPS experience, about half of all typing oc-
the expressions of affect were applied toward the task of curs while several students are typing. Even three students
using dialogue features to identify the gender of the partici- at a time can be typing and responding to each other, all
pant. contributing to the same discussion, since they can see
GEEN 165 corresponds to the CS2 (second semester) each other's keystrokes in real time. In spoken conversation
class in the ACM/IEEE curriculum (ACM/IEEE, 2013). productive dialogue does not happen when three people are
The historical success rate for students attempting GEEN talking at once, but we have shown that in the chat domain
165 is low. From 2003 to 2012, comprising about 1000 it indeed occurs (Glass et al., 2015).
student-semesters, approximately 66% of students The exercises in this project involve students solving
succeeded well enough (grade C or better) on the first multiple-choice questions. When implementing these as
attempt to continue to the next class. The fact that so many group collaborations, we pay attention to three principles
students have difficulty makes it potentially a fertile class that promote successful collaborative learning: a structure
for experimenting with educational innovation. or activity script for the students to follow, creative
Lab-based computer programming classes interdependence, and individual accountability (Eberly
traditionally permit unstructured group interaction. Center, 2016). The activity is structured as follows. The
Students can talk to each other even as they require the students are instructed to come to consensus on the answer,
students to write their own software. Therefore problem- then have one student approach the instructor or a TA to
Copyright retained by the authors. verify the answer. That student is responsible for bringing

69
Jung Hee Kim et al. MAICS 2016 pp. 69–74

the correct answer (or a hint) back to the group, and they ing”), emerging individual interest, and well-developed in-
must reach consensus again. Creative interdependence dividual interest. Mitchell (1993), as an example, reported
means that students should need each other to complete the that using group work activities, computer-based activities,
exercise, it should not reasonable for one or several engaging puzzles, and meaningful activities, were corre-
students to race ahead and finish it and leave the others lated with triggering and holding interest in a mathematics
behind or let them not participate. During the discussion classroom.
the obligations of discourse require that students explain Recently Kim and Schallert (2014) have investigated
themselves in the course of reaching consensus. Having the mediating effect interpersonal interactions have on
conceptual knowledge as the learning goal promotes student interest. It is possible to track student interest in
explanatory dialogue. We have examples where seemingly four developmental phases throughout a semester, not just
the weakest student serves as a metacognitive regulator, within the time frame of individual activities. It is affected
challenging or directing every reasoning step and not only by the enthusiasm expressed by the teacher and
becoming a participant in all dialogue exchanges as the fellow students, but also by factors such as affiliative
other students seem to teach that weakest one (Glass et al., motivations: the desire to belong to the group. The social
2013). Individual accountability typically occurs after the factors enhancing interest were found within college
group exercise, where the students have a quiz or an classes in a number of diverse disciplines (e.g. history,
exercise utilizing what they have learned. Individual chemistry, religion) in both upper and lower level college
accountability also occurs within the discussion, as the classes.
students find themselves responsible for explaining their Viewed in this light, group exercises should address
positions in order to reach consensus. student motivation issues through social interaction at the
same time as they address learning of concepts through
Addressing Student Learning group cognition. The exercises are constructed so that
Our collaborative inquiry learning exercises are in line students engage with other students, providing the small-
with current practices in Computer Supported Collabora- group interpersonal contact that best transmits enthusiasm.
tive Learning. A key concept is group cognition, where dif- The students know the teacher is watching the
ferent participants in a conversation contribute different conversations and is taking an active interest in the
parts of the epistemic knowledge construction task. The students' progress, sometimes by intervening and
Virtual Math Teams project, where students solve math sometimes by providing answers and hints.
The implication for COMPS technology is that
problems through computer-mediated chat, has docu-
monitoring the health of student conversations could be
mented this phenomenon (Stahl, 2009). Learning through
informed by expressions of student affect. Affect, the
group cognition is justified both in terms of learning out-
observable manifestation of emotion, mediates social
comes and student motivation. There is also research
interaction and is related to student interest.
specifically showing that collaborative activity is a desir- Self-efficacy, an individual’s belief to be capable of
able pedagogical approach for “relational understanding” performing a particular task (Bandura, 1977), has been
or understanding of concepts (Tchounikine et al., 2010). widely studied because of its relationship to performance
Dialogue that engages in domain reasoning, such as ex- including academic achievement (Choi, 2005; Pajares and
plaining, negotiating, or inferring is observed in these Miller, 1995; Wood and Locke, 1987) and even choice of
kinds of exercises (Zhou, 2009; Stahl, 2004). major in college (Hackett, 1985). In accordance with the
The implication for COMPS technology is that suggestions of Finney and Schraw (2003), we measured
monitoring the health of student conversations could be self-efficacy using task-specific survey items rather than
informed by a) whether students are talking to each other, generalized questions. This project measures students’ self-
b) whether they are engaging in reasoning activities. efficacy both at the level of the skills in individual
assignments at the time of the COMPS exercises and
Addressing Student Interest and Self-Efficacy overall in the topics of the class at the beginning and end of
Group exercises address many of the components of stu- the semester.
dent interest. Interest refers to an individual’s psychologi-
cal inclination to participate in particular content over time
Data and Methods
(Hidi & Renninger, 2006). There is a relationship between
interest, achievement goals, performance and retention We have collected data from one semester of the GEEN
(Harackiewicz et al., 2008). Interest plays a critical role in 165 class. There were 55 students at the start of the semes-
students’ further decisions on engaging and reengaging in ter and 47 at the end. We administered COMPS exercises
the major (Brown, 2012). The four-phase model of interest four times during the semester, with 53 group discussions
posits four sequential interest phases: triggered situational in total. Most groups had 3 or 4 participants. The bulk of
interest (“catching”), maintained situational interest (“hold- students were assigned to sessions quasi-randomly as stu-

70
Jung Hee Kim et al. MAICS 2016 pp. 69–74

dents arrived in lab. Cliques of friends, who tended to ar- • Excited.
rive together, were split into different random groups. We • Apologetic. Refers to a user expressing regret for
deviated from this protocol by creating a few all female previous action. This type of message is usually
groups, for comparison with the all-male groups. Alto- aimed towards another user or towards the group
gether there were about 8000 dialogue turns. Students were as a whole.
surveyed near the beginning and end of the semester re- • Humor.
garding their enthusiasm for the class, their self-efficacy in • Frustrated.
programming, and their desire to continue. Every COMPS • Confused. User explicitly expressing confusion,
exercise was also accompanied by a survey of the student or exhibiting confusion e.g. through questions.
experiences. • Sad. A negative emotion determined by keywords
and sad emoticons that are usually directed at self.
Transcript Processing and Annotation Some of these affective states have been tagged and
Table 1 contains an extract from a COMPS discussion. illustrated in the Table 1 dialogue. Table 2 indicates some
From COMPS log files we extract dialogue turns in of the textual indications for the various states. These are
spreadsheet format for processing. The text from one dia- being used by the coders at present, but will become
logue turn is in one line of the spreadsheet. In addition to machine-derived features for the purpose of machine-
the metadata such as problem number, turn number, and annotating the affective states.
time stamp, each dialogue turn is tagged with features.
Some are derived by software and some are annotated by Surveys
hand. These features are available for machine learning ex- The survey administered to all students at the beginning
periments and for human analysis and study of dialogues. and end of the semester has an interest part and a self-effi-
The machine-derived classifiers are available for feeding cacy part. The end-of-semester survey also inquires about
software that will monitor the health of the conversation. student plans for continuing in the major and registering
Some of the existing machine-derived features (Glass for the next programming class. All items use a 6-point
et al., 2014b) that have been relevant to transcript studies scale. The interest survey items are derived from a survey
and machine monitoring of the health of the conversation from Harackiewicz et al. (2008). One of the authors of this
are: paper has utilized these items to assess how much a stu-
• The presence of discourse marker words, e.g. dent's interest in a class is affected by the enthusiasm of
“now” or “therefore” near the beginning of a dia- fellow students (Kim and Schallert, 2014). Some represen-
logue turn. These are linguistically associated tative items are “What we are learning in GEEN165 this
with reasoning, and are therefore possibly indica- year can be applied to real life” and “To be honest, I don’t
tive of productive discussion. find what we do in the GEEN165 class interesting.” The
• The presence of pronouns that include another self-efficacy items inquire about student confidence in
participant in the dialogue: “you,” “we,” “us.” completing 13 tasks corresponding to class topics. This list
These are possibly indicative of transactive dis- was obtained from the instructor. A typical item is “Design
cussion. inner classes that implement event handling interfaces.”
• The presence of question marks. The after-COMPS-lab survey had items covering
• The presence of emoticons. It is possible that student perceptions in three areas: student interest, whether
emoticons are associated with students attending the student learned from the lab, and how well the group
to each others affect. exercise functioned. An example item is “I contributed to
• The length of a turn in words. the understanding of other students in my group.”
• Whether typing this turn overlapped with other
people typing.
Results
Affective States Evinced in Dialogue
Of particular interest are six affective states that we have
Survey Results
chosen as initial targets. These are annotated by hand. They Table 3 shows the students' perceptions of interest and effi-
were chosen because they may be salient for monitoring cacy at the beginning and end of the semester. All interest
both the learning aspects (whether the students are reason- items were combined into one mean and the same for all
ing together) and the social health of the conversation. We efficacy items. In total 28 students participated in both pre-
show here some of the definitions that the coders have ap- and post-surveys.
plied for consistency in recognizing and coding. • Regarding students’ interest toward the course,
their interest did not change. The averages of stu-

71
Jung Hee Kim et al. MAICS 2016 pp. 69–74

dents’ interest toward the course in the beginning to mimic the task of monitoring a conversation in real time
of the semester and end of the semester were 4.33 turn-by-turn. These classifiers have not been successful.
and 4.32 respectively. The same features that are statistically correlated with
• Self-efficacy with respect to the course content in- gender are discovered by the decision trees, but accuracy
dicated significant improvement between the be- has been quite low.
ginning and the end of the semester, rising from
2.83 to 3.81.
The increase in self efficacy was significant, p < 0.01. Discussion and Future work
Table 4 shows students' perception of the COMPS
labs, surveyed immediately after each lab. There seemed to Survey Results
be a clear improvement between the first part of the The students experienced improvement in their experience
semester (Labs 1 and 2) and the later part (Labs 3 and 4). of the COMPS exercises during the semester. They re-
Students perceived: ported that the groups worked better in the last two exer-
• more effective group work in the second part cises and that they learned more. It is not clear why student
(means rose from about 3.1 to about 3.4) interest was lower in the last lab. Anecdotally there are two
• better understanding of concepts in the second reasons that have been suggested by the instructor and lab
half (means rose from about 3.4 to about 3.9). TAs who supervised this session. One is that the last lab
Multiple one-way ANOVA supports the hypothesis was optional, presented during Thanksgiving week. That
that mean scores are indeed different, p = 0.03 for both fewer students attended could indicate that the general
effectiveness and understanding. Post hoc analyses using level of engagement was lower than usual. The other is that
the Tukey test for significance indicated that the mean perhaps the novelty was wearing off. Some students ex-
scores of Lab 3 were significantly higher than Lab 2 for
pressed as much during the session. We will need to find
both effectiveness and understanding.
some way to survey the reasons for student interest.
However, students’ interest in each exercise in the lab
The pre- and post-semester survey is hard to interpret
sessions seemed to fluctuate throughout the semester. Lab
because of low participation rate and dropouts. In the next
3 had the highest interest, which corresponded with the
semester we are enforcing better participation. The
highest effectiveness and understanding. But interest in
increase in self-efficacy was striking, but we do not have
Lab 4 was the approximately the same as Labs 1 and 2.
yet any comparison with other classes. Future work
includes comparing interest and self-efficacy with learning
Affective States by Gender gains on the pre- and post-tests. Future work also includes
We annotated the 14 group discussions of one COMPS ex- comparing pre- and post-semester survey results with
ercise, comprising 2147 dialogue turns, for the six affective individual lab surveys, to see whether there are correlations
features. In total 199 turns showed evidence of one or more between overall student interest and the situational interest
feature, or 9.3%. in individual COMPS exercises.
As a first test of the utility of these features along with Another analysis in the future will be between the
the machine-generated ones, we tried to use them to predict participants of the same group: do they agree about
the gender of the participant. Among 49 students we had learning and group functioning, do they have similar
16 women and 33 men. First we aggregated all the turns learning gains.
from each student, and looked at statistical differences
between the two populations. Two-tailed t-tests revealed Affective States in Dialogue
that none of the features were significantly different Hand-annotating the remaining 6000 turns of dialogue may
between the genders at the p < 0.05 level. However result in more reliable statistical correlations. We are also
expressions of apology were different at the p = 0.06 level.
at work toward machine-annotation of these features.
The most common affective feature was confusion, with 62
Annotation of the affective states so far has relied
instances of utterances expressing confusion. Women
entirely on the text of the dialogues. Future work will
expressed confusion in 4.6% of turns, and men in 2.2%. It
include extra-linguistic features. In COMPS group
suggests the two genders behave differently, but the p <
exercises in other classes evidence of student engagement
0.22 level does not show significance. The two genders
sometimes presents through Comic Sans typeface, big or
also showed differences in the amount of participation.
bold fonts, and wild colors. We are also exploring using
Men each uttered an average of 46 turns per dialogue and
timing features from the overlapped typing. Students can
women 36 turns.
all type simultaneously while seeing each other's
We then trained a J48 decision tree classifier and a
developing chat text (Glass et al., 2015). We think that
multiple-regression linear classifier using the Weka data
typing speed, degree of simultaneous typing, and pauses as
mining tool (Witten and Frank, 2005). The task is to
they look at each other's turns, may provide indications of
classify each dialogue turn with the gender of the speaker,

72
Jung Hee Kim et al. MAICS 2016 pp. 69–74

affective states such as being excited or indications of ous Typed-Chat in Computer-Supported Collaborative Dia-
when they are attending to each other's utterances. logue. Journal of Computing Sciences in Colleges, 31 no.
2, Dec.
Hackett, G. (1985). Role of mathematics self-efficacy in
Acknowledgements the choice of math-related majors of college women and
Partial support for this work was provided by the National men: a path analysis. Journal of Counseling Psychology,
Science Foundation's Improving Undergraduate STEM Ed- 32, pp. 47–56.
ucation (IUSE) program under Award No. 1504917. Any Harackiewicz, J. M., A.M. Durik, K.E. Barron, L. Linnen-
opinions, findings, and conclusions or recommendations brink-Garcia, and J.M. Tauer. (2008). The role of achieve-
expressed in this material are those of the author(s) and do ment goals in the development of interest: Reciprocal rela-
not necessarily reflect the views of the National Science tions between achievement goals, interest, and perfor-
Foundation. mance. Journal of Educational Psychology, 100, pp. 105‒
122.
Hidi, S., and K. Renninger. (2006). The four-phase model
References of interest development. Educational Psychologist, 41, pp.
111‒127.
ACM/IEEE. Computer Science Curricula 2013: Curricu-
lum Guidelines for Undergraduate Degree Programs in Kim, Jung Hee, Melissa Desjarlais, Kelvin Bryant, and
Computer Science. New York: ACM, 2013. Michael Glass. (2013). Observations of Collaborative Be-
hivor in COMPS Computer-Mediated Problem Solving,
Bandura, A. (1997). Self-efficacy: the exercise of control.
Proceedings of the 24th Midwest AI and Cognitive Science
New York: Freeman.
Symposium, New Albany, IN.
Brown, J. (2012). Developing a freshman orientation sur-
Kim, Taehee, and Diane L. Schallert. (2014). Mediating ef-
vey to improve student retention within a college. College
fects of teacher enthusiasm and peer enthusiasm on stu-
Student Journal, 46, pp. 83‒851.
dents’ interest in the college classroom. Contemporary Ed-
Choi, N. (2005). Self-efficacy and self-concept as predic- ucational Psychology, 39, no. 2 pp. 134‒144.
tors of college students’ academic performance. Psychol-
Mitchell, M. (1993). Situational interest: Its multifaceted
ogy in the Schools, 42, pp. 197‒204.
structure in the secondary school mathematics classroom.
Eberly Center (2016). Carnegie Mellon, Eberly Center for Journal of Educational Psychology, 85, pp. 424‒436.
Teaching Excellence and Innovation. What are best prac-
Pajares, F., and M. D. Miller. (1995). Mathematics self-ef-
tices for designing group projects? http://www.cmu.edu/
ficacy and mathematics outcomes: The need for specificity
teaching/designteach/design/instructionalstrategies/group-
of assessment. Journal of Counseling Psychology, 42, pp.
projects/design.html. Retrieved March, 2016.
190–198.
Finney, S. and G. Schraw (2003). Self-efficacy beliefs in
Stahl, Gerry (2004). Building collaborative knowing: Ele-
college statistics courses. Contemporary Educational Psy-
ments of a social theory of CSCL. In J. W. Strijbos, P.
chology, 28, pp. 161‒186.
Kirschner and R. Martens, eds., What we know about
Glass, Michael, Melissa Desjarlais, Jung Hee Kim, and CSCL and implementing it in higher education, Boston,
Kelvin Bryant. (2013). COMPS Computer-Mediated Prob- MA: Kluwer Academic Publishers, pp. 53‒86.
lem Solving Dialogues (poster abstract). International Con-
Stahl, Gerry, ed. (2009). Studying Virtual Math Teams,
ference on Computer Supported Collaborative Learning,
Springer, pp. 57‒73.
Madison, WI.
Tchounikine Pierre, Nikol Rummel, and Bruce M.
Glass, Michael, Jung Hee Kim, Kelvin Bryant, Melissa
McLaren. (2010). Computer Supported Collaborative
Desjarlais, Micayla Goodrum, and Thomas Martin.
Learning and Intelligent Tutoring Systems. In R. Nkambo,
(2014a). Toward Measurement of Conversational Interac-
J.Bourdeau, & R. Mizoguchi (Eds.) Advances in Intelligent
tivity in COMPS Computer Mediated Problem Solving.
Tutoring Systems. Chapter 22, pp. 447‒463, Springer.
Proceedings of the 25th Modern Artificial Intelligence and
Cognitive Science Conference (MAICS-14), Spokane, WA. Witten, Ian H. and Eibe Frank. (2005). Data Mining: Prac-
tical Machine Learning Tools and Techniques, 2nd Edition,
Glass, Michael, Jung Hee Kim, Kelvin Bryant, and Melissa
Morgan Kaufmann.
Desjarlais. (2014b). Indicators of Conversational Interac-
tivity in COMPS Problem-Solving Dialogues, Third Work- Wood, R.E. and E.A. Locke. (1987). The relation of self-
shop on Intelligent Support for Learning in Groups (ISLG) efficacy and grade goals to academic performance. Educa-
at Twelfth International Conference on Intelligent Tutoring tional and Psychological Measurement, 47, pp. 1013–
Systems, Honolulu, Hawaii, June. 1024.
Glass, Michael, Jung Hee Kim, Kelvin Bryant, and Melissa Zhou, Nan (2009). Question Co-Construction in VMT
Desjarlais (2015). Come Let Us Chat Together: Simultane- Chats. In Stahl, 2009, pp. 141‒159.

73
Jung Hee Kim et al. MAICS 2016 pp. 69–74

Table 1: Example of Dialogue Transcript with Affective Features
Turn Student Time Dialogue turn Affective State
1 A 06:44.2 f and foo are the refernece variables
2 A 07:05.2 so those together make 16? for the refrence types
3 B 07:11.9 yup yup
4 A 07:27.9 16 bytes
5 C 07:30.2 2a = 20
6 C 07:36.0 :D Excited
7 B 07:39.7 there ya go lol Humor
8 D 07:54.9 Wait where did you get 16? Confused
9 D 08:05.8 wouldnt it be 48 at least for the main method
10 D 08:18.3 because the array creates 5 object
11 A 08:26.1 oh yeah i looked over that was just counting m f and foo Apologetic
12 C 08:28.7 those are on the heap not the stack
13 D 08:48.0 So the objects created by an array are on the heap
14 A 09:13.8 yeah run time stack = 48

Table 2: Example of Feature words
Excited Apologetic Confused Frustrated Sad
:D sorry i'm confused D:< :(
yay my bad how ):< ):
yes! nvm why This is hard I feel stupid
!!! whoops what is
cool! i messed up I don't under-
stand

Table 3: Beginning and end of semester surveys
Time Interest Efficacy
beginning of sem. 4.33 / 5 2.83 / 5
ending of sem. 4.32 / 5 3.81 / 5

Table 4: After lab surveys
Effectiveness of group work Understanding of concept Interest in lab
Mean / SD Mean / SD Mean /SD
Lab1 3.17 0.68 3.45 0.96 3.19 0.94
Lab2 3.08 0.93 3.42 1.05 3.08 0.93
Lab3 3.47 0.71 4.03 1.06 3.65 0.76
Lab4 3.40 0.61 3.78 0.85 3.17 0.89