=Paper=
{{Paper
|id=Vol-1584/paper22
|storemode=property
|title=Student Understanding and Engagement in a Class Employing
COMPS Computer Mediated Problem Solving: A First Look
|pdfUrl=https://ceur-ws.org/Vol-1584/paper22.pdf
|volume=Vol-1584
|authors=Jung Hee Kim,Michael Glass,Taehee Kim,Kelvin Bryant,Angelica Willis,Ebonie McNeil,Zachery Thomas
|dblpUrl=https://dblp.org/rec/conf/maics/KimGKBWMT16
}}
==Student Understanding and Engagement in a Class Employing
COMPS Computer Mediated Problem Solving: A First Look==
Jung Hee Kim et al. MAICS 2016 pp. 69–74 Student Understanding and Engagement in a Class Employing COMPS Computer Mediated Problem Solving: A First Look Jung Hee Kim Michael Glass Taehee Kim Kelvin Bryant North Carolina A&T State U. Valparaiso U. North Carolina A&T North Carolina A&T jungkim@ncat.edu michael.glass@valpo.edu tkim@ncat.edu ksbryant@ncat.edu Angelica Willis Ebonie McNeil Zachery Thomas North Carolina A&T North Carolina A&T North Carolina A&T awillis@aggies.ncat.edu eimcneil@aggies.ncat.edu zithomas@aggies.ncat.edu Abstract solving discussions, where students respond to each other in normal dialogue fashion, are a natural addition to the lab COMPS computer-mediated group discussion exercises are component of a computer programming class. being added to a second-semester computer programming NC A&T has migrated to an objects-later curriculum, class. The class is a gateway for computer science and com- meaning that CS2 contains more object concepts than the puter engineering students, where many students have diffi- first semester CS1 class. The student exercises in this culty succeeding well enough to proceed in their major. This paper reports on first results of surveys on student experi- intervention are thus oriented toward object concepts. ence with the exercises. It also reports on the affective states Expressions of affect have three potential uses for this observed in the discussions that are candidates for analysis project. One is they are indications of emotional states that of group functioning. As a step toward computer monitoring may effect student enthusiasm, self-efficacy and of the discussions, an experiment in using dialogue features satisfaction. Another is they will be used in studies of to identify the gender of the participants is described. group interaction. Finally, they may detectable by machine, contributing to an instructor's dashboard or other assessment of how well the group discussions are working. Introduction The second Java programming class, GEEN 165, at North Carolina A&T State University is a bottleneck for many Background Computer Science and Computer Engineering students. As an experiment in improving student learning and interest, COMPS Dialogue Platform and Exercises COMPS computer-mediated discussion exercises (Glass et COMPS is a web-delivered computer-mediated chat envi- al., 2014a) have been introduced. This paper reports on ronment (Kim et al., 2013). It permits the instructor (or a first measurements of a) student self-efficacy and interest, TA) to monitor each conversation. The dialogue data from b) expressions of affect within the discussions. As a test of this study comes from log files. Attesting to the interactiv- our ability to have the computer monitor the conversation, ity of the COMPS experience, about half of all typing oc- the expressions of affect were applied toward the task of curs while several students are typing. Even three students using dialogue features to identify the gender of the partici- at a time can be typing and responding to each other, all pant. contributing to the same discussion, since they can see GEEN 165 corresponds to the CS2 (second semester) each other's keystrokes in real time. In spoken conversation class in the ACM/IEEE curriculum (ACM/IEEE, 2013). productive dialogue does not happen when three people are The historical success rate for students attempting GEEN talking at once, but we have shown that in the chat domain 165 is low. From 2003 to 2012, comprising about 1000 it indeed occurs (Glass et al., 2015). student-semesters, approximately 66% of students The exercises in this project involve students solving succeeded well enough (grade C or better) on the first multiple-choice questions. When implementing these as attempt to continue to the next class. The fact that so many group collaborations, we pay attention to three principles students have difficulty makes it potentially a fertile class that promote successful collaborative learning: a structure for experimenting with educational innovation. or activity script for the students to follow, creative Lab-based computer programming classes interdependence, and individual accountability (Eberly traditionally permit unstructured group interaction. Center, 2016). The activity is structured as follows. The Students can talk to each other even as they require the students are instructed to come to consensus on the answer, students to write their own software. Therefore problem- then have one student approach the instructor or a TA to Copyright retained by the authors. verify the answer. That student is responsible for bringing 69 Jung Hee Kim et al. MAICS 2016 pp. 69–74 the correct answer (or a hint) back to the group, and they ing”), emerging individual interest, and well-developed in- must reach consensus again. Creative interdependence dividual interest. Mitchell (1993), as an example, reported means that students should need each other to complete the that using group work activities, computer-based activities, exercise, it should not reasonable for one or several engaging puzzles, and meaningful activities, were corre- students to race ahead and finish it and leave the others lated with triggering and holding interest in a mathematics behind or let them not participate. During the discussion classroom. the obligations of discourse require that students explain Recently Kim and Schallert (2014) have investigated themselves in the course of reaching consensus. Having the mediating effect interpersonal interactions have on conceptual knowledge as the learning goal promotes student interest. It is possible to track student interest in explanatory dialogue. We have examples where seemingly four developmental phases throughout a semester, not just the weakest student serves as a metacognitive regulator, within the time frame of individual activities. It is affected challenging or directing every reasoning step and not only by the enthusiasm expressed by the teacher and becoming a participant in all dialogue exchanges as the fellow students, but also by factors such as affiliative other students seem to teach that weakest one (Glass et al., motivations: the desire to belong to the group. The social 2013). Individual accountability typically occurs after the factors enhancing interest were found within college group exercise, where the students have a quiz or an classes in a number of diverse disciplines (e.g. history, exercise utilizing what they have learned. Individual chemistry, religion) in both upper and lower level college accountability also occurs within the discussion, as the classes. students find themselves responsible for explaining their Viewed in this light, group exercises should address positions in order to reach consensus. student motivation issues through social interaction at the same time as they address learning of concepts through Addressing Student Learning group cognition. The exercises are constructed so that Our collaborative inquiry learning exercises are in line students engage with other students, providing the small- with current practices in Computer Supported Collabora- group interpersonal contact that best transmits enthusiasm. tive Learning. A key concept is group cognition, where dif- The students know the teacher is watching the ferent participants in a conversation contribute different conversations and is taking an active interest in the parts of the epistemic knowledge construction task. The students' progress, sometimes by intervening and Virtual Math Teams project, where students solve math sometimes by providing answers and hints. The implication for COMPS technology is that problems through computer-mediated chat, has docu- monitoring the health of student conversations could be mented this phenomenon (Stahl, 2009). Learning through informed by expressions of student affect. Affect, the group cognition is justified both in terms of learning out- observable manifestation of emotion, mediates social comes and student motivation. There is also research interaction and is related to student interest. specifically showing that collaborative activity is a desir- Self-efficacy, an individual’s belief to be capable of able pedagogical approach for “relational understanding” performing a particular task (Bandura, 1977), has been or understanding of concepts (Tchounikine et al., 2010). widely studied because of its relationship to performance Dialogue that engages in domain reasoning, such as ex- including academic achievement (Choi, 2005; Pajares and plaining, negotiating, or inferring is observed in these Miller, 1995; Wood and Locke, 1987) and even choice of kinds of exercises (Zhou, 2009; Stahl, 2004). major in college (Hackett, 1985). In accordance with the The implication for COMPS technology is that suggestions of Finney and Schraw (2003), we measured monitoring the health of student conversations could be self-efficacy using task-specific survey items rather than informed by a) whether students are talking to each other, generalized questions. This project measures students’ self- b) whether they are engaging in reasoning activities. efficacy both at the level of the skills in individual assignments at the time of the COMPS exercises and Addressing Student Interest and Self-Efficacy overall in the topics of the class at the beginning and end of Group exercises address many of the components of stu- the semester. dent interest. Interest refers to an individual’s psychologi- cal inclination to participate in particular content over time Data and Methods (Hidi & Renninger, 2006). There is a relationship between interest, achievement goals, performance and retention We have collected data from one semester of the GEEN (Harackiewicz et al., 2008). Interest plays a critical role in 165 class. There were 55 students at the start of the semes- students’ further decisions on engaging and reengaging in ter and 47 at the end. We administered COMPS exercises the major (Brown, 2012). The four-phase model of interest four times during the semester, with 53 group discussions posits four sequential interest phases: triggered situational in total. Most groups had 3 or 4 participants. The bulk of interest (“catching”), maintained situational interest (“hold- students were assigned to sessions quasi-randomly as stu- 70 Jung Hee Kim et al. MAICS 2016 pp. 69–74 dents arrived in lab. Cliques of friends, who tended to ar- • Excited. rive together, were split into different random groups. We • Apologetic. Refers to a user expressing regret for deviated from this protocol by creating a few all female previous action. This type of message is usually groups, for comparison with the all-male groups. Alto- aimed towards another user or towards the group gether there were about 8000 dialogue turns. Students were as a whole. surveyed near the beginning and end of the semester re- • Humor. garding their enthusiasm for the class, their self-efficacy in • Frustrated. programming, and their desire to continue. Every COMPS • Confused. User explicitly expressing confusion, exercise was also accompanied by a survey of the student or exhibiting confusion e.g. through questions. experiences. • Sad. A negative emotion determined by keywords and sad emoticons that are usually directed at self. Transcript Processing and Annotation Some of these affective states have been tagged and Table 1 contains an extract from a COMPS discussion. illustrated in the Table 1 dialogue. Table 2 indicates some From COMPS log files we extract dialogue turns in of the textual indications for the various states. These are spreadsheet format for processing. The text from one dia- being used by the coders at present, but will become logue turn is in one line of the spreadsheet. In addition to machine-derived features for the purpose of machine- the metadata such as problem number, turn number, and annotating the affective states. time stamp, each dialogue turn is tagged with features. Some are derived by software and some are annotated by Surveys hand. These features are available for machine learning ex- The survey administered to all students at the beginning periments and for human analysis and study of dialogues. and end of the semester has an interest part and a self-effi- The machine-derived classifiers are available for feeding cacy part. The end-of-semester survey also inquires about software that will monitor the health of the conversation. student plans for continuing in the major and registering Some of the existing machine-derived features (Glass for the next programming class. All items use a 6-point et al., 2014b) that have been relevant to transcript studies scale. The interest survey items are derived from a survey and machine monitoring of the health of the conversation from Harackiewicz et al. (2008). One of the authors of this are: paper has utilized these items to assess how much a stu- • The presence of discourse marker words, e.g. dent's interest in a class is affected by the enthusiasm of “now” or “therefore” near the beginning of a dia- fellow students (Kim and Schallert, 2014). Some represen- logue turn. These are linguistically associated tative items are “What we are learning in GEEN165 this with reasoning, and are therefore possibly indica- year can be applied to real life” and “To be honest, I don’t tive of productive discussion. find what we do in the GEEN165 class interesting.” The • The presence of pronouns that include another self-efficacy items inquire about student confidence in participant in the dialogue: “you,” “we,” “us.” completing 13 tasks corresponding to class topics. This list These are possibly indicative of transactive dis- was obtained from the instructor. A typical item is “Design cussion. inner classes that implement event handling interfaces.” • The presence of question marks. The after-COMPS-lab survey had items covering • The presence of emoticons. It is possible that student perceptions in three areas: student interest, whether emoticons are associated with students attending the student learned from the lab, and how well the group to each others affect. exercise functioned. An example item is “I contributed to • The length of a turn in words. the understanding of other students in my group.” • Whether typing this turn overlapped with other people typing. Results Affective States Evinced in Dialogue Of particular interest are six affective states that we have Survey Results chosen as initial targets. These are annotated by hand. They Table 3 shows the students' perceptions of interest and effi- were chosen because they may be salient for monitoring cacy at the beginning and end of the semester. All interest both the learning aspects (whether the students are reason- items were combined into one mean and the same for all ing together) and the social health of the conversation. We efficacy items. In total 28 students participated in both pre- show here some of the definitions that the coders have ap- and post-surveys. plied for consistency in recognizing and coding. • Regarding students’ interest toward the course, their interest did not change. The averages of stu- 71 Jung Hee Kim et al. MAICS 2016 pp. 69–74 dents’ interest toward the course in the beginning to mimic the task of monitoring a conversation in real time of the semester and end of the semester were 4.33 turn-by-turn. These classifiers have not been successful. and 4.32 respectively. The same features that are statistically correlated with • Self-efficacy with respect to the course content in- gender are discovered by the decision trees, but accuracy dicated significant improvement between the be- has been quite low. ginning and the end of the semester, rising from 2.83 to 3.81. The increase in self efficacy was significant, p < 0.01. Discussion and Future work Table 4 shows students' perception of the COMPS labs, surveyed immediately after each lab. There seemed to Survey Results be a clear improvement between the first part of the The students experienced improvement in their experience semester (Labs 1 and 2) and the later part (Labs 3 and 4). of the COMPS exercises during the semester. They re- Students perceived: ported that the groups worked better in the last two exer- • more effective group work in the second part cises and that they learned more. It is not clear why student (means rose from about 3.1 to about 3.4) interest was lower in the last lab. Anecdotally there are two • better understanding of concepts in the second reasons that have been suggested by the instructor and lab half (means rose from about 3.4 to about 3.9). TAs who supervised this session. One is that the last lab Multiple one-way ANOVA supports the hypothesis was optional, presented during Thanksgiving week. That that mean scores are indeed different, p = 0.03 for both fewer students attended could indicate that the general effectiveness and understanding. Post hoc analyses using level of engagement was lower than usual. The other is that the Tukey test for significance indicated that the mean perhaps the novelty was wearing off. Some students ex- scores of Lab 3 were significantly higher than Lab 2 for pressed as much during the session. We will need to find both effectiveness and understanding. some way to survey the reasons for student interest. However, students’ interest in each exercise in the lab The pre- and post-semester survey is hard to interpret sessions seemed to fluctuate throughout the semester. Lab because of low participation rate and dropouts. In the next 3 had the highest interest, which corresponded with the semester we are enforcing better participation. The highest effectiveness and understanding. But interest in increase in self-efficacy was striking, but we do not have Lab 4 was the approximately the same as Labs 1 and 2. yet any comparison with other classes. Future work includes comparing interest and self-efficacy with learning Affective States by Gender gains on the pre- and post-tests. Future work also includes We annotated the 14 group discussions of one COMPS ex- comparing pre- and post-semester survey results with ercise, comprising 2147 dialogue turns, for the six affective individual lab surveys, to see whether there are correlations features. In total 199 turns showed evidence of one or more between overall student interest and the situational interest feature, or 9.3%. in individual COMPS exercises. As a first test of the utility of these features along with Another analysis in the future will be between the the machine-generated ones, we tried to use them to predict participants of the same group: do they agree about the gender of the participant. Among 49 students we had learning and group functioning, do they have similar 16 women and 33 men. First we aggregated all the turns learning gains. from each student, and looked at statistical differences between the two populations. Two-tailed t-tests revealed Affective States in Dialogue that none of the features were significantly different Hand-annotating the remaining 6000 turns of dialogue may between the genders at the p < 0.05 level. However result in more reliable statistical correlations. We are also expressions of apology were different at the p = 0.06 level. at work toward machine-annotation of these features. The most common affective feature was confusion, with 62 Annotation of the affective states so far has relied instances of utterances expressing confusion. Women entirely on the text of the dialogues. Future work will expressed confusion in 4.6% of turns, and men in 2.2%. It include extra-linguistic features. In COMPS group suggests the two genders behave differently, but the p < exercises in other classes evidence of student engagement 0.22 level does not show significance. The two genders sometimes presents through Comic Sans typeface, big or also showed differences in the amount of participation. bold fonts, and wild colors. We are also exploring using Men each uttered an average of 46 turns per dialogue and timing features from the overlapped typing. Students can women 36 turns. all type simultaneously while seeing each other's We then trained a J48 decision tree classifier and a developing chat text (Glass et al., 2015). We think that multiple-regression linear classifier using the Weka data typing speed, degree of simultaneous typing, and pauses as mining tool (Witten and Frank, 2005). The task is to they look at each other's turns, may provide indications of classify each dialogue turn with the gender of the speaker, 72 Jung Hee Kim et al. MAICS 2016 pp. 69–74 affective states such as being excited or indications of ous Typed-Chat in Computer-Supported Collaborative Dia- when they are attending to each other's utterances. logue. Journal of Computing Sciences in Colleges, 31 no. 2, Dec. Hackett, G. (1985). Role of mathematics self-efficacy in Acknowledgements the choice of math-related majors of college women and Partial support for this work was provided by the National men: a path analysis. Journal of Counseling Psychology, Science Foundation's Improving Undergraduate STEM Ed- 32, pp. 47–56. ucation (IUSE) program under Award No. 1504917. Any Harackiewicz, J. M., A.M. Durik, K.E. Barron, L. Linnen- opinions, findings, and conclusions or recommendations brink-Garcia, and J.M. Tauer. (2008). The role of achieve- expressed in this material are those of the author(s) and do ment goals in the development of interest: Reciprocal rela- not necessarily reflect the views of the National Science tions between achievement goals, interest, and perfor- Foundation. mance. Journal of Educational Psychology, 100, pp. 105‒ 122. Hidi, S., and K. Renninger. (2006). The four-phase model References of interest development. Educational Psychologist, 41, pp. 111‒127. ACM/IEEE. Computer Science Curricula 2013: Curricu- lum Guidelines for Undergraduate Degree Programs in Kim, Jung Hee, Melissa Desjarlais, Kelvin Bryant, and Computer Science. New York: ACM, 2013. Michael Glass. (2013). Observations of Collaborative Be- hivor in COMPS Computer-Mediated Problem Solving, Bandura, A. (1997). Self-efficacy: the exercise of control. Proceedings of the 24th Midwest AI and Cognitive Science New York: Freeman. Symposium, New Albany, IN. Brown, J. (2012). Developing a freshman orientation sur- Kim, Taehee, and Diane L. Schallert. (2014). Mediating ef- vey to improve student retention within a college. College fects of teacher enthusiasm and peer enthusiasm on stu- Student Journal, 46, pp. 83‒851. dents’ interest in the college classroom. Contemporary Ed- Choi, N. (2005). Self-efficacy and self-concept as predic- ucational Psychology, 39, no. 2 pp. 134‒144. tors of college students’ academic performance. Psychol- Mitchell, M. (1993). Situational interest: Its multifaceted ogy in the Schools, 42, pp. 197‒204. structure in the secondary school mathematics classroom. Eberly Center (2016). Carnegie Mellon, Eberly Center for Journal of Educational Psychology, 85, pp. 424‒436. Teaching Excellence and Innovation. What are best prac- Pajares, F., and M. D. Miller. (1995). Mathematics self-ef- tices for designing group projects? http://www.cmu.edu/ ficacy and mathematics outcomes: The need for specificity teaching/designteach/design/instructionalstrategies/group- of assessment. Journal of Counseling Psychology, 42, pp. projects/design.html. Retrieved March, 2016. 190–198. Finney, S. and G. Schraw (2003). Self-efficacy beliefs in Stahl, Gerry (2004). Building collaborative knowing: Ele- college statistics courses. Contemporary Educational Psy- ments of a social theory of CSCL. In J. W. Strijbos, P. chology, 28, pp. 161‒186. Kirschner and R. Martens, eds., What we know about Glass, Michael, Melissa Desjarlais, Jung Hee Kim, and CSCL and implementing it in higher education, Boston, Kelvin Bryant. (2013). COMPS Computer-Mediated Prob- MA: Kluwer Academic Publishers, pp. 53‒86. lem Solving Dialogues (poster abstract). International Con- Stahl, Gerry, ed. (2009). Studying Virtual Math Teams, ference on Computer Supported Collaborative Learning, Springer, pp. 57‒73. Madison, WI. Tchounikine Pierre, Nikol Rummel, and Bruce M. Glass, Michael, Jung Hee Kim, Kelvin Bryant, Melissa McLaren. (2010). Computer Supported Collaborative Desjarlais, Micayla Goodrum, and Thomas Martin. Learning and Intelligent Tutoring Systems. In R. Nkambo, (2014a). Toward Measurement of Conversational Interac- J.Bourdeau, & R. Mizoguchi (Eds.) Advances in Intelligent tivity in COMPS Computer Mediated Problem Solving. Tutoring Systems. Chapter 22, pp. 447‒463, Springer. Proceedings of the 25th Modern Artificial Intelligence and Cognitive Science Conference (MAICS-14), Spokane, WA. Witten, Ian H. and Eibe Frank. (2005). Data Mining: Prac- tical Machine Learning Tools and Techniques, 2nd Edition, Glass, Michael, Jung Hee Kim, Kelvin Bryant, and Melissa Morgan Kaufmann. Desjarlais. (2014b). Indicators of Conversational Interac- tivity in COMPS Problem-Solving Dialogues, Third Work- Wood, R.E. and E.A. Locke. (1987). The relation of self- shop on Intelligent Support for Learning in Groups (ISLG) efficacy and grade goals to academic performance. Educa- at Twelfth International Conference on Intelligent Tutoring tional and Psychological Measurement, 47, pp. 1013– Systems, Honolulu, Hawaii, June. 1024. Glass, Michael, Jung Hee Kim, Kelvin Bryant, and Melissa Zhou, Nan (2009). Question Co-Construction in VMT Desjarlais (2015). Come Let Us Chat Together: Simultane- Chats. In Stahl, 2009, pp. 141‒159. 73 Jung Hee Kim et al. MAICS 2016 pp. 69–74 Table 1: Example of Dialogue Transcript with Affective Features Turn Student Time Dialogue turn Affective State 1 A 06:44.2 f and foo are the refernece variables 2 A 07:05.2 so those together make 16? for the refrence types 3 B 07:11.9 yup yup 4 A 07:27.9 16 bytes 5 C 07:30.2 2a = 20 6 C 07:36.0 :D Excited 7 B 07:39.7 there ya go lol Humor 8 D 07:54.9 Wait where did you get 16? Confused 9 D 08:05.8 wouldnt it be 48 at least for the main method 10 D 08:18.3 because the array creates 5 object 11 A 08:26.1 oh yeah i looked over that was just counting m f and foo Apologetic 12 C 08:28.7 those are on the heap not the stack 13 D 08:48.0 So the objects created by an array are on the heap 14 A 09:13.8 yeah run time stack = 48 Table 2: Example of Feature words Excited Apologetic Confused Frustrated Sad :D sorry i'm confused D:< :( yay my bad how ):< ): yes! nvm why This is hard I feel stupid !!! whoops what is cool! i messed up I don't under- stand Table 3: Beginning and end of semester surveys Time Interest Efficacy beginning of sem. 4.33 / 5 2.83 / 5 ending of sem. 4.32 / 5 3.81 / 5 Table 4: After lab surveys Effectiveness of group work Understanding of concept Interest in lab Mean / SD Mean / SD Mean /SD Lab1 3.17 0.68 3.45 0.96 3.19 0.94 Lab2 3.08 0.93 3.42 1.05 3.08 0.93 Lab3 3.47 0.71 4.03 1.06 3.65 0.76 Lab4 3.40 0.61 3.78 0.85 3.17 0.89 74