COMPS Computer Mediated Problem Solving: A First Look

         Melissa A. Desjarlais                               Jung Hee Kim                             Michael Glass
            Valparaiso University                          North Carolina A&T                       Valparaiso University
         melissa.desjarlais@valpo.edu                        jungkim@ncat.edu                      michael.glass@valpo.edu


                            Abstract                                    • Computer-monitored dialogues. COMPS has provisions
                                                                           for an instructor to oversee and intervene in the student
  COMPS is a web-delivered computer-mediated problem                       conversations. In the style of, e.g. Argunaut [De Groot
  solving environment designed for supporting instructional ac-            et al. 2007], COMPS will provide a status screen for the
  tivities in mathematics. It is being developed as a platform
                                                                           instructor, showing what knowledge the students have dis-
  for student collaborative exploratory learning using problem-
  specific affordances. COMPS will support computer-aided                  covered in their inquiry learning as well as measures of af-
  monitoring and assessment of these dialogues. In this paper              fective state (e.g. are they on-task or frustrated) and other
  we report on the first use of COMPS in the classroom, sup-               measures of progress. Experiments toward computer-
  porting an exercise in quantitative problem-solving. We have             generated status are described in this paper.
  identified a number of categories of dialogue contribution that       • Assessment reports. Using similar techniques as for mon-
  will be useful for monitoring and assessing the dialogue and
                                                                           itoring, COMPS will provide the instructor with assess-
  built classifiers for recognizing these contributions. Regard-
  ing the usability of the interface for problem-solving exer-             ment reports of the conversations. This will permit the
  cises, the primary unexpected behavior is an increase (com-              instructor to have the students engage in the exercises out
  pared to in-person exercises) in off-task activity and concomi-          of class, on their own time.
  tant decrease in shared construction of the answer. Its first         • Observation and data collection. COMPS collects tran-
  large deployment will be for Math 110, a quantitative literacy           scripts and data that will be useful both in understanding
  class at Valparaiso University.                                          the student problem-solving behaviors and in producing
                                                                           better computer understanding of COMPS conversations.
                        Introduction                                       In this paper we report on the interaction model of
                                                                        COMPS, the educational context for its initial deployment,
COMPS is a web-delivered computer-mediated problem                      results from first use in a classroom setting, and first results
solving environment designed for supporting instructional               toward having it monitor the progress the student conversa-
activities in mathematics.                                              tion.
   In its initial classroom use COMPS supports groups of
students engaging a particular exercise in quantitative liter-                            The COMPS Model
acy: figuring out a winning strategy, should one exist, for
a Nim-like game. It has problem-related affordances for                 The common threads to COMPS applications are a) dia-
the students to manipulate, shows the instructor the con-               logue, b) solving problems, and c) third parties. It is in-
versations in real time, permits the instructor to intervene,           tended to facilitate and capture the kinds of interactions that
and records all events for analysis. The intelligent part of            would occur in mathematics problem-solving conversations.
COMPS, which has not been deployed in classroom use,                    We have a simplified keyboard-chat communication channel
has the computer itself participate in the supervisory task:            instead of in-person face-to-face and voice communication.
monitoring the conversation status for bits of knowledge and            This permits us to readily log all interaction, more impor-
other markers of progress or lack of progress and displaying            tantly it facilitates having the computer understand, moni-
its findings to the supervising instructor.                             tor, assess, and potentially intervene in the dialogue. Be-
                                                                        cause the problem domain is mathematics COMPS includes
   COMPS gives us a platform for deploying AI techniques
                                                                        facilities for interpreting and rendering “ASCII Math,” ex-
in mathematics dialogues. Immediate applications include:
                                                                        pressions typed in-line using ordinary keyboard characters
• Exploratory learning.       COMPS is an environment                   [MathForum 2012a].
  with affordances for computer-supported collaborative                    COMPS conversations can be tutorial or they can be peer-
  exploratory-learning dialogues. Plug-in modules provide               to-peer explorations. Our view of how to support interac-
  problem specific aids and affordances. The Poison game                tions is informed by the tutorial problem-solving dialogue
  we report on here comes with a visualization of the game              studies of [Fox 1993] and the Virtual Math Team problem-
  state and buttons for playing.                                        solving dialogue studies of [Stahl 2009]. Wooz, the im-
mediate predecessor to COMPS, has been used for record-          paraiso University (VU) Math 110 class. Math 110 deliv-
ing and facilitating tutorial dialogues in algebra and differ-   ers the quantitative literacy skills expected of an educated
ential equations, experiments in structured tutorial interac-    adult [Gillman 2006] along with the mathematics skills ex-
tions, and exploratory learning with differential equations      pected in quantitative general education classes in a liberal
visualization applets [Kim and Glass 2004][Patel et al. 2003]    arts curriculum. It achieves this by using modern pedagog-
[Glass et al. 2007].                                             ical techniques and a selection of topics and problems that
   The other element of COMPS conversations is possible          are quite different from, more motivating than, and we hope
third parties: teachers figuratively looking over the shoul-     more successful than the typical bridge or college algebra
ders of the students as they work, computers also looking        class.
over the shoulders, teachers and computers intervening in           It is the style of instruction that matches Math 110 to
the conversation, reports generated afterward with assess-       COMPS, viz:
ments of the student learning sessions, and analyses of the      • Problems are explored by experimentation, using manip-
transcripts of interactions.                                        ulatives and written instructions.
   The common elements of COMPS applications are thus:
                                                                 • Four person groups collaborate on the in-class explo-
• Interactivity. Just as in in-person interactions, partici-        rations, with students adopting special assigned roles in
  pants can behave asynchronously: interrupting and chat-           the collaborative process.
  ting over each other. Participants can see the other partic-   • During the class period the instructor observes the group
  ipants’ keystrokes in real time, they do not need to take         interactions and offers suggestions or guiding questions,
  turns or wait for the other person to press enter. One            as needed.
  use for this is documented by Fox who found tutors us-
  ing transition relevance points [Sacks et al. 1974]. These     These are aligned with the three threads of COMPS: solving
  are places within a dialogue turn where the other party is     problems, dialogue, and third parties. During a semester,
  licensed to take over. For example, the tutor can provide      students solve twenty in-class problems. An emphasis is
  scaffolding by starting to say an answer. Using prosodic       placed on problem-solving strategies.
  cues (rising voice, stretched vowels), the tutor provides         Math 110 in its current form has been the established
  the student opportunities to take over the dialogue and        bridge class in the VU curriculum for 15 years. Students
  complete the thought.                                          enrolled in Math 110 performed poorly on the mathematics
                                                                 placement exam and must successfully complete the course
• A problem window. The problem is configurable, but gen-        before they can enroll in quantitatively-based general edu-
  erally there are areas of the screen window that keep the      cation courses. Data show that completing Math 110 has
  problem statement and elements of the solution in view         a positive effect on retention and success at the university
  without scrolling them off the screen. These items are as-     [Gillman 2006].
  sumed to be within the dialogue focus of all participants at      Math 110 differs from simply repeating high school alge-
  all times, the objects of team cognition (Stahl) and shared    bra not only in teaching style but also in content. There are
  construction (Fox).                                            five topical themes: Pattern Recognition, Proportional Rea-
• A central server. The server routes interaction traffic be-    soning, Fairness, Graphs and Decision Science, and Orga-
  tween the participants and optional third parties to the       nizing Information. Together these themes provide a back-
  conversation (both human and machine), and records all         ground in logical reasoning, quantitative skills, and critical
  interactions in log files.                                     thinking.
                                                                    Writing skills are exercised by requiring students to write
Figure 1 at the end of this paper illustrates COMPS at work.     up each problem in a narrative format. Each written solution
   COMPS runs as a Flash application within a web browser,       includes the statement of the problem in the student’s own
the server is a Java program. The application is config-         words, the solution of the problem, and an explanation of the
urable: plug-in modules written in Flash provide custom          solution. Often this entails a description of the experimental
environments tailored for particular mathematics problems        activities and results. The students are assessed on the writ-
and learning modalities.                                         ten aspect of the solution in addition to the mathematical
   COMPS is similar in spirit to the Virtual Math Teams          aspect.
(VMT) chat interface [MathForum 2012b]. The VMT inter-
face supports a generalized graphical whiteboard instead of                          Poison Exercise
having specialized interfaces for particular exercises. How-
                                                                 An example of a Math 110 collaborative exercise—the first
ever many of the exercises that COMPS is intended to sup-
                                                                 we have implemented in COMPS—is the Poison problem.
port are currently executed in class with manipulatives. For
                                                                 The prompt is shown in Figure 2 at the end of this paper.
example the Poison game described in this report uses piles
                                                                 Poison is a Nim-like two-person game. Starting from a pile
of tiles. It was incumbent on us to have COMPS provide
                                                                 of tiles, each person removes one or two tiles per turn. The
software affordances that mimic the manipulatives.
                                                                 last tile is “poisoned,” the person who removes the last tile
                                                                 loses. The question before the students is to figure out how
                Math 110 Background                              to play perfectly, to find an algorithm for either person A
A goal of this project is to introduce COMPS computer-           or person B to force a win. In a classroom setting the ma-
mediation to the group collaborative exercises in the Val-       nipulative for this exploratory learning exercise in pattern
 A     well everytime ive had 4, or 7 i lose.                       A    How?
 C    huh?                                                         //D   If you take 2, then whatever you do on the next
 A     Oh wait, that’s every round >:(                                   turn, you can do the opposite to leave 1.
 C    i dont think it matters                                       B    If you take 1 or 2, then you can take 1 or 2 to
 B    hahaha                                                             counter balance that//
      (playing game)                                                A    OK
 B    lets do 23 again and ill pick a 1 to start instead of a       C    OK
      2?                                                           //C   So if I take 2, whatever they do ...
 A     FINE                                                         B    So basically if the other team ends up 4 left, then
      ..                                                                 you can win. //
       .                                                            D    Yes
 D    i just tried to avoid 7 and still got stuck with 4            B    And that’s if the other team ends up with 4 left
 Figure 3: Dialogue from Poison Exercise Using COMPS                B    OK
                                                                    A    We could maybe abbreviate opponent as OPP or
                                                                         something. Whatever, you might be writing a lot.
                                                                    B    So yeah. um
recognition is a box of tiles. Students also have pencil and
                                                                         (sounds of mumbling)
paper.
                                                                    C    Ok. Um
   For purposes of moving this exercise to the computer-            B    Oh boy
mediated environment, we wrote a COMPS module that                  A    We don’t need grammar.
simulates the manipulatives: the pile of tiles. There are but-      B    Um so, if they 4 left you can win have how can you
tons for each of the two teams to remove one or two tiles.               get it so that ..
There is an option to arrange the tiles into small groups, a        D    If you have 5 or 6 on your turn, you can either take
useful way to visualize the game and its solution. Students              1 or two to get it to that situation.
sometimes discover this method while playing with the tiles         B    Ok you you want to get to 4, that’s kind of a stable
on the table-top. There is an option to restart the game with            point where you can force them
an arbitrary number of tiles. Students often find that they can
better analyze the game if they consider a simpler problem,                  Figure 4: In-Person Poison Dialogue
with only a few tiles. Finally, there is a record of the moves
played, since in the face-to-face regime students typically
write down the sequences of moves for study.                      Observations
   The current status of this COMPS plug-in is that students      Both from experience observing Poison exercises, and from
can play Poison, the teacher can monitor all the ongoing con-     prior audiotaped sessions, differences between COMPS-
versations in the computer lab, and the teacher can intervene.    mediated and in-person versions of Poison were evident.
The computer is not yet monitoring the conversation.
                                                                  • The COMPS students spent considerable time off-task,
                                                                    chatting about things not related to the problem. From
                       First Usage                                  the start, when students were logging in and greeting each
                                                                    other, it took some time for them to focus on the problem.
Setup                                                               Off-task conversation was almost negligible in our audio
                                                                    tapes, and not extensively observed in the classroom be-
In November 2011 students in an elementary education                fore the problem is solved.
mathematics course used the COMPS version of the Poison           • The COMPS groups spent much time playing the game
exercise. These were not Math 110 students, but education           for entertainment value, without advancing toward the
students who would normally engage in quantitative literacy         goal of deducing whether a winning strategy existed.
classroom exercises as part of both learning the mathematics
and experiencing how it is taught.                                • In the COMPS environment there was more team rivalry
                                                                    between the two teams within a group. There was even an
   Twenty-five students were arranged in six groups in a
                                                                    instance where a student was reluctant to share the win-
computer lab so that group members were not near each
                                                                    ning strategy with the rest of the group.
other and verbal conversations were discouraged. The stu-
dents were accustomed to working in groups sitting around            A consequence of all three of these behaviors is that in-
a table. Keyboard chat was a new element. Each student            cidences of shared construction of the winning strategy are
was given a copy of the problem. The instructor logged in         less often observed in the COMPS transcripts, compared to
as a member of each group so that she could monitor and           their transcribed verbal ones. Figure 4 (in-person) and Fig-
contribute to the conversations. A sample from a conversa-        ure 3 (computer-mediated) illustrate the typical difference.
tion is shown Figure 3. The session ran for approximately         The in-person group engages in long exchanges where group
40 minutes, at which time the students stopped where they         cognition is evident. In the computer-mediated group the
were and gathered together offline to share notes for their       students rarely engage with each other for more than several
written reports.                                                  turns at a stretch.
The student experience                                            react to the other students’ developing turns as they are
Students were surveyed the next day in class. There were 8        typed.
Likert questions (strongly-disagree to strongly-agree) and 6
short-answer questions. The students told us the following.              Studies in Computer Monitoring
• Using the computer seemed easy: 19 of the 25 students         The first COMPS application of intelligence is to figura-
  either agreed or strongly agreed.                             tively look over the shoulder of the students as they work,
                                                                then display a real-time summary for the instructor. We have
• Students were divided over whether it was easier to play      initially approached this task by writing shallow text classi-
  Poison on a computer than with tiles on a table.              fiers. The work in this section is described in an unpublished
• Eleven students were neutral with regard to whether it was    report [Dion et al. 2011].
  easier to find a winning strategy for Poison on a computer
  than with tiles on a table, while 10 students either agreed   Background
  or strongly agreed that the computer was easier.              We created a preliminary set of categories and classifiers
  This finding stands in contrast with our observation that     based on two sources of language data
  the computer-mediated groups were less successful in          • Tape-recorded dialogues of upper-class students working
  finding a winning strategy.                                      the poison exercise. Figure 4 shows an extract of recorded
• Responding to open-ended questions, students enjoyed             verbal interaction.
  the chat option in COMPS and the fact that the activity       • Written reports of the Poison problem that Math 110 stu-
  was different from other class activities.                       dents provided in earlier semesters. These reports exhibit
• On the other hand, when comparing using COMPS to                 many of the mathematical realizations that student exhibit
  solving problems face-to-face around a table, the students       while solving the Poison problem, but none of the dia-
  commented that it took time to type their ideas (which           logue or problem-construction phenomena.
  were sometimes difficult to put into words) and they could    This work was completed before the initial collection of
  not show things to the others.                                COMPS-supported Poison dialogues, so does not include
  One student did comment that the chat environment made        the COMPS data.
  the student try to solve the problem individually rather         For the COMPS Math 110 project we are concentrat-
  than sharing the solution right away among the group          ing first on identifying epistemic knowledge and social co-
  members.                                                      construction phenomena. This is congruent with the results
                                                                of a study of the criteria that teachers use for assessing stu-
• Aspects of the Poison module were troublesome. Stu-           dent collaborative efforts [Gweon et al. 2011]. We cate-
  dents were confused about the L/R buttons (they were for      gorized the dialogue data according to the following types
  the two teams), they would have preferred images of tiles     of phenomena we deemed useful for real-time assessment
  to the @ symbol, and they found keeping up with the con-      along these axes:
  versation difficult at times.
                                                                • Bits of knowledge: domain-specific realizations that are
  This was congruent with our own observation of students
                                                                   either needed or characteristically occur during the path
  using the interface. Images of tiles, and perhaps even a
                                                                   toward solving the problem.
  way to drag them with a mouse cursor, would be a bet-
  ter model for the manipulatives than the simple row of @      • Varieties of student activities that were on-task but not
  symbols and buttons. It took students a while to learn to        part of the cognitive work of constructing the solution:
  use the interface in this respect.                               e.g. picking sides, clarifying rules, playing the game.
• The students would have liked to have a way to have a         • Student utterances related to constructing a solution: e.g.
  private chat between members of a team so that the other         making observations, hypothesizing, wrong statements.
  team could not see their conversation.                        • Off-task statements, filler.
Other observations of student use of the interface:             Altogether we annotated the student utterances with 19 cat-
• The physical tiles are limited to 20, but the computer        egories, shown in Table 1. In this study, individual dialogue
  placed no limit on virtual tiles. Combined with the Poison    turns or sentences were assigned to one of these categories.
  game’s evident play value, this resulted in some COMPS
  groups playing longer games with more tiles than the
                                                                Experiment in machine classification
  physical-tiles groups do. Such games did not contribute       For our classifiers we chose two numerical methods: non-
  to understanding.                                             negative matrix factorization (NMF) and singular value de-
                                                                composition (SVD). SVD is the most common numeri-
• In person student groups picked and maintained teams a        cal technique used in latent semantic analysis (LSA). Both
  bit more readily. We think COMPS should allow students        of these methods rely on factoring a word-document co-
  to pick a team, and have the software display the current     occurrence matrix to build a semantic space: a set of
  team rosters.                                                 dimensionality-reduced vectors. The training set for these
• We observed students using the full-duplex chat commu-        experiments—the text used for building semantic spaces—
  nication constantly. They often do not take turns, and they   was 435 sentences from the written corpus. The test sets
                                                                    Some categories occurred very infrequently in both the train-
 Table 1: Dialogue Categories from Poison Conversations             ing and test corpora, resulting in very low success rates.
          Dialogue Category                                         Thus we also report the percent correct among the most
     1    4 tiles is important                                      common three categories in the test corpus: numbers 6, 11,
     2    2 and 3 are good tiles                                    and 15 in Table 1. Together these represented n = 59, more
     3    You want to leave your opponent with 19 tiles             than half the test sentences.
     4    Going first gives you control of the game                    A χ2 test on tagging sentences in the top three categories
     5    You want to take 1 tile on your first move                shows that the computer tagging success rates are indeed
     6    1, 4, 7, 10, 13, 16, 19 are the poison numbers            not due to random chance. All values are significant at the
     7    “Opposite” strategy                                       p < 0.05 level and some at the p < 0.01 level. We found
     8    “3 pattern”                                               no consistent advantage to using unigrams, bigrams, or both
     9    Wrong statements                                          together. In this our result is similar to [Rosé et al. 2008],
    10    Exploring                                                 where differences among these conditions are slight. That
                                                                    study of classifiers for collaborative learning dialogues eval-
    11    Playing the game
                                                                    uated its results using κ interrater reliability between human
    13    Making an observation
                                                                    and computer annotaters. We have not computed κ, as the
    14    Clarifying observations
                                                                    number of categories is large and the number of test sen-
    15    Clarifying rules                                          tences is small, rendering the statistic not very meaningful
    16    Exploring further versions of the game                    [Di Eugenio and Glass 2004].
    17    Hypothesizing                                                In the NMF-u method many dimensions did not corre-
    18    There is a winning strategy                               late with any tag. It was thus not capable of categorizing a
    19    Filler                                                    test sentence into all the possible categories, leaving most of
                                                                    the categories unrecognized. Table 3 summarizes the most
                                                                    prominent categories that the NMF-u method found. For
were taken from approximately 100 sentences from the writ-          some of the most attested categories NMF-u was successful
ten corpus and 500 spoken dialogue turns. All our semantic          at correctly tagging the sentences in those categories, at the
spaces had 20 dimensions. Our feature sets included uni-            cost of a high rate of false positives. It had high recall but
grams (individual words) and bigrams.                               the precision was startlingly low.
   We report here on three computer-tagging methods: SVD,
NMF-s, and NMF-u.                                                               Data Collection for Analysis
   The SVD and NMF-s methods are supervised. They                   One of the benefits of COMPS is the ability to gather data on
match test sentences to manually accumulated bundles of             students, their interactions, and the exercise that they engage
exemplar sentences. This technique is much the same as              in.
the latent semantic analysis algorithm used successfully by            An advantage of recording group problem-solving is that
Auto-Tutor [Graesser et al. 2007].                                  ordinary obligations and discourse pragmatics dictate that
   In the NMF-s method the vector for a test sentence was           the participants signal when they achieve some understand-
built by solving a set of linear equations in 20 unknowns,          ing or some common ground. This means that not only are
which effectively computed what the vector for the test sen-        all the learnable knowledge components visible, but partici-
tence would have been had that sentence been a part of the          pants in the discussion should be making recognizable signs
training set. We believe that this technique for using non-         of whether the components are understood [Koschmann
negative matrix factorization to build text classifiers is novel.   2011]. In short, student thinking is forced out into the open
   The NMF-u method is unsupervised. The reduced dimen-             in ways that an assessment test, a cognitive experiment, or a
sions of the factored matrices are assumed to correspond di-        think-aloud protocol might never get at.
rectly to semantic dimensions within the data. This approach           Our study of Poison collaborative dialogues [Dion et al.
was described by [Segaran 2007] for classifiying blog posts.        2011] has already uncovered knowledge components that
Our training documents (sentences) were sorted according            students realize and express before they arrive at a closed-
to a) their manually-assigned category and b) which of the          form solution but are not themselves part of the solution.
20 dimensions in the NMF vector representation of the doc-          Examples are: 2 and 3 tiles force a win, 4 tiles is a simple
ument had the largest value. The dimensions were then man-          completely-analyzable case. There is no good way besides
ually associated with individual tags, if possible.                 observation to find out the ancillary realizations that students
                                                                    characteristically pass through as they explore the problem.
Results                                                             And it is necessary to understand these ancillary realizations
                                                                    in order to assess the state of the knowledge-construction
Table 2 summarizes the classification success rates of the          task.
two supervised methods, using unigram, bigram, and com-
bined uni- and bi-gram feature spaces. We report the per-
centage of sentences that were correctly tagged from n =                       Conclusions and Future Work
113 test sentences. Test sentences represented all categories.      COMPS is being developed with several uses in mind, viz:
Overall classification accuracy varied from 45% to 55%.             a platform for student collaborative exploratory learning us-
                                          Table 2: Accuracy of Supervised Classifiers
                                                % Correct                               Top 3 Tags
                                          All Tags Top 3 Tags         χ2      Tab 6       Tag 11 Tag 15
                                          n = 113     n = 59       p value   n = 19      n = 13 n = 27
                     NMF-s Unigrams         47%        61%           .003     58%          31%     78%
                     NMF-s Bigrams          45%        58%           .027     37%          38%     81%
                       NMF-s Both           48%        64%           .024     52%          54%     78%
                      SVD Unigrams          51%        66%          .0002     52%          86%     85%
                      SVD Bigrams           55%        68%           .028     63%          15%     96%
                        SVD Both            53%        59%           .003     42%           0%     100%


                                      Table 3: Unsupervised NMF Classifier Results
                                                                             Correctly           False
                                                        Class           N    classified         positives
                                                 #7 Opposite Strategy 13 13 (100%)                 63
                                  Unigrams
                                                   #6 Poison Numbers 13 12 (92%)                    2
                                                  #15 Clarifying Rules 27 16 (59%)                  8
                      Unigrams no-stopwords
                                               #1 Four Tiles Important 9      5 (56%)              15
                                                 #7 Opposite Strategy 13 11 (85%)                  19
                                                  #15 Clarifying Rules 27 23 (85%)                 23
                                    Bigrams
                                                   #6 Poison Numbers 13 12 (92%)                   10
                                               #1 Four Tiles Important 9      5 (56%)              15


ing problem-specific affordances, computer-aided monitor-         person exercises) in off-task activity and concomitant de-
ing and assessment of these dialogues, and recording di-          crease in shared construction of the answer. Certain updates,
alogues for study. Its first large deployment will be for         such as making the interface more explanatory and reducing
Math 110, a quantitative literacy class at VU.                    the maximum number of tiles, may reduce the evidently en-
   First use with 25 students students exercising the Poison      hanced play value provided by the computer mediated envi-
exercise in six teams shows that COMPS is quite usable.           ronment. Also specifically addressing this goal we have two
What seemed like a fairly straightforward translation of the      improvements on offer:
Poison exercise manipulatives to software affordances will,       • Unlike earlier Wooz exercises, the Poison problem
however, benefit from updating and experimentation.                  prompt was not on permanent display in a COMPS win-
   Analyzing dialogues collected before COMPS, we have               dow. The students have it on paper. Possibly putting the
identified a number of categories of dialogue contribution           problem on display will serve to keep the students more
that will be useful in monitoring and assessing the dialogue.        on-task. In short, we may be suffering the consequence of
With regard to epistemic knowledge in the Poison problem             not following our own COMPS interaction model strictly
domain, we have identified realizations that students pass           enough.
through on the way toward building the final solution. These
realizations may not appear in the final solution, but hav-       • In Math 110 team members are assigned roles. For ex-
ing students engage in dialogue and team cognition seems             ample one student is a moderator, one is a reflector, and
to successfully force the cognitive processes into the open.         so on. These are not represented in the COMPS interface.
   We have classifiers based on latent semantic analysis and         Possibly displaying which students are assigned to which
non-negative matrix factorization that can recognize a few           role will foster more focused interactions.
of the most important of these epistemic categories in solv-         We note that in addition to the epistemic tags, teachers
ing the Poison exercise. One of our classifiers relies on a       have been found to evaluate student collaborative activities
somewhat novel method of using NMF. It entails discover-          on a number of axes such as goal-setting, division of labor,
ing where a test sentence would be in the factor matrices         and participation [Gweon et al. 2011] [Gweon et al. 2009].
by solving a system of linear equations. It performed about       Accordingly, we have been annotating our dialogues using
as well as LSA on our data set, but more testing would be         the VMT threaded markup scheme [Strijbos 2009] which
needed. Our classifiers are trained on student written re-        shows when a turn addresses previous turns and annotates
ports, we expect that accuracy will improve once we train         the discourse relationship between them. Future work on
them on student dialogue data.                                    the text classifiers needs to address these discourse relations.
   Regarding the usability of the interface for problem-          The VMT Chat interface [MathForum 2012b] permits users
solving exercises, the primary unexpected behavior that we        to explicitly link their dialogue utterances: a user can indi-
will address in future tests is the increase (compared to in-     cate that a particular dialogue turn responds to a different,
earlier, turn, possibly uttered by somebody else. COMPS          Gahgene Gweon, Rohit Kumar, Soojin Jun, and Carolyn P.
does not have this functionality, but it might be useful.          Rosé. Towards automatic assessment for project based
                                                                   learning groups. In Proceedings of the 2009 conference
                  Acknowledgments                                  on Artificial Intelligence in Education, pages 349–356,
                                                                   Amsterdam, 2009. IOS Press.
This work could not have been done without our hardwork-
ing, patient, and resourceful students. Special thanks to        Gahgene Gweon, Soojin Jun, Joonhwan Lee, Susan Finger,
Nicole Rutt, Lisa Dion, and Jeremy Jank from the 2011 VU           and Carolyn Penstein Rosé. A framework for assessment
mathematics REU who worked on analyzing and classify-              of student project groups on-line and off-line. In Sad-
ing Poison dialogues, Scott Ramsey from NC A&T who de-             hana Puntambekar, Gijsbert Erkens, and Cindy E. Hmelo-
signed and implemented much of COMPS in Actionscript,              Silver, editors, Analyzing Interactions in CSCL, volume
and Bryan Lee who helped update the server.                        12, part 3 of Computer-Supported Collaborative Learn-
   This work is supported by the National Science Founda-          ing Series, pages 293–317. Springer, 2011.
tion REESE program under awards 0634049 to Valparaiso            Jung Hee Kim and Michael Glass. Evaluating dialogue
University and 0633953 to North Carolina A&T State Uni-            schemata with the wizard of oz computer-assisted alge-
versity and the NSF REU program under award 0851721 to             bra tutor. In James C. Lester, Rosa Maria Vicari, and
Valparaiso University. The content does not reflect the posi-      Fábio Paraguaçu, editors, Intelligent Tutoring Systems,
tion or policy of the government and no official endorsement       7th International Conference, Maceió, Brazil, volume
should be inferred.                                                3220 of Lecture Notes in Computer Science, pages 358–
                                                                   367. Springer, 2004.
                       References                                Tim Koschmann. Understanding understanding in action.
R. De Groot, R. Drachman, R. Hever, B. Schwarz,                    Journal of Pragmatics, 43, 2011.
  U. Hoppe, A. Harrer, M. De Laat, R. Wegerif, B. M.             MathForum. Math notation in email messages or web
  McLaren, and B. Baurens. Computer supported moder-              forms. Web help page from Math Forum: Virtual Math
  ation of e-discussions: the ARGUNAUT approach. In               Teams project, 2012a. URL http://mathforum.
  Clark Chinn, Gijsbert Erkens, and Sadhana Puntambekar,          org/typesetting/email.html.
  editors, Mice, Minds, and Society: The Computer Sup-           MathForum. VMT software orientation. Web help page
  ported Collaborative Learning (CSCL) Conference 2007,           from Math Forum: Virtual Math Teams project, 2012b.
  pages 165–167. International Society of the Learning Sci-       URL http://vmt.mathforum.org/vmt/help.
  ences, 2007.                                                    html.
Barbara Di Eugenio and Michael Glass. The kappa statistic:       Niraj Patel, Michael Glass, and Jung Hee Kim. Data col-
  A second look. Computational Linguistics, 32:95–101,             lection applications for the NC A&T State University al-
  2004.                                                            gebra tutoring dialogue (Wooz tutor) project. In Anca
Lisa Dion, Jeremy Jank, and Nicole Rutt. Computer moni-            Ralescu, editor, Fourteenth Midwest Artificial Intelligence
  tored problem solving dialogues. Technical report, Math-         and Cognitive Science Conference (MAICS-2003), pages
  ematics and CS Dept., Valparaiso University, July 29             120–125, 2003.
  2011. REU project.                                             Carolyn Rosé, Yi-Chia Wang, Yue Cui, Jaime Arguello,
Barbara Fox. The Human Tutoring Dialogue Project. Erl-             Karsten Stegmann, Armin Weinberger, and Frank Fis-
  baum, Hillsdale, NJ, 1993.                                       cher. Analyzing collaborative learning processes automat-
                                                                   ically: Exploiting the advances of computational linguis-
Rick Gillman. A case study of assessment practices in quan-        tics in CSCL. Int. J. of Computer-Supported Collabora-
  titative literacy. In Current Practices in Quantitative Lit-     tive Learning, 3(3), 2008.
  eracy, MAA Notes 70, pages 165–169. Mathematical As-
  sociation of America, 2006.                                    H. Sacks, E.A. Schegloff, and G. Jefferson. A simplest sys-
                                                                   tematics for the organization of turn-taking for conversa-
Michael Glass, Jung Hee Kim, Karen Allen Keene, and                tion. Language, pages 696–735, 1974.
 Kathy Cousins-Cooper. Towards Wooz-2: Supporting tu-
                                                                 Toby Segaran.     Programming Collective Intelligence.
 torial dialogue for conceptual understanding of differen-
                                                                   O’Reilly, 2007.
 tial equations. In Eighteenth Midwest AI and Cognitive
 Science Conference (MAICS-2007), Chicago, pages 105–            Gerry Stahl. Studying Virtual Math Teams. Springer, 2009.
 110, 2007.                                                      Jan-Willem Strijbos. A multidimensional coding scheme
Art Graesser, Phanni Penumatsa, Matthew Ventura,                   for VMT. In Gerry Stahl, editor, Studying Virtual Math
  Zhiqiang Cai, and Xiangen Hu. Using LSA in AutoTutor:            Teams, chapter 22. Springer, 2009.
  Learning through mixed-initiative dialogue in natural lan-
  guage. In Thomas K. Landauer, Danielle S. McNamara,
  Simon Dennis, and Walter Kintsch, editors, Handbook of
  Latent Semantic Analysis, pages 243–262. Lawrence Erl-
  baum, 2007.
                                             Figure 1: COMPS with Poison problem.


The people in each group are to form two teams. One team will play against the other team in the group. To begin, place 20
tiles between the two teams. Here are the rules:
1. Decide which team will play first.
2. When it is your team’s turn, your team is to remove 1 or 2 tiles from the pile.
3. The teams alternate taking turns.
4. The team that is forced to take the last tile – the poison tile – loses the game.
Play this game a number of times, alternating which team plays first. As you play these games, keep track of your
moves/choices. Eventually, you want to be able to determine how your team should play to force the other team to lose.
In order to make this determination, you will need to look for a pattern. In order to find a pattern, you will need data, and so
you will need to decide how to collect and organize these data to see if a pattern will appear.
                                                Figure 2: Poison Assignment