Mining an Online Judge System to Support Introductory
             Computer Programming Teaching
                 Rodrigo Elias Francisco                                              Ana Paula Ambrosio
                    Instituto Federal Goiano                                           Instituto de Informatica
                       Campus Morrinhos                                            Universidade Federal de Goiás
                    Morrinhos – GO – Brazil                                             Goiânia – GO – Brazil
                       +55 (64) 34137900                                                 +55 (62) 35211181
            rodrigo.francisco@ifgoiano.edu.br                                            apaula@inf.ufg.br

ABSTRACT                                                              The BOCA software [1] is an online judge system used in
Computer programming is an activity which requires a set of           programming marathons in Brazil. It is freely available for
cognitive processes that naturally develop through practice,          institutions to download and install. This particular system allows
writing algorithmic solutions. Students learn a lot from their        teachers to register problems and to track their students’ work.
mistakes, but for this they need feedback on their workouts.          However this system is neither easy to handle nor has an exercise
Marking students’ work outs is very time consuming, which often       database (DB), needed to facilitate the generation of problem lists.
limits a teacher’s capacity to offer close guidance individually.     This project proposes to extend the BOCA online judge to make it
The PROBOCA project aims to build a tool, based on the BOCA           more suitable for use in introductory programming teaching. The
online judge, suited for the purpose of learning computer             resulting system, called PROBOCA, complements the online
programming by practice. In addition to a problem database            judge functionality with features that improve its ease-of-use,
organized by theme and difficulty, the system provides                enhancing teacher productivity and course effectiveness.
functionalities to support the teacher in the classroom. One of the   One of the introduced features provides automatic classification of
main endeavors is to develop a procedure for estimating the           problem difficulty, based on submitted solutions and students’
degree of difficulty of a certain problem. This “nominal”             evaluation of degree of difficulty encountered in solving a given
parameter may then be compared to the difficulty level as             problem. This will aid teachers while composing the exercise lists,
perceived by each student. The result is a valuable indicator of      allowing them to better gauge the list complexity. It also lets
those students that are experiencing challenges. This paper           students organize the order of problems to solve, tackling the
presents the preliminary specification of PROBOCA´s                   easier problems first before turning to more complex ones.
architecture and functional requirements along with its current
state of development.                                                 Another additional feature is the production of student reports
                                                                      based on the submitted solutions and the students’ behavior while
                                                                      accessing the tool. Student evaluation of program difficulty,
Keywords                                                              number of submissions for the same exercise, time between
Online Judge; Computer Programming Education.                         submissions, order of submissions, among other information
                                                                      gathered by the tool, reveal information about student behavior
1. INTRODUCTION                                                       and his ease in solving the proposed problems.
The learning process of Computer Programming (CP) usually
involves practicing the resolution to many problems. Students         2. EXERCISES ON PROGRAMMING
write code to implement algorithms that meet exercise                 EDUCATION
requirements and teachers should review those codes and present
                                                                      Aiming at automatic correction of program code and student
their feedback to the students. As this is a time consuming task,
                                                                      monitoring and evaluation during the programming process,
some teachers are installing and using online judge systems as
                                                                      several initiatives have been developed.
automatic reviewing and scoring tools.
                                                                      Within this line of research, the study of Chaves et al [2] aims to
Online judge refers to software tools originally designed for
                                                                      explore resources of online judges looking to integrate them into
programming competitions. They usually run on a server, and
                                                                      the Moodle system. The goal is to provide a Virtual Learning
contestants access it online. Basically, its role is to present the
                                                                      Environment (VLE) with automatic evaluation feature, allowing
contestant teams with a list of problems, to which they should
                                                                      the teacher to monitor students´ problem solving. The authors
respond by uploading program codes that satisfy the criteria of
                                                                      defined an architecture containing a module for integration with
each problem. The tool then evaluates the answer code by using a
                                                                      online judges (MOJO). The Integration Module integrated the
set of predefined inputs and comparing the program results to
                                                                      VLE structure to URI Online Judge [3] and SPOJ Brazil [4].
predefined outputs. If the output of the answer code exactly
matches the expected output for the corresponding input, the          In [5], the authors have developed a prototype intended to be an
answer code is considered correct. Otherwise it is considered         educational online judge. They argue that online judge tools are
incorrect. No indication of where the program went wrong is           suited to competition and have few educational features. They
given. Although helpful as a teaching assistant, these tools were     also criticize the way online judges provide feedback to students
not designed for use in a classroom and therefore lack some           only indicating whether the answer is right/wrong or if there were
features that are important for academic management.                  errors at compilation/runtime. Their project, called JOnline, aims
                                                                      to add the following didactic features: Presenting tips in
Portuguese to fix compilation errors found in the submitted source    traditional exercise lists. The burden on the teacher is significantly
code; presenting the test cases that generate erroneous results;      reduced and the students get feedback. Teachers are then able to
organization of the problems by topic; difficulty level deduced       focus on helping the students that fail to solve the problems even
from a poll conducted by the system and a resource for                after several attempts. This strategy allows students to tackle a
collaborative programming that allows two students to co-write        larger number of exercises, thus increasing their potential to
one common code.                                                      master cognitive skills required for programming. However, in
Automatic generation of exercise lists based on user defined          spite of these advantages, some drawbacks were revealed.
criteria requires the classification of problems according to these   Looking at the process of reviewing students’ answers two issues
criteria. In [6], the authors conclude that there is a strong         were identified. First, the system assesses a submitted program by
relationship between the difficulty of a problem and the total        comparing its output, as generated by the student’s code in
number of lines of code and amount of flow control in the             response to a given input data set, to the registered expected
program (IF, WHILE, FOR ...). New students have difficulty in         output. For the two outcomes to be identical, exact formatting of
reading and interpreting problem statements. This problem has         the program’s output is required. At the beginning students
been related to the difficulty in dealing with abstraction [9].       incurred in many errors just for using a slightly different format.
                                                                      This situation marks the problem’s result as wrong whereas the
3. BOCA                                                               program’s logic maybe right, causing frustration among students
Designed for use in programming marathons, the BOCA online            as the system does not point out where the error is. Students,
judge system has vocabulary and requirements contextualized by        however, managed to adapt and began paying more attention to
Competition and Teams. The main interest in this system is that it    output formatting. Second, there is no cross-answers plagiarism
was designed to enable its users, the competitors, to interact with   identification, in other words no control over cheats. It was
a set of problems. Figure 1 shows a use case diagram with the         observed that some students simply copied their colleagues’
different functions provided by BOCA. Its software allows             program and submitted them as their own, thus considered to have
registration of problems for a given competition. That is done by     solved the problem without any real comprehension of the
submitting a PDF file stating the problem to be tackled and the       solution.
files containing the input and output data sets for the
                                                                      Figure 2 shows the BOCA code submission screen. It was
corresponding test cases. The registered problems are associated
                                                                      modified to include the “level of difficulty” selection where the
with a single competition. If needed, to reuse the problems for
                                                                      students evaluate how hard it was to solve that problem they are
another competition, the files must be inserted again. BOCA does
                                                                      submitting.
not include a database to store the problems.


                                                                           Figure 2. BOCA’s submission functionality answers


                                                                      Regarding the class management functionality, it was noted that
                                                                      each installed instance of BOCA supports one active competition
              Figure 1. BOCA´s use case diagram                       at a time, meaning a single class per instance. Thus, if a teacher
                                                                      wants to have several competitions running at the same time i.e.
In the system, colored balloons represent the points gained by the
                                                                      different groups doing the same or different sets of exercises, he
competitors. This feature is part of the programming marathons
                                                                      must install multiple instances of BOCA. Each instance is
context where correctly answering a problem yields a balloon to
                                                                      independent. So even if two competitions are composed of the
its team. Besides allowing teams to monitor their submissions and
                                                                      same problems (e.g. for different classes), these problems must be
the related results, BOCA also provides a screen displaying the
                                                                      separately registered for each instance. As each competition is
teams ranking including the team’s overall score, and a list of the
                                                                      independent, the system does not group the results of the different
solved exercises using balloons to identify those successfully
                                                                      competitions. This feature would be interesting in case of multiple
done. It also produces information on students’ interaction with
                                                                      exercise lists delivered to the same class. It should also be noted
the problems containing the time spent and the number of
                                                                      that the system is not trivial to install and manage, which ends up
resolution attempts.
                                                                      discouraging the teachers from adopting it. On top of that, the
BOCA has been used in CP introductory classes at our institute        teacher needs to add each student to the system, which proves to
for some years, with good results. Using this system enhances the     be quite tedious, especially for large classes. To overcome this
practice of problem solving by providing automatic feedback to        drawback, teachers have implemented programs that automatically
students. We have observed that the number of problems solved         generate registration ids and corresponding passwords based on
by students using BOCA is significantly higher when compared to       student university registration number. As BOCA does not allow
password modification, students often know other students             parameters, namely a difficulty level (1-easy; 2-medium; 3-
password as they are generated following certain patterns. This       difficult); the desired component syllabus´ elements for the given
facilitates copying other students’ solutions.                        problem set and finally the quantity of problems to compose the
This shows that, although BOCA has several advantages, there are      list. The system then analyzes which problems best fit the user-
problems that hinder the use of the system in the classroom. To       defined parameters and generates a competition list, from among
solve this, beyond what has already been presented, we aim to         the problems available in its DB. The difficulty level for each
tackle the following requirements: automatic generation of lists of   problem is estimated automatically by the system based on data
problems based on subjects and level of difficulty, measurement       obtained from solutions submitted by students and other
of the student experience with problem solving by subject, and        parameters as described in section 6.
rank problems by levels of difficulty. We consider the last item is
important for didactic purposes, not finding a technique or
algorithm to use, this is the focus of this work.

4. PROBOCA
From an architectural viewpoint, the proposed system builds upon
BOCA that is considered an internal component and offers its
structure, PHP source code and PostgreSQL DB to be reused.
In order to avoid further complication of the development process,
the “Competitions and Teams” context is kept, with some
adaptations. The terms “Competition” and “Team” are used to
loosely indicate exercise list and student respectively. BOCA´s
user interface allows for such extrapolation which is already in
practice by the teachers who use this system in their classroom.
Adapting BOCA to support CP teaching brought the need to
introduce new functional requirements, as presented in the use
case diagram in Figure 3.
PROBOCA required some changes to the BOCA DB structure. It
was necessary to modify the internal tables in order to link the
stored problems to given course’s syllabus and certain difficulty              Figure 4. Generate problem lists\competitions
levels as well as to include more detailed data about the students.
Furthermore, the original competition-based structure will be
altered to adapt it to the concept of multiple exercise lists.        R3’s purpose is to provide teachers with a report of every
                                                                      student´s experience level based on his interaction with the
                                                                      system. Using the collected data, it should be possible to present
                                                                      information about a student's experience regarding the syllabus
                                                                      elements addressed in each problem, thus allowing the teacher to
                                                                      detect where students show more difficulty. This requirement
                                                                      depends on R4, which measures the difficulty level of a given
                                                                      problem. A simple mechanism to measure the degree of difficulty
                                                                      of a given problem was developed and tested.
                                                                      For R3 to be reliable, it is necessary that a plagiarism
                                                                      identification strategy (R9) be also implemented in the system.
                                                                      R5 is aimed at associating a student to multiple competitions
                                                                      without the need to register each time. Thus, the data from
                                                                      different competitions can be associated to the student, allowing
                                                                      the integration of results obtained for the different lists presented
                                                                      to a given class. This requirement has a dependency upon R6,
                                                                      which aims to adapt BOCA to judge problems from several
                                                                      competitions. R7 is also related to R5, with the goal of allowing a
                                                                      student to participate in several competitions - analog to
     Figure 3. Use Case Diagram with New Requirements                 answering several problem lists.
                                                                      R8 aims to display the problems’ balloons, color coded to indicate
Following requirement R1, student registration is now done by the     the problem’s difficulty level. Currently, balloon colors are
student himself using a token which is handed out by the teacher.     registered by the system administrator along with the problems
Besides saving a password, other information is collected during      and have no relation to difficulty or content, unless the
registration that permits class wide statistical analysis.            classification and corresponding color attribution is done by the
                                                                      teacher when uploading the problems.
R2 was intended to simplify the task of competition generation,
which is performed by the teacher. To achieve this goal, a
database of problems was implemented. Currently, to generate a
competition, as shown in figure 4, the user indicates three
5. IMPLEMENTATION                                                         Nonetheless, this work investigates the possibility of
                                                                          characterizing a programming problem in a way such that
                                                                          calculated parameters may correlate to a determined “Difficulty-
The functionalities of detailing the student information and              level” as expressed by students.
providing automatic generation of exercise lists are already
implemented and tested. Requirement R4, estimating the difficulty          In their work [6], Alvarez and Scott present the program control
level of a given problem, is currently in progress.                       flow and number of lines of code as variables that correlate to the
                                                                          difficulty of a problem. The experiments undertaken in the current
Within this requirement, the first executed task was to create 74         study corroborate this information.
exercises using BOCA´s problem-registration functionality. These
exercises cover the syllabus of the “Introduction to programming”         Also, other works [7, 8] show the need to deal with abstraction to
course (CS1) and range from debutant programming instructions             solve problems. A highly abstract problem is one that implies a
(Hello World style) up to coding with matrixes. The inserted              greater level of generalization. Thus, a problem that involves
problems populated an initial database to be used for testing, but        many different topics can become more abstract and hence more
additional problems are expected to be easily included. An                difficult. It is fairly safe to assume that in order to measure the
estimated level of difficulty for each problem was supplied by the        difficulty of a problem, it is necessary to make a detailed analysis
teacher inserting the problem in the database.                            of the topics that are addressed within the problem. That
                                                                          proposition helps in formulating the hypothesis below.
To submit an answer, the student uploads one code file per
problem solved. Along with the file upload, the student is asked to       6.1 Problem Difficulty Mechanism
evaluate the problem´s difficulty, by indicating one of three
choices: Easy, Medium or Difficult, based on his experience when
                                                                          On one hand, the survey of the student´s opinion of the submitted
solving the problem (Figure 4).
                                                                          problem’s difficulty provided the first estimate. For each of the 74
Success was obtained in computing several parameters that are             registered exercises, this information was stored, along with the
needed to calculate problem difficulty. They include different            respective answer in the DB. Although information was collected
measures, such as counting the number of repetition and selection         for all students, statistics were calculated only considering those
structures used in the solution and the number of topics involved         that completed at least 95% of the exercises (70 out of 74). This
in the problem. Problem topics were defined using the                     excluded students that dropped out the course in order to prevent
introductory programming course syllabus. They include,                   their partial answers from skewing the results. After filtering, the
input/output; attribution; selection; repetition; vectors; strings;       mean difficulty “Mean_Dif” was then calculated for each exercise
matrices and functions. In this list, topics appear in the order they     based on the answers of the resulting 62 students. “Median_Dif”
are taught and therefore in increasing level of complexity for            and “Mode_Dif” were also calculated.
novel students. Since programming is incremental, more complex
                                                                          In addition, C source code was written to correctly workout each
topics usually base upon lesser complex topics. Several tests have
                                                                          of the registered problems. A PHP program was also developed in
been conducted to define and validate the mathematical function
                                                                          order to analyze the programs´ internal structures and to extract
that will calculate the difficulty of the problem, but this is still an
                                                                          some measures from each program. The measures include
open issue.
                                                                          counting:
Based on the calculated difficulty and the topics involved in the
problem, colored balloons are associated to the problem. Different             •    Lines of code, represented by variable N_LC;
colors represent different topics, and within each topic, the level            •    Repetition structures used in the solution, N_RP;
of difficulty is represented by the intensity of the color. For
                                                                               •    Selection structures, N_SL;
example, blue represents problems whose most complex topic is
selection. Light blue balloons are associated to easy problems                 •    Edges in a graph that represents the algorithm, N_EDG;
using selection. Medium blue balloons are associated to medium                 •    Edges in a graph that represent repetition structures in
difficulty problems and dark blue balloons are associated to                        the algorithm, N_EDG_RP;
problems that demand more from students.
                                                                               •    Height of a tree, with each sub-block of internally
The task of providing teachers with a report on student                             nested code representing a node, MX_HT and
achievements (R3) has not been tackled yet. Ideas in this sense
include: (1) comparing student’s perceived problem difficulty and              •    Number of topics involved in the problem, N_TPC.
mean problem difficulty. If students perceive problems as harder                    This last count was obtained manually.
or easier than the mean this could indicate that they are at a lesser
or higher level of programming competency; (2) comparing
successive submission of the same problem. This may show if               To verify if a combination of variables obtained better results, the
students adopt a trial-and-error approach; (3) mean time taken to         following formulae were also tested against the Mean_Dif
solve problems; (4) mean number of submission per problem; (5)            variable to verify their correlation. Table 2 shows the results.
score when compared to the other students; among others.                  Mean_Dif correlated positively with all measured variables, being
                                                                          the best correlation associated to f4, r = .76, p = .000, that
6. Estimating Problem Difficulty                                          improves on the individual correlations with N_TPC (number of
                                                                          topics) and MX_HT (maximum tree height).
In a sense, “difficulty” expresses the lack of ability, or amount of
skill/effort needed to accomplish something. Obviously, the
perception of difficulty is an individual matter that is related to
many aspects, including the time at which the question was posed.
Table 1. Correlation between measured variables and student                 Table 3. Correlation between measured variables and teacher
                         evaluation                                                                  evaluation
                                             Median_     Mean     Mode_                                                    Median_Prof_Dif
                                             Dif         _Dif     Dif             N_RP              Pearson Correlation                .62
   N_RP                Pearson Correlation       .49       .52        .44                           Sig. (2-tailed)
                                                                                  (Repetitions)                                      .000
   (Repetitions)       Sig. (2-tailed)         .000      .000       .000                            N
                       N                                                                                                                74
                                                  74        74         74         N_SL              Pearson Correlation                .20
   N_SL                Pearson Correlation       .16       .34        .19                           Sig. (2-tailed)
                                                                                  (Selections)                                       .093
   (Selections)        Sig. (2-tailed)         .171      .003       .108                            N
                       N                                                                                                                74
                                                  74        74         74         N_LC              Pearson Correlation                .56
   N_LC                Pearson Correlation       .44       .62        .45                           Sig. (2-tailed)
                                                                                  (Lines of Code)                                    .000
   (Lines of           Sig. (2-tailed)         .000      .000       .000                            N
                       N                                                                                                                74
   Code)                                          74        74         74         N_TPC             Pearson Correlation                .50
   N_TPC               Pearson Correlation       .57       .69        .59                           Sig. (2-tailed)
                                                                                  (Topics)                                           .000
   (Topics)            Sig. (2-tailed)         .000      .000       .000                            N
                       N                                                                                                                74
                                                  74        74         74         N_EDG             Pearson Correlation                .53
   N_EDG               Pearson Correlation       .42       .58        .41                           Sig. (2-tailed)
                                                                                  (Edges)                                            .000
   (Edges)             Sig. (2-tailed)         .000      .000       .000                            N
                       N                                                                                                                74
                                                  74        74         74         MX_HT             Pearson Correlation                .50
   MX_HT               Pearson Correlation       .52       .67        .56                           Sig. (2-tailed)
                                                                                  (Tree Height)                                      .000
   (Tree Height)       Sig. (2-tailed)         .000      .000       .000                            N
                       N                                                                                                                74
                                                  74        74         74         N_EDG_RP          Pearson Correlation                .62
   N_EDG_RP            Pearson Correlation       .53       .60        .51                           Sig. (2-tailed)
                                                                                  (Rep. Edges)                                       .000
   (Rep. Edges)        Sig. (2-tailed)         .000      .000       .000                            N
                       N                                                                                                                74
                                                  74        74         74

                                                                            Table 4 correlates the teacher defined difficulty with the proposed
f1 = N_RP+N_SL+N_LC+N_TPC+N_EDG+MX_HT+N_EDG_RP
                                                                            formulae defined above. Positive correlation was found with all
                       7
                                                                            formulae, being the best correlation associated to f2, r = .63,
f2 = N_EDG . 0.3 + MX_HT . 1.4 + N_EDG_RP . 1.4 + N_LC . 0.2                p=.000. Furthermore, a positive correlation was found between
f3 = N_TPC+MX_HT+N_LC + N_EDG_RP+N_EDG+N_RP + N_SL                          the teacher defined difficulty and mean student perceived
            3                    8                                          difficulty, r = .69, p = .000.
f4 = N_TPC + MX_HT
          2
Table 2 shows the results. Mean_Dif correlated positively with all               Table 4. Correlation of teacher defined difficulty and
measured variables, being the best correlation associated to f4, r =                             developed formulae
                                                                                                                             Mean_Dif
.76, p= .000, that improves on the individual correlations with                                      Pearson Correlation         .59
N_TPC and MX_HT.                                                                             f1      Sig. (2-tailed)           .000
                                                                                                     N
                                                                                                                                  74
                                                                                                     Pearson Correlation         .63
   Table 2. Correlation of student perceived difficulty and                                  f2      Sig. (2-tailed)           .000
                     developed formulae                                                              N
                                                                                                                                  74
                                                       Mean_Dif                                      r*                          .59
                  Pearson Correlation                      .66                               f3      Pearson Correlation       .000
                  Sig. (2-tailed)                                                                    Sig. (2-tailed)              74
          f1                                             .000
                  N                                                                                  N
                                                            74                                       Pearson Correlation           .55
                  Pearson Correlation                      .68                                       Sig. (2-tailed)
                  Sig. (2-tailed)                                                            f4                                  .000
          f2                                             .000                                        N
                  N                                                                                                                 74
                                                            74
                  Pearson Correlation                      .67
          f3      Sig. (2-tailed)                        .000               Although student perceived difficulty and teacher defined
                  N
                                                            74              difficulty are correlated, there are differences that can be verified
                  Pearson Correlation                      .76              by the correlations found with the measured variables and
          f       Sig. (2-tailed)                        .000
                  N                                                         proposed formulae. This could be explained by the fact that
          4                                                 74
                                                                            students evaluate difficulty based on the knowledge they have
                                                                            when developing the solution. As they do not know what comes
We also verified correlations between teacher evaluations of the            ahead, they cannot base their evaluation on the overall knowledge
problems’ difficulty. For this we asked five teachers to read each          of programming. Teachers on the other hand, have this overall
problem and attribute a level of difficulty in the range 1-5. Since         view of the domain and evaluate difficulty accordingly. It must be
we had only five evaluations we chose to work with the median               observed that the teachers that did the evaluation are new to
difficulty obtained (Median_Prof_Dif).                                      teaching and have not yet acquired a more critical understanding
Correlating this variable to the measured variables we obtained             of the difficulties students encounter when learning to program.
the results presented in table 3. Best correlations were obtained           After developing algorithmic reasoning, people tend to forget how
with N_RP and N_EDG_RP, r = .62, p = .000, both related to the              they thought before, and find that many concepts are obvious
number of repetition structures found in the solution codes.                when in fact, for beginners, they are not.
7. CONCLUSION                                                           REFERENCES

PROBOCA is a system under development whose goal is to adapt            [1] BOCA. BOCA Online Contest Administrator. Available in
the BOCA software for use in teaching programming. While                    <http://www.ime. usp.br/~cassio/boca/> Access on March
BOCA itself was developed for programming marathons, it is                  20th, 2015
already in use, as a support tool, in programming introductory          [2] Chaves, J. O. et al. Uma Ferramenta de Auxílio ao Processo
courses. As a learning aid, BOCA has several limitations,                   de Ensino Aprendizagem para Disciplinas de Programação
especially relative to class administration and student monitoring.         de Computadores (2013). TISE 2013
In addition, as a system specifically developed for competitions, it
lacks mechanisms that facilitate the creation of exercise lists, such   [3] URI. URI Online Judge. Available in <http://
as a question bank, and analysis of student performance.                    www.urionlinejudge.com.br/> Access on May 20th, 2014.
PROBOCA supports the persistence of problems registered by              [4] SPOJ. SPOJ Brasil. Available in
teachers in the database and provides greater access to students’           <http://br.spoj.com/embed/info/> Access on May 20th, 2014.
related information. Unlike other “Online Judge” systems that are       [5] Santos, J. C. S., Ribeiro, A. R. L. JOnline - proposta
available exclusively online and are managed by their creators,             preliminar de um juiz online didático para o ensino de
PROBOCA can be downloaded and installed by the teacher,                     programação (2007). SBIE 2011.
giving the teacher control over the problems stored in the
                                                                        [6] Alvarez, A. and Scott, T. A. (2010). Using student surveys in
database. Since teachers are responsible for introducing the
                                                                            determining the difficulty of programming assignments.
problems, this solution has the additional advantage that it is
                                                                            Journal of Computing Sciences in Colleges, 26(2):157–163.
language free, i.e., it is not limited to teachers and students that
speak the language in which the problem specifications were             [7] Gomes, A. et al. (2008). Aprendizagem de programação de
written as is the case of online systems administered by third              computadores: dificuldades e ferramentas de suporte.
parties.                                                                    Revista Portuguesa de Pedagogia. 42(2).
In addition to implementing an environment that facilitates the use     [8] Mendonça, A. et al. Difficulties in solving ill-defined
of BOCA in teaching programming, PROBOCA also aims to                       problems: A case study with introductory computer
provide teachers and students with information that will help               programming students. In Frontiers in Education
students in their learning process. One of the important features of        Conference, 2009. FIE’09. 39th IEEE, pages 1–6. IEEE.
PROBOCA is an automatic evaluation of problem difficulty. This          [9] Mendonça, A., de Oliveira, C., Guerrero, D., and Costa, E.
gives students direction in the path to follow when choosing                (2009). Difficulties in solving ill-defined problems: A case
which exercises to solve first, allowing them to solve easier               study with introductory computer programming students. In
exercises before more complex ones, diminishing student                     Frontiers in Education Conference, 2009. FIE’09. 39th
frustration at not solving problems. It also allows teachers to             IEEE, pages 1–6. IEEE.
better gauge the level of difficulty of exercise lists. As shown by
the collected data, teacher and student evaluation regarding
problem difficulty do not match, and this may lead to distortions
when teaching programming. Future work could include
developing a system that will automatically suggest problems to
students based on their performance and calculated problem
difficulty.
This work shows the approach being taken to calculate the
difficulty of the problems. This approach differs from others by
treating the algorithms submitted by students in the form of
graphs and trees to identify properties that could be correlated
with the difficulty of problems. For this, data mining using
correlations was undertaken.
Part of the system has already been implemented, and has been
successfully used in the classroom. The success of these tasks
shows the feasibility of the project and encourages further work.