37


 The development of a web application for assessment by
     tests generated using genetic-based algorithms

                   Doru Anastasiu Popescu1, Victor Tița2, Nicolae Bold1
        1University of Pitesti, Department of Mathematics and Computer Science, Romania
    2University of Agronomic Sciences and Veterinary Medicine Bucharest, Faculty of Manage-

ment, Economic Engineering in Agriculture and Rural Development, Slatina Branch, Romania
dopopan@gmail.com, victortita@yahoo.com, bold_nicolae@yahoo.com


         Abstract. The multitude of the technology-based tools used for educational pur-
         poses is now a common thing to be seen. These tools can help within the educa-
         tional process either for the organizational purposes or these are included in the
         materials used in education. This paper presents Kromatine, a generator of as-
         sessment tests which are obtained using a genetic algorithm, which includes it in
         the first category, organizational purposes. The genetic algorithm uses basic ge-
         netic operations and structures and it is presented in a form of a web application.
         It eases the organizational tasks of the teacher by giving him the opportunity to
         generate tests that will be used further in assessment. The questions are stored in
         a database and the user has the possibility to add questions to database and to
         generate tests that can be used later. The questions are characterized by a degree
         of difficulty and are multiple-choice type. The choice of the genetic algorithm is
         due to the fact that the problem can be summarized in generating an arrangement
         of question summing a given total degree of difficulty (comparable to the subset
         sum problem), which includes the issue in the category of NP-complete prob-
         lems. Also, the problem structure can be easily modeled based on a general ge-
         netic algorithm structure.

         Keywords: genetic, tool, web application, education, assessment.


1        Introduction

As technology advances more and more and the modalities of easing organizational
tasks are more numerous and following several recent breakthrough researches. All
these research is based on nature functionalities and structures. Also, education is an
extremely important field within the domains of the people, from the primary school to
adult training [15]. This importance is obvious, because education is a basis for every
human activity.
   We will present in this paper a primary version of a tool that generates tests used for
assessment based on a genetic algorithm. As we will see, the questions are multiple-
choice type and are selected from a database which is built overtime. Section 2 intro-
duces a theoretical base formed from notions and operations used to build the tool. In
section 3 we will present the actual implementation of the tool, in the form of a web

Back to Table of Contents
                                                                                      38


application, and section 4 contains an example of obtaining assessment tests using this
tool.


2       Research and related work

Genetic algorithm theory is rapidly developing due to advances in technology and re-
search. Known for their large applicability, their approximate nature is a both a feature
and a drawback that is currently studied in order to increase the accuracy of the solu-
tions obtained. Thus, state of the art research on genetic algorithm is aiming to solving
both classical theory problems and unusual particular issues.
   Given the first direction, the applicability of the genetic algorithms to fuzzy prob-
lems is a candidate for solving matrix problems (implying chessboard-like structures
such as the queen problem [1]), which refer to the larger component of combinatorics.
Also, the genetic algorithms are widely used as a second solution for NP-complete
problems and one of the closest to education area is the generation of a timetable or a
schedule [2], given certain constraints. Besides that, the genetic algorithm may be com-
bined with neural network notions in order to help in pattern and classification problem
[11].
   The problem studied in this paper is part of the second set of problems. Given the
fact that the problem of generating tests formed of question with a given requirement
is not a common issue found in the literature, the existing papers which deal with the
problem deal with the problem of efficiency of genetic-based generation [4]. Other
types of generators use random-based generators [5] or ant-colony algorithms (ACA),
where is shown that effectiveness is slightly greater in terms of generating time. How-
ever, time generation is not necessarily a key-parameter within the problem of tests
generation, but the precision of results. The precision is close either an ACA or genetic
algorithms are used [6].
   Another issue regarding the studied problem is that this can be classified as NPcom-
plete, due to its reduction to the subset sum problem, of generating subsets of finite
cardinal whose sum of difficulty degree is close to a given parameter, which is known
for being NP-complete [3]. This is why an evolutionary approach is preferred.
   Issues regarding the generation of tests which are secondary in this paper are also
consisting in the type of the question that is generated, whose number can be extended
using existing methods based on word analyze and NLP algorithms [7] and the auto-
matic determination of the degree of difficulty of a given question [8]. These issues are
forming new fields of research and integration in future research. Furthermore, the
questions that form tests can be seen as nodes in a complex network, which would
consist in the possibility of using graph-based structures [14] and introduce the concept
of linked questions within the implementation of the algorithm.
   Finally, the problem described in the paper is a new integration of technology tools
within the vast domain of education. We should not exclude the social part of the edu-
cation [10] and the implications of the usage of the technology [13], which are im-


Back to Table of Contents
39


mensely influencing the educational development of the students. Thus, a future devel-
opment would be the inclusion of a social aspect within the tool, either in selection of
the test or regarding the interaction between users.


3        Theoretical notions and application structure

Before the actual presentation of the tool to be made, we will present the notions that
led to its creation. The tool has been developed based on a genetic algorithm, meaning
that the structures used are the gene and the chromosome. Also, the questions and the
generated tests are stored in a database. The definitions that follow present the partic-
ular notions and clarify the terminology used in this paper. The database has four
tables which are basic for the needs of the generator:

─ table Questions, which contains fields storing the identification number of the ques-
  tion, the statement, the number of choices, the degree of difficulty of the question,
  the correct choice(s) and the user who proposed it;
─ table Choices, which contains the question identification number, the choice letter
  from „a‟ to „z‟ and the choice text;
─ table Tests, containing fields storing the identification number of the test, the ques-
  tions, the total degree of difficulty of the test, the user who generated it, the genera-
  tion timestamp and the generation time. The latter field is used entirely for monitor-
  ing and research purposes;
─ table Users, containing fields related to the user who uses the generator, such as user
  identification number or alias. The table is designed to store user data and has an
  organizational purpose.
     A detailed perspective on the database tables is presented in Table 1.

                     Table 1. A detailed perspective of the database structure.


Back to Table of Contents
                                                                                     40


  The structure of the database DBQ containing the tables and the connections be-
tween them is presented in figure 1.


                            Fig. 1.Visual concept of the database

  Specification 1. A question q (id; st; dd; V) is an object formed of the next compo-
nents:
─ the identification number of the question id;
─ the statement st;
─ the degree of difficulty dd, dd ϵ {1, 2, 3, 4, 5};
─ choices set V.


Back to Table of Contents
41


     Observations:
   The degree of difficulty dd is subjective for each question and it is considered to
be input data given by a human operator. This degree is considered to situate on a
scale from 1 to 5, where 1 is the least difficult and 5 means the most difficult. In order
to normalize the difficulty and cancel to a certain extent the subjectiveness of the ap-
preciation of the difficulty, a short explanation is given to the users.

        The set V contains objects structuralizing a choice vi (id; l; cst), i = 1, |V| of the
        question, as follows:
     • question identification number id;
     • choice identification particle l. We choose as choice identification letters from
       the English alphabet, thus l ϵ {„a‟, „b‟, …, „z‟}. The number of choices is thus
       limited to 27;
     • choice statement cst.

     Observation:

     a) A test T (S, GD) is a set of questions qi, i= 1, |S|, where S is the set of questions
        that form the test and GD is the degree of difficulty of the test:

                                                                    (1)

   Specification 2. Given the database question set Q and the selected test question set
S for a given set of input data, a gene gi is an integer particle and a member of the set
{1, …, |Q|}, i ϵ {1, …, |S|}.

     Observations:
 a) Basically, a gene stores the order number for a question (g is equivalent to qid).
 b) |S| is an input data and used in the algorithm.
 c) The elements of set S are unknown before the generation, being an output data.

   Specification 3. Given the database question set Q, the selected test question set S
at a given state, the population set NC and the desired total degree of difficulty MGD,
a chromosome C is an object formed of:

        - order number id, id ϵ {0, …, |NC|};

        - the gene set Gj = {gi | i ϵ {1, …, |S|} }, where G = S; j = 1, |NC|;

        - the fitness function f defined as follows

                                                                    (2)


Back to Table of Contents
                                                                                                        42


         Observations:

         a) Gj is equivalent to qid.
         b) We can easily observe thatMGD= [|S|,5×|S|].
         c) The fitness function checks if the sum of the difficulty degrees of each ques-
          tion within a chromosome are lower and as close as the value MGD.
         d) The chromosome contains the order numbers of questions that form a test. If
          we denote the test questions set by T, then T = S = {Gi| i = 1, |S|}.

        Proposition 4. Given a chromosome Ci (i = 1, |NC|) and random positions a and b
     (a, b = 1, |S|), the mutation operation is defined as the shift of the genes found on the
     positions a and b.
        Observation. The mutation has as result the generation of a new chromosome.

        Proposition 5. Given two chromosomes Ci and Cj and a random position p, the
     crossover operation is defined as a succession of steps as follows:
       ─ The two chromosomes are split at the position p.
     ─ The first part of the chromosome Ci is combined with the second part of the chro-
       mosome Cj and the first part of the chromosome Cj is combined with the second
       part of the chromosome Ci.
       ─ Two new chromosomes Ci’ and Cj’ are obtained, as follows:

                                 𝐶𝐶𝑖𝑖′ = (𝑔𝑔𝑖𝑖1, 𝑔𝑔𝑖𝑖2, … , 𝑔𝑔𝑖𝑖𝑝𝑝−1, 𝑔𝑔𝑗𝑗𝑝𝑝 , … , 𝑔𝑔𝑗𝑗𝑆𝑆 )   (3)

                                 𝐶𝐶𝑗𝑗′ = (𝑔𝑔𝑗𝑗1, 𝑔𝑔𝑗𝑗2, … , 𝑔𝑔𝑗𝑗𝑝𝑝−1, 𝑔𝑔𝑖𝑖𝑝𝑝 , … , 𝑔𝑔𝑖𝑖𝑆𝑆) (4) Within

         the algorithm, the order of the operations is:

a.        Generation of the initial population
b.        Sort of chromosomes based on fitness
c.        Mutation of chromosomes
d.        Crossover of chromosomes

       Operations b), c) and d) are repeated for a previously-set number of generations.
     The final result is a list of tests from which we store a finite number of tests which
     have the highest value of the fitness.


     4       Implementation

     The implementation was made in the form of a web application. The implementation
     was based on Bootstrap framework, used for display and structural components. The
     back-end component is based on PHP combined with MySQL used for database stor-
     age. The customizable parameters, i.e. the ones which influence the performance of the
     final output (the size of the initial population, the mutation rate, the crossover rate) can

     Back to Table of Contents
43


be modified, but they have default values that guarantee a close-to-optimum solution.
Thus, if the user in unaware of the definition of these parameters, he can as well ignore
giving them values. Regardless the situation, the technical details are presented in a
help section.
   The main page of the application is shown in Figure 2 (a-d).


                            Fig. 2. (a) Main panel of the application


                                   Fig. 2. (b) Activity page.


Back to Table of Contents
                                                                                   44


                               Fig. 2. (c) Generation form.


                               Fig. 2. (d) Submission form.

   The application was built of the following components:
─ the dashboard, which shows a summary of the user activity;
─ the script for proposing questions, consisting in an extended form;
─ the page for generating questions, which is the core of the entire application and
  where the input data is set;
─ the page used for showing the generated tests for a given user, where he can choose
  some of the tests generated before.

   The visual representation of the application scheme is presented in figure 3.


Back to Table of Contents
45


                      Fig. 3. Visual representation of the application scheme


5       Conclusions

The presented application is basically a core for a future development of an assessment
aid tool for a teacher. The implemented tool can be in this matter included in a long list
of technology-based tool that are used in education, widely developed [12] on different
supports, even mobile [9]. Given the fact that the foundation theory of the problem
relates to NP-completeness, the chosen genetic approach is legitimate due to user re-
quirements. Future work would obviously consist in the development of the existing
tool in directions of functionalities for the user, such as the automatic output of the test
in a desired form (document), and theoretical basic structure, such as adding require-
ments to the fitness function.
   The educational process depends on mathematical parameters that technology can
use in order to ease the organisational tasks for the person who is in charge with the
educational process (e.g., the teacher). Also, the technology has implications on the
actual educational process by providing materials that create an interactive learning en-
vironment.


References
 1. Alharbi, S., Venkat, I.: A Genetic Algorithm Based Approach for Solving the Minimum
    Dominating Set of Queens Problem. Journal of Optimization, Volume 2017 (2017)
 2. Colorni, A., Dorigo, M., Maniezzo, V.: A Genetic Algorithm To Solve The Timetable Prob-
    lem(1994).


Back to Table of Contents
                                                                                             46


 3. Moon, B.: The Subset Sum Problem: Reducing Time Complexity of NP-Completeness with
    Quantum Search, Undergraduate Journal of Mathematical Modeling: One + Two: Vol. 4:
    Iss. 2, Article 2(2012).
 4. Li Y., Li S., Li X.: Test Paper Generating Method Based on Genetic Algorithm, AASRI
    Procedia, Volume 1, Pages 549-553, ISSN 2212-6716 (2012).
 5. Guang C.,Yuxiao D., Wanlin G., Lina Y., Simon S., Qing W., Ying Y., Hongbiao J.: A
    implementation of an automatic examination paper generation system, Mathematical and
    Computer Modelling, Volume 51, Issues 11–12, Pages 1339-1342, ISSN 0895-7177 (2010).
 6. Liu, D. W., Jianmin Z., Lijuan.: Automatic Test Paper Generation Based on Ant Colony
    Algorithm. Journal of Software. 8. . 10.4304/jsw.8.10.2600-2606(2013).
 7. Thessen A. E., Cui H., Mozzherin D.: Applications of Natural Language Processing in Bio-
    diversity Science. Advances in Bioinformatics, 2012:391574 (2012).
 8. Boopathiraj, C. Chellamani, K.: Analysis of Test Items on Difficulty Level and Discrimina-
    tion Index in the Test for Research in Education. International Journal Of Social Sciences &
    Interdisciplinary Research, [S.l.], p. 189-193,ISSN 2277-3630 (2013).
 9. Elvira Popescu, Constantin Stefan, Sorin Ilie, Mirjana Ivanovic, EduNotes - A Mobile
    Learning Application for Collaborative Note-Taking in Lecture Settings, Proceedings ICWL
    2016, Lecture Notes in Computer Science, Vol. 10013, Springer, ISBN: 978-3-31947439-7,
    pp. 131-140, 2016.
10. Alex Becheru, Elvira Popescu, Design of a conceptual knowledge extraction framework for
    a social learning environment based on social network analysis methods, Proceedings ICCC
    2017, ISBN: 978-1-5090-4862-5, pp. 177-182, 2017.
11. Doru Constantin, Emilia Clipici, “A new model for estimating the risk of bankruptcy of the
    insurance companies based on the artificial neural networks”, Proceedings of the 16th edi-
    tion of the SGEM International GeoConferences (International Multidisciplinary Scientific
    GeoConferences), 28 June-7 July, 2016.
12. C. Baron, A. Şerb, N.M. Iacob, C.L. Defta, IT Infrastructure Model Used for Implementing
    an E-learning Platform Based on Distributed Databases, Quality-Access to Success Journal,
    Vol. 15, Issue 140, pp. 195-201, 2014
13. C.L Defta, A. Şerb, N.M. Iacob, C. Baron, Threats analysis for E-learning platforms, Vol. 6
    / Nr. 1, pp. 132–135, 2014
14. Domşa Ovidiu, Emilian Ceuca, Mircea Râşteiu, Algorithm to find a tree with Maximal Ter-
    minal Nodes, 1st Balkan Conference in Informatics, 21-23 November 2003, Thessaloniki,
    Greece, ISSN960-287-045-1, pag.113-122
15. Victor Tiţa, Raluca Necula: Trends In Educational Training For Agriculture In Olt
    County, Scientific Papers Series Management, Economic Engineering in Agriculture and
16. Rural Development , Vol. 15, Issue 4, 2015, PRINT ISSN 2284-7995, E-ISSN 2285-3952


Back to Table of Contents