SUBJECTIVE MODEL ANSWER GENERATION TOOL FOR
          DIGITAL EVALUATION SYSTEMS
                Shubham                                   Dr. Arpana Rawal                             Dr. Ani Thomas
            Research Scholar                                    Professor                                   Professor
      Bhilai Institute of Technology                 Bhilai Institute of Technology              Bhilai Institute of Technology
                    Durg                                           Durg                                        Durg
             +91-9644026902                                 +91-9907180993                              +91-9893165872
 Shubhamlive1010@gmail.com                          arpana.rawal@gmail.com                     ani.thomas@bitdurg.ac.in


ABSTRACT                                                               recent statistics it takes one month on an average to evaluate 700
Automated subjective answer assessments in modern digital              candidate answer script for six subjects in total. Thus, taking
evaluation environments are promising structural consistency, but      almost two to three months to declare answer for the same lot of
they distort the very nature of expressing complex and context-        students. Apart from this, even with expert evaluators it is not
rich information put up for evaluation. In modern teaching-            possible for anyone to justify which answer is better and why?
learning environments, with wide variety of biasing observed           Envisioning such a series of hurdles, an attempt is being made to
while fabricating humane scripted Memoranda-Of-Instructions, it        ease the task of manual answer generation by obtaining machine
becomes difficult to evaluate subjective answers with appropriate      generated answers to some question categories.
justification. Answer evaluation systems have seen an extensive        The rest of this paper is organized as follows: Section 2 addresses
research by academicians since few decades. On the other hand,         preprocessing issues that have been addressed by other systems
research on subjective model answer generation is still in its         for model answer generation. Section 3 outlines the subjective
infancy stage. Lately, an algorithm for subjective model answer        answer generation algorithm. Section 4 suggests further
generation has become necessary for developing a generic               applications and developments possible in near future.
framework sufficing all types of subjective questions. In this
paper, we described one such algorithm for generating model            2. PRE-PROCESSING ISSUES
answers for all types of descriptive (subjective) questions from a          The considerable issues that are needed to be investigated in
given text corpus.                                                     depth before building the prototype tool of generating model
                                                                       answers are enumerated below:
CCS Concepts
• CCS → Information systems → Information retrieval →                        Language Support: One of the design issues in the algorithm
Retrieval tasks and goals → Question answering.                        demands the concurrent modification of passive data objects in
                                                                       already existing dictionary, while checking for the terms in that
Keywords                                                               domain-specific vocabulary in an attempt to expand the
Answer Generation Systems; Algorithm; Content filtering;               vocabulary at runtime. Not all languages support the above
Restricted Domain; Vocabulary; Seed; Co-occurring; Domain              mentioned feature. Hence, there arises a need to choose the
specific; Answer retrieval                                             appropriate language for tool development. This issue can also be
                                                                       resolved by using the method as described by D. Clarke et al. in
1. INTRODUCTION                                                        their article [3].
As the modern education system is augmented with digital
environments round the globe, the evaluation systems are also               Natural Language Processing (NLP) Tool selection: The
getting digitized progressively.                                       selected NLP tool must support the following Annotator
                                                                       properties viz. Tokenization, Sentence Splitting, Lemmatization,
With the advent of automated subjective answering evaluation           Parts of Speech Tagging, Constituency Parsing, Dependency
tools like Electrnic Essay Rater (E-rater) by Burstein, Kukich,        Parsing and Co-reference Resolution
Wolff, Chi and Chodorow (1998), Conceptual Rater (C-rater)
(Valenti et al., 2003), Intelligent Essay Assessor (IEA) (Valenti et         Supporting Domain-specific Vocabularies: Using ‘WordNet’
al., 2003), Educational Testing Service (ETS-I) by Whittington         as the source of open-domain vocabulary may seem optimal at
and Hunt (1999), BETSY (Valenti et al., 2003), Schema Extract          first, but it usually hinders the generation of most accurate
Analyze and Report (SEAR) (Christie, 1999), a drastic drift is         answers, demanding information retrieval for a narrowly specified
seen in preparation of question paper manuscripts from MCQ             subject domain. The Information Retrieval model built for
questionnaire to a blend of both objective and subjective questions    Question-Answering (QA) systems by IBM’s statistic system
[1,2]. The Academicians are observed spending more of their time       finds greatest hindrance observed in the last step of trimming set
in setting question papers and evaluating answer, rather than          of optimal sentences from the ranked set of passages obtained in
analyzing the scores and counselling the students. According to        the previous step. The best alternative for reducing such system
                                                                       errors is to use restricted domains as back-ground knowledge
Copyright © 2017 for the individual papers by the papers’ authors.     rather than open-domains [4]. In another exhaustive survey put up
Copying permitted for private and academic purposes. This volume is    by L. Hirschman and R. Gaizauskas, they emphasized on the
published and copyrighted by its editors.
crucial role of passages in extraction of answers to the subjective          Seed_Index ← Get-Seed-
questions through IR techniques [5].                                  Index(Section_Sentences,Seed_Sentences[j])
                                                                             CREATE List Seed_Vocab of type String
3. SUBJECTIVE ANSWER GENERATION                                              CREATE List Co_Occurring_NP of type String
ALGORITHM                                                                    Seed_Vocab ← Get-Vocab(Get-
The syntax used in pseudo code borrows some of the elements           Keywords(Seed_Sentences[Seed_Index]),V)
from java syntax. All the input and output objects are specified in          Left_Marker ← Seed_Index
bold. All the language constructs like conditionals and loops use            Right_Marker ← Seed_Index
italics style. Square brackets denote index for storing and                  WHILE there exists a String from Seed_Vocab or
accessing array elements. Assignment operation is denoted by          Co_Occurring_NP
Symbolic notation ←. The description of various variables and                       in Seed_Sentences[Left_Marker]
procedures used in this pseudo code are as follows:                            Cur_Co_Occurring_NP ← Get-Co-Occurring-
                                                                      NP(Seed_Sentences[Left_Marker])
     a.   Q is the raw question string to be used for finding                  add Cur_Co_Occurring_NP to Co_Occurring_NP
          answers                                                              Left_Marker ← Left_Marker - 1
     b.   K is the List of keyword strings present in question Q               IF Left_Marker = 0
          which can be generated using open source NLP tools.                     break from while loop
                                                                               END IF
     c.   C is the List of Corpus sections as string which can be            END WHILE
          sections or chapters defined in a standard text book               WHILE there exists a String from Seed_Vocab or
                                                                      Co_Occurring_NP
     d.   V is the source vocabulary which can be generated for                     in Seed_Sentences[Right_Marker]
          all the keywords of Corpus C using either wordnet for                Cur_Co_Occurring_NP ← Get-Co-Occurring-
          enhanced domain vocab or manual human intervention
                                                                      NP(Seed_Sentences[Right_Marker])
          for restricting domain.
                                                                               add Cur_Co_Occurring_NP to Co_Occurring_NP
     e.   Get-Entry-Points : Function for initial filtering based              Right_Marker ← Right_Marker + 1
          on count of Question keywords K found in different                   IF Right_Marker = Seed_Sentences.size
          sections of Text Corpus.                                                break from while loop
                                                                               END IF
     f.   Get-Seed-Sentences : Function for getting seed                     END WHILE
          sentences present in a particular section based on                 INITIALIZE Cur_Frag to empty String
          keyword K and keyword vocab for given question Q.                  FOR k = Left_Marker to Right_Marker
     g.   Get-Section-Sentences : Function to return list of all               Concatenate Seed_Sentences[k] to Cur_Frag
          sentences present in a text paragraph from a section.              END FOR
                                                                             Add Cur_Frag to SectionWise_Fragments
     h.   Get-Vocab : This procedure returns a list of string for         END FOR
          all terms supplied separately. Each list in returned            Remove Duplicate sentences from SectionWise_Fragments
          composite list is synonym for respective words provided         INITIALIZE Cur_Section_Answer to empty String
          in input list.                                                  FOR j = 0 to SectionWise_Fragments.size do
                                                                             Concatenate SectionWise_Fragments[j] to
     i.   Get-Keywords : This procedure returns a list of related
                                                                      Cur_Section_Answer
          keywords based on NLP dependencies provided by NLP
                                                                          END FOR
          parsers.
                                                                          Add Cur_Section_Answer to Answer_Fragments
     j.   Get-Co-Occurring-NP : This procedure returns co-              END FOR
          occurring NP after performing anaphora resolution of          INITIALIZE A to empty String
          supplied text.                                                FOR i = 0 to Answer_Fragments.size
                                                                          Concatenate Answer_Fragments[i] to A
The algorithm for generating answers is as follows:                     END FOR
                                                                        RETURN A
ALGORITHM Generate-Answer is
  INPUT: Question Q with Keywords K,                                  4. FURTHER APPLICATIONS AND
     Text Corpus C as List of section fragments,                      DEVELOPMENT
     Vocab Source V
  OUTPUT: Answer A comprising concatenated fragments                  This tool is observed to provide answer fragments that go fairly
                                                                      congenial, when compared with model answers fabricated by
   E ← Get-Entry-Points(Q,C)                                          human assessors. The algorithm presented here is capable of
   CREATE an empty list Answer_Fragments of type String               generating answers with highest precision depending on the
   FOR i = 0 to E.size do                                             vocabulary source but some other parameters like context
     Cur_Segment ← E[i]                                               continuity and context span must be included in order to limit the
     Seed_Sentences ← Get-Seed-Sentences(E[i],K,C)                    locality of context for more accurate results with high recall. The
     Section_Sentences ← Get-Section-Sentences(E[i],C)                software testing of the tool seems to provide promising results in
     CREATE an empty list SectionWise_Fragments of type               performing fair and unbiased evaluation of students’ answer
string                                                                scripts. Combining this algorithm with a good answer evaluation
     FOR j = 0 to Seed_Sentences.size do
approach can provide robust answer evaluation feature for
automating the digital evaluation systems.
Another field of this tool application is evaluation of online
assignments at the institute level for analyzing students’ appraisals
on continuous scale. The up gradation scopes of such a tool
development follow with real-time answer generation for different
types of subjective questions presented in wide variety of
grammatical styles and for versatile subject domains.

5. ACKNOWLEDGMENTS
This work was supported by Research and Development
Laboratory, Department of Computer Science and Engineering at
Bhilai Institute of Technology, Durg, Chhattisgarh, India,
awaiting sponsorship from suitable funding agencies.
6. REFERENCES
[1] Valenti, S., Neri, F. and Cucchiarelli, A. 2003. An Overview
    of Current Research on Automated Essay Grading. J. of
    Information Technology Education (JITE), pp. 319-330.
[2] Christie, J. 1999. Assessment of Essay Marking - focus on
    Style and Content. In 3rd International Computer Assisted
    Assessment Conference (CAA) , pp. 39-45.
[3] R.Diekema, Ozgur Yilmazel, and E.D.Liddy, 2004. Minimal
    Ownership of Active Objects. In Proceedings of the 6th
    Asian Symposium on Programming Languages and Systems,
    APLAS 2008, Bangalore, pp. 139-154A.
[4] Parag A. Guruji, Mrunal M. Pagnis, Sayali M. Pawar and
    Prakash J. Kulkarni, ‘Evaluation Of Subjective Answers
    Using Glsa Enhanced With Contextual Synonymy’,
    International Journal on Natural Language Computing
    (IJNLC) Vol. 4, No.1, February 2015, pp. 51-60.
[5] Jorg Tiedemann, “Integrating linguistic knowledge in
    passage retrieval for question answering.” Proceedings of
    Conference on Human Language Technology and Empirical
    Methods in Natural LanguageProcessing, Vancouver, British
    Columbia, Canada, pp. 939 - 946, 2005.