An Ontological Inference Driven Interactive
                Voice Response System
                         Mohammad Ababneh                               Duminda Wijesekera
                    Department of Computer Science                Department of Computer Science
                       George Mason University                       George Mason University
                          Fairfax, VA, USA                               Fairfax, VA, USA
                         mababneh@gmu.edu                               dwijesek@gmu.edu


Abstract— Someone seeking entry to an access controlled       accusations of bias from rejected entrants. Ideally, a
facility or through a border control point may face an in     successful interview should accommodate differences in
person interview. Questions that may be asked in such an      accents and provide assurance that it is unbiased against
interview may depend on the context and vary in detail. One   similar attributes.
of the issues that interviewers face is to ask relevant
questions that would enable them to either accept or reject
entrance. Repeating questions asked at entry point               Given the recent success of interactive voice response
interviews may render them useless because most               (IVR) systems such as auto attendants, satellite
interviewees may come prepared to answer common               navigation,   and   personal   assistants   such   as   Apple’s   Siri,  
questions. As a solution, we present an interactive voice     Google’s  Voice,  Microsoft’s   Speech, we investigated the
response system that can generate a random set of questions   possibility of specializing IVR systems for access control
that are contextually relevant, of the appropriate level of   such as: Visa interviews, entry point interviews, biometric
difficulty and not repeated in successive question answer     enrollment interviews, password reset, etc.
sessions. Furthermore our system will have the ability to
limit the number of questions based on the available time,
                                                                 Although IVR systems have come a long way in
degree of difficulty of generated questions or the desired
subject concentration. Our solution uses Item Response        recognizing human voice, and responding to human
Theory to select questions from a large item bank generated   requests as if responses come from another human, most
by inferences over multiple distributed ontologies.           of the existing IVR systems are pre-programmed with
                                                              questions and their acceptable answers, and consequently
                                                              have limited capability in satisfying the Use Case at hand.
  Keywords—Ontology; Semantic Web; OWL; Dialogue;
Question Answering; Voice Recognition; IVR; VXML; Access          The first minor limitation of current IVR systems
Control Policy; Security; Item Response Theory.               comes from the fact that, the human starts and drives the
                                                              conversation. The second limitation is that most IVR
                  I. INTRODUCTION                             systems have a finite number of pre-programmed
                                                              conversations. Therefore the set of questions generated by
                                                              such a system are the same for every conversation. This
    Physical control points such as human guarded gates,
                                                              limitation may expose the set of questions so that aspiring
border control points and visa counters provide entry into
                                                              entrants may come with prepared question-answer pairs,
facilities or geographical regions to those that can be
                                                              even if the subject matter of the questions may be
admitted legitimately. Legitimacy is usually determined
                                                              unfamiliar to them. Consequently, having the ability to
by rules, regulations or policies known to entry control
                                                              select questions from a large pool may resolve this
personnel whose duty is to ensure that these policies are
                                                              limitation. The third limitation is that when selecting a
enforced while admitting people. In order to do so, they
                                                              random set of questions from a large pool, the set of
hold an interview, in which an aspiring entrant is asked a
                                                              questions asked may not have the desired overall level of
series of questions, and possibly show some documents
                                                              difficulty to challenge the user. Solving this issue is
and demonstrate some knowledge about the contents of
                                                              relevant because all aspiring entrants expect to have a fair
the documents or attributes contained in them. Successful
                                                              interview. The forth limitation is that questions must be
interviews should have questions that are relevant, of a
                                                              able to discriminate between someone that knows the
reasonable level of difficulty (i.e. not too difficult or
                                                              subject matter from someone who guesses an answer.
common knowledge) and not to have been asked in prior
interviews for the same purpose without drawing


                                          STIDS 2013 Proceedings Page 125
    As a solution we created an ontological inference                    CAT/IRT reduces the number of questions necessary to
based IVR system that uses item response theory (IRT) to                 reach   a   credible   estimation   of   the   examinee’s ability by
select the questions [13, 3]. Our system uses the XACML                  50%. CAT/IRT can be used to control the number and
language as a base to establish entry policies that consist              order   of   questions   to   be   generated   based   on   examinee’s  
of rules to specify the attributes that must be possessed by             previous answers [4, 5].
permitted entrants [7]. The IVR system has the
responsibility of determining access by asking questions                     Our goal in this work is to demonstrate and build an
generated using ontological inferences and IRT.                          access control system using dialogues of questions and
                                                                         answers generated from a suitable collection of
    In previous work, we introduced a policy-based IVR                   ontologies. Table I shows a sample dialogue that is
system for use in access control to resources [1]. Later,                generated from our research. Our prototype automated
we presented an enhancement that uses IRT to select                      IVR system can help immigration enforcement at a border
queries from a large set of attributes present in a policy               control point making a decision to permit or deny a
[2]. Here we introduce ontology-aided access control                     person asking for entry. Through a dialogue of questions
system by including questions related to the base                        and answers, the interviewee will be assigned a numerical
attributes in order to ascertain the interviewee’s                       score that will then serve as a threshold in the decision
familiarity, and provide a score for the entire set of                   making process. This score is calculated using IRT, which
answers [8]. We also have the added capability to                        takes into account the correctness of the user’s  responses  
generate the succeeding question based on the accuracy of                and the weight of the individual questions.
the preceding question. We do so by aligning each                        The rest of the paper is written as follows. Section II
attribute with an ontology that encodes the subject matter               describes an ontological use case, Section III describes
expertise on that attribute and derive facts from these                  the response theory. Section IV describes the system
ontologies using reasoners to generate questions. We then                architecture. Section V describes our implementation.
assign weights to these derivations based on the axioms                  Section VI is about experimental results and section VII
and rules of derivations used in the proof tree.                         concludes the paper.

     Usually ontologies have a large number of axioms                                         II. Motivating Use Case
and assert even more facts when using reasoners.
Consequently, blindly converting such an axiom base to                      In this section, we describe an example ontology used
human-machine dialogue would result in very long                         in our work to generate efficient dialogues of questions
conversations with many disadvantages. The first is that
                                                                                             TABLE I. A SAMPLE DIALOGUE
human users would become frustrated of being subjected
to long machine driven interrogations, and thereby
reducing the usability of the system. The second is that
long conversations take longer time to arrive at an
accept/reject decision, and likely to create long queues at
points of service, such as Airports and guarded doors. In
addition, having a line of people behind one person in
close proximity may leak private information of the
interviewee. Also, others may quickly learn the set of
questions and answers that would get them mistakenly
authorized, thereby gaining unauthorized access.

    We use IRT, which provides the basis for selecting
tests from large number of potential questions.
Psychmotricans in social sciences and standardized test
preparation organizations such as the Educational Testing
Services that administer standardized test examinations
like SAT, MCAT, GMAT etc. have developed
methodologies   to   measure   an   examinee’s   trust   or  
credibility from answers provided to a series of questions.
In traditional tests, the ability of the examinee is
calculated by adding up the scores of correct answers.
Currently, Computerized Adaptive Testing (CAT) that
relies on IRT has been used to better estimate an
examinee’s  ability.  It  has  also  been  shown  that  the  use  of  


                                                  STIDS 2013 Proceedings Page 126
and answers that are used in assigning a numerical value                      We use this ontology in our work because it serves as a
to  an  interviewee’s  ability  or  trust  level.                          good example showing the strength of our system. First, it
                                                                           shows the possibility of generating valuable questions
     Fig. 1 illustrates a class diagram of our under-                      from asserted or inferred facts. Second, it enables the
development ontology for homeland security. The                            implementation of the theory under consideration (to be
purpose of this ontology is to collect, organize and infer                 discussed later in the background section) to generate
information that can help deterring possible attacks,                      efficient and secure dialogs that are used in: (1) making
enforcing strict entry and enabling faster reach to                        entry control decisions, (2) assigning numerical values to
suspects. The ontology defines classes, individuals,                       ability or trust in the shortest time possible and (3) load
properties and relationships using OWL 2 Web Ontology                      distribution among interviewers and diverting people for
Language (OWL) [9]. The major entities in the ontology                     further investigation.
are:
x Person: defines humans in general and has                                   The use of ontology in such an application provides
     subclasses like; International Student and Friend.                    many benefits. The most important amongst them is
x Event: defines an event that has a location, date, time                  reasoning. Using a reasoner we are able to derive facts
     and type like terrorist attack                                        from asserted ones. These facts are used to generate
x International Student: is a person who is on an F-1                      questions to measure the knowledge or ability level of an
     or J-1 Visa type                                                      interviewee on a subject under questioning. In IRT, better
x University: defines a university. Some of its current                    item selection and ability estimation happens when a large
     members are MIT and GMU                                               set of items is available to draw questions from. Using
x City: defines a city like Boston                                         ontology, the large number of derivable facts provides us
x Country: defines a country like USA, Russia,                             with the ability to increase the number of questions, and
     Dagestan, Kazakhstan, etc.                                            also control the quality and difficulty of questions.
x State: defines a state like Massachusetts
x Visa: defines visa types like F-1 and J-1 student visas                     Although there are many reasoners such as FaCT++,
     and maybe others.                                                     JFact, Pellet, RacerPro, we use HermiT [12] in our work.
                                                                           Given an OWL file, HermiT can determine whether or not
   This ontology represents many kinds of data classes                     the ontology or an axiom is consistent, identify
and relationships between these major classes and                          subsumption relationships between classes and deduce
individuals. For example, we   define   the   “Boston                      other facts. Most reasoners are also able to provide
Marathon   Bombing”   as   a   “Terrorist   Attack”   that                 explanations of how an inference was reached using the
happened  in  “Boston”,  which  is  a  city  in  “Massachusetts”           predefined axioms or asserted facts.
state.   Another   fact   is   that   “Dzhokhar   Tsarnaev”   is   an  
“Event   Character”   in   the   “Boston   Marathon   Bombing”                One such fact derived from asserted ones in our
“Terrorist   Attack”.   Also   we   have   an   “International             ontology, is finding the friends that hold a student visa of
Student”   who   is   a   friend   to   “Event   Character”   in   the     a person involved in a terrorist attack. To explain this, we
“Boston  Marathon  Bombing”.                                               have   “dzhokhar is   friend   of   Dias”,   “Dias   is   friend   of  
                                                                           Azamat”,   “Dias   has   F-1   visa”,   “Azamat   has a J-1   visa”,  
                                                                           “dzhokhar   is   an   “Event Character” in the “Boston
                                                                           Marathon   Bombing”,   “Boston   Marathon   Bombing”   is   a  
                                                                           “Terrorist Attack”.   Thus   we   infer   (using   the HermiT
                                                                           reasoner) that Azamat and Dias are the friends of the
                                                                           Boston Bomber and therefore need to be questioned at
                                                                           any entry point. We use this chain of derivations to
                                                                           generate specific questions from them.

                                                                              Reasoners and the explanations that they provide are
                                                                           very important components in our work to generate
                                                                           relevant and critical questions from ontology that measure
                                                                           knowledge and estimate ability from a response in order
                                                                           to grant access or assign trust. In the example above, the
                                                                           reasoner provided an explanation of the inference using
                                                                           11 axioms. We use such a number in defining the
         Fig. 1. The Homeland Security Ontology in Protégé                 difficulty of questions generated from such inferences, as


                                                   STIDS 2013 Proceedings Page 127
                                                                         characterization of what happens when an individual
                                                                         meets an item, such as an exam or an interview. In IRT,
                                                                         each person is characterized by a proficiency parameter
                                                                         that represents his ability, mostly denoted by (T) in
                                                                         literature. Each item is characterized by a collection of
                                                                         parameters mainly, its difficulty (b), discrimination (a)
                                                                         and guessing factor (c). When an examinee answers a
                                                                         question,   IRT   uses   the   examinee’s   proficiency   level   and  
                                                                         the   item’s   parameters   to   predict   the   probability   of   the  
                                                                         person answering the item correctly. The probability of
                                                                         answering a question correctly according to IRT in a
                                                                         three-parameter model is shown in (1), where e is the
                                                                         constant 2.718, b is the difficulty parameter, a is the
                                                                         discrimination parameter, c is the guessing value and Tis
                                                                         the ability level [3].

                                                                                           𝑃 = 𝑐   + (1 − 𝑐)                (    )                   (1)
   Fig. 2. A sample explanation of an inferred axiom in Protégé
                    using the HermiT reasoner
                                                                            In IRT, test items are selected to yield the highest
                                                                         information content about the examinee by presenting
will be explained in section V. Fig. 2 shows the HermiT
                                                                         items with difficulty parameter values that are closer to
reasoner explanation of our inferred fact.
                                                                         his ability value. This reduces time by asking fewer and
                                                                         relevant questions rather wider range ones while
                       III. BACKGROUND                                   satisfying content considerations such as items or rules
                                                                         that are critical for a decision of access or scoring.
A. IVR Systems
                                                                            1) IRT parameter estimation
   The main purpose of an IVR system is to interact with
humans using a voice stream. An IVR environment                              In order to determine the difficulty and discrimination
consists of a markup language to specify voice dialogues,                parameters of a test item, IRT uses Bayesian estimates,
a voice recognition engine, a voice browser and auxiliary                maximum likelihood estimates or similar methods (MLE)
services that allow a computer to interact with humans                   [3, 4]. In the original IRT, an experiment is conducted to
using voice and Dual Tone Multi-Frequency (DTMF)                         estimate these values for each item and at an assumed
tones with a keypad enabling hands-free interactions                     level of ability for various groups with associated values
between a user and a host machine [13]. Recently, many                   of IRT parameters using his judgment and experience.
applications such as auto attendant, satellite navigation,               Nevertheless, by using our system we can also revise any
and   personal   assistants   such   as   Apple’s   Siri,   Google’s     initial values for these parameters. We model rule
Voice,   Microsoft’s   Voice,   etc.,   have   started   using   IVR     attributes as test items and rely on the policy
systems. The IVR language we use is VoiceXML,                            administrator to provide the estimated probabilities.
sometimes abbreviated as VXML [14]. Briefly, Voice
XML is a Voice Markup Language (comparable to                               2) IRT ability estimation
HTML in the visual markup languages) developed and
standardized   by   the   W3C’s   Voice   Browser   Working                  In IRT, responses to questions are dichotomously
Group to create audio dialogues that feature synthesized                 scored. That is, a correct answer gets a score of “1”   and
speech, digitized audio, recognition of spoken and                       an   incorrect   answer   gets   a   score   of   “0”.   The   list   of   such  
(DTMF) key inputs, recording of spoken input ,                           results consist an item response vector. To estimate the
telephony, and mixed initiative conversations.                           examinee’s   ability,   IRT utilizes maximum likelihood
                                                                         estimates (MLE) using an iterative process involving a
B. Item Response Theory                                                  priori value of the ability, the item parameters and the
                                                                         response vector as shown in (2). Here, 𝜃 is the estimated
   IRT, sometimes called latent trait theory is popular                  ability within iteration s. 𝑎 is the discrimination
among psychometricians for testing individuals, and a                    parameter of item i, where i=1,2,...,N. 𝑢 is the response
score assigned to an individual in IRT is said to measure                of the examine (1/0 for correct/incorrect). 𝑃 𝜃 is the
his latent trait or ability. Mathematically, IRT provides a


                                                    STIDS 2013 Proceedings Page 128
probability of correct response from (1). 𝑄 𝜃 is the
probability of incorrect response = 1- 𝑃 𝜃 [3,4].

                                                [         ]
       𝜃       =    𝜃 +                                              (2)
                                                      

   Then, the ability estimate is adjusted to improve the
computed   probabilities   with   the   examinee’s   responses   to  
items. This process is repeated until the MLE adjustment
becomes small enough so that the change becomes
negligible. IRT accommodates multiple stopping criteria
such as: fixed number of questions, ability threshold or a
standard error confidence level. The result is then
considered   an   estimate   of   the   examinee’s   ability   and   the  
estimation procedure stops. The ability or trait usually
ranges from -∞ to +∞, but for computational reasons
acceptable values are limited to the range [-3, +3].                                         Fig. 3. Ontology-based IVR using IRT

C. Access Control and XACML                                                  into VoiceXML and plays to the user. Then the system
                                                                             waits  for  the  user’s  utterance, and if the user provides one,
                                                                             the system’s voice recognition software attempts to
   Access control policies specify which subjects may
                                                                             recognize the input and checks the correctness of the
access which resources under some specified conditions
                                                                             answer. Based on the answer, the IRT estimation
[6]. An attribute-based access control policy specifies
                                                                             procedure either increases a priori ability score or
subjects, objects and resources using some attributes.                       decreases it. The process continues until a predetermined
XACML is an OASIS standard XML-based language for                            level of ability or accuracy specified according to the
specifying access control policies [7]. In a typical                         application is reached.
XACML usage scenario, a subject that seeks access to a
resource submits a query through an entity called a Policy                       Because ontologies produce a large number of facts, it
Enforcement Point (PEP), which is responsible for                            would be impractical to run a dialogue that lasts hours in
controlling access to the resource. It forms a request in the                order  to  estimate   user’s  ability.  In  our  homeland  security  
XACML request language format and sends it to the a                          ontology uses 167 axioms. The reasoner was able to infer
policy decision point (PDP), which in turn, evaluates the                    94 facts raising the total number of axioms and candidate
request and sends back one of the following responses:                       to generate questions to 273.
accept, reject, error, or unable to evaluate.
                                                                                 We use IRT to manage and control dialogue questions
           IV. USING IRT TO MANAGE AND CONTROL                               generated from a large pool of ontologically derived facts
                DIALOGUES FROM ONTOLOGIES                                    in a way that shortens the length of dialogues while
                                                                             keeping   the   maximum   accuracy   in   estimating   the   user’s  
    Fig. 3 shows the overall architecture of our system.                     trust. The IRT-based estimated  (θ)  represents  the  trust  or  
We use derived or axiomatic facts of the ontology to                         confidence of the system in the person answering the
create questions asked by our IVR system. Given that a                       questions in order to make an access decision.
large number of facts can be derived from our ontology,
but only few questions can be asked during an interview,                         We have used the OWL annotation property to assign
we use IRT to select the facts that are used to generate                     IRT parameters to axioms. Annotations were selected in
questions.                                                                   order to keep the semantics of the original ontology and
                                                                             structure intact. We annotate every asserted axiom in the
     Our questions are automatically created without                         ontology with IRT parameters, which are: difficulty (b),
human involvement by combing English words or phrases                        discrimination (a) and guessing (c). Currently, we assume
such as “Does”   or   “Is-a”   with   ones chosen from the                   all asserted axioms have the same default degree of
ontology of (subject, property, object) triples. The                         difficulty and discrimination values of 1. The code
expectation is a dichotomous answer of either (yes, no) or                   snippet in Fig. 4 illustrates our annotation using Java with
(true, false). The ontological property names such as “is-                   OWL API. An improvement to this approach would be to
a”,   “has-something”   are prime candidates for creating                    assign different values for difficulty and discrimination by
true/false questions. Our system transforms the question                     using domain experts.


                                                         STIDS 2013 Proceedings Page 129
    OWLAnnotationProperty irtDifficultyAP =                                    Set<OWLAxiom>
    df.getOWL                                                                  inferredAxioms=inferredOntology.getAxioms();
    AnnotationProperty(IRI.create("#irt_difficulty"                            DefaultExplanationGenerator explanationGenerator
    ));                                                                        =new DefaultExplanationGenerator(
    OWLAnnotation irtAnnotation =                                              manager, factory, ontology, reasoner, new
    df.getOWLAnnotation(                                                       SilentExplanationProgressMonitor());
        irtDifficultyAP , df.getOWLLiteral(1.0));                              for (OWLAxiom axiom : inferredAxioms) {
    for (OWLAxiom axiom : axioms) {                                                Set<OWLAxiom> explanation =
          OWLAxiom axiom2 = axiom.getAnnotatedAxiom                            explanationGenerator.getExplanation(axiom);
             (Collections.singleton(irtAnnotation));                           //Annotate inferred axioms using the number of
          manager.addAxiom(ontology, axiom2);                                  explanation
    }                                                                          OWLAxiom tempAxiom =
                                                                               axiom.getAnnotatedAxiom(Collections.singleton(irt
            Fig. 4. Java code for asserted axiom annotation                    Annotation));
                                                                               manager.addAxiom(inferredOntology, tempAxiom);
    We weigh inferred facts more during the estimation
                                                                                         Fig. 5. Java code for inferred axiom annotation
process. We are calculating these parameter values from
the number of explanation axioms used in each
individually inferred fact. Our current scheme of                               The resultant decision is based on the IRT
difficulty value assignment is shown in Table II; where                     characteristics of the axiom and not on the number or the
higher values or weights are assigned according to the                      percentage of correctly answered questions as in
number of explanation axioms used to infer a fact, and                      traditional testing. The ability estimate produced by our
consequently the question generated from it is considered                   implementation also comes with a standard error (SE)
to be more difficult than one generated from an asserted                    value that is a measure of the accuracy of the estimate.
fact. Fig. 5 illustrates a code snippet for inferred axiom                  Equation (3) presents the formula used for standard error
annotation.                                                                 calculation [7].

     In our current work and for testing purposes we use a                            𝑆𝐸 𝜃 =                                             
                                                                                                     ∑              ( )
default   value   of   “1.0”   for   discrimination   and   “0.0”   for  
guessing, which practically neutralizes them leaving the
difficulty parameter as the sole factor in estimating ability                  Higher standard error indicates that the estimate is not
using equation 2. However, our solution and algorithm are                   very accurate, while lower values indicate higher
based on the IRT two-parameter model, which relies on                       confidence in the estimation. This too can be used as a
the  item’s  difficulty  and  discrimination  parameters. Fig. 6            means to discontinue the dialogue or use an alternate
shows our algorithm to estimate ability based on equation                   decision method.
2 [3]. Our system estimates the ability of a user after
every answer to a question generated from an axiom                          V. IMPLEMENTING THE ONTOLOGY-BASED IVR
before selecting and asking the next question. If the                               SYSTEM FOR ENTRY CONTROL
ability estimate exceeds the threshold then access is
granted. If the threshold is not reached then additional                       Here, we present a prototype of our system showing
questions   are   offered.   If   the   estimated   ability   doesn’t       the major components. It is not yet validated as a
reach the threshold the dialog stops and access is denied.                  deployable system, but it works for the sample use case.
Depending on the application, the dialog might be run
                                                                              Algorithm 1: IRT Ability estimation
again giving a second chance. When the ability estimation
                                                                              Input:a priori theta, Difficulty, Discrimination,
again reaches a predefined threshold, the system                              Answer
concludes the dialog and conveys the decision.                                Output: posteriori theta, standard error
                                                                              /* calculate theta and standard error*/
        TABLE II. IRT DIFFICULTY ASSIGNMENT BASED ON                          1:for (counter < items.length) do
           NUMBER OF AXIOMS IN EXPLANATION                                    2: itemDifficulty=parseFloat(difficultyArray[i]);
            Number of            IRT                                          3:itemDiscrimination=parseFloat(discriminationArr
                                                                                ay[i]);
           explanations          Difficulty                                   4:answer=parseFloat(answerArray[i]);
           1                        0            Easy                         5:probTheta=calculateProbability(itemDiscriminati
           2-3                      1                                           on,aTheta,itemDifficulty); // equation 1
           4-5                      1.5          Moderate                     6:thetaSplus1= claculateTheta(probTheta, thetaS);
                                                                                //equation 2
           6-7                      2                                         7:endfor;
           8-9                      2.5                                       8:estimatedTheta = thetaSplus1;
           >=10                     3            Hard                         9:return thetaSplus1;

                                                                                      Fig. 6. Algorithm for ability estimation in IRT


                                                    STIDS 2013 Proceedings Page 130
  1) Voice Platform (Voxeo)                                                <form id="Begin">    <block>
                                                                           <prompt bargein="true">
    We   use   the   Voxeo’s   Prophecy   local   server   as   our           Welcome to the United States. To accelerate
                                                                           your entry, we will appreciate your responses to
voice platform for voice recognition and to run the                        some questions to verify your identity and
dialogues. Java, Java Server Pages (JSP), and Java Script                  eligibility </prompt>
(JS) are used to implement the architecture modules and                    <assign name="xacmlResource" expr="’point  of  
to implement IRT procedures used to estimates  the  user’s                 entry"/>
ability/trust scores.                                                      <goto next="#Resource"/></block>
                                                                           </form>

    Voxeo’s   Prophecy   is   a   comprehensive   IVR   and                  Fig. 7. A sample Homeland security VoiceXML greeting form
standards-based platform [15]. Some of the capabilities
integrated into the platform are: automatic speech                          The conversation starts with a menu in VoiceXML
recognition, speech synthesis (Text-to-Speech), Software                hosted on the local Voxeo Prophecy web server. The
Implemented Phone (SIP) browser and libraries to create                 voice browser connects to the web server and converts
and deploy IVR or VoIP applications using VXML                          text to speech and speech to text. Fig. 7 shows a sample
CCXML. It supports most of server side languages and                    VoiceXML code.
has a built-in web server.
                                                                            Fig. 8 shows our algorithm integrating ontology, IVR
  2) Item bank                                                          and IRT. This algorithm was successfully implemented
                                                                        using JavaScript and Java Server Pages (JSP) embedded
   In our work, we start with ontology, annotate every                  in VoiceXML pages. The main steps are as follows:
axiom with an “irt_difficulty” property  of  value  “1”.  Then          x Load the ontology and parse the XML into Document
we use this ontology in the HermiT reasoner to infer                         Object Model (DOM).
implicit axioms and their explanations. The inferred facts              x Extract  the  axiom’s  triplet  (subject,  property,  object)
are themselves annotated with “irt_difficulty” property                 x Extract   the   axiom’s   IRT   difficulty   value   from   the  
and values calculated by factoring the number of                             annotation
explanation axioms using the schema stated in Table II.                 x Establish  a  VoiceXML  “For”  loop  that  synthesizes  a  
                                                                             question from string or text values to speech (TTS).
    For example, when annotating the inferred fact “the                      The question consists of an auxiliary verb, object,
friends of the Boston Attack Bomber”, which has an                           property and subject to test the correctness of an
explanation that includes 11 axioms shown in Fig. 2, the                     axiom.
irt_difficulty annotation   would   be   “3.0”;;   which   is   the     x The system waits for a response. If there is one it
highest value on the scale of IRT difficulty parameter                       converts it to text and recognizes it. If it adheres to
values in Table II. We assume that answering a question                      grammar then a value is assigned as an answer.
generated from a high-valued fact is a difficult task.
                                                                        x If there was no answer then VXML re-prompts the
Consequently, if the answer to a question derived from
                                                                             question up to a programmed number of times. If
this fact is correct, the ability estimate would be impacted
                                                                             exceeded then an appropriate VXML is executed.
more positively than a correct, but easy one and more
                                                                        x The vector of binary answers is used to estimate the
negatively if the opposite happens. An example is the
                                                                             IRT ability.
asserted  axiom  that  “Boston  is  located  in  Massachusetts”.  
Because this is an asserted fact, it is annotated with value            x The loop continues until a threshold of T or the
“1.0”;;  which  makes  a  question  generated  from  it  an  easy            maximum number of questions is reached.
one and thus not affecting the ability estimate greatly.                x The IRT ability estimation algorithm, as illustrated in
   This process is basically generating the item bank in                     Fig. 6, takes the variables: answer vector, a priori T,
CAT/IRT terminology. Each item in the item bank                              difficulty, discrimination and calculates a posteriori
contains a question, an answer and IRT parameters. In                        𝜃.
addition to saving it as ontology in any of the supported               x If the   answer   is   correct   (“yes”   or   “true”),   a value of
formats, this item bank can also be supported by using a                     “1”  is  assigned.  If  not,  a  “0”  is  assigned.
more specialized CAT/IRT platform like Cambridge                        x The last posteriori 𝜃 in the loop is the estimated
University’s  Concerto [16].                                                 user’s  ability   T and can be compared to a threshold
                                                                             value set by an administrator. Access is granted if ( T
  3) Generating dialogues from an ontology                                   >  𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑) and denied otherwise.


                                                 STIDS 2013 Proceedings Page 131
                                                                              voice dialogs from ontologies for entry control. We have
  Algorithm 2: dialogue access evaluation
  Input: a priori theta, Difficulty,                                          used IRT to generate shorter dialogues between the
  Discrimination, Answer                                                      system and a human speaker. IRT is useful in
  Output: access control decision                                             compensating for inaccurate voice recognition of answers
  /* make access control decision from
  ontology*/
                                                                              during dialogs or accidental mistakes. Our entry control
  1: domDocument=parse(ontology); // DOM                                      decisions are made based on an estimation of a level of
  2: subjectArray=getAxiomSubject(axiom);                                     trust in a subject derived from the importance or
  3: propertyArray=getAxiomProperty(axiom);                                   relevance of axioms in ontology. The use of IRT also
  4: objectArray=getAxiomObject(axiom);
  5: difficultyArray=getAxiomDifficulty(axiom);
                                                                              enables the reordering of questions with the purpose of
  6: /*use voiceXML , JSP to generate dialog*/                                preserving privacy in IVR systems. With the advancement
  7: for (counter < items.length) do                                          in the fields of mobile, cloud and cloud based voice
  8:     <vxml:Prompt> ‘[auxiliary verb]’                                     recognition such systems become important in defence
      +propertyArray[i]  +  “ ” + objectArray[i]
      +“ ”+ subjectArray[i];                                                  and physical security applications [17, 18, 19].
  9:     <vxml:Field>= user_utterance;
  10:    response[i] =                                                                                    REFERENCES
      Field.voiceRecognition(user_utterance);
  11:    if response[i]= ‘Yes’  or  ‘true’
                                                                              [1]  M.   Ababneh,   D.   Wijesekera,   J.   B.   Michael,   “A   Policy-based
  12:       resultVector[i]=1;
                                                                                   Dialogue System for Physical Access Control”,   The   7th STIDS
  13:    else
                                                                                   2012), Fairfax, VA, October 24-25, 2012.
  14:       resultVector[i]=0;
  15: endfor;                                                                 [2] M.   Ababneh,   D.   Wijesekera,   “Dynamically Generating Policy
                                                                                   Compliant Dialogues for Physical Access Control”,   CENTERIS
  16: theta = IRT_algorithm(resultVector,
      difficulty, discrimination,aPrioriTheta);                                    2013 - Conference on Enterprise Information Systems – aligning
  17: if theta > thetaThreshold                                                    technology, organizations and people, Lisbon, Portugal. October
  18:    permit;                                                                   23-25, 2013.
  19: else                                                                    [3] F. B. Baker, The basics of item response theory, ERIC
  20:    deny;                                                                     Clearinghouse on Assessment and Evaluation, 2001.
                                                                              [4] D. J. Weiss, G. G. Kingsbury, Application of computerized
           Fig. 8. Ontology-IVR algorithm with IRT                                 adaptive testing to educational problems, Journal of Educational
                                                                                   Measurement, 21, 361-375, 1984,
                                                                              [5] H. Wainer, Computerized Adaptive Testing: A Primer, Second
                                                                                   Edition, Lawrence Erlbaum Associates Publishers, 2000
           VI.         EXPERIMENTAL RESULTS                                   [6] M. Bishop, Computer Security: Art and Science, Addison Wesley,
                                                                                   2002.
     Our implementation shows that efficient dialogs could                    [7] XACML,            OASIS,        URL:                  https://www.oasis-
                                                                                   open.org/committees/tc_home.php?wgabbrev= xacml, accessed
be generated from ontologies that have been enhanced                               September 30, 2013.
with IRT attributes. The successful implementation of the                     [8] T. R. Gruber. A translation approach to portable
IRT in dialogues of questions and answers shortens the                             ontologies. Knowledge Acquisition, 5(2):199-220, 1993.
number of questions necessary to reach an accurate                            [9] W3C,           Web         Ontology            Language           (Primer),
                                                                                   http://www.w3.org/TR/owl2-primer/ , accessed August 22, 2013.
estimation   of   subject’s   ability,   knowledge   or   trust   by   at     [10] W3C, SPARQL Protocol and RDF Query Language, URL:
least 50% as it has already been proved by the IRT                                 http://www.w3.org/2009/sparql/, accessed August 22, 2013.
literature [4, 5]. This reduction of the number of questions                  [11] http://owlapi.sourceforge.net/reasoners.html, accessed August 22,
necessary to estimate the ability produces shorter dialogs                         2013.
                                                                              [12] http://hermit-reasoner.com/, accessed August 22, 2013.
without losing accuracy. Also, the use of IRT enables the                     [13] W3C        Voice      Browser          Working         Group,        URL:
use of multiple stopping criteria such as: fixed length                            http://www.w3.org/Voice, accessed August 22, 2013.
number of questions or time, ability threshold and                            [14] W3C, Voice Extensible Markup Language (VoiceXML)(VXML),
standard error confidence interval. The availability of                            URL: http://www.w3.org/Voice/, accessed August. 22, 2013.
                                                                              [15] Voxeo web site, URL: http://www.Voxeo.com, accessed August
large number of ontology axioms enables generating a set                           22, 2013.
of questions different from another set to be generated                       [16] Concerto IRT Platform, URL: http://www.psychometrics.
immediately after the current user preserving privacy and                          cam.ac.uk/page/338/concerto-testing-platform, accessed August
protecting against question exposure, especially in voice                          22, 2013.
                                                                              [17] Microsoft         Windows           Phone          Speech,          URL:
systems. The success of dialog system depends upon                                 http://www.windowsphone.com/en-us/how-to/wp7/basics/use-
multiple timing factors and scalability of supporting                              speech-on-my-phone, accessed September 3,2013.
multiple users. Our on-going research addresses these two                     [18] Apple Siri, URL: http://www.apple.com/ios/siri, accessed
aspects.                                                                           September 3, 2013.
                                                                              [19] Google          Android         Mobile            Search,            URL:
                                                                                   http://www.google.com/mobile/search/, accessed September 3,
                     VII.       CONCLUSION                                         2013.

    We have designed and implemented a novel IVR
system that can dynamically generate efficient interactive


                                                     STIDS 2013 Proceedings Page 132