Mining Ontologies for Analogy Questions: A Similarity-based Approach Tahani Alsubait, Bijan Parsia, and Uli Sattler School of Computer Science, The University of Manchester, United Kingdom {alsubait,bparsia,sattler}@cs.man.ac.uk Abstract. In this paper, we propose a new approach to generate anal- ogy questions of the form ”A is to B as ... is to ?” from ontologies. Analogy questions are widely used in multiple-choice tests such as SATs and GREs and are used to assess student’s higher cognitive abilities. The design, implementation and evaluation of the new approach are pre- sented in this paper. The results show that mining ontologies for such questions is fruitful. 1 Introduction Learning may be seen as its own reward; however assessment is usually required to provide various types of reward and recognition. This notion of assessment is usually referred to as summative assessment compared to formative assessment which is mainly for providing necessary feedback to students to support the learning process. Assessment items (i.e. questions) can be classified into two widely used for- mats: (i) Objective (e.g. Multiple Choice Questions (MCQs) or True/False ques- tions) and (ii) Subjective (e.g. essays or short answers). Each family of questions has its own advantages/disadvantages w.r.t. different phases of testing (i.e. Set- ting, Taking and Marking). On the one hand, objective tests can be used to assess a broad range of knowledge and yet require less administration time. In addition, they are scored easily, quickly and objectively either manually or au- tomatically and can be used to provide instant feedback to test takers. On the other hand, objective questions are hard to prepare and require considerable time per each question [26]. For example, Davis [8] & Lowman [19] pointed out that even professional test developers cannot prepare more than 3-4 items per day. In addition to the considerable preparation time, manual construction of MCQs does not necessarily imply that they are well-constructed. See for exam- ple the study carried out by Paxton [23] who analysed a large number of MCQs and reported that they are often not well-constructed. A major challenge in preparing MCQs is the need for good distractors that should appear as plausible answers to the question for those students who have not achieved the objective being assessed. At the same time, distractors should appear as implausible answers for those students who have achieved the objective [3]. Moreover, a well-written MCQ is a question that does not confuse students, and yields scores that can be used in determining the extent to which students have achieved educational objectives [3, 17]. Many guidelines have been proposed to ensure the effectiveness of distractors; however, many major issues are still debatable such as the appropriate number of distractors [10, 23]. Before the effectiveness of MCQs is discussed further, the difficulty of evaluat- ing such questions should be mentioned. The difficulty of evaluation lies, among other things, in the need for administering those questions to real students in normal settings and analyzing their grades according to well-defined procedures. For example, one can follow the procedures described in Item Response The- ory (IRT) [16, 21, 20] which is a theory that explains the statistical behavior of good/bad questions. According to IRT, good test questions have the following three characteristics: (i) prevent students from guessing the correct answer, (ii) function towards proper discrimination between good and poor students and (iii) different questions in the test have different difficulties. In addition to the above mentioned characteristics of good questions, the need for having questions that assess different levels of learning objectives should be also mentioned. For example, a test developer might be interested in knowing the level of which a student has achieved a learning objective (ranging from the ability to recall information to the ability to analyse and judge the provided information) [2]. Note that questions that address a specific level of learning objectives can be of different difficulties for a specific set of students. Note also that questions that assess lower level objectives are not necessarily questions of a lower quality as long as they meet the determined learning objectives [11]. Given the considerable time and effort required to develop MCQs, we pro- pose to automate the generation of these questions by using an ontology-based approach. Our motivation to use OWL ontologies in particular is their precise semantics, available reasoning services and considerable efforts put into their development. One of the promises of representing knowledge in such ontologies is that it can be used for different applications. In this paper, we investigate the potential benet of ontology-based question generation. Recently, a handful of studies [12, 13, 22, 29, 30, 1] explored the generation of MCQs over ontologies. A brief overview of these approaches is provided in section 5. A general critique of these approaches is the lack of pedagogic theory backing which we try to overcome in this paper. Moreover, most of these approaches generate questions of the type ”What is X?” or ”Which of the following is an example of X?” based on class-subclass and/or class-individual relationships. This type of questions can only assess lower levels of learning objectives [2]. Therefore, it is crucial to design approaches capable of generating questions of other types. In this paper, we present a new approach for generating multiple-choice anal- ogy questions from ontologies. Such questions aim to assess the analogical rea- soning ability of students (i.e. the ability to determine the underlying relation between a pair of concepts and identifying a similar pair that has the same un- derlying relation). We also describe the notion of relational similarity and how to use it to control the difficulty of the generated questions. In addition, we report on some experiments carried out to evaluate the new approach using a large corpus. 2 Preliminaries To understand the procedure required to generate MCQs, we first present a simple, yet general, definition for MCQs in what follows. Definition 1. A multiple choice question M CQ is a tool that can be used to evaluate whether (or not) our students have achieved a certain learning objective. It consists of the following parts: – A statement S that introduces a problem to the student (i.e. stem). – A number of functional options A = {Ai |2 ≤ i ≤ max} that can be further divided into two sets: 1. A number of correct options K = {Km |1 ≤ m ≤ i} (i.e. key) 2. A number of incorrect options D = {Dn |n := i − m} (i.e. distractors). To generate good MCQs we need a psychologically plausible theory that guides us in the generation. In this paper, we propose to use the notion of similarity to control the characteristics of the generated questions. For example, consider a question that has a stem that is similar to the key and different from the distractors. We would expect that the students will find this question to be an easy one (assuming that they notice the clues provided with the correct answer). Similarly, we would expect the question to be difficult if the stem is very similar to one (or all) of the distractors and different from the key. There are at least two major types of similarity. In addition, similarity is dif- ferent than the general notion of relatedness. For example, we say that cars and fuel are closely related compared to cars and bicycles that are closely similar. This notion of similarity is usually referred to as semantic similarity. A number of measures have been proposed to measure semantic similarity between con- cepts. See for example [24, 25, 18, 15] for general similarity measures and [7] for a semantic similarity measure that was designed for DL ontologies. Another important type of similarity is relational similarity [28, 27] which corresponds to similarities in the underlying relations. For example, food is to body as fuel is to car. When two pairs of concepts have a strong relational similarity, we say that they are analogous. In analogical reasoning, we compare two different types of objects and identify points of resemblance. Different types of similarity can play different roles in question generation. For example, semantic similarity can be used to generate plausible distractors for simple recall questions of the form ”What is X?”. Also, controlling the degree of similarity between the stem, key and distractors allows us to generate questions of different difficulties. Along similar lines, relational similarity plays a major role in generating questions that assess higher cognitive abilities. As an example of such questions that can be generated using our proposed similarity-based approach, we consider analogy questions that have the form ”X is to Y as:”. The alternative answers to such questions take the form ”Xi is to Yi ” where the key is the pair (Xi , Yi ) that has the same underlying relation as the pair (X, Y ) in the stem. See Table 1 for a sample multiple-choice analogy question taken from the GRE exam. For our purposes, we define analogy questions as follows (detailed explanation for Relatedness and Analogy functions will be provided later): Definition 2. Let Q be a multiple-choice analogy question with stem S = (X, Y ), key K = (V, W ) and a set of distractors D = {Di = (Ai , Bi )|1 < i ≤ max}. We assume that Q satisfies the following conditions: 1. The stem S, the key K, the distractor Di are all good (i.e.Relatedness(X, Y ) ≥ ∆R , Relatedness(V, W ) ≥ ∆R , Relatedness(Ai, Bi) ≥ ∆R ). 2. The key K is sig- nificantly more analogous to S compared to the distractors (i.e.Analogy(S, K) ≥ Analogy(S, Di )+∆1 ). 3. The key K is sufficiently analogous to S (i.e.Analogy(S, K) ≥ ∆2 ). 4. The distractors should be analogous to S to an extent (i.e.Analogy(S, Di ) ≥ ∆3 ). 5. Each distractor Di is unique (i.e.Analogy(S, Di ) 6= Analogy(S, Dj )). Table 1. A sample multiple-choice analogy question [9] Stem: Cat: Mouse Options: (A) Lion: Tiger (B) Food: Warmth (C) Bird: Worm (D) Dog: Tail Key: (C) Bird: Worm We would like to be able to control the difficulty of the generated questions. According to Definition 2 and Propositions 1.a, 1.b, 1.c we can control the dif- ficulty of Q by increasing or decreasing ∆1 , ∆2 and ∆3 . Proposition 1. a. Increasing ∆1 decreases the difficulty of Q. b. Increasing ∆2 decreases the difficulty of Q. c. Decreasing ∆3 decreases the difficulty of Q. To generate analogy questions, we need to define two functions, Relatedness and Analogy. A very basic example for the Relatedness function is to con- sider concepts that are both referenced in one (or more) of the ontological ax- ioms in the source ontology as sufficiently related concepts (e.g. X v ∃r.Y → Relatedness(X, Y ) > 0). However, such a syntax-based notion of relatedness is sensitive to tautologies and therefore cannot be adopted without further con- siderations. For simplicity, we currently adopt a simple relatedness notion that considers a pair of named classes to be sufficiently related if they have one of the structures in Figure 1. These structures have at most one change in direction in the path connecting the two nodes and at most two steps in each direction. Other structures were discarded to avoid too difficult (and probably confusing) ques- tions. While in the most general case, one should consider pairs with arbitrary related classes (e.g. by considering user-defined relations), for current purposes we only consider class-subclass relations. This simplifies the problem consider- ably in several dimensions while still generates reasonable number of candidate pairs (as we will see later). In addition, we need to define the function Analogy Fig. 1. Closely related structures of class-subclass relations (labels represent no. of steps and direction (up or down) in the path that connects the two concepts) which is the core function for generating multiple-choice analogy questions. This function is defined as follows: Definition 3. Let Analogy(x, y) be the function that takes two pairs of concepts and returns a numerical score for their analogy value according to the equation: SharedSteps(x, y) SharedDirections(x, y) Analogy(x, y) = × (1) T otalSteps(x, y) T otalDirections(x, y) 3 Extracting Analogy Questions from Ontologies One of the questions that arise here is how many MCQs can be generated from a given ontology? To answer this question we need to first determine what parts of the ontology (i.e. classes, individuals, properties, and annotations) will be considered in the generation process. Secondly, we need to determine whether (or not) a filtering mechanism is used to differentiate between good and bad questions and to generate questions that are supposed to be good questions only. As an example, the following equation (2) can be used to count the number of possible multiple-choice questions of the form ”What is [class name]?” with one key and three distractors (all are class names), assuming that no filtering mechanism is used (n is the number of classes in the given ontology and Ti is the number of correct answers (i.e. super-classes) for class i): n     X Ti n − 1 − Ti × (2) i=1 1 3 Needless to say, the number of questions increases rapidly as n grows (see Figure 2 for some examples). Also, it reaches its maximum value when Ti equals n−1 4 (i.e. the ratio of correct answers to wrong answers is 1:3). The number of possible questions that can be generated from a given ontology can further be increased if we consider other parts of the ontology (e.g. individuals, properties). Having said this, we should also mention that generating a large number of questions is not desirable unless the generated questions are expected to be all good. A similar analysis of the number of possible analogy questions is part of future work. In what follows, we provide an algorithm (see Algorithm 1) that can be Fig. 2. Number of possible questions from some BioPortal ontologies used to generate multiple choice analogy questions from a given ontology O. The algorithm is founded on the premise that varying the relational similarity (i.e. the analogy degree) between the stem, the key and distractors allows us to control the difficulty of the generated questions. This can be achieved by setting the parameters ∆1 , ∆2 and ∆3 to different values. The proposed approach consists of two phases: (i) extraction of interesting pairs of concepts which can be determined using the proposed Relatedness function, those pairs can be used as stems, keys or distractors and (ii) generation of multiple-choice questions based on the similarity between pairs which can be derived from the proposed Analogy function. Note that this approach can be generalized to generate other types of questions such as finding the antonyms/odds. 4 Empirical Evaluation To evaluate the proposed approach, we implemented a question generation en- gine that utilizes algorithm 1 and used the implemented engine to generate anal- ogy questions from three ontologies (one specialized ontology and two tutorial- based ontologies). The three ontologies are presented in Table 2 below with some basic ontology statistics. The first ontology is the Gene Ontology which is a structured vocabulary for the annotation of gene products. It has three main parts: (i) molecular function, (ii) cellular component and (iii) biological role. The two other ontologies are the People & Pets Ontology and the Pizza Ontology which are very simple ontologies that are usually used in ontology development tutorials. The table shows the number of classes in each ontology and the num- ber of sample questions generated by the engine (this is only a representative sample of all the possible questions). The table also shows the percentage of questions that our proposed solver agent can solve correctly. The details of the approach used to simulate question solving are explained in what follows. Other ontologies can be used as input for our implemented question generation engine; however we tried to avoid ontologies that use difficult-to-read labels (e.g. labels that have no spaces between words). Table 2. Ontologies used to generate analogy-questions along with basic statistics No. of Classes No. of questions %Correct Gene Ontology 36146 25 8% People & Pets 58 15 67% Pizza Ontology 97 16 88% In order to evaluate the proposed similarity-based approach defined in Al- gorithm 1, we need at least to simulate students while solving the generated questions and check whether (or not) the proposed approach can be used to successfully control the difficulty of questions. To do this, we follow the method explained by Turney & Littman [28, 27] for evaluating analogies using a large corpus. In their study, Turney & Littman reported that their method can solve about 47% of multiple-choice analogy questions (compared to an average of 57% cor- rect answers solved by high school students). The solver takes a pair of words representing the stem of the question and 5 other pairs representing the answers presented to students. Their proposed method is inspired by the Vector Space Model (VSM) of informational retrieval. For each provided answer, the solver creates two vectors representing the stem (R1 ) and the given answer (R2 ). The solver returns a numerical value for the degree of analogy between the stem and the given answer. Then, the answers are ranked according to their analogy value and the answer with the highest rank is considered the correct answer. To create the vectors, they proposed a table of 64 joining terms that can be used to join the two words in each pair (stem or answer). The two words are joined by these joining terms in two different ways (e.g. ”X is Y” and ”Y is X”) to create a vector of 128 features. The actual values stored in each vector are calculated by counting the frequencies of those constructed terms in a large corpus (e.g. web resources indexed by a search engine). To improve the accuracy of their proposed method, they suggested to use the logarithm of the frequency instead of the frequency itself. In this paper, we follow a similar procedure to evaluate the difficulty of our generated MCQs. First, we constructed a table of joining terms relevant to the relations considered in our approach (e.g. ”is a”, ”type”, ”and”, ”or”). Based on these joining terms, we create vectors of 10 features for the stem, the key and each distractor. The constructed terms are sent as a query to a search engine (Yahoo!) and the logarithm of the hit count is stored in the corresponding element in the vector. The hit count is always incremented by one to avoid getting undefined values. Following this procedure, our proposed solver agent solved 8% of the questions generated from the Gene Ontology, 67% of the questions generated from the People and Pets Ontology and 88% of the questions generated from the Pizza Ontology. We argue that this is caused by the specific terminology used in the Gene Ontology and lack of web resources that have information regarding it compared to the other ontology. Examples of the questions that were generated using our proposed approach are presented in Tables 3 & 4. Those questions were generated from the Peo- ple & Pets ontology and Pizza ontology respectively. Moreover, we varied the difficulty-control parameters to generate different sets of questions (i.e. questions of different difficulties) from the two tutorial ontologies. The results (See Table 5) show that the proposed parameters sufficiently controlled the difficulty of the generated questions. 5 Related Work Chung, Niemi, and Bewley (2003) [4] developed the Assessment Design and Delivery System (ADDS). The purpose of ADDS is to assist non-expert physics Table 3. Sample question generated from the People & Pets Ontology Stem: Haulage Truck Driver : Driver Options: (A) Quality Broadsheet : Newspaper (B) Giraffe : Sheep (C) Bus : Vehicle (D) Giraffe : Cat Liker Key: (C) Bus : Vehicle Table 4. A sample question generated from the Pizza Ontology Stem: Sloppy Giuseppe : Pizza De Carne Options: (A) Cogumelo : Pizza Vegetariana (B) Pizza : Food (C) Cogumelo : Napoletana (D) Cogumelo : Sorvete Key: (B) Pizza : Food teachers in designing appropriate assessments by constraining the design process by structure-based and cognitive-based rules derived from an ontology that was specifically designed for the system. In addition, ADDS’s domain ontology has links to a set of reusable assessment tasks or components of tasks (i.e. text, graphic, multimedia) along with information to guide teachers practice. Holohan et al. (2005) [12] described the OntAWare system which is an ontology- based authoring environment for learning content. It employs an ontology graph traversal algorithm that generates MCQs of the form ”Which of the following items is (or is not) an example of the concept, X?”. The alternative answers will be generated randomly and the question as a whole can be exported to external systems that conform to the IMS/QTI [14] standard. One of the central problems in OntAwar, other than the highly constrained forms of questions, is that the ontology graph transformations employed in the system are hardcoded (in Java) to incorporate implicit instructional strategies and therefore their approach is not ready to be generalized and adopted in other systems. They extended their work in Holohan et al. (2006) [13] by focusing on the generation of SQL exercise problems for database students using domain-dependent algorithms. Stankov and Zitko (2008) [29] proposed templates and algorithms for auto- matic generation of objective questions (i.e. MCQs, T/F) over ontologies. The Table 5. Difference in the percentage of correctly answered questions (generated from People & Pets Ontology and Pizza Ontology) Types of questions ∆1 ∆2 ∆3 No. of questions %Correct People Decreased Difficulty 0.5 1 0 15 67% & Pets Increased Difficulty 0.75 0.5 0.1 11 27% Pizza Decreased Difficulty 0.5 1 0 16 88% Ontology Increased Difficulty 0.75 0.5 0.1 16 50% focus in their work was to extend the functionality of a previously implemented tutoring system (Tex-Sys) by concentrating on the assessment component. The proposed methodology generates a set of random alternative answers for each MCQ without an attempt to filter them according to their pedagogical appro- priateness. Papasalouros et al. (2008) [22] presented various ontology-based strategies for automatic generation of MCQs. These strategies are used for selecting the correct and wrong (distracting) answers of the questions. The answers are later transformed into English sentences using simple natural language generation techniques. The evaluation of the produced questions by domain experts shows that the questions are satisfactory for assessment but not all of them are syntac- tically correct. The major problem related to this approach is the use of highly constrained rules with no theory backing that would motivates the selection of these rules. For example, the distractors in each MCQ are mainly picked from the set of siblings of the correct answer while there might be other plausible distractors. Cubric and Tosic (2009) [5] reported their experience in implementing a Portege plugin for question generation based on the strategies proposed by Pa- pasalouros et al. (2008) [22]. More recently, Cubric and Tosic (2010) [6] extended their previous work by considering new ontology elements (i.e. annotations). In addition, they suggested employing question templates to avoid syntactical prob- lems in the automatically generated questions. This also enables the generation of questions in different levels of Blooms taxonomy [2]. 6 Conclusion and Future Work A handful of studies have already proposed some approaches to generate MCQs over ontologies, however little have been done on the theoretical and evaluation aspectes. In this paper, we propose a new approach to generate multiple-choice analogy questions from ontologies. The paper describes the foundations of the proposed approach from a psychological point of view. In addition, the paper reports on some evaluations carried out to evaluate the proposed approach. The results show that mining ontologies for analogy questions in particular and for assessment questions in general is fruitful. Moreover, the results show that the proposed approach can be used to control the difficulty of the generated ques- tions. For future work, we aim to generalize the proposed approach for generating analogies and consider arbitrary relations found in existing ontologies (i.e. user- defined relations instead of only class-superclass relations). To evaluate such analogies, we suggest to use Latent Relational Similarity (LRS) [27] which has the ability to learn relations instead of using predefined joining terms. References 1. M.Al-Yahya. Ontoque: A question generation engine for educational assessment based on domain ontologies. In 11th IEEE International Conference on Advanced Learning Technologies, 2011. 2. B.S. Bloom and D.R. Krathwohl. Taxonomy of educational objectives: The classi- fication of educational goals by a committee of college and university examiners. Handbook 1. Cognitive domain. New York: Addison-Wesley, 1956. 3. S.Burton, R.Sudweeks, P.Merrill, and B.Wood. How to prepare better multiple- choice test items: Guidelines for university faculty. Brigham young university test- ing services and the department of instructional science. Retrieved November 22, 2011, from http://testing.byu.edu/info/handbooks/betteritems.pdf, 1991. 4. G.Chung, D.Niemi, and W.L. Bewley. Assessment applications of ontologies. In Paper presented at the Annual Meeting of the American Educational Research Association, 2003. 5. M.Cubric and M.Tosic. SEmcq: Protege plugin for automatic ontology-driven mul- tiple choice question tests generation. In 11th Intl. Protege Conference, Poster and Demo Session, 2009. 6. M.Cubric and M.Tosic. Towards automatic generation of e-assessment using se- mantic web technologies. In Proceedings of the 2010 International Computer As- sisted Assessment Conference, University of Southampton, July 2010. 7. C.d’Amato, S.Staab, and N.Fanizzi. On the influence of description logics ontolo- gies on conceptual similarity. In EKAW 08 Proceedings of the 16th international conference on Knowledge Engineering: Practice and Patterns, 2008. 8. B.B. Davis. Tools for Teaching. San Francisco, CA: Jossey-Bass, 2001. 9. GRESampleQuestions. Best sample questions. Retrieved March 10, 2012, from http://www.bestsamplequestions.com/gre-questions/analogies/. 10. T.M. Haladyna and S.M. Downing. How many options is enough for a multiple choice test item? Educational & Psychological Measurement, 53(4):9991010, 1993. 11. M.Hufler, M.AL-Smadi, and C.G. Investigating content quality of automatically and manually generated questions to support self-directed learning. In In White- lock, D., Warburton, W., Wills, G., and Gilbert, L. (Eds.), CAA 2011 International Computer Assisted Assessment Conference, University of Southampton, 2011. 12. E.etal. Holohan. Adaptive e-learning content generation based on semantic web technology. In Proceedings of Workshop on Applications of Semantic Web Tech- nologies for e-Learning, pages 2936, Amsterdam, The Netherlands, 2005. 13. E.etal. Holohan. The generation of e-learning exercise problems from subject on- tologies. In Proceedings of the Sixth IEEE International Conference on Advanced Learning Technologies, pages 967969, 2006. 14. IMS. IMS question & test interoperability. ASI best practice & implementation guide. final specification version 1.2. IMS global learning consortium Inc., June 2002. 15. J.Jiang and D.Conrath. Semantic similarity based on corpus statistics and lexi- cal taxonomy. In In: Proc. of the 10th International Conference on Research on Computational Linguistics, Taiwan, 1997. 16. J.Kehoe. Basic item analysis for multiple-choice tests. Practical Assessment, Re- search & Evaluation, 4(10), 1995. 17. K.King, D.Gardner, S.Zucker, and M.Jorgensen. The distractor rationale taxon- omy: Enhancing multiple-choice items in reading and mathematics. Assessment Report. Pearson, July 2004. 18. D.Lin. An information-theoretic definition of similarity. In In: Proc. of the 15th International Conference on Machine Learning, page 296-304, San Francisco, CA, 1998. Morgan Kaufmann. 19. J.Lowman. Mastering the Techniques of Teaching (2nd ed.). San Francisco: Jossey- Bass, 1995. 20. M.Miller, R.Linn, and N.Gronlund. Measurement and Assessment in Teaching, Tenth Edition. Pearson, 2008. 21. R.Mitkov, L.AnHa, and N.Karamani. A computer-aided environment for generat- ing multiple-choice test items.cambridge university press. Natural Language Engi- neering, 12(2):177194, 2006. 22. A.Papasalouros, K.Kotis, and K.Kanaris. Automatic generation of multiple-choice questions from domain ontologies. In IADIS e-Learning 2008 conference, Amster- dam, 2008. 23. M.Paxton. A linguistic perspective on multiple choice questioning. Assessment & Evaluation in Higher Education, 25(2):109119, 2001. 24. R.Rada, H.Mili, E.Bicknell, and M.Blettner. Development and application of a metric on semantic nets. In In: IEEE Transaction on Systems, Man, and Cyber- netics, volume19, page 17-30, 1989. 25. P.Resnik. Using information content to evaluate semantic similarity in a taxonomy. In In Proceedings of the 14th international joint conference on Artificial intelligence (IJCAI’95), volume1, pages 448453, 1995. 26. J.T. Sidick, G.V. Barrett, and D.Doverspike. Three-alternative multiple-choice tests: An attractive option. Personnel Psychology, 47:829835, 1994. 27. P.Turney. Measuring semantic similarity by latent relational analysis. In IJCAI is the International Joint Conference on Artificial Intelligence, 2005. 28. P.Turney and M.Littman. Corpus-based learning of analogies and semantic rela- tions. Machine Learning, 60(1-3):251278, 2005. 29. B.Zitko, S.Stankov, M.Rosi, and A.Grubi. Dynamic test generation over ontology- based knowledge representation in authoring shell. Expert Systems with Applica- tions: An International Journal, 36(4):81858196, 2008. 30. K.Zoumpatianos, A.Papasalouros, and K.Kotis. Automated transformation of SWRL rules into multiple-choice questions. In FLAIRS Conference11, 2011.