<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Generating educational assessment items from Linked Open Data: the case of DBpedia</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Muriel</forename><surname>Foulonneau</surname></persName>
							<email>muriel.foulonneau@tudor.lu</email>
							<affiliation key="aff0">
								<orgName type="institution">Tudor Research Centre</orgName>
								<address>
									<addrLine>29, av. John F. Kennedy</addrLine>
									<postCode>L-1855</postCode>
									<settlement>Luxembourg</settlement>
									<country key="LU">Luxembourg</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Generating educational assessment items from Linked Open Data: the case of DBpedia</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">43DA23C8CA7D0A3D03A457490A7B720F</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T21:49+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Linked Data</term>
					<term>open data</term>
					<term>DBpedia</term>
					<term>eLearning</term>
					<term>e-assessment</term>
					<term>formative assessment</term>
					<term>assessment item generation</term>
					<term>data quality</term>
					<term>IMS-QTI</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This work uses Linked Open Data for the generation of educational assessment items. We describe the streamline to create variables and populate simple choice item models using the IMS-QTI standard. The generated items were then imported in an assessment platform. Five item models were tested. They allowed identifying the main challenges to improve the usability of Linked Data sources to support the generation of formative assessment items, in particular data quality issues and the identification of relevant sub-graphs for the generation of item variables.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Assessment takes a very important role in education. Tests are created to evaluate what students have learned in the class, to assess their level at the beginning of a cycle, to enter a prestigious university, or even to obtain a degree. More and more assessment is also praised for its contribution to the learning process through formative assessment (i.e., assessment to learn, not to measure) and/or selfassessment whereby the concept of a third party controlling the acquisition of knowledge is totally taken out of the assessment process. The role of assessment in the learning process has considerably widened. The New York Times even recently published an article entitled "To Really Learn, Quit Studying and Take a Test" <ref type="bibr" target="#b0">[1]</ref>, reporting on a study by Karpicke et al. <ref type="bibr" target="#b1">[2]</ref> which suggests that tests are actually the most efficient knowledge acquisition method.</p><p>The development of e-assessment has been hampered by a number of obstacles, in particular the time and effort necessary to create assessment items (i.e., test questions) <ref type="bibr" target="#b2">[3]</ref>. Therefore, automatic or semi-automatic item generation has gained attention over the last years. Item generation consists in using an item model and creating automatically or semi-automatically multiple items from that model.</p><p>The Semantic Web can provide relevant resources for the generation of assessment items because it includes models of factual knowledge and structured datasets for the generation of item model variables. Moreover, it can provide links to relevant learning resources, through the interlinking between different data sources.</p><p>Using a heterogeneous factbase for supporting the learning process however raises issues related for instance to the potential disparities of data quality. We implemented a streamline to generate simple choice items from DBpedia. Our work aims at identifying the potential difficulties and the feasibility of using Linked Open Data to generate items for low stake assessment, in this case formative assessment.</p><p>We present existing approaches to the creation of item variables, the construction of the assessment item creation streamline, and the experimentation of the process to generate five sets of items.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Existing work</head><p>Item generation consists in creating multiple instances of items based on an item model. The item model defines variables, i.e., the parts which change for each item generated. There are different approaches to the generation of variables, depending on the type of items under consideration.</p><p>In order to fill item variables for mathematics or science, the creation of computational models is the easiest solution. Other systems use natural language processing (NLP) to generate for instance vocabulary questions and cloze questions (fill in blanks) in language learning formative assessment exercises ( <ref type="bibr" target="#b3">[4]</ref>, <ref type="bibr" target="#b4">[5]</ref>, <ref type="bibr" target="#b5">[6]</ref>). Karamanis et al. <ref type="bibr" target="#b6">[7]</ref> also extract questions from medical texts.</p><p>The generation of variables from structured datasets has been experimented in particular in the domain of language learning. Lin et al. <ref type="bibr" target="#b7">[8]</ref> and Brown et al. <ref type="bibr" target="#b8">[9]</ref> for instance generated vocabulary questions from the WordNet dataset, which is now available as RDF data on the Semantic Web. Indeed, the semantic representation of data can help extracting relevant variables. Sung et al. <ref type="bibr" target="#b9">[10]</ref> use natural language processing to extract semantic networks from a text and then generate English comprehension items.</p><p>Linnebank et al. <ref type="bibr" target="#b10">[11]</ref> use a domain model as the basis for the generation of entire items. This approach requires experts to elicit knowledge in specifically dedicated models. However, the knowledge happens to already exist in many data sources (e.g., scientific datasets), contributed by many different experts who would probably never gather in long modeling exercises. Those modeling exercises would have to be repeated over time, as the knowledge of different disciplines evolves. Moreover, in many domains, the classic curricula, for which models could potentially be developed and maintained by authorities, are not suitable. This is the case of professional knowledge for instance.</p><p>Given the potential complexity of the models for generating item variables, Liu <ref type="bibr" target="#b11">[12]</ref> defines reusable components of the generation of items (including the heuristics behind the creation of math variables for instance). Our work complements this approach by including the connection to semantic datasets as sources of variables. Existing approaches to item generation usually focus on language learning <ref type="bibr" target="#b12">[13]</ref> or mathematics and physics where variable can be created from formulae <ref type="bibr" target="#b13">[14]</ref>. We aim to define approaches applicable in a wider range of domains (e.g., history) by reusing existing interlinked datasets.</p><p>An item model includes a stem, options, and potentially auxiliary information <ref type="bibr" target="#b14">[15]</ref>. Only the stem (i.e., the question) is mandatory. Response options are provided in the case of a multiple choice item. Auxiliary information can be a multimedia resource for instance. In some cases, other parameters can be adapted, including the feedback provided to candidates after they answer the item.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Figure 1 -Semi-automatic item generation from semantic datasets</head><p>In order to investigate the use of Linked Data as a source of assessment items, we built a streamline to generate simple choice items from a SPARQL endpoint on the Web. The item generation process is split in different steps detailed in this section. Figure <ref type="figure">1</ref> shows the item model represented as an item template, the queries to extract data from the Semantic Web, the generation of a set of potential variables as a variable store, the organization of all the values of variables for each item in data dictionaries, and the creation of items in QTI-XML format from the item template and item data dictionaries. These steps are detailed in this section.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Creating an IMS QTI-XML template</head><p>In order to generate items which are portable to multiple platforms, it is necessary to format them in IMS-QTI (IMS Question &amp; Test Interoperability Specification) <ref type="foot" target="#foot_0">1</ref> . IMS-QTI is the main standard used to represent assessment items <ref type="bibr" target="#b15">[16]</ref>. It specifies metadata (as a Learning Object Metadata profile), usage data (including psychometric indicators), as well as the structure of items, tests, and tests sections. It allows representing multimedia resources in a test. IMS-QTI has an XML serialization. &lt;choiceInteraction responseIdentifier="RESPONSE" shuffle="false" maxChoices="1"&gt; &lt;prompt&gt;What is the capital of {prompt}?&lt;/prompt&gt; &lt;simpleChoice identifier="{responseCode1}"&gt;{responseOption1}&lt;/simpleChoice&gt; &lt;simpleChoice identifier="{responseCode2}"&gt;{responseOption2}&lt;/simpleChoice&gt; &lt;simpleChoice identifier="{responseCode3}"&gt;{responseOption3}&lt;/simpleChoice&gt; &lt;/choiceInteraction&gt;</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Figure 2 -Extract of the QTI-XML template for a simple choice item</head><p>No language exists for assessment item templates. We therefore used the syntax of JSON templates for an XML-QTI file (Figure <ref type="figure">2</ref>). All variables are represented with the variable name in curly brackets. Unlike RDF and XML template languages, JSON templates can define variables for an unstructured part of text in a structured document. For instance, in Figure <ref type="figure">2</ref>, the {prompt} variable is only defined in part of the content of the &lt;prompt&gt; XML element. Therefore, the question itself can be stored in the item model, only the relevant part of the question is represented as a variable.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Collecting structured data from the Semantic Web</head><p>In order to generate values for the variables defined in the item template, data sources from the Semantic Web are used. The Semantic Web contains data formatted as RDF. Datasets can be interlinked in order to complement for instance the knowledge about a given resource. They can be accessed through browsing, through data dumps, or through a SPARQL interface made available by the data provider. For this experiment, we used the DBpedia SPARQL query interface (Figure <ref type="figure">3</ref>). The query results only provide a variable store from which items can be generated. All the response options are then extracted from the variable store (Figure <ref type="figure">1</ref>). SELECT ?country ?capital WHERE { ?c &lt;http://dbpedia.org/property/commonName&gt; ?country . ?c &lt;http://dbpedia.org/property/capital&gt; ?capital } LIMIT 30</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Figure 3 -SPARQL query to generate capitals in Europe</head><p>Linked data resources are represented by URIs. However, the display of variables in an assessment item requires finding a suitable label for each concept. In the case presented on Figure <ref type="figure">3</ref>, the ?c variable represents the resource as identified by a URI. The &lt;http://dbpedia.org/property/commonName&gt; property allows finding a suitable label for the country. Since the range of the &lt;http://dbpedia.org/property/capital&gt; property is a literal, it is not necessary to find a distinct label.</p><p>The label is however not located in the same property in all datasets and for all resources. In the example of Figure <ref type="figure">3</ref>, we used the property &lt;http://dbpedia.org/property/commonName&gt; which provides the capital names as literals. However, other properties, such as &lt;foaf:name&gt; are used for the same purpose. In any case, the items always need to be generated from a path in a semantic graph rather than from a single triple. This makes Linked Data of particular relevance since the datasets can complete each other.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Generating item distractors</head><p>The SPARQL queries aim to retrieve statements from which the stem variable and the correct answer are extracted. However, a simple or multiple choice item also needs distractors. Distractors are the incorrect answers presented as options in the items. In the case of Figure <ref type="figure">3</ref>, the query retrieves different capitals, from which the distractors are randomly selected to generate an item. For instance, the capital of Bulgaria is Sofia. Distractors can be Bucarest and Riga.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4">Creating a data dictionary from Linked Data</head><p>The application then stores all the variables for the generated items in data dictionaries. Each item is therefore represented natively with this data dictionary. We created data dictionaries as Java objects conceived for the storage of QTI data. We also recorded the data as a JSON data dictionary. In addition to the variables, the data dictionary includes provenance information, such as the creation date and the data source.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.5">Generating QTI Items</head><p>QTI-XML items are then generated from the variables stored in the data dictionary and the item model formalized as a JSON template. We replaced all the variables defined in the model by the content of the data dictionary. If the stem is a picture, this can be included in the QTI-XML structure as an external link.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">The DBpedia experiment</head><p>In order to validate this process, we experimented the generation of assessment items for five single choice item models. We used DBpedia as the main source of variables. The item models illustrate the different difficulties which can be encountered and help assessing the usability of the Linked Data for the generation of item variables.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">The generation of variables for five item models</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Q1 -What is the capital of { Azerbaijan }?</head><p>The first item model uses the query presented on Figure <ref type="figure">3</ref>. This query uses the http://dbpedia.org/property/ namespace, i.e., the Infobox dataset. This dataset however is not built on top of a consistent ontology. It rather transforms the properties used in Wikipedia infoboxes. Therefore, the quality of the data is a potential issue <ref type="foot" target="#foot_1">2</ref> .</p><p>Out of 30 value pairs generated, 3 were not generated for a country (Neuenburg am Rhein, Wain, and Offenburg). For those, the capital was represented by the same literal as the country. Two distinct capitals were found for Swaziland (Mbabane, the administrative capital and Lobamba, the royal and legislative capital). The Congo is identified as a country, whereas it has been split into two distinct countries. Its capital Leopoldville was since renamed Kinshasa. The capital of Sri Lanka is a URI, whereas the range of the capital property is usually a de facto literal. Finally the capital of Nicaragua is represented with display technical instructions "Managua right|20px". Overall, 7 value pairs out of 30 were deemed defective.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Q2 -Which country is represented by this flag ?</head><p>SELECT ?flag ?country WHERE { ?c &lt;http://xmlns.com/foaf/0.1/depiction&gt; ?flag . ?c &lt;http://dbpedia.org/property/commonName&gt; ?country . ?c &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#type&gt; &lt;http://dbpedia.org/class/yago/EuropeanCountries&gt; } LIMIT 30 Q2 uses the Infobox dataset to identify the label of the different countries. However, the FOAF ontology also helps identifying the flag of the country and the YAGO (Yet Another Great Ontology) <ref type="bibr" target="#b16">[17]</ref> ontology ensures that only European countries are selected. This excludes data which do not represent countries.</p><p>Nevertheless, it is more difficult to find flags for non European countries, while ensuring that only countries are selected. Indeed, in the <ref type="bibr">YAGO</ref>  Q3 uses the YAGO ontology to ensure that the resource retrieved is indeed a king of France. Out of 30 results, one was incorrect (The three Musketeers). The query generated duplicates because of the multiple labels associated to each king. The same king was named for instance Louis IX, Saint Louis, Saint Louis IX. Whereas deduplication is a straight forward process in this case, the risk of inconsistent naming patterns among options of the same item is more difficult to tackle. An item was indeed generated with the following 3 options: Charles VII the Victorious, Charles 09 Of France, Louis VII. They all use a different naming pattern, with or without the king's nickname and with a different numbering pattern. The above question is a variation of Q1. It adds a picture collection from a distinct dataset in the response feedback. It uses the YAGO ontology to exclude countries outside Europe and resources which are not countries. A feedback section is added. When the candidate answers the item, he then receives a feedback if the platform allows it. In the feedback, additional information or formative resources can be suggested. Q4 uses the linkage of the DBpedia dataset with the Flickr wrapper dataset. However the Flickr wrapper data source was unavailable when we performed the experiment.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Q4 -What is the capital of { Argentina</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Q5 -Which category does { Asthma } belong to?</head><p>SELECT DISTINCT ?diseaseName ?category WHERE { ?x &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#type&gt; &lt;http://dbpedia.org/ontology/Disease&gt; . ?x &lt;http://dbpedia.org/property/meshname&gt; ?diseaseName . ?x &lt;http://purl.org/dc/terms/subject&gt; ?y . ?y &lt;http://www.w3.org/2004/02/skos/core#prefLabel&gt; ?category } LIMIT 30 Q5 aims to retrieve diseases and their categories. It uses SKOS and Dublin Core properties. The Infobox dataset is only used to find labels. Labels from the MESH vocabularies are even available. Nevertheless, the SKOS concepts are not related to a specific SKOS scheme. Categories retrieved range from Skeletal disorders to childhood. For instance, the correct answer to the question on Obesity is childhood.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">The publication of items on the TAO platform</head><p>The TAO platform <ref type="foot" target="#foot_2">3</ref> is an open source semantic platform for the creation and delivery of assessment tests and items. It has been used in multiple assessment contexts, including large scale assessment in the PIAAC and PISA surveys of the OECD, diagnostic assessment and formative assessment. We imported QTI items generated for the different item models in the platform, in order to validate the overall Linked Data based item creation streamline. Figure <ref type="figure">4</ref> presents an item generated from Q1 (Figure <ref type="figure">3</ref>) imported in the TAO platform.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Figure 4 -Item preview on the TAO platform 5 Data analysis</head><p>The experimentation of the streamline was therefore tested with SPARQL queries which use various ontologies and which collect various types of variables. It raised two types of issues for which future work should find relevant solutions: the quality of the data and the relevance of particular statements for the creation of an assessment item.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1">Data quality challenges</head><p>In our experiment, the chance that an item will have a defective prompt or a defective correct answer is equal to the number of defective variables used for the item creation. Q1 uses the most challenging dataset in terms of data quality. 7 out of 30 questions had a defective prompt or a defective correct answer (23,33%).</p><p>The chance that an item will have defective distractors is represented by the following formula, where D is the total number of distractors, d(V) is the number of defective variables and V is the total number of variables:</p><p>We used 2 distractors. Among the items generated from Q1, 10 items had a defective distractor (33,33%). Overall, 16 out of 30 items had neither a defective prompt nor a defective correct answer nor a defective distractor (53,33%). As a comparison, the items generated from unstructured content (text) that are deemed usable without edit were measured between 3,5% and 5% by <ref type="bibr">Mitkov et al. [18]</ref> and between 12% and 21% by Karamanis et al. <ref type="bibr" target="#b6">[7]</ref>. The difficulty of generating items from structured sources should be lower. Although a manual selection is necessary in any case, the mechanisms we have implemented can be improved.</p><p>The ontology Q1 used properties from the Infobox dataset, which has no proper underlying ontology. Q1 can therefore be improved by using ontologies provided by DBpedia, as demonstrated by Q2 for which no distractor issue was identified. We present Q1 and Q2 to illustrate this improvement but it should be noted that there is not always a straight equivalent to the properties extracted from the Infobox dataset. Q5 could be improved either if the dataset would be linked to a more structured knowledge organization system (KOS) or through an algorithm which would verify the nature of the literals provided as a result of the SPARQL query.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>The labels</head><p>The choice of the label for each concept to be represented in an item is a challenge when concepts are represented by multiple labels (Q4). The selection of labels and their consistency can be ensured by defining representation patterns or by using datasets with consistent labeling practices.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Inaccurate statements</head><p>Most statements provided for the experiment are not inaccurate in their original context but they sometimes use properties which are not sufficiently precise for the usage envisioned (e.g., administrative capital). In other cases, the context of validity of the statement is missing (e.g., Leopoldville used to be the capital of a country called Congo). The choice of DBpedia as a starting point can increase this risk in comparison to domain specific data sources provided by scientific institutions for instance. Nevertheless, the Semantic Web raises similar quality challenges as the ones encountered in heterogeneous and distributed data sources <ref type="bibr" target="#b18">[19]</ref>. Web 2.0 approaches, as well as the automatic reprocessing of data can help improve the usability of the Semantic Web statements. This requires setting up a traceability mechanism between the RDF paths used for the generation of items and the items generated.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Data linkage</head><p>Data linkage clearly raises an issue because of the reliability of the mechanism on different data sources. Q3 provided 6 problematic URIs out of 30 (i.e., 20%). Q4 generated items for which no URI from the linked data set was resolvable since the whole Flickr wrapper data source was unavailable. This clearly makes the generated items unusable. The creation of infrastructure components such as the SPARQL Endpoint status for CKAN<ref type="foot" target="#foot_3">4</ref> registered data sets<ref type="foot" target="#foot_4">5</ref> can help provide solutions to this quality issue over the longer run.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Missing inferences</head><p>Finally, the SPARQL endpoint does not provide access to inferred triples. Our streamline does not tackle transitive closures on the data consumer side (e.g., through repeated queries), as illustrated with Q3. Further consideration should be given to the provision of data including inferred statements. Alternatively, full datasets could be imported. Inferences could then be performed in order to support the item generation process.</p><p>Different strategies can therefore be implemented to cope with data quality issues we encountered. Data publishers can improve the usability of the data, for instance with the implementation of an upper ontology in DBpedia. However, other data quality issues require data consumers to improve their data collection strategy, for instance to collect as much information as possible on the context of validity of the data, whenever it is available.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2">Data selection</head><p>The experiment also showed that the Linked Data statements should be selected. The suitability of an assessment item for a test delivered to a candidate or a group of candidates is measured in particular through such information as the item difficulty.</p><p>The difficulty can be assessed through a thorough calibration process in which the item is given to beta candidates for extracting psychometric indicators. In low stake assessment, however, the evaluation of the difficulty is often manual (candidate or teacher evaluation) or implicit (the performance of previous candidates who took the same item). In the item generation models we have used, each item has a different construct (i.e., it assesses a different knowledge). In this case, the psychometric variables are more difficult to predict <ref type="bibr" target="#b19">[20]</ref>. A particular model is necessary to assess the difficulty of items generated from Semantic Web sources. For instance, it is likely that for a European audience, the capital of the Cook Islands will raise a higher rate of failure than the capital of Belgium. There is no information in the datasets, which can support the idea of a higher or lower difficulty. Moreover, the difficulty of the item also depends on the distractors, which in this experiment were generated on a random basis from a set of equivalent instances. As the generation of items from structured Web data sources will become more elaborated, it will therefore be necessary to design a model for predicting the difficulty of generated items.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">Conclusion and future work</head><p>The present experimentation shows the process for generating assessment items and/or assessment variables from Linked Data. The performance of the system in comparison with other approaches shows its potential as a strategy for assessment item generation. It is expected that data linkage can provide relevant content for instance to propose formative resources to candidates who failed an item or to illustrate a concept with a picture published as part of a distinct dataset. The experimentation shows the quality issues related to the generation of items based on such a resource as DBpedia. It should be noted that the measurements were made with a question which raises particular quality issues. It can be easily improved as shown with other questions. Nevertheless the Linked Data Cloud also contains datasets published by scientific institutions, which may therefore raise less data accuracy concerns. In addition, the usage model we are proposing is centered on low stake assessment, for which we believe that the time saved makes it worthwhile having to clean some of the data, while the overall process remains valuable.</p><p>Nevertheless, additional work is necessary both on the data and on the assessment items. The items created demonstrate the complexity of generating item variables for simple assessment items. We aim to investigate the creation of more complex items and the relevance of formative resources which can be included in the item as feedback. Moreover, the Semantic Web can provide knowledge models from which items could be generated. Our work is focused on semi-automatic item generation, where users create item models, while the system aims to generate the variables. Nevertheless, the generation of the items from a knowledge model as in <ref type="bibr" target="#b10">[11]</ref> requires that more complex knowledge is encoded in the data (e.g., what happens to water when the temperature decreases). The type and nature of data published as Linked Data need therefore to be further analyzed in order to support the development of such models for the fully automated generation items based on knowledge models.</p><p>We will focus our future work on the creation of an authoring interface for item models with the use of data sources from the Semantic Web, on the assessment of item quality, on the creation of different types of assessment items from Linked Data sources, on the traceability of items created, including the path on the Semantic Web datasets which were used to generate the item, and on the improvement of data selection from semantic datasets.</p><p>Acknowledgments. This work was carried out in the scope of the iCase project on computer-based assessment. It has benefited from the TAO semantic platform for eassessment (https://www.tao.lu/) which is jointly developed by the Tudor Research Centre and the University of Luxembourg, with the support of the Fonds National de la Recherche in Luxembourg, the DIPF (Bildungsforschung und Bildungsinformation), the Bundesministerium für Bildung und Forschung, the Luxemburgish ministry of higher education and research, as well as OECD.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head></head><label></label><figDesc>/dbpedia.org/class/yago/Country108544813&gt;. Indeed, the SPARQL endpoint does not provide access to inferred triples. It is necessary to perform a set of queries to retrieve relevant subclasses and use them for the generation of variables.Out of 30 items including pictures of flags used as stimuli, 6 URIs did not resolve to a usable picture (HTTP 404 errors or encoding problem).</figDesc><table><row><cell>ontology, subclass of &lt;http://dbpedia.org/class/yago/Country108544813&gt;. But most European countries &lt;http://dbpedia.org/class/yago/EuropeanCountries&gt; is a are not retrieved when querying the dataset with SELECT DISTINCT ?kingHR ?successorHR WHERE { ?x &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#type&gt; &lt;http://dbpedia.org/class/yago/KingsOfFrance&gt; . ?x &lt;http://dbpedia.org/property/name&gt; ?kingHR . ?x &lt;http://dbpedia.org/ontology/successor&gt; ?z . ?z &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#type&gt; &lt;http://dbpedia.org/class/yago/KingsOfFrance&gt; . ?z &lt;http://dbpedia.org/property/name&gt; ?successorHR } &lt;http:/Q3 -Who succeeded to { Charles VII the Victorious } as ruler of France ? LIMIT 30</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">http://www.imsglobal.org/question/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">http://wiki.dbpedia.org/Datasets</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">http://www.tao.lu</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">http://www.ckan.net</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_4">http://labs.mondeca.com/sparqlEndpointsStatus/index.html</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">To Really Learn, Quit Studying and Take a Test</title>
		<author>
			<persName><forename type="first">P</forename><surname>Belluck</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2011-01-20">January 20th, 2011</date>
			<pubPlace>New York Times</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Retrieval Practice Produces More Learning than Elaborative Studying with Concept Mapping</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">D</forename><surname>Karpicke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">R</forename><surname>Blunt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Science</title>
		<imprint>
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">L</forename><surname>Gilbert</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Gale</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Warburton</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Wills</surname></persName>
		</author>
		<title level="m">Report on Summative E-Assessment Quality (REAQ)</title>
				<meeting><address><addrLine>Southampton.</addrLine></address></meeting>
		<imprint>
			<publisher>Joint Information Systems Committee</publisher>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Arikiturri: an Automatic Question Generator Based on Corpora and NLP techniques</title>
		<author>
			<persName><forename type="first">I</forename><surname>Aldabe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lopez De Lacalle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Maritxalar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Martinez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Uria</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="s">ser. Lecture Notes in computer science</title>
		<imprint>
			<biblScope unit="volume">4053</biblScope>
			<biblScope unit="page" from="584" to="594" />
			<date type="published" when="2006">2006</date>
			<publisher>Springer</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Automatic correction of grammatical errors in non-native English text</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">S Y</forename><surname>Lee</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
		<respStmt>
			<orgName>The Massachussets Institute of Technology</orgName>
		</respStmt>
	</monogr>
	<note type="report_type">PhD dissertation at</note>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Automatic Generation System of Multiple-Choice Cloze Questions and its Evaluation</title>
		<author>
			<persName><forename type="first">T</forename><surname>Goto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Kojiri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Watanabe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Iwata</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Yamada</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">An International Journal (KM&amp;EL)</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page">210</biblScope>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
	<note>Knowledge</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Generating multiple-choice test items from medical text: a pilot study</title>
		<author>
			<persName><forename type="first">N</forename><surname>Karamanis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">A</forename><surname>Ha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Mitkov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Fourth International Natural Language Generation Conference</title>
				<meeting>the Fourth International Natural Language Generation Conference</meeting>
		<imprint>
			<date type="published" when="2006">2006</date>
			<biblScope unit="page" from="111" to="113" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">An Automatic Multiple-Choice Question Generation Scheme for English Adjective Understanding</title>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">C</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">C</forename><surname>Sung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">C</forename><surname>Chen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">the 15th International Conference on Computers in Education (ICCE 2007)</title>
				<imprint>
			<date type="published" when="2007">2007</date>
			<biblScope unit="page" from="137" to="142" />
		</imprint>
	</monogr>
	<note>Workshop on Modeling,</note>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Automatic question generation for vocabulary assessment</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">C</forename><surname>Brown</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">A</forename><surname>Frishkoff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Eskenazi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing</title>
				<meeting>the conference on Human Language Technology and Empirical Methods in Natural Language Processing</meeting>
		<imprint>
			<date type="published" when="2005">2005</date>
			<biblScope unit="page" from="819" to="826" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">The Design of Automatic Quiz Generation for Ubiquitous English E-Learning System</title>
		<author>
			<persName><forename type="first">L.-C</forename><surname>Sung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y.-C</forename><surname>Lin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">C</forename><surname>Chen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Technology Enhanced Learning Conference</title>
				<meeting><address><addrLine>TELearn; Jhongli, Taiwan</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2007">2007. 2007</date>
			<biblScope unit="page" from="161" to="168" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Question generation and answering</title>
		<author>
			<persName><forename type="first">F</forename><surname>Linnebank</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Liem</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Bredeweg</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">DynaLearn, EC FP7 STREP project 231526</title>
				<imprint>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
	<note>Deliverable D3.3.</note>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">SARAC: A Framework for Automatic Item Generation</title>
		<author>
			<persName><forename type="first">B</forename><surname>Liu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Presented at the 2009 Ninth IEEE International Conference on Advanced Learning Technologies (ICALT)</title>
				<meeting><address><addrLine>Riga, Latvia</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2009">2009. 2009</date>
			<biblScope unit="page" from="556" to="558" />
		</imprint>
	</monogr>
	<note>Ninth IEEE International Conference on Advanced Learning Technologies</note>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Speech-Based Interactive Games for Language Learning: Reading, Translation, and Question-Answering</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Seneff</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computational Linguistics and Chinese Language Processing</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="133" to="160" />
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Using automatic item generation to address item demands for CAT</title>
		<author>
			<persName><forename type="first">H</forename><surname>Lai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Alves</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Gierl</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing</title>
				<meeting>the 2009 GMAC Conference on Computerized Adaptive Testing</meeting>
		<imprint>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Developing a Taxonomy of Item Model Types to Promote Assessment Engineering</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Gierl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Alves</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Technology, Learning, and Assessment</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="issue">2</biblScope>
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Reusability in e-assessment: Towards a multifaceted approach for managing metadata of e-assessment resources</title>
		<author>
			<persName><forename type="first">S</forename><surname>Sarre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Foulonneau</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Fifth International Conference on Internet and Web Applications and Services</title>
				<imprint>
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Yago: a core of semantic knowledge</title>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">M</forename><surname>Suchanek</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Kasneci</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Weikum</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 16th international conference on World Wide Web</title>
				<meeting>the 16th international conference on World Wide Web</meeting>
		<imprint>
			<biblScope unit="page" from="697" to="706" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">A computer-aided environment for generating multiple-choice test items</title>
		<author>
			<persName><forename type="first">R</forename><surname>Mitkov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">An</forename><surname>Ha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Karamanis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Natural Language Engineering</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="issue">02</biblScope>
			<biblScope unit="page" from="177" to="194" />
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Strategies for reprocessing aggregated metadata</title>
		<author>
			<persName><forename type="first">Muriel</forename><surname>Foulonneau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Timothy</forename><forename type="middle">W</forename><surname>Cole</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">European Conference on Digital Libraries</title>
		<title level="s">Lecture notes in computer science</title>
		<imprint>
			<date type="published" when="2005">2005</date>
			<biblScope unit="volume">3652</biblScope>
			<biblScope unit="page" from="290" to="301" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<monogr>
		<title level="m" type="main">A feasibility study of on-the-fly item generation in adaptive testing</title>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">I</forename><surname>Bejar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">R</forename><surname>Lawless</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">E</forename><surname>Morley</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">E</forename><surname>Wagner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">E</forename><surname>Bennett</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Revuelta</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2002">2002</date>
			<publisher>Educational Testing Service</publisher>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
