=Paper= {{Paper |id=None |storemode=property |title=Sherlock: a Semi-Automatic Quiz Generation System using Linked Data |pdfUrl=https://ceur-ws.org/Vol-1272/paper_7.pdf |volume=Vol-1272 |dblpUrl=https://dblp.org/rec/conf/semweb/LiuL14 }} ==Sherlock: a Semi-Automatic Quiz Generation System using Linked Data== https://ceur-ws.org/Vol-1272/paper_7.pdf

Sherlock: a Semi-Automatic Quiz Generation
System using Linked Data

Dong Liu1 and Chenghua Lin2
1
BBC Future Media & Technology - Knowledge & Learning,Salford M50 2QH, UK,
Dong.Liu@bbc.co.uk
2
Department of Computing Science, University of Aberdeen, AB24 3UE, UK
chenghua.lin@abdn.ac.uk

Abstract. This paper presents Sherlock, a semi-automatic quiz gener-
ation system for educational purposes. By exploiting semantic and ma-
chine learning technologies, Sherlock not only offers a generic framework
for domain independent quiz generation, but also provides a mechanism
for automatically controlling the difficulty level of the generated quizzes.
We evaluate the effectiveness of the system based on three real-world
datasets.

Keywords: Quiz Generation, Linked Data, RDF, Educational Games

1 Introduction
Interactive games are effective ways of helping knowledge being transferred be-
tween humans and machines. For instance, efforts have been made to unleash
the potential of using Linked Data to generate educational quizzes. However,
it is observed that the existing approaches [1, 2] share some common limita-
tions that they are either based on domain specific templates or the creation of
quiz templates heavily relies on ontologist and Linked Data experts. There is no
mechanism provided to end-users to engage with customised quiz authoring.
Moreover, a system that can generate quizzes with different difficulty lev-
els will better serve users’ needs. However, such an important feature is rarely
offered by the existing systems, where most of the practices simply select the dis-
tractors (i.e., the wrong candidate answers) at random from an answer pool (e.g.,
obtained by querying the Linked Data repositories). Some work has attempted
to determine the difficulty of a quiz but still it is simply based on assessing the
popularity of a RDF resource, without considering the fact that the difficulty
level of a quiz is directly affected by semantic relatedness between the correct
answer and the distractors [3].
In this paper, we present a novel semi-automatic quiz generation system
(Sherlock) empowered by semantic and machine learning technologies. Sherlock
is distinguished from existing systems in a few aspects: (1) it offers a generic
framework for generating quizzes of multiple domains with minimum human
effort; (2) a mechanism is introduced for controlling the difficulty level of the
generated quizzes; and (3) an intuitive interface is provided for engaging users
2

Offline Online
Data Collection and Integration Incorrect
Distractor Quiz Renderer
Database
Similarity Computation

LOD Adaptive
Similarity Clustering
Question
and Answer Quiz Creator
Template-based
Question and Answer Generator Database

Fig. 1. Overall architecture of Sherlock.

in creating customised quizzes. The live Sherlock system can be accessed from
http://sherlock.pilots.bbcconnectedstudio.co.uk/1 .
2 System Architecture
Fig. 1 depicts an overview of the Sherlock framework, in which the components
are logically divided into two groups: online and offline. These components can
interact with each other via shared databases which contain the information of
the questions, correct answers and distractors (i.e., incorrect answers).
Data Collection and Integration: We collected RDF data published by
DBpedia and the BBC. These data play two main roles, i.e., serving as the
knowledge base for quiz generation and used for calculating the similarity scores
between objects/entities (i.e., answers and distractors).
Similarity Computation: The similarity computation module is the core
for controlling the difficulty level of quiz generation. It first accesses the RDF
store, and then calculates the similarity scores between each object/entity pairs.
In the second step, the module performs K-means clustering to partition the
distractors into different difficulty levels according to their Linked Data Semantic
Distance (LDSD) [4] scores with the correct answer of a quiz. In the preliminary
experiment, we empirically set K=3 corresponding to three difficulty levels, i.e.
“easy”, “medium” and “difficult”.
Template-based Question and Answer Generator: This module auto-
mates the process of generating questions and the correct answers. Fig. 2(a)
demonstrates the instantiation of an example template: “Which of the following
animals is {?animal name}?”, where the variable is replaced with rdfs:label of
the animal. The generated questions and answers will be saved in the database.
Quiz Renderer: The rendering module first retrieves the question and the
correct answer from the database, and then selects suitable distractors from
the entities returned by the similarity computation module. Fig. 2(a) shows the
module’s intuitive gaming interface, as well as a distinctive feature for tuning up
or down the quiz difficulty level dynamically, making Sherlock able to better serve
the needs of different user groups (e.g., users of different age and with different
1
For the best experiences, please use Safari or Opera to access the demo.
3

(a) (b)

Fig. 2. (a) User interface for playing a quiz; (b) User interface for creating a quiz.

0.8
0.96

0.95

0.94
0.6

0.93

0.92
Accuracy 0.4
Similarity

0.91

0.9
0.2

0.89

0.88
0.0
0.87

0.86
0.90 0.91 0.92 0.93 0.94
Easy
Medium
Diﬃcult
Similarity

(a) (b)

Fig. 3. (a) Averaged similarity of the testing quizzes (Wildlife domain); (b) Correlation
measure of the Wildlife domain (r = −0.97, p < 0.1).

educational background). Furthermore, to enhance a user’s learning experience,
the “learn more” link on the bottom left of the interface points to a Web page
containing detailed information about the correct answer (e.g., Cheetah).
Quiz Creator: Fig. 2(b) depicts the quiz creator module, which complements
the automatic quiz generation by allowing users to create customised quizzes
with more diverse topics and to share with others. Quiz authoring involves
three simple steps: 1) write a question; 2) set the correct answer (distractors
are suggested by the Sherlock system automatically); and 3) preview and sub-
mit. For instance, one can take a picture of several ingredients and let people
guess what dish one is going to cook. The quiz creator interface can be accessed
from http://sherlock.pilots.bbcconnectedstudio.co.uk/#/quiz/create.
3 Empirical Evaluation
This demo aims to show how Sherlock can effectively generate quizzes of different
domains and how well a standard similarity measure can be used to suggest
quiz difficulty level that matches human’s perception. The hypothesis is that if
some objects/entities have higher degree of semantic relatedness, their differences
would be subtle and hence more difficult to be disambiguated, and vice versa.
4

We investigated the correlation between the difficulty level captured by the
similarity measure and that perceived by human. To test our hypothesis, a group
of 10 human evaluators were presented with 45 testing quizzes generated by
Sherlock based on the BBC Wildlife domain data, i.e., 15 quizzes per difficulty
level. Next the averaged pairwise similarity between the correct answer and
distractors of each testing quiz were computed, as shown in Fig. 3(a). Fig. 3(b)
demonstrates that the quiz test accuracy of human evaluation indeed shows a
negative correlation (r = −0.97, p < 0.1) with the average similarity of the quiz
answer choices (i.e., each datapoint is the averaged value over 15 quizzes per
difficulty level). This suggests that LDSD is an appropriate similarity measure
for indicating quiz difficulty level, which inlines with our hypothesis.
In another set of experiments, we evaluated Sherlock as a generic framework
for quiz generation, in which the system was tested on structural RDF datasets
from three different domains, namely, BBC Wildlife, BBC Food and BBC Your-
Paintings2 , with 321, 991 and 2,315 quizzes automatically generated by the
system for each domain respectively. Benefiting from the domain-independent
similarity measure (LDSD), Sherlock can be easily adapted to generate quizzes
of new domains with minimum human efforts, i.e., no need to manually define
rules or rewrite SPARQL queries.
4 Conclusion
In this paper, we presented a novel generic framework (Sherlock) for generating
educational quizzes using linked data. Compared to existing systems, Sherlock
offers a few distinctive features, i.e., it not only provides a generic framework
for generating quizzes of multiple domains with minimum human effort, but
also introduces a mechanism for controlling the difficulty level of the generated
quizzes based on a semantic similarity measure.
Acknowledgements
The research described here is supported by the BBC Connected Studio pro-
gramme and the award made by the RCUK Digital Economy theme to the
dot.rural Digital Economy Hub; award reference EP/G066051/1. The authors
would like to thank Ryan Hussey, Tom Cass, James Ruston, Herm Baskerville
and Nava Tintarev for their valuable contribution.
References
[1] Damljanovic, D., Miller, D., O’Sullivan, D.: Learning from quizzes using intelligent
learning companions. In: WWW (Companion Volume). (2013) 435–438
[2] Álvaro, G., Álvaro, J.: A linked data movie quiz: the answers are out there, and
so are the questions [blog post]. http://bit.ly/linkedmovies (2010)
[3] Waitelonis, J., Ludwig, N., Knuth, M., Sack, H.: WhoKnows? - evaluating linked
data heuristics with a quiz that cleans up dbpedia. International Journal of Inter-
active Technology and Smart Education (ITSE) 8 (2011) 236–248
[4] Passant, A.: Measuring semantic distance on linking data and using it for resources
recommendations. In: AAAI Symposium: Linked Data Meets AI. (2010)
2
http://www.bbc.co.uk/nature/wildlife, http://www.bbc.co.uk/food and http:
//www.bbc.co.uk/arts/yourpaintings