=Paper=
{{Paper
|id=Vol-307/paper-6
|storemode=property
|title=Exploiting Image Segmentation Techniques for Social Filtering of Educational Content
|pdfUrl=https://ceur-ws.org/Vol-307/paper06.pdf
|volume=Vol-307
|authors=Pythagoras Karampiperis,Aristeidis Diplaros
}}
==Exploiting Image Segmentation Techniques for Social Filtering of Educational Content==
Proceedings of the 1st Workshop on Social Information Retrieval for Technology-Enhanced Learning & Exchange
Exploiting Image Segmentation Techniques for Social
Filtering of Educational Content
Pythagoras Karampiperis1 and Aristeidis Diplaros2
1
Department of Technology Education and Digital Systems,
University of Piraeus, Greece,
pythk@ieee.org
2
Informatics Institute, University of Amsterdam, The Netherlands
diplaros@gmail.com
Abstract. The need for applying advanced social information retrieval
techniques for personalizing web-based information discovery has been
identified as a key challenge. Until now, significant R&D effort has been
devoted aiming towards applying collaborative filtering techniques for
educational content retrieval. However, limited attention has been given to the
use of educational metadata as a mean to enhance social filtering techniques via
educationally informed filtering decisions. In this paper we propose the use of
an add-on filtering service on existing social filtering systems/applications so as
to create a data post-filtering mechanism that makes use of intelligence stored
in TEL metadata. The proposed methodology starts with the generation of a
matrix that represents the educational characteristics of the resources suggested
by typical social filtering techniques and applies post-filtering using the
educational “footprint” of the resources already used by the targeted end-user.
Keywords: Technology Enhanced Learning, Educational Metadata, Social
Filtering, Data Clustering.
1 Introduction
The high rate of evolution of Web 2.0 applications implies that on the one hand,
increasingly complex and dynamic web-based learning infrastructures need to be
managed more efficiently, and on the other hand, new type of learning services and
mechanisms need to be developed and provided. To meet the current needs, such
services should satisfy a diverse range of requirements, as for example,
personalization based on social filtering [1].
In this context, the need for applying advanced social information retrieval
techniques for personalizing web-based information discovery and retrieval has been
identified as a key challenge. This has become more critical in the case of Technology
Enhanced Learning applications, since on the Web a vast variety of digital learning
resources exist that have the potential to facilitate teaching and learning tasks. Until
now, significant R&D effort has been devoted aiming towards applying collaborative
filtering techniques for educational content retrieval [2]. These techniques are using
45
Proceedings of the 1st Workshop on Social Information Retrieval for Technology-Enhanced Learning & Exchange
usage log files over a set of educational resources to provide personalized
recommendations by comparing the profile of the learner in hand with similar
persons/groups recorded in the historical log data [3, 4, 5]. However, limited attention
has been given to the use of educational metadata as a mean to enhance social
filtering techniques via educationally informed filtering decisions.
In this paper we propose the use of an add-on filtering service on existing social
filtering systems/applications so as to create a data post-filtering mechanism that
makes use of intelligence stored in TEL metadata. The main driver of this work was
inspired by the idea of using visualization information for accessing Learning Object
Repositories [6]. Our goal was to investigate how image segmentation techniques
could be applied in order to enhance the social filtering process of educational
content. More precisely, the proposed methodology starts with the generation of a
matrix that represents (in visual form) the educational characteristics of the resources
suggested by typical social filtering techniques and applies post-filtering using the
educational “footprint” of the resources already used by the targeted end-user. For the
generation of the resource filter we utilize image segmentation techniques, taking into
account the spatial coherence of the created visual representation. We treat the
filtering problem as an inference problem, assuming that each pixel in the educational
“footprint” (visualization) has a hidden binary label associated with it which specifies
if it is appropriate for the targeted learner or not. In order to solve the inference
problem, we use a variation of the EM algorithm [7] which incorporates the spatial
constraints with just a small computational overhead [8].
Moreover, a potential drawback when applying social filtering techniques is that
the models used are not fully transparent to the end user, thus, affecting the end-users’
trust on the provided recommendations [9]. Since the generated filter by the proposed
approach is represented visually, end-users can directly observe the core of the
educational filtering process and make modifications/updates if desired.
The paper is structured as follows: In section 2, we discuss how educational
metadata could be used in order to generate the educational “footprint” (visualization)
of a set of educational resources. Section 3 presents the proposed methodology for
generating the post-filter for the resources recommended by typical social filtering
techniques, using as an input the educational “footprint” of the resources already used
by the targeted end-user. Finally, we demonstrate the application of the proposed
visualization and filtering process on an easy-to-understand real life scenario.
2 Social Filtering via Educational Metadata Visualizations
Social filtering is a method for making automatic predictions (filtering) about the
preferences of a user by collecting preference information from many users. The
underlying assumption of social filtering is that the users with similar preferences in
the past tend to have similar preferences in the future. There exist three main types of
social filtering: active filtering, passive filtering and item-based filtering. Active
filtering uses a peer-to-peer approach, based on explicit user ratings over a set of
available digital resources. On the other hand, passive filtering uses preference
information that was implicitly collected via usage log files. Implicit filtering relies on
46
Proceedings of the 1st Workshop on Social Information Retrieval for Technology-Enhanced Learning & Exchange
the historical actions of users to determine a value rating for digital content. Finally,
in the case of item-based filtering, items (digital resources) are rated and used as
parameters instead of users. This type of filtering uses the ratings to group various
items so as to enable potential users to compare them.
Our proposed method is an add-on filtering service on existing passive social
filtering systems/applications, utilizing intelligence stored in TEL metadata. The main
idea of the proposed approach is to post-filter the recommendations provided by
typical passive social filtering techniques using the educational “footprint” of the
resources already used by the targeted end-user. To achieve this, we create a matrix
that represents (in visual form) the educational characteristics of the resources already
recorded in the historical log files. Based on this matrix, we generate another matrix
that represents the educational preferences of the targeted user. The latter matrix acts
as an educational post-filter on the resources suggested by a typical social filtering
system. This post-filtering is made by comparing the generated filter with the
educational “footprint” of the resources suggested by a passive social filtering
technique. Next paragraphs present how educational metadata are used to create the
educational “footprint” of a single resource, as well as, of a set of resources. It is clear
that this method is used for creating both the educational representation of the
resources already used by the targeted user (which is the input for the filtering
generation process), and the educational representation of the resources suggested by
a passive social filtering technique (which is the input for the post-filtering process).
2.1 Creating the Educational Footprint of a Learning Resource
In order to generate the educational footprint (representation) of an educational
resource we use the corresponding metadata record, a subset of the IEEE Learning
Object Metadata (LOM) standard elements. The metadata elements used were
selected in such a way that each element uses a specific state vocabulary, as illustrated
in Table 1.
Fig. 1. Examples of representing the educational footprint of individual learning resources with
Learning Resource Type (RT) equal to “simulation”.
The educational footprint of a learning resource is a 15x8 pixels image where the first
dimension (lines) stands for the states of the Learning Resource Type attribute and the
47
Proceedings of the 1st Workshop on Social Information Retrieval for Technology-Enhanced Learning & Exchange
second dimension (columns) stands for the rest eight attributes used. Each pixel is
colored according to the value of the corresponding attribute of the second dimension.
The color coding used for each metadata attribute j is defined by the formula:
kj
j
ColorRED j
= ColorGREEN j
= ColorBLUE = 1 − × 255 ,
N
where N stands for the number of vocabulary states of metadata attribute j , and
j
k is the state code of attribute j for a given educational resource.
Table 1. Educational Resource Description Model and Color Coding used.
Metadata Vocabulary State Color Code
Color
Element Used State Code (R-G-B)=(X-X-X)
active 1 X=(2/3)*255
Interactivity Type expositive 2 X=(1/3)*255
mixed 3 X=0
very low 1 X=(4/5)*255
low 2 X=(3/5)*255
Interactivity Level medium 3 X=(2/5)*255
high 4 X=(1/5)*255
very high 5 X=0
Semantic Density Same Vocabulary and Color Coding with “Interactivity Level”
K12 1 Custom Vocabulary (not defined in
13-18 2 IEEE LOM). In our simulations we used the
Typical Age Range
3 same Color Coding with “Interactivity
Adults
Type”
Difficulty Same Vocabulary and Color Coding with “Interactivity Level”
teacher 1 X=(3/4)*255
Intended End User author 2 X=(2/4)*255
Role learner 3 X=(1/4)*255
manager 4 X=0
school 1
higher
2 Same Color Coding with “Intended End
Context education
User Role”
training 3
other 4
Typical Learning Custom Vocabulary (not defined in IEEE LOM). In our simulations we
Time used the same Vocabulary and Color Coding with “Interactivity Level”
exercise 1
simulation 2
questionnaire 3
diagram 4
figure 5
graph 6 This metadata element was used as the
second dimension for the creation of the
index 7
resource visual matrix. Thus, no color
Learning Resource slide 8
coding was used for this metadata element
Type table 9 since each line (or set of lines) in the visual
narrative text 10 matrix represents directly the value of the
exam 11 “Learning Resource Type”
experiment 12
problem
13
statement
self assessment 14
lecture 15
48
Proceedings of the 1st Workshop on Social Information Retrieval for Technology-Enhanced Learning & Exchange
Fig.1, presents examples of the produced representations for different cases of
educational content, with the same learning resource type. For presentation simplicity,
we have used resources that use only two values (states) per each metadata attribute
(represented with gray and black colors accordingly).
2.2 Creating the Educational Footprint of a Set of Learning Resources
In order to generate the representation of a set of learning resources, we start from the
representation of the first learning resource in the set and extend the resolution of the
generated image for each n × n resources, with n ≥ 2, n ∈ N * per learning resource
type. So the size of the generated representation for a set can be: (15k )× (8k ) pixels,
where k ∈ N * . As a result the generated visualizations can be (15 x 8), (30 x 16), (45
x 24), … pixels. Fig.2, presents the aggregated representation of the resources
demonstrated in previous section (Fig.1).
Fig. 2. Example of aggregated representation of a set of learning resources.
Next section presents the methodology for generating the educational post-filter
(that is, a matrix which represents the educational preferences of the targeted user) for
the resources suggested by a typical passive social filtering system.
3 Generating the Filter for Educational Resource Post-Filtering
The core idea of the filtering generation method used in this paper, is to treat the pixel
labels of a representation as independent random variables from a common prior
distribution p(si) (which we are going to learn by the EM algorithm), but constrain
their posterior distributions (computed in the E-step of the EM algorithm) according
to the spatial dependencies between pixels [8]. Although educational metadata
properties are correlated, the idea of treating them as independent random variables
seems (from preliminary investigation) that it does not affect the filtering process. Of
course, this issue will be a subject for deeper investigation in the future, since in this
paper our goal was to setup the framework for educational post-filtering of social
filtering processes rather than the deep comparison of data clustering techniques to
handle the correlation of educational metadata.
In particular, we define a log-likelihood function:
49
Proceedings of the 1st Workshop on Social Information Retrieval for Technology-Enhanced Learning & Exchange
n
L(θ ) = ∑ log ∑ p (ci | si )p (si ) (1)
i =1 si
where the parameter θ summarizes all unknown parameters in the model. These
unknown parameters are learned by the EM algorithm [10]. More precisely, θ
includes the prior probability of each state of the educational metadata parameters. In
order to capture the spatial constraints of the pixel labels into an EM algorithm, we
employ a variational approximation in which we maximize in each step a lower bound
of L(θ ) . This bound F (θ , Q ) is a function of the current mixture parameters θ and a
∏ q (s ) , where each q (s ) corresponds to pixel i
n
factorized distribution Q = i i i i
i =1
but defines an otherwise arbitrary discrete distribution over si .
An attractive property of the variational EM framework is that in each step of the
algorithm we are allowed to assign any distribution qi (si ) to individual pixels as long
as this increases the energy F . In summary, our variational EM algorithm is as
follows:
1. (Initialization) Start with a random guess for the parameter vector θ .
2. (Standard E-step) Compute the Bayes posterior probabilities over pixel labels
given the pixel colors given the current estimate of θ .
3. Smooth the responsibilities of neighboring pixels by applying a local filter on
the set of assigned posteriors (and then renormalize if needed). An efficient
way to do this is to represent the set of assigned responsibilities as an image
and apply a standard Gaussian smoothing filter.
4. (Standard M-step) Use the smoothed responsibilities in order to update the
parameter θ as in standard EM [9]. If convergence stop, else go to step 2.
4 Demonstration
In order to make a preliminary evaluation of the effectiveness the proposed approach
we used 10 Learning Object sets consisting of 135 learning object metadata records,
that is, 9 Learning Objects per Learning Resource Type (simulating 10 different end-
user’s historical log files) and a set of 20 learning object metadata records (simulating
recommendations from a passive social filtering system), with normal distribution
over the value space of each metadata element. The goal of the evaluation was to test
the ability of filtering out learning resources with educational footprint that does not
match the educational preferences of a given end-user. From this preliminary
evaluation, we have evidence that such an add-on service has the potential to enhance
social filtering techniques via educationally informed filtering decisions.
Fig.3 presents an example of how the educational footprint for a set of 9 Learning
Objects per Learning Resource Type is generated, depicting the step-by-step result of
this process for the case of “Interactivity Type” metadata attribute. As we can
observe, this is an incremental process starting with the representation of the
educational footprint of the first learning object in the set (Fig.3a), continues with the
50
Proceedings of the 1st Workshop on Social Information Retrieval for Technology-Enhanced Learning & Exchange
representation for the first 2x2 learning objects (Fig.3b), then with the representation
of the first 3x3 learning objects (Fig.3c), and so on for larger sets of learning objects
(Fig.3d).
Learning Resource
Color
Resource Interactivity
Code
ID Type
Resource #1 active
Resource #2 expositive
Resource #3 expositive
Resource #4 mixed
Resource #5 active
Resource #6 active
Resource #7 expositive
Resource #8 mixed
Resource #9 expositive
… … …
Fig. 3. Generating the educational footprint of a set of learning resources.
This representation is used as an input for generating the resource filter for the
educational post-filtering of the resources suggested by a typical social filtering
system/application. An example of such a filter is presented in Fig.4.
P1 P2 P3 P4 P5 P6 ... P1 P2 P3 P4 P5 P6 ...
RT 1 RT 1
RT 2 RT 2
RT 3 RT 3
RT 4 RT 4
... ...
(a) (b)
Fig. 4. (a) Example of representing a set of 16 learning objects – 4 per each learning resource
type, (b) result of the proposed algorithm acting as a post-filter for future recommendations.
51
Proceedings of the 1st Workshop on Social Information Retrieval for Technology-Enhanced Learning & Exchange
5 Conclusions
In this paper we propose the use of an add-on filtering service on existing social
filtering systems/applications so as to create a data post-filtering mechanism that
makes use of intelligence stored in TEL metadata. The main driver of this work was
inspired by the idea of using visualization information for accessing Learning Object
Repositories. Our goal was to investigate how image segmentation techniques could
be applied in order to enhance the social filtering process of educational content. The
proposed methodology starts with the generation of a matrix that represents the
educational characteristics of the resources suggested by typical social filtering
techniques and applies post-filtering using the educational “footprint” of the resources
already used by the targeted end-user. We treat the filtering problem as an inference
problem, assuming that each pixel in the educational content visualization has a
hidden binary label associated with it which specifies if it is appropriate for the
targeted learner or not. In order to solve the inference problem, we use a variation of
the EM algorithm which incorporates the spatial constraints with just a small
computational overhead.
References
1. Ahn, J.-W., Farzan, R., and Brusilovsky, P.: Social Search in the Context of Social
Navigation. Journal of the Korean Society for Information Management, vol. 23 (2), pp.
147--165 (2006)
2. Recker, M. M., Walker, A., and Wiley, D. A.: Collaboratively filtering learning objects. In
D. A. Wiley (Ed.), The Instructional Use of Learning Objects: Online Version (2000)
3. M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, and S. Slattery.:
Learning to construct knowledge bases from the world wide web. Artificial Intelligence,
vol. 118(1-2), pp. 69--113 (2000)
4. B. Mobasher, R. Cooley, and J. Srivastava.: Automatic personalization based on web usage
mining. Communications of the ACM, vol. 43(8), pp. 142--151 (2000)
5. B. Mobasher, H. Dai, and M. Nakagawa T. Luo.: Discovery and evaluation of aggregate
usage profiles for web personalization. Data Mining and Knowledge Discovery, vol. 6, pp.
61--82 (2002)
6. Klerkx, J., Duval, E., Meire, M.: Using Information Visualization for Accessing Learning
Object Repositories. In Proc. of the 8th IEEE International Conference on Information
Visualisation, pp. 465 -- 470 (2004)
7. A. P. Dempster, N. M. Laird and D. B. Rubin.: Maximum likelihood from incomplete data
via the EM algorithm, J. Roy. Statist. Soc. B, vol. 39, pp. 1--38 (1977)
8. A. Diplaros, N. Vlassis, and T. Gevers.: A spatially constrained generative model and an
EM algorithm for image segmentation. IEEE Transactions on Neural Networks, vol. 18(3),
pp. 798--808 (2007)
9. D. Pierrakos, G. Paliouras, C. Papatheodorou, and C.D. Spyropoulos.: Web usage mining
as a tool for personalization: A survey. User Modeling and User-Adapted Interaction, vol.
13, pp. 311--372 (2003)
10. R. M. Neal and G. E. Hinton.: A view of the EM algorithm that justices incremental,
sparse, and other variants, .Learning in graphical models, M. I. Jordan, Ed. Kluwer
Academic Publishers, pp. 355--368 (1998)
52