<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Exploiting Image Segmentation Techniques for Social Filtering of Educational Content</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Pythagoras Karampiperis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aristeidis Diplaros</string-name>
          <email>diplaros@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Technology Education and Digital Systems, University of Piraeus</institution>
          ,
          <country country="GR">Greece</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Informatics Institute, University of Amsterdam</institution>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <fpage>45</fpage>
      <lpage>52</lpage>
      <abstract>
        <p>The need for applying advanced social information retrieval techniques for personalizing web-based information discovery has been identified as a key challenge. Until now, significant R&amp;D effort has been devoted aiming towards applying collaborative filtering techniques for educational content retrieval. However, limited attention has been given to the use of educational metadata as a mean to enhance social filtering techniques via educationally informed filtering decisions. In this paper we propose the use of an add-on filtering service on existing social filtering systems/applications so as to create a data post-filtering mechanism that makes use of intelligence stored in TEL metadata. The proposed methodology starts with the generation of a matrix that represents the educational characteristics of the resources suggested by typical social filtering techniques and applies post-filtering using the educational “footprint” of the resources already used by the targeted end-user.</p>
      </abstract>
      <kwd-group>
        <kwd>Technology Enhanced Learning</kwd>
        <kwd>Educational Metadata</kwd>
        <kwd>Social Filtering</kwd>
        <kwd>Data Clustering</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        The high rate of evolution of Web 2.0 applications implies that on the one hand,
increasingly complex and dynamic web-based learning infrastructures need to be
managed more efficiently, and on the other hand, new type of learning services and
mechanisms need to be developed and provided. To meet the current needs, such
services should satisfy a diverse range of requirements, as for example,
personalization based on social filtering [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        In this context, the need for applying advanced social information retrieval
techniques for personalizing web-based information discovery and retrieval has been
identified as a key challenge. This has become more critical in the case of Technology
Enhanced Learning applications, since on the Web a vast variety of digital learning
resources exist that have the potential to facilitate teaching and learning tasks. Until
now, significant R&amp;D effort has been devoted aiming towards applying collaborative
filtering techniques for educational content retrieval [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. These techniques are using
usage log files over a set of educational resources to provide personalized
recommendations by comparing the profile of the learner in hand with similar
persons/groups recorded in the historical log data [
        <xref ref-type="bibr" rid="ref3 ref4 ref5">3, 4, 5</xref>
        ]. However, limited attention
has been given to the use of educational metadata as a mean to enhance social
filtering techniques via educationally informed filtering decisions.
      </p>
      <p>
        In this paper we propose the use of an add-on filtering service on existing social
filtering systems/applications so as to create a data post-filtering mechanism that
makes use of intelligence stored in TEL metadata. The main driver of this work was
inspired by the idea of using visualization information for accessing Learning Object
Repositories [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Our goal was to investigate how image segmentation techniques
could be applied in order to enhance the social filtering process of educational
content. More precisely, the proposed methodology starts with the generation of a
matrix that represents (in visual form) the educational characteristics of the resources
suggested by typical social filtering techniques and applies post-filtering using the
educational “footprint” of the resources already used by the targeted end-user. For the
generation of the resource filter we utilize image segmentation techniques, taking into
account the spatial coherence of the created visual representation. We treat the
filtering problem as an inference problem, assuming that each pixel in the educational
“footprint” (visualization) has a hidden binary label associated with it which specifies
if it is appropriate for the targeted learner or not. In order to solve the inference
problem, we use a variation of the EM algorithm [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] which incorporates the spatial
constraints with just a small computational overhead [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        Moreover, a potential drawback when applying social filtering techniques is that
the models used are not fully transparent to the end user, thus, affecting the end-users’
trust on the provided recommendations [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Since the generated filter by the proposed
approach is represented visually, end-users can directly observe the core of the
educational filtering process and make modifications/updates if desired.
      </p>
      <p>The paper is structured as follows: In section 2, we discuss how educational
metadata could be used in order to generate the educational “footprint” (visualization)
of a set of educational resources. Section 3 presents the proposed methodology for
generating the post-filter for the resources recommended by typical social filtering
techniques, using as an input the educational “footprint” of the resources already used
by the targeted end-user. Finally, we demonstrate the application of the proposed
visualization and filtering process on an easy-to-understand real life scenario.</p>
    </sec>
    <sec id="sec-2">
      <title>2 Social Filtering via Educational Metadata Visualizations</title>
      <p>Social filtering is a method for making automatic predictions (filtering) about the
preferences of a user by collecting preference information from many users. The
underlying assumption of social filtering is that the users with similar preferences in
the past tend to have similar preferences in the future. There exist three main types of
social filtering: active filtering, passive filtering and item-based filtering. Active
filtering uses a peer-to-peer approach, based on explicit user ratings over a set of
available digital resources. On the other hand, passive filtering uses preference
information that was implicitly collected via usage log files. Implicit filtering relies on
the historical actions of users to determine a value rating for digital content. Finally,
in the case of item-based filtering, items (digital resources) are rated and used as
parameters instead of users. This type of filtering uses the ratings to group various
items so as to enable potential users to compare them.</p>
      <p>Our proposed method is an add-on filtering service on existing passive social
filtering systems/applications, utilizing intelligence stored in TEL metadata. The main
idea of the proposed approach is to post-filter the recommendations provided by
typical passive social filtering techniques using the educational “footprint” of the
resources already used by the targeted end-user. To achieve this, we create a matrix
that represents (in visual form) the educational characteristics of the resources already
recorded in the historical log files. Based on this matrix, we generate another matrix
that represents the educational preferences of the targeted user. The latter matrix acts
as an educational post-filter on the resources suggested by a typical social filtering
system. This post-filtering is made by comparing the generated filter with the
educational “footprint” of the resources suggested by a passive social filtering
technique. Next paragraphs present how educational metadata are used to create the
educational “footprint” of a single resource, as well as, of a set of resources. It is clear
that this method is used for creating both the educational representation of the
resources already used by the targeted user (which is the input for the filtering
generation process), and the educational representation of the resources suggested by
a passive social filtering technique (which is the input for the post-filtering process).
2.1</p>
      <sec id="sec-2-1">
        <title>Creating the Educational Footprint of a Learning Resource</title>
        <p>In order to generate the educational footprint (representation) of an educational
resource we use the corresponding metadata record, a subset of the IEEE Learning
Object Metadata (LOM) standard elements. The metadata elements used were
selected in such a way that each element uses a specific state vocabulary, as illustrated
in Table 1.</p>
        <p>The educational footprint of a learning resource is a 15x8 pixels image where the first
dimension (lines) stands for the states of the Learning Resource Type attribute and the
second dimension (columns) stands for the rest eight attributes used. Each pixel is
colored according to the value of the corresponding attribute of the second dimension.
The color coding used for each metadata attribute j is defined by the formula:

ColorRjED = ColorGjREEN = ColorBjLUE = 1 −

k j </p>
        <p> × 255 ,</p>
        <p>N 
where N stands for the number of vocabulary states of metadata attribute j , and
k j is the state code of attribute j for a given educational resource.</p>
        <p>This metadata element was used as the
second dimension for the creation of the
resource visual matrix. Thus, no color
coding was used for this metadata element
since each line (or set of lines) in the visual
matrix represents directly the value of the</p>
        <p>“Learning Resource Type”</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2 Creating the Educational Footprint of a Set of Learning Resources</title>
        <p>In order to generate the representation of a set of learning resources, we start from the
representation of the first learning resource in the set and extend the resolution of the
generated image for each n × n resources, with n ≥ 2, n ∈ N * per learning resource
type. So the size of the generated representation for a set can be: (15k )× (8k )pixels,
where k ∈ N * . As a result the generated visualizations can be (15 x 8), (30 x 16), (45
x 24), … pixels. Fig.2, presents the aggregated representation of the resources
demonstrated in previous section (Fig.1).</p>
        <p>Next section presents the methodology for generating the educational post-filter
(that is, a matrix which represents the educational preferences of the targeted user) for
the resources suggested by a typical passive social filtering system.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3 Generating the Filter for Educational Resource Post-Filtering</title>
      <p>
        The core idea of the filtering generation method used in this paper, is to treat the pixel
labels of a representation as independent random variables from a common prior
distribution p(si) (which we are going to learn by the EM algorithm), but constrain
their posterior distributions (computed in the E-step of the EM algorithm) according
to the spatial dependencies between pixels [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Although educational metadata
properties are correlated, the idea of treating them as independent random variables
seems (from preliminary investigation) that it does not affect the filtering process. Of
course, this issue will be a subject for deeper investigation in the future, since in this
paper our goal was to setup the framework for educational post-filtering of social
filtering processes rather than the deep comparison of data clustering techniques to
handle the correlation of educational metadata.
      </p>
      <p>In particular, we define a log-likelihood function:
n
L(θ ) = ∑ log ∑ p(ci | si )p(si ) (1)</p>
      <p>
        i=1 si
where the parameter θ summarizes all unknown parameters in the model. These
unknown parameters are learned by the EM algorithm [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. More precisely, θ
includes the prior probability of each state of the educational metadata parameters. In
order to capture the spatial constraints of the pixel labels into an EM algorithm, we
employ a variational approximation in which we maximize in each step a lower bound
of L(θ ) . This bound F (θ , Q) is a function of the current mixture parameters θ and a
factorized distribution Q = ∏in=1 qi (si ) , where each qi (si ) corresponds to pixel i
but defines an otherwise arbitrary discrete distribution over si .
      </p>
      <p>
        An attractive property of the variational EM framework is that in each step of the
algorithm we are allowed to assign any distribution qi (si ) to individual pixels as long
as this increases the energy F . In summary, our variational EM algorithm is as
follows:
1. (Initialization) Start with a random guess for the parameter vector θ .
2. (Standard E-step) Compute the Bayes posterior probabilities over pixel labels
given the pixel colors given the current estimate of θ .
3. Smooth the responsibilities of neighboring pixels by applying a local filter on
the set of assigned posteriors (and then renormalize if needed). An efficient
way to do this is to represent the set of assigned responsibilities as an image
and apply a standard Gaussian smoothing filter.
4. (Standard M-step) Use the smoothed responsibilities in order to update the
parameter θ as in standard EM [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. If convergence stop, else go to step 2.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Demonstration</title>
      <p>In order to make a preliminary evaluation of the effectiveness the proposed approach
we used 10 Learning Object sets consisting of 135 learning object metadata records,
that is, 9 Learning Objects per Learning Resource Type (simulating 10 different
enduser’s historical log files) and a set of 20 learning object metadata records (simulating
recommendations from a passive social filtering system), with normal distribution
over the value space of each metadata element. The goal of the evaluation was to test
the ability of filtering out learning resources with educational footprint that does not
match the educational preferences of a given end-user. From this preliminary
evaluation, we have evidence that such an add-on service has the potential to enhance
social filtering techniques via educationally informed filtering decisions.</p>
      <p>Fig.3 presents an example of how the educational footprint for a set of 9 Learning
Objects per Learning Resource Type is generated, depicting the step-by-step result of
this process for the case of “Interactivity Type” metadata attribute. As we can
observe, this is an incremental process starting with the representation of the
educational footprint of the first learning object in the set (Fig.3a), continues with the
representation for the first 2x2 learning objects (Fig.3b), then with the representation
of the first 3x3 learning objects (Fig.3c), and so on for larger sets of learning objects
(Fig.3d).</p>
      <sec id="sec-4-1">
        <title>Learning</title>
      </sec>
      <sec id="sec-4-2">
        <title>Resource ID</title>
        <p>Resource #1
Resource #2
Resource #3
Resource #4
Resource #5
Resource #6
Resource #7
Resource #8
Resource #9
…</p>
      </sec>
      <sec id="sec-4-3">
        <title>Resource</title>
      </sec>
      <sec id="sec-4-4">
        <title>Interactivity Type</title>
        <p>active
expositive
expositive
mixed
active
active
expositive
mixed
expositive
…</p>
      </sec>
      <sec id="sec-4-5">
        <title>Color</title>
      </sec>
      <sec id="sec-4-6">
        <title>Code</title>
        <p>…</p>
        <p>This representation is used as an input for generating the resource filter for the
educational post-filtering of the resources suggested by a typical social filtering
system/application. An example of such a filter is presented in Fig.4.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5 Conclusions</title>
      <p>In this paper we propose the use of an add-on filtering service on existing social
filtering systems/applications so as to create a data post-filtering mechanism that
makes use of intelligence stored in TEL metadata. The main driver of this work was
inspired by the idea of using visualization information for accessing Learning Object
Repositories. Our goal was to investigate how image segmentation techniques could
be applied in order to enhance the social filtering process of educational content. The
proposed methodology starts with the generation of a matrix that represents the
educational characteristics of the resources suggested by typical social filtering
techniques and applies post-filtering using the educational “footprint” of the resources
already used by the targeted end-user. We treat the filtering problem as an inference
problem, assuming that each pixel in the educational content visualization has a
hidden binary label associated with it which specifies if it is appropriate for the
targeted learner or not. In order to solve the inference problem, we use a variation of
the EM algorithm which incorporates the spatial constraints with just a small
computational overhead.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Ahn</surname>
          </string-name>
          , J.-W.,
          <string-name>
            <surname>Farzan</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Brusilovsky</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Social Search in the Context of Social Navigation</article-title>
          .
          <source>Journal of the Korean Society for Information Management</source>
          , vol.
          <volume>23</volume>
          (
          <issue>2</issue>
          ), pp.
          <fpage>147</fpage>
          --
          <lpage>165</lpage>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Recker</surname>
            ,
            <given-names>M. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Walker</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , and Wiley, D. A.:
          <article-title>Collaboratively filtering learning objects</article-title>
          . In D. A. Wiley (Ed.),
          <source>The Instructional Use of Learning Objects: Online Version</source>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>M.</given-names>
            <surname>Craven</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>DiPasquo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Freitag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>McCallum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mitchell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Nigam</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Slattery</surname>
          </string-name>
          .:
          <article-title>Learning to construct knowledge bases from the world wide web</article-title>
          .
          <source>Artificial Intelligence</source>
          , vol.
          <volume>118</volume>
          (
          <issue>1-2</issue>
          ), pp.
          <fpage>69</fpage>
          --
          <lpage>113</lpage>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>B.</given-names>
            <surname>Mobasher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cooley</surname>
          </string-name>
          , and
          <string-name>
            <surname>J. Srivastava.</surname>
          </string-name>
          :
          <article-title>Automatic personalization based on web usage mining</article-title>
          .
          <source>Communications of the ACM</source>
          , vol.
          <volume>43</volume>
          (
          <issue>8</issue>
          ), pp.
          <fpage>142</fpage>
          --
          <lpage>151</lpage>
          (
          <year>2000</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>B.</given-names>
            <surname>Mobasher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Dai</surname>
          </string-name>
          , and M. Nakagawa T. Luo.:
          <article-title>Discovery and evaluation of aggregate usage profiles for web personalization</article-title>
          .
          <source>Data Mining and Knowledge Discovery</source>
          , vol.
          <volume>6</volume>
          , pp.
          <fpage>61</fpage>
          --
          <lpage>82</lpage>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Klerkx</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Duval</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Meire</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Using Information Visualization for Accessing Learning Object Repositories</article-title>
          .
          <source>In Proc. of the 8th IEEE International Conference on Information Visualisation</source>
          , pp.
          <fpage>465</fpage>
          --
          <lpage>470</lpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>A. P.</given-names>
            <surname>Dempster</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. M.</given-names>
            <surname>Laird</surname>
          </string-name>
          and
          <string-name>
            <given-names>D. B.</given-names>
            <surname>Rubin</surname>
          </string-name>
          .:
          <article-title>Maximum likelihood from incomplete data via the EM algorithm</article-title>
          ,
          <source>J. Roy. Statist. Soc. B</source>
          , vol.
          <volume>39</volume>
          , pp.
          <fpage>1</fpage>
          --
          <lpage>38</lpage>
          (
          <year>1977</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>A.</given-names>
            <surname>Diplaros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Vlassis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Gevers</surname>
          </string-name>
          .:
          <article-title>A spatially constrained generative model and an EM algorithm for image segmentation</article-title>
          .
          <source>IEEE Transactions on Neural Networks</source>
          , vol.
          <volume>18</volume>
          (
          <issue>3</issue>
          ), pp.
          <fpage>798</fpage>
          --
          <lpage>808</lpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>D.</given-names>
            <surname>Pierrakos</surname>
          </string-name>
          , G. Paliouras,
          <string-name>
            <given-names>C.</given-names>
            <surname>Papatheodorou</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.D.</given-names>
            <surname>Spyropoulos</surname>
          </string-name>
          .:
          <article-title>Web usage mining as a tool for personalization: A survey. User Modeling and User-Adapted Interaction</article-title>
          , vol.
          <volume>13</volume>
          , pp.
          <fpage>311</fpage>
          --
          <lpage>372</lpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>R. M. Neal</surname>
            and
            <given-names>G. E.</given-names>
          </string-name>
          <string-name>
            <surname>Hinton</surname>
          </string-name>
          .
          <article-title>: A view of the EM algorithm that justices incremental, sparse, and other variants, .Learning in graphical models, M. I</article-title>
          . Jordan, Ed. Kluwer Academic Publishers, pp.
          <fpage>355</fpage>
          --
          <lpage>368</lpage>
          (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>