<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>SlideWiki { A Platform for Authoring FAIR Educational Content</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ali Khalili</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Klaas Andries de Graaf</string-name>
          <email>ka.de.graafg@vu.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, Vrije Universiteit Amsterdam</institution>
          ,
          <addr-line>NL</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>SlideWiki.org is a Web-based OpenCourseWare (OCW) authoring system that enables educators and learners to collaborate on creating, sharing, re-using and re-purposing multi-lingual open educational content. The SlideWiki platform allows people to author FAIR (Findable, Accessible, Interoperable, and Reusable) educational content. SlideWiki supports many features to semantically-enrich educational content to support FAIR authoring. In this paper we will present those features of the platform such as Linked Data interface, manual and automatic content annotation as well as content linking and metadata.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        A major obstacle to increase the e ciency, e ectiveness and quality of
education is the lack of widely available, accessible, multilingual, timely, engaging and
high-quality educational material (i.e. OpenCourseWare). The creation of
comprehensive OpenCourseWare (OCW) is tedious, time-consuming, and expensive.
Courseware employed by educators is therefore often incomplete, outdated, dull,
and inaccessible to learners with disabilities. With the open-source SlideWiki
platform (available at SlideWiki.org) the e ort in creating, translating, and
evolving FAIR (Findable, Accessible, Interoperable, and Reusable) OCW can
be widely shared (i.e. crowdsourced). Similarly to Wikipedia for encyclopaedic
content, SlideWiki [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] allows its users (1) to collaboratively create
comprehensive OCW (curricula, slide presentations, self-assessment tests, illustrations etc.)
online in a crowdsourcing manner, (2) to semi-automatically translate this
content into more than 50 di erent languages and to improve the translations in a
collaborative manner and (3) to support engagement and social networking of
educators and learners around that content. SlideWiki.org (funded by an EU
H2020 grant1) is already used by thousands of educators and learners.
      </p>
      <p>Figure 1 depicts the 3-tier technical architecture of the SlideWiki platform
where data, service and user interaction concerns are decoupled as individual
stand-alone components. Our contribution to semantically enrich educational
content in SlideWiki, touches upon all those 3 layers: RDF and Linked Data
version of content are provided in the data layer, NLP (Natural Language
Processing) services are exposed for automatic content annotation, and user
interfaces for manual content annotation together with inline metadata to increase
1 see http://slidewiki.eu for more information
the ndability of content are provided in the interaction layer. In the following
subsections we brie y introduce the SlideWiki semantics-related features:
1.1</p>
    </sec>
    <sec id="sec-2">
      <title>RDF Generation and Linked Data Interface</title>
      <p>
        An RDF data model allows easy exposure and integration of educational data
and the interlinking of data across system boundaries. In order to foster the
generation of RDF from legacy systems based in relational databases, there is a
stack of technologies and standards such as a) R2RML (W3C recommendation2)
to generate RDF from relational databases, b) RML (an extension to R2RML3)
to generate RDF from JSON, XML and CSV datasets, c) XR2RML [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] (an
extension to RML) to generate RDF from non-SQL databases, such as MongoDB.
      </p>
      <p>To generate the RDF version of SlideWiki, we employed Morph
(MorphXR2RML4) that implements the standard XR2RML. Morph is a java program
that reads two les: a con g le and a mapping le. Although both les can be
modi ed by the developers, in practical terms the con g le is xed (stable in
time) and the real e ort is focused on the mapping le. A mapping le establishes
how the elements in the SlideWIki MongoDB instance are converted. The result
of SlideWiki RDF conversion is exposed as a Linked Data interface on top of an
open SPARQL endpoint5.
2 https://www.w3.org/TR/r2rml/
3 http://rml.io/RML_R2RML.html
4 https://github.com/frmichel/morph-xr2rml/tree/morph-xr2rml-1.0
5 available at http://slidewiki.oeg-upm.net/sparql</p>
    </sec>
    <sec id="sec-3">
      <title>1.2 Manual Content Annotation</title>
      <p>Presentations get more visibility when they are tagged with words or entities
about their topic and contents. Annotations make the contents of a deck more
meaningful to users as tags tell a user what the deck is about. An author can
manually set tags for his/her presentation. To support the user in this process,
tag recommendations are calculated via a Natural Language Processing (NLP)
service (based on calculation of important words and named entities in the
presentation) and presented to the user. The user can select/approve recommended
tags to improve the semantic enrichment of his/her presentation (cf. Figure 2).</p>
      <p>Another content annotation method is via in-slide annotation. Users select
phrases in a slide and then annotate these phrases to be instances of an ontology
class. We will provide both our own SlideWiki speci c ontology (partially based
on existing OCW and OER ontologies) a user-created or uploaded ontology, and
the existing ontologies or vocabularies used in knowledge bases accessible on
the internet via a SPARQL endpoint. We also call DBpedia spotlight to give
suggestions to annotate phrases based on the DBpedia knowledge base (linking
to a DBpedia instance or instantiating as DBpedia concept). Presentations are
considered semantically enriched when they are linked to external knowledge
bases. The in-slide annotation (cf. Figure 3) is still under development.</p>
    </sec>
    <sec id="sec-4">
      <title>1.3 Automatic Content Annotation</title>
      <p>There are also automatic processes in SlideWiki for content annotation: In each
slide DBPedia Spotlight entities are identi ed by the NLP Analysis from the NLP
service (See https://nlpstore.experimental.slidewiki.org/documentation)
including a DBPedia URI. Named Entity Recognition is used to recommend tags
via the NLP service. The calculated results of the NLP service are stored in the
NLPstore in order to provide annotation and tag suggestions to the user in a
fast manner by querying pre-processed recommendations in the future.</p>
      <p>The main NLP API called nlpForDeck performs several NLP steps
encapsulated in one service. It can be divided in 3 main parts: 1) Preprocessing (like html
to text, automatic language detection, tokenization) 2) Automatically retrieve
entities used in the slides via DBpedia spotlight (use for in-slide annotation and
semantic enrichment) 3) Identi cation of important words and entities useful for
tagging the presentation in the platform6.</p>
    </sec>
    <sec id="sec-5">
      <title>1.4 Content Interlinking</title>
      <p>Presentations are considered semantically enriched when linked to presentations
with similar content. The user can dive deeper into the given topic and get a
better understanding of the topic by viewing related presentations. This can
be either done by showing presentations with the same tags or by linking to
presentations with similar content (even if they do not share the same tags).
The latter is performed by the deck recommendation based on the content of
the presentation.</p>
    </sec>
    <sec id="sec-6">
      <title>1.5 Metadata</title>
      <p>As most web search engines incorporate semantic data in their search, semantic
enrichment of the educational resources will result in a higher ranking and better
visibility. This is particularly important for improving search engine performance
for content within SlideWiki. SlideWiki will make use of metadata HTML tags as
well as embedded Microdata and JSONLD description of educational resources
to support SEO (Search Engine Optimization).
6 The importance of the words and entities is determined via TFIDF (term
frequencyinverse document frequency) ranking.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>A.</given-names>
            <surname>Khalili</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Tarasowa</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Ermilov.</surname>
          </string-name>
          <article-title>Slidewiki: elicitation and sharing of corporate knowledge using presentations</article-title>
          .
          <source>In International Conference on Knowledge Engineering and Knowledge Management</source>
          , pages
          <volume>302</volume>
          {
          <fpage>316</fpage>
          . Springer,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>F.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Djimenou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Faron-Zucker</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>J.</given-names>
            <surname>Montagnat</surname>
          </string-name>
          .
          <article-title>Translation of relational and non-relational databases into rdf with xr2rml</article-title>
          .
          <source>In WEBIST2015</source>
          , pages
          <fpage>443</fpage>
          {
          <fpage>454</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>