=Paper=
{{Paper
|id=None
|storemode=property
|title=Linked Data based applications for Learning Analytics Research: faceted searches, enriched contexts, graph browsing and dynamic graphic visualisation of data
|pdfUrl=https://ceur-ws.org/Vol-974/lakdatachallenge2013_03.pdf
|volume=Vol-974
|dblpUrl=https://dblp.org/rec/conf/lak/MaturanaALIE13
}}
==Linked Data based applications for Learning Analytics Research: faceted searches, enriched contexts, graph browsing and dynamic graphic visualisation of data==
Linked Data based applications for Learning Analytics
Research: faceted searches, enriched contexts, graph
browsing and dynamic graphic visualisation of data
Ricardo Alonso Maturana María Elena Alvarado María José Ibáñez
gnoss.com gnoss.com* gnoss.com*
th
*Piqueras 31, 4 floor elenaalvarado@gnoss.com mariajoseibanez@gnoss.com
E-26006 Logroño. La Rioja. Spain
+34 941248905 Susana López-Sola Lorena Ruiz Elósegui
riam@gnoss.com gnoss.com* gnoss.com*
susanalopez@gnoss.com lorenaruiz@gnoss.com
ABSTRACT
We present a case of exploitation of Linked Data about learning
Keywords
Learning Analytics, Educational Data Mining, Semantic Web,
analytics research through innovative end-user applications built
Linked Data, linked open data, faceted search, semantic contexts,
on GNOSS, a semantic and social software platform. It allows
users to find and discover knowledge from two datasets, Learning recommendation systems, geographic visualisation, geolocated
data, discovery information systems, knowledge management,
Analytics Knowledge (LAK) and Educational Data Mining
semantic data platform, end-user semantic platform, GNOSS.
(EDM), and also reach some related external information thanks
to the correlation with other datasets. We used four additional
datasets, either to supplement information or to generate enriched 1. MOTIVATION: PURPOSES OF THE
contexts: Dbpedia, Geonames, DBLP-GNOSS (an index of SOLUTION
scientific publications in Computer Science from DBLP) and The main purpose of the service developed on the gnoss.com1
DeustoTech Publications (publications of the Institute of software platform is to provide end-users with innovative
Technology of the University of Deusto, and more specifically a applications that allow them to find and discover knowledge
selection of works by the DeustoTech Learning research unit). related to learning analytics research from the LAK and EDM
The featured applications are: faceted searches, enriched contexts, datasets [1].2 Based on the exploitation of Linked Data [2, 3], the
navigation through graphs and graphic visualization in charts or system includes faceted searches, recommendation systems and
geographic maps. Faceted searches can be performed on three adapted contexts. More specifically, the software solution serves
basic items: scientific publications, researchers (authors of the the following purposes:
publications) and organizations in the learning analytics area. The
search engine enables aggregated searches by different facets and 1. Explore and navigate the datasets (LAK and EDM) through
summarization of results for each successive search. Analytics on faceted searches and graph browsing. It enables to find
data are provided firstly through that summarization given for publications, researchers and organizations in the area, as well as
results in every facet, and secondly through dynamic graphic to know about which topics are being investigated, who is
representations for some attributes. Several charts are available to working in which fields, where those people and their
show the distribution of publications depending on different organizations are located, who has published about LAK in an
attributes (e.g. per publication type and year or per organization). organization, or who is collaborating with whom, for instance.
The search results for organizations and researchers can be 2. Access a geographic visualisation of researchers and
visualized in geographic maps. organizations working in learning analytics, with the option of
This work was presented to the LAK Data Challenge 2013. filtering results by different and aggregated facets.
3. Visualise dynamic charts of some analytic information related
Categories and Subject Descriptors to the evolution and distribution of publications. The charts are
Information systems - Data management systems - Database dynamic as the results evolve in the chart with the successive
design and models - Graph-based database models selected filters in the search facets.
Information systems - Information systems applications - 4. Discover related information within the dataset once the user
Collaborative and social computing systems and tools - Social has access a specific item (internal context), such as related LAK
networking sites publications, co-authors, related and nearest organizations, etc.
Classification Scheme: The 2012 ACM Computing Classification
System (CCS) http://dl.acm.org/ccs.cfm
1
GNOSS: http://www.gnoss.com/en/about-gnoss
General Terms 2
Algorithms, Documentation, Standardization, Languages. LAK and EDM datasets are available online in:
http://www.solaresearch.org/resources/lak-dataset
5. Discover external related information through the correlation information and/or correlating datasets, for instance SIOC
with other datasets. Some examples of datasets have been chosen (Semantically Interlinked Online Communities),8 SKOS (Simple
for the demo to show the potential of the GNOSS platform tools. Knowledge Organization System),9 DBPROP10 or GN
(Geonames).11
6. Facilitate the potential relevant re-use of these datasets as
contexts in other scenarios by linking them to existing social 2.2 Other datasets
learning communities on gnoss.com, or any community related to Besides the direct consumption of the information provided in the
the study of those topics. datasets on learning analytics research, we used other additional
datasets, either to supplement the information of the former ones
2. DESCRIPTION OF THE SOLUTION (as explained above) or to automatically generate dynamic
contexts with external related information. The following
AND DATASETS additional datasets were employed:
The solution exploiting the LAK datasets has been developed on
gnoss.com, a social and semantic platform with a deep focus on 1. Dbpedia, for supplementing the data of organizations and
the generation of social knowledge ecosystems and end-user obtaining geographic information that enables the connection with
applications in a Linked Data environment. It includes faceted Geonames.
searches, recommendation systems and adapted contexts in The automation of this process gave rise to incomplete
education, university and enterprises. GNOSS could be conceived information for some items, in such a way that we could not
as a network of networks or a linked networks space oriented to obtain the necessary information to represent all the researchers
using semantic technologies for data and service integration. It and organizations in the geographic map. As a consequence, the
expresses the data generated by users with default basic semantic presentation of results differs from the ‘mosaic view’ (it includes
standard vocabularies. This is done automatically when a user all the results) and the ‘map view’ (it only represents the
shares content on the platform. Besides, GNOSS has an engine for geolocated data). These data could be improved in the future.
developing specific ontologies and, as a consequence, specific
search engines if necessary. Moreover, it has a wide range of This is a common problem in the Web of Data. Datasets usually
configurable social tools, which have been mostly deactivated for need to be refined because of one or more of the following
this demo, except for comments and the option to share the link reasons: incomplete data, insufficient (missing) data or
via email or other social networks. inconsistent data (data are not well described or depicted or are
named wrongly). This complicates to provide an adequate service
2.1 The basis: LAK and EDM datasets and, specially, this makes it difficult to upload datasets and set
The baseline information to develop the solution was obtained relations between data.
from two datasets related to learning analytics: 1) Learning 2. Geonames, with the aim of recovering geolocation data and use
Analytics and Knowledge (LAK) 2011-2012 and 2) Educational them to develop the exploitation of geographic visualization of
Data Mining (EDM) 2008-2012. Both of them have information data.
about people (researchers), organizations in which they work, and
publications (proceedings, inproceedings and articles). 3. Two GNOSS existing datasets of scientific publications that we
found interesting as contexts in the field of learning analytics:
The information of the original datasets was enriched with data DBLP-GNOSS12 and DeustoTech publications. 13
coming from Dbpedia3 and Geonames,4 and also with
automatically generated tags. Moreover, some duplicated data ‘DBLP-GNOSS’ is and index with over two million scientific
(researchers and organizations) that appeared when unifying the publications in IT, developed by GNOSS in collaboration with the
two datasets were eliminated. University of Deusto. The data of DBLP-GNOSS have been
obtained from the dataset in the LOD cloud ‘DBLP’ promoted by
This information was uploaded to an online space inside the the University of Trier, and have been enriched with abstracts and
gnoss.com platform to consume and exploit the data and present key words.
the end-user applications.
‘DeustoTech publications’ is the dataset of scientific publications
We prepared a general navigation through tabs that includes a of the Technology Institute of the University of Deusto,
homepage with content selection and three other tabs DeustoTech. As a demo of a relevant external context, we
corresponding to the three entities from the datasets: publications, included a selection of the publications produced by the research
researchers and organizations. team DeustoTech Learning.
The three previous entities were represented on the platform with
their specific ontologies thanks to the semantic CMS of GNOSS
following the standard vocabularies of the original data: FOAF
(Friend-of-a-Friend),5 SWRC (Semantic Web for Research
Communities),6 DC (Dublin Core),7 etc. In addition, other
vocabularies were included for representing the extended 8
SIOC Core Ontology Specification: http://rdfs.org/sioc/spec/
9
SKOS namespace: http://www.w3.org/2004/02/skos/core#
3
Dbpedia: http://dbpedia.org/ 10
Dbpedia ontology: http://dbpedia.org/Ontology
4
Geonames: http://www.geonames.org/ 11
Geonames ontology: http://www.geonames.org/ontology
5
FOAF Vocabulary specification: http://xmlns.com/foaf/spec/ 12
GNOSS Research Groups: http://researchgroups.gnoss.com
6
SWRC ontology: http://swrc.ontoware.org/ontology# 13
DeustoTech publications:
7
Dublin Core terms: http://purl.org/dc/terms/ http://deusto.gnoss.com/comunidad/DeustoTech/Publications
2.3 Faceted searches published in 2008, 2011 and 2012, that one of them is related to
The web of structured data makes it possible to develop strategies Bayesian knowledge, and that he collaborated with other four
for intelligent information retrieval based on faceted searches [4, authors.
5, 6 and 7]. GNOSS has a powerful faceted search engine that is
generated by the GNOSS semantic graphs (RDF triplets); the
search engine exploits that graphs through reasoned or inference-
based searches.
The main advantages of facet-based searches are:
They are based on meaning and concepts, and relations
between them.
Users obtain reduced lists of results based on semantic
properties or attributes of the data.
They allow reasoning: a new search allows restricting the
subset of data from the previous search across multiple
facets. You can progressively filter results until you reach a
manageable data set.
Searches on the LAK Data Challenge space in GNOSS can be
started from two approaches:
As a meta-search, seeking in any kind of content.
Or selecting the item type to perform the search, either
choosing it from the facet ‘item type’ in the home webpage,
or navigating through the corresponding tab for every item.
In this case, there are three basic item types: publications,
researchers and organizations.
Once an item type is selected, the search engine provides specific
facets for each of them, which are configurable in function of the
available data. The relevant facets that have been set for each case
of the LAK Data are: Figure 1 Example of facets showing summarization of results
For publications: categories, tags, author, year, conference
and publication type.
For researchers: categories, tags, affiliation (organization) 2.4 Navigation trough graphs and
and country. relationships between entities and properties
For organizations: categories, tags, country, region, city and The possibilities of navigation through graphs that connect
students number. entities and properties (among them and with each other) are
immense and n-dimensional. Just to give some examples of items
The GNOSS faceted search engine allows concatenated searches, relationships and possible navigation paths in the present case:
and all relationships among the facets are recalculated with each
successive filter for the corresponding set of results. Authors and papers: authors who wrote articles on a specific
topic, authors of a publication you are interested in, papers
written by a selected author.
Researchers and organizations in which they work: related
2.3.1 Summarization of results: direct quantitative
organizations working on similar topics.
exploitation of data
GNOSS offers summarization of the number of results in each Authors and co-authors: if you find a researcher, you could
property represented in the facets. The values are recalculated for be interested in the people working with him, and then
every set of results in aggregated searches. This gives direct discover other research areas the latter are working at, and
analytic information that is represented in the form of facets for see their location in a map.
searches (see example in Figure 1). Related topics and their relation with researchers (authors):
you look for a key word and you see other related terms and
Thus, the search results give a lot of information through the the researchers publishing on that subject. You can navigate
facets: how they relate to the other searching attributes. For through the authors and discover new publications, co-
example, you look for publications with the tag ‘intelligent authors, etc.
tutoring system’, you obtain 12 results and know who worked on
this topic and who published the most papers, and you know that Location, people and research topics: you look for
the author Zachary A. Pardos, for example, wrote 3 publications researchers by geographic criteria, e.g. United Kingdom, and
in the field. If you select this author, all the facets are recalculated you get the topics they are working at (tags).
and you can see how they relate to the publications about
intelligent tutoring system by Pardos, for example, that they were
2.5 Enriched contexts of information and 2.7 Visualisation of analytics with dynamic
recommendations charts
The Web of Data also enables to connect information The analytics provided by summarization in search facets was
significantly, which can be exploited in GNOSS for the generation supplemented with some graphic visualisations. Google charts
of dynamic contexts that can be customized for each case. tools14 were integrated in the platform to represent some analytics
In the present work on LAK and EDM data, we set several related to the evolution and distribution of publications. Four
demonstration contexts depending on the object or entity that the types of charts were used: column chart, intensity map, pie chart
user is viewing: and bar chart. The user can choose among several charts, and
continue filtering through facets successively, thus seeing how the
1. Contexts for the entity ‘publication’: related LAK and EDM results evolve in the chart with the selected filters. Six charts were
publications (internal), related DBLP publications (external), included to analyse LAK data:
DeustoTech Learning publications (external).
Evolution of number of publications per year and publication
2. Contexts for the entity ‘researcher’: co-authors (internal), type (column chart, Figure 3). It shows how the number of
related organizations by topics (internal) and related DBLP publications in this area has increased during the last years,
publications (external). and how they are distributed in inproceedings (the main
3. Contexts for the entity ‘organization’: related organizations part), articles and proceedings. These results can be restricted
(internal), related researchers (internal), nearest organizations to selected criteria filtering through facets.
(internal), geolocation (external) and its visualization on a map.
4. Contexts of general purpose: Freebase definitions of tags of the
contents (when the concepts have an article in Freebase). For
example, if you select a publication about data mining, when you
put the mouse on the tag ‘data mining’, a window appears with its
definition on Freebase and the link to the Freebase and Wikipedia
articles.
Figure 3 Evolution of number of publications per year and
2.6 Geographical visualisation of data publication type
The present work includes the development of an application to
represent a set of geolocated results in a geographical map. In the
case of LAK and EDM datasets, this visualisation is enabled for Evolution of number of inproceedings per conference
researchers and organizations (see example in Figure 2), (column chart).
combined with the option of filtering results by different and Distribution of number of publications per country (intensity
aggregated facets. map, Figure 4). It gives a quick idea about the countries with
more scientific production in the field, according to search
criteria (total number, one or more specific topics, a selected
year, etc.).
Figure 4 Distribution of number of publications per country
(intensity map)
Figure 2 Geographic visualisation of search results
(organizations)
Distribution of number of researchers with publications per
year (pie chart, Figure 5).
14
Google chart tools. Information for developers available in
https://developers.google.com/chart/.
3. LINK TO THE PLATFORM
The demo GNOSS solution for Learning Analytics Research is
available in the following link:
http://datasetexplorer.gnoss.com/en/community/LAKChallenge
4. ACKNOWLEDGMENTS
Government of Spain, Ministry of Economy and Competitiveness,
CDTI (Centre for Industrial Technological Development (CDTI),
for the funding of a project financed by CDTI (Project IDI-
20110600).
Figure 5 Distribution of number of researchers with
publications per year (pie chart) University of Deusto, for the collaborative work in the generation
of external contexts.
Distribution of number of publications per organization (bar 5. REFERENCES
chart, Figure 6). The first view shows the total number of [1] Taibi, D., Dietze, S., Fostering analytics on learning analytics
publications for each organization along the years, and shows research: the LAK dataset, Technical Report, 03/2013, URL:
clearly which ones have produced the larger amount, with the http://resources.linkededucation.org/2013/03/lak-dataset-
Worcester Polytechnic Institute leading the list. By filtering taibi.pdf.
through facets, like year, tags or publication type, the user
can observe how the chart changes depending on those filter [2] Bizer, C.; Heath, T.; Berners-Lee, T. 2009. Linked Data -
options. The Story So Far. International Journal on Semantic Web
and Information Systems, 5(3), 1-22. DOI:
10.4018/jswis.2009081901.
[3] Bauer, F.; Kaltenböck, M. 2012. Linked Open Data: The
Essentials. A Quick Start Guide for Decision Makers. Edition
Mono/monochrom, Vienna, Austria. ISBN: 978-3-902796-
05-09.
[4] Suominen, O.; Viljanen, K.; Hyvönen, E. 2007. User-centric
faceted search for semantic portals. In The Semantic Web:
Research and Applications. Proceedings of the 4th European
Semantic Web Conference ESWC2007, forth-coming
(Innsbruck, Austria, June 3-7, 2007), 356-370. DOI:
10.1007/978-3-540-72667-8_26.
[5] Stefaner, M: Ferré, S.; Perugini, S.; Koren, J.; Zhang, Y.
2009. User Interface Design. In Dynamic Taxonomies and
Faceted Search. 2009. Sacco, G. M.; Tzitzikas, Y. (Eds.)
[6] Ferré, S.; Hermann, A.; Ducassé, M. 2011. Semantic Faceted
Search: Safe and Expressive Navigation in RDF Graphs.
Research report. ISSN: 2102-6327.
[7] Dal Mas, M. 2012. Faceted Semantic Search for
Figure 6 Distribution of number of publications per Personalized Social Search. In Computing Research
organization Repository, 2012, abs-1202-6685. URL:
Distribution of number of publications per author (bar chart). http://arxiv.org/abs/1202.6685.
It is similar to the previous one, but representing authors
instead of organizations.
This work shows some examples of charts representing analytics
on the LAK data, and it is extensible to additional similar
exploitations.