<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Using Semantics for Interactive Visual Analysis of Linked Open Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gerwald Tschinkel</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eduardo Veas</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Belgin Mutlu</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vedran Sabol</string-name>
          <email>vsabol@know-center.at</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Graz University of Technology</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Know-Center gtschinkel</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Providing easy to use methods for visual analysis of Linked Data is often hindered by the complexity of semantic technologies. On the other hand, semantic information inherent to Linked Data provides opportunities to support the user in interactively analysing the data. This paper provides a demonstration of an interactive, Web-based visualisation tool, the \Vis Wizard", which makes use of semantics to simplify the process of setting up visualisations, transforming the data and, most importantly, interactively analysing multiple datasets using brushing and linking methods.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>1. Selecting and con guring the visualisations
2. Aggregating datasets
3. Brushing and linking over multiple datasets</p>
      <p>This paper illustrates the use of semantic technologies in a visual analytics
tool that enables novice users to perform complex operations and analyses on
Linked Data. The demonstration focuses mainly on step 3, with a screencast of
the demonstration also being available6.</p>
      <sec id="sec-1-1">
        <title>3 http://code-research.eu 4 http://www.w3.org/TR/vocab-data-cube 5 http://code.know-center.tugraz.at/vis 6 http://youtu.be/aBfuGhgVaxA</title>
        <p>
          Related work: A wide range of tools o ers functionalities for visualising and
interacting with data, but only a few rely on semantic information to support the
analytical process. Tableau [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] provides a mighty visualisation toolset, however
it does not make use of semantic information for assisting the user. The CubeViz
Framework [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] facilitates visual analytics on RDF Data Cubes, but does not use
semantics for the user interface. CubeViz supports no brushing, no possibility
to compare datasets directly and no automatic selection of visualisations.
Cammarano et al. [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] introduces a method to automatically analyse data attributes
and map them to visual properties of the visualisation. Even so, this does not
include an automatic selection of visualisation types.
2
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>The Linked Data Vis Wizard</title>
      <p>The underlying thought is to make the user capable of visually analysing data
without knowing about the concept of Linked Data or RDF Data Cubes.
However, the Vis Wizard utilises the available semantic information to support users
in interacting with the data and performing analytical tasks.
Scenario: Figure 1 compares two datasets taken from the EU Open Data
Endpoint7 in the Vis Wizard. The rst one, shown in parallel coordinates, represents
the 3G coverage in Europe, as percentage value, per country for each year. The
second dataset, shown in the geo-chart, contains active SIM cards per 100 people
(encoded by colour-grading) for countries in Europe. In the following we use the
Vis Wizard to gain insights into the data and ascertain the datasets correlate.</p>
      <sec id="sec-2-1">
        <title>7 http://open-data.europa.eu</title>
        <p>
          charts only those are made available which can actively and meaningfully be
used with the provided data. For example, the geo-chart is only available if the
data contains a geographic dimension. After the chart was selected, the user
can adjust the mapping of data onto the visual properties (e.g. axes, colours,
item sizes etc.) of the chart, whereby only suitable mappings are o ered. Chart
selection and the data mapping is computed by an algorithm [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] comparing the
semantic information in the RDF Data Cube with the visual capabilities of the
chart, which are described using the Visual Analytics Vocabulary8.
        </p>
        <p>Aggregation: We provide a dialogue for aggregating the data and creating a
new Data Cube. In the scenario shown in Fig 1 the second dataset was averaged
over the years and visualised over the countries. Using semantics we di erentiate
between dimensions and measures and enable validation of the user choices.</p>
        <p>For suggesting charts and supporting aggregation we are utilising RDF datatype,
occurrence and persistence.</p>
        <p>
          Brushing and Linking: The idea behind brushing and linking is to combine
di erent visualisations to overcome the shortcomings of single techniques [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
Interactive changes made in one visualisation are automatically re ected in the
other ones. Our scenario contains two separate datasets: the rst dataset has
the dimensions \country" and \year", the second dataset has only \country".
For conventional tools it is hard to provide interaction over di erent datasets,
because relationships between them are usually not explicitly available. In cases
when columns are labelled using equal strings guessing the relationships may be
possible, but when labels di er, e.g. a dimension in dataset A is called \Country"
while in the dataset B it is called \State", the relation cannot be established. In
such cases the burden of understanding the structure of the datasets and linking
them together falls on the user. Within RDF Data Cubes, each dimension has an
URI which is (by de nition) unique and can be used to establish the connection
between datasets, making linking and brushing over di erent datasets possible.
        </p>
        <p>Applied to our scenario the following interactive analysis is performed (see
Fig. 1): The user applies a brush on the rst dataset by selecting a speci c
value range in the \3G coverage" dimension using the parallel coordinates chart.
Countries outside of the selected range are greyed out in the geo-chart, which
shows the second dataset (SIM card penetration). Obviously, a high 3G coverage
correlates with high SIM card penetration (red), with one exception - France.</p>
        <p>It should be noted that the functionality of linking data over di erent datasets,
or even di erent endpoints, depends on the quality of the semantic information:
the URIs of the cube-dimensions in di erent datasets need to be consistent. If
datasets use di erent, domain-speci c URI namespaces, linking the data will not
be possible.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Evaluation</title>
      <p>
        We conducted a formative evaluation to explore if our goals regarding the
usability of the Vis Wizard could be achieved and to ascertain that users were able
8 http://code.know-center.tugraz.at/static/ontology/visual-analytics.owl
to analyse complex datasets. Eight test users participated which executed six
tasks, where one task was exclusively about linking and brushing. Test users had
a good knowledge of computers, but were not familiar with semantic data. We
conducted a quantitative subjective workload test, using the simpli ed NASA
R-TLX, and a qualitative thinking aloud test. More details on the evaluation,
including methodology, test users and results are available under [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>The functionality supporting the choice and con guration of the
visualisation was much appreciated, but users pointed out that immediately suggesting
the most suitable visualisation would have been even more helpful. The task
regarding brushing in the scatterplot had a very high subjective performance of
accomplishing (the median was 91.25 on a scale from 0 to 100, 100 being the
highest value achievable). The conclusion of our evaluation is that, while several
usability issues still need to be xed, the overall advantage is clearly observable.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion and Future Work</title>
      <p>Within this research we have observed a high potential in using semantic
information for improving interaction in visual analytics. It has been shown that the
user supporting techniques were helpful in gaining insights from the data,
without spending much time in selecting and con guring visualisations or analysing
how to link the datasets manually.</p>
      <p>As for our purpose the correctness of the semantic annotations of the data is
essential, the stability of our approach could be improved by implementing the
use of URI aliases. We will also explore the possibilities to rank the visualisations
in order to, given a particular dataset, automatically show the most suitable one.
Acknowledgements. This work is funded by the EC FP7 projects CODE (grant 296150) and
EEXCESS (grant 600601). The Know-Center GmbH is funded by Austrian Federal Government within
the Austrian COMET Program, managed by the Austrian Research Promotion Agency (FFG).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Cammarano</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dong</surname>
            ,
            <given-names>X.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chan</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klingner</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Talbot</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Halevey</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hanrahan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Visualization of heterogeneous data</article-title>
          .
          <source>In: IEEE Information Visualization</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Keim</surname>
            ,
            <given-names>D.A.</given-names>
          </string-name>
          :
          <article-title>Information visualization and visual data mining</article-title>
          .
          <source>In: IEEE Transactions on Visualization and computer graphics</source>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Mutlu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , Ho er,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Tschinkel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Veas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.E.</given-names>
            ,
            <surname>Sabol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Stegmaier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Granitzer</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Suggesting visualisations for published data</article-title>
          .
          <source>In: Proceedings of IVAPP 2014</source>
          . pp.
          <volume>267</volume>
          {
          <issue>275</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Sabol</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tschinkel</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Veas</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hoe</surname>
            <given-names>er</given-names>
          </string-name>
          , P.,
          <string-name>
            <surname>Mutlu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Granitzer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Discovery and visual analysis of linked data for humans</article-title>
          .
          <source>In: Accepted for publication at the 13th International Semantic Web Conference</source>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Salas</surname>
            ,
            <given-names>P.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mota</surname>
            ,
            <given-names>F.M.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Breitman</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Casanova</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          :
          <article-title>Publishing statistical data on the web</article-title>
          .
          <source>In: Proceedings of 6th International IEEE Conference on Semantic Computing. IEEE</source>
          <year>2012</year>
          , IEEE (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Stolte</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hanrahan</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Polaris: A system for query, analysis and visualization of multi-dimensional relational databases</article-title>
          .
          <source>IEEE Transactions on Visualization and Computer Graphics</source>
          <volume>8</volume>
          ,
          <issue>52</issue>
          {
          <fpage>65</fpage>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>