=Paper= {{Paper |id=Vol-1224/paper8 |storemode=property |title=LODmilla: a Linked Data Browser for All |pdfUrl=https://ceur-ws.org/Vol-1224/paper8.pdf |volume=Vol-1224 |dblpUrl=https://dblp.org/rec/conf/i-semantics/MicsikTG14 }} ==LODmilla: a Linked Data Browser for All== https://ceur-ws.org/Vol-1224/paper8.pdf
            LODmilla: a Linked Data Browser for All

                     András Micsik, Sándor Turbucz, Attila Györök

                Department of Distributed Systems, MTA SZTAKI
                    Lágymányosi u. 11., Budapest, Hungary
    {andras.micsik,sandor.turbucz,attila.gyorok}@dsd.sztaki.hu



       Abstract. Although the Linked Data paradigm is extremely popular, and there
       is immense amount of Linked Open Data available worldwide, the human ex-
       ploration of these datasets is limited. In our work we try to evolve a generic
       platform called LODmilla for exploring and editing Linked Open Data. Our aim
       is to enable the extraction and sharing of data associations (or information) hid-
       den in Linked Open Data. LODmilla is an open web application supporting
       graph views, graph searching and many other commodity features for surfing
       over Linked Data.

       Keywords: Linked Data, LOD, Semantic Web, graph visualization


1      Introduction

   In 2006 Tim Berners-Lee outlined a set of best practices for publishing and con-
necting structured data on the Web: the Linked Data (LD) principles [1]. This endors-
es the connection of RDF (Resource Description Framework) datasets with each other
forming a global data network. The merge of the LD and Open Data concepts became
very popular in last years named as Linked Open Data (LOD) [2].
   Although the LOD cloud diagram [3] recorded the immense growth of available
LOD datasets, the human exploitation of this data bonanza is still very ad hoc. In our
work we try to evolve a generic platform for exploring and editing Linked Open Data.
Our aim is to enable the extraction and sharing of data associations (or information)
hidden in Linked Open Data.
   Linked Data is built using triples, where each triple defines a statement in the form
of subject-predicate-object. The graph representation of such data is quite straightfor-
ward and widely used. The subject and the object are nodes in the graph, and edges
between them are labelled with the predicate name. This way we get a directed, la-
belled graph as a view of the Linked Data. Another natural way to present LOD con-
tent is using a tabular format.
   Like in a spreadsheet, the three parts of a triple can be sorted or grouped in sepa-
rate columns. The Graphity [4] and Tabulator [5] are examples of the tabular brows-
ing with nested tables, and one could list several prototypes of graph-based LOD
browsers (LodLive [6], RelFinder [8], oobian [7], etc.). Here we present LODmilla
[9], which is a continuously improving service for generic Linked Data browsing
trying to combine the best features from both tabular and graph-based browsers.
32      Micsik et al.


2      The LODmilla browser

   The aim of LODmilla is to facilitate the human inspection of information accessi-
ble as Linked Data. LODmilla users may find associations between objects, and rec-
ord various “mind map” views of the underlying data. For this, we provide both graph
and table based browsing, and exploration functions specific to RDF.




                         Fig. 1. A snapshot of LODmilla in action

   LODmilla (Fig. 1) is running in conventional web browsers as a web app. While it
is primarily visual, it also contains textual representations of resource properties in
order to combine the best of both worlds. Its goal is to provide a single application for
the interactive exploration of LOD content residing in multiple knowledge bases.
   The browser provides the following function groups:
─ Opening URIs as nodes, expanding and browsing by RDF properties,
─ Zooming and panning in the graph view,
─ Reorganization of the graph view,
─ Various search operations in the graph,
─ Saving and loading graph views as well as sharing them with other users,
─ Editing Linked Data,
─ Undo of previous actions.

   The specific search operations allow users to find string occurrences hidden in tri-
ples both in the current view and in the neighborhood of selected nodes. This way one
                                    Posters & Demos Track @ SEMANTiCS2014              33


can expand the graph view in the desired direction, for example by opening all nodes
representing creators, or by searching for the word ‘semantic’ near one author. Fur-
thermore, a path search is also offered, revealing connections between selected enti-
ties (nodes).
   In order to facilitate caching and fast triple loading, the search operations use a
dedicated backend, which is also responsible for saving and sharing graph views.
LODmilla can switch between two methods for fetching triple data; the first one is
based on SPARQL querying, the second one uses actionable URIs. By using the Jena
toolkit at the backend, we can parse incoming RDF as Turtle, RDF/XML, JSON, etc.
Therefore, a large variety of datasets can be used at the same time, even without con-
figuring the dataset details in the frontend (in this case actionable URIs are used to
load graph details). Future plans include the use of VoID [10] for automatic configu-
ration of dataset-specific features in the frontend.
   As we cannot rely on SPARQL querying for path extraction, we had to apply graph
traversal methods, which have the advantage that they may work across datasets as
well. Path finding currently works between two nodes, it starts from both nodes using
simple heuristics to select the next path segment to explore. We plan to improve this
algorithm both by improving the heuristics and by parallelizing it on several virtual
machines. The connection and content search operations use breadth-first traversal of
triples (naturally excluding too common connections such as rdf:type and nodes hav-
ing too many connections). The content search is typically useful for finding text
occurrences hidden in the multitude of properties, while the connection search helps
users to see selected aspects (connection types) of the graph. In both cases, there is an
important problem due to the unidirectionality of node connections: we can search
only by outgoing connections (where the current node entity is the subject in the tri-
ple) and not by incoming connections, therefore some of the information sought may
remain undiscovered.
   During the processing of triples, we use some assumptions to improve the visual
presentation, for example to show small icons for nodes, we use the foaf:depiction
and dbpedia:thumbnail property values, or if not present, the rdf:type property values
are mapped to a set of predefined icons (e.g. person, paper, organizational unit, etc.).
Similarly, the texts shown in nodes are taken from rdfs:label, dc:title, foaf:name or
skos:prefLabel properties. Properties are also scanned for images, geolocations and
external URLs. Images are shown inline in the info panel, and locations are shown on
a map.
   Recent improvements of the browser include the editing and reorganization func-
tions. Changing a graph is more natural by drawing than by modifying the triples,
therefore we added the possibility to insert new nodes, draw new edges and also to
remove edges and nodes in the graph. Such light-weight editing can be used to quick-
ly fix errors or complete missing parts in the graph. These changes are translated into
SPARQL Update statements for further use by the author. In this sense one can think
LODmilla as a Linked Data Editor for non-professionals.
   When the graph view gets cluttered the user can ask for rearrangement of nodes us-
ing several methods. We experiment with the adaptation of Spring, Grid and Radial
layout algorithms to LOD graphs (which typically contain many cycles) and their
34         Micsik et al.


parametrizations to provide useful presets for various usage scenarios [10]. For exam-
ple, it is possible to lay out strongly connected nodes closer to each other, or to group
nodes by their types. In general, the insertion of new nodes is done in the least disrup-
tive way, without moving existing nodes significantly, yet positioning new nodes
closely in the free areas of the canvas. The layout algorithms also include genetic
modifications of the layout between iterations based on graph details such as node
distances or node types. In the future we would like to develop metrics for the ‘good-
ness’ of the layout and to guess the number of iterations necessary for a suitable lay-
out.
   We think that LODmilla unifies most features found in previously implemented
LOD browsers and it also exhibits novel principles such as serving multiple LOD
datasets at a time and presenting connections between nodes in separate datasets.
Beyond the new graph search operations, our development of the browser continues
to include and improve useful features for Linked Data exploration. LODmilla can be
used as a public service1 and its source code is available on GitHub2.


References
 1. Berners-Lee, T.: Linked Data - Design Issues. 2006,
    http://www.w3.org/DesignIssues/LinkedData.html
 2. Bizer, C.: The Emerging Web of Linked Data. IEEE Intelligent Systems, Vol.24, no.5,
    pp.87-92, Sept.-Oct. 2009, doi: 10.1109/MIS.2009.102
 3. Cyganiak, R., Jentzsch, A.: The Linking Open Data cloud diagram. http://lod-cloud.net/
 4. Graphity, http://graphity.org
 5. Berners-Lee, T., Chen, Y., Chilton, L., Connolly, D., Dhanaraj, R., Hollenbach, J., Lerer,
    A., Sheets, D.: Tabulator: Exploring and analyzing linked data on the semantic web. In
    Proceedings of the 3rd International Semantic Web User Interaction Workshop (SWUI06)
    (2006).
 6. LodLive, http://en.lodlive.it/
 7. ::oobian::, http://oobian.com/
 8. Lohmann, S., Heim, P., Stegemann, T., Ziegler, J.: The RelFinder user interface: interac-
    tive exploration of relationships between objects of interest. In Proceedings of the 15th in-
    ternational conference on Intelligent user interfaces (IUI '10). ACM, New York, NY, USA,
    421-422. DOI=10.1145/1719970.1720052
 9. Micsik, A., Tóth, Z., Turbucz, S.: LODmilla: Shared Visualization of Linked Open Data.
    In: L. Bolikowski, V. Casarosa, P. Goodale, N. Houssos, P. Manghi, J. Schirrwagen (eds.)
    Theory and Practice of Digital Libraries - TPDL 2013 Selected Workshops, Springer 2014
    CCIS, DOI: 10.1007/978-3-319-08425-1_9
10. Alexander, K., Cyganiak, R., Hausenblas, M., and Zhao, J.: Describing Linked Datasets -
    On the Design and Usage of VoID, the 'Vocabulary of Interlinked Datasets'. In WWW
    2009 Workshop: Linked Data on the Web (LDOW2009) (Madrid, Spain, 2009).
11. Golbeck, J., Mutton, P.: Spring-Embedded Graphs for Semantic Visualization. In: V.
    Geroimenko, Ch. Chen (eds.) Visualizing the Semantic Web, Springer 2006, pp 172-182,
    DOI: 10.1007/1-84628-290-X_10

1
    http://munkapad.sztaki.hu/lodmilla/
2
    https://github.com/dsd-sztaki-hu/LODmilla-frontend