=Paper= {{Paper |id=Vol-2578/BigVis9 |storemode=property |title=Providing Effective Visualizations over Big Linked Data |pdfUrl=https://ceur-ws.org/Vol-2578/BigVis9.pdf |volume=Vol-2578 |authors=Federico Desimoni,Laura Po |dblpUrl=https://dblp.org/rec/conf/edbt/DesimoniP20 }} ==Providing Effective Visualizations over Big Linked Data== https://ceur-ws.org/Vol-2578/BigVis9.pdf
         Providing Effective Visualizations over Big Linked Data
                                                                    Short Research Paper

                            Federico Desimoni                                                                        Laura Po
                   federico.desimoni@unimore.it                                                          laura.po@unimore.it
               “Enzo Ferrari" Engineering Department                                           “Enzo Ferrari" Engineering Department
               University of Modena and Reggio Emilia                                          University of Modena and Reggio Emilia
                             Modena, Italy                                                                   Modena, Italy

ABSTRACT                                                                               LD by offering different levels of abstraction. The first version
The number and the size of Linked Data sources are constantly                          of H-BOLD appeared in 2018 [13, 15]. In this paper, we want to
increasing. In some lucky case, the data source is equipped with a                     describe the re-engineering process that has been carried out and
tool that guides and helps the user during the exploration of the                      the additional features developed.
data, but in most cases, the data are published as an RDF dump                            The architecture of the tool is introduced in Section 2, while
through a SPARQL endpoint that can be accessed only through                            the new features implemented are reported in Section 3. Related
SPARQL queries. Although the RDF format was designed to be                             work are described in Section 4. Section 5 sketches conclusion
processed by machines, there is a strong need for visualization                        and future work.
and exploration tools. Data visualizations make big and small
linked data easier for the human brain to understand, and visual-                      2     H-BOLD
ization also makes it easier to detect patterns, trends, and outliers                  H-BOLD1 (High-level visualizations on Big Open Linked Data) is
in groups of data.                                                                     a tool, available online, for visualizing, and interacting with LD.
   For this reason, we developed a tool called H-BOLD (High-                           It is defined in the context of hierarchical and interactive visual
level Visualization over Big Linked Open Data). H-BOLD aims to                         exploration and analysis over LD [13].
help the user exploring the content of a Linked Data by providing                          H-BOLD starts from our past experience with the tool LODeX,
a high-level view of the structure of the dataset and an interactive                   a tool able to automatically provide a summarization of a LD,
exploration that allows users to focus on the connections and                          including its inferred schema [1–3, 5], and tried to overcome the
attributes of one or more classes. Moreover, it provides a visual                      main limitations arose during its evaluation [4]. The main goal
interface for querying the endpoint that automatically generates                       of H-BOLD was to facilitate the exploration of LD with a high
SPARQL queries.                                                                        number of classes. The current architecture of H-BOLD is shown
                                                                                       in Figure 1 where the updates, with respect the previous version
KEYWORDS                                                                               [13], are highlighted in blue.
Linked Data, Visualization, Big Data, SPARQL, Semantic Web,                                H-BOLD is composed of a server layer and a presentation layer
Visual Querying, Data Visualization                                                    that will be described in the following.

                                                                                       2.1     Server layer
1    INTRODUCTION
                                                                                       The server layer aims to generate high-level representations of a
Since 2006, the year in which sir Tim Berners-Lee coined the
                                                                                       set of LD starting from a list of SPARQL endpoints. The outputs
term Linked Data (LD), leading to a new way in which data can
                                                                                       are the Cluster Schema (a high level representation of a complex
be structured and accessed through the Internet, the number
                                                                                       and big source) and the Schema Summary (a low level repre-
of LD exploded. Starting from governments, many institutions,
                                                                                       sentation of the instanced classes within a source). The SPARQL
enterprises, and privates adopted this method for publishing data.
                                                                                       endpoint list is created starting from the old list of endpoints used
As a result, there is a huge number of LD that can be accessed
                                                                                       in LODeX and adding new SPARQL endpoints that are present
through the Internet. Unfortunately, endpoints are big containers
                                                                                       on Open data portals. Moreover, we enable users to manually
of triples. They can contain every kind of information and the
                                                                                       add a new URL for a SPARQL endpoint they wish to visualize
high number of triples required to express a concept made LD
                                                                                       and explore in H-BOLD. On each endpoint a set of queries is
visualization a non-trivial task. Due to the volume and the variety
                                                                                       executed in order to extract structural and statistical information
of information, it is hard to find a common procedure for explor-
                                                                                       that describe the LD, this phase is called Index Extraction. In
ing every dataset but nonetheless, several research groups tried
                                                                                       particular, the indexes are the number of instances, the number
to address this task [10, 12, 16, 17]. A hierarchical visualization
                                                                                       of classes, the list of classes with the respective properties and
can address the problem of information overloading, offering an
                                                                                       the number of instances belonging to a specific class. The Index
effective mechanism for information abstraction and summariza-
                                                                                       Extraction is able to deal with the performance issues of the dif-
tion. Additionally, an interactive exploration allows the user to
                                                                                       ferent implementations of SPARQL endpoints by using pattern
understand step-by-step the content of even complex and big LD.
                                                                                       strategies [1].
   In this paper, we present a new version of the tool H-BOLD
                                                                                          Starting from the indexes is then possible to create a Schema
(High-level visualizations on Big Open Linked Data). H-BOLD
                                                                                       Summary of the LD, a pseudograph that represents, through
enables the exploratory search and multilevel analysis of Big
                                                                                       nodes and arches, the relations between the various instantiated
© 2020 Copyright for this paper by its author(s). Published in the Workshop Proceed-   classes of the dataset [2, 5]. The Schema Summary is a good
ings of the EDBT/ICDT 2020 Joint Conference (March 30-April 2, 2020, Copenhagen,       approach to compactly represent a RDF dataset, however when
Denmark) on CEUR-WS.org. Use permitted under Creative Commons License At-
tribution 4.0 International (CC BY 4.0)
                                                                                       1 https://dbgroup.ing.unimore.it/hbold/
                                           Figure 1: Architecture and Workflow of H-BOLD


we are dealing with big sources, it happens that the number                Another option for the user to start is from the exploration
of classes is high, thus the graph contains a high number of            of the Schema Summary, here, he/she will see a complete graph
nodes and the visualization results complex and confused. On            containing all the instantiated classes of the LD (see Figure 2,
the Schema Summary, a set of community detection techniques             step 4). The user can focus on a particular class and explore its
has been used to create a high-level visualization for Big LD           attributes and properties. Figure 2 shows the visualization steps
[15]. The classes, of the Schema Summary, are grouped into              in H-BOLD starting from the Cluster Schema and then selecting
Clusters, therefore, a Cluster Schema is generated for each LD          a class and expanding the graph till the visualization of the entire
where nodes are groups of classes and arches are connections            Schema Summary over the Scholarly LD2 .
among these Clusters. In the clustering of the Schema Summary,
the possibility that a node belongs to several Clusters is avoided.     3    RE-ENGINEERING THE TOOL
The labels in the Cluster Schema are assigned based on the degree       The H-BOLD application has been renovated both in the server
(the sum of in-degree and out-degree) of the classes (nodes) that       layer and in the presentation layer. The server layer is imple-
are represented by the cluster. For the formal definition of the        mented entirely in python and it uses MongoDB for the storage
Schema Summary and Cluster Schema see [15].                             of information. With respect to the previous version of H-BOLD,
   The Schema Summary and Cluster Schema offer several ad-              it has been equipped with new features for making the applica-
vantages: they can be easily memorized and retrieved on the             tion more responsive to the user requests. First of all, to enhance
MongoDB improving data recovery performance and graph visu-             the understanding of the behavior of the tool, a guideline has
alization.                                                              been drawn up: the application code has been studied and vari-
                                                                        ous flowcharts have been extracted. A process for continuously
2.2    Presentation layer                                               updating and enriching our collection of SPARQL endpoints has
                                                                        been implemented (see Subsection 3.1 for further details). The
In the presentation layer, the first step for a user is the selection   Cluster Schema creation has been re-implemented and included
of a dataset, then, the user can start the exploration of the Cluster   in the Server Layer, while previously was calculated on-the-fly
Schema or the exploration of the Schema Summary. The first is           in the Presentation Layer (Subsection 3.2). The list of SPARQL
more concise, while the second is the complete visualization of         endpoints has been enriched thanks to the integration of three
the structural information on the LD.                                   new open data portals (Subsection 3.3).
   If the user opts for the Cluster Schema (see Figure 2, steps 1),        The presentation layer went through a complete re-engineer-
he/she will see a shrank representation of the Schema Summary           ization. Due to many incompatibilities between the current brows-
that is obtained by applying a community detection algorithm.           ers and the version of Polymer3 in use (i.e. a javascript framework
From the Cluster Schema, by selecting a class within a cluster,         developed by Google that helps building web application through
a new visualization focused on the selected class is proposed           the concept of web components), we were forced to re-implement
to the user. The user might then further explore the class, its         the presentation layer of H-BOLD.
connections with other classes and its attributes, or can itera-           We started searching alternatives that could replace Polymer.
tively increase the graph displayed by expanding the connections        At the end of this research, we decided to move towards Boot-
starting from some classes (nodes in the graph). In each partial        strap4 , a javascript framework developed by Twitter for build-
representation of the Schema Summary, the user is informed              ing and styling web applications. Therefore, the interface of the
about the percentage of the instances represented by the graph          web application has been completed rebuilt with Bootstrap. The
and the total number of nodes (see Figure 2, steps 2 and 3). This
                                                                        2 http://www.scholarlydata.org/
expansion can be repeated until all the classes are displayed, as       3 https://www.polymer-project.org/
in the Schema Summary visualization.                                    4 https://getbootstrap.com/
Figure 2: Step-by-step visualization of the Scholary LD. From left to right: 1) visualization of the Cluster Schema, 2) explo-
ration of the "Event" class, 3) further expansion and exploration of the dataset, 4) complete visualization of the Schema
Summary.


javascript graphic library, D35 , already in use in H-BOLD, has         3.2     Cluster Schema visualization and
been integrated with Bootstrap to create interactive visualiza-                 generation
tions. Moreover, new features ameliorate H-BOLD. A new inter-
                                                                        The Cluster Schema is the core part that has been inserted in
face for allowing user to manually insert a SPARQL endpoint
                                                                        H-BOLD, starting from the predecessor LODeX. The Cluster
has been added (Subsection 3.4). New visualization layouts have
                                                                        Schema allows to visualize in an aggregated way all the classes of
been conceived for the exploration of the Cluster Schema and
                                                                        a LD. In the previous demo of H-BOLD [13], the Cluster Schema
the Schema Summary (Subsection 3.5).
                                                                        was calculated on-the-fly by running the community detection
                                                                        algorithm each time a user asked to see a Cluster Schema. This
3.1      Endpoint extraction automation and                             procedure shows different weak points. Since the Cluster Schema
         updates                                                        is computed over the Schema Summary, if the Schema Summary
                                                                        does not change then the Cluster Schema will not change neither,
Adding new content inside a SPARQL endpoint is a simple task
                                                                        so it does not make sense to recompute the Cluster Schema on
since the only constraint is that data must be represented in one
                                                                        each user click. Moreover, even if the community detection algo-
of the RDF serialization formats. Adding new classes or relations
                                                                        rithm took short time to run, the user had to wait the information
is just as simple as adding new instances. As a consequence, the
                                                                        to be both transformed and loaded before its visualization. For
structure and also the content of a LD could change very often.
                                                                        these reasons, we inserted the computation of the community
In H-BOLD, we want to display the most updated version of the
                                                                        detection algorithm on the server side layer and the storage of
dataset we indexed. For this reason, we automated the procedure
                                                                        the Cluster Schema in the MongoDB. Now, the Cluster Schema
of indexes extraction and Schema Summary and Cluster Schema
                                                                        is computed only once, after the index extraction procedure and
generation to run daily. Working with SPARQL Endpoints since
                                                                        the Schema Summary computation, and then stored in the DB.
2014, we noticed two important aspects. First, a SPARQL End-
                                                                        Therefore, both the Schema Summary and Cluster Schema can
point might be often not available6 , but this does not mean that
                                                                        be visualized by directly querying the DB. Experimental results
it is completely out of order, it might work again after 1 or 2 days.
                                                                        showed that, on half of the SPARQL endpoints stored in H-BOLD,
Second, LD do not change daily, they usually changed weekly, or
                                                                        the time needed to display the Cluster Schema to the user is
monthly, or do not change ever. For these reasons, it is useless to
                                                                        decreased by the 35%.
run the index extraction over all the datasets daily, it is enough to
run it weekly. However, since a SPARQL endpoint might not be
available one day, and maybe be online the next day, we should          3.3     Automatic insertion of SPARQL
check its availability. Therefore, we decided to store the date of              endpoints by crawling open data portals
the last index extraction for each SPARQL endpoint. If the last         In the previous version of H-BOLD, the list of datasets that users
index extraction was executed more than seven days before, then         could visualize was manually created from a list of SPARQL
we do not update the information for that LD, however, if some          endpoints available on DataHub. Unfortunately, the endpoint
things went wrong with the last index extraction because the            we queried for obtaining such information is no more available
endpoint was not available, we can repeat the index extraction          but the same data can be found at this link 7 . Moreover, that
every day.                                                              list was quite old and some of the endpoints were no longer
                                                                        available. For this reason, we conducted a research to identify
                                                                        which were the most relevant open data portals that contains
                                                                        links to SPARQL endpoints. Then, we extended both the server
5 https://d3js.org/
6 https://sparqles.ai.wu.ac.at/availability                             7 https://old.datahub.io/dataset/sparql-endpoint-status
             Figure 3: Interface for the insertion of a SPARQL endpoint and the e-mail of the successfully extraction


layer and the presentation layer. The server layer was set up to           3.4    Manual insertion of new endpoints
query three new open data portals. In particular, now, H-BOLD              Even with the crawling of SPARQL endpoint from open data
search for new endpoints in the following portals:                         portals, we are not able to reach, index and expose every SPARQL
      • European Data Portal8 . It contains open data published            endpoint available on internet. Therefore, to further increase
        over the different european countries, regions and local           the number of indexed datasets, we integrated in H-BOLD a
        administrations open data portals;                                 procedure through which the user is able to upload the URL of a
      • EU Open Data Portal9 . It holds the data produced by the           SPARQL endpoint and to see, this new dataset listed among the
        different organization of the European Union;                      others in the H-BOLD dataset list (we tested this procedure over
      • IO Data Science of Paris10 . Born with the idea of creat-          some open datasets [6]). Since the index extraction procedure
        ing new synergies between data, it contains a metadata             can be time-consuming, the user is asked to provide an e-mail
        description of several LD.                                         address so that the system can notify he/she about the status of
                                                                           the extraction. At the end of the extraction, the e-mail address
   The portals were deeply explored in order to gain knowledge
                                                                           is deleted, since we do not want to keep person data, while the
over their content. Initially, we tried to produce three ad-hoc
                                                                           dataset is added to the list of available datasets. In this way,
queries for extracting the maximum amount possible of SPARQL
                                                                           we enlarge the list of LD that H-BOLD is able to visualize and
endpoints but then we found that the query presented in Listing
                                                                           we made H-BOLD more responsive to the user requests. Figure
1 perfectly fits all the portals.
                                                                           3 contains the interface for uploading the URL of a SPARQL
                                                                           endpoint and the email that will be sent to the users in case the
Listing 1: Query sent to the open data portals to extract a                index extraction procedure is successful.
list of SPARQL endpoints
PREFIX d c a t : < h t t p : / / www. w3 . o r g / ns / d c a t # >        3.5    New visualizations
PREFIX dc : < h t t p : / / p u r l . o r g / dc / t e r m s / >
                                                                           The previous version of H-BOLD adopted only graphs for display-
SELECT ? d a t a s e t ? t i t l e ? u r l                                 ing the information (as shown in Figure 2) and in some occasion
WHERE {                                                                    they were not suited for extracting the maximum of the infor-
    ? dataset a dcat : Dataset .                                           mation. For instance, the number of instances belonging to a
    ? d a t a s e t dc : t i t l e ? t i t l e .
    ? dataset dcat : d i s t r i b u t i o n ? d i s t r i b u t i o n .
                                                                           class is hard to extrapolate from a graph and it is also hard to
    ? d i s t r i b u t i o n dcat : accessURL ? u r l .                   understand which classes have been aggregate in what cluster.
    f i l t e r ( regex ( ? u r l , ' s p a r q l ' ) ) .                  Moreover, when the number of classes and relations is high, it is
}                                                                          hard to understand the various connection. For this reason we
    With this research, we discovered 65 SPARQL endpoints on               implemented four supplementary visualization layouts, three for
the European Data Portal, 9 SPARQL endpoints on the EU Open                the Cluster Schema and one for the Schema Summary. We drew
Data Portal and 15 SPARQL endpoints on the IO Data Science of              inspiration from an extensive analysis over other LD visualiza-
Paris. Some endpoints were already present in H-BOLD, therefore            tion tools [14]. The new visualizations for the Cluster Schema
we were effectively able to increment the number of endpoints in           allow displaying together the clusters and the classes, providing
our collection by 70 units. As a result, the number of endpoints           users with a complete high-level overview of the dataset. The
listed in H-BOLD raised from 610 to 680. Since some of them                new visualization for the Schema Summary allow to better under-
are not working or are not compatible with the index extraction            standing the inter-connections among classes and the incoming
phase of H-BOLD, we were able to index and expose the structure            and out-going properties.
of 20 new datasets raising the number of indexed endpoints from               3.5.1 Treemap visualizations of the Cluster Schema. Treemaps
110 to 130.                                                                are an alternative way of visualising the hierarchical structure of
                                                                           a Cluster Schema while also displaying quantities for each cluster
8 https://www.europeandataportal.eu/                                       and each class within the cluster via area size. Each cluster is
9 https://data.europa.eu/euodp/en/home/                                    assigned to a rectangle area with a specific color and their classes
10 https://io.datascience-paris-saclay.fr
                                                                           rectangles nested inside of it. When a quantity is assigned to
 Figure 4: Treemap visualization of the Cluster Schema


a class, its rectangle area size is displayed in proportion to that
quantity and to the other quantities within the same cluster in
a part-to-whole relationship. Also, the area size of the cluster
is the total of its classes. If no quantity is assigned to a class,
then its area is divided equally amongst the other classes within
its cluster. The treemap built over the instance’s count, shown       Figure 6: Circle Pack visualization of the Cluster Schema
in Figure 4), highlights the classes with the higher number of
instances, the size of the clusters and the predominance of some
                                                                      represented as a circle and its sub-branches are represented as
classes in terms of instances.
                                                                      circles inside it. Similarly to the rectangles in the Treemap, the
                                                                      circles might have different dimensions. As displayed in Figure
                                                                      6, the inner circles represent the classes, while the intermediate
                                                                      circles represent the clusters, an external circle represents the
                                                                      entire dataset. In some cases, a cluster can contain only one class.




 Figure 5: Sunburst visualization of the Cluster Schema


   3.5.2 Sunburst visualizations of the Cluster Schema. The Sun-
burst Chart visualization (Figure 5) shows the hierarchy through
a series of rings, that is sliced for each category node. The inner
ring represents the clusters while the outer ring shows the classes
grouped by the clusters.                                              Figure 7: Hierarchical Edge graph visualization of the
                                                                      Schema Summary
   3.5.3 Circle Pack visualizations of the Cluster Schema. The
Circle Packing (Figure 6) is a variation of a Treemap that uses
circles instead of rectangles. Containment within each circle           3.5.4 Hierarchical edge bundling visualization for the Schema
represents a level in the hierarchy: each branch of the tree is       Summary. Hierarchical edge bundling is a method developed by
Holten in 2006 [11] for allowing to visualize adjacency relations       project funded by the “Enzo Ferrari" Engineering Department
between entities organized in a hierarchy. The idea is to bun-          of the University of Modena and Reggio Emilia within FAR2019.
dle the adjacency edges together to decrease the clutter usually        The contents of this publication are the sole responsibility of its
observed in complex networks. This data visualisation method            authors and do not necessarily reflect the opinion of the European
allows to check connections between leaves (classes in our case)        Union.
of a hierarchical network.
   Using the Hierarchical Edge Bundling layout, the classes are         REFERENCES
displayed over an invisible circumference and the properties are         [1] Fabio Benedetti, Sonia Bergamaschi, and Laura Po. 2014. Online Index Ex-
                                                                             traction from Linked Open Data Sources. In LD4IE@ISWC (CEUR Workshop
arcs within the circumference. This layout is perfectly suited               Proceedings), Vol. 1267. CEUR-WS.org, 9–20.
for understanding links within the classes and the domain and            [2] Fabio Benedetti, Sonia Bergamaschi, and Laura Po. 2015. Exposing the
range of the properties that connect the classes. As an example,             Underlying Schema of LOD Sources. In IEEE/WIC/ACM International Con-
                                                                             ference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2015,
in Figure 7, the node in bold (Event) is the class of interest, the          Singapore, December 6-9, 2015 - Volume I. IEEE Computer Society, 301–304.
node in green (Situation) is the rdfs:Range class of the properties          https://doi.org/10.1109/WI-IAT.2015.99
that connects it to the class of interest and the nodes in red           [3] Fabio Benedetti, Sonia Bergamaschi, and Laura Po. 2015. LODeX: A Tool for
                                                                             Visual Querying Linked Open Data. In International Semantic Web Conference
(Vevent, SessionEvent, ConferenceSeries and InformationObject) are           (Posters & Demos) (CEUR Workshop Proceedings), Vol. 1486. CEUR-WS.org.
the rdfs:Domain classes of the properties that connect them to           [4] Fabio Benedetti, Sonia Bergamaschi, and Laura Po. 2015. Visual Querying
                                                                             LOD sources with LODeX. In K-CAP. ACM, 12:1–12:8.
the class of interest.                                                   [5] Fabio Benedetti, Laura Po, and Sonia Bergamaschi. 2014. A Visual Summary for
                                                                             Linked Open Data sources. In Proceedings of the ISWC 2014 Posters & Demon-
4    RELATED WORK                                                            strations Track a track within the 13th International Semantic Web Conference,
                                                                             ISWC 2014, Riva del Garda, Italy, October 21, 2014 (CEUR Workshop Proceedings),
In [7], a model for building, visualizing, and interacting with hier-        Matthew Horridge, Marco Rospocher, and Jacco van Ossenbruggen (Eds.),
archically organized numeric and temporal LD has been proposed.              Vol. 1272. CEUR-WS.org, 173–176. http://ceur-ws.org/Vol-1272/paper_136.pdf
                                                                         [6] Domenico Beneventano, Sonia Bergamaschi, Luca Gagliardelli, and Laura
This method has been implemented in a framework for hierarchi-               Po. 2015. Open Data for Improving Youth Policies. In KEOD 2015 - Proceed-
cal charting and exploration of LD called "rdf:SynopsViz" [8]. The           ings of the International Conference on Knowledge Engineering and Ontology
tool is available online11 for the exploration of a single dataset           Development, part of the 7th International Joint Conference on Knowledge Discov-
                                                                             ery, Knowledge Engineering and Knowledge Management (IC3K 2015), Volume
and the hierarchical charting available are mainly focused on                2, Lisbon, Portugal, November 12-14, 2015, Ana L. N. Fred, Jan L. G. Dietz,
numeric or datetime properties. The exploration of new SPARQL                David Aveiro, Kecheng Liu, and Joaquim Filipe (Eds.). SciTePress, 118–129.
                                                                             https://doi.org/10.5220/0005625401180129
endpoint is not working12 .                                              [7] Nikos Bikakis, George Papastefanatos, Melina Skourla, and Timos Sellis.
   In [18], a hierarchical co-clustering approach over LD has                2017. A hierarchical aggregation framework for efficient multilevel vi-
been proposed. It simultaneously groups links and entity classes             sual exploration and analysis. Semantic Web 8, 1 (2017), 139–179. https:
                                                                             //doi.org/10.3233/SW-160226
exploiting measures of intra-link and intra-class similarity. This       [8] Nikos Bikakis, Melina Skourla, and George Papastefanatos. 2014. rdf: Syn-
approach has been implemented in a LD browser called CoClus.                 opsViz - A Framework for Hierarchical Linked Data Visual Exploration and
Although extensive evaluations have been carried out and demon-              Analysis. In The Semantic Web: ESWC 2014 Satellite Events - ESWC 2014 Satel-
                                                                             lite Events, Anissaras, Crete, Greece, May 25-29, 2014, Revised Selected Papers
strated that the approach provides useful support for entity ex-             (Lecture Notes in Computer Science), Valentina Presutti, Eva Blomqvist, Raphaël
ploration, the browser CoClus is not directly accessible online,             Troncy, Harald Sack, Ioannis Papadakis, and Anna Tordai (Eds.), Vol. 8798.
                                                                             Springer, 292–297. https://doi.org/10.1007/978-3-319-11955-7_37
therefore it is impossible to compare it w.r.t. H-BOLD. Another          [9] Marie Destandau. 2019. Interactive visualisation techniques for the web of
interesting and recent tool for interactive LD visualization is              data. The Web Conference 2019 - Companion of the World Wide Web Conference,
S-Paths [9]. It can display multiple views on RDF resource sets              WWW 2019 (2019), 17–21. https://doi.org/10.1145/3308560.3314189
                                                                        [10] Sébastien Ferré. 2017. Sparklis: An expressive query builder for SPARQL
and supports browsing over the Web of Data. It is able to show               endpoints with guidance in natural language. Semantic Web 8, 3 (2017), 405–
different properties along paths in the graph. Users can navigate            418. https://doi.org/10.3233/SW-150208
between different resource sets by selecting a subset or pivoting.      [11] Danny Holten. 2006. Hierarchical Edge Bundles: Visualization of Adjacency
                                                                             Relations in Hierarchical Data. IEEE Transactions on Visualization and Com-
                                                                             puter Graphics 12 (2006). Issue 5.
5    CONCLUSION AND FUTURE WORK                                         [12] Steffen Lohmann, Vincent Link, Eduard Marbach, and Stefan Negru. 2016.
                                                                             Extraction and Visualization of TBox Information from SPARQL Endpoints. In
In this paper, we presented H-BOLD, a tool for multilevel in-                Proceedings of the 20th International Conference on Knowledge Engineering and
teractive visual exploration of Big LD that has been enhanced                Knowledge Management (EKAW 2016) (LNAI), Vol. 10024. Springer, 713–728.
                                                                        [13] Laura Po. 2018. High-level Visualization Over Big Linked Data. In Proceedings
with new visualizations: TreeMap, Sunburst Chart and Circle                  of the ISWC 2018 Posters & Demonstrations, Industry and Blue Sky Ideas Tracks
Packing. H-BOLD has been tested on 130 Big LD showing good                   co-located with 17th International Semantic Web Conference (ISWC 2018), Mon-
performances.                                                                terey, USA, October 8th - to - 12th, 2018 (CEUR Workshop Proceedings), Marieke
                                                                             van Erp, Medha Atre, Vanessa López, Kavitha Srinivas, and Carolina Fortuna
   In the next future, we intend to raise the number of endpoints            (Eds.), Vol. 2180. CEUR-WS.org. http://ceur-ws.org/Vol-2180/paper-50.pdf
indexed in H-BOLD by improving the index extraction procedure           [14] Laura Po, Nikos Bikakis, Federico Desimoni, and George Papastefanatos. 2020.
                                                                             Linked Data Visualization. Morgan & Claypool Publishers. to appear.
and by querying new repositories that collect SPARQL endpoints          [15] Laura Po and Davide Malvezzi. 2018. Community Detection Applied on Big
metadata. Moreover, we intend to evaluate the effectiveness of H-            Linked Data. J. UCS 24, 11 (2018), 1627–1650. http://www.jucs.org/jucs_24_
BOLD as a visualization tool through a survey involving different            11/community_detection_applied_on
                                                                        [16] Georgia Troullinou, Haridimos Kondylakis, Evangelia Daskalaki, and Dimitris
kinds of LD consumers: practitioners, unskilled users, domain                Plexousakis. 2015. RDF Digest: Efficient Summarization of RDF/S KBs. In
experts.                                                                     ESWC (Lecture Notes in Computer Science), Vol. 9088. Springer, 119–134.
                                                                        [17] Fabio Viola, Luca Roffia, Francesco Antoniazzi, Alfredo D’Elia, Cristiano
                                                                             Aguzzi, and Tullio Salmon Cinotti. 2018. Interactive 3D Exploration of
6    ACKNOWLEDGMENTS                                                         RDF Graphs through Semantic Planes. Future Internet 10, 8 (2018). https:
This work has been partially supported by the TRAFAIR project                //doi.org/10.3390/fi10080081
                                                                        [18] Liang Zheng, Yuzhong Qu, Xinqi Qian, and Gong Cheng. 2018. A hierarchical
2017-EU-IA-0167, co-financed by the Connecting Europe Facility               co-clustering approach for entity exploration over Linked Data. Knowl.-Based
of the European Union and the “Networking over Linked Data"                  Syst. 141 (2018), 200–210. https://doi.org/10.1016/j.knosys.2017.11.017

11 http://synopsviz.imis.athena-innovation.gr/
12 Tests have been performed on Jan 3rd, 2020