=Paper=
{{Paper
|id=Vol-538/paper-2
|storemode=property
|title=Explorator: A tool for exploring RDF data through direct manipulation
|pdfUrl=https://ceur-ws.org/Vol-538/ldow2009_paper2.pdf
|volume=Vol-538
|dblpUrl=https://dblp.org/rec/conf/www/AraujoS09
}}
==Explorator: A tool for exploring RDF data through direct manipulation==
Explorator: a tool for exploring RDF data through direct
manipulation.
Samur F. C. de Araújo Daniel Schwabe
Catholic University of Rio de Janeiro Catholic University of Rio de Janeiro
R. M. S. Vicente 225 R. M. S. Vicente 225
Gávea, Rio de Janeiro, RJ, Brazil Gávea, Rio de Janeiro, RJ, Brazil
+55 21 3527-1500 +55 21 8241-4313
saraujo@inf.puc-rio.br dschwabe@inf.puc-rio.br
ABSTRACT investigating and learning about a set of data without a-priori
knowledge of its domain. This data is expressed in RDF1, and is
In this paper we introduce Explorator, a tool for exploring the typically stored in very large interconnected databases, without a
Semantic Web data by direct manipulation. Explorator homogeneous schema. The exploration mechanisms currently
implements a model of operations that is supported by a visual available are not sufficient to accomplish the user tasks in the SW.
interface that enables the user, with minimal knowledge of RDF Keyword search, e.g. Sindice2, only addresses simple information
model, to explore an RDF database without a-priori knowledge of lookup. Explicitly formulated queries, e.g. iSparql 3 , requires
data domain. Consequently, it is well suited for tasks that involve schema and technical knowledge from the users. Semantic
information search, exploration and visualization. browsers, e.g. Tabulator [3], are not designed to explore huge
datasets and semantic faceted browsing, e.g. BrowseRDF [12], is
Categories and Subject Descriptors inefficient for fact-finding or known-item retrieval and some more
H.5.3 Web-based interaction; H.5.4 Hypertext/Hypermedia - complex exploratory tasks.
Navigation, H.3.3 Information Search and Retrieval – search
In this paper we will describe a model for representing
process, query formulation.
information processing by users in exploratory tasks, and
Explorator tool, which provides a browser interface supporting
General Terms this model. Explorator is based on the metaphor of direct
Algorithms, Design, Experimentation, Human Factors, manipulation of information on the interface, with immediate
Languages, Theory, Verification. feedback of user actions. The remainder of the paper is organized
as follows. Section 2 defines more precisely the exploratory
Keywords search itself; Section 3 presents the information processing model;
RDF, exploratory search, exploration, ontology, semantic web. describes Explorator tool and its interface; Section 4 we present
some details of its implementation; Section 5 presents some
conclusions and directions for further work.
1. INTRODUCTION
As the volume of information on the Web increases considerably,
2. EXPLORATORY SEARCH
we need better tools to help us discover and make sense of the In the hypertext field, we call information exploration the process
available information, as well as to seek answers to specific of seeking, learning about and investigating a (potentially large)
questions we may have. collection of information items through search, browsing or
Currently, seeking information is a task that permeates most navigation, but not excluding other forms, in order to discover
activities we develop in our day-to-day. Depending on the type of something new.
activity we perform, we use different strategies and tactics to Research in the area called exploratory search [11] has tried to
search for information. In the web, these tactics are supported by develop solutions that support information exploration.
computational tools such as keyword search, navigation and Exploratory search is applicable in situations where the user’ task
browsing [11]. But the process of seeking information is not and the search environment have complex elements that require
simply finding it, we must keep in mind that the task of the user constant user interpretation during the exploration process. For
ranges from simply searching for a known item to activities such example, how to support the user’s search task when she is not
as knowledge acquisition, understanding of concepts, discovery, familiar with the search domain, or she does not have sufficient
planning, transforming, etc. [11] knowledge about the domain to make a query; how to support the
A more recent development has been the Semantic Web (SW), navigation in vast information spaces, or when the navigation,
and the rapidly growing amount of semantically annotated data searching and browsing are not enough. In other words, how to
leads to the need to support not only for searching, but also for
1
RDF – Resource Description Framework
Copyright is held by the author/owner(s). 2
http://sindice.com
LDOW 2009, April 20-24, 2009, Madrid, Spain. 3
iSparql can be accessed at http://demo.openlinksw.com/isparql/
take into account all aspects [2, 7, 11] that influence the able to extract semantic annotations from HTML pages obtained
exploration process: the user’s task, the user’s context, the user’s from URIs that cannot be dereferenced as an RDF file, using
profile, the environment, the information provenance, etc. GRDLL. In spite of distinct dereferencing processes being able to
Marchionini [11] made a distinction between exploratory search, retrieve different amounts of information, the process itself does
lookup and search retrieval. According to him, exploratory search not improve the nature of tasks performed in these tools. In fact,
is based not only on lookup but also on investigation and learning. the set of exploration tasks are limited to navigation between sub-
He argues that investigative search and learning search require graphs by clicking on the resources displayed in the interface and
more human iteration than a simple lookup, because these are dereferencing the corresponding URIs.
exploratory processes that support tasks that require the cognitive Another way to access SW data is by querying a SPARQL
and interpretative ability of user. These kinds of tasks are Endpoint that receives a SPARQL12 query and returns a set of
commonly found in the exploration of RDF databases, where the RDF resources described in XML notation. There are a few tools
users need to identify classes and properties from the schema, in that allow us to explore a SPARQL Endpoint. NITELIGHT [15]
order to understand concepts, acquire knowledge and learn about and iSPARQL13 are Visual Query Systems (VQS) [5] which allow
the domain. visual construction of SPARQL queries, differing mainly in the
Berners-Lee et al. [3] argue that once the information sought is visual notation employed. It is understood that to use these tools
found, it may be necessary to analyze it. According to their the user must have a full comprehension of the underlying RDF
description, exploration and analysis are distinct processes that are schema and the query language syntax, therefore leading to a high
inter-related during the user’s task. In our point of view, the cognitive load for newcomers and less experienced users.
process of exploration involves both finding a piece of Tabulator also provides a way to query its data using SPARQL by
information and investigating or learning about its domain, providing an interface in which the user can formulate a query
because it is guided by the need to perform a task. The cognitive based on the selection of the elements of the RDF graph displayed
process of analysis permeates the entire exploratory task, since on the interface. However, more complex queries need to be
while browsing, the user creates an expectation of what she will edited manually, exposing the user to some of the issues cited
obtain, she sees what has been achieved and uses this information before.
to guide her in the next step. Some tools address a different goal in the process of accessing
In order to provide to the user an exploratory search tool that SW data. Instead of focusing on access to RDF data, they focus on
supports learning and investigative search on SW, we focused on how to consume RDF data. Exhibit [9] is a lightweight structured
three fronts: data publishing tool that can be used to export small collection of
RDF data. This tool accomplishes an important role on the SW,
• Information search (how semantic data is found on the by publishing content from different sources on the Web.
Semantic Web),
Taking all this into consideration, we can see there are no tools
• Information usage (how semantic data is used on the adequate to explore the semantic web as a whole. Currently, the
Semantic Web), browsers and SPARQL query builders are addressing different
• Information visualization (how semantic data is goals, and were designed for different kinds of users. In order to
presented on the Semantic Web). provide a complete and integrated exploratory search mechanism
to access the SW data, we are proposing Explorator.
2.1 Information Search (in the SW)
2.2 Information Usage (in the SW)
Nowadays, we can access the SW data in three different manners:
through a SPARQL Endpoint4, through an URI, or by processing The RDF model provides a format for data, information, and
semantically annotated HTML pages (e.g. Microformats 5 or knowledge exchange. However, the repositories of data are
RDFa6). There are tools which can explore the SW directly, such scattered on the SW, which demand a unified mechanism to
as semantic web browsers, such as Tabulator [3], Disco7, Zitgist access them. Many information-intensive human tasks demand the
data viewer 8 , Marbles 9 , ObjectViewer 10 and Openlink RDF manipulation of multiple pieces of information. In a SW
Browser11. exploration tool, at a low level, the objects manipulated are RDF
data (resources, triples, literals, properties, etc) and queries. These
These tools all implement a similar exploration strategy, allowing
are the information items being manipulated when using an RDF
the user to visualize an RDF sub-graph in a tabular fashion or in a
browser.
more “visual” way (e.g., map views or timelines) when
applicable. The sub-graph is obtained by dereferencing [4, 6] an Consider the SW user looking for all papers mentioning another
URI and each tool uses a distinct approach for this. Tabulator is paper; or all paper authors’ phone numbers. The user may
encounter different data architectures while performing such
4
tasks. For example, the information sought may be stored in
http://www.w3.org/TR/rdf-sparql-protocol/ multiple RDF files or in a single large RDF repository, and
5
http://microformats.org/ expressed in distinct vocabularies. It is crucial that any
6 exploratory tool be able to consolidate the information to be
http://www.w3.org/TR/xhtml-rdfa-primer/ accessed in an integrated way. The user should be able to merge
7
http://www4.wiwiss.fu-berlin.de/bizer/ng4j/disco/ information described in different vocabularies, at least by
8
http://dataviewer.zitgist.com/ directly manipulating each piece of information. For example,
9
http://beckr.org/marbles
10 12
http://objectviewer.semwebcentral.org/ http://www.w3.org/TR/rdf-sparql-query/
11 13
http://demo.openlinksw.com/rdfbrowser/index.html iSparql can be accessed at http://demo.openlinksw.com/isparql/
suppose she is looking for all email addresses by dereferencing Tabulator’s more general view represents the information in a tree
four different URIs, each one returning triples expressed in a structure. As the user selects a resource in the interface, a new
distinct vocabulary. Even if she could see all the data together, she node is added to the tree, thus recording user’s navigation process
would not be able to manipulate this set of information to obtain a in the interface. The authors argue that it is comfortable for the
unique final set of email addresses, only by using current RDF user to see the information in a tree-oriented interface, due to
browsers’ functionality. familiarity with other sources of data are also represented in a
Some of these browsers, like Openlink RDF Browser, cache all hierarchical structure. The authors also proposed a model of views
RDF data during the user’s navigation. Therefore, the user can to be applied when the domain is known. A view oriented towards
treat pieces of information from different sources as coming from a specific domain improves the understanding of the instances
a unique repository. However, the user cannot issue a query on the being explored. For example, it is better to see geographic
results, which limits the kinds of tasks supported. For example, it coordinates on a map than in a table.
is very difficult to obtain the homepage address for all people From the user's task point of view, the representation of
known to someone, as reported in their FOAF profile, by using information helps its assimilation, but it does not expand the kinds
one of the RDF Browsers mentioned earlier. of tasks that can be done. What we have observed so far is that
From the user’s task point of view, exploring the SW involves without a proper model of exploration, involving well-defined
asking questions and getting answers about the schema and operations, the user’s exploration resumes to navigating between
instances. Obviously, understanding what is presented, what and the nodes of an RDF graph, sequentially.
how it can be manipulated is essential for the user to be able to
formulate her question. Thus, querying is an important way for the 3. EXPLORATOR
user to increase her knowledge about the schema and data
contained in an RDF repository. Direct SPARQL query Explorator 14is an open-source exploratory search tool for RDF
formulation, which is allowed in some browsers, still imposes a graphs, implemented in a direct manipulation interface metaphor.
higher mental load from the user, even for the more advanced. In It implements a custom model of operations, and also provides a
addition, the user often does not have enough knowledge about Query-by-example [18] interface. Additionally, it provides faceted
the domain to formulate a query. As seen in Cartaci et al. [5], the navigation over any set obtained during the operations in the
raw use of query languages induces the user to make mistakes model that are exposed in the interface. It can be used to explore
during writing, considerably increasing the time for query both a SPARQL endpoint as well as an RDF graph in the same
formulation and usually being far from the mental model that the way as “traditional” RDF browsers. Its general architecture is
user has of the reality. represented in the diagram below:
Ding at al. [7] argue that the object of interest is not only the
domain schema and instances, but also the source of data, which
EXPLORATOR INTERFACE
is an import piece of information in the exploratory process. In
fact, when we are exploring several repositories, we could want to
know from where each piece of information comes from. Marbles EXPLORATOR MODEL
and Disco are examples of RDF browsers that track the
provenance of the information, helping the user in judging its
credibility. REPOSITORIES
In summary, current tools allow the user to manipulate raw RDF
data and do not provide a user friendly way to ask question. The SEMANTIC WEB DATA
user is limited to visualizing the result as aggregate data. Any
processing is done manually, and the user has a limited way to
rearrange, group or filter the data, and process it further. We will Figure 1. Explorator’s general architecture.
discuss later how Explorator can be a step forward in SW data
manipulation.
At the most elementary level, the user’s task resumes to
2.3 Information Visualization (in the SW) dereferencing an URI or formulating and executing a SPARQL
query against a SPARQL Endpoint. In Explorator, every
A SW browser navigates along relationships between concepts. At SPARQL Endpoint is a repository, that can be enabled or disabled
each step of navigation, in this unknown and semi-structured (in and can be manipulated individually or integrated into a single
the sense of schema-less) space, a set of RDF triples is displayed global source of RDF data. The dereferenced URIs are stored in a
in the interface. local SESAME 15 repository which can then be queried and
Browsers such as Disco, Marbles, Zitgist data viewer, Openlink manipulated as if it were a SPARQL Endpoint. In other words, the
RDF Viewer, represent RDF data in a tabular fashion. In Disco’s user always explores a federation of databases, containing
SPARQL Endpoints and RDF triples obtained by dereferencing
interface, each triple is a line in a two columns table, the
specific URIs.
navigation is done by clicking on the resources displayed in the
interface. Marbles does the same, and groups the values of
properties that occur more than once for the same resource. In
addition to the tabular presentation, the user has a more refined 14
view of the triples being displayed. As in Disco, for each Explorator information, including a demo interface and the
navigation step, the whole content is replaced by a new set of URL of the subversion repository can be accessed at
triples retrieved from the dereferenced URI. http://.www.tecweb.inf.puc-rio.br/explorator
15
http://www.openrdf.org/
The set of manipulation operations is limited to the operations _:a foaf:name "Johnny Lee Outlaw" .
defined in our processing information model which we will _:a foaf:mbox .
_:b foaf:name "Peter Goodguy" .
describe next.
_:b foaf:mbox .
_:c foaf:mbox .
3.1 The Information Processing Model
The query above should return all triples. On the other hand, the
Exploring a set of information items in the SW is understood here function SPO(∅,{foaf:mbox}, ∅) can be translated to:
as a process of transforming resources and triple by successive
application of operations. SELECT ?s ?p ?o WHERE { ?s ? p ?o. Filter (p =
Our experience in Web application design methods [10, 16] has foaf:mbox)} .
shown us that it useful to characterize the user information
processing as set of manipulation operations, in what has been This query returns all triples that have the property foaf:mbox.
called “set based navigation” [14]. This view is also supported by
more recent proposal such as Parallax16. Basically, the user is Consider the more complex example of how this model could be
always processing (browsing) information items within a set of used, to solve the task: “find all Russian lakes”:
interest; if necessary, this set is further manipulated to either Let S be a function that returns all subjects from a set of triples.
remove uninteresting elements or to add additional elements of SPO(
interest.
S( SPO(∅,{rdf:type},{mondial:Lake}) ),
Explorator’s model is composed of two elements: the manipulated
items and the manipulation operations. The items are primitive {mondial:locatedIn},
elements in the RDF model: triple, resources, literals, URIs, etc. {mondial:Russia}
The operations are grouped in two sets: set operations and search
operations. )
We will show in the following sub-sections that this model can The expression above returns all triples that have the property
encompass classical browsing, set-based navigation as found in mondial:locateIn with value mondial:Russia.
SHDM [10], and faceted browsing, as well as keyword search. It should be noted that, whereas these examples show single
valued parameters, in general the parameters for SPO are sets.
3.1.1 Sets
The model manipulates two kinds of sets – sets of RDF triples and 3.1.3 Set Operations
sets of RDF resources. When dealing with sets of RDF resources, The model allows the user to manipulate items of information
the usual set operations, union, intersection and difference are within the RDF domain. Once the user has obtained a set of triples
available. Since RDF resources are treated as URIs, blank nodes and resources, she can manipulate them individually, formulate
will only be included if they are assigned URIs, as occurs in some new queries, or create new sets. To do so, the model supports the
data stores. following set operations:
When operating on sets of triples, we interpret the set operations Let A be the set of all triples.
as applying to any of the triple components, namely, subjects (S),
predicates (P) or objects (O). This is equivalent to projecting a set Union:
of triples along one of its three slots. Given two sets M and N, each containing a triple, the union
between M and N is the union of triples of M and N.
3.1.2 Search Operation Intersection:
As previously stated, there are two ways to access the data in SW: The intersection set I between M and N is the union of the triples
dereferencing an URI or querying a SPARQL Endpoint. We in A such that the subject of the triples in I appear in triples in
define in our model general query operation, called SPO (S, P, O), both M and N.
to be applied to a SPARQL Endpoint. This operation allows the Difference:
user to obtain a new set of interest, which can then be processed in
the next step in the task. The difference set D between M and N contains the triples in A
such that their subjects appear in triples in M and do not appear in
The SPO operation has three parameters, all of which are sets: a triples in N .
set of subjects, predicates, and objects. This operation is a subset
of general SPARQL queries, allowing the user to query an RDF Note that, in this model, the result is always a set of triples, and
database by providing an example pattern of the desired set of the operations are always computed on the sets of subjects,
triples. predicates or objects of these triples.
For example, the function SPO(∅,∅,∅) can be translated into
the following SPARQL query: 3.2 Visualizing RDF data with Explorator
SELECT ?s ?p ?o WHERE {?s ?p ?o} . In existing RDF browsers, the data are expressed in one of the
following metaphors: table, tree or graph. In our approach, the
For the following data: interface represents the elements of the underlying exploration
@prefix foaf: . model: resources, triples and sets.
16
http://mqlx.com/~david/parallax/index.html
interface (ctrl-click) and then click on the union operation to form
the corresponding set.
The second subdivision, marked as 2, includes the operands for
the SPO operation. In this case, the user must select one set, and
then click on one of S, P or O. She may also assign another set to
one of the other operands (S, P, O). Clicking on “=” produces the
result. Clicking on “clear” resets the operands previously selected.
1 2
Figure 4. Operations in Explorator toolbar.
The sets are represented as boxes, and stand for both sets of triples
or sets of resources. Strictly speaking, all boxes represent sets of
triples which can be grouped by subject, property or object.
Classes are shown in blue, and RDF properties are shown in
green.
Figure 5. Sets of triples represented in Explorator’s interface.
Figure 2. A set of triples displayed in Explorator. The subject
On the left we have all triples with Budapest as subject. On
is “Niger”, the properties and values are listed under it.
Considering a generic exploration mechanism over the RDF
model, the concept of triple, entity and resource are mixed. In
Explorator’s interface. The predicates and objects of the triples are
nested and right aligned under the subject, thus evidencing the
entity represented by the subject of the triple, as shown in the
figure 2.
Explorator uses the following heuristic to render a resource (or
URI) in the interface:
• If the resource has a label, name or title property, it
renders its value.
• Otherwise the URI localname is rendered.
In this interface, each element can be manipulated individually.
Sets of subjects, predicates and objects can be selected by the user
and provided as parameters in the operations described in the
model. Dereferencing an URI, or the result of an operation over
the model always results in a new set in the interface. In this
sense, Explorator incorporates elements of the Direct
Manipulation paradigm [17], since the output of an operation may
the right we have some triples grouped by subject.
be used as input of another, as they are expressed in the same
notation. Direct Manipulation is a user-system interaction To select a triple the user simply clicks on the surrounding box,
paradigm that allows users to point at visual representations of whose border becomes dashed to indicate the selection. If the user
objects and actions to carry out tasks rapidly and observe the double-clicks on a triple, it is interpreted as a request for all triples
results immediately. Explorator’s interface follows this paradigm. with the same subject as the subject of the clicked triple.
The interface has two main elements, the toolbar and the result
sets. The toolbar has a menu giving access to repository 3.3 Faceted Navigation
configuration and additional functionalities; a search box; and a
group of buttons representing the operations of the model. In addition to the operations already described, we have also
defined a model for specifying tailor made facets. This model can
be specified using a custom made vocabulary called FACETO,
which we do not elaborate here for reasons of space.
Figure 3. Explorator toolbar. While many tools implement faceted navigation (FacetMap 17 ,
Longwell18, BrowseRDF19, Flamenco20, Exhibit21, /facet22 [8] ),
The operations menu is divided in two groups, as shown in Figure none allow the specification of facets using RDF.
4. The first area (Fig. 4 - 1) has the set operations: To operate, the
user must select the first set among the sets displayed, then click
on the operation (union, intersection or difference), then select 17
http://www.facetmap.com/
(click on) another set, and then click on ‘=’. Specifically for 18
union, the user can also click on multiple resources in the http://simile.mit.edu/wiki/Longwell
19
http://browserdf.org/
are several possible ways to achieve this task; one possible way
would be as follows:
1. Find all the lakes in the database;
2. Find Russia, the country;
3. Find all the lakes in Russia obtaining a set we will call
LR;
4. Find the countries that share a boundary with Russia
(Russia’s neighbors);
5. Find all the lakes in Russia’s neighbors, obtaining a set
we will call LN; and
6. Build the set of the lakes contained exclusively in
Russia by calculating the difference between the
previous sets: LR-LN
To find all the lakes in the database, the user first searches for
“lake”:
Figure 6: Explorator’s faceted interface.
Using FACETO, the designer may.
1. Specify a facet based on a given RDF property;
2. Specify a facet based on computed values. For example,
she may define a “dimension” facet based on the
combination of values of the “width” and “height”
properties.
3. Define synonyms among different resources that
represent the same information.
4. Define a facet as an arbitrary enumeration of values, or
as a range. For example, “inexpensive” and
“expensive”.
5. Specify a facet based on a hierarchical relation, such as
“located in”. She locates the Lake class (in blue) in the resulting set, and gets
the set of instances of the Lake class by clicking on it, to obtain all
Note also, none of the existing tools can be applied directly to an the lakes in the database:
arbitrary SPARQL Endpoint. Using Explorator, the user can facet
any set of triples retrieved during her navigation.
As an added convenience, we have also implemented an
algorithm, based on entropy measures, that given a set of triples,
determines the set of properties that is most discriminant for that
set, and builds a set of facets based on these properties. Again,
due to space limitations, we do not detail this algorithm here. This
operator can be activated by clicking on the F* button in the
interface of any set.
Due SPARQL language limitations (missing of aggregation
functions), applying this operation over a SPARQL endpoint may
be very time consuming.
3.4 An Example
Let us now illustrate the usage of Explorator. Suppose the user
needs to find all the lakes contained exclusively in Russia. There
20
http://flamenco.berkeley.edu/
21
http://simile.mit.edu/exhibit/
22
http://slashfacet.semanticweb.org/
Next, to find Russia, she searches for “Russia” and locates the Continuing to build the query, she selects the resource Russia and
resource Russia in the resulting set: sets it as the object of her query:
To make sure she has the right resource, she views the resource
details:
She executes the query to obtain the set of all lakes in Russia:
Next, to find all lakes LR in Russia, she selects the set of all lakes
and sets it as the subject of her query by clicking on the [S] Next, to find the countries that share a boundary with Russia, she
toolbar button: views the details of the Russia resource and locates the “neighbor”
property for Russia, thereby finding its neighboring countries:
She then executes the query to find all lakes in Russia’s
neighboring countries:
To find all the Russian lakes that are also in Russia’s neighbors,
she selects the set of Lakes in Russia and sets it as the subject of
her next query:
Finally, to build the set of the lakes contained exclusively in
Russia, she needs to calculate the difference between the set of
lakes in Russia and the set of lakes in Russia’s neighbors. To do
this, she selects the first set and the difference operator:
Finally, she selects the second set (containing the lakes in
Russia’s neighbors) and executes the difference operation by
clicking on the equal sign [=] toolbar button, thereby obtaining the
She selects the set of Russia’s neighbors and sets it as the object desired result:
of her query:
4. IMPLEMENTATION
In the following we outline our implementation architecture and
some notable details. We decided to use a two layer architecture
which separates the upper presentation layer from the lower
model layer.
4.1 Presentation Layer was very effective in formulation of complex queries over an
unknown domain.
For the implementation of the proposed interface we adopted the
Explorator also allows faceted navigation, and we developed an
approach of adding semantic annotations in the HTML code to
RDF vocabulary for faceted specification and an algorithm for
define interface widgets behavior. To that end, we used the
automatic extraction of all facets of a set of triples.
Prototype23 library, which allows us to easily navigate the DOM
tree, select elements by their class attribute values - using CSS - We have conducted a preliminary study [1] that has shown
and link operations to interface events such as onclick, encouraging results. Users with only basic knowledge of RDF
onmouseover, onkeyup, etc.. This technique enables us to create were able to elaborate nontrivial queries with Explorator. We
very dynamic interfaces for direct manipulation with continuous realized that Explorator’s performance (query execution time) had
representation, incremental actions and feedback. Also, all users a negative impact on the user experience, especially when
requests to the server are made using Ajax24, allowing users to accessing remote endpoints. It may be the case that users explored
continue to explore data while their request are being processed. less because of the time it took to compute the queries. In fact, the
time consumption is demanded by the SPARQL datastores, which
are still in early stages, especially when compared to relational
DBMSs. This issue is of the utmost importance and is being
4.2 Model Layer addressed for future versions.
Not surprisingly, the experiments showed us that Explorator is
The model layer can be summed up in the picture better suited to advanced users who have solid knowledge about
below: RDF. Nevertheless, the experiments were brief, so we cannot yet
draw any conclusions about Explorator’s learning curve.
Preliminary evidence indicates that once the initial difficulty is
EXPLORATOR MODEL overcome, users can become quite proficient with the system.
The next step in our study will be to investigate the use of
ACTIVERDF Explorator as an epistemic tool, for users to understand more
about the represented data domain, as opposed to performing
predefined tasks and answering specific questions. In particular,
RDF DATABASE an open hypothesis is the adequacy of the RDF model to match
the user’s mental models – some of the collected evidence
suggests that it might be too low level, which means suitable
Figure 7. Explorator model architecture abstractions might have to be introduced. Exposing Explorator’s
operation model to naïve users is still a challenge which is the
We used the ActiveRDF [13] framework as a layer for translating subject of ongoing research.
the Explorator model to the RDF model. Basically, we used the
ActiveRDF to generate SPARQL queries from our model. The set Additional larger-scale experiments should be conducted to
operations are performed on Ruby objects because the ActiveRDF compare different user interface alternatives and interaction
and SPARQL do not support those operations natively. The query paradigms to better support both novice and expert users in
and cache mechanism of ActiveRDF were modified to better exploring the semantic web. To do so, Explorator can be
support integration with Explorator’s model. instrumented to remotely capture the users’ actions at the user
interface and on the underlying processing model.
The default dereferencing mechanism implemented is quite
simple: it simply retrieves and loads all triples retrieved from the As future work, we will extend the model to support the definition
URI into a SESAME repository. No inference or recursive of parameterized sets, i.e., sets derived from parameterized
dereferencing heuristic is applied. As a result of this approach, the operations. Following the QBE paradigm, the user will be able to
user can explore the triples retrieved along the direct URI select any set in the interface, and indicate which should be the
navigation as a SPARQL Endpoint. parameters. Once this has been done, the user can then plug the
output of a box as the input of another box (set), thus establishing
a graph of inter-related operations, much like a spreadsheet. Such
5. CONCLUSION parameterized sets can be saved to libraries, to be later reused by
any user.
Exploratory search is a data exploration technique that supports
complex user’s tasks involving lookup as well as learning and Explorator needs some improvements related to the dereferencing
investigation. We have shown how this technique can be heuristics. Also, we are working on some mechanisms to enable
employed for arbitrary RDF databases. We have developed an exporting RDF, and for enabling alternative views to allow the
information-processing model that supports the tasks in the user to visualize the resources and triples in table, timetables and
Semantic Web that not only consist of a searching for a known maps, as well as in customized domain-dependent formats.
item, but also consists of acquisition and assimilation of In summary, Explorator’s contributions are:
knowledge and concepts in an RDF database. This model has
been implemented in a tool called Explorator. We use the direct • An information exploration model for RDF based on
manipulation metaphor in the construction of the interface, which facet and set navigation;
• An exploration environment that allows query
formulation by direct manipulation, allowing remote
23
and local SPARQL endpoints exploration;
http://www.prototypejs.org/
24
• Automatic facet generation for given sets of RDF
http://ajaxpatterns.org/ triples;
• A facet specification vocabulary and corresponding [13] Oren E.,Delbru R., Gerke S., Haller A., Decker S.
implementation within the tool (not shown in this ActiveRDF: ObjectOriented Semantic Web Programming.
paper). Digital Enterprise Research Institute National University of
Explorator is an open source project and can be accessed at Ireland, Galway Galway, Ireland. 2007
http://www.tecweb.inf.puc-rio.br/explorator. [14] ROSSI, G.; SCHWABE, D.; LYARDET, F.; "Patterns for
Designing Navigable Spaces", Proceedings of PLoP98 (Tech
Report TR #WUCS-98-25, Washington University, St.
ACKNOWLEDGMENT. Daniel Schwabe was partially Louis, MO, USA), Monticello, Illinois, USA, August 1998.
supported by a grant from CNPq.
[15] Russell, A., Smart, P. R., Braines, D. and Shadbolt, N. R.
6. REFERENCES (2008). NITELIGHT: A Graphical Tool for Semantic Query
Construction. In: Semantic Web User Interaction Workshop
(SWUI 2008), 5th April, Florence, Italy. 2008.
[1] Araújo F. C. S.; Schwabe D.; Barbosa D. J. S. Experimenting [16] Schwabe, D., Rossi, G.: An object-oriented approach to web-
with Explorator: a Direct Manipulation Generic RDF based application design. Theory and Practice of Object
Browser and Querying Tool. Visual Interfaces to the Social Systems (TAPOS), Special Issue on the Internet, v. 4#4,
and the Semantic Web. VISSW 2009. Sanibel Island, Florida October, 1998, 207-225.
February 2009 (http://www.smart- [17] Shneiderman, Ben, Direct manipulation: a step beyond
ui.org/events/vissw2009/index.html) programming languages. IEEE Computer 16,8 (August
[2] Baldonado M. Q. W., Winograd T. SenseMaker: An 1983), 57-69.
Information-Exploration Interface Supporting the Contextual [18] Zloof, M. M., 1977. Query-by-example: a database language.
Evolution of a User’s Interests. 1996 IBM System Journal 16, 324-343, 1977.
[3] Berners-Lee T., Chen Y., Chilton L., Connolly D., Dhanaraj
R.,Hollenbach J, Lerer A., and Sheets D. Tabulator:
Exploring and Analyzing linked data on the Semantic Web.
Decentralized Information Group. Computer Science and
Artificial, Intelligence Laboratory. Massachusetts Institute of
Technology. Cambridge, MA, USA. 2006.
[4] Best Practice Recipes for Publishing RDF Vocabularies.
http://www.w3.org/TR/swbp-vocab-pub/
[5] Catarci, T., Costabile, M. F., Levialdi, S., Batini, C., 1997.
Visual Query Systems for Databases: A Survey. Journal of
Visual Languages and Computing, 8(2), 215-260, 1997.
[6] Dereferencing a URI to RDF.
http://esw.w3.org/topic/DereferenceURI
[7] Ding L., Zhou L., Finin T., Joshi A. How the Semantic Web
is Being Used: An Analysis of FOAF Documents.
Proceedings of the 38th Hawaii International Conference on
System Sciences – 2005
[8] Hildebrand M., Ossenbruggen J. v. and Hardman L. /facet: A
Browser for Heterogeneous Semantic Web Repositories. The
5th International Semantic Web Conference (ISWC). Athens,
GA, USA. 2005
[9] Huynh D. F., Karger D. R., Miller R. C.. Exhibit: lightweight
structured data publishing. International World Wide Web
Conference. Proceedings of the 16th international
conference on World Wide Web (WWW). Banff, Alberta,
Canada. 2007
[10] Lima, F.; Schwabe, D.: “Application Modeling for the
Semantic Web”, Proceedings of LA-Web 2003, Santiago,
Chile, Nov. 2003. IEEE Press, pp. 93-102, ISBN (available
at http://www.la-web.org).
[11] Marchionini G. Exploratory search: From finding to
understanding. Comm. Of the ACM, 49(4), 2006.
[12] OREN, E.; Delbru, R.; Decker, S. Extending faceted
navigation for RDF data. 5th International Semantic Web
Conference, Athens, GA, USA, LNCS 4273, p. 5-9. 2006