=Paper=
{{Paper
|id=Vol-1311/paper8
|storemode=property
|title=Introducing a User Interface with an Entity-Strategy-based Approach for Exploring Document Collections
|pdfUrl=https://ceur-ws.org/Vol-1311/paper8.pdf
|volume=Vol-1311
}}
==Introducing a User Interface with an Entity-Strategy-based Approach for Exploring Document Collections==
<pdf width="1500px">https://ceur-ws.org/Vol-1311/paper8.pdf</pdf>
<pre>
    Introducing a User Interface with an Entity-Strategy-
    based Approach for Exploring Document Collections

                           Daniel Hienert1 and Wilko van Hoek1
           1
               GESIS – Leibniz Institute for the Social Sciences, Cologne, Germany
                   {daniel.hienert,wilko.vanhoek}@gesis.org


       Abstract. In this paper we present a first sketch of an alternative approach for
       searching and exploring document collections. The traditional approach applied
       in Digital Libraries and Web Search Engines is based on search forms and re-
       sult lists. The user enters a keyword and is presented with a list of document
       metadata with authors, titles and descriptions. We propose an alternative ap-
       proach that is based on entities in a document collection like authors, docu-
       ments and topics. The user can search for these entities and can then choose
       from a set of highly abstracted search strategies, e.g. to get highly cited papers
       from an author. The approach is applied in a zoomable and infinite user inter-
       face that enables the user to explore freely and where the search history is al-
       ways present.


       Keywords: Visual Interface, Exploratory Search, Visual Exploration, Search
       Strategies.


1      Introduction

Today’s Digital Libraries (DLs) still make use of the standard paradigm of query-
response. Users can enter a query and are presented with a list of relevant documents
which they have to inspect and filter according to their information need. Already
Bates presented a list of alternative search strategies from the real-world such as ‘Ci-
tation Searching’ or ‘Journal Run’ [1] which partly have been adopted in modern
scholarly database systems such as Scopus or Web of Science. However, this is far
from being usual practice in DLs, where full-text indexing of documents opens new
possibilities. Exploratory Search [5] proposes a model beyond query-response, with a
focus on the learn and investigation step in the search process. Highly-interactive
search systems can support these steps whereby interactive visual search systems are
a part of it. Therefore already a number of visual search tools have been proposed for
the exploration of DL content. Early attempts experimented with different visual met-
aphors or tried to gain insight with the visualisation of the distribution of information
facets. More recent tools are for example the INVISQUE system [4] which supports
the search and manipulation of results on an infinite panel or PivotPaths [2] which
show relations between concepts, resources and people. Another important aspect for
learning and investigation while searching is the visualisation of the search process
itself. Scientists spend much time with literature search; their search history can en-
large quickly over months to even years. Today’s DLs only support to save search
results or documents, other artefacts such as document inspection are lost. Research
showed that search histories support revisitation [6] in web search and support the
user’s orientation within a search session [3]. In the following we want to present a
User Interface (UI) concept which combines visual information search with different
search strategies and a visible search history.


2        Concept Overview

The concept consists of four core ideas:
1. The UI is an infinite panel with zoom & pan functionality. The user can start a
    search session from every point on the surface. Starting point is a simple search
    form, where one can search for artefacts of a document collection such as topics,
    persons, documents, journals similar to a search in a standard DL.
2. The UI does not only return result lists with document metadata, but small inter-
    active elements which represent entities and artefacts like persons, documents,
    topics and results of applied search strategies like a list of highly-cited papers.
3. For any of these entities, the user can choose a search strategy or functionality
    from a select box to initiate the next search step. A search strategy for the entity
    person can be e.g. ‘Highly-cited papers’, for the entity document e.g. ‘Similar
    Topics’. Compare Table 1 for more search strategies based on the entity type.
4. Over time a search graph is shown on the surface that represents the whole user’s
    search history. Users can select certain regions or search paths to (a) mini-
    mise/expand them, (b) to label, categorise or annotate them or (c) to set up an
    alert via email for new results, (d) to save/export/share them or for any other ac-
    tion that may be applied to a search path.

    Table 1. Examples for search strategies based on the entity type. In brackets the appropriate
                             search strategy of Bates [1] are shown.

Based on person                                  Based on document
• Highly-cited/highly-referenced                 • Cited by (Citation Searching)
    papers                                       • Referenced by
• Main/important co-authors                      • Similar Topics (Subject search)
• Main/sub research topics                       • Main/co-authors (Author searching)
                                                 • Documents from footnotes (Footnote
Based on topic                                       chasing)
• By relevance (tf-idf, BM25)                    • Same Journal (Journal run)
• Highly-cited papers                            • Same category/classification (Area
• Important authors/journals                         scanning)
• Journal/author productivity                    • Related/Similar papers (e.g. by con-
• Author centrality                                  tent, topics, references, subject etc.)
     Fig. 1. Exploring social science literature starting from the author ‘Ulrich Beck’


Figure 1 shows the core idea of our approach applied to an example from the field of
the social sciences. The mockup shows one possible exploration path based on a real-
world document collection from the social science portal Sowiport1.
    First, the user searches for the author ‘Ulrich Beck’ and can then choose ‘Highly-
cited papers’ from the strategies menu to get an overview of his most influencing
work. As a result the most-cited papers are presented in a small list. After inspecting
the abstracts in the document view, the user classifies the third paper as interesting
and chooses ‘Cited by’ from the methods menu to show the latest papers which influ-
enced it. Based on this paper the user is interested in the topic ‘Cosmopolitanism’ and
wants to see highly-cited papers for this topic. She/he arrives at the author ‘Esref Ak-
su’ and grabs the keywords ‘Cosmopolitan Democracy’ from the abstract and initiates
a new search. Choosing ‘Main Authors’ from the strategies menu shows two authors
and their papers with which the search process can continue.
    Because the search history is always visible on the user interface, the user can re-
turn to a previous search step such as a search, a person or a document and can con-
tinue the search there. In the above stated example, the user returned to the topic
‘Globalization’ from Beck’s ‘Cosmopolitical Realism’ and initiated a new search.


1
    http://sowiport.gesis.org
3         Discussion & Future Work

The proposed approach has several benefits over the standard search form/result list
paradigm. Based on the four core ideas of our approach these are:
1. UI as infinite Panel: The use of an UI panel with infinite space is a prerequisite
     that has been used in other visual exploration tools as well [4]. In our context it
     removes the limitation of showing only one search step, but can represent whole
     search sessions and multiple sessions.
2. Entities: In a standard DL, search results are limited to a list of documents or-
     dered by relevance. The use of entities such as persons, documents or topics al-
     lows the intuitive application of search strategies.
3. Search Strategy: Complex search strategies are encapsulated in one-click UI
     elements and can be applied easily. This allows alternative exploration and views
     on the document collection. New search strategies can be implemented easily.
4. Search Graph: Every step in the search process is visible on the UI and forms a
     search graph over time. That allows an overview of the whole search session, but
     also over a set of search sessions. Therefore a prior search path can be continued,
     but also a search session can be shared with another person, e.g. among col-
     leagues in a research group.
However, most search strategies require complex computation and a rich data set. For
example, “Highly-cited papers” for an author needs a separate citation index, which
may not always be present in nowadays DLs or the real-time computation of metrics
like author centrality can be a challenge. In a next step we want to implement a sys-
tem prototype that can be used for exploring different document collections such as
the arXiv2 corpus for the natural sciences or Sowiport for the social sciences which
contain this rich information. Based on that, we will perform various user tests to
verify the basic plausibility of our approach.

References

1. Bates, M.J.: The Design of Browsing and Berrypicking Techniques for the Online Search
   Interface. Online Rev. 13, 5, 407–424 (1989).
2. Dörk, M. et al.: PivotPaths: Strolling through Faceted Information Spaces. IEEE Trans Vis
   Comput Graph. 18, 12, 2709–2718 (2012).
3. Imko, J. et al.: Semantic History Map: Graphs Aiding Web Revisitation Support. Presented
   at the August (2010).
4. Kodagoda, N. et al.: Using Interactive Visual Reasoning to Support Sense-Making: Implica-
   tions for Design. IEEE Trans. Vis. Comput. Graph. 19, 12, 2217–2226 (2013).
5. Marchionini, G.: Exploratory search: from finding to understanding. Commun. ACM. 49, 4,
   41–46 (2006).
6. Mayer, M.: Web History Tools and Revisitation Support: A Survey of Existing Approaches
   and Directions. Found. Trends® Hum.-Comput. Interact. 2, 3, 173–278 (2007).


2
    http://arxiv.org

</pre>