<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>B. Rahdari); peterb@pitt.edu (P. Brusilovsky)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>PaperExplorer: Personalized Exploratory Search for Conference Proceedings</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Behnam Rahdari</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Peter Brusilovsky</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Computing and Information, University of Pittsburgh</institution>
          ,
          <addr-line>135 North Bellefield Avenue Pittsburgh, PA 15260</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2021</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>This paper presents our attempt to create an exploratory search system, PaperExplorer, for a historic archive of conference proceedings. PaperExplorer uses concept extraction, knowledge graphs, and user-controlled recommendation to assist users with various levels of domain expertise in their information needs.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Exploratory Search</kwd>
        <kwd>Knowledge Graph</kwd>
        <kwd>Information Exploration</kwd>
        <kwd>Intelligent interface</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction and Background</title>
      <sec id="sec-1-1">
        <title>Exploratory search systems form an increasingly popu</title>
        <p>
          lar category of information access and exploration tools. 1.2. Controllability
These systems creatively combined search, browsing, and
information analysis steps shifting user eforts from re- User controllability has been recognized as a valuable
call (formulating a query) to recognition (i.e., selecting component of advanced information access interfaces.
a link) and helping them to gradually learn more about The ideas of controllability were made popular by a
the explored domain [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. stream of work on user-controllable recommender
sys
        </p>
        <p>
          In this paper we present our attempt to augment the tems [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. However the value of extended user control
set of search systems focused on conference proceedings has been also demonstrated in the area of exploratory
with a personalized exploratory search system PaperEx- search.
plorer 1. We hope that PaperExplorer ability to support in- For example, NameSieve [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] presented a summary of
formation discovery, learning-while-searching, and per- search results in the form of entity clouds, which a
consonalization could help a broader set of users to benefit trollable filtering and exploration of results.
PeopleExfrom the assembled collection of conference proceedings. plorer [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] ofered users an option to re-sort people search
results based on multiple user-related factors. uRank [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]
1.1. Exploratory Search introduced a controllable interface for refining and
reorganizing search result and SciNoon [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] simplifies the
exploratory search process for scientific groups.
of finding research publications related to a certain
conference.
        </p>
        <p>
          A number of real-life search tasks require a considerable
amount of learning during the search process to achieve
adequate results. These tasks are known as exploratory
search tasks [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. Since simple search systems are usually
not eficient in supporting exploratory search tasks, a
range of specialized systems have been developed and
evaluated.
        </p>
        <p>
          More recently, few projects in this area demonstrated
that the efectiveness of exploratory search could be
improved by using a personalized system, which builds a
profile of user interests and adapts to the individual user
[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. The work presented in this paper investigates the
ideas of profile-based exploratory search in the context
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>The idea to apply open user profiles (also known as open</title>
        <p>user models) to better support personalized information
access was among the early ideas explored in this field.
Open user profiles allow users to examine and possibly
change the content of their interest profiles, which are
used to personalize their search or browsing process.</p>
        <p>
          Since the open user profiles increase interactivity,
transparency, and controllability of the information
exploration process, their application was a good match to
the nature of exploratory search. While first attempts to
introduce “bag-of-words" open user profiles had mixed
success [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], more recent work focused on semantic level
user profiles demonstrated its potential for personalized
exploratory search [
          <xref ref-type="bibr" rid="ref10 ref3">3, 10</xref>
          ].
        </p>
        <p>
          We start the paper with the presentation of
PaperExplorer interface and follow with the details on concept
Personalized information exploration in PaperExplorer
is centered around user interest profile [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] - a collection
of concepts represented by keyphrases that express user
interests. Unlike traditional search that requires users to
specify all keyphrases in a query, PaperExplorer supports
users in the process of gradual discovery and refinement
of their interests. It also allows the users to control the
importance of each keyphrase in recommending relevant
results. PaperExplorer interface consists of the following
main sections.
        </p>
        <sec id="sec-1-2-1">
          <title>2.1. Instant Search Box</title>
        </sec>
      </sec>
      <sec id="sec-1-3">
        <title>The search box (Figure 1A) is the gateway to the system.</title>
        <p>The instant search approach allows users to discover
relevant keyphrases representing concepts of interest
without a fully formulated query. When a user starts
typing a query, a series of matching keyphrases appears
helping the user to discover a concepts of interest (e.g.,
User Interfaces and User Modeling). When an item is
selected from the list, it will automatically adds to the
slider area (Figure 1C). at the same time, an updated list
of search results will be presented to the user.</p>
        <sec id="sec-1-3-1">
          <title>2.2. Recommended Keyphrases</title>
        </sec>
      </sec>
      <sec id="sec-1-4">
        <title>When at least one keyphrase is added to the user’s profile,</title>
        <p>the system recommends five semantically similar
concepts (shown as keyphrases) in the Similar keyphrases
area of the interface (Figure 1B). Users can add recom- 2.4. Search Results
mended keyphrases to their interest profiles by clicking
on the plus button to the right of each keyphrase. As the
extraction, knowledge graph organization, and recom- user’s profile grows and refines, the set of recommended
mendation that enable the work of this interface. concepts is updated since the system recommends
instances similar to all concepts in the user’s profile. Each
recommended concept also provides users with a short
2. The Interface of PaperExplorer description of the concept. Clicking on the question mark
button next to the add button, opens up a separate
window containing the abstract of that concept’s Wikipedia
entry.</p>
        <sec id="sec-1-4-1">
          <title>2.3. Open User Profile</title>
        </sec>
      </sec>
      <sec id="sec-1-5">
        <title>The slider area (Figure 1C) displays the current user pro</title>
        <p>ifle of interest. PaperExplorer implements a
contentbased recommendation approach, which generates the
list of recommended results (Figure 1D) using the profile.
To support transparency and controllability of this
process, the interest profile is visible and directly editable by
the end users.</p>
        <p>To build the profile the user can add relevant concepts
represented by keyphrases as explained above as well
as remove less relevant keyphrases (using the red x) as
they discover more relevant concepts or explore diferent
interests.</p>
        <p>
          Sliders associated with each keyphrase enable users to
control the relative importance of the represented
concept compared to others in their profile, ranging from
1 (least important) to 10 (most important). The use of
sliders for fine-tuning of user profile was motivated by
keyword tuning approach in uRank [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], which was
conifrmed as a user-friendly and eficient in an exploratory
search context. All actions within the profile (adding,
removing, or adjusting sliders) immediately afect the
search results list.
        </p>
      </sec>
      <sec id="sec-1-6">
        <title>As soon as the user adds the first keyphrase to the interest profile, a table of the 20 most relevant publications</title>
        <sec id="sec-1-6-1">
          <title>3.1. Data Source and Keyphrase</title>
        </sec>
        <sec id="sec-1-6-2">
          <title>Extraction</title>
          <p>We used the collection of proceedings from two main
conferences (Hypertext and UMAP) as the main source
of data to build the knowledge graph and extract the
keyphrases. This collection covers all publications of
these two conferences from 2008 to 2020. Using this
dataset and the concept extraction explained below, we
generated the knowledge graph covering 2023
publications. 14404 keyphrases were extracted from titles and
abstracts of these publications.</p>
          <p>
            We used TopicRank [
            <xref ref-type="bibr" rid="ref12">12</xref>
            ], a graph-based keyphrase
extraction method to extract the initial set of candidate
keyphrases from the title and abstract of the publications.
We then used the Wikipedia API to filter all extracted
keyphrases; only keyphrases with an entry in Wikipedia
were kept in the knowledge graph. We further assign
weight to each publication keyphrase pair using cosine
similarity between the bags-of-words extracted from the
Wikipedia page and the publications.
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>4. Profile-Based Search</title>
      <p>is generated (Figure 1:D). The first column of the table
visualizes the combined relevance between keyphrases
in the user interest profile and each result. The colors in
the stacked-bar (Figure 1:D1) are matched with the color
of slider in the profile and the size and opacity of each
bar expresses the relevance of the result to each profile
keyphrase.</p>
      <p>The second column of table lists the titles of relevant
publications. Clicking on each title expands a window
that holds the abstract of the paper. The mentioned
keyphrases are highlighted with corresponding colors.</p>
      <p>The opacity of the colors reflect the relevance of a
keyphrase to the paper and the current value of slider for
that keyphrase. To further assist the users, PaperExplorer
underlines all available keyphrases in the text (both in
title and abstract).</p>
      <p>Hovering over the underlined portion of the text opens
a popup window (Figure 1:D2) that enable user to (1) see
the relevance of the keyphrase to the text in a form of a
vertical bar-chart, (2) add the keyphrase directly to the
interest profile, and (3) report the improper keyphrases
to the administrator for removal.</p>
      <p>The latter helps us to improve the quality of extracted
keyphrases and eliminate the occasional errors in the
process of extraction.</p>
      <p>We deployed a two-phase search process to produce the
most relevant results based on user interest profile. In the
ifrst phase, a primary list of candidates is being selected
from the graph and the second phase assure that the
results are presented to the user in the right order based
on their relevancy to the query. We describe these two
phases in more details in the following.</p>
      <p>Candidate selection: We used the Cypher Querying
Language to generate the initial list of candidate
publications. At each instance of user interaction with the
system (e.g., adding/removing keyphrases or tuning the
3. The Knowledge Graph sliders), the system considers all publications connected
to at least one of the concepts of interest in the user
The knowledge graph consists of three main entities - profile.
publications, authors, keyphrases and their relationships Reordering the results: After generating the list of
can- extracted from our data set and hosted in a native graph didate results, the system rearranges the results in a way
database Neo4j2. that the most relevant results appear at the top of the list.</p>
      <p>Figure 2 presents the schematic representation of the In order to do that, first a complete list of keyphrases that
knowledge graph. Authors are interconnected by the re- appear in the text (title and abstract) of each publication,
lation Co-Author (based on co-authorship) and connected alongside with their relevancy score (weight) is being
to papers by the relation Published. Papers connected to generated. Then for every keyphrase that exist in the
keyphrases using the Has-Key relationship. The latter user interest profile, we multiplied its weight with the
carries a weight that determines the strength of the rela- value of corresponding slider. Finally, the relevance score
tionship between each keyphrase and the publication. is assigned to each candidate considering candidate’s
similarity to each of profile concepts and the value of the
sliders.
PaperExplorer system has been deployed online and also
demonstrated to several target users. The early results
indicate that the success of the system to a
considerable extent depends on the quality of keyphrase
extraction. We are interested to collaborate with experts on
keyphrase extraction to develop approaches optimized
for exploratory search.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R. W.</given-names>
            <surname>White</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Kules</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Drucker</surname>
          </string-name>
          , et al.,
          <article-title>Supporting exploratory search</article-title>
          ,
          <source>Communications of the ACM</source>
          <volume>49</volume>
          (
          <year>2006</year>
          )
          <fpage>36</fpage>
          -
          <lpage>39</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Marchionini</surname>
          </string-name>
          ,
          <article-title>Exploratory search: From finding to understanding</article-title>
          ,
          <source>Communications of the ACM</source>
          <volume>49</volume>
          (
          <year>2006</year>
          )
          <fpage>41</fpage>
          -
          <lpage>46</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>F.</given-names>
            <surname>Bakalov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>König-Ries</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nauerz</surname>
          </string-name>
          , M. Welsch,
          <string-name>
            <surname>IntrospectiveViews:</surname>
          </string-name>
          <article-title>An interface for scrutinizing semantic user models</article-title>
          ,
          <source>in: 18th International Conference on User Modeling, Adaptation, and Personalization</source>
          , Springer,
          <year>2010</year>
          , pp.
          <fpage>219</fpage>
          -
          <lpage>230</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>B. P.</given-names>
            <surname>Knijnenburg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bostandjiev</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. O'Donovan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Kobsa</surname>
          </string-name>
          ,
          <article-title>Inspectability and control in social recommenders</article-title>
          ,
          <source>in: 6th ACM Conference on Recommender Systems</source>
          ,
          <year>2012</year>
          , pp.
          <fpage>43</fpage>
          -
          <lpage>50</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5] J.-w. Ahn,
          <string-name>
            <given-names>P.</given-names>
            <surname>Brusilovsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Grady</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Florian</surname>
          </string-name>
          ,
          <article-title>Semantic annotation based exploratory search for information analysts</article-title>
          ,
          <source>Information Processing &amp; Management</source>
          <volume>46</volume>
          (
          <year>2010</year>
          )
          <fpage>383</fpage>
          -
          <lpage>402</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Han</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Yue</surname>
          </string-name>
          ,
          <article-title>Supporting exploratory people search: a study of factor transparency and user control</article-title>
          ,
          <source>in: Proceedings of the 22nd ACM international conference on Information &amp; Knowledge Management, ACM</source>
          ,
          <year>2013</year>
          , pp.
          <fpage>449</fpage>
          -
          <lpage>458</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C.</given-names>
            di
            <surname>Sciascio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Sabol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. E.</given-names>
            <surname>Veas</surname>
          </string-name>
          ,
          <article-title>Rank as you go: User-driven exploration of search results</article-title>
          ,
          <source>in: 21st International Conference on Intelligent User Interfaces</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>118</fpage>
          -
          <lpage>129</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Nedumov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Babichev</surname>
          </string-name>
          , I. Mashonsky,
          <string-name>
            <given-names>N.</given-names>
            <surname>Semina</surname>
          </string-name>
          , Scinoon:
          <article-title>Exploratory search system for scientific groups</article-title>
          ,
          <source>in: IUI 2019 Workshop on Exploratory Search and Interactive Data Analytics</source>
          ,
          <year>2019</year>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2327</volume>
          / IUI19WS-ESIDA-3.pdf .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9] J.-w. Ahn,
          <string-name>
            <given-names>P.</given-names>
            <surname>Brusilovsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Grady</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. Y.</given-names>
            <surname>Syn</surname>
          </string-name>
          ,
          <article-title>Open user profiles for adaptive news systems: help or harm?</article-title>
          ,
          <source>in: the 16th international conference on World Wide Web, WWW '07</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          ,
          <year>2007</year>
          , pp.
          <fpage>11</fpage>
          -
          <lpage>20</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>T.</given-names>
            <surname>Ruotsalo</surname>
          </string-name>
          , G. Jacucci,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kaski</surname>
          </string-name>
          ,
          <article-title>Interactive faceted query suggestion for exploratory search: Wholesession efectiveness and interaction engagement</article-title>
          ,
          <source>Journal of the Association for Information Science and Technology</source>
          <volume>71</volume>
          (
          <year>2020</year>
          )
          <fpage>742</fpage>
          -
          <lpage>756</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>B.</given-names>
            <surname>Rahdari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Brusilovsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Babichenko</surname>
          </string-name>
          ,
          <article-title>Personalizing information exploration with an open user model</article-title>
          ,
          <source>in: 31st ACM Conference on Hypertext and Social Media (HT '20)</source>
          , Association for Computing Machinery, New York, NY, USA,
          <year>2020</year>
          , p.
          <fpage>0</fpage>
          . doi:
          <volume>10</volume>
          .1145/3372923.3404797.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bougouin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Boudin</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. Daille,</surname>
          </string-name>
          <article-title>TopicRank: Graph-based topic ranking for keyphrase extraction</article-title>
          ,
          <source>in: Proceedings of the Sixth International Joint Conference on Natural Language Processing, Asian Federation of Natural Language Processing</source>
          , Nagoya, Japan,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>