<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Analysis of Footnote Chasing and Citation Searching in an Academic Search Engine</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ameni Kacem</string-name>
          <email>ameni.sahraoui@gesis.org</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Philipp Mayr</string-name>
          <email>philipp.mayr@gesis.org</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>GESIS - Leibniz Institute for the Social Sciences</institution>
          ,
          <addr-line>Cologne</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In interactive information retrieval, researchers consider the user behavior towards systems and search tasks in order to adapt search results by analyzing their past interactions. In this paper, we analyze the user behavior towards Marcia Bates' search stratagems such as 'footnote chasing' and 'citation search' in an academic search engine. We performed a preliminary analysis of their frequency and stage of use in the social sciences search engine sowiport. In addition, we explored the impact of these stratagems on the whole search process performance. We can conclude that the appearance of these two search features in real retrieval sessions lead to an improvement of the precision in terms of positive interactions with 16% when using footnote chasing and 17% for the citation search stratagem.</p>
      </abstract>
      <kwd-group>
        <kwd>Whole Session Retrieval</kwd>
        <kwd>Information Behavior</kwd>
        <kwd>Session Log</kwd>
        <kwd>Cited Reference Searching</kwd>
        <kwd>Stratagem Search</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Interactive information retrieval (IIR) refers to a research discipline that studies
the interaction between the user and the search system. In fact, researchers have
moved from considering only the current query and results set to focus more on
the user’s past interactions and the analysis of whole retrieval sessions. Research
approaches aim to understand the user searching behavior in order to improve
the ranking of results after submitting a query and enhance the user experience
within an IR system.</p>
      <p>
        In Digital Libraries (DLs), researchers study concepts such as search
strategies [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], search term suggestions [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ], communities’ detection [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ],
personalization of search results [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], recommendation’s impact [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], user’s information needs
frequency and change. In addition, many interactive IR models have been
proposed in the literature (e.g. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]) that describe the user’s behavior by different
steps (stages) of information seeking and interacting with an IR system.
      </p>
      <p>
        Similarly, in the academic search engine sowiport1 [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], we aim at
understanding the user behavior in order to support him/her during the search session. In
1 http://sowiport.gesis.org/
fact, DL users behave differently when interacting with the system as underlined
by Bates [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] who highlighted different concepts such as moves, tactics,
stratagems, and strategies.
      </p>
      <p>
        The goal of this paper is to explore the specific stratagems "footnote chasing"
and "citation searching" which are often utilized as exploratory search
functionalities in DLs [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. According to Bates [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] footnote chasing is defined as checking
the reference and related material of a work backward in time. Citation
searching refers to a forward chaining of works citing another one through a citation
index. Both stratagems are important search features which are build in
stateof-the-art academic search engine like Google Scholar or ACM Digital Library
and support the natural search behavior of a majority of academic searchers.
      </p>
      <p>
        Stratagems in general are not always supported by DLs because most of the
search functions available in academic search engines remain on the "moves" or
"tactics" level (described in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]) or cited references are completely missing in the
system. However, the use of stratagems can enhance the search experience of
the user and this represents the focus of this work. In particular, we address the
following research questions:
      </p>
    </sec>
    <sec id="sec-2">
      <title>RQ 1: Which usage patterns can be observed from footnote chasing (FC) and citation searching (CS)?</title>
      <p>In this paper, we analyze the usage patterns of footnote chasing (FC) and
citation searching (CS) stratagems in real retrieval sessions in terms of frequency
of their use and the stage at which they appear.</p>
    </sec>
    <sec id="sec-3">
      <title>RQ 2: How successful are retrieval sessions using the FC and CS stratagems?</title>
      <p>
        The use of a stratagem can impact the session conduct in different ways. We
examine the interactions of the users in sowiport DL in order to measure the
usefulness and the precision of sessions having such stratagems. We determine
the session success based on the presence of positive actions proposed recently by
Hienert and Mutschke [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In the following, we measure the amount of positive
actions before and after our two stratagems’ (FC and CS) occur in a retrieval
session.
      </p>
      <p>The remainder of this paper is organized as follows. In the next section, we
present an short overview of basic concepts explored by researchers in the field
of DLs. In Section 3 we analyze the user behavior towards the FC and CS
stratagems and how using them affects the quality of the whole session search.
Finally, we summarize our findings and present some perspectives relevant for
future work.
2</p>
      <sec id="sec-3-1">
        <title>Related Work</title>
        <p>
          Bates [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] has specified different types of user behavior toward search system,
among them we cite: moves, tactics, stratagems and strategies. A move refers to
a basic action performed by the user. A tactic resides in using additional moves
to go with the search. As for stratagems, they indicate complex and multiple
moves/tactics having knowledge of a particular search domain. A strategy is a
combination of moves, tactics and stratagems as a plan to pursue during the
search session. Footnote chasing and citation searching are popular stratagems
that refers to the study of documents and their bibliographic references and
citations. Schneider and Borlund [9] studied the effectiveness of using stratagems
in constructing and maintaining thesauri vocabulary and structure. Mahoui and
Cunningham [10] specified the importance of understanding the information of
DL users in creating useful and stable search systems. They analyzed transaction
logs to study usage patterns of CiteSeer in terms of query and search patterns.
Xie [11] analyzed the users’ search behaviors and their relationships with their
information needs by specifying a hierarchical level of users’ goals. Shute and
Smith [12] identified 13 knowledge-based tactics arranged into three categories:
broaden topic scope, narrow topic scope and change topic scope. Carevic and
Mayr [13] proposed bibliometric-enhanced search facilities such as "journal run"
or "citation search" and their possible integration in DLs. In their position paper,
they argue that bibliometric-enhanced stratagems can facilitate domain specific
search activities by applying bibliometric measures for re-ranking and/or
rearranging DL-entities like documents, journals or authors. They propose different
types of stratagem implementations like "extended journal run" or
"contextpreserving journal run" or extended versions of citation search.
3
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Methodology</title>
        <p>In this section, we first provide details about the data set that we used for
our analysis. Then, we describe the approach used to answer research questions
raised in Section 1.
3.1</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Data Set</title>
      <p>
        Sowiport2 is a DL for the Social Sciences that contains more than nine
million records, full texts and research projects included from twenty-two different
databases whose content is in English and German [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. For a part of the
collections, namely the ProQuest databases "Sociological Abstracts", "Social
Services Abstracts", "Applied Social Sciences Index and Abstracts", "Worldwide
Political Science Abstracts" and "Physical Education Index", sowiport provides
references and builds a citation index over its collections. These references and
citations are part of the analysis in the following.
      </p>
      <p>The Sowiport User Search Sessions Data Set (SUSS)3 contains individual
search sessions extracted from the transaction log of sowiport. The data was
collected over a period of one year (between 2nd April 2014 and 2nd April
2015)4. The web server log files and specific JavaScript-based logging techniques
were used to capture the user behavior within the system. The log was heavily
filtered to exclude transactions performed by robots. All transaction activities
2 http://www.sowiport.de
3 http://dx.doi.org/10.7802/1380 and [14]
4 A detailed description of the data set can be found in [15].
are mapped to a list of 58 different user actions which cover all types of activities
and pages that can be carried out/visited within the system (e.g. typing a query,
visiting a document, selecting a facet, exporting a document, etc.). For each
action, a session id, the date stamp and additional information (e.g. query terms,
document ids, and result lists) are stored. Based on the session id and date
stamp, the step in which an action is conducted and the length of the action is
included in the data set as well. The session id is assigned via browser cookies
and allows tracking user behavior over multiple searches. Thus, in the data set
we find 484,449 individual search sessions and a total of 7,982,427 log entries.
3.2</p>
    </sec>
    <sec id="sec-5">
      <title>Description of Actions in the Session Log</title>
      <p>
        Searching sowiport can be performed through an All fields search box (default
search without specification), or through specifying one or more field(s): title,
person, institution, number, keyword or year. The users’ main actions are
described in Table 1. In fact, we grouped the main actions into two categories:
"Query"-related and "Document"-related actions. Another categorization of
actions was proposed in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] by specifying search interactions and successive positive
actions.
      </p>
      <p>Main user actions as described before can be categorized into actions
regarding either search queries or documents. These actions are used in different scales
in the data set. Query-related actions represent 29.84% while document-related
actions represent 35.79% of the total amount of actions. The rest of actions
contain navigational interactions such as logging in the system, managing favorites,
and accessing the system pages.</p>
      <p>In Table 2, we show a specific session, the user’s ID and the actions’ label and
length in seconds. In this session, the user with ID 41821 started with logging
into the system and then submitted a query describing his/her information need
(query_form) after doing some navigational actions. After getting the result list,
labeled as resultlistids, the user performed additional searches (searchterm_2 ),
and displayed some results’ content (view_record ). Finally, he/she checked the
external availability of a result (goto_google_scholar ). We notice that the user
spent more than 40% of the time reading documents’ content.</p>
      <p>In this paper, we are interested in the stratagems view_citation aka CS and
view_references aka FC that are found in 20,353 sessions of the SUSS dataset.
In our data set, we have 1,520 sessions within a user performed FC and 18,833
sessions with a CS.
3.3</p>
    </sec>
    <sec id="sec-6">
      <title>Measurements</title>
      <p>To answer the first research question described in Section 1, we analyze the
sessions with the mentioned stratagems FC and CS.</p>
      <p>
        For a session S during which a set of interactions fIg is performed by the
user, we define:
– Strat is a stratagem such as FC and CS,
– P os is a positive interaction present in our data set among the following set
fP g described in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]:
goto_fulltext, goto_google_scholar, goto_local_availability, goto_google_books,
view_description, export_cite, export_bib, export_mail, to_favorites,
export_search_mail, save_search, save_search_history, save_to_multiple_favorites.
      </p>
      <p>To answer the second research question, we measure the precision of a stratagem
before (P recision(Strat)b) and after (P recision(Strat)a) so we verify if it has
an influence on the conduct of a session. We verify if we can find more
positive actions after using a stratagem comparing to the number of positive actions
before its utilization.</p>
      <p>P recision(Strat)b =
jP os 2 fP gj
jIj</p>
      <p>b
P recision(Strat)a = jP os 2 fP gj (2)
jIj a</p>
      <p>
        To have an overview of a stratagem benefit, we measure the Usefulness as
the percentage of successful sessions in terms of positive actions among all the
sessions including both of the studied stratagems. This measure is inspired from
the Global Usefulness measure proposed by [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]:
(1)
(3)
U sef ulness(Strat) =
+
sStrat
jsStratj
where s+(Strat) indicates session success in terms of positive actions occurrence
after using a specific stratagem, and jsStratj represents the number of sessions
using a stratagem (footnote chasing or citation search) no matter the type of
user’s interactions (positive or not).
4
      </p>
      <sec id="sec-6-1">
        <title>Results</title>
        <p>Our preliminary results show the distribution of stratagems at different stages of
the sessions (see Figure 1 and Figure 2). We observe that citation search (Figure
1a) appear mostly in the end of sessions (90%) with 5,400 sessions and in the
middle (50%) with 3,504 among 18,833 sessions presenting this stratagem. For
footnote chasing (Figure 1b), it appears mostly in the middle of the session (50%
- 60%) within 489 sessions among the 1,520 sessions including this stratagem.</p>
        <p>We note that the position of the stratagem differs from one session to another
due to the difference of sessions’ length. We noticed from the user behavior
analysis that for sessions that are short, the stratagem appears after or in the
middle of sessions. For longer sessions, the stratagem appears between the first
30% and 50% of a session’s interactions.</p>
        <p>
          Then, in order to study the effectiveness of the stratagems Footnote chasing
and Citation searching, we measure their precision before and after their
appearance during search sessions. This measure is based on a set of positive actions
that are considered as indicators of the session success [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
        </p>
        <p>In Table 3, we present the results of the measures described in equations 1,
2 and 3. We use as a baseline a set of random sessions that do not include both
of the stratagems.</p>
        <p>From Table 3, we conclude that footnote chasing and citation search have a
positive impact on the session performance in terms of positive actions
appearance with 16.19% and 17% correspondingly. In fact, the positive actions appeared
before a stratagem are lower in most of the cases to positive ones appeared after.
We can present an improvement in terms of positive actions occurrence when a
stratagem is employed by the users and thus conclude that these stratagems lead
to successful sessions and positive interactions. In fact, there is a reformulation
of the query in only 7.67% of sessions with citation search and 12.63% of those
with footnote chasing. Thus, in most of the sessions, the positive actions are
directly related to the first search and first appearance of stratagems.</p>
        <p>As for the global performance of both stratagems, the usefulness values
of 77.2% for footnote chasing and 63.3% for citation search are considered as
promising results as the value of this measure is in [0; 1] and it improves the
results of the baseline with over 44%.</p>
        <p>In Table 4 we present the different ways in which a stratagem can effect the
conduct of a session: positive, non-positive or neutral. We obtain those values
through the difference Dif f (a; b) between the precision before (P recision(Strat)b)
and the one after (P recision(Strat)a) the use of a stratagem. We can see that
both stratagems affect the sessions mostly in a positive way. This means that the
use of a stratagem influences the user behavior and make him more interactive
with the system in a beneficial aspect. Also, the non-positive effect is present in
a small amount of sessions comparing to the neutral one. In fact, the absence
of positive actions does not mean a negative conduct of a session because the
user is always interacting with the system using moves and tactics that are not
judged positive but not specified as negative either.
5 The baseline is a set of 100 random sessions that do not include both of the
stratagems.
6 Gain: computed as the difference between the precision-after and precision-before.</p>
      </sec>
      <sec id="sec-6-2">
        <title>Summary</title>
        <p>
          In this paper, we started to investigate the use of two stratagems namely Footnote
Chasing and Citation Search in sowiport digital library and more precisely in the
SUSS data set [14]. In fact, studying the user behavior towards stratagems can
enhance the user-system interactions and lead to more useful academic search
engines [13]. Using the SUSS data set, we examined the frequency and stage of
use of such stratagems as well as their impact on sessions. We verify whether
their utilization can lead to successful sessions. We measured the success of
sessions by measuring the difference between the precision before and after using
a stratagem. Both of the precisions are obtained thanks to the positive actions
occurrence in sessions [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
        </p>
        <p>As future work, we need to perform further analysis of other stratagems such
as journal run for instance and to go beyond log analysis to do user studies in
order to compare user feedback with the findings of this study.
6</p>
      </sec>
      <sec id="sec-6-3">
        <title>Acknowledgement</title>
        <p>This work was funded by Deutsche Forschungsgemeinschaft (DFG), grant no.
MA 3964/5-1; the AMUR project at GESIS together with the working group of
Norbert Fuhr. The AMUR project aims at improving the support of interactive
retrieval sessions following two major goals: improving user guidance and system
tuning.
9. Schneider, J.W., Borlund, P.: Introduction to bibliometrics for construction and
maintenance of thesauri: Methodical considerations. Journal of Documentation
60(5) (2004) 524–549
10. Mahoui, M., Cunningham, S.J.: Search behavior in a research-oriented digital
library. In: International Conference on Theory and Practice of Digital Libraries,
Springer (2001) 13–24
11. Xie, H.I.: Patterns between interactive intentions and information-seeking
strategies. Information Processing &amp; Management 38(1) (2002) 55 – 77
12. Shute, S.J., Smith, P.J.: Knowledge-based search tactics. Information Processing
&amp; Management 29(1) (1993) 29 – 45
13. Carevic, Z., Mayr, P.: Extending search facilities via bibliometric-enhanced
stratagems. CoRR abs/1503.06683 (2015)
14. Mayr, P.: Sowiport User Search Sessions Data Set (SUSS) (2016)
15. Mayr, P., Kacem, A.: A Complete Year of User Retrieval Sessions in a Social
Sciences Academic Search Engine. In: 21st International Conference on Theory
and Practice of Digital Libraries (TPDL 2017). (2017)</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Carevic</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mayr</surname>
            ,
            <given-names>P.:</given-names>
          </string-name>
          <article-title>Survey on high-level search activities based on the stratagem level in digital libraries</article-title>
          .
          <source>In: Proceedings of TPDL</source>
          <year>2016</year>
          .
          <article-title>(</article-title>
          <year>2016</year>
          )
          <fpage>54</fpage>
          -
          <lpage>66</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Hienert</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mutschke</surname>
            ,
            <given-names>P.:</given-names>
          </string-name>
          <article-title>A usefulness-based approach for measuring the local and global effect of iir services</article-title>
          .
          <source>In: Proceedings of CHIIR '16</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2016</year>
          )
          <fpage>153</fpage>
          -
          <lpage>162</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Mayr</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>How do practitioners, PhD students and postdocs in the social sciences assess topic-specific recommendations?</article-title>
          <source>In: Proc. of the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL2016)</source>
          , Newark, New Jersey, USA, CEUR-WS.org (
          <year>2016</year>
          )
          <fpage>84</fpage>
          -
          <lpage>92</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Akbar</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shaffer</surname>
            ,
            <given-names>C.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fox</surname>
            ,
            <given-names>E.A.</given-names>
          </string-name>
          :
          <article-title>Deduced social networks for an educational digital library</article-title>
          .
          <source>In: Proceedings of JCDL '12</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2012</year>
          )
          <fpage>43</fpage>
          -
          <lpage>46</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Belkin</surname>
            ,
            <given-names>N.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cole</surname>
            ,
            <given-names>M.J.:</given-names>
          </string-name>
          <article-title>Personalization of search results using interaction behaviors in search sessions</article-title>
          .
          <source>In: Proceedings of SIGIR '12</source>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2012</year>
          )
          <fpage>205</fpage>
          -
          <lpage>214</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Ellis</surname>
            ,
            <given-names>D.:</given-names>
          </string-name>
          <article-title>A behavioural approach to information retrieval system design</article-title>
          .
          <source>Journal of Documentation</source>
          <volume>45</volume>
          (
          <issue>3</issue>
          ) (
          <year>1989</year>
          )
          <fpage>171</fpage>
          -
          <lpage>212</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Hienert</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sawitzki</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mayr</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : Digital Library Research in Action - Supporting Information Retrieval in Sowiport. D-Lib Magazine
          <volume>21</volume>
          (
          <issue>3</issue>
          /4) (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Bates</surname>
            ,
            <given-names>M.J.:</given-names>
          </string-name>
          <article-title>Where should the person stop and the information search interface start? Inf</article-title>
          . Process. Manage.
          <volume>26</volume>
          (
          <issue>5</issue>
          ) (
          <year>October 1990</year>
          )
          <fpage>575</fpage>
          -
          <lpage>591</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>