<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Pluggable Work-bench for Creating Interactive IR Interfaces</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mark M. Hall</string-name>
          <email>m.mhall@shef</email>
          <email>m.mhall@sheffield.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Spyros Katsaris</string-name>
          <email>evolve.sheffieldis@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elaine Toms</string-name>
          <email>e.toms@shef</email>
          <email>e.toms@sheffield.ac.uk</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Sheffield University</institution>
          ,
          <addr-line>S1 4DP, Sheffield</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Sheffield University</institution>
          ,
          <addr-line>S1 4DP, Sheffield</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Sheffield University</institution>
          ,
          <addr-line>S1 4DP, Sheffield</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Information Retrieval (IR) has bene ted from standard evaluation practices and re-usable software components, that enable comparability between systems and experiments. However, Interactive IR (IIR) has had only very limited bene t from these developments, in part because experiments are still built using bespoke components and interfaces. In this paper we propose a exible workbench for constructing IIR interfaces that will standardise aspects of the IIR experiment process to improve the comparability and reproducibility of IIR experiments.</p>
      </abstract>
      <kwd-group>
        <kwd>evaluation</kwd>
        <kwd>framework</kwd>
        <kwd>standardisation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>MOTIVATION</title>
      <p>Information Retrieval (IR) has bene ted from standard
evaluation practices and re-usable software components. The
Cran eld-style evaluation methodology enabled evaluation
programmes such as TREC, INEX, or CLEF. At the same
time provision of re-usable software components such as
Lucene1, Terrier2, Heritrix3, or Nutch4 have enabled IR
researchers to focus on the development of those components
directly related to their research. However, Interactive IR
(IIR) as had only very limited bene t from these
developments.</p>
      <p>Typically IIR research is still conducted using a single
system in a laboratory setting in which a researcher observed
1https://lucene.apache.org/
2http://terrier.org/
3https://webarchive.jira.com/wiki/display/Heritrix/Heritrix
4http://nutch.apache.org/</p>
      <p>
        Presented at EuroHCIR2013. Copyright 2013 for the individual papers
by the papers’ authors. Copying permitted only for private and
academic purposes. This volume is published and copyrighted by its
editors.
and interacted with a participant [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], usually using a
bespoke IIR interface. Developing and running such
experiments is a time-consuming, resource exhaustive and labour
intensive process [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. As a result of this bespoke approach,
the comparability of IIR experiments and their results
suffers. Where studies of the same activities show divergent
results, it is di cult to determine whether the di erences
are due to the speci c aspect of IIR under investigation, or
simply due to di erent participant samples or small di
erences in how the non-investigated user-interface (UI)
components were implemented. The bespoke nature also makes it
harder to replicate studies, as publications frequently do not
contain su cient detail to exactly replicate the experiment.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] we have proposed a exible, standardised IIR
evaluation framework that aims to address the issues created by
variations in the experimental processes and by how context
information is acquired from the participants. However, the
framework makes no provisions towards providing
standardised IIR components that would improve the comparability
of the experiment itself, the ease of setting up the
experiment, and the ease of reproducibility.
      </p>
      <p>
        A number of attempts at developing a con gurable,
reusable IIR evaluation system have been made in the past.
In 2004, Toms, Freund and Li designed and implemented
the WiIRE (Web-based Interactive Information Retrieval)
system [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], which devised an experimental work ow
process that took the participant through a variety of
questionnaires and the search interface. Used in TREC 11
Interactive Track, it was built using Microsoft O ce desktop
technologies, severely limiting its capabilities. The system was
re-created for the web and successfully used in INEX2007
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], but lacked exibility in setup and data extraction. More
recently, SCAMP (Search Con gurAtor for experiMenting
with PuppyIR) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] was developed to assess IR systems, but
does not include the range of IIR research designs that are
typically done. A heavy-weight solution is PIIRExS5 [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
which supports the researcher through the whole process
from setting up the experiment to analysis, providing greater
support but also a steeper learning curve. These approaches
highlight the di culty of balancing the two main constraints
that limit a system's wide-spread use:
su cient exibility to support the wide range of IIR
interfaces and experiments;
su ciently simple to implement that it does not
increase the resource commitment required to set up the
experiment.
      </p>
      <sec id="sec-1-1">
        <title>5http://sourceforge.net/projects/piirexs</title>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>DESIGN</title>
      <p>
        To achieve the goal of developing a system that ful ls
these requirements, we propose a system design that is based
around a very lean core into which the researcher can plug
the IIR components they wish to include in their experiment.
We have implemented this design in our web-based
evaluation framework ( g. 1), which complements the larger IIR
experiment support system presented in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. To achieve
maximum exibility, the system was designed using a
messagepassing architecture that consists of the following four
components:
      </p>
      <p>Web Frontend is handles the interface between the
participant's browser and the evaluation workbench
and is implemented using a combination of client-side
and server-side functionality.</p>
      <p>Message Bus handles the inter-component
communication and forms the core of the system. It is
responsible for passing messages from the Web Frontend
to the IIR components con gured to be listening for
those messages and also for passing messages directly
between the components.</p>
      <p>Session handles loading and saving the components'
current state for a speci c participant, hiding the
complexities of web-application state from the individual
components.
[SearchResults]
handler = application.components.SearchResults
name = search_results
layout = grid-9 vgrid-expand
connect = search_box:query</p>
      <p>When the researcher sets up the workbench for their
experiment, they can freely con gure which components to
use, how to lay them out, and which components to
connect to which other components. Based on this con
guration the Web Frontend generates the initial user-interface
that is shown to the participants. Then, when the
participant interacts with a UI element ( g. 2), the resulting UI
event is handled by the Web Frontend, which generates a
message based on the UI event. This message is passed to
the Message Bus, which uses the con guration provided
by the researcher to determine which components to deliver
the message to. The components that are listening for that
message update their own Session state based on the
message and then mark themselves as changed. After message
processing has been completed for all components, the Web
Frontend then updates the UI for each of the changed
components.</p>
      <p>An example of the con guration used to set-up the
experiment is shown in gure 3 (from the experiment in gure 4),
specifying the con guration of the \search results"
component. It speci es that the component should be displayed 9
grid-cells wide (the application layout uses a 12-by-12 cell
grid layout) and should expand vertically to use as much
space as is available. The component is con gured to be
connected to the \search box" component via the \query"
message. It is this ability to freely plug components together
that, we believe, makes the framework su ciently exible to
support the wide range of IIR experiments, while remaining
simple to set-up and use.</p>
    </sec>
    <sec id="sec-3">
      <title>3. STANDARD COMPONENTS</title>
      <p>The core system provides only the framework into which
the IIR components can be plugged. This allows the
researcher to build any custom IIR UI they wish to test, while
at the same time being able to take advantage of the
standardised session and log handling functionality. As IIR UIs
frequently include required elements that are not the focus of
the study the researcher wishes to undertake, an optional set
of default components for core IR UI elements is provided to
reduce set-up time. This has the additional advantage that
as their behaviour is consistent across experiments, the
comparability of experiments using the framework is improved.
3.1</p>
    </sec>
    <sec id="sec-4">
      <title>Search Box</title>
      <p>Logging provides a standardised logging interface that
allows the components to easily attach logging
information to the UI event generated by the participant.</p>
      <p>
        The Search Box component ([
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], p. 49, \Formulate Query
Interface" [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], p. 76) provides a standard search box. When
the participant enters text and clicks on the \Search" button,
it generates a query message, which is usually connected to
a Standard Results List.
3.2
      </p>
    </sec>
    <sec id="sec-5">
      <title>Standard Results List</title>
      <p>
        The Standard Results List component ([
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], p. 50,
\Examine Results Interface" [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], p. 77) provides a default 10 item
listing of search results. The Standard Results List includes
support for displaying snippets ([
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], p. 51) and what Wilson
calls \Usable Information" ([
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], p. 51) for each result
document. Unlike the other standard components, which can
be used out-of-the-box, the Standard Results List has to be
extended by the researcher in order to be able to access the
search-engine used to power the UI.
3.3
      </p>
    </sec>
    <sec id="sec-6">
      <title>Pagination</title>
      <p>
        The Pagination component ([
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] p. 70) displays a con
gurable number of pages around the current search-results
page. In response to user interaction it sends a start
message with the rank of the rst document to paginate to.
3.4
      </p>
    </sec>
    <sec id="sec-7">
      <title>Category Browsing</title>
      <p>
        The Category Browsing component ([
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], p. 54) provides a
hierarchical category structure that the participant can use
to explore a collection. Clicking on a category sends a query
message with the category's identi er.
3.5
      </p>
    </sec>
    <sec id="sec-8">
      <title>Saved Documents</title>
      <p>The Saved Documents component provides an area where
the participant can save things that they have found
interesting, to support them in their current task. Documents
are added through a save_document message. The Saved
Documents component supports an optional tagging feature
enabling the participant to tag the document with values
speci ed by the researcher. This can be used to let the
participant specify why they have chosen that document or how
much it helps them in their current task.
3.6</p>
    </sec>
    <sec id="sec-9">
      <title>Task</title>
      <p>The Task component provides a static display of the task
information to show to the user. Two versions of this
component are provided, one that displays a static text set in
the con guration, and one that can fetch a task description
from the database, based on a parameter passed to it.</p>
    </sec>
    <sec id="sec-10">
      <title>APPLICATION</title>
      <p>The evaluation work-bench has so far been used to build
two IIR experiments, very di erent in their nature, clearly
demonstrating the work-bench's exibility.</p>
      <p>The rst experiment ( g. 4) re-uses the standard Task,
Search Box, Pagination, and Saved Documents components,
and extends the Standard Results List to work with the
speci c search backend. This set-up re-creates what is
essentially a relatively standard search UI con guration, that is
being used to investigate query session behaviour.</p>
      <p>The second experiment ( g. 5) demonstrates a much
richer interface, with more modi cations to the components
and an experiment-speci c component. It re-uses the Task
and Category Browsing components, extends the default
Search Box, Pagination, Standard Results List, and Saved
Documents components, and adds a new Item View
component. The message-passing nature of the system made
it possible to quickly integrate the new component, so that
when the participant clicks on a meta-data facet in the Item
View, a query message is sent to the Standard Results List
to nd items with the same bit of meta-data. The interface
was used to investigate un-directed exploration behaviour in
a large digital cultural heritage collection.
5.</p>
    </sec>
    <sec id="sec-11">
      <title>WHERE TO GO NEXT?</title>
      <p>The stated aim of this paper was to present a novel,
pluggable, extensible, and con gurable IIR interface work-bench,
that supports our wider aim of improving IIR experiment
comparability. The work-bench is su ciently exible to
support the wide range of web-based IIR experiments that are
undertaken, while being su ciently simple and light-weight
to encourage wide-spread use of the workbench.</p>
      <p>To enable this wide-spread use, the system has been
released under an open-source license6. We are also moving
to engage with the wider research community to determine
to what degree the work-bench satis es their needs for an
evaluation system and what needs to be done to achieve the
wide-spread use needed to improve IIR experiment
comparability.
6.</p>
    </sec>
    <sec id="sec-12">
      <title>ACKNOWLEDGEMENTS</title>
      <p>The research leading to these results was supported by
the Network of Excellence co-funded by the 7th Framework
Program of the European Commission, grant agreement no.
258191.</p>
      <sec id="sec-12-1">
        <title>6https://bitbucket.org/mhall/pyire</title>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Bierig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cole</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gwizdka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. J.</given-names>
            <surname>Belkin</surname>
          </string-name>
          , J. Liu, C. Liu,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , and
          <string-name>
            <surname>X. Zhang.</surname>
          </string-name>
          <article-title>An experiment and analysis system framework for the evaluation of contextual relationships</article-title>
          .
          <source>In CIRSE 2010, page 5</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C.</given-names>
            <surname>Chua</surname>
          </string-name>
          .
          <article-title>A user interface guide for web search systems</article-title>
          .
          <source>In Proceedings of the 24th Australian Computer-Human Interaction Conference</source>
          ,
          <source>OzCHI '12</source>
          , pages
          <fpage>76</fpage>
          {
          <fpage>84</fpage>
          , New York, NY, USA,
          <year>2012</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M. M.</given-names>
            <surname>Hall</surname>
          </string-name>
          and
          <string-name>
            <given-names>E. G.</given-names>
            <surname>Toms</surname>
          </string-name>
          .
          <article-title>Building a common framework for iir evaluation</article-title>
          .
          <source>In Information Access Evaluation meets Multilinguality, Multimodality, and Visualization. 4th International Conference of the CLEF Initiative - CLEF</source>
          <year>2013</year>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G.</given-names>
            <surname>Renaud</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Azzopardi</surname>
          </string-name>
          .
          <article-title>Scamp: a tool for conducting interactive information retrieval experiments</article-title>
          .
          <source>In Proceedings of the 4th Information Interaction in Context Symposium</source>
          , pages
          <volume>286</volume>
          {
          <fpage>289</fpage>
          . ACM,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Tague-Sutcli e.</surname>
          </string-name>
          <article-title>The pragmatics of information retrieval experimentation, revisited</article-title>
          .
          <source>Information Processing &amp; Management</source>
          ,
          <volume>28</volume>
          (
          <issue>4</issue>
          ):
          <volume>467</volume>
          {
          <fpage>490</fpage>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>E. G.</given-names>
            <surname>Toms</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Freund</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          .
          <article-title>Wiire: the web interactive information retrieval experimentation system prototype</article-title>
          .
          <source>Information Processing &amp; Management</source>
          ,
          <volume>40</volume>
          (
          <issue>4</issue>
          ):
          <volume>655</volume>
          {
          <fpage>675</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>E. G.</given-names>
            <surname>Toms</surname>
            , H. O'Brien
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mackenzie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Jordan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Freund</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Toze</surname>
          </string-name>
          , E. Dawe,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Macnutt</surname>
          </string-name>
          .
          <article-title>Task e ects on interactive search: The query factor</article-title>
          . In Focused access to XML documents, pages
          <volume>359</volume>
          {
          <fpage>372</fpage>
          . Springer,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Wilson</surname>
          </string-name>
          .
          <article-title>Search User Inteface Design</article-title>
          , volume
          <volume>20</volume>
          . Morgan &amp; Claypool Publishers,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>