=Paper= {{Paper |id=Vol-512/paper-2 |storemode=property |title=A User-Centered Experiment and Logging Framework for Interactive Information Retrieval |pdfUrl=https://ceur-ws.org/Vol-512/paper02.pdf |volume=Vol-512 |dblpUrl=https://dblp.org/rec/conf/sigir/BierigGC09 }} ==A User-Centered Experiment and Logging Framework for Interactive Information Retrieval== https://ceur-ws.org/Vol-512/paper02.pdf
    A User-Centered Experiment and Logging Framework for
             Interactive Information Retrieval ∗ †

                    Ralf Bierig                        Jacek Gwizdka                         Michael Cole
             SC&I Rutgers University                SC&I Rutgers University             SC&I Rutgers University
               4 Huntington St.,                      4 Huntington St.,                   4 Huntington St.,
                New Brunswick                          New Brunswick                       New Brunswick
                NJ 08901, USA                          NJ 08901, USA                       NJ 08901, USA
            bierig@rci.rutgers.edu              jgwizdka@scils.rutgers.edu mcole@scils.rutgers.edu


ABSTRACT                                                          This poses new challenges for the evaluation of information
This paper describes an experiment system framework that          retrieval systems. An enriched set of possible user behaviors
enables researchers to design and conduct task-based ex-          needs to be addressed and included as part of the evalu-
periments for Interactive Information Retrieval (IIR). The        ation process. Systems need to address information about
primary focus is on multidimensional logging to obtain rich       the entire interactive process with which users’ accomplish a
behavioral data from participants. We summarize initial           task. This problem has so far only been initially explored [4].
experiences and highlight the benefits of multidimensional
data logging within the system framework.                         This paper describes an experiment system framework that
                                                                  enables researchers to design and conduct task-based IIR
                                                                  experiments. The paper is focused on the logging features
Categories and Subject Descriptors                                of the system designed to obtain rich behavioral data from
H.4 [Information Systems Applications]: Miscellaneous             participants. The following section describes the overall ar-
                                                                  chitecture of the system. Section 3 provides more details
                                                                  about its specific logging features. Section 4 summarizes ini-
Keywords                                                          tial experiences with multidimensional data logging within
User logging, Interactive Information Retrieval, Evaluation       the system framework based on initial data analysis from
                                                                  three user studies. Future work is proposed in section 5.
1.   INTRODUCTION
Over the last two decades, Interactive Information Retrieval      2.     THE POODLE IIR EXPERIMENT SYS-
(IIR) has established a new direction within the tradition of            TEM FRAMEWORK
IR. Evaluation in traditional IR is often performed in labo-      The PooDLE IIR Experiment System Framework is part of
ratory settings where controlled collections and queries are      an the ongoing research project. The goal of PooDLE1 to
evaluated against static information needs. IIR introduces        investigate ways to improve information seeking in digital
the user at the center of a more naturalistic search environ-     libraries; the analysis concentrates on an array of interact-
ment. Belkin and colleagues [3, 2] suggested the concept of       ing factors involved in such online search activities. The
an information seeking episode composed of a sequence of a        overall aim of the framework is to reduce the complexity
person’s interactions with information objects, determined        of designing and conducting IIR experiments using multidi-
by a specific goal, conditioned by an initial task, the general   mensional logging of users’ interactive search behavior. Such
context and the more specific situation in which the episode      experiments usually require a complex arrangement of sys-
takes place, and the application of a particular information      tem components (e.g. GUI, user management and persis-
seeking strategy.                                                 tent data storage) including logging facilities that monitor
                                                                  implicit user behavior. Our framework enables researchers
∗Copyright is held by the author/owner(s).                        to focus on the design of the experiment including ques-
SIGIR’09, July 19-23, 2009,Boston, USA.                           tionnaire and task design and the selection of appropriate
†This work is supported, in part, by the Institute of Museum      logging tools. This can help to reduce the overall time and
                                                                  effort that is needed to design and conduct experiments that
and Library Services (IMLS grant LG-06-07-0105-07)
                                                                  support the needs for IIR. As shown in figure 1, the experi-
                                                                  ment system framework consists of two sides – a server that
                                                                  operates in an Apache webserver environment and a client
                                                                  that resides on the machine where the experiment is con-
                                                                  ducted. We distinguish the following components:
                                                                        • Login and Authentication manages participants, allows
                                                                          them to authenticate with the system, and enables the
                                                                          system to direct individuals to particular experiment
                                                                  1
.                                                                     http://www.scils.rutgers.edu/imls/poodle/index.html
Figure 1: System components of the PooDLE IIR Experiment System Framework. Logging features high-
lighted in grey.



    setups; multiple experiments may exist and users can            a bookmarking feature and an evaluation pro-
    be registered for multiple or multi-part experiments at         cedure, and cognitive tasks to obtain informa-
    any time.                                                       tion about individual differences between partici-
                                                                    pants). Tasks are easily added to this basic collec-
  • The Graphical UI allows participants to authenticate            tion and can be reused as part of the framework
    with the framework and activate their experiment. Each          in different experiments.
    experiment consists of a number of rotated tasks that
                                                                  – The Task Progress and Control Management pro-
    are provided with a generic menu that presents the
                                                                    vides participants with (rotated) task sequences,
    predefined task order to the user. After every com-
                                                                    monitors their state within the experiment, and
    pleted task, the UI guides the participant back to the
                                                                    allows them to continue interrupted experiments
    menu that now highlights the completed tasks. This
                                                                    at a later point in time.
    allows participants to navigate between tasks and gain
    feedback that helps them to track their progress. In          – The Interaction Logger allows tasks to register
    addition, the interface presents participants with ad-          and trigger logging messages at strategic points
    ditional information, instructions and warnings when            within the task. The system automatically logs
    progressing through the tasks of an experiment.                 the beginning and end of each task at task bound-
                                                                    aries.
  • The Experimenter controls and coordinates the core            – Remote Logging Application Invocation calls log-
    components of the system – these are:                           ging applications that reside on the client. This
                                                                    allows for rich client-sided logging of low level user
      – An Extensible Task Framework that provides a                behavior obtained from specific hardware (e.g. mouse
        range of standard tasks for IIR experiments that            movements or eye-tracking information).
        are part of the framework (e.g. questionnaires
        for acquiring background information and gen-         • The Database interface manages all access to one or
        eral feedback from participants, search tasks with      more databases that store users’ interaction logs as
       well as the basic experiment design for other system                 javascript. It monitors page loads as well as resize and
       components (e.g. participants, tasks and experiment                  focus events. It identifies mouse hover events over page
       blocks in the form of task rotations for individual users).          elements, mouse movements, mouse clicks, keystrokes,
                                                                            and scrolling. Our version of UsaProxy is slightly mod-
3.    USER INTERACTION LOGGING                                              ified as we don’t log mouse movements with this tool.
                                                                            UsaProxy can run directly on the client, but can also
This section focuses on the logging features of the Experi-
                                                                            be activated on a separate computer to balance load.
ment System Framework as highlighted in grey in figure 1.
The logging features and the arrangement of logging tools                 • The URL Tracker is a command line tool that extracts
within the framework have been informed by the following                    and logs the users current web location directly from
requirements:                                                               the Internet Explorer (IE) address bar and makes it
                                                                            available to the system framework. This allows any
     • Hybridity: All logging functionality is divided between              task to determine participants’ current position on the
       a more general server architecture and a more specific               web and to monitor their browsing history within a
       client; this integrates server-based as well as client-              task.
       based logging features into a hybrid system framework.
       Whereas the server logs user interactions uniformly                • Tobii Eyetracker: We use the Tobii T60 eyetracking
       across experiments, client logging is targeted to the                hardware which is packaged with Tobii Studio2 , a com-
       capabilities of the particular client machine used for               mercial eyetracking recording and analysis software.
       the experiment. Researchers can select from a range                  The software records eye movements, eye fixations, as
       of logging tools or integrate their own tools to record              well as webpage access, mouse events and keystrokes.
       user behavior. This enables the system to use low level
       input devices, normally inaccessible by the server, to             • Morae is a commercial software package for usability
       be controlled by logging tools residing on the client.               testing and user experience developed and distributed
                                                                            by TechSmith3 . It records participants’ webcam and
     • Flexibility: Client logging tools can be combined through
                                                                            computer screen as video, captures audio, and logs
       a loosely coupled XML-based configuration that is pro-
                                                                            screen text, mouse clicks and keystrokes occurring within
       vided at task granularity. The system framework uses
                                                                            Internet Explorer.
       these task configurations to start logging tools on the
       client when the participant enters a task and stops
                                                                     This extensible list of logging tools are loosely coupled to
       them when the participant completes a task. This
                                                                     the Interaction Logger and the Remote Logging Application
       gives researchers the flexibility to compose logging tools
                                                                     Framework components through task configurations for in-
       as part of the experiment design and attach them to
                                                                     dividual tasks. The task configuration describes which log-
       the configuration of the task. Such configurations can
                                                                     ging tools are used during a task and the software framework
       later be reused as design templates which promotes
                                                                     activates them as soon as participants enter a task and de-
       uniformly across experiments and ensures important
                                                                     activates them as soon as they complete a task.
       types of user interaction data are being logged.

     • Scalability: Experiments can be configured to apply a         The researcher can create a selection of relevant tools for
       number of different client machines as part of the data       each task of a particular IIR experiment from the available
       collection. A researcher can, for example, trigger an-        logging tools supported by the system framework. First, one
       other client computer to record video from a second           should select all user behavior the researcher is interested in.
       web camera or simultaneously activate several clients         Second, the observable data types that provide evidence for
       for experiments that involve more than one partici-           the existence and the structure of these user behaviors is
       pant. Redundant instances of the same logging tools           identified. Finally, these data types are linked with relevant
       can be instantiated to produce multiple data streams          logging tools. In the next section we summarize experiences
       to overcome potential measurement errors and insta-           from three distinct experiments that were designed and per-
       bilities on a data stream due to load or general failure      formed with our experiment system framework. We do not
       of hard and software.                                         describe these experiments in this paper. Instead, we focus
                                                                     on key points and issues that should be addressed when col-
The client is configured to work with the following selection        lecting multidimensional logging data from hybrid logging
of open-source and commercial logging tools that record dif-         tools.
ferent behavioral aspects of participants:

     • RUIConsole is an adapted command line version of
                                                                     4.      EXPERIENCES FROM MULTIDIMEN-
       the RUI tools developed at Pennsylvania State Univer-                SIONAL DATA LOGGING
       sity [5]. RUI logs low level mouse movements, mouse           Data logging with an array of hybrid tools, as described
       clicks, and keystrokes. Our extension additionally pro-       in the previous section, has a number of benefits and chal-
       vides full control over its logging features through a        lenges. This section summarizes our initial experiences from
       command line interface to allow for more efficient au-        conducting three IIR user experiments with the system frame-
       tomated use within our experiment framework.                  work and some initial processing and integration of its data
                                                                     logs.
     • UsaProxy is a javascript based HTTP proxy devel-
                                                                     2
       oped at the University of Munich [1] that logs inter-             http://www.tobii.com
                                                                     3
       active user behavior unobtrusively through injected               http://www.techsmith.com
• Accuracy and Reliability: Using data streams from                  can be demanding when using high quality web cam-
  multiple logging tools limits the risk of measurement              era and screen capture recording. Limited hardware
  errors to enter data analysis. This is especially rel-             resources may have a direct effect on the recording ac-
  evant to IIR due to its need to conduct experiments                curacy of other logging tools. More importantly, how-
  in naturalistic settings where people perform tasks in             ever, a overloaded client may have an effect on par-
  conditions that are not fully controlled and therefore             ticipants and their ability to accomplish tasks realis-
  less predictable. Such settings allow participants to              tically. This can be avoided by choosing a sufficiently
  solve tasks with great degrees of freedom. As a re-                equipped client machine and a fast network. As men-
  sult of this, user actions in such settings tend to be             tioned in section 4, the software framework supports
  highly variable. Measurement errors or missing data,               the distribution of logging tools over several machines,
  for example based on varying system performance and                while these tools are activated centrally by the server
  network latencies, have a larger impact because the                architecture, which can help to better balance the load.
  entire interaction is studied. Multiple data streams
  from different sources improve the overall accuracy of           • Stability: Concurrent use of multiple logging applica-
  recorded sessions and increase the reliability of detect-          tions can destabilize the client computer. Individual
  ing features in individual logs. Furthermore, the use of           applications can affect each other especially when log-
  multiple data logs limits of chances that artifacts cre-           ging from the same resources (e.g. from the same in-
  ated by individual logging tools and their assumptions             stance of Internet Explorer). Currently, our system
  will affect downstream analysis.                                   framework does not monitor running logging tools and
                                                                     there is no mechanism to recover tools that hang or
• Disambiguation: The use of multiple data logs allows               break during a task. This is a feature we will incorpo-
  to contextualize each log with the logs produced by                rate into a future version of the system framework.
  other tools and disambiguate uncertainties in the in-
  terpretation of logging event sequences. We found that      5.    FUTURE WORK
  the most common cases are timestamp disambiguation
  and the synchronization of event accuracies.                Future work on the experiment system framework will fo-
                                                              cus on further improvement of logging tool integration and
    – Timestamp disambiguation: The timestamp gran-           monitoring. We are currently developing a graphical user
      ularity of recorded events usually varies between       interface for researchers to more easily design IIR experi-
      logging tools. For example, Tobii Studio records        ments with the system and monitor progress of running ex-
      eye tracking data with a constant frequency deter-      periments and the accuracy of its data logs. An extension
      mined by the eye tracking hardware (e.g. 60 logs        to the experiment system framework presented in this paper
      per second (17 ms) for the T60 model) whereas           is a data analysis system that allows us to fully integrate,
      UsaProxy records events only every full second          analyse and develop models from the recorded data. In par-
      and RUIConsole records events dynamically only          ticular, we are interested in creating higher level constructs
      when they occur. The combination of logging             from integrated low-level logging data that can be used to
      data from different tools helps to better deter-        personalise interactive search for users. The experiment sys-
      mine the real timing of events by providing differ-     tem framework will be released as open source to the wider
      ent viewpoints for the same sequence of actions a       research community.
      user has performed. Low granularity timestamps
      might collapse a number of user events to a sin-        6.    REFERENCES
      gle point of time and, based on that, change the        [1] R. Atterer, M. Wnuk, and A. Schmidt:. Knowing the
      natural order in which these events are recorded.           User’s Every Move - User Activity Tracking for Website
      Alternative secondary logging data can help to de-          Usability Evaluation and Implicit Interaction. In 15th
      tect such event sequences and help disambiguat-             International World Wide Web Conference
      ing and correcting them.                                    (WWW2006), Edinburgh, Scotland, 2006.
    – Detail of event structure: Every logging tool im-       [2] N. Belkin. Intelligent Information Retrieval: Whose
      poses a number of assumptions on the data pro-              Intelligence? In Fifth International Symposium for
      duced by a user – which events to log, which                Information Science (ISI), pages 25–31, Konstanz,
      events to differentiate and how to label them.              Germany, 1996. Universtaetsverlag Konstanz.
      Two logging tools recording the same events can         [3] N. Belkin, C. Cool, A. Stein, and U. Thiel. Cases,
      therefore produce different event structures with           Scripts, and Information-Seeking Strategies: On the
      varying detail. For example, RUIConsole differ-             Design of Interactive Information Retrieval Systems.
      entiates a mouse click into a press and a release           Expert Systems with Applications, 9(3):379–395, 1995.
      event whereas Tobii Studio considers a mouse click      [4] A. Edmonds, K. Hawkey, M. Kellar, and D. Turnbull.
      as a single event. Different logging tools recording        Workshop on logging traces of web activity: The
      the same user actions produce events with a struc-          mechanics of data collection. In 15th International
      ture of different detail that can be used to con-           World Wide Web Conference (WWW 2006),
      textualise conflicting recordings of user actions.          Edinburgh Scotland, 2006.
• Scalability: Concurrent use of logging tools may cre-       [5] U. Kukreja, W. E. Stevenson, and F. E. Ritter. RUI –
  ate performance issues on the client machine especially         Recording User Input from interfaces under Windows
  with tools that produce large amounts of data. Es-              and Mac OS X. Behavior Research Methods,
  pecially the combined use of Morae and Tobii Studio             38(4):656–659, 2006.