=Paper=
{{Paper
|id=Vol-512/paper-2
|storemode=property
|title=A User-Centered Experiment and Logging Framework for Interactive Information Retrieval
|pdfUrl=https://ceur-ws.org/Vol-512/paper02.pdf
|volume=Vol-512
|dblpUrl=https://dblp.org/rec/conf/sigir/BierigGC09
}}
==A User-Centered Experiment and Logging Framework for Interactive Information Retrieval==
A User-Centered Experiment and Logging Framework for
Interactive Information Retrieval ∗ †
Ralf Bierig Jacek Gwizdka Michael Cole
SC&I Rutgers University SC&I Rutgers University SC&I Rutgers University
4 Huntington St., 4 Huntington St., 4 Huntington St.,
New Brunswick New Brunswick New Brunswick
NJ 08901, USA NJ 08901, USA NJ 08901, USA
bierig@rci.rutgers.edu jgwizdka@scils.rutgers.edu mcole@scils.rutgers.edu
ABSTRACT This poses new challenges for the evaluation of information
This paper describes an experiment system framework that retrieval systems. An enriched set of possible user behaviors
enables researchers to design and conduct task-based ex- needs to be addressed and included as part of the evalu-
periments for Interactive Information Retrieval (IIR). The ation process. Systems need to address information about
primary focus is on multidimensional logging to obtain rich the entire interactive process with which users’ accomplish a
behavioral data from participants. We summarize initial task. This problem has so far only been initially explored [4].
experiences and highlight the benefits of multidimensional
data logging within the system framework. This paper describes an experiment system framework that
enables researchers to design and conduct task-based IIR
experiments. The paper is focused on the logging features
Categories and Subject Descriptors of the system designed to obtain rich behavioral data from
H.4 [Information Systems Applications]: Miscellaneous participants. The following section describes the overall ar-
chitecture of the system. Section 3 provides more details
about its specific logging features. Section 4 summarizes ini-
Keywords tial experiences with multidimensional data logging within
User logging, Interactive Information Retrieval, Evaluation the system framework based on initial data analysis from
three user studies. Future work is proposed in section 5.
1. INTRODUCTION
Over the last two decades, Interactive Information Retrieval 2. THE POODLE IIR EXPERIMENT SYS-
(IIR) has established a new direction within the tradition of TEM FRAMEWORK
IR. Evaluation in traditional IR is often performed in labo- The PooDLE IIR Experiment System Framework is part of
ratory settings where controlled collections and queries are an the ongoing research project. The goal of PooDLE1 to
evaluated against static information needs. IIR introduces investigate ways to improve information seeking in digital
the user at the center of a more naturalistic search environ- libraries; the analysis concentrates on an array of interact-
ment. Belkin and colleagues [3, 2] suggested the concept of ing factors involved in such online search activities. The
an information seeking episode composed of a sequence of a overall aim of the framework is to reduce the complexity
person’s interactions with information objects, determined of designing and conducting IIR experiments using multidi-
by a specific goal, conditioned by an initial task, the general mensional logging of users’ interactive search behavior. Such
context and the more specific situation in which the episode experiments usually require a complex arrangement of sys-
takes place, and the application of a particular information tem components (e.g. GUI, user management and persis-
seeking strategy. tent data storage) including logging facilities that monitor
implicit user behavior. Our framework enables researchers
∗Copyright is held by the author/owner(s). to focus on the design of the experiment including ques-
SIGIR’09, July 19-23, 2009,Boston, USA. tionnaire and task design and the selection of appropriate
†This work is supported, in part, by the Institute of Museum logging tools. This can help to reduce the overall time and
effort that is needed to design and conduct experiments that
and Library Services (IMLS grant LG-06-07-0105-07)
support the needs for IIR. As shown in figure 1, the experi-
ment system framework consists of two sides – a server that
operates in an Apache webserver environment and a client
that resides on the machine where the experiment is con-
ducted. We distinguish the following components:
• Login and Authentication manages participants, allows
them to authenticate with the system, and enables the
system to direct individuals to particular experiment
1
. http://www.scils.rutgers.edu/imls/poodle/index.html
Figure 1: System components of the PooDLE IIR Experiment System Framework. Logging features high-
lighted in grey.
setups; multiple experiments may exist and users can a bookmarking feature and an evaluation pro-
be registered for multiple or multi-part experiments at cedure, and cognitive tasks to obtain informa-
any time. tion about individual differences between partici-
pants). Tasks are easily added to this basic collec-
• The Graphical UI allows participants to authenticate tion and can be reused as part of the framework
with the framework and activate their experiment. Each in different experiments.
experiment consists of a number of rotated tasks that
– The Task Progress and Control Management pro-
are provided with a generic menu that presents the
vides participants with (rotated) task sequences,
predefined task order to the user. After every com-
monitors their state within the experiment, and
pleted task, the UI guides the participant back to the
allows them to continue interrupted experiments
menu that now highlights the completed tasks. This
at a later point in time.
allows participants to navigate between tasks and gain
feedback that helps them to track their progress. In – The Interaction Logger allows tasks to register
addition, the interface presents participants with ad- and trigger logging messages at strategic points
ditional information, instructions and warnings when within the task. The system automatically logs
progressing through the tasks of an experiment. the beginning and end of each task at task bound-
aries.
• The Experimenter controls and coordinates the core – Remote Logging Application Invocation calls log-
components of the system – these are: ging applications that reside on the client. This
allows for rich client-sided logging of low level user
– An Extensible Task Framework that provides a behavior obtained from specific hardware (e.g. mouse
range of standard tasks for IIR experiments that movements or eye-tracking information).
are part of the framework (e.g. questionnaires
for acquiring background information and gen- • The Database interface manages all access to one or
eral feedback from participants, search tasks with more databases that store users’ interaction logs as
well as the basic experiment design for other system javascript. It monitors page loads as well as resize and
components (e.g. participants, tasks and experiment focus events. It identifies mouse hover events over page
blocks in the form of task rotations for individual users). elements, mouse movements, mouse clicks, keystrokes,
and scrolling. Our version of UsaProxy is slightly mod-
3. USER INTERACTION LOGGING ified as we don’t log mouse movements with this tool.
UsaProxy can run directly on the client, but can also
This section focuses on the logging features of the Experi-
be activated on a separate computer to balance load.
ment System Framework as highlighted in grey in figure 1.
The logging features and the arrangement of logging tools • The URL Tracker is a command line tool that extracts
within the framework have been informed by the following and logs the users current web location directly from
requirements: the Internet Explorer (IE) address bar and makes it
available to the system framework. This allows any
• Hybridity: All logging functionality is divided between task to determine participants’ current position on the
a more general server architecture and a more specific web and to monitor their browsing history within a
client; this integrates server-based as well as client- task.
based logging features into a hybrid system framework.
Whereas the server logs user interactions uniformly • Tobii Eyetracker: We use the Tobii T60 eyetracking
across experiments, client logging is targeted to the hardware which is packaged with Tobii Studio2 , a com-
capabilities of the particular client machine used for mercial eyetracking recording and analysis software.
the experiment. Researchers can select from a range The software records eye movements, eye fixations, as
of logging tools or integrate their own tools to record well as webpage access, mouse events and keystrokes.
user behavior. This enables the system to use low level
input devices, normally inaccessible by the server, to • Morae is a commercial software package for usability
be controlled by logging tools residing on the client. testing and user experience developed and distributed
by TechSmith3 . It records participants’ webcam and
• Flexibility: Client logging tools can be combined through
computer screen as video, captures audio, and logs
a loosely coupled XML-based configuration that is pro-
screen text, mouse clicks and keystrokes occurring within
vided at task granularity. The system framework uses
Internet Explorer.
these task configurations to start logging tools on the
client when the participant enters a task and stops
This extensible list of logging tools are loosely coupled to
them when the participant completes a task. This
the Interaction Logger and the Remote Logging Application
gives researchers the flexibility to compose logging tools
Framework components through task configurations for in-
as part of the experiment design and attach them to
dividual tasks. The task configuration describes which log-
the configuration of the task. Such configurations can
ging tools are used during a task and the software framework
later be reused as design templates which promotes
activates them as soon as participants enter a task and de-
uniformly across experiments and ensures important
activates them as soon as they complete a task.
types of user interaction data are being logged.
• Scalability: Experiments can be configured to apply a The researcher can create a selection of relevant tools for
number of different client machines as part of the data each task of a particular IIR experiment from the available
collection. A researcher can, for example, trigger an- logging tools supported by the system framework. First, one
other client computer to record video from a second should select all user behavior the researcher is interested in.
web camera or simultaneously activate several clients Second, the observable data types that provide evidence for
for experiments that involve more than one partici- the existence and the structure of these user behaviors is
pant. Redundant instances of the same logging tools identified. Finally, these data types are linked with relevant
can be instantiated to produce multiple data streams logging tools. In the next section we summarize experiences
to overcome potential measurement errors and insta- from three distinct experiments that were designed and per-
bilities on a data stream due to load or general failure formed with our experiment system framework. We do not
of hard and software. describe these experiments in this paper. Instead, we focus
on key points and issues that should be addressed when col-
The client is configured to work with the following selection lecting multidimensional logging data from hybrid logging
of open-source and commercial logging tools that record dif- tools.
ferent behavioral aspects of participants:
• RUIConsole is an adapted command line version of
4. EXPERIENCES FROM MULTIDIMEN-
the RUI tools developed at Pennsylvania State Univer- SIONAL DATA LOGGING
sity [5]. RUI logs low level mouse movements, mouse Data logging with an array of hybrid tools, as described
clicks, and keystrokes. Our extension additionally pro- in the previous section, has a number of benefits and chal-
vides full control over its logging features through a lenges. This section summarizes our initial experiences from
command line interface to allow for more efficient au- conducting three IIR user experiments with the system frame-
tomated use within our experiment framework. work and some initial processing and integration of its data
logs.
• UsaProxy is a javascript based HTTP proxy devel-
2
oped at the University of Munich [1] that logs inter- http://www.tobii.com
3
active user behavior unobtrusively through injected http://www.techsmith.com
• Accuracy and Reliability: Using data streams from can be demanding when using high quality web cam-
multiple logging tools limits the risk of measurement era and screen capture recording. Limited hardware
errors to enter data analysis. This is especially rel- resources may have a direct effect on the recording ac-
evant to IIR due to its need to conduct experiments curacy of other logging tools. More importantly, how-
in naturalistic settings where people perform tasks in ever, a overloaded client may have an effect on par-
conditions that are not fully controlled and therefore ticipants and their ability to accomplish tasks realis-
less predictable. Such settings allow participants to tically. This can be avoided by choosing a sufficiently
solve tasks with great degrees of freedom. As a re- equipped client machine and a fast network. As men-
sult of this, user actions in such settings tend to be tioned in section 4, the software framework supports
highly variable. Measurement errors or missing data, the distribution of logging tools over several machines,
for example based on varying system performance and while these tools are activated centrally by the server
network latencies, have a larger impact because the architecture, which can help to better balance the load.
entire interaction is studied. Multiple data streams
from different sources improve the overall accuracy of • Stability: Concurrent use of multiple logging applica-
recorded sessions and increase the reliability of detect- tions can destabilize the client computer. Individual
ing features in individual logs. Furthermore, the use of applications can affect each other especially when log-
multiple data logs limits of chances that artifacts cre- ging from the same resources (e.g. from the same in-
ated by individual logging tools and their assumptions stance of Internet Explorer). Currently, our system
will affect downstream analysis. framework does not monitor running logging tools and
there is no mechanism to recover tools that hang or
• Disambiguation: The use of multiple data logs allows break during a task. This is a feature we will incorpo-
to contextualize each log with the logs produced by rate into a future version of the system framework.
other tools and disambiguate uncertainties in the in-
terpretation of logging event sequences. We found that 5. FUTURE WORK
the most common cases are timestamp disambiguation
and the synchronization of event accuracies. Future work on the experiment system framework will fo-
cus on further improvement of logging tool integration and
– Timestamp disambiguation: The timestamp gran- monitoring. We are currently developing a graphical user
ularity of recorded events usually varies between interface for researchers to more easily design IIR experi-
logging tools. For example, Tobii Studio records ments with the system and monitor progress of running ex-
eye tracking data with a constant frequency deter- periments and the accuracy of its data logs. An extension
mined by the eye tracking hardware (e.g. 60 logs to the experiment system framework presented in this paper
per second (17 ms) for the T60 model) whereas is a data analysis system that allows us to fully integrate,
UsaProxy records events only every full second analyse and develop models from the recorded data. In par-
and RUIConsole records events dynamically only ticular, we are interested in creating higher level constructs
when they occur. The combination of logging from integrated low-level logging data that can be used to
data from different tools helps to better deter- personalise interactive search for users. The experiment sys-
mine the real timing of events by providing differ- tem framework will be released as open source to the wider
ent viewpoints for the same sequence of actions a research community.
user has performed. Low granularity timestamps
might collapse a number of user events to a sin- 6. REFERENCES
gle point of time and, based on that, change the [1] R. Atterer, M. Wnuk, and A. Schmidt:. Knowing the
natural order in which these events are recorded. User’s Every Move - User Activity Tracking for Website
Alternative secondary logging data can help to de- Usability Evaluation and Implicit Interaction. In 15th
tect such event sequences and help disambiguat- International World Wide Web Conference
ing and correcting them. (WWW2006), Edinburgh, Scotland, 2006.
– Detail of event structure: Every logging tool im- [2] N. Belkin. Intelligent Information Retrieval: Whose
poses a number of assumptions on the data pro- Intelligence? In Fifth International Symposium for
duced by a user – which events to log, which Information Science (ISI), pages 25–31, Konstanz,
events to differentiate and how to label them. Germany, 1996. Universtaetsverlag Konstanz.
Two logging tools recording the same events can [3] N. Belkin, C. Cool, A. Stein, and U. Thiel. Cases,
therefore produce different event structures with Scripts, and Information-Seeking Strategies: On the
varying detail. For example, RUIConsole differ- Design of Interactive Information Retrieval Systems.
entiates a mouse click into a press and a release Expert Systems with Applications, 9(3):379–395, 1995.
event whereas Tobii Studio considers a mouse click [4] A. Edmonds, K. Hawkey, M. Kellar, and D. Turnbull.
as a single event. Different logging tools recording Workshop on logging traces of web activity: The
the same user actions produce events with a struc- mechanics of data collection. In 15th International
ture of different detail that can be used to con- World Wide Web Conference (WWW 2006),
textualise conflicting recordings of user actions. Edinburgh Scotland, 2006.
• Scalability: Concurrent use of logging tools may cre- [5] U. Kukreja, W. E. Stevenson, and F. E. Ritter. RUI –
ate performance issues on the client machine especially Recording User Input from interfaces under Windows
with tools that produce large amounts of data. Es- and Mac OS X. Behavior Research Methods,
pecially the combined use of Morae and Tobii Studio 38(4):656–659, 2006.