<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On the Development of a Collaborative Search System</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sindunuraga Rikarno Putra, Kilian Grashof</string-name>
          <email>k.c.grashof@student.tudelft.nl</email>
          <email>sindunuragarikarnoputra@student.tudelft.nl</email>
          <email>{sindunuragarikarnoputra,k.c.grashof }@student.tudelft.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Felipe Moraes, Claudia Hauf</string-name>
          <email>c.hauf@tudelft.nl</email>
          <email>{f.moraes,c.hauf }@tudelft.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Delft University of Technology</institution>
          ,
          <addr-line>Delft</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <abstract>
        <p>Collaborative search is an active area of research in the IR community (and has been for many years)-despite this, there is a lack of open-source tools available to jump-start research in collaborative search. It is common for collaborative search researchers to implement their own tooling, leading to unnecessary duplicate engineering eforts. In this work, we describe the design process and challenges in implementing SearchX, an open-source collaborative search system, built using modern Web standards. SearchX implements essential features of collaborative search as found in the literature. In the design process, we focused on providing support for modern research needs (such as running crowdsourcing experiments and fast prototyping). We open-sourced SearchX https://github.com/felipemoraes/searchx-frontend (front-end) and https://github.com/felipemoraes/searchx-backend (back-end).</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        Web search is generally seen as a solitary activity, as most
mainstream technologies are designed for single-user search sessions.
However, for a suficiently complex task, collaboration during the
information seeking process is beneficial [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. A survey by
Morris [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] has shown that collaborating during search is a common
activity, albeit using ad hoc solutions such as email and instant
messaging. Morris also found a significant increase in the number
of people who collaborate during search at a regular basis, from
0.9% in 2006 to 11% in 2012. This increasing use of collaborative
search (CSE) has also been reflected in the research community,
where CSE has been an active area of research for many years.
Workshops that explicitly focus on collaborative search—and more
generally information seeking—have started to appear in 2008 [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]
and continue to do so to this day, e.g. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        In contrast to single-user search where a number of up-to-date
and open-source tools are readily available (e.g. Terrier1 and
Elasticsearch2), the CSE research community has currently just one
maintained open-source option (Coagmento, cf. Table 1) despite the
fact that researchers have designed and implemented a number
of systems in the past ten years [
        <xref ref-type="bibr" rid="ref1 ref15 ref16 ref21 ref24 ref4 ref8 ref9">1, 4, 8, 9, 15, 16, 21, 24</xref>
        ]. While
Coagmento provides an extensive collaboration feature set, it
requires users to either install a browser plugin or an Android/iOS
app, making it less viable for large-scale CSE experiments which
are often conducted with crowd workers. Furthermore, we believe
as researchers we should have a choice of tooling, instead of relying
on a single one.
1http://terrier.org/
2https://www.elastic.co/
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>BACKGROUND</title>
      <p>
        Collaborative search is a subset of the more generic field of
collaborative information seeking (CIS). Golovchinsky et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] have
characterised the collaboration aspect of online CIS along four
dimensions: intent (explicit or implicit), mediation (user interface or
algorithm), concurrency (synchronous or asynchronous), and
location (remote or co-located). Morris [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] suggested two additional
dimensions: role (symmetric or asymmetric) and medium (Desktop
or emerging devices). In our interpretation, collaborative search is
scoped around collaboration of explicit intent, with the mediation
and role dimension explored in designing the system, and the other
three dimensions (concurrency, location, medium) describing the
potential application scenarios of the system.
      </p>
      <sec id="sec-2-1">
        <title>Systems for Collaborative Search. Our analysis of previously</title>
        <p>
          proposed systems in Table 1 is limited to those similar to SearchX—
systems that support at least synchronous and remote collaborations.
Additionally, we limit the scope to text retrieval systems, since it
is the most common use case in Web search. One of the first
attempts at such system was the design of SearchTogether [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ],
which focused on supporting awareness, division of labour, and
persistence. Paul and Morris [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] built CoSense, an extension for
SearchTogether to improve sense-making by providing additional
views. Shah et al. [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] built upon the weaknesses of SearchTogether
and created Coagmento, which has been analyzed for its
experimental suitability in [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
        <p>
          More recent systems were created to explore specific aspects of
online collaborations. Golovchinsky et al. [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] designed Querium to
better support the collaboration in an exploratory search process,
specifically through implementing a shared document history that
ranks documents based on relevance feedback. Capra et al. [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]
designed ResultsSpace to study mostly asynchronous collaborations
(though synchronous collaborations are possible too), therefore
features for direct communication such as a chat were not added. Yue
et al. [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ] investigated the search behaviour of users and designed
CollabSearch with basic collaborative features for their purpose.
Leelanupab et al. [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] explored the efectiveness of visual snippets
for sense-making by introducing the SnapBoard feature into the
CoZpace system.
        </p>
        <p>
          As stated before, most of the listed systems are only described
in publications, and not open-sourced (or even available as
binaries). As mediation is a vital factor for CSE, we first analysed how
mediation was designed in prior works, and used that as a starting
point in implementing SearchX. As CSE solutions should require
low additional efort compared to ad hoc solutions [
          <xref ref-type="bibr" rid="ref11 ref14 ref9">9, 11, 14</xref>
          ], we
strove to implement features that look familiar to users (who all
use Web search engines) today.
        </p>
        <p>
          Designing Mediation. There are two main directions in
developing mediation for CSE: interface mediation adapts the search
interface towards a multi-user context, usually in the form of a
shared workspace; system mediation directly mediates the
collaboration process, mostly through re-ranking of documents [
          <xref ref-type="bibr" rid="ref4 ref9">4, 9</xref>
          ]
or modifying the distribution of documents [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. Both types of
mediation are complementary to each other. The support for
collaboration features can be categorized along three lines [
          <xref ref-type="bibr" rid="ref20 ref6">6, 20</xref>
          ]: division
of labour, sharing of knowledge, and awareness. Table 1 provides a
feature comparison of prior works [
          <xref ref-type="bibr" rid="ref15 ref16 ref21 ref24 ref4 ref9">4, 9, 15, 16, 21, 24</xref>
          ] in relation
to these concepts. We now elaborate on each one and outline what
is implemented in SearchX.
        </p>
        <p>
          Division of Labour refers to the distribution of work load across
collaborators. This division can be left to the user (user-driven) or
mediated by the system. The latter can be implemented at the user
level through assignment of roles, or at the document level through
assigning diferent document subsets [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]. Prior systems mostly
support user-driven division of labour through the provision of
communication features. Group chat and document sharing (the
explicit recommendation of a document to a collaborator) are two
features which have been shown to be favoured by users [
          <xref ref-type="bibr" rid="ref15 ref21">15, 21</xref>
          ].
Following this, SearchX implements group chat; we argue that
document sharing in the sense above can be achieved through
the chat feature as well and thus does not warrant a separate UI
element.
        </p>
        <p>
          Sharing of Knowledge refers to the ability to share ideas and
information efectively between collaborators [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ]. This can be
facilitated either through shared workspaces [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], or through the
re-ranking of search results based on relevance feedback [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Prior
systems support sharing of knowledge primarily through
providing a shared workspace with features for collectively capturing
information. Bookmarking documents (i.e. document saving) and
document rating are both relevance feedback mechanisms, with
prior systems either implementing one or the other. Document
bookmarking promotes shortlisting, which involves forming and
refining a shared list of potential resources [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]; document
rating provides a finer granularity of feedback, which is needed for
algorithmic mediation [
          <xref ref-type="bibr" rid="ref4 ref9">4, 9</xref>
          ]. Document annotation supports the
previous two features by communicating the rationale behind an
action [
          <xref ref-type="bibr" rid="ref11 ref15">11, 15</xref>
          ]. The choice of features largely depends on the
experimental setup, therefore SearchX implements all three in a way that
toggling individual features is easy. In SearchX document saving is
instantiated as bookmarking as users are familiar with this concept.
        </p>
        <p>
          Awareness is defined as “the ability to maintain some
knowledge about the situation and activities of others” [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], encompassing
knowledge of the workspace and collaborators’ actions, as well
as the ability to instantaneously notice changes on the work
conducted. Prior systems focus on providing lightweight information
regarding collaborators’ search activities (query history, document
history, and colour coding) and the overall sense-making process
(document metadata and group summary). All features were found
to be useful in past experiments, except for the document history
which Kelly et al. [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] reported to provide too much information.
As group summaries are mostly beneficial for asynchronous
sessions [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ], SearchX implements query history, document metadata,
and colour coding as awareness features.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>SYSTEM DESIGN</title>
      <p>SearchX is designed for the following experimental workflow: a
researcher first implements an experimental setup of their user study
using SearchX (either relying on existing features, or adding their
own). Each study participant accesses the SearchX instance through
a designated URL; all major browsers are supported. The system
then allocates a collaboration group to each set of m ≥ 2
participants (m is a configuration parameter). Throughout a group’s search
session (which may include pre/post questionnaires), SearchX
continuously captures fine-grained user activity logs.</p>
      <p>We now discuss the architecture of SearchX and then elaborate
on the two main design directions: supporting collaboration, and
empowering research.
3.1</p>
    </sec>
    <sec id="sec-4">
      <title>Architecture</title>
      <p>
        When implementing SearchX, we chose to start from an existing
system/interface to save development time. The options were
limited as search engine interfaces are generally not open-sourced. We
decided to use the single-user Pienapple search system [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]3 as a
starting point, as it provides a generic Web search interface built
with modern Web technologies (node.js4, React5) which have
active developer communities and are supported by large
companies, ensuring that the system will be relevant technology-wise for
the upcoming years. Given this base system, we vastly expanded
its functionalities for collaboration and experimentation, and then
refactored the code base to be modular and reusable.
      </p>
      <p>SearchX’s client-server architecture is shown in Figure 1. The
front-end is responsible for presenting the interface, managing task
sessions, and logging user activities; the back-end is responsible
for communicating with the retrieval engine, and managing group
creation and synchronisation.</p>
      <p>Front-end. The front-end (shown in Figure 2) is developed using
React (a JavaScript library). React manages its own data model,
minimising communication with the back-end; it enforces the
creation of standalone view components, resulting in an interface
simple to modify and extend. As the front-end is a Web application,
any user with a modern browser can access it without requiring
additional installation.</p>
      <p>The front-end consists of three logical abstractions. The search
interface is composed of features related to searching and
collaboration, and is presented to the user during the search session. Each
feature is implemented as a standalone component which makes
changing the layout or design of the interface eficient. Additionally,
we separate the rendering component from the data management
to make adjustments to the interface more eficient (e.g. adapting
the interface for mobile or emerging devices). The task session
implements the desired experimental setup, and controls the search
task, group creation, and the experimental procedure (e.g. a pre-test,
the search session, and then a post-test). An experimental
procedure consists of a sequence of pages, which we bootstrapped by
implementing template components for the search session, and
questionnaires, which we found to be the most commonly required
templates in our experiments. The logger accepts activity data
3The authors kindly provided us with their source code.
4 https://nodejs.org/
5https://reactjs.org/
from each component, and regularly sends the logs to the back-end
through an HTTP request for storage. This abstraction provides
a clear separation of concern between interface features,
experimental setup, and data collection, making it clear which part of the
system needs to be changed for a particular experimental need.
Back-end. The back-end is developed with the node.js server
environment, which directly supports asynchronous I/O operations,
making it suitable for applications requiring real-time updates. An
added benefit of node.js is its language (JavaScript)—developing
both the front-end and back-end in the same language made the
development more manageable for us. The back-end provides the
application data services which are made available to the front-end
through APIs—implemented using the express6 framework for
HTTP and the socket.io7 library for Web sockets. We chose these
two libraries as they are currently the most common libraries for
their respective role. We chose MongoDB8 for the data storage as
it uses a dynamic data schema, providing added flexibility during
the development and modification of features.</p>
      <p>The data services are categorised into four types. Retrieval
services includes communication with the retrieval system through the
provider, and further processing of the retrieval results through the
regulator. Currently, we provide support for the Bing Search API
for searching the Web, and support for Elasticsearch and Indri9
servers for custom collections. Session services handles group
communication and assigning search tasks to users. Collaboration
services includes the back-end logic of collaborative features in the
front-end. Utility services includes data collection tools such as the
log collector which stores user logs received from the front-end,
and the URL scrapper which scrapes all documents returned to the
user. Additionally, we also have a URL renderer which makes it
possible to load external Web pages inside our Web based system
(think of a browser inside a browser), allowing us to implement
the front-end document viewer. This ofers the possibility to keep
users inside the system at all times, allowing the system to log user
interactions within the documents as well. Both the URL scraper
and URL renderer utilise a headless browser via Puppeteer10.
3.2</p>
    </sec>
    <sec id="sec-5">
      <title>Supporting Collaboration</title>
      <p>As can be seen in Table 1, only three prior systems implement
system mediation, with each of them implementing a diferent
type. In contrast, a number of interface mediation features (i.e. all
features apart from system mediation) are popular across systems.
SearchX also primarily supports collaboration through interface
mediation, while facilitating custom implementation of system
mediation through the addition of a regulator layer in the back-end.
We now discuss each implemented feature. Figure 2 shows the UI
of our interface with all features enabled.</p>
      <p>Group Chat. Even though knowledge sharing is already facilitated
through more specialised mediation features, direct communication
is necessary for coordination and discussions. We opted for a
familiar pop-up design where the chat window is always visible in the
6https://expressjs.com/
7https://socket.io/
8 https://www.mongodb.com/
9http://www.lemurproject.org/lemur/
10https://github.com/GoogleChrome/puppeteer
interface but can be minimised when not in use (to avoid cluttering
the interface). It is implemented using converse.js11 which
provides a robust chat window out of the box. A downside though is
its black-box nature, which prevents direct access to the internals,
making it dificult to extend (e.g. we are not able to automatically
assign usernames). Currently the benefits still outweigh this
limitation, but we will consider creating our own implementation in the
future if needed.</p>
      <p>
        Bookmarking. Apart from functioning as a means to bookmark
documents for later revisits, document bookmarking also promotes
the shortlisting strategy which involves curating a shared list of
potential documents [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Given the central role of bookmarking
in such collaborative eforts, we wanted to make it more accessible,
therefore we implement the bookmark button directly next to each
search result. The list of bookmarked documents is always visible
in the sidebar to promote awareness of collaborators’ actions; it is
sorted by time, the most recent bookmarks appear at the top. In
addition, users benefit in the sense-making process when given the
option to manage and rearrange their bookmarks [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], therefore
SearchX also implements pinned/starred bookmarks.
Document Rating. Document rating is mainly considered as
finegrained source of information for relevance feedback. To avoid
cluttering of the search engine result page (SERP), we present the
rating buttons not on the SERP, but inside the document viewer;
the added benefit is that users can only rate once they have seen
the document. Document rating is implemented as a like/dislike
button to leverage users’ familiarity with this type of interaction.
Document Annotation. Unlike existing systems, we implemented
annotations as a message thread similar to chat interfaces. This
setup highlights the bidirectional nature of the annotation process,
promoting sense-making through the exchange of opinions. The
annotation interface is presented inside the document viewer, directly
next to the document to make adding new annotations a quick
11https://conversejs.org/
process. A feature planned for future development is the ability to
highlight specific parts of the document along with an annotation.
Query History. This feature has been implemented in all prior
systems; it provides awareness of collaborators’ search activities,
allowing users to avoid duplication of efort and be inspired by
their collaborators’ choice of keywords [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Its implementation is
similar across prior systems: as a list of queries that can be clicked
on to immediately open results for that query. SearchX provides a
scrollable list of recent queries in the sidebar.
      </p>
      <p>Document Metadata. This feature is presented below each SERP
entry to provide information about collaborators’ activities on the
document. This allows users to quickly identify documents that are
considered relevant by their group. The information is presented
using a number of simple icons.</p>
      <p>Colour Coding. We colour code elements of the interface that are
associated with a particular collaborator’s actions (such as querying
and bookmarking). This allows users within the group to
diferentiate between the activities and contributions of each individual
collaborator. The colours are generated randomly.</p>
      <p>System Mediation. As stated before, SearchX does not implement
a specific form of system mediation, but facilitates such an
implementation if needed. In Section 2 we outlined that system mediation
is usually performed in the form of modifications to the retrieved
list of results (re-ranking). We have designed the retrieval service
in the back-end to also contain a regulator layer that enables us to
adjust the SERP sent to each collaborator based on the actions of
the group’s members.</p>
      <p>The regulator layer collects the necessary input data for system
mediation by fetching and aggregating it from the database. One
example of such data is the current collection of bookmarked and
up-voted results for the entire group, which can be used as input
for relevance feedback. The input data can be sent to the search
provider in order to incorporate the data in the retrieval algorithm.
This is useful to incorporate features from the search provider into
(a) Search result page
(b) Document viewer
system mediation. For example, Indri supports relevance feedback
by sending it a set of results. The input data can also be used directly
in the regulator to re-rank or filter the list of results. This option
can be used for e.g. distribution of labour by filtering the documents
that are assigned to each user according to a distribution criterion.
3.3</p>
    </sec>
    <sec id="sec-6">
      <title>Empowering Research</title>
      <p>We now elaborate on how SearchX was designed as a tool for
research.</p>
      <p>Availability and Accessibility. SearchX is open-sourced for use
and development by other researchers. We iterated on the
installation process a number of times to make it simple and efective. We
provide three example implementations of diferent experimental
setups (synchronous collaborative search, asynchronous
collaborative search, single-user search) to be expanded upon. We also put
significant efort into extensively documenting how researchers can
modify the system, e.g. by adding new UI features, or by changing
the retrieval system in the back-end.</p>
      <p>
        Study Creation. Currently, modifying the system requires
programming knowledge as we do not provide a graphical interface
to create user studies yet. However, we have created reusable
implementations of common components in the experimental setup:
questionnaires and the search session. The questionnaires are
implemented using SurveyJS12, which allows defining questionnaires
directly using JSON. We have created a React component that
abstracts over SurveyJS, adding logging features and flow control.
We also did the same for the search session, which abstracts over
the search interface, adding session-related logs, flow control, and
12https://surveyjs.io/
a task bar to describing the search task. We found this to simplify
the creation of new user studies, since it takes away much of the
boilerplate code needed in configuring the experimental procedure.
Data Collection. A requirement for a CSE user study is the
collection of user activity logs. In SearchX, we have added logging to all
interactive components of the system so that it records when a user
hovers over or directly interacts with a component (e.g. clicking,
querying, opening a document). We also log session related data
(e.g. starting/finishing the search session, submitting a
questionnaire) and interactions with the browser (e.g. changing tabs), which
helps understanding all actions executed by a user. All logs are
captured directly by the interface without the need for third party
plugins installed by the user. All logs are defined and implemented
in the front-end, while the back-end only handles storage of logs,
making it easy to modify the logs or create additional logs.
Interface Guide. Prior works report that some features of their
system were not explored much by users because they do not know
or understand it [
        <xref ref-type="bibr" rid="ref15 ref9">9, 15</xref>
        ]. We solve this issue by adding a guided
interface walk-through of the interface (built using Intro.js13)
which explains step-by-step what each feature is meant to do. This
interface guide is launched when a user first starts the search
session, ensuring that they are aware of the features we want them to
use, before moving on to the search task.
4
      </p>
    </sec>
    <sec id="sec-7">
      <title>CHALLENGES</title>
      <p>We now discuss issues that are usually hidden from view—things
that did not go as intended or slowed down the process. While the
design decisions of SearchX were taken by the last two authors
13 https://introjs.com/
of this paper, the engineering efort of making the design a reality
was made by the first three authors (two MSc computer science
students working on their thesis and a PhD student).</p>
      <p>Iterating on the Experimental Setup. We initially implemented
the basic version of SearchX with a paper deadline in mind. This led
to a working but not very modular version of SearchX, which we
realized when attempting to implement a number of CSE experiments—
for each experiment, multiple files in both the front-end and
backend required changes. Since we wanted the system to be reusable
for diferent experiments, we invested efort onto refactoring the
code for a more intuitive experimental setup. We started fixing this
in the front-end by separating out all code related to the
experimental setup from the search interface and encapsulating them
into reusable React components. While this simplified the
experimental setup, the communication with the back-end remained a
complex issue. We now limit the responsibility of the back-end
to only group management and synchronisation, allowing us to
directly implement the limited range of functionality inside the
task components. If we would have spent more time on the initial
design, we would have saved substantial development time,
iterating a number of times on the architecture and the interactions
among the components.</p>
      <sec id="sec-7-1">
        <title>Deploying a Crowdsourced Study. As an efort to support online</title>
        <p>
          studies, we adapted SearchX for crowd-sourced studies. During a
ifrst CSE pilot on Crowdflower, we found crowd workers to not
be overly motivated to properly execute our assigned
collaborative search tasks (many tasks on Crowdflower tend to be short
and do not require elaborate instructions such as image labeling).
We found two ways around this issue: (i) a new platform and (ii)
actively encouraging complying behaviour. We switched to the
research-focused Prolific 14 platform which was shown to provide
higher quality data [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]—something we found to be true as well
in our work. We also spent significant development time on
monitoring workers’ attentiveness and actively keeping them on track.
We logged browser interactions (change tabs, context menu) and
notified workers about their tab changes in real-time (after n tab
changes a worker is no longer paid). We also added quality control
questions and disabled copy and paste operations in the
questionnaires. All these steps improved the quality of data we collected, but
were slow to be implemented as we discovered them as solutions
to worker compliance issues one by one after running another (and
another) pilot study.
        </p>
        <p>Synchronising Group Sessions. Running a synchronous search
session through a crowdsourcing platform is tricky, since workers
are not available right away, therefore a type of “waiting room” is
needed for the grouping so that workers assigned to a single group
start their search session at the same time. This problem becomes
particularly intense as the group size increases—an experiment
with 20 workers requires 20 workers to accept the task at roughly
the same time. Another issue we encountered was that workers
were disconnected from the grouping process when the page was
refreshed/closed during the waiting period, resulting in the worker
not being able to continue the study. We currently just warn workers
that attempt to refresh/leave the Web page running SearchX but
14https://prolific.ac/
a better way to handle (and entertain?) workers in the “waiting
room” is needed to enable CSE experiments with large group sizes.
Implementing a Document Viewer. Ensuring that
crowdworkers remain within SearchX (and otherwise rescind the payment) is
a good way of ensuring compliance, but of course this idea breaks
down when we want the workers to interact with the SERP (and
click on links and view documents in another browser tab). We
thus needed to implement a document viewer (again requiring
valuable development time) that allows users to view the document
within SearchX. This is rather straightforward for static resources
such as text or images, however it is not possible to render another
Web page directly inside SearchX because of CORS (cross origin
resource sharing) restrictions. We thus had to render the URL in
the back-end and pass the rendered HTML to an iframe in the
front-end. This is an imperfect solution though since the resulting
page is static with most interactive elements disabled, and at times
the rendering is not perfect. We are still improving this aspect of
SearchX. Currently, to alleviate the issue of imperfect renders, we
add a button to open the web page in another tab.</p>
      </sec>
      <sec id="sec-7-2">
        <title>Dynamic Result List and System Mediation. If the regulator</title>
        <p>layer of SearchX is used, the list of results is no longer a function
of only the user’s query, because it can change based on other input
data as described in Section 3.2. In our experience this can lead to
several challenges.</p>
        <p>Data analyses that use the state of the SERP that a user observes is
complicated in use of system mediation features. This is because the
SERP that a user observed cannot be replayed based on their query,
since it also depends on the other input data that the regulator
layer made use. A possible solution for this problem is to store the
search results that were displayed. We implemented a logger for
each search result that is visible on the user’s screen to facilitate
our data analyses.</p>
        <p>Another aspect that needs to be considered when implementing
system mediation is the interaction of the mediation features with
the search interface. A result list that is modified can lead to a
jarring user experience, especially if the list updates in real time.
We consider that it is better to apply changes to the SERP after a user
initiates an action (e.g. a new query or page change); by combining
the update due to system mediation with a user-initiated update of
the SERP the user is not confused that the page changes. Another
approach to prevent confusion is to indicate to users when results
have been omitted or re-ranked and to give them the option to
enable and disable mediation features. In this manner, users are
given the autonomy to decide when system mediation features are
useful.
5</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>CONCLUSIONS</title>
      <p>We have presented SearchX, a collaborative search system whose
design and implementation is an ongoing process, born out of the
unmet need for an open-source CSE tool that can be deployed
online without the need for additional installations (one of the main
reasons for ruling out Coagmento for our purposes). The version
we have just described provides a good starting point for online
collaborative search research. Having a well-functioning and
modular system is the basic requirement for the research we actually
want to conduct with SearchX: large-scale (think tens or hundreds
of users) collaborative search. We will continue to develop SearchX
and by open-sourcing the system we hope that other researchers
can benefit from it too.</p>
    </sec>
    <sec id="sec-9">
      <title>ACKNOWLEDGEMENTS</title>
      <p>This research has been supported by NWO projects LACrOSSE
(612.001.605) and SearchX (639.022.722).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Saleema</given-names>
            <surname>Amershi</surname>
          </string-name>
          and Meredith Ringel Morris.
          <year>2008</year>
          .
          <article-title>CoSearch: A System for Co-located Collaborative Web Search</article-title>
          . In CHI '
          <volume>08</volume>
          .
          <fpage>1647</fpage>
          -
          <lpage>1656</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Leif</given-names>
            <surname>Azzopardi</surname>
          </string-name>
          , Jeremy Pickens, Chirag Shah, Laure Soulier, and
          <string-name>
            <given-names>Lynda</given-names>
            <surname>Tamine</surname>
          </string-name>
          .
          <year>2017</year>
          . Second International Workshop On the
          <article-title>Evaluation of Collaborative Information Seeking and Retrieval (Ecol'17)</article-title>
          . In CHIIR '
          <volume>17</volume>
          .
          <fpage>429</fpage>
          -
          <lpage>431</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Martynas</given-names>
            <surname>Buivys</surname>
          </string-name>
          and
          <string-name>
            <given-names>Leif</given-names>
            <surname>Azzopardi</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Pienapple Search: An Integrated Search Interface to Support Finding, Refinding and Sharing</article-title>
          . In ASIST '
          <volume>16</volume>
          . 122:
          <fpage>1</fpage>
          -
          <lpage>122</lpage>
          :
          <fpage>5</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Robert</given-names>
            <surname>Capra</surname>
          </string-name>
          , Annie T. Chen, Katie Hawthorne, Jaime Arguello,
          <string-name>
            <given-names>Lee</given-names>
            <surname>Shaw</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Gary</given-names>
            <surname>Marchionini</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Design and evaluation of a system to support collaborative search</article-title>
          .
          <source>ASIST 49</source>
          ,
          <issue>1</issue>
          (
          <year>2012</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Colum</given-names>
            <surname>Foley and Alan F. Smeaton</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Synchronous Collaborative Information Retrieval: Techniques and Evaluation</article-title>
          . In ECIR '
          <volume>09</volume>
          .
          <fpage>42</fpage>
          -
          <lpage>53</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Colum</given-names>
            <surname>Foley and Alan F. Smeaton</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Division of Labour and Sharing of Knowledge for Synchronous Collaborative Information Retrieval</article-title>
          .
          <source>IPM 46</source>
          ,
          <issue>6</issue>
          (
          <year>2010</year>
          ),
          <fpage>762</fpage>
          -
          <lpage>772</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Jonathan</given-names>
            <surname>Foster</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Collaborative Information Seeking and Retrieval</article-title>
          .
          <source>Annual Review of Information Science and Technology 40</source>
          ,
          <issue>1</issue>
          (Dec.
          <year>2006</year>
          ),
          <fpage>329</fpage>
          -
          <lpage>356</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Gene</given-names>
            <surname>Golovchinsky</surname>
          </string-name>
          , John Adcock, Jeremy Pickens, Pernilla Qvarfordt, and
          <string-name>
            <given-names>Maribeth</given-names>
            <surname>Back</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Cerchiamo: a collaborative exploratory search tool</article-title>
          . Computer Supported Cooperative Work (
          <year>2008</year>
          ),
          <fpage>8</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Gene</given-names>
            <surname>Golovchinsky</surname>
          </string-name>
          , Abdigani Diriye, and
          <string-name>
            <given-names>Tony</given-names>
            <surname>Dunnigan</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>The Future is in the Past: Designing for Exploratory Search</article-title>
          . In IIIX '
          <volume>12</volume>
          .
          <fpage>52</fpage>
          -
          <lpage>61</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Gene</surname>
            <given-names>Golovchinsky</given-names>
          </string-name>
          , Jeremy Pickens, and
          <string-name>
            <given-names>Maribeth</given-names>
            <surname>Back</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>A taxonomy of collaboration in online information seeking</article-title>
          .
          <source>JCDL Workshop on Collaborative Information Retrieval</source>
          (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Ryan</given-names>
            <surname>Kelly</surname>
          </string-name>
          and
          <string-name>
            <given-names>Stephen J.</given-names>
            <surname>Payne</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Collaborative Web Search in Context: A Study of Tool Use in Everyday Tasks</article-title>
          . In CSCW '
          <volume>14</volume>
          .
          <fpage>807</fpage>
          -
          <lpage>819</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Teerapong</surname>
            <given-names>Leelanupab</given-names>
          </string-name>
          , Hannarin Kruajirayu, and
          <string-name>
            <given-names>Nont</given-names>
            <surname>Kanungsukkasem</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Snapboard: A Shared Space of Visual Snippets - A Study in Individual and Asynchronous Collaborative Web Search</article-title>
          .
          <source>In Information Retrieval Technology</source>
          . Springer International Publishing,
          <volume>161</volume>
          -
          <fpage>173</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Olivier</given-names>
            <surname>Liechti</surname>
          </string-name>
          .
          <year>2000</year>
          .
          <article-title>Awareness and the WWW: An Overview</article-title>
          .
          <source>SIGGROUP Bulletin 21</source>
          ,
          <issue>3</issue>
          (Dec.
          <year>2000</year>
          ),
          <fpage>3</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Meredith</surname>
            <given-names>Ringel</given-names>
          </string-name>
          <string-name>
            <surname>Morris</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Collaborative Search Revisited</article-title>
          . In CSCW '
          <volume>13</volume>
          .
          <fpage>1181</fpage>
          -
          <lpage>1192</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Meredith</given-names>
            <surname>Ringel</surname>
          </string-name>
          Morris and
          <string-name>
            <given-names>Eric</given-names>
            <surname>Horvitz</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>SearchTogether: An Interface for Collaborative Web Search</article-title>
          .
          <source>In UIST '07</source>
          .
          <fpage>3</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Sharoda</surname>
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Paul</surname>
          </string-name>
          and Meredith Ringel Morris.
          <year>2009</year>
          .
          <article-title>CoSense: Enhancing Sensemaking for Collaborative Web Search</article-title>
          . In CHI '
          <volume>09</volume>
          .
          <fpage>1771</fpage>
          -
          <lpage>1780</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Eyal</surname>
            <given-names>Peer</given-names>
          </string-name>
          , Laura Brandimarte, Sonam Samat, and
          <string-name>
            <given-names>Alessandro</given-names>
            <surname>Acquisti</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Beyond the Turk: Alternative platforms for crowdsourcing behavioral research</article-title>
          .
          <source>Journal of Experimental Social Psychology</source>
          <volume>70</volume>
          (
          <year>2017</year>
          ),
          <fpage>153</fpage>
          -
          <lpage>163</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Jeremy</surname>
            <given-names>Pickens</given-names>
          </string-name>
          , Gene Golovchinsky, and Meredith Ringel Morris.
          <year>2009</year>
          .
          <source>Proceedings of 1st International Workshop on Collaborative Information Seeking. CoRR abs/0908</source>
          .0583 (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Steven</surname>
            <given-names>Poltrock</given-names>
          </string-name>
          , Jonathan Grudin, Susan Dumais, Raya Fidel, Harry Bruce, and Annelise Mark Pejtersen.
          <year>2003</year>
          .
          <article-title>Information Seeking and Sharing in Design Teams</article-title>
          .
          <source>In 2003 International ACM SIGGROUP Conference on Supporting Group Work (GROUP '03)</source>
          . ACM,
          <volume>239</volume>
          -
          <fpage>247</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Chirag</given-names>
            <surname>Shah</surname>
          </string-name>
          and
          <string-name>
            <given-names>Gary</given-names>
            <surname>Marchionini</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Awareness in Collaborative Information Seeking</article-title>
          .
          <source>ASIST 61</source>
          ,
          <issue>10</issue>
          (
          <year>2010</year>
          ),
          <fpage>1970</fpage>
          -
          <lpage>1986</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Chirag</surname>
            <given-names>Shah</given-names>
          </string-name>
          , Gary Marchionini, and
          <string-name>
            <given-names>Diane</given-names>
            <surname>Kelly</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Learning Design Principles for a Collaborative Information Seeking System</article-title>
          .
          <source>In CHI EA '09</source>
          .
          <fpage>3419</fpage>
          -
          <lpage>3424</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Laure</given-names>
            <surname>Soulier</surname>
          </string-name>
          and
          <string-name>
            <given-names>Lynda</given-names>
            <surname>Tamine</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>On the Collaboration Support in Information Retrieval</article-title>
          .
          <source>Comput. Surveys</source>
          <volume>50</volume>
          ,
          <issue>4</issue>
          ,
          <string-name>
            <surname>Article 51</surname>
          </string-name>
          (
          <issue>Aug</issue>
          .
          <year>2017</year>
          ),
          <volume>34</volume>
          pages.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Ke-Thia</surname>
            <given-names>Yao</given-names>
          </string-name>
          , Robert Neches, In-Young Ko, Ragy Eleish, and
          <string-name>
            <given-names>Sameer</given-names>
            <surname>Abhinkar</surname>
          </string-name>
          .
          <year>1999</year>
          .
          <article-title>Synchronous and asynchronous collaborative information space analysis tools</article-title>
          .
          <source>In 1999 International Workshops on Parallel Processing. IEEE</source>
          ,
          <fpage>74</fpage>
          -
          <lpage>79</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Zhen</surname>
            <given-names>Yue</given-names>
          </string-name>
          , Shuguang Han, and
          <string-name>
            <given-names>Daqing</given-names>
            <surname>He</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Search tactics in collaborative exploratory web search</article-title>
          .
          <source>In HCIR</source>
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>