On the Development of a Collaborative Search System
         Sindunuraga Rikarno Putra, Kilian Grashoff                                Felipe Moraes, Claudia Hauff
                 Delft University of Technology                                      Delft University of Technology
                     Delft, The Netherlands                                               Delft, The Netherlands
    {sindunuragarikarnoputra,k.c.grashoff}@student.tudelft.nl                         {f.moraes,c.hauff}@tudelft.nl

ABSTRACT                                                                   For these reasons, we have designed and implemented SearchX,
Collaborative search is an active area of research in the IR com-       a CSE system built on modern Web standards, allowing it to be
munity (and has been for many years)—despite this, there is a lack      accessed from multiple platforms without the need for user-side
of open-source tools available to jump-start research in collabo-       installations. We designed SearchX specifically for CSE research
rative search. It is common for collaborative search researchers        and provide a comprehensive documentation to enable others to
to implement their own tooling, leading to unnecessary duplicate        implement and run their own CSE experiments.
engineering efforts. In this work, we describe the design process
and challenges in implementing SearchX, an open-source collabo-         2   BACKGROUND
rative search system, built using modern Web standards. SearchX         Collaborative search is a subset of the more generic field of collab-
implements essential features of collaborative search as found          orative information seeking (CIS). Golovchinsky et al. [10] have
in the literature. In the design process, we focused on providing       characterised the collaboration aspect of online CIS along four di-
support for modern research needs (such as running crowdsourc-          mensions: intent (explicit or implicit), mediation (user interface or
ing experiments and fast prototyping). We open-sourced SearchX          algorithm), concurrency (synchronous or asynchronous), and loca-
https://github.com/felipemoraes/searchx-frontend (front-end) and        tion (remote or co-located). Morris [14] suggested two additional
https://github.com/felipemoraes/searchx-backend (back-end).             dimensions: role (symmetric or asymmetric) and medium (Desktop
                                                                        or emerging devices). In our interpretation, collaborative search is
1     INTRODUCTION
                                                                        scoped around collaboration of explicit intent, with the mediation
Web search is generally seen as a solitary activity, as most main-      and role dimension explored in designing the system, and the other
stream technologies are designed for single-user search sessions.       three dimensions (concurrency, location, medium) describing the
However, for a sufficiently complex task, collaboration during the      potential application scenarios of the system.
information seeking process is beneficial [7]. A survey by Mor-
ris [14] has shown that collaborating during search is a common         Systems for Collaborative Search. Our analysis of previously
activity, albeit using ad hoc solutions such as email and instant       proposed systems in Table 1 is limited to those similar to SearchX—
messaging. Morris also found a significant increase in the number       systems that support at least synchronous and remote collaborations.
of people who collaborate during search at a regular basis, from        Additionally, we limit the scope to text retrieval systems, since it
0.9% in 2006 to 11% in 2012. This increasing use of collaborative       is the most common use case in Web search. One of the first at-
search (CSE) has also been reflected in the research community,         tempts at such system was the design of SearchTogether [15],
where CSE has been an active area of research for many years.           which focused on supporting awareness, division of labour, and
Workshops that explicitly focus on collaborative search—and more        persistence. Paul and Morris [16] built CoSense, an extension for
generally information seeking—have started to appear in 2008 [18]       SearchTogether to improve sense-making by providing additional
and continue to do so to this day, e.g. [2].                            views. Shah et al. [21] built upon the weaknesses of SearchTogether
   In contrast to single-user search where a number of up-to-date       and created Coagmento, which has been analyzed for its experimen-
and open-source tools are readily available (e.g. Terrier1 and Elas-    tal suitability in [11].
ticsearch2 ), the CSE research community has currently just one            More recent systems were created to explore specific aspects of
maintained open-source option (Coagmento, cf. Table 1) despite the      online collaborations. Golovchinsky et al. [9] designed Querium to
fact that researchers have designed and implemented a number            better support the collaboration in an exploratory search process,
of systems in the past ten years [1, 4, 8, 9, 15, 16, 21, 24]. While    specifically through implementing a shared document history that
Coagmento provides an extensive collaboration feature set, it re-       ranks documents based on relevance feedback. Capra et al. [4] de-
quires users to either install a browser plugin or an Android/iOS       signed ResultsSpace to study mostly asynchronous collaborations
app, making it less viable for large-scale CSE experiments which        (though synchronous collaborations are possible too), therefore fea-
are often conducted with crowd workers. Furthermore, we believe         tures for direct communication such as a chat were not added. Yue
as researchers we should have a choice of tooling, instead of relying   et al. [24] investigated the search behaviour of users and designed
on a single one.                                                        CollabSearch with basic collaborative features for their purpose.
                                                                        Leelanupab et al. [12] explored the effectiveness of visual snippets
1 http://terrier.org/
2 https://www.elastic.co/                                               for sense-making by introducing the SnapBoard feature into the
                                                                        CoZpace system.
                                                                           As stated before, most of the listed systems are only described
DESIRES 2018, August 2018, Bertinoro, Italy                             in publications, and not open-sourced (or even available as bina-
© 2018 Copyright held by the author(s).
                                                                        ries). As mediation is a vital factor for CSE, we first analysed how
Table 1: Feature comparison of existing remote collaborative search systems and SearchX (ordered by publication year of
the first paper describing the system). A dash − indicates that this information is not available. Language and platform ab-
breviations: JS=JavaScript, BP=Browser Plugin, IE=Internet Explorer, FF=Firefox, GC=Google Chrome. Note that we only list
programming languages listed in the respective papers (if no open-source code is available). †The Coagmento iOS app is only
available in Apple’s US app store.

                                 Search Together [15]   CoSense [16]   Coagmento [21]      Querium [9]            ResultsSpace [4]       CollabSearch [24]   CoZpace [12]   SearchX
 Division of Labour
 Group Chat                               ✓                  ✓               ✓                  ✓                                               ✓                 ✓           ✓
 Document Sharing                         ✓                                  ✓                  ✓
 Sharing of Knowledge
 Bookmarking / Document Saving                                               ✓                                                                  ✓                             ✓
 Document Rating                          ✓                  ✓                                  ✓                        ✓                                        ✓           ✓
 Document Annotation                      ✓                  ✓               ✓                                                                                    ✓           ✓
 Awareness
 Query History                            ✓                  ✓               ✓                  ✓                        ✓                      ✓                 ✓           ✓
 Document History                                            ✓               ✓                  ✓
 Document Metadata                        ✓                  ✓               ✓                  ✓                        ✓                                        ✓           ✓
 Group Summary                                               ✓                                  ✓                                               ✓                 ✓
 Colour Coding                                               ✓                                                                                  ✓                             ✓
 System Mediation                    Split Search            –               –          Ranked Doc. History   Re-ranked Search Results          –                 –           –
 Tool Availability
 Functioning                             ✗                   –               ✓                  –                       –                       –                 –           ✓
 Open Source                             –                   –               ✓                  –                       –                       –                 –           ✓
 Last Update                            2009                 –              2018                –                       –                       –                 –          2018
 Language                                –                   –           PHP & JS               JS                     PHP                      –                 JS          JS
 Platform                              BP (IE)               –            †
                                                                       iOS , Android           Web                     Web                     Web               Web         Web
                                                                        BP (FF, GC)


mediation was designed in prior works, and used that as a starting                              Sharing of Knowledge refers to the ability to share ideas and
point in implementing SearchX. As CSE solutions should require                               information effectively between collaborators [23]. This can be
low additional effort compared to ad hoc solutions [9, 11, 14], we                           facilitated either through shared workspaces [19], or through the
strove to implement features that look familiar to users (who all                            re-ranking of search results based on relevance feedback [5]. Prior
use Web search engines) today.                                                               systems support sharing of knowledge primarily through provid-
Designing Mediation. There are two main directions in devel-                                 ing a shared workspace with features for collectively capturing
oping mediation for CSE: interface mediation adapts the search                               information. Bookmarking documents (i.e. document saving) and
interface towards a multi-user context, usually in the form of a                             document rating are both relevance feedback mechanisms, with
shared workspace; system mediation directly mediates the collab-                             prior systems either implementing one or the other. Document
oration process, mostly through re-ranking of documents [4, 9]                               bookmarking promotes shortlisting, which involves forming and
or modifying the distribution of documents [15]. Both types of                               refining a shared list of potential resources [11]; document rat-
mediation are complementary to each other. The support for collab-                           ing provides a finer granularity of feedback, which is needed for
                                                                                             algorithmic mediation [4, 9]. Document annotation supports the
oration features can be categorized along three lines [6, 20]: division
                                                                                             previous two features by communicating the rationale behind an
of labour, sharing of knowledge, and awareness. Table 1 provides a
feature comparison of prior works [4, 9, 15, 16, 21, 24] in relation                         action [11, 15]. The choice of features largely depends on the experi-
to these concepts. We now elaborate on each one and outline what                             mental setup, therefore SearchX implements all three in a way that
is implemented in SearchX.                                                                   toggling individual features is easy. In SearchX document saving is
    Division of Labour refers to the distribution of work load across                        instantiated as bookmarking as users are familiar with this concept.
collaborators. This division can be left to the user (user-driven) or                           Awareness is defined as “the ability to maintain some knowl-
mediated by the system. The latter can be implemented at the user                            edge about the situation and activities of others” [13], encompassing
level through assignment of roles, or at the document level through                          knowledge of the workspace and collaborators’ actions, as well
assigning different document subsets [22]. Prior systems mostly                              as the ability to instantaneously notice changes on the work con-
                                                                                             ducted. Prior systems focus on providing lightweight information
support user-driven division of labour through the provision of
                                                                                             regarding collaborators’ search activities (query history, document
communication features. Group chat and document sharing (the
explicit recommendation of a document to a collaborator) are two                             history, and colour coding) and the overall sense-making process
features which have been shown to be favoured by users [15, 21].                             (document metadata and group summary). All features were found
Following this, SearchX implements group chat; we argue that                                 to be useful in past experiments, except for the document history
document sharing in the sense above can be achieved through                                  which Kelly et al. [11] reported to provide too much information.
the chat feature as well and thus does not warrant a separate UI                             As group summaries are mostly beneficial for asynchronous ses-
element.                                                                                     sions [20], SearchX implements query history, document metadata,
                                                                                             and colour coding as awareness features.
3     SYSTEM DESIGN                                                       from each component, and regularly sends the logs to the back-end
SearchX is designed for the following experimental workflow: a re-        through an HTTP request for storage. This abstraction provides
searcher first implements an experimental setup of their user study       a clear separation of concern between interface features, experi-
using SearchX (either relying on existing features, or adding their       mental setup, and data collection, making it clear which part of the
own). Each study participant accesses the SearchX instance through        system needs to be changed for a particular experimental need.
a designated URL; all major browsers are supported. The system            Back-end. The back-end is developed with the node.js server
then allocates a collaboration group to each set of m ≥ 2 partici-        environment, which directly supports asynchronous I/O operations,
pants (m is a configuration parameter). Throughout a group’s search       making it suitable for applications requiring real-time updates. An
session (which may include pre/post questionnaires), SearchX con-         added benefit of node.js is its language (JavaScript)—developing
tinuously captures fine-grained user activity logs.                       both the front-end and back-end in the same language made the
   We now discuss the architecture of SearchX and then elaborate          development more manageable for us. The back-end provides the
on the two main design directions: supporting collaboration, and          application data services which are made available to the front-end
empowering research.                                                      through APIs—implemented using the express6 framework for
                                                                          HTTP and the socket.io7 library for Web sockets. We chose these
3.1      Architecture                                                     two libraries as they are currently the most common libraries for
When implementing SearchX, we chose to start from an existing             their respective role. We chose MongoDB8 for the data storage as
system/interface to save development time. The options were lim-          it uses a dynamic data schema, providing added flexibility during
ited as search engine interfaces are generally not open-sourced. We       the development and modification of features.
decided to use the single-user Pienapple search system [3]3 as a              The data services are categorised into four types. Retrieval ser-
starting point, as it provides a generic Web search interface built       vices includes communication with the retrieval system through the
with modern Web technologies (node.js4 , React5 ) which have              provider, and further processing of the retrieval results through the
active developer communities and are supported by large compa-            regulator. Currently, we provide support for the Bing Search API
nies, ensuring that the system will be relevant technology-wise for       for searching the Web, and support for Elasticsearch and Indri9
the upcoming years. Given this base system, we vastly expanded            servers for custom collections. Session services handles group com-
its functionalities for collaboration and experimentation, and then       munication and assigning search tasks to users. Collaboration ser-
refactored the code base to be modular and reusable.                      vices includes the back-end logic of collaborative features in the
    SearchX’s client-server architecture is shown in Figure 1. The        front-end. Utility services includes data collection tools such as the
front-end is responsible for presenting the interface, managing task      log collector which stores user logs received from the front-end,
sessions, and logging user activities; the back-end is responsible        and the URL scrapper which scrapes all documents returned to the
for communicating with the retrieval engine, and managing group           user. Additionally, we also have a URL renderer which makes it
creation and synchronisation.                                             possible to load external Web pages inside our Web based system
                                                                          (think of a browser inside a browser), allowing us to implement
Front-end. The front-end (shown in Figure 2) is developed using
                                                                          the front-end document viewer. This offers the possibility to keep
React (a JavaScript library). React manages its own data model,
                                                                          users inside the system at all times, allowing the system to log user
minimising communication with the back-end; it enforces the cre-
                                                                          interactions within the documents as well. Both the URL scraper
ation of standalone view components, resulting in an interface
                                                                          and URL renderer utilise a headless browser via Puppeteer10 .
simple to modify and extend. As the front-end is a Web application,
any user with a modern browser can access it without requiring
additional installation.
                                                                          3.2      Supporting Collaboration
   The front-end consists of three logical abstractions. The search       As can be seen in Table 1, only three prior systems implement
interface is composed of features related to searching and collabo-       system mediation, with each of them implementing a different
ration, and is presented to the user during the search session. Each      type. In contrast, a number of interface mediation features (i.e. all
feature is implemented as a standalone component which makes              features apart from system mediation) are popular across systems.
changing the layout or design of the interface efficient. Additionally,   SearchX also primarily supports collaboration through interface
we separate the rendering component from the data management              mediation, while facilitating custom implementation of system
to make adjustments to the interface more efficient (e.g. adapting        mediation through the addition of a regulator layer in the back-end.
the interface for mobile or emerging devices). The task session im-       We now discuss each implemented feature. Figure 2 shows the UI
plements the desired experimental setup, and controls the search          of our interface with all features enabled.
task, group creation, and the experimental procedure (e.g. a pre-test,    Group Chat. Even though knowledge sharing is already facilitated
the search session, and then a post-test). An experimental proce-         through more specialised mediation features, direct communication
dure consists of a sequence of pages, which we bootstrapped by            is necessary for coordination and discussions. We opted for a famil-
implementing template components for the search session, and              iar pop-up design where the chat window is always visible in the
questionnaires, which we found to be the most commonly required
templates in our experiments. The logger accepts activity data            6 https://expressjs.com/
                                                                          7 https://socket.io/
3 The authors kindly provided us with their source code.                  8 https://www.mongodb.com/
4 https://nodejs.org/                                                     9 http://www.lemurproject.org/lemur/
5 https://reactjs.org/                                                    10 https://github.com/GoogleChrome/puppeteer
                                                 Figure 1: SearchX architecture overview.


interface but can be minimised when not in use (to avoid cluttering     process. A feature planned for future development is the ability to
the interface). It is implemented using converse.js11 which pro-        highlight specific parts of the document along with an annotation.
vides a robust chat window out of the box. A downside though is         Query History. This feature has been implemented in all prior
its black-box nature, which prevents direct access to the internals,
                                                                        systems; it provides awareness of collaborators’ search activities,
making it difficult to extend (e.g. we are not able to automatically
                                                                        allowing users to avoid duplication of effort and be inspired by
assign usernames). Currently the benefits still outweigh this limita-   their collaborators’ choice of keywords [15]. Its implementation is
tion, but we will consider creating our own implementation in the       similar across prior systems: as a list of queries that can be clicked
future if needed.                                                       on to immediately open results for that query. SearchX provides a
Bookmarking. Apart from functioning as a means to bookmark              scrollable list of recent queries in the sidebar.
documents for later revisits, document bookmarking also promotes        Document Metadata. This feature is presented below each SERP
the shortlisting strategy which involves curating a shared list of      entry to provide information about collaborators’ activities on the
potential documents [11]. Given the central role of bookmarking         document. This allows users to quickly identify documents that are
in such collaborative efforts, we wanted to make it more accessible,    considered relevant by their group. The information is presented
therefore we implement the bookmark button directly next to each        using a number of simple icons.
search result. The list of bookmarked documents is always visible
in the sidebar to promote awareness of collaborators’ actions; it is    Colour Coding. We colour code elements of the interface that are
sorted by time, the most recent bookmarks appear at the top. In         associated with a particular collaborator’s actions (such as querying
addition, users benefit in the sense-making process when given the      and bookmarking). This allows users within the group to differen-
option to manage and rearrange their bookmarks [11], therefore          tiate between the activities and contributions of each individual
SearchX also implements pinned/starred bookmarks.                       collaborator. The colours are generated randomly.
Document Rating. Document rating is mainly considered as fine-          System Mediation. As stated before, SearchX does not implement
grained source of information for relevance feedback. To avoid          a specific form of system mediation, but facilitates such an imple-
cluttering of the search engine result page (SERP), we present the      mentation if needed. In Section 2 we outlined that system mediation
rating buttons not on the SERP, but inside the document viewer;         is usually performed in the form of modifications to the retrieved
the added benefit is that users can only rate once they have seen       list of results (re-ranking). We have designed the retrieval service
the document. Document rating is implemented as a like/dislike          in the back-end to also contain a regulator layer that enables us to
button to leverage users’ familiarity with this type of interaction.    adjust the SERP sent to each collaborator based on the actions of
                                                                        the group’s members.
Document Annotation. Unlike existing systems, we implemented                The regulator layer collects the necessary input data for system
annotations as a message thread similar to chat interfaces. This        mediation by fetching and aggregating it from the database. One
setup highlights the bidirectional nature of the annotation process,    example of such data is the current collection of bookmarked and
promoting sense-making through the exchange of opinions. The an-
                                                                        up-voted results for the entire group, which can be used as input
notation interface is presented inside the document viewer, directly
                                                                        for relevance feedback. The input data can be sent to the search
next to the document to make adding new annotations a quick             provider in order to incorporate the data in the retrieval algorithm.
11 https://conversejs.org/
                                                                        This is useful to incorporate features from the search provider into
                                         (a) Search result page                                         (b) Document viewer

Figure 2: SearchX collaborative search interface. [A] Document metadata, [B] shared query history, [C] shared bookmarks, [D]
chat, [E] document rating, [F] document comments.


system mediation. For example, Indri supports relevance feedback           a task bar to describing the search task. We found this to simplify
by sending it a set of results. The input data can also be used directly   the creation of new user studies, since it takes away much of the
in the regulator to re-rank or filter the list of results. This option     boilerplate code needed in configuring the experimental procedure.
can be used for e.g. distribution of labour by filtering the documents     Data Collection. A requirement for a CSE user study is the collec-
that are assigned to each user according to a distribution criterion.      tion of user activity logs. In SearchX, we have added logging to all
                                                                           interactive components of the system so that it records when a user
3.3      Empowering Research                                               hovers over or directly interacts with a component (e.g. clicking,
We now elaborate on how SearchX was designed as a tool for                 querying, opening a document). We also log session related data
research.                                                                  (e.g. starting/finishing the search session, submitting a question-
Availability and Accessibility. SearchX is open-sourced for use            naire) and interactions with the browser (e.g. changing tabs), which
and development by other researchers. We iterated on the installa-         helps understanding all actions executed by a user. All logs are
tion process a number of times to make it simple and effective. We         captured directly by the interface without the need for third party
provide three example implementations of different experimental            plugins installed by the user. All logs are defined and implemented
setups (synchronous collaborative search, asynchronous collabora-          in the front-end, while the back-end only handles storage of logs,
tive search, single-user search) to be expanded upon. We also put          making it easy to modify the logs or create additional logs.
significant effort into extensively documenting how researchers can        Interface Guide. Prior works report that some features of their
modify the system, e.g. by adding new UI features, or by changing          system were not explored much by users because they do not know
the retrieval system in the back-end.                                      or understand it [9, 15]. We solve this issue by adding a guided
Study Creation. Currently, modifying the system requires pro-              interface walk-through of the interface (built using Intro.js13 )
gramming knowledge as we do not provide a graphical interface              which explains step-by-step what each feature is meant to do. This
to create user studies yet. However, we have created reusable im-          interface guide is launched when a user first starts the search ses-
plementations of common components in the experimental setup:              sion, ensuring that they are aware of the features we want them to
questionnaires and the search session. The questionnaires are im-          use, before moving on to the search task.
plemented using SurveyJS12 , which allows defining questionnaires
directly using JSON. We have created a React component that ab-            4     CHALLENGES
stracts over SurveyJS, adding logging features and flow control.           We now discuss issues that are usually hidden from view—things
We also did the same for the search session, which abstracts over          that did not go as intended or slowed down the process. While the
the search interface, adding session-related logs, flow control, and       design decisions of SearchX were taken by the last two authors
12 https://surveyjs.io/                                                    13 https://introjs.com/
of this paper, the engineering effort of making the design a reality     a better way to handle (and entertain?) workers in the “waiting
was made by the first three authors (two MSc computer science            room” is needed to enable CSE experiments with large group sizes.
students working on their thesis and a PhD student).                     Implementing a Document Viewer. Ensuring that crowdwork-
Iterating on the Experimental Setup. We initially implemented            ers remain within SearchX (and otherwise rescind the payment) is
the basic version of SearchX with a paper deadline in mind. This led     a good way of ensuring compliance, but of course this idea breaks
to a working but not very modular version of SearchX, which we re-       down when we want the workers to interact with the SERP (and
alized when attempting to implement a number of CSE experiments—         click on links and view documents in another browser tab). We
for each experiment, multiple files in both the front-end and back-      thus needed to implement a document viewer (again requiring valu-
end required changes. Since we wanted the system to be reusable          able development time) that allows users to view the document
for different experiments, we invested effort onto refactoring the       within SearchX. This is rather straightforward for static resources
code for a more intuitive experimental setup. We started fixing this     such as text or images, however it is not possible to render another
in the front-end by separating out all code related to the exper-        Web page directly inside SearchX because of CORS (cross origin
imental setup from the search interface and encapsulating them           resource sharing) restrictions. We thus had to render the URL in
into reusable React components. While this simplified the exper-         the back-end and pass the rendered HTML to an iframe in the
imental setup, the communication with the back-end remained a            front-end. This is an imperfect solution though since the resulting
complex issue. We now limit the responsibility of the back-end           page is static with most interactive elements disabled, and at times
to only group management and synchronisation, allowing us to             the rendering is not perfect. We are still improving this aspect of
directly implement the limited range of functionality inside the         SearchX. Currently, to alleviate the issue of imperfect renders, we
task components. If we would have spent more time on the initial         add a button to open the web page in another tab.
design, we would have saved substantial development time, iter-          Dynamic Result List and System Mediation. If the regulator
ating a number of times on the architecture and the interactions         layer of SearchX is used, the list of results is no longer a function
among the components.                                                    of only the user’s query, because it can change based on other input
Deploying a Crowdsourced Study. As an effort to support online           data as described in Section 3.2. In our experience this can lead to
studies, we adapted SearchX for crowd-sourced studies. During a          several challenges.
first CSE pilot on Crowdflower, we found crowd workers to not               Data analyses that use the state of the SERP that a user observes is
be overly motivated to properly execute our assigned collabora-          complicated in use of system mediation features. This is because the
tive search tasks (many tasks on Crowdflower tend to be short            SERP that a user observed cannot be replayed based on their query,
and do not require elaborate instructions such as image labeling).       since it also depends on the other input data that the regulator
We found two ways around this issue: (i) a new platform and (ii)         layer made use. A possible solution for this problem is to store the
actively encouraging complying behaviour. We switched to the             search results that were displayed. We implemented a logger for
research-focused Prolific14 platform which was shown to provide          each search result that is visible on the user’s screen to facilitate
higher quality data [17]—something we found to be true as well           our data analyses.
in our work. We also spent significant development time on moni-            Another aspect that needs to be considered when implementing
toring workers’ attentiveness and actively keeping them on track.        system mediation is the interaction of the mediation features with
We logged browser interactions (change tabs, context menu) and           the search interface. A result list that is modified can lead to a
notified workers about their tab changes in real-time (after n tab       jarring user experience, especially if the list updates in real time.
changes a worker is no longer paid). We also added quality control       We consider that it is better to apply changes to the SERP after a user
questions and disabled copy and paste operations in the question-        initiates an action (e.g. a new query or page change); by combining
naires. All these steps improved the quality of data we collected, but   the update due to system mediation with a user-initiated update of
were slow to be implemented as we discovered them as solutions           the SERP the user is not confused that the page changes. Another
to worker compliance issues one by one after running another (and        approach to prevent confusion is to indicate to users when results
another) pilot study.                                                    have been omitted or re-ranked and to give them the option to
Synchronising Group Sessions. Running a synchronous search               enable and disable mediation features. In this manner, users are
session through a crowdsourcing platform is tricky, since workers        given the autonomy to decide when system mediation features are
are not available right away, therefore a type of “waiting room” is      useful.
needed for the grouping so that workers assigned to a single group
start their search session at the same time. This problem becomes
                                                                         5   CONCLUSIONS
particularly intense as the group size increases—an experiment           We have presented SearchX, a collaborative search system whose
with 20 workers requires 20 workers to accept the task at roughly        design and implementation is an ongoing process, born out of the
the same time. Another issue we encountered was that workers             unmet need for an open-source CSE tool that can be deployed on-
were disconnected from the grouping process when the page was            line without the need for additional installations (one of the main
refreshed/closed during the waiting period, resulting in the worker      reasons for ruling out Coagmento for our purposes). The version
not being able to continue the study. We currently just warn workers     we have just described provides a good starting point for online
that attempt to refresh/leave the Web page running SearchX but           collaborative search research. Having a well-functioning and mod-
                                                                         ular system is the basic requirement for the research we actually
14 https://prolific.ac/
                                                                         want to conduct with SearchX: large-scale (think tens or hundreds
of users) collaborative search. We will continue to develop SearchX
and by open-sourcing the system we hope that other researchers
can benefit from it too.

ACKNOWLEDGEMENTS
This research has been supported by NWO projects LACrOSSE
(612.001.605) and SearchX (639.022.722).

REFERENCES
 [1] Saleema Amershi and Meredith Ringel Morris. 2008. CoSearch: A System for
     Co-located Collaborative Web Search. In CHI ’08. 1647–1656.
 [2] Leif Azzopardi, Jeremy Pickens, Chirag Shah, Laure Soulier, and Lynda Tamine.
     2017. Second International Workshop On the Evaluation of Collaborative Infor-
     mation Seeking and Retrieval (Ecol’17). In CHIIR ’17. 429–431.
 [3] Martynas Buivys and Leif Azzopardi. 2016. Pienapple Search: An Integrated
     Search Interface to Support Finding, Refinding and Sharing. In ASIST ’16. 122:1–
     122:5.
 [4] Robert Capra, Annie T. Chen, Katie Hawthorne, Jaime Arguello, Lee Shaw, and
     Gary Marchionini. 2012. Design and evaluation of a system to support collabora-
     tive search. ASIST 49, 1 (2012), 1–10.
 [5] Colum Foley and Alan F. Smeaton. 2009. Synchronous Collaborative Information
     Retrieval: Techniques and Evaluation. In ECIR ’09. 42–53.
 [6] Colum Foley and Alan F. Smeaton. 2010. Division of Labour and Sharing of
     Knowledge for Synchronous Collaborative Information Retrieval. IPM 46, 6
     (2010), 762–772.
 [7] Jonathan Foster. 2006. Collaborative Information Seeking and Retrieval. Annual
     Review of Information Science and Technology 40, 1 (Dec. 2006), 329–356.
 [8] Gene Golovchinsky, John Adcock, Jeremy Pickens, Pernilla Qvarfordt, and Mari-
     beth Back. 2008. Cerchiamo: a collaborative exploratory search tool. Computer
     Supported Cooperative Work (2008), 8–12.
 [9] Gene Golovchinsky, Abdigani Diriye, and Tony Dunnigan. 2012. The Future is
     in the Past: Designing for Exploratory Search. In IIIX ’12. 52–61.
[10] Gene Golovchinsky, Jeremy Pickens, and Maribeth Back. 2008. A taxonomy of
     collaboration in online information seeking. JCDL Workshop on Collaborative
     Information Retrieval (2008).
[11] Ryan Kelly and Stephen J. Payne. 2014. Collaborative Web Search in Context: A
     Study of Tool Use in Everyday Tasks. In CSCW ’14. 807–819.
[12] Teerapong Leelanupab, Hannarin Kruajirayu, and Nont Kanungsukkasem. 2015.
     Snapboard: A Shared Space of Visual Snippets - A Study in Individual and Asyn-
     chronous Collaborative Web Search. In Information Retrieval Technology. Springer
     International Publishing, 161–173.
[13] Olivier Liechti. 2000. Awareness and the WWW: An Overview. SIGGROUP
     Bulletin 21, 3 (Dec. 2000), 3–12.
[14] Meredith Ringel Morris. 2013. Collaborative Search Revisited. In CSCW ’13.
     1181–1192.
[15] Meredith Ringel Morris and Eric Horvitz. 2007. SearchTogether: An Interface for
     Collaborative Web Search. In UIST ’07. 3–12.
[16] Sharoda A. Paul and Meredith Ringel Morris. 2009. CoSense: Enhancing Sense-
     making for Collaborative Web Search. In CHI ’09. 1771–1780.
[17] Eyal Peer, Laura Brandimarte, Sonam Samat, and Alessandro Acquisti. 2017.
     Beyond the Turk: Alternative platforms for crowdsourcing behavioral research.
     Journal of Experimental Social Psychology 70 (2017), 153 – 163.
[18] Jeremy Pickens, Gene Golovchinsky, and Meredith Ringel Morris. 2009. Proceed-
     ings of 1st International Workshop on Collaborative Information Seeking. CoRR
     abs/0908.0583 (2009).
[19] Steven Poltrock, Jonathan Grudin, Susan Dumais, Raya Fidel, Harry Bruce, and
     Annelise Mark Pejtersen. 2003. Information Seeking and Sharing in Design
     Teams. In 2003 International ACM SIGGROUP Conference on Supporting Group
     Work (GROUP ’03). ACM, 239–247.
[20] Chirag Shah and Gary Marchionini. 2010. Awareness in Collaborative Information
     Seeking. ASIST 61, 10 (2010), 1970–1986.
[21] Chirag Shah, Gary Marchionini, and Diane Kelly. 2009. Learning Design Principles
     for a Collaborative Information Seeking System. In CHI EA ’09. 3419–3424.
[22] Laure Soulier and Lynda Tamine. 2017. On the Collaboration Support in Infor-
     mation Retrieval. Comput. Surveys 50, 4, Article 51 (Aug. 2017), 34 pages.
[23] Ke-Thia Yao, Robert Neches, In-Young Ko, Ragy Eleish, and Sameer Abhinkar.
     1999. Synchronous and asynchronous collaborative information space analysis
     tools. In 1999 International Workshops on Parallel Processing. IEEE, 74–79.
[24] Zhen Yue, Shuguang Han, and Daqing He. 2012. Search tactics in collaborative
     exploratory web search. In HCIR 2012.