=Paper=
{{Paper
|id=Vol-2741/paper-07
|storemode=property
|title=Towards a Framework for Harm Prevention in Web
                     Search
|pdfUrl=https://ceur-ws.org/Vol-2741/paper-07.pdf
|volume=Vol-2741
|authors=Steven Zimmerman,Stefan Herzog,Jon Chamberlain,David Elsweiler,Udo Kruschwitz
|dblpUrl=https://dblp.org/rec/conf/sigir/ZimmermanHCEK20
}}
==Towards a Framework for Harm Prevention in Web
                     Search==
<pdf width="1500px">https://ceur-ws.org/Vol-2741/paper-07.pdf</pdf>
<pre>
        Towards a Framework for Harm Prevention in
                       Web Search

Steven Zimmerman1 , Stefan M. Herzog2 , David Elsweiler3 , Jon Chamberlain1 ,
                          and Udo Kruschwitz3
                    1
                       University of Essex, Colchester, United Kingdom
    2
        Center for Adaptive Rationality Max Planck Institute for Human Development,
                                      Berlin, Germany
                     3
                        Universität Regensburg, Regensburg, Germany


          Abstract. We introduce a framework aimed at the information science
          (IS), information retrieval (IR) and data science communities as well as
          behavioral and cognitive scientists and policy makers inside government
          and at corporations operating Web search platforms. The goal of this
          framework is to instigate collaborative discussion and research across
          these communities to address potential dangers searchers and society face
          in modern Web search. We provide an overview of the harms, such as
          poor health outcomes, and their possible causes, including searcher and
          system biases being tuned for maximum profit. Modifications to policy,
          additional evaluation metrics, a mixture of cognitive decision making
          tools and improvements to the IR system are the suggested pathways.
          Examples are provided of how the framework can be put into practice.


1       Introduction
Research and theory developed in information science (IS) has inspired many
of the algorithms, models and interfaces developed and implemented in modern
information retrieval (IR) systems [78]. Even with this influential link between
the communities, quite recent commentary [17] suggests there is a gap in research
between the two areas and a need for a holistic view of the searcher (the IS
focus) and the search system (the IR focus) as one system together [36, 37].
In fact, Ingwersen and Järvelin [37] suggest this view must go beyond just the
system and the searcher. Data science, a profession deemed “the sexiest job"
[19], also has agency in the modern Web search environment. The data science
community has quite an influential role in modern Web search environments,
as members of this community are often tasked with development of models
that are optimized to maximize user satisfaction [78], revenue and profits [19,
88] and user engagement [19]. Finally, there is the community of searchers that
    Copyright © 2020 for this paper by its authors. Use permitted under Creative
    Commons License Attribution 4.0 International (CC BY 4.0). BIRDS 2020, 30 July
    2020, Xi’an, China (online).


                                             30
seek and find information on the Web through a multitude of IR environments
such as search engines (e.g. Google), product websites (e.g. Amazon) and social
media news feeds (e.g. Facebook) that have their own biases and beliefs [7, 77].
Unfortunately, the biases and beliefs of the searcher and the IR systems they
interact with form a feedback loop that not only changes the system [7, 8, 73]
but also changes the beliefs of the user [7, 73, 77]. Ultimately, this self-reinforcing
cycle exposes individual searchers and broader society to potentially harmful and
dangerous outcomes [7, 55, 73, 77]. It is our view that the potential and already-
realized harms caused by this reinforcing cycle are a side effect of the non-holistic
view, a view which must extend beyond the searcher and the system [37], and
thus motivates our framework for harm prevention in Web search.

    Our framework is aimed at communities which include researchers in IS,
IR, data science and the behavioral and cognitive sciences as well as the pol-
icy makers in governments and the leadership teams of Web platforms. There
is common recognition by these communities of the ethical concerns and po-
tentially grave implications of the technology that are now ubiquitous in our
everyday lives. Simultaneously, aside from efforts by IS and IR [11, 17, 37], these
communities appear to be working independently of one another. As such, we
see a need for a common framework for all of these communities to jointly work
towards a common goal of ethical responsibility to the searcher and broader so-
ciety for which we are a part of. The components of the framework (Section 3)
are ones we believe are the most essential for these communities to place focus
initially. Components include policy updates, cognitive interventions, evaluation
methods, and considerations for overall search system design. Central to our
framework are four themes: collaborative effort by the communities mentioned,
greater transparency to the user, greater choice for the user and an ethics-based
approach for search system development.


2    Background

Why develop a framework? A number of researchers in the IR, IS and data
science communities have expressed ethical concerns and potential for harmful
ramifications due to the information that is collected and provided by the cur-
rent Web search environments we deploy [7, 22, 35, 67, 77, 78]. In parallel, some
researchers in our community have proposed and demonstrated methods that
address some of these matters (see [10, 23, 27, 34, 64, 67, 87]). The behavioral and
cognitive sciences community has voiced similar concerns [41, 43, 45] and offered
possible solutions [41–43, 45] in line with the IR and IS communities. Yet, even
though leadership of many popular Web platforms (where search for informa-
tion commonly occurs) publicly express their concerns about these same matters,
there are very few instances where they set policies for their platforms to align

                                          31
with recommendations of researchers and policy makers4 and the more common
response is to take no action5 until required by law to do so6 .
    What are the harms? A broad spectrum of harms have occurred or have the
potential to occur. Dangerous and potentially deadly health outcomes [77, 55],
as well as destabilized political systems [14] have occurred. Excessive interaction
(e.g. internet addiction) with information in social media environments is shown
to have small but negative impacts to adolescents [53]7 . State-sponsored surveil-
lance that monitors searcher behavior is sometimes used to harm individuals
[13]. Other individuals interacting with IR systems have the potential of being
radicalized and motivated to join extremist communities known to cause harm
to others [73]. Users can receive dangerous advertisements based on their beliefs
[62]. Indeed the harms can become quite dystopian [74] and these serve as the
primary motivation for our proposal that aims to prevent individual and societal
harms due to Web search.
    What are the causes? The harms have many underlying causes, but we em-
phasize key factors. First, corporate (IR) platform policy can encourage data sci-
entists to create addictive systems [19] and environments that are non-transparent
to searchers8 [41, 45]. There are of course the previously mentioned biases exis-
tent in both the system [7, 77] and the user [38, 51, 77], and the reinforcement
factor [7, 8, 73]. Information itself is a factor. This comes in two forms, the infor-
mation found (e.g. search results, Web pages, videos, social media post) and the
information collected about the searcher (e.g. queries, IP address, usernames) –
both of which offer many benefits to searchers such as exposure to more relevant
information and faster task completion [78], but simultaneously may contain
harmful content [77] and cost them their privacy [39, 78]. The centralized nature
of search environments [82, 83], which are now commonplace, were built upon
IS models of search developed and tested in quite different environments, such
as the library [78, 83], and is another factor likely playing into concerns around
privacy. The motive of profit [88] certainly encourages searchers to view infor-
mation they might not otherwise do [74]. Moreover, the metrics used to evaluate
systems (a focus of data scientists and IR researchers [80]) are quite different
to those suggested by IS researchers (e.g. [21]). This issue of the communities
(e.g. IR, data science and cognitive science) operating independently, rather than
collectively towards a shared goal, is also a possible cause that should not be
overlooked. The field of research known as interactive information retrieval (IIR)
4
  Exceptions include: Twitter now includes credibility assessments of claims for some
  Tweets [16]; Facebook moderates news feeds for hate speech [15].
5
  For example, nearly a decade has passed since [34, 64] suggested approaches to enable
  searchers a pathway for better assessments credibility, but to our knowledge, no
  commercial search engine has implemented these methods.
6
  In 2018 both the General Data Protection Regulation (GDPR) (to better protect
  privacy) [24] and NetzDG (to reduce racism and extremism) [63] forced major plat-
  forms to update their policies and practices or face stiff penalties.
7
  This study argues for better evaluation measures, which we discuss in Section 3.4
8
  For example, to understand privacy impacts, searchers must read lengthy privacy
  policy statements, written in an obfuscated manner.


                                         32
aims for a collective and interdisciplinary view of the search process [36, 37] and
it is through the IIR lens that the framework was developed and now introduced.

3   Framework Components
A recent proposal by Smith and Young Rieh [67] provides important insights
for the development of a framework for Web search systems designed to prevent
harms to individuals and society. Their proposal suggests that many users have
information literacy and critical thinking skills that are useful to reduce the
risks of harms from Web search. They highlight that current implementation
methods of search systems, such as the Search Engine Results Page (SERP),
offer little if any support to utilize these skills. Furthermore, they point out
that users (likely due to the high cognitive demands of applying information
literacy skills) put too much trust in the results found in the SERP, as has
been demonstrated by other research [38, 77]. It appears some vitally important
processes of search introduced by the IS community are being inhibited by the
current design [67], processes including sense-making and exploratory search,
which is unfortunate given the role they play in learning [78]. Their proposal
also suggests that current IR systems are optimized for close-ended tasks (e.g.
fact-based, question-answering), but should instead be optimized for learning.
Ultimately, their proposal being that information in Web search interfaces should
offer cues (e.g. topic, author and their affiliations, affective semantics including
hate and humor) that enable users to utilize their literacy skills and assist them
with critical thinking. However, their framework, as encompassing as it is, does
not consider important matters such as privacy of the searcher [39] and platform
policies optimized for corporate profit [80, 88]. That said, many elements of their
proposal, such as considerations for interface design and informational cues that
engage critical thinking for better decision making, are central to our proposal.
We therefore see our work as an extension of their work. The decision-making
factor is in fact fundamental to search [60, 78], suggesting a strong need for
cognitive interventions, which are proposed as a bridge between many of the
other framework components.
    The proposed framework is segmented into four main areas which are seen as
core to the development of search environments to reduce risk of harm to both
the searcher and society. (1) Policy, which includes methods of law, education
and corporate policy, are suggested. (2) A set of cognitive approaches, devel-
oped for the specific purpose of engaging decision making that reduces risk to
individuals and society are introduced. (3) Considerations for the system design
are provided, for which content enrichment and interface design are the main
focus. (4) Any framework, and any approach for that matter, needs to be evalu-
ated, for which suggestions are also given. The considerations provided are not
exhaustive, and are intended as a foundation for a way forward.

3.1 Policy
Policy is a broad topic, which encompasses relevant areas such as law and edu-
cation and can be used as a mechanism to prevent harm in Web search. Policies

                                        33
set by the Web search systems are used as a means to leverage their commercial,
legal and overall organizational interests. For instance, explicit privacy policies
and data usage policies may be tailored to protect the provider from legal ram-
ifications (e.g. the GDPR), while simultaneously maximizing their commercial
profits [88]. Alternatively, a provider may shift their policy to meet social norms
and address public outcry from issues such as misinformation [16]. There are
also policy decisions around design choices for the core product that searchers
interact with, such as how to present information in a SERP, search support
tools to include in the product (e.g. query suggestion), and underlying retrieval
and ranking models to implement. Clearly, web search platforms should keep
harm prevention central to their design policy.
    Laws can be implemented, locally, nationally, within economic regions and
globally, with examples including California, Canada, European Union and hu-
man rights law respectively. For Web searchers, the GDPR [24], a law which
is designed to better protect their privacy and allow for greater transparency
and control over how data collected about them is used, is perhaps the most
well known law for harm prevention to date. Laws may also be used to enforce
information providers (e.g. social media and search platforms) to take down and
/ or filter out content that is perceived by law makers to be harmful to individ-
uals and or societies. Examples include the NetzDG Germany [63] that require
removal of speech that is hateful (e.g. Nazi imagery) and censoring of Google
search results (e.g. websites mentioning the Tiananmen Square massacre) by the
Chinese government [72]. Some laws, such as the communications and decency
act in the USA [75], place the onus of legal liability on the publisher of the con-
tent (e.g. author of news article), but not the provider (e.g. search engine) or
user (e.g. searcher). Legal tools achieve harm prevention through penalty (e.g.
fines, imprisonment) for non-adherence to the rules stated by law, they also have
a sense of authoritarianism and dictating what is good or bad. As differentiating
between good and bad can be problematic [27], we suggest that law be used as a
tool of last resort. Nonetheless, ethical considerations are a critical factor to IIR
systems and research and therefore must be taken into account [40, 78]. As part
of the framework, basic universal human rights [69–71] are the recommended
lens through which policy is set.
    Finally, education approaches and campaigns are a suggested pathway to
improve search capabilities that minimize personal harm. There are some efforts
by platforms to provide education tools and programs in primary and secondary
schooling (see [30]) as well as being broadcast to searchers of any age (see [30,
59]). However, a searcher is not provided such tools directly in the search engines
(i.e. there is no link provided)9 . In our view, education is a promising pathway, as
it overlaps with the cognitive interventions discussed in the section that follows.


9
    Query recommendations and spelling corrections may be educational if it can be
    shown that the user improves their query behavior or their spelling over time.


                                         34
3.2 Behavioral and Cognitive Interventions
Decision making is fundamental to the search process [60, 78]. Therefore it is
worthwhile to consider strategies from behavioral and cognitive sciences that are
developed specifically for minimizing risk and harms. Three approaches, nudg-
ing [56, 68], boosting [33, 41] and techno-cognition [43], were recently proposed
pathways to minimize and address harms in the modern online world [41, 45].
We focus on the first two approaches, nudging and boosting, as they are quite
different in their methodology, yet very similar in their aim of reducing individ-
ual and societal risks. Additionally, we introduce nutrition labels and fact boxes
as means to communicate potential harms.


Nudging [68] is a popular behavioral-public-policy approach, which has gained
notoriety in recent years. Nudges aim to push people towards—what the ‘nudger’
believes to be—more beneficial decisions through the ‘choice architecture’ of
people’s environment (e.g., default settings). Thaler and Sunstein [68] provide
the following definition of a nudge:

    A nudge . . . is any aspect of the choice architecture that alters people’s
    behavior in a predictable way without forbidding any options or signifi-
    cantly changing their economic incentives. To count as a mere nudge, the
    intervention must be easy and cheap to avoid. Nudges are not mandates.
    Putting fruit at eye level counts as a nudge. Banning junk food does not.

    Nudging requires that the ‘choice architecture’ includes an element of liber-
tarian paternalism [68], that is, the nudge must allow the individual to opt out
(e.g., choose the non-default option); this is different from a purely paternalistic
approach such as bans, which have no opt-out mechanism by design and intent.
Some nudges, such as nutrition labels and warning lights, have educational el-
ements [41, 86], but for the most part, nudges aim to directly change behavior
without targeting people’s competences [33]. The political philosophy and claims
about human nature underlying nudging have been criticized recently [52, 28];
see, for example, [56] for a review of the issues discussed. Self-nudging [56] –
people acting as their own “citizen choice architects” – has been proposed as a
way to harness the power of nudging while largely circumventing its problems.


Boosting [33] is another approach to behavior change based on evidence from
behavioral science. Quoting Hertwig and Grüne-Yanoff (p. 974):

    The objective of boosts is to improve people’s competence to make their
    own choices. The focus of boosting is on interventions that make it easier
    for people to exercise their own agency by fostering existing competences
    or instilling new ones. Examples include the ability to understand sta-
    tistical health information, the ability to make financial decisions on the
    basis of simple accounting rules, and the strategic use of automatic pro-
    cesses (...)

                                        35
    In the context of web search, boosting aims to improve people’s skills to effec-
tively and safely search the web. To achieve this, a boosting approach combines
both IR research on how people search and adapt their search strategies to the
environment [34, 42, 66] with general insights on human judgment decision mak-
ing online [41, 45] and offline [29] to design and evaluate boosting interventions.
    The key difference between a boosting and nudging approach lies in the for-
mer’s assumption that people are not simply “irrational” (and thus need to be
nudged towards better decisions), but that the human cognitive architecture
is malleable and thus new competencies and skills can be instilled [33]—often
requiring little time and effort. However, whether nudging or boosting is the “bet-
ter” approach for a particular situation depends both on ethical considerations
(e.g., how much value is placed on people’s autonomy), but also on pragmatic
considerations of which approach will likely be more successful in terms of ef-
fectiveness and economic and non-economic costs. For example, since boosting
needs people’s cooperation to be effective, boosting has the advantage that it—by
design—cannot be manipulative. But this cooperation requirement also implies
that boosting will not be successful in situations were people are unwilling or
unable to learn or make use of a boost. See [32] for a discussion and some rules
of thumb for when nudging or boosting is likely to work better.


Nutrition Labels and Fact Boxes Cognitive science has also provided a
large body of evidence on visual approaches for communication of risk in an
understandable way. Nutrition labels are one popular visual approach for risk
communication, and it is shown that traffic light type approaches produce bet-
ter outcomes and are more preferred by users [48, 65, 76] than tabular based
approaches10 . Originally designed for medical decision making for doctors and
patients, fact boxes are another promising means to provide information in a
manner that includes the benefits and harms of the available decisions [49]11 .
Interestingly, both nutrition labels and fact boxes can act as the medium to
perform a nudge or a boost.


3.3   Search System Design
Many components are necessary to build a fully functional search engine [18] and
it is clear that the underlying systems have a tendency to become biased and steer
users towards harmful information [7]. Here, we focus on content enrichment and
the search interface and provide limited discussion on other components, such
as log analysis and retrieval models hinting at where they might play a role
within the framework. Based on commonly used implementation methods, such
as those leveraging query logs as a primary means to model searchers and provide
10
   Such as those produced by the Food and Drug Administration
   https://www.fda.gov/food/nutrition-education-resources-materials/new-nutrition-
   facts-label
11
   See examples of fact boxes at the Harding Institute for Risk Literacy
   https://hardingcenter.de/en/fact-boxes


                                        36
support tools (e.g. query recommendations, collaborative search models) [18, 78],
it is conceivable that many of the system biases currently present [7] would abate
over time due to the logging of interactions of a subset of users that make use of
system elements discussed in sections below. That is, users that take the effort
to minimize personal harm could in fact provide benefit to all users.


Content Enrichment with Informational Cues Processes that enrich and
classify information in Web documents are fundamental to modern search en-
gines and IR systems [18]. As previous research indicates, there are many differ-
ent cues to consider [27, 42, 67] during this enrichment process to be applied to
information that may be useful for minimizing risk to searchers, for which many
are important factors for making better decisions in search [34, 42, 66].
    For IR researchers and data scientists, the listing of cues provided by Smith
and Rieh (see [67]) as well as methods outlined by Fuhr et al (see [27]) are useful
guides for development of cue extraction methods. Methods are already available
for extracting cues such as the reading level, the virality of the content (i.e. how
likely will the information spread), emotionality (e.g. language that is angry,
overly positive, etc.), prevalence of factual, opinionated and / or controversial
information, trustworthiness of the source (e.g. mechanisms to determine the
credibility of a Web page), technicality (e.g. a score for amount of technical
jargon in document) and if the document is currently topically relevant [27].
    Bibliographic cues (e.g. author affiliation) and inferential cues (e.g. citations
to and from document) are also needed for critical thinking and evaluation of
information [67]. A lack of transparency exists in affiliations of authors and
publishers of information [67], and therefore there is a clear need for developing
methods that evaluate affiliation(s) of authors and publishers of information (e.g.
who is funding think tank X that publishes web page Y) [43].
    Methods to identify content that is hateful [84], misogynistic [9] or containing
vulgar language [20] are also readily available and potentially useful for mini-
mizing exposure to content that some users may find offensive, which Smith
and Rieh [67] classify as valence cues. Marking content which is sexually explicit
(written, verbally and / or visually) [54], may be useful for developing strategies
to minimize harms to minors as well as users that are susceptible to addiction.
    As privacy is of paramount concern too, extracting cues that provide greater
transparency to the searcher into what data is collected and by whom it is
collected and shared are also critical to prevent harms from the collected infor-
mation. One such task in this space is the identification of 3rd parties that data
will be shared with when visiting a Web page [44, 81] and another being the
classification of privacy statements on the websites where the content is hosted,
a task that could be designed with existing privacy statement corpora [79]. In
a similar vein as author affiliation cues proposed by [67] and [43], privacy-based
ontologies containing information (e.g. total fines, number of GDPR violations)
about 1st-party providers and the 3rd-party affiliations could be developed to
present privacy cues.

                                         37
    Many of the cues can be extracted with models produced by machine learn-
ing algorithms. Nevertheless, for data scientists that develop these models, it is
critical to minimize model bias, such as gender bias [12]. Many solutions in the
field of IR are designed with a “find the best” model mindset, as evidenced by the
leaderboard approach for shared tasks (e.g. TREC, SemEval), and are a likely
cause of some model biases and subsequent poor predictions. There is evidence
that ensemble approaches are more robust and resilient to bias and more likely
to outperform a single model [31, 46, 84], and are one possible alternative. In
search spaces with potentially dangerous outcomes (e.g. health), data scientists
should also consider interpretable models [58].


Interface Design Informational cues, cognitive interventions and policy are all
important for harm reductions [45], but they need a medium for implementation
and it is the search interface (such as a SERP) that is this medium.
    Extracting cues that allow for the design of better decision making tools
(thus enabling users to better tap into their critical thinking skills) and designing
interfaces that present such cues and tools in a not-too-disruptive manner are two
major challenges for interface design. Commercial SERPs are typically presented
as ranked lists [67] and, depending on the query, will contain content such as
advertisements, social media posts and news articles [5]. Search support tools are
an important IR system component for improving search [78], some of which are
available within the SERP including query suggestions and auto-completion as
well as spelling correction. Thus, any component that allows the user to minimize
the chance of harm, also falls within the scope of search support.
    Space is a premium and one challenge is to ensure that the screen is not
overloaded [47]. Risk communication tools such as nutrition labels and fact boxes
are highly effective and desirable, but may not fit on small mobile devices, where
warning lights are likely the better option. Link enrichment is another approach
[47], where pop-ups populated with informational cues are included with the
results, and is thus especially appealing as it could be applied to both desktop
and mobile search. Link enrichment also need not apply only to the SERP, and
can be applied as searchers navigate within [2] or across domains and the Web
(Wikipedia desktop offers link enrichment and is one live example).
    Alternatively, the SERP could be designed to rank or filter results as to
attenuate possible harms from, say, privacy concerns or dangerous medical advice
[87]. Indeed, commercial search engines already offer the default of filtering adult
content (e.g. content that is classified as sexually explicit), and takes up little
space within the interface. Altering results in the SERP in this manner is a
nudge, so long as the user is given the capability to opt-out [68]. However, we
caution against such approaches, as it does not tap into the important critical
thinking and literacy skills of searchers [67] and thus likely does not generalize
to other contexts without such a nudge.
    The interface is also where policy can be implemented. It is conceivable that
law makers may someday require IR systems to include any number of the ap-
proaches already discussed—an information nutrition label [27] is one such pos-

                                         38
sibility. Or platforms, such as Google, could voluntarily set policy that provides
links to educational resources in the SERP, simplifying the process for searchers
to learn how to better protect themselves during the search process.

3.4 Evaluation
Evaluation is a fundamental and necessary process for IIR research. There are
many resources readily available (for an introduction see [40] for evaluation of
user studies and [18] for evaluation of IR systems) to perform evaluations of IIR
systems and user interactions. Traditional evaluation metrics, such as relevance-
based metrics (e.g. precision, F-measure), and human-based metrics (e.g. time
to complete tasks, query abandonment rate) are essential for harm prevention
strategies developed with this framework, as there should be minimal impact on
these metrics. For data scientists and other researchers that perform evaluations,
they should consider recent suggestions of leading experts (see [26, 61]), as IR
studies often lack statistical rigor [61] and that many easily avoidable mistakes
are made [26]. We now turn our attention to more recently proposed metrics
that should also be considered, including metrics that take an economic view,
and a somewhat newer generation of outcome-based metrics.

Economic-Based Metrics Interventions that reduce risk of harm, such as
those suggested in the framework, have costs (e.g. time) for the individual [32, 68]
and costs are an important economic consideration for IIR environments [4, 6]12 .
The economic view has inspired a new set of useful evaluation approaches, which
integrate theories from economics and have the overall aim to better predict user
behavior in the search environment [3]. Incorporation of the economic view of
IIR is potentially useful for evaluating the framework, as it allows evaluation
from the perspective of trade-offs of costs and benefits [3, 5], such as the trade-
off of costs of time for the benefit of reduced risk of harm as part of the search
process. In addition to time, examples of relevant costs one might consider are the
money a searcher is willing to pay for information that is of high quality, amount
of data they are willing to share with 3rd parties and the effort of searching for
information relevant to their task.

Outcome-Oriented Metrics Sense-making, one area of research within IS
that considers the process of filling in gaps of knowledge, also has a strong
focus on the ultimate outcome of this process [21], outcomes which may have
positive or negative impacts [21, 57]. Such impacts fall in the domain of success
metrics [78], which are possibly the most important evaluation approach for the
framework, as they can be measured from the user perspective (does the user
believe their risk of harm was reduced) and from the system perspective (did
user X, making use of a privacy intervention, share less data than user Y, who
12
     There are costs with respect to designing and operationalizing interventions in a
     search environment, such as salaries for software engineers. However, for this discus-
     sion, we are strictly concerned with the economics of the searcher.


                                             39
did not). Such metrics are important in the area of health search as incorrect
information (resulting in incorrect knowledge) risks great harm [55, 77]. Also
promising are new evaluation methods, such as overall reputation of commercial
search platforms and longitudinal studies that sample regular users13 .

4    The Framework in Practice
Several empirical studies and commercial systems have some (but not all) of the
elements of the framework. It is worth noting these to provide a lens into how
the framework can be used in practice. To our knowledge, however, no approach
addresses all four components. Behavioral interventions (nudges) and system-
based content enrichment were shown to effectively steer users towards healthier
food choices [23] and away from Websites that more greatly impact personal
privacy [1, 87]. Outcome-oriented evaluation measures were considered in the
latter study [87], but lack the policy element. Some commercial search engines
(e.g. DuckDuckGo), have used policy, system design and cognitive approaches
to protect users from adult material, but do not publish evaluation approaches.
    Specific to behavioral and cognitive interventions, there are additional empir-
ical findings worth noting. A subset of the cues suggested by Smith and Rieh [67]
were used to augment search results visually to nudge users to more accurately
assess credible information [64]. Browser plug-ins can provide a visual nudge
during Web browsing and exploration, such as the Ghostery 3rd-party blocking
tool (https://www.ghostery.com/), which by default blocks data sharing with
3rd parties14 . In line with a boosting approach, one study tested low-cost search
tips as a means to provide skills for better searching [50] and another study im-
proved novice searchers skills by feedback based their search behavior compared
to expert searchers [10]; note that neither study was explicitly designed for harm
reduction nor explicitly referred to boosting.
    Evidence suggests that elements in the URL can be utilized to boost users
with a skill to better protect privacy and simultaneously improve health out-
comes [85] and Figure 1 is a prototype of a boost that combines these findings
with a fact box [49] as a means to reduce risks of 3rd-party tracking. Commercial
search platforms may someday shift their policy to offer more focus on harm pre-
vention, where policy might place such a fact box in the SERP, or alternatively
offer tools allowing users to self-nudge [56] (e.g. selecting search domains, such
as health and politics, to filter out non-credible information by default).

5    Conclusions
We introduced a framework as a pathway to reduce the risk of harms present
in modern Web search. Central to the framework are cognitive decision making
tools and three further components: policy, system design and overall evaluation.
13
   The Harvard nurse study https://www.nurseshealthstudy.org/ may be a useful tem-
   plate for longitudinally assessing harm and risk factors of search systems.
14
   Caution should prevail with 3rd-party blocking tools as recent findings question their
   effectiveness [25].


                                           40
Fig. 1. A proposed boost based upon fact box methods [49] and previous findings on
the usefulness of the top-level domain of a URL for harm reduction [85]. Data scientists
and IR researchers could make use of sampling techniques to dynamically populate fact
boxes based upon the topical search space of a user’s information need.


   Baeza-Yates’ recent commentary on the interactions between IR systems and
searchers [7] as a cause of harms in Web search systems, is direct real-world
evidence of the pervasive nature of the current Web search setup. Implementing
the strategies sketched in this article is a possible approach to improving the
overall system. Given current algorithms and their ability to learn from log data,
hypothetically it would only require a subset of users concerned about harms in
Web search to shift the results for everyone else (i.e., a positive externality). The
environment could naturally evolve to something more protective for individuals
and society as a whole.
   We recognize that our framework is just a start, and there is much more
to be considered, especially in the space of ensuring overall commercial value
and limiting the impacts on system performance. Embedding the current, initial
framework in the IIR view, which takes a multi-faceted and interdisciplinary ap-
proach inclusive of these additional factors, will ensure that it can mature into a
comprehensive and realistic framework. A key message from our proposal is that
methods from cognitive science introduced in our framework appear particularly
promising [33, 41, 45]. We hope that such a framework will be a useful discussion
point for members of all communities involved to work towards a common goal
of minimizing the harm for searchers and our society.


Acknowledgements Thank you to the reviewers for their constructive feedback
that has helped strengthen this paper. This work was supported by the Economic
and Social Research Council grant number ES/M010236/1 (The Human Rights
Big Data and Technology (HRBDT) Project).


                                          41
References

 1. Acquisti, A., Adjerid, I., Balebako, R., Brandimarte, L., Cranor, L.F., Komanduri,
    S., Leon, P.G., Sadeh, N., Schaub, F., Sleeper, M., Wang, Y., Wilson, S.: Nudges
    for privacy and security: Understanding and assisting users’ choices online. ACM
    Comput. Surv. 50(3) (August 2017)
 2. Alhindi, A., Kruschwitz, U., Fox, C., Albakour, M.D.: Profile-based summarisation
    for web site navigation. ACM Transactions on Information Systems (TOIS) 33(1),
    4:1–4:39 (Mar 2015)
 3. Azzopardi, L.: The economics in interactive information retrieval. In: Proceedings
    of the 34th International ACM SIGIR Conference on Research and Development
    in Information Retrieval. p. 15–24. SIGIR ’11 (2011)
 4. Azzopardi, L.: Modelling interaction with economic models of search. In: Proceed-
    ings of the 37th International ACM SIGIR Conference on Research and Develop-
    ment in Information Retrieval. pp. 3–12. SIGIR ’14, ACM (2014)
 5. Azzopardi, L., Thomas, P., Craswell, N.: Measuring the utility of search engine
    result pages: An information foraging based measure. In: The 41st International
    ACM SIGIR Conference on Research and Development in Information Retrieval.
    p. 605–614. SIGIR ’18 (2018)
 6. Azzopardi, L., Zuccon, G.: An analysis of the cost and benefit of search interac-
    tions. In: Proceedings of the 2016 ACM International Conference on the Theory
    of Information Retrieval. pp. 59–68. ICTIR ’16 (2016)
 7. Baeza-Yates, R.: Bias on the web. Communications of the ACM 61(6), 54–61
    (2018)
 8. Bakshy, E., Messing, S., Adamic, L.A.: Exposure to ideologically diverse news and
    opinion on facebook. Science 348(6239), 1130–1132 (2015)
 9. Basile, V., Bosco, C., Fersini, E., Debora, N., Patti, V., Pardo, F.M.R., Rosso, P.,
    Sanguinetti, M., et al.: Semeval-2019 task 5: Multilingual detection of hate speech
    against immigrants and women in twitter. In: 13th International Workshop on
    Semantic Evaluation. pp. 54–63. Association for Computational Linguistics (2019)
10. Bateman, S., Teevan, J., White, R.W.: The search dashboard: How reflection and
    comparison impact search behavior. In: Proceedings of the SIGCHI Conference
    on Human Factors in Computing Systems. p. 1785–1794. CHI ’12, Association for
    Computing Machinery (2012)
11. Belkin, N.J.: Some(what) grand challenges for information retrieval. SIGIR Forum
    42(1), 47–54 (2008)
12. Bolukbasi, T., Chang, K.W., Zou, J.Y., Saligrama, V., Kalai, A.T.: Man is to
    Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings.
    In: Advances in Neural Information Processing Systems 29, pp. 4349–4357. NIPS,
    Curran Associates, Inc. (2016)
13. Botsman, R.: Big data meets big brother as china moves to rate its citizens. Wired
    UK 21 (2017)
14. Cadwalladr, C., Graham-Harrison, E.: Revealed: 50 million Facebook pro-
    files harvested for Cambridge Analytica in major data breach. The Guardian
    17, 22 (2018), https://www.theguardian.com/news/2018/mar/17/cambridge-
    analytica-facebook-influence-us-election
15. Carlsen, A., Haque, F.: What Does Facebook Consider Hate Speech? Take Our
    Quiz.     https://www.nytimes.com/interactive/2017/10/13/technology/facebook-
    hate-speech-quiz.html (October 2017), (Accessed on 10/21/2018)


                                          42
16. Conger, K., Alba, D.: Twitter Refutes Inaccuracies in Trump’s
    Tweets      About      Mail-In    Voting     -    The      New       York    Times.
    https://www.nytimes.com/2020/05/26/technology/twitter-trump-mail-in-
    ballots.html (May 2020), (Accessed on 06/14/2020)
17. Croft, W.B.: The importance of interaction for information retrieval. In: Proceed-
    ings of the 42nd International ACM SIGIR Conference on Research and Devel-
    opment in Information Retrieval. pp. 1–2. SIGIR’19, ACM, New York, NY, USA
    (2019)
18. Croft, W.B., Metzler, D., Strohman, T.: Search Engines: Information Retrieval in
    Practice. Pearson Education (2015)
19. Davenport, T.H., Patil, D.: Data scientist: The sexiest job of the 21st century.
    Harvard business review 90(5), 70–76 (2012)
20. Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated hate speech detection
    and the problem of offensive language. arXiv preprint arXiv:1703.04009 (2017)
21. Dervin, B.: Sense-making theory and practice: an overview of user interests in
    knowledge seeking and use. Journal of Knowledge Management 2(2), 36–46 (1998)
22. Diaz, F.: Worst practices for designing production information access systems.
    ACM SIGIR Forum 50(1), 2–11 (2016)
23. Elsweiler, D., Trattner, C., Harvey, M.: Exploiting food choice biases for healthier
    recipe recommendation. In: Proceedings of the 40th Annual International ACM
    SIGIR Conference on Research and Development in Information Retrieval. pp.
    575–584. SIGIR ’18, ACM (2017)
24. European Union: General Data Protection Regulation (GDPR) – Official Legal
    Text (2016 April), https://gdpr-info.eu/, (Accessed on 04/02/2018)
25. Fouad, I., Bielova, N., Legout, A., Sarafijanovic-Djukic, N.: Missed by filter lists:
    Detecting unknown third-party trackers with invisible pixels. Proceedings on Pri-
    vacy Enhancing Technologies 2020(2), 499–518 (2020)
26. Fuhr, N.: Some Common Mistakes In IR Evaluation, And How They Can Be
    Avoided. ACM SIGIR Forum 51(3), 32–41 (2017)
27. Fuhr, N., Giachanou, A., Grefenstette, G., Gurevych, I., Hanselowski, A., Jarvelin,
    K., Jones, R., Liu, Y., Mothe, J., Nejdl, W., Peters, I., Stein, B.: An information
    nutritional label for online documents. ACM SIGIR Forum 51(3), 46–66 (2017)
28. Gigerenzer, G.: On the supposed evidence for libertarian paternalism. Review of
    philosophy and psychology 6(3), 361–383 (2015)
29. Gigerenzer, G., Hertwig, R., Pachur (Eds.), T.: Heuristics: The Foundations of
    Adaptive Behavior. Oxford University Press (2011)
30. Google: Search education – google. https://www.google.com/insidesearch/searcheducation/,
    (Accessed on 06/14/2020)
31. Hagen, M., Potthast, M., Büchner, M., Stein, B.: Webis: An ensemble for twitter
    sentiment detection. In: Proceedings of the 9th International Workshop on Seman-
    tic Evaluation (SemEval 2015). pp. 582–589 (2015)
32. Hertwig, R.: When to consider boosting: some rules for policy-makers. Behavioural
    Public Policy 1(2), 143–161 (2017)
33. Hertwig, R., Grüne-Yanoff, T.: Nudging and boosting: Steering or empowering
    good decisions. Perspectives on Psychological Science 12(6), 973–986 (2017)
34. Hilligoss, B., Rieh, S.Y.: Developing a unifying framework of credibility assess-
    ment: Construct, heuristics, and interaction in context. Information Processing &
    Management 44(4), 1467 – 1484 (2008)
35. Horvitz, E., Mulligan, D.: Data, privacy, and the greater good. Science 349(6245),
    253–255 (2015)


                                        43
36. Ingwersen, P.: Cognitive perspectives of information retrieval interaction: Elements
    of a cognitive ir theory. Journal of Documentation 52(1), 3–50 (1996)
37. Ingwersen, P., Järvelin, K.: The Turn: Integration of Information Seeking and
    Retrieval in Context, vol. 18. Springer Science & Business Media (2005)
38. Joachims, T., Granka, L., Pan, B., Hembrooke, H., Gay, G.: Accurately interpreting
    clickthrough data as implicit feedback. In: Proceedings of the 28th Annual Inter-
    national ACM SIGIR Conference on Research and Development in Information
    Retrieval. pp. 154–161. SIGIR ’05, ACM (2005)
39. Jones, K.S.: Privacy: what’s different now? Interdisciplinary Science Reviews 28(4),
    287–292 (2003)
40. Kelly, D.: Methods for evaluating interactive information retrieval systems with
    users. Foundations and Trends® in Information Retrieval 3(1–2), 1–224 (2009)
41. Kozyreva, A., Lewandowsky, S., Hertwig, R.: Citizens versus the internet: Con-
    fronting digital challenges with cognitive tools. Psychological Science in the Public
    Interest (In Press 2020)
42. Lee, M.D., Loughlin, N., Lundberg, I.B.: Applying one reason decision-making:
    the prioritisation of literature searches. Australian Journal of Psychology 54(3),
    137–143 (2002)
43. Lewandowsky, S., Ecker, U.K., Cook, J.: Beyond misinformation: Understanding
    and coping with the “post-truth” era. Journal of Applied Research in Memory and
    Cognition 6(4), 353 – 369 (2017)
44. Libert, T.: Privacy implications of health information seeking on the web. Com-
    munications of the ACM 58(3), 68–77 (2015)
45. Lorenz-Spreen, P., Lewandowsky, S., Sunstein, C.R., Hertwig, R.: How behavioural
    sciences can promote truth, autonomy and democratic discourse online. Nature
    Human Behaviour (2020)
46. MacAvaney, S., Yao, H.R., Yang, E., Russell, K., Goharian, N., Frieder, O.: Hate
    speech detection: Challenges and solutions. PloS one 14(8) (2019)
47. Marchionini, G., Geisler, G., Brunk, B.: Agileviews: A human-centered framework
    for interfaces to information spaces. In: Proceedings of the Annual Conference of
    the American Society for Information Science. pp. 271–280 (2000)
48. Maubach, N., Hoek, J., Mather, D.: Interpretive front-of-pack nutrition labels.
    comparing competing recommendations. Appetite 82, 67–77 (2014)
49. McDowell, M., Rebitschek, F.G., Gigerenzer, G., Wegwarth, O.: A simple tool for
    communicating the benefits and harms of health interventions: A guide for creating
    a fact box. MDM Policy & Practice 1(1) (2016)
50. Moraveji, N., Russell, D., Bien, J., Mease, D.: Measuring improvement in user
    search performance resulting from optimal search tips. In: Proceedings of the 34th
    International ACM SIGIR Conference on Research and Development in Informa-
    tion Retrieval. p. 355–364. SIGIR ’11, Association for Computing Machinery (2011)
51. Novin, A., Meyers, E.: Making sense of conflicting science information: Exploring
    bias in the search engine result page. In: Proceedings of the 2017 Conference on
    Conference Human Information Interaction and Retrieval. pp. 175–184. CHIIR ’17,
    ACM (2017)
52. Oliver, A.: Nudging, shoving, and budging: Behavioural economic-informed policy.
    Public Administration 93(3), 700–714 (2015)
53. Orben, A.: Teenagers, screens and social media: a narrative review of reviews and
    key studies. Social Psychiatry and Psychiatric Epidemiology pp. 1–8 (2020)
54. Perez, M., Avila, S., Moreira, D., Moraes, D., Testoni, V., Valle, E., Goldenstein,
    S., Rocha, A.: Video pornography detection through deep learning techniques and
    motion information. Neurocomputing 230, 279 – 293 (2017)


                                           44
55. Pogacar, F.A., Ghenai, A., Smucker, M.D., Clarke, C.L.: The positive and negative
    influence of search results on people’s decisions about the efficacy of medical treat-
    ments. In: Proceedings of the ACM SIGIR International Conference on Theory of
    Information Retrieval. pp. 209–216. ICTIR ’17, ACM (2017)
56. Reijula, S., Hertwig, R.: Self-nudging and the citizen choice architect. Behavioural
    Public Policy p. 1–31 (2020). https://doi.org/10.1017/bpp.2020.5
57. Reinhard, C.D., Dervin, B.: Comparing situated sense-making processes in vir-
    tual worlds: Application of dervin’s sense-making methodology to media reception
    situations. Convergence 18(1), 27–48 (2012)
58. Rudin, C.: Stop explaining black box machine learning models for high stakes
    decisions and use interpretable models instead. Nature Machine Intelligence 1(5),
    206–215 (2019)
59. Russell, D., Callegaro, M.: How to be a better web searcher: Secrets
    from google scientists researchers who study how we use search en-
    gines share common mistakes, misperceptions and advice. Scientific Ameri-
    can (2019), https://blogs.scientificamerican.com/observations/how-to-be-a-better-
    web-searcher-secrets-from-google-scientists/
60. Ruthven, I.: Interactive information retrieval. Annual Review of Information Sci-
    ence and Technology 42(1), 43–91 (2008)
61. Sakai, T.: Statistical Significance, Power, and Sample Sizes: A Systematic Review
    of SIGIR and TOIS, 2006-2015. In: Proceedings of the 39th International ACM
    SIGIR Conference on Research and Development in Information Retrieval. pp.
    5–14. SIGIR ’16, ACM, New York, NY, USA (2016)
62. Sankin, A.: Want to Find a Misinformed Public? Facebook’s Already Done
    It. https://themarkup.org/coronavirus/2020/04/23/want-to-find-a-misinformed-
    public-facebooks-already-done-it (April 2020), (Accessed on 04/30/2020)
63. Schulz, W.: Regulating Intermediaries to Protect Privacy Online – the Case of the
    German NetzDG. Personality and Data Protection Rights on the Internet (2018)
64. Schwarz, J., Morris, M.: Augmenting web pages and search results to support cred-
    ibility assessment. In: Proceedings of the SIGCHI Conference on Human Factors in
    Computing Systems. p. 1245–1254. CHI ’11, Association for Computing Machinery
    (2011)
65. Siegrist, M., Leins-Hess, R., Keller, C.: Which front-of-pack nutrition label is the
    most efficient one? the results of an eye-tracker study. Food Quality and Preference
    39, 183–190 (2015)
66. Smith, C.L., Kantor, P.B.: User adaptation: Good results from poor systems. In:
    Proceedings of the 31st Annual International ACM SIGIR Conference on Research
    and Development in Information Retrieval. p. 147–154. SIGIR ’08, Association for
    Computing Machinery, New York, NY, USA (2008)
67. Smith, C.L., Rieh, S.Y.: Knowledge-context in search systems: Toward information-
    literate actions. In: Proceedings of the 2019 Conference on Human Information
    Interaction and Retrieval. pp. 55–62. CHIIR ’19, ACM, New York, NY, USA (2019)
68. Thaler, R.H., Sunstein, C.R.: Nudge: Improving Decisions about Health, Wealth,
    and Happiness. Penguin (2009)
69. The United Nations: Universal declaration of human rights (1948)
70. The United Nations General Assembly: International covenant on civil and political
    rights. Treaty Series 999, 171 (December 1966)
71. The United Nations General Assembly: International covenant on economic, social,
    and cultural rights. Treaty Series 999, 171 (December 1966)
72. Thompson, C.: Google’s China Problem (and China’s Google Problem). The New
    York Times (2006), (Accessed on 06/2020)


                                           45
73. Tufekci, Z.: Facebook said its algorithms do help form echo chambers, and the tech
    press missed it. New Perspectives Quarterly 32(3), 9–12 (2015)
74. Tufekci, Z.: We’re building a dystopia just to make people click on ads (September
    2017), https://www.ted.com/
75. U.S. Government Publishing Office: 47 usc 230: Protection for private blocking and
    screening of offensive material. https://www.govinfo.gov/content/pkg/USCODE-
    2011-title47/pdf/USCODE-2011-title47-chap5-subchapII-partI-sec230.pdf (1996),
    (Accessed on 06/2020)
76. Van Herpen, E., Van Trijp, H.C.: Front-of-pack nutrition labels. their effect on
    attention and choices when consumers have varying goals and time constraints.
    Appetite 57(1), 148–160 (2011)
77. White, R.: Beliefs and biases in web search. In: Proceedings of the 36th Inter-
    national ACM SIGIR Conference on Research and Development in Information
    Retrieval. pp. 3–12. SIGIR ’13, ACM (2013)
78. White, R.W.: Interactions with Search Systems. Cambridge University Press (2016)
79. Wilson, S., Schaub, F., Dara, A.A., Liu, F., Cherivirala, S., Leon, P.G., Andersen,
    M.S., Zimmeck, S., Sathyendra, K.M., Russell, N.C., et al.: The creation and anal-
    ysis of a website privacy policy corpus. In: Proceedings of the 54th Annual Meeting
    of the Association for Computational Linguistics. pp. 1330–1340 (2016)
80. Young, K.S.: Internet addiction: A new clinical phenomenon and its consequences.
    American Behavioral Ccientist 48(4), 402–415 (2004)
81. Yu, Z., Macbeth, S., Modi, K., Pujol, J.M.: Tracking the trackers. In: Proceedings
    of the 25th International Conference on World Wide Web. pp. 121–132. WWW
    ’16, International World Wide Web Conferences Steering Committee (2016)
82. Zimmer, M.: Privacy on Planet Google: Using the Theory of Contextual Integrity to
    Clarify the Privacy Threats of Google’s Quest for the Perfect Search Engine Google:
    An Intersection of Business and Technology. Journal of Business & Technology Law
    3, 109–126 (2008)
83. Zimmer, M.: Web Search Studies: Multidisciplinary Perspectives on Web Search
    Engines, pp. 507–521. Springer Netherlands (2010)
84. Zimmerman, S., Fox, C., Kruschwitz, U.: Improving hate speech detection with
    deep learning ensembles. In: Proceedings of the 10th International Conference on
    Language Resources and Evaluation (Miyazaki, Japan). LREC 2018 (2018)
85. Zimmerman, S., Thorpe, A., Chamberlain, J., Kruschwitz, U.: Towards search
    strategies for better privacy and information. In: Proceedings of the 2020 Confer-
    ence on Human Information Interaction and Retrieval. pp. 124–134. CHIIR ’20,
    Association for Computing Machinery (2020)
86. Zimmerman, S., Thorpe, A., Fox, C., Kruschwitz, U.: Investigating the interplay
    between searchers’ privacy concerns and their search behavior. In: Proceedings of
    the 42nd International ACM SIGIR Conference on Research and Development in
    Information Retrieval. pp. 953–956. SIGIR’19, ACM (2019)
87. Zimmerman, S., Thorpe, A., Fox, C., Kruschwitz, U.: Privacy nudging in search:
    Investigating potential impacts. In: Proceedings of the 2019 Conference on Human
    Information Interaction and Retrieval. pp. 283–287. CHIIR ’19 (2019)
88. Zuboff, S.: Big other: surveillance capitalism and the prospects of an information
    civilization. Journal of Information Technology 30(1), 75–89 (Mar 2015)


                                          46

</pre>