=Paper= {{Paper |id=Vol-1873/IWPE17_paper_23 |storemode=property |title=Engineering Privacy and Protest: A Case Study of AdNauseam |pdfUrl=https://ceur-ws.org/Vol-1873/IWPE17_paper_23.pdf |volume=Vol-1873 |authors=Daniel C. Howe,Helen Nissenbaum |dblpUrl=https://dblp.org/rec/conf/sp/HoweN17 }} ==Engineering Privacy and Protest: A Case Study of AdNauseam== https://ceur-ws.org/Vol-1873/IWPE17_paper_23.pdf
                                   Engineering Privacy and Protest:
                                    a Case Study of AdNauseam
                                    Daniel C. Howe                                          Helen Nissenbaum
                                School of Creative Media                                       Cornell Tech
                               City University, Hong Kong                                   New York University
                               Email: daniel@rednoise.org                                   Email: hfn1@nyu.edu




   Abstract—The strategy of obfuscation has been broadly applied—          tools addressing this issue, by leveraging obfuscation in an attempt
in search, location tracking, private communication, anonymity—and         to frustrate profiling based on interests revealed through ad clicks.
has thus been recognized as an important element of the privacy            Specifically AdNauseam enables users to click ads behind the scenes
engineer’s toolbox. However, there remains a need for clearly articu-      in order to register visits in ad network databases. The aim of the
lated case studies describing not only the engineering of obfuscation      software is to pollute the data gathered by trackers and render their
mechanisms but, further, providing a critical appraisal of obfusca-        efforts to profile less effective and less profitable. At the same time,
tion’s fit for specific socio-technical applications. This is the aim of   the software allows users an avenue of proactive expression, by
our paper, which presents our experiences designing, implementing,         actively disrupting the economic system that drives the system, and
and distributing AdNauseam, an open-source browser extension that          by sowing mistrust (advertisers generally pay ad networks for clicks)
leverages obfuscation to frustrate tracking by online advertisers.         within it. Additionally, AdNauseam makes the ads it collects available
   At its core, AdNauseam works like a list-based blocker, hiding          to users to explore via interactive real-time displays.
or blocking ads and trackers. However, it provides two additional
features. First, it collects the ads it finds in its ‘Vault’, allowing     A. Engineering Philosophy
users to interactively explore the ads they have been served, and             Our approach builds on prior work that explicitly takes social
providing insight into the algorithmic profiles created by advertising     values into account during tool design [14], [12], [26]. In plan-
networks. Second, AdNauseam simulates clicks on ads in order to            ning, development, and testing phases, we have integrated values-
confuse trackers and diminish the value of aggregated tracking data.       oriented concerns as first-order “constraints” together with more
   A critic might ask: why click? Why not simply hide ads from             typical metrics such as efficiency, speed, and robustness. Specific
users and hide users from trackers? The twofold answer reveals             instances of values-oriented constraints include transparency in in-
what may be distinctive elements of the AdNauseam approach. To             terface, function, code, process, and strategy; personal autonomy,
begin, we conceptualize privacy as a societal value. Whereas many          where users need not rely on third parties; social privacy with
privacy tools offer solutions only for individual users, AdNauseam is      distributed/community-oriented action; minimal resource consump-
built on the assumption that, often, effective privacy protection must     tion (cognitive, bandwidth, client and server processing); and us-
be infused throughout a system. This assumption presents different         ability (size, speed, configurability, and ease-of-use). Enumerating
and interesting engineering challenges. Second, AdNauseam seeks to         values-oriented constraints early in the design process enables us to
concurrently achieve the goal of resistance through protest. And since     iteratively revisit and refine them in the context of specific technical
protest frequently involves being vocal, AdNauseam’s core design           decisions. Where relevant in the following sections, we discuss ways
conflicts at times with conceptions of privacy based on secrecy or         in which AdNauseam benefited from this values-oriented approach,
concealment. While such tensions, and the tradeoffs that result, are       as well as tensions between design goals that emerged. We have
not uncommon in privacy engineering, the process of designing and          also followed strategies from privacy-by-design [19], [24], [20], [22],
building AdNauseam demanded their systematic consideration.                [5], including Data Minimization, Legitimacy Analysis and Socially-
   In this paper we present challenges faced in attempting to apply        informed Risk Analysis as elements of our design process.
obfuscation to a new domain, that of online tracking by advertisers.
We begin with the goals of the project and the implemented features        B. Design Goals and Constraints
to which they map. We then present our engineering approach, the              The AdNauseam extension attempts to realize three tangible goals
set of tensions that arose during implementation, and the ways in          for the context of online tracking via advertising. The first is to
which these tensions were addressed. We discuss our initial evaluation     offer protection; protection against for users against malware and
efforts on both technical and ethical dimensions, and some of the          “malvertising” (malicious software that leverages advertising mech-
challenges that remain. We conclude with thoughts on the broader           anisms to gain access to users’ systems [31]), as well as protection
issues facing privacy tools that must operate within complex socio-        against data aggregation and profiling (either for individual users,
technical contexts—especially those dominated by actors openly             the aggregate set of users, or both) via clicks on advertisements. The
resistant to them—informed by our experience with AdNauseam’s              second goal is to provide a means of proactive engagement, allowing
ban from Google’s Chrome store.                                            users an avenue for expression of their dissatisfaction with current
                                                                           advertising mechanisms to those in control of such systems. In the
                         I. I NTRODUCTION                                  case of AdNauseam, this expression has an interventional aspect,
  The ad blocking wars [34] reflect wide ranging resistance to the         as the software actively attempts to disrupt the economic model
online advertising landscape. AdNauseam, an open-source, cross-            that drives advertising surveillance. The third goal is to facilitate
platform browser extension, contributes to the growing arsenal of          transparency regarding the advertising ecosystem—and the profiling
on which it operates—by providing users with the ability to view              which may have yielded important insights. Yet this both clarified
the ads they are served in real-time, and later, to explore interactive       our own position in regard to data collection, and also enabled us to
visualizations of the ads collected over time, providing a rare glimpse       sidestep potential problems of data leakage or interception in transit.
of how advertisers view them.
                                                                                   [D]ata minimization does not necessarily imply anonymity,
C. Social-Technical Context                                                        but may also be achieved by means of concealing informa-
   AdNauseam applies obfuscation to the context of tracking by                     tion related to identifiable individuals [19].
online advertisers. If we compare this context to others where                Additionally we have applied the principle of data minimization to
obfuscation has been applied (e.g., search), we notice similarities and       our ad export feature, which allows users to export their aggregate ad
differences. In both cases users are confronted with large corporate          collection (as JSON) in order to sync or migrate between machines,
entities (search engines and advertising networks) whose business             to backup, or to share. From our experiences user-testing this feature,
models depend on the bulk collection of personal data. And in both            we noted that such exports contained potentially sensitive data,
cases users have little say in shaping these interactions, except to          specifically users’ click trails (stored as the locations for found ads),
take-it-or leave it. One difference is that although “leaving it” may         possibly over months or even years. When considering how to handle
be feasible in search, using instead a non-tracking alternative such as       this data we noted that it also existed in the browser’s local storage,
DuckDuckGo, it is unclear what alternative exists for those wanting to        which could potentially be accessed by a malicious actor. Thus we
opt-out of advertising surveillance – cease using the Web? 1 A second         subsequently implemented encryption for this data, both in memory
difference is the degree to which users want the service provided by          and in storage, as well as offering users the option, before each export,
the trackers. In search we can assume most users do in fact want (or          to redact ad locations if desired.
need) the service offered by the search engine, which also happens
to be the tracker. Advertising, by contrast, is not as clear. Tracking        F. Legitimacy Analysis
aside, some users may find ads useful; while others prefer not to see
ads at all, while still others might tolerate non-tracking ads in order to         Before any privacy-by-design activities are embarked upon,
avoid subscription fees. A third, structural difference, is that in search         a discussion needs to take place with respect to the “legit-
there is a single adversary with full knowledge of both search and                 imacy” of the desired system given its burden on privacy.”
meta data, including prior queries, click frequency, results clicked,              [28] [20]
timing data, prior interest profiles, etc. By contrast, the advertising       A critic might ask: why click? Why not simply hide ads from
ecosystem is fractured into multiple actors, including ad-hosting web         users and hide users from trackers? There are two reasons. First,
sites, advertisers, trackers, advertising networks, ad servers, analytics     AdNauseam is inspired by the path-breaking work of Priscilla Regan,
companies, etc., each of which interacts with the user in different           who argued that beyond the protection of individual interests, privacy
ways, and is privy to different aspects of those interactions.                may serve social ends, similar to collective goods such as clean
D. Feature Mapping                                                            air or national defense [38]. This notion of privacy as a collective
                                                                              good presents interesting engineering and evaluation challenges,
   The mapping of goals to features (and to the system modules                which, in our view, warrant close attention. Thus AdNauseam may
described below) was performed as follows: The goal of protection             stimulate deliberation not only on its particular features, but may
was implemented at a basic level by the clicking of collected ads,            draw attention to the conception of privacy it seeks to promote. A
via the visitation module; and by the blocking of non-visual trackers         second reason for clicking, as opposed to simply blocking, is that
and other malware, via the detection module. The former attempts to           AdNauseam seeks concurrently to achieve the goal of expressive
protect the user from data profiling via clicks on advertisements, and        resistance to tracking through protest. And since protest generally
the latter from non-visual tracking and potential malware. Expression         involves being vocal, AdNauseam’s design seeks to give voice to
was realized through clicks, again via the visitation module, and also        users. Rather than enacting privacy as concealment, AdNauseam
in our implementation of the EFF’s Do Not Track (DNT) mechanism               provides a means for users to express, in plain sight, their dissent
[11]. With DNT enabled (the default setting), the DNT header is sent          by disrupting the dominant model of commercial surveillance. This
with all requests, and ads on DNT sites remain visible. Ads on these          approach embodies a principle drawn from the theory of contextual
DNT pages are also not visited by AdNauseam. The goal of increased            integrity, namely, privacy as appropriate flow of information [36].
transparency is realized through the visualization module, specifically       Thus, AdNauseam does not hide deliberate clicks from trackers but
via the real-time menu interface, where users can watch as new ads            rather, by surrounding these clicks with decoy clicks, obfuscates
are discovered, then visited; the vault interface (described below),          inferences from clicks to users’ interests, which may be manipulated
and a range of explanatory links embedded throughout AdNauseam’s              in various ways, including via behavioral advertising. AdNauseam
settings pages. Additionally, an in-depth Frequently-Asked-Questions          does not block clicks; instead it blocks inappropriate access to interest
(FAQ) list is linked from multiple locations within the interface.            profiles that trackers may infer from them.
E. Data Minimization                                                             Some have argued that simply using a quality ad blocker offers
                                                                              similar degrees of protection and expression. Although basic ad
   Following a growing body of literature on privacy-by-design [19],
                                                                              blocking may protect individual users, its scope of impact is limited to
[24], [20], [22], [5], our design and implementation process followed
                                                                              those users. There is also a need for tools whose impacts reach beyond
principles of data minimization. Thus AdNauseam was designed to
                                                                              those individuals who know they exist and possess the sufficient
function without ever communicating to a “home server” or sending
                                                                              technical competence and confidence to install them. AdNauseam’s
user-data to any other entity, for any reason. For developers, this
                                                                              aim of polluting aggregate data has the potential to reduce its value
meant we were unable to access usage patterns and related data,
                                                                              to profilers and, more generally, to draw attention to the problematic
  1 From this perspective, obfuscation may be even more legitimate for        practices of behavioral advertisers. Although blocking may also
advertising than for search, due to the lack of viable alternative options.   realize expressive goals, for example, via industry studies and media
reports, the expressed message differs from that of AdNauseam’s. Ad              are made to the first group, on which this module focuses, which
blocker use is generally interpreted by the advertising industry as a            includes ad and ad-tracking services [43]. This module determines
rejection of problematic aesthetic aspects of the ad experience, while           which requests to block and which to allow, and distinguishes, in the
AdNauseam’s expressive intent specifically targets the industry’s                latter category, between those that yield visual elements and those
unethical surveillance practices 2 . Anecdotal reports from tools users,         used only for tracking.
to which we return briefly below, also suggest qualitative differences              In order to categorize such requests, we leverage the capabilities of
of intent in their use of AdNauseam.                                             the open-source uBlock-Origin [17] project, a configurable, list-based
   Finally, critics have claimed that AdNauseam harms independent                “blocker” that is effective and efficient [43]. Like other blockers,
content producers who can no longer support their sites. As this                 uBlock allows users to specify publicly accessible lists of resources
critique touches a broad array of tools, including standard ad blockers,         which contain syntactic matching rules for the retrieval of web
it will take us too far afield to address it fully here. However, setting        resources. Based on these lists, we first determine whether a request
aside the rejoinder which points out that these sites are enabling               should be blocked or allowed, and then, if allowed, whether it should
surveillance, or more harshly, “selling out” their visitors, the hope            be visible or hidden. If hidden, the element is downloaded and
is that loosening the chokehold of tracking over web and mobile                  included in the page, but made invisible to the user via a content-
domains will allow other business models to flourish. Toward this                script. Both blocking and hiding are specified via rules that may
end we have enabled support in AdNauseam for the EFF’s DNT                       include the serving domain, the type of resource (e.g., images or
mechanism, a machine-verifiable, and potentially legally-binding,                video), and/or properties of the DOM container (for example, a DIV
assertion on the part of sites that commit to privacy-respecting                 with a specific id or class). Rules are included from widely distributed
behavior [11]. For sites that make this commitment, AdNauseam does               lists that are updated and maintained by individuals and communities
not (by default) hide, block, or click their ads.                                (e.g, “EasyList“ [8]. Additionally, users can augment these lists with
                                                                                 custom rules they create, either to block or hide new content, or to
G. Socially-informed Risk Analysis                                               whitelist a site, page, or element.
  Given the goals we hoped to achieve and the set of features to                    Requests marked as blockable in AdNauseam are disallowed at
which these mapped, we set out to identify risks to which users                  the network level, mimicking the behavior of most other blockers,
might be exposed. For each such risk, we considered the degree to                including uBlock, AdBlock Plus, Adblock, and Adguard, which
which the user would be exposed when browsing the web using                      perform blocking on some percentage of requests, and hiding on the
an unmodified browser, in comparison to the degree of exposure                   remainder. The difference for AdNauseam is that a subset of requests
while using AdNauseam. Finally we considered their exposure using                which might be blocked in other blockers must be allowed in AdNau-
existing alternatives, ad-blockers like AdBlock Plus [1] or wide-                seam; specifically those that result in visual advertisements.3 At the
spectrum blockers like uBlock [17](see, for example, Figure 3 below).            element hiding level, the detection module is invoked incrementally,
The following risks were identified:                                             via content-scripts, as page elements are loaded (or dynamically
  • Increased tracking by advertisers and other data-gatherers                   generated) and inserted into the DOM. Elements marked for hiding
  • Personal data leakage (via clicks, hiding or export)                         are assigned a CSS class that sets their display to invisible, and the
  • Harms via malware or “malvertising”                                          surrounding DOM is collapsed so as not to leave blank space on the
To establish a lower-bound on exposure, we imposed a constraint that             page. Each hidden element (generally a visual ad) is then passed to
exposure with AdNauseam must be strictly lower on all dimensions                 the Extraction module.
than with an unmodified browser. Conversely, we hypothesized that                B. Extraction
the current performance of uBlock, the open-source blocker with
                                                                                    Once a visual element has been detected and hidden, we must
the best performance metrics, would provide an upper-bound on
                                                                                 then determine whether it is in fact an advertisement. If so, the
exposure. As AdNauseam must interact, at least minimally, with
                                                                                 extraction module of the system must extract the properties needed
advertising servers in order to fulfill its functions, it would necessarily
                                                                                 by the Visualization and Visitation modules. These properties include
expose users to more risk than the current state-of-the art blocker.
                                                                                 timestamp, size, content-URL, target-URL, page-detected-on, etc.
For all cases (see Comparative Evaluation below) we were able to
                                                                                 Text-only ads, as often found on search engines, present a different
verify that risk to users was diminished with AdNauseam, both in
                                                                                 challenge, as these are generally served inline along with page content
comparison with the no-blocker case, and to AdBlock Plus, the most
                                                                                 rather than requested from a 3rd-party server. In these non-image
commonly installed blocker [37].
                                                                                 cases, several additional fields are aggregated to form the content
                            II. A RCHITECTURE                                    payload (title, description, tagline) and there is no content-URL
                                                                                 linking to an external resource. To enable extraction of such data,
   The AdNauseam software is comprised of four modules, each
                                                                                 AdNauseam includes a custom set of CSS selectors used to parse
responsible for one of its primary functions: detection, extraction,
                                                                                 specific DOM attributes from text-ad sites (Google, Ask, Bing, etc.).
visualization, and visitation.
                                                                                 Such filters run only on specific domains where text-ads have been
A. Detection                                                                     previously discovered.
   This module is responsible for the analysis and categorization of             C. Visualization
requests following a page view. Such requests, most often to third-                 In order to facilitate transparency regarding tracking and profiling
parties, are first classified according to the type of elements they             by advertisers, AdNauseam provides users with interactive visualiza-
realize; whether advertisements, analytics, beacons, social-media, or            tions of their collected ad data. These visualizations provide both
functional widgets. The largest proportion of such requests (40-50%)
                                                                                   3 Interestingly, it is exactly this standard combination of functions—hiding
  2 This is our intent at least; Google’s recent ban of the software may imply   and blocking—that Google cites as being in violation of its Terms of Service,
that this intent is understood.                                                  a claim discussed below in the Distribution section.
                                                                             What are the expected results of visiting some percentage of each
                                                                          user’s collected ads? First, the data profiles of these users stored by
                                                                          advertising networks and data brokers may be polluted, as users’
                                                                          actual interests are hidden by generated clicks. This both protects
                                                                          individual users (assuming they have clicked, or may click, some ad in
                                                                          the future) as well as the larger user community, as aggregate statistics
                                                                          are less accurate and thus less valuable. Second, as advertisers must
                                                                          now potentially pay publishers for decoy clicks, a degree of mistrust
                                                                          is introduced into the economic model that drives the system. This is
                                                                          perhaps the most compelling argument for this strategy, as it could,
                                                                          given adequate adoption, force advertisers to change their behavior,
                                                                          either by developing new algorithms to filter such clicks, and/or by
                                                                          adopting more privacy-friendly policies (e.g., the EFF’s Do Not Track
                                                                          mechanism).
                                                                          E. Distribution
                 Fig. 1. AdNauseam’s AdVault visualization.
                                                                             Although not often discussed in an engineering context, the distri-
                                                                          bution issues we experienced highlight concerns we imagine will be
                                                                          only more relevant with the growing influence of corporate players
                                                                          of the software ecosystem.
                                                                             The prototype for AdNauseam was initially developed as a Firefox-
                                                                          only extension available in Mozilla’s addon store. In our production
                                                                          release, we added Opera and Chrome support and made the extension
                                                                          available in the Opera and Chrome stores respectively. We distributed
                                                                          upwards of 50,000 copies of the software over the subsequent six
                                                                          months, with the majority via Google’s Chrome store. In January of
                                                                          2017 however, we learned that Google had banned AdNauseam from
                                                                          the store, and further, had begun disallowing even manual installation
                                                                          or updates, effectively locking users out of their own saved data, all
                                                                          without prior notice or warning.
                Fig. 2. Estimated cost to advertising networks.
                                                                             Google responded to our requests for justification by saying that
                                                                          AdNauseam had violated the following clause of the Store’s Terms
                                                                          of Service: “an extension should have a single purpose that is clear to
high-level displays of aggregate data (see Figure 1), as well as the      users.”4 The single purpose of AdNauseam, we would argue, is quite
option to inspect individual ads for a range of data. A number of         clear—namely to resist the non-consensual surveillance conducted by
derived functions provide additional metrics (i.e., the total estimated   advertising networks, of which Google is a prime example. We do
charge to advertising networks for the ads visited for a page, site       recognize that Google might prefer users not to install AdNauseam,
or time-period, as in Figure 2). Ads may be filtered and sorted by        as it opposes their core business model, but the Terms of Service do
date, topic-category, ad-network, page-category, etc. The visualization   not (at least thus far) require extensions to endorse Google’s business
module is a distinct contribution of AdNauseam that attempts to a)        model. Moreover, this is not the justification cited for the ban.
provide users with greater insight concerning their interactions with     Whether or not one is an advocate of obfuscation, it is disconcerting
advertisers, and b) enable interested users and researchers to study      to know that Google can make a privacy extension, along with stored
the ad data collected. To facilitate the latter, we include mechanisms    data and preferences, disappear without warning. Here it is a counter-
for importing and exporting ad data sets (as JSON) from within the        surveillance tool that is banned; perhaps tomorrow it will be a secure
extension. The use of this data for further research, with appropriate    chat app, or password manager. For developers, who, incidentally,
mechanisms for user consent, is an area of future work.                   must pay a fee to post items in the Chrome store, this is cause for
                                                                          concern. Not only can one’s software be banned without warning,
                                                                          but comments, ratings, reviews, releases and statistics are removed
D. Visitation
                                                                          as well.
   This module simulates clicks (or visits) on collected ads, with
                                                                                                   III. D ESIGN T ENSIONS
the intention of appearing to the serving website (as well as to
advertisers and ad networks) as if the ad had been manually clicked.      A. Indistinguishability and Protection
Currently, these clicks are implemented via AJAX, which simulates           For obfuscation to function effectively as a means of counter-
requests (matching headers, referer, etc.) that the browser would         surveillance, the noise generated must exhibit a high degree of
normally send. This provides users with protection against potential      indistinguishability with regards to data the system intends to capture;
malware in ad payloads, as responses are not executed in the browser,     that is, it must be difficult for an adversary to distinguish injected
and JavaScript, Flash, and other client-side scripting mechanisms
                                                                             4 In the one subsequent email we received, a it was stated that a single
are not executed. Similarly, AdNauseam blocks incoming cookies
                                                                          extension should not perform “both blocking and hiding,” a claim that is
in responses to ad visits. The likelihood that a particular ad will       difficult to accept at face value, as most blockers (including uBlock, AdBlock
be clicked depends on the user-configurable click-probability setting     Plus, Adguard, etc.) perform both blocking and hiding, and have not been
described further below.                                                  banned.
noise from the data it is attempting to collect [15]. However, there                  ads are clicked, there is a higher likelihood the adversary will infer the
are times when this goal comes into tension with other aims of the                    use of AdNauseam and may choose to discard all click data from the
software, specifically that of protection, e.g., from malware.                        user in question. In this case personal protection may be achieved as
   For example, following a software-generated ad click, we must                      the user is no longer profiled, but there is no immediate gain in social
decide whether the DOM for the response should be parsed and                          protection. (As noted earlier, however, the fractured online advertising
executed, and whether scripts should be allowed to run. In current Ad-                ecosystem makes this less obvious than, say, in the domain of search.)
Nauseam versions, visits are implemented via AJAX requests, which                     At the extreme left, the tool’s clicks are undetectable (as there are
means that no DOM is constructed from the response, and scripts are                   none), and AdNauseam then functions like a standard blocker, simply
not executed. While protection is maximized here (against malicious                   blocking and hiding ads.
code embedded in ads), obfuscatory power may be diminished. For                          Detectability by an adversary is not, however, the only measure
example, one attack we have noted is from an adversary who, upon                      of a tool’s expressive powers, as there may be other audiences
receiving a click request, sends a preliminary response containing                    that developers seek to impress. Take the case of ScareMail (men-
code that executes, within the DOM, the actual request for the ad’s                   tioned in Related Work below), an obfuscation tool that appends an
target. If the code in the preliminary response never runs, then, from                algorithmically-generated narrative containing NSA “trigger” words
the advertising network’s perspective, the click is never executed. We                to the end of sent emails. Users are able to express resistance to
have experimented with solutions that address this issue (including                   the recipients of their emails irrespective of whether the adversary,
executing clicks in sandboxed invisible tabs), but have yet to settle on              presumably the email provider, is able to detect its use. Whether
a cross-platform solution that adequately protects user from potential                ScareMail is actually a “privacy tool,“ or simply a tool for social
malware/malvertising. For now we leave this as future work.                           protest focusing on email privacy, is a question we will not take up
                                                                                      here. Our purpose, instead, is to argue that the expressive potential of
B. Expression, Detectability, and Social Privacy
                                                                                      software need not be mapped only and directly to detectability by the
   We have spoken of the expressive capabilities of data obfuscation                  actor identified as the adversary. This would rule out subtle forms of
generally, and of AdNauseam specifically. But how does this design                    social expression that we are seeing; for example, where users have
goal relate to detectability (the degree to which an adversary can                    spontaneously sought ways to share their ad collections online.
detect usage of the tool). Abstractly conceived, expression and
detectability appear to lie at opposite ends of a spectrum; that is,                                                IV. E VALUATION
if a tool is undetectable to an adversary, its expressive capability is                  Qualitative evaluation was performed iteratively throughout devel-
minimal, at least in relation to the adversary. Thus, if expression is                opment, often guided by solicited and unsolicited feedback from var-
a goal of an obfuscation tool, designers may wish, perhaps counter-                   ious constituencies, including users, developers, reviewers at Mozilla
intuitively, to make its use detectable. If a goal of the tool is social              and Opera, and a range of privacy and security advocates. When
privacy—the pollution of aggregate data collected by trackers—then,                   considering how to evaluate the software, the question of whether
one might argue, the tool should be undetectable, so that the adversary               AdNauseam in fact “worked” seemed at first to be most obvious and
cannot simply discard all data from those discovered to be using the                  simple to address. We soon realized, however, that the meaning of
tool. It appears, at least in a simplistic analysis that a tool cannot                this question shifted as users’ goals, expectations, and perceived risks
simultaneously achieve expressivity and protect social privacy5 .                     varied. Evaluating AdNauseam on the basis of feedback from the
   To address this tension, we adapt the design of the user-                          various constituencies was often a two-part process: first determining
configurable query-frequency in TrackMeNot [26], to AdNauseam,                        user orientations, and then examining feedback in light of their
allowing users to adjust the probability (from 0-100%) that discovered                goals, concerns, and priorities. Additionally, beyond the technical
ads will be clicked. As the slider is moved to the left, the likelihood               issues with which we grappled, a subset of critiques consistently
that an ad will be clicked decreases, and vice versa to the right. If                 addressed ethical concerns. Thus we have split the discussion below
we hypothesize that, all other elements being equal, a lower click                    into technical and ethical components.
frequency will be harder to detect, then this setting would represent
                                                                                      A. Technical
a mapping between expression, detectability, and social privacy6 . As
the slider moves left, expressivity is reduced as is the likelihood                      Evaluation of obfuscation-based strategies for counter-surveillance
of detection, and the potential for social protection is increased.                   is often relatively straightforward. Take search, for example. One can
When moved right, the likelihood of detection increases, as do both                   extract query logs from tool users, containing both user-generated
expressivity and the potential for (economic) disruption, while the                   and software-generated queries, and then attempt to separate the two,
degree of social protection decreases. At the right extreme, when all                 either manually or automatically; in the latter case, by running best-
                                                                                      practice machine-learning (or other) algorithms. Although one may
    5 Real-world domains, like advertising surveillance, are often complicated
                                                                                      not know the exact capabilities of the adversary, evaluators can make
by a range of socio-economic factors. For example, the analysis above assumes         educated guesses as to the type of attacks to be considered, whether
a single adversary with full knowledge of the system, which, as discussed,
is not the case here. Further, simply because an adversary can filter the data        timing, query content, side-channel, or other means (for details of
for tool users does not mean they will, especially given high enough adoption         such evaluations in the search case, see [15]). If we find that the
rates. Such data is at the core of the business model that drives such collection,    adversary can differentiate true queries with high accuracy, then our
and thus ignored profiles have direct economic impact. Clearly there is some          generated queries can be filtered and, from a protection standpoint,
number of ignored users after which the practice is no longer economically
viable. One must also consider the effort and expense required to initiate such
                                                                                      we must say that the tool fails.7
filtering, and the questions which it raises – should, for example, the data of          At first glance, evaluating AdNauseam would seem to call for a
tool users be discarded forever, or are such users to be monitored for tool           similar approach in which one measures the difficulty with which an
stoppage as well? A range of social, economic, and cultural factors interact
here to influence what is, in the end, a complex business decision.                      7 We may still argue that the socio-economic cost of filtering is prohibitively
    6 As a variety of factors influence detectability, the actual assertion of such   high, or that the tool is successful in terms of expression, but these are non-
a linear relationship would require supporting evidence.                              technical concerns which we must bracket for the moment.
adversary, using standard techniques, can distinguish user clicks from
generated clicks. However, there are three distinct cases to consider,
depending on what ads, if any, are seen by the user. In the first case,
where a user enables ad-hiding, disables DNT exceptions, and does
not provide a whitelist, no ads are visible to the user, and there are
no true clicks for an adversary to discover. This is also true for the
second case in which the only ads visible are those of DNT-respecting
sites (AdNauseam’s default settings). As such sites by definition do
not track users, there are again no true clicks to discern. The third
case applies for users who see non-DNT ads, either because they have
disabled hiding entirely, or because they have manually whitelisted
                                                                                        Fig. 3. Number of distinct third-parties contacted.
sites. Here we must consider the tool’s detectability determined in
large part by the user-selected click-probability. If this probability
is set high enough that detection is possible, the adversary may
simply discard all clicks from the user in question; a result similar
to that obtained from a successful blocking-only strategy, except
with an enhanced expressive component (as the adversary must both
recognize the tool and then take action to perform the filtering). While
this may be considered a success for some users, as they are no longer
profiled via click-data, since the data is discarded there is no net gain
in what we have referred to as social privacy. If click-probability is
low enough, however, that the tool’s actions are not detectable, then in
order to evaluate the degree of social protection provided, we need to
                                                                                               Fig. 4. Total page load time (sec).
asses both a) the indistinguishability of the clicks, and b) the impact
that successful decoy clicks have on the resulting users profile (a
complex question we return to in the Future Work below). Of course
even if requests themselves are indistinguishable, there may still be       [16], [41]. Data from this surveillance contributes to the creation
side-channels available to adversaries, as discussed above. For the         of valuable, but often highly problematic profiles that fuel big
moment we leave the specifics of such evaluations to future work.           data industries with uncertain, potentially negative effects on their
   1) Comparative: To further evaluate performance we compare               subjects. Against this backdrop, we judge the aims of AdNauseam,
AdNauseam with commonly used blockers on a range of dimensions,             which include the disruption of this process, to be morally defensible.
relating both to protection (number of 3rd parties contacted) and              The second charge asks whether obfuscation imposes a lower
usability (page-load speed and memory efficiency). Tests were first         collateral costs than alternatives for achieving similar ends. Compar-
run without any extension, then with AdNauseam, Adblock Plus [1],           ing the purported cost of AdNauseam against alternative approaches
uBlock-Origin [17], and Privacy Badger [10]. Tests were performed           involves uncertainties we are unable to tackle here. But, by the same
with each extension’s default settings after resetting the browser to       token, this dearth of concrete evidence poses a challenge to critics
its install state. After visiting the websites in the test set (between     who accuse ad blockers—and AdNauseam—of harming the web’s
15 and 85 popular URLs, depending on the test) via the Selenium             economy. Even if one holds that the “best” resolution would be
browser automation tool, we evaluated the safety of each extension          societal-level regulation, there has been little progress on this front.
in terms of the number of 3rd parties contacted (Figure 3), page-           As important as seeking credible alternatives, however, is weighing
load speed (Figure 4), and memory efficiency. As shown in the               the purported costs of using AdNauseam. Among the latter, the harm
graphs below, AdNauseam performed better on all dimensions than             of “wasting” network bandwidth or server resources is ironic at best,
no blocker and, perhaps surprisingly, better than AdBlock Plus. As          given the vast amount of bandwidth used by advertisers and trackers,
expected, AdNauseam performed less well than uBlock, due to the             the performance degradation resulting from loading this unwanted
need to allow visual ad resources, rather than blocking them outright.      content, and the financial toll on those paying for fixed data plans.
                                                                            From an ethical perspective, it is questionable whether the term
B. Ethical                                                                  “waste” is appropriate at all. For those who deliberately choose to use
   In adopting the philosophy of data obfuscation AdNauseam seeks           AdNauseam it offers a potential escape from inappropriate profiling.
to shield users from the inexorable and inappropriate probes of             In our view, this is not a worthless endeavor.
services and third parties. Choosing obfuscation, however, means               One of the most aggressive charges leveled at AdNauseam is that
taking seriously the ethical critiques it has drawn, including charges      it perpetuates “click fraud.” Since obfuscation and fraud both involve
of dishonesty, wasted resources, and polluted databases. Addressing         forms of lying that disrupt entrenched systems, it is important to
these issues, Brunton and Nissenbaum [4] ask creators of obfuscating        evaluate whether the two forms are alike. To carry this out, we
systems to answer two questions: first, whether their aims are laud-        consulted various definitions: “[Click] fraud occurs when a person,
able; and second, whether alternatives exist that might achieve these       automated script or computer program imitates a legitimate user of a
aims at lesser cost. Regarding the first charge we begin by saying          web browser, clicking on such an ad without having actual interest in
that ubiquitous online surveillance violates the tenets of a liberal        the target of the ad’s link” [29] comes close to capturing AdNauseam
democracy. The troubling nature of this surveillance is exacerbated         in its notion of clicking without actual interest, but this definition
by its surreptitious operation, its prevarication, and its resistance to    seemed overly broad in that it commits users to click only on ads
the wishes of a majority of users; claims clearly established through       in which they are interested, and seems an unjustifiable restriction
systems’ analysis, demonstrations and public opinion surveys [42],          on liberty of action. We also argue that if the automated script is
                                                                             CacheCloak [32], for location data. There have also been a number
                                                                             of obfuscation schemes for web search [15].
                                                                                Other relevant work, described in [27], has come from the art/tech
                                                                             community. “I Like What I See” is a tool that clicks all ‘Like’
                                                                             links on Facebook to obscure user interests. “ScareMail” [18] is an
                                                                             extension built atop Gmail that append an algorithmically-generated
                                                                             narrative containing NSA “trigger-words” to the end of each sent
                                                                             email. “Invisible” [21] extends obfuscation to the context of genetic
                                                                             privacy via a spray that obfuscates DNA to frustrate identification.
                                                                                Two early tools addressing surveillance integrate ad-blocking with
                                                                             some broadly-defined social good: AddArt [2] replaces ads with user-
                                                                             configurable art, while AdLiPo [25] does the same with language art.
                                                                             Lightbeam [33], provides displays of users’ connections, including to
               Fig. 5. Opt-in settings on initial install page.              advertising networks (though not ads themselves). Floodwatch [13]
                                                                             is the one tool we have found that provide visualizations similar to
                                                                             our own, though it requires communication with one or more 3rd-
performing as an agent of an individual, through that individual’s           party servers to do so. Privacy Badger [10] blocks third-party requests
legitimate choice, then the script is a proxy for the user. John Bat-        based on real-time monitoring of the connections they attempt rather
telle’s account [3], which includes motive and intention, gets closer        than via lists, blocking only those resources engaged in tracking.
to the standard meaning of “fraud” in “click fraud”: the “‘decidedly
black hat’ practice of publishers illegitimately gaming paid search                                   VI. F UTURE W ORK
advertising by employing robots or low-wage workers to repeatedly               AdNauseam provides individuals with the means to express their
click on each AdSense ad on their sites, thereby generating money            commitment to online privacy without the need to depend on the
to be paid by the advertiser to the publisher and to Google.” While          good will or intervention of third-parties. Although fully functional,
elements of the above definitions overlap with AdNauseam’s clicking          AdNauseam is perhaps best considered as a proof of concept for a
(without genuine interest in their targets), machine automation is           particular approach to privacy, that is, privacy through obfuscation.
only incidental to click fraud, and may instead involve “low-wage            As discussed, AdNauseam’s potential lies in its capacity to protect
workers.” More significant is what AdNauseam does not share with             individuals against data profiling, as well as simultaneously providing
click fraud, namely action on behalf of stakeholders resulting in            a proactive means of expressing one’s views to monolithic and largely
financial gain. In litigated cases of click fraud the intention to inflate   uninterested corporations. Going forward, a scientific approach to
earnings has been critical.                                                  evaluating AdNauseam’s performance, or the performance of any
   We readily admit that a primary aim of AdNauseam is to disrupt            system adopting obfuscation, needs a rigorous means of measuring
business models that support surreptitious surveillance. It does not         success—namely, evidence that decoy clicks have been registered and
follow however that AdNauseam is responsible for the demise of free          have an impact on the resulting profile. Such needs are likely to turn
content on the web. First, it is not, as we make clear on the project        not only on the statistical analysis of signal-to-noise ratios, but also
page, advertising that is the primary target of the project, but rather      on a practical understanding of how click data is actually mined and
the tracking of users without their consent. Contextual advertising          used, and the extent to which it influences aspects of user profiles.
that does not involve tracking can certainly support free content just       This would allow future iterations of obfuscation-based tools to be
as it has in the past. Second, web content is not actually ‘free’ as         both effective and efficient in the noise they produce.
this argument implies. The development of the Internet has been                 Future work could take several directions. In the near term we hope
supported largely by government funding (and thus by taxpayers)              to better answer the question of how to perform indistinguishable
since its beginning. In fact, vast infrastructure and energy costs are       clicks without exposing users to potential harms via downloaded
still born in large part by taxpayers, not to mention the potentially        content, as discussed above. Though complex, P2P approaches for
species-threatening cost to the environment posed by increasing data         the sharing of obfuscation data between users is a ripe area of future
traffic [23]. Critics may say that ad blocking users free ride upon          work, with users potentially visiting the ads detected by peers as a
those who allow themselves to be tracked, however, in our view this          means of both shielding their data and maximizing indistinguisha-
presumes an entitlement on the part of trackers that is indefensible;        bility. A central challenge here would be meeting functional criteria
one may equally charge trackers with destructive exploitation of users       while not compromising the design constraints discussed early in this
[4]. Lastly, in regard to free riding, we wish to point out that the         paper, e.g., transparency and independence from third-parties. Finally,
hiding of ads is an optional element of AdNauseam, one that users            beyond the technical, work exploring the motivations and qualitative
must explicitly opt into when they install the software (see Figure          experiences of users who select obfuscation tools could shed light on
5); AdNauseam’s visitation and visualization modules work equally            the unique potential such tools might offer in additional domains.
well whether the user elects to view ads or to hide them.
                                                                                                     VII. C ONCLUSIONS
                          V. R ELATED W ORK
                                                                                AdNauseam operates in a technologically and socially complex
   The strategy of obfuscation has been broadly applied—in search            environment, one in which user data is perceived to be highly
[26], location tracking [32], social networks [30], anonymity [7], [39],     valuable. For individuals, however, recorded patterns potentially open
etc.—and, as such, has been recognized as an important element               a window into their lives, interests, and ambitions. Thus surveillance
of the privacy engineer’s toolbox. A range of obfuscation-based              via advertising is not only a source of individual vulnerability, but
projects have been described in [4], including FaceCloak [30], for           also interferes with the rights to inquiry, association, and expression
Facebook profiles, BitTorrent Hydra [39], for decoy torrent sites, and       that are essential to a healthy democratic society. Consequently,
there remain tensions between individual users, collective social              [13] Floodwatch. “Floodwatch.” n.d. https://floodwatch.o-c-r.org/.
and political values, and the economic interests of publishers and             [14] Friedman, Batya, Daniel C. Howe, and Edward Felten. “Informed
                                                                                    Consent in the Mozilla Browser: Implementing Value-Sensitive Design”.
advertisers. In a better world, this tension would be resolved in
                                                                                    Proceedings of the 35th Annual Hawaii International Conference on
a transparent, trust-based accommodation of respective interests.                   System Sciences. IEEE, 2002.
Instead, concerned users find little transparency and few credible             [15] Gervais, Arthur, et al. “Quantifying Web-Search Privacy.” Proceedings of
assurances from advertisers that privacy will ever trump the pursuit                the 2014 ACM SIGSAC Conference on Computer and Communications
of profit. Thus trust-based mutual accommodation gives way to                       Security. ACM, 2014.
                                                                               [16] Goldfarb, Avi, and Catherine Tucker. “Shifts in Privacy Concerns.” The
an adversarial relationship, one in which we must leverage all the                  American Economic Review 102.3 (2012): 349-353.
strategies at our disposal. Our success in this endeavor will depend in        [17] Gorhill. “uBlock Origin - An efficient blocker for Chromium and
part on how well we share our experience applying known strategies                  Firefox.” 2016. https://github.com/gorhill/uBlock
to new contexts, in concrete and specific detail, according to an              [18] Grosser, Ben. “ScareMail.” 2013.
                                                                                    Web http://bengrosser.com/projects/scaremail/.
evolving set of best practices, as we have attempted above.                    [19] Gürses, Seda, Carmela Troncoso, and Claudia Diaz. “Engineering Pri-
   We conclude with a philosophical point. In some of the most                      vacy by Design.” Computers, Privacy & Data Protection 14.3, 2011.
revealing exchanges we have had with critics, we note a palpable               [20] Gürses, Seda, Carmela Troncoso, and Claudia Diaz. “Engineering Pri-
sense of indignation, one that appears to stem from the belief that                 vacy by Design Reloaded.” Amsterdam Privacy Conference. 2015.
human users have an obligation to remain legible to their systems,             [21] Dewey-Hagborg, H. “Invisible.” 2014.
                                                                                    http://www.newmuseumstore.org/browse.cfm/invisible/4,6471.html.
a duty to remain trackable. We see things differently; advertisers             [22] Hansen, Marit, Meiko Jensen, and Martin Rost. “Protection Goals for
and service providers are not by default entitled to the externalities              Privacy Engineering.” Security and Privacy Workshops. IEEE, 2015.
of our online activity. Rather, users should control the opacity of            [23] Hazas, Mike, et al. “Are there limits to growth in data traffic?: On time
their actions, while powerful corporate entities should be held to the              use, data generation and speed.” Proceedings of the Second Workshop
                                                                                    on Computing within Limits. ACM, 2016.
highest standards of transparency. Unfortunately this is the opposite          [24] Hoepman, Jaap-Henk. “Privacy Design Strategies.” IFIP International
of the status quo; our trackers want us to remain machine-readable so               Information Security Conference. Springer Berlin Heidelberg, 2014.
that they can exploit our most human endeavors (sharing, learning,             [25] Howe, Daniel C. “AdLiPo” 2014. http://rednoise.org/adlipo/.
searching, socializing) in the pursuit of profit. AdNauseam attempts           [26] Howe, Daniel C. and Helen Nissenbaum. “TrackMeNot: Resisting
                                                                                    Surveillance in Web Search.” Lessons from the Identity Trail: Anonymity,
to represent an alternative position.
                                                                                    Privacy and Identity in a Networked Society 23, 2009: 417-436.
                                                                               [27] Howe, Daniel C. “Surveillance Countermeasures: Expressive Privacy
                        ACKNOWLEDGEMENTS                                            via Obfuscation”. APRJA, A Peer-Reviewed Journal About Datafied
   The authors wish to thank all those who have helped with the                     Research 4.1, 2015.
creation and spread of AdNauseam, particularly Sally Chen, Leon                [28] Iachello, Giovanni, and Gregory D. Abowd. “Privacy and Proportional-
                                                                                    ity: Adapting Legal Evaluation Techniques to Inform Design In Ubiq-
Eckert, Cyrus Suen, and Emily Goldsher-Diamond. This paper has                      uitous Computing.” Proceedings of the SIGCHI conference on Human
been greatly improved by comments from our (anonymous) review-                      factors in computing systems. ACM, 2005.
ers, our “shepherd” Ero Balsa, and Seda Gürses; specifically for               [29] Liu, De, Jianqing Chen, and Andrew B. Whinston. “Current Issues in
her substantive guidance throughout the writing process and her                     Keyword Auctions”. Business Computing (Handbooks in Information
                                                                                    Systems, Vol. 3) 2009: 69-97.
unflagging support of productive, cross-disciplinary work, no matter           [30] Luo, Wanying, Qi Xie, and Urs Hengartner. “FaceCloak: An Architecture
the difficulty. Finally, thanks go to Mushon Zer-Aviv, for his profound             for User Privacy on Social Networking Sites” International Conference
contributions to all aspects of the project.                                        on Computational Science and Engineering, 2009.
   This publication has been supported in part by grants from US               [31] Mansfield-Devine, Steve. “When advertising turns nasty”. Network Se-
                                                                                    curity 2015.11 2015: 5-8.
NSF CNS/NetS 105833, US NSF SATC 1642553, and the Research
                                                                               [32] Meyerowitz, Joseph and R. R. Choudhury. “Hiding stars with fireworks:
Grants Council of Hong Kong, China (Project No. CityU 11669616)                     Location privacy through camouflage.” Proc. of the 15th annual inter-
                                                                                    national conference on Mobile computing and networking. ACM, 2009.
                              R EFERENCES                                      [33] Mozilla. “Lightbeam.” 2016. https://www.mozilla.org/en-US/lightbeam/.
                                                                               [34] Murphy, Kate. “The Ad Blocking Wars.” New York Times, 20 Feb. 2016.
 [1] AdBlock Plus. “AdBlock Plus.” n.d. https://adblockplus.org/.
                                                                               [35] Nikiforakis, Nick, et al. “Cookieless monster: Exploring the ecosystem
 [2] AddArt. “AddArt.” n.d. http://add-art.org/.
                                                                                    of web-based device fingerprinting.” IEEE symposium on Security and
 [3] Battelle, John. The Search: How Google and Its Rivals Rewrote the
                                                                                    privacy (SP). IEEE, 2013.
     Rules of Business and Transformed Our Culture. Nicholas Brealey, 2011.
                                                                               [36] Nissenbaum, Helen. Privacy in Context: Technology, Policy and the
 [4] Brunton, Finn, and Helen Nissenbaum. Obfuscation: A User’s Guide for           Integrity of Social Life. Palo Alto: Stanford University Press, 2010.
     Privacy and Protest. MIT Press, 2015.                                     [37] PageFair, Adobe “The cost of ad blocking–PageFair and Adobe 2015
 [5] Cavoukian, Ann, and Michelle Chibba. “Cognitive Cities, Big Data and           Ad Blocking Report”, 2015.
     Citizen Participation: The Essentials of Privacy and Security”. Towards   [38] Regan, Priscilla M. Legislating privacy: Technology, social values, and
     Cognitive Cities. Springer International Publishing, 2016. 61-82.              public policy. Univ of North Carolina Press, 1995.
 [6] Click Fraud. (n.d.). In Wikipedia. Retrieved Aug 1, 2016.                 [39] Schulze, Hendrik, and Klaus Mochalski. “Internet Study 2008/2009.”
     https://en.wikipedia.org/wiki/Click_fraud                                      Ipoque Report 37 2009: 351-362.
 [7] Chakravarty, Sambuddho, et al. “Detecting Traffic Snooping in             [40] Spiekermann, Sarah, and Lorrie Faith Cranor. “Engineering Privacy.”
     Anonymity Networks Using Decoys.” 2011.                                        IEEE Transactions on software engineering 35.1 2009: 67-82.
 [8] “EasyList.” 2016. https://easylist.to/                                    [41] Tucker, Catherine E. “Social networks, personalized advertising, and
 [9] Englehardt, Steven, and Arvind Narayanan. “Online Tracking: A 1-               privacy controls.” Journal of Marketing Research 51.5 2014: 546-562.
     million-site Measurement and Analysis.” Proceedings of the ACM            [42] Turow, Joseph, et al. “Americans reject tailored advertising and three
     SIGSAC Conf. on Computer and Communications Security. ACM, 2016.               activities that enable it.” 2009.
[10] Electronic Frontier Foundation. “Privacy Badger.”                         [43] Wills, Craig E., and Doruk C. Uzunoglu. “What Ad Blockers Are (and
     n.d. https://www.eff.org/privacybadger.                                        Are Not) Doing." Fourth IEEE Workshop on Hot Topics in Web Systems
[11] Electronic Frontier Foundation. “Do Not Track.”                                and Technologies (HotWeb). IEEE, 2016.
     n.d. https://www.eff.org/issues/do-not-track.
[12] Flanagan, Mary, Daniel C. Howe, and Helen Nissenbaum. “Embodying
     Values in Technology: Theory and Practice.” Information technology and
     moral philosophy. (2008): 322-353.