<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>TAR on Social Media: A Framework for Online Content Moderation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Eugene Yang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>David D. Lewis</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ophir Frieder</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>IR Lab, Georgetown University</institution>
          ,
          <addr-line>Washington, DC</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Reveal Brainspace</institution>
          ,
          <addr-line>Chicago, IL</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Content moderation (removing or limiting the distribution of posts based on their contents) is one tool social networks use to fight problems such as harassment and disinformation. Manually screening all content is usually impractical given the scale of social media data, and the need for nuanced human interpretations makes fully automated approaches infeasible. We consider content moderation from the perspective of technology-assisted review (TAR): a human-in-the-loop active learning approach developed for high recall retrieval problems in civil litigation and other fields. We show how TAR workflows, and a TAR cost model, can be adapted to the content moderation problem. We then demonstrate on two publicly available content moderation data sets that a TAR workflow can reduce moderation costs by 20% to 55% across a variety of conditions.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Technology-assised review</kwd>
        <kwd>active learning</kwd>
        <kwd>social media</kwd>
        <kwd>content moderation</kwd>
        <kwd>cost analysis</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>and automated classification will be required for online
content moderation for the foreseeable future [1, 5, 6].</p>
      <p>Online social networks are powerful platforms for per- This has meant not just capital investments in machine
sonal communication, community building, and free ex- learning tools for moderation, but also massive ongoing
pression. Unfortunately, they can also be powerful plat- personnel expenses for teams of human reviewers [7].
forms for harassment, disinformation, and perpetration Surprisingly, the challenge of reducing costs when
of criminal and terrorist activities. Organizations host- both machine learning and manual review are
necesing social networks, such as Facebook, Twitter, Reddit, sary has been an active area of interest for almost two
and others, have deployed a range of techniques to coun- decades, but in a completely diferent area: civil
litigateract these threats and maintain a safe and respectful tion. Electronic discovery (eDiscovery) projects involve
environment for their users. teams of attorneys, sometimes billing the equivalent of</p>
      <p>One such approach is content moderation: removal hundreds of euros per person-hour, seeking to find
docu(hard moderation) or demoting (soft moderation) of ments responsive to a legal matter [8]. As the volume of
policy-violating posts [1, 2]. Despite recent progress in electronically produced documents grew, machine
learnmachine learning, online content moderation still heav- ing began to be integrated in eDiscovery workflows in
ily relies on human reviews [3]. Facebook’s CEO Mark the early 2000s, a history we review elsewhere [9].
Zuckerberg stated that language nuances could get lost The result in the legal world has been
technologywhen relying on automated detection approaches, empha- assisted review (TAR): human-in-the-loop active learning
sizing the necessities for human judgments. 1 Ongoing workflows that prioritize the most important documents
changes in what is considered inappropriate content com- for review [10, 11]. One-phase (continuous model
refineplicates the use of machine learning [4]. Policy experts ment) and two-phase (with separate training and
deployhave argued that complete automation of content mod- ment phases) TAR workflows are both in use [9, 12].
eration is socially undesirable regardless of algorithmic Because of the need to find most or all relevant
docuaccuracy [5]. ments, eDiscovery has been referred to as a high recall
It is thus widely believed that both human moderation review (HRR) problem [13, 14, 15]. HRR problems also
arise in systematic reviews in medicine, sunshine law
requests, and other tasks [16, 17, 18]. Online content
moderation is an HRR problem as well, in that a very
high proportion of inappropriate content should be
identified and removed.</p>
      <p>Our contributions in this paper are two-fold. First, we
describe how to adapt TAR and its cost-based evaluation
framework to the content moderation problem. Second,
DESIRES 2021 – 2nd International Conference on Design of
Experimental Search &amp; Information REtrieval Systems, September
15–18, 2021, Padua, Italy
" eugene@ir.cs.georgetown.edu (E. Yang);
desires2021paper@davelewis.com (D. D. Lewis);
ophir@ir.cs.georgetown.edu (O. Frieder)</p>
      <p>© 2021 Copyright for this paper by its authors. Use permitted under Creative
CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g CCoEmmUoRns LWiceonsrekAstthribouptionP4r.0oIncteerenadtiionnagl s(CC(CBYE4U.0)R.-WS.org)</p>
      <p>1https://www.businessinsider.com/zuckerberg-nuances-conte
nt-moderation-ai-misinformation-hearing-2021-3
Content moderation on online platforms is a
necessity [19, 20] and has been argued by some to be the
defining feature of an online platform [6]. Despite terms of
service and community rules on each platform, users
produce inappropriate content, particularly when
anonymous [21]. Inappropriate content includes toxic content
such as hate speech [22], ofensive content [ 23], and mis
/ disinformation [4, 23]. It also includes content that is
inappropriate for legal or commercial reasons, such as
potential copyright violations [5, 24].</p>
      <p>The identification of toxic content can require subtle
human insight [4, 22], both due to attempts at obfuscation
by posters, and because the inappropriateness of the
content is often tied to its cultural, regional, and temporal
context [1, 3]. Mis- and disinformation often consists
of subtle mixtures of truthful and misleading content
that require human common sense inferences and other
background knowledge [4, 23].</p>
      <p>Social media organizations have deployed numerous
techniques for implementing community policies,
including graph- and time-based analyses of communication
patterns, user profile information, and others [ 25]. Our
focus here, however, is on methods that use the content
of a post.</p>
      <p>Content monitoring falls into three categories:
manual moderation, text classification, and
human-in-theloop methods. The latter two approaches leverage
machine learning models and are sometimes collectively
referred to as algorithmic content moderation in policy
research [5].</p>
      <p>Manual moderation is the oldest approach, dating back
to email mailing lists. It is, however, extremely expensive
at the scale of large social networks and sufers potential
human biases. Additionally, mental health concerns are
an issue for moderators exposed to large volumes of toxic
content [25, 26, 27].</p>
      <p>The simplest text classification approaches are
keyword filters, but these are susceptible to
embarrassing mistakes2 and countermeasures by content
creators. More efective text classification approaches to
content moderation are based on supervised machine
learning [28, 29]. Content types that have been
addressed include cyberbullying [29, 30, 31, 32], hate speech
we test this approach using two publicly available con- [22, 31, 33, 34, 35, 36] or ofensive language in general
tent moderation datasets. Our experiments show substan- [23, 37, 38, 39, 40, 41, 42].
tial cost reductions using the proposed TAR framework However, some moderation judgments are inevitably
over both manual review of unprioritized documents and too subtle for purely automated methods3, particularly
training of prioritized models on random samples. when content is generated with the intent of fooling
automated systems [1, 25, 43]. Content that is
recontextualized from the original problematic context, for example,
2. Background through reposting, screenshotting, and embedding in
new contexts complicates moderation [2]. Additionally,
bias in automated systems can also arise both by
learning from biased labels and from numerous other choices
in data preparation and algorithmic settings [27, 44, 45].</p>
      <p>Biased models risk further marginalizing and
disproportionately censoring groups that already face
discrimination [1]. Diferences in cultural and regulatory contexts
further complicate the definition of appropriateness,
creating another dimension of complexity when deploying
automated content moderation [4].</p>
      <p>Human-in-the-loop approaches, where AI systems
actively manage which materials are brought to the
attention of human moderators, attempt to address the
weaknesses of both approaches while gathering training
data to support supervised learning components [25, 46].</p>
      <p>Filtering mechanisms that proactively present only
approved content (pre-moderation) and/or removal
mechanisms that passively take down inappropriate ones are
used by platforms depending on the intensity [4].
Reviewing protocols could shift from one to the other based
on the frequency of violations or during a specific event,
such as elections4. Regardless of the workflows, the core
and arguably the most critical components is reviews.</p>
      <p>However, the primary research focus of
human-in-theloop content moderation has been on classification
algorithm design and bias mitigation, rarely on the
investigation of the overall workflow.</p>
      <p>Like content moderation, eDiscovery is a high recall
retrieval task applied to large bodies of primarily
textual content (typically enterprise documents, email, and
chat) [11, 12]. Both fixed data set and streaming task
structures have been explored, though the streaming
context tends to bursty (e.g., all data from a single person
arriving at once) rather than continuous. Since cost
minimization is a primary rationale for TAR [47], research
on TAR has focused on training regimens and workflows
for minimizing the number, or more generally the cost,
of documents reviewed [9, 12]. A new TAR approach is
typically evaluated for its ability to meet an efectiveness
target while minimizing cost or a cost target while
maximizing efectiveness [ 18, 48, 49]. This makes approaches
developed for TAR natural to consider for content
moderation.</p>
      <p>2https://www.techdirt.com/articles/20200912/11133045288/p
aypal-blocks-purchases-tardigrade-merchandise-potentially-viol
ating-us-sanctions-laws.shtml</p>
      <sec id="sec-1-1">
        <title>3https://venturebeat.com/2020/05/23/ai-proves-its-a-poor-su</title>
        <p>bstitute-for-human-content-checkers-during-lockdown/
4https://www.washingtonpost.com/technology/2020/11/07/f
acebook-groups-election/</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>3. Applying TAR to Content</title>
    </sec>
    <sec id="sec-3">
      <title>Moderation</title>
      <p>shut up mind your own business and go f*** some one
else over
In most TAR applications, at least a few documents of
the (usually rare) category of interest are available at (a) Wikipedia collection.
the start of the workflow. These are used to initialize an
iterative pool-based active learning workflow [ 50]. Re- : being in love with a girl you dont even know yours is
viewed documents are used to train a predictive model, sadder
which in turn is used to select further documents based : f*** of you f***ing c***!
on predicted relevance [51], uncertainty [52], or
composite factors. Workflows may be batch-oriented (mimicking
pre-machine learning manual workflows common in the
law) or a stream of documents may be presented through (b) ASKfm collection
an interactive interface with training done in the back- Figure 1: Example content in the collections
ground. These active learning workflows have almost
completely displaced training from random examples
when supervised learning is used in eDiscovery. tion approaches used in social media are complex, but in</p>
      <p>Two workflow styles can be distinguished [ 9]. In a the end reduce to some combination of machine-assisted
one-phase workflow , iterative review and training simply manual decisions (phase one) and automated decisions
continues until a stopping rule is triggered [49, 53, 54]. based on deploying a trained model (phase two).
OperaStopping may be conditioned on estimated efectiveness tional decisions such as flagging and screening all posts
(usually recall), cost limits, and other factors [53, 55, 56]. from an account or massive reviewing of posts related
Two-phase workflows stop training before review is fin- to certain events [4, 6] are all results of applying
previished, and deploy the final trained classifier to rank the ously trained models, which is also a form of deployment.
remaining documents for review. The reviewed docu- Also, broadly applying the model to filter the content
ments are typically drawn from the top of the ranking, vastly reduces moderation burden when similar content
with the depth in the ranking chosen so that an estimated is rapidly being published on the platform with the risk
efectiveness target is reached [ 18, 48]. Two-phase work- of falsely removal [4]. We claim no optimal for this
spelfows are favored when labeling of training data needs to cific simplified model in evaluating content moderation,
be done by more expensive personnel than are necessary but an initial efort for modeling the human-in-the-loop
for routine review. moderation process.</p>
      <p>The cost of both one- and two-phase TAR workflows When applying the model to content moderation,
howcan be captured by in a common cost model [9]. The ever, we assume uniform review costs for all documents.
model defines the total cost of a one-phase review termi- This seems the best assumption given the short length
nated at a particular point as the cost incurred in review- of texts reviewed and what is known publicly about the
ing documents to that point, plus a penalty if the desired cost structure of moderation [6].
efectiveness target (e.g., a minimum recall value) has not In the next section, we describe our experimental
setbeen met. The penalty is simply the cost of continuing ting for adapting and evaluating TAR for content
moderon to an optimal second-phase review from that point, ation.
i.e. the minimum number of prioritized documents is
reviewed to hit the efectiveness target. For a two-phase
workflow, we similarly define total cost to be the cost 4. Experiment Design
of the training phase plus the cost of an optimal second
phase using the final trained model. Here we review the data sets, evaluation metric, and</p>
      <p>These costs in both cases are idealizations in that there implementation details for our experiment.
may be additional cost (e.g. a labeled random sample) to
choose a phase two cutof citecikmpaper. However, the 4.1. Data Sets
model allows a wide range of workflows to be compared
on a common basis, as well as allowing diferential costs We used two fully labeled and publicly available
confor review of positive vs. negative documents, or phase tent moderation data sets with a focus on inappropriate
one vs. phase two documents. user-generated content. The Wikipedia personal attack</p>
      <p>While developed for eDiscovery, the above cost model data set [32] consists of 115,737 Wikipedia discussion
is also a good fit for content moderation. As discussed comments with labels obtained via crowdsourcing. An
in the previous section, the human-in-the-loop modera- example of the comment is presented in Figure 1(a) Eight
annotators assigned one of five mutually exclusive la- framework is available on GitHub7.
bels to each document: Recipient Target, Third Party
Target, Quotation Attack, Other Attack, and No Attack 4.3. Evaluation
(our names). We defined three binary classification tasks
corresponding to distinguishing Recipient Target, Third Our metric was total cost to reach 80% recall as described
Party Target, or Other Attack from all other classes. (Quo- in Section 3. This was computed at the end of each
traintation Attack had too low a prevalence.) A fourth binary ing round as the sum of the number of training
docclassification task distinguished the union of all attacks uments, plus the ideal second phase review cost as a
from No Attack. A document was a positive example if 5 penalty, which is the number of additional top-ranked
or more annotators put it in the positive class. Proportion documents (if any) needed to bring recall up to 80%.
Rankof the positive class ranged from 13.44% to 0.18%. ing was based on sorting the non-training documents by</p>
      <p>The ASKfm cyberbullying dataset [29] contains 61,232 probability of relevance using the most recent trained
English utterance/response pairs, each of which we model. Note that we experimented with 80% recall as
treated as a single document. An example of the con- an example. However, the TAR workflow is capable of
versation is presented in Figure 1(b). Linguists annotated running with arbitrary recall target, such as 95% for
sysboth the poster and responder with zero or one of four tematic review [18, 56].
mutually exclusive cyberbullying roles, as well as an- In actual TAR workflows, recall would be estimated
notating the pair as a whole for any combination of 15 from a labeled random sample. Since the cost of this
samtypes of textual expressions related to cyberbullying. We ple would be constant across our experimental conditions
treated these annotations as defining 23 binary classifica- we used an oracle for recall instead.
tions for a pair, with prevalence of the positive examples
ranging from 4.63% to 0.04%. 5. Results and Analysis</p>
      <p>For both data sets we refer to the binary classification
tasks as topics and the units being classified as documents.</p>
      <p>Documents were tokenized by separating at punctuation
and whitespace. Each distinct term became a feature. We
used log tf weighting as the features for the underlying
classification model. The value of a feature was 0 if not
present, and else 1 + ( ), where  is the number of
occurrences of that term in the document.
4.2. Algorithms and Workflow
Our experiments simulated a typical TAR workflow. The
ifrst training round is a seed set consisting of one
random positive example (simulating manual input) and one
random negative example. At the end of each round, a
logistic regression model was trained and applied to the
unlabeled documents. The training batch for the next
round was then selected by one of three methods: a
random sampling baseline, uncertainty sampling [52], or
relevance feedback (top scoring documents) [51].
Variants of the latter two are widely used in eDiscovery [57].</p>
      <p>Labels for the training batch were looked up, the batch
was added to the training set, and a new model trained to
repeat the cycle. Batches of size 100 and 200 were used
and training continued for 80 and 40 iterations
respectively, resulting in 8002 coded training documents at the
end.</p>
      <p>We implemented the TAR workflow in libact5 [58],
an open-source framework for active learning
experiments. We fit logistic regression models using Vowpal
Wabbit6 with default parameter settings. Our experiment</p>
      <sec id="sec-3-1">
        <title>5https://github.com/ntucllab/libact 6https://vowpalwabbit.org/</title>
        <p>Our core finding was that, as in eDiscovery, active
selection of which documents to review reduces costs over
random selection. Figure 2 shows mean cost to reach
80% recall over 20 replications (diferent seed sets and
random samples) for six representative categories. On all
six categories, all TAR workflows within a few iterations
beat the baseline of reviewing a random 80% of the data
set (horizontal line labeled Manual Review).</p>
        <p>The Wikipedia Attack category is typical of low to
moderate prevalence categories ( = 0.1344).
Uncertainty sampling strongly dominates both random
sampling (too few positives chosen) and relevance feedback
(too many redundant positives chosen for good training).
Costs decrease uniformly with additional training. We
plot 99% confidence intervals under the assumption that
costs are normally distributed across replicates. Costs
are not only higher for relevance feedback, but less
predictable.</p>
        <p>The ASKfm Curse Exclusion ( = 0.0169) and
Wikipedia Other attack ( = 0.0019) category are
typical low prevalence categories. Uncertainty sampling and
relevance feedback act similarly in such circumstances:
even top scoring documents are at best uncertainly
positive. Average cost across replicates levels of and starts to
increase after 44 iterations for uncertainty sampling and
45 iterations for relevance feedback. This is the point at
which additional training no longer pays for itself by
improving the ranking of documents. For this category (and
typically) this occurs shortly before 80% recall is reached</p>
      </sec>
      <sec id="sec-3-2">
        <title>7https://github.com/eugene-yang/TAR-Content-Moderation</title>
        <p>on the training data alone (iteration 48 for uncertainty ing sets reached 5000 documents for ASKfm but continue
sampling and iteration 52 for relevance feedback). for Wikipedia. Categories in Wikipedia ( = 0.1344</p>
        <p>Task such as the ASKfm Sexism category ( = 0.0030) to 0.0018) are generally more frequent comparing to
that deals with nuances in human languages requires ASKfm ( = 0.0463 to 0.001), providing more
advanmore training data to produce a stable classifier. While tage for training to identify more positive documents.
obtaining training data by random sampling stops reduc- Larger batch size slightly reduce the improvement as the
ing the cost after the first iteration, uncertainty sampling underlying classifiers are retrained less frequently. In
and relevance feedback continue to take advantage of practice, the sizes are depending on the cost structure of
additional training data to minimize the cost and become reviewing and specific workflows in each organization.
more predictable. However, as the classifiers are frequently updated with</p>
        <p>Note that the general relationship between the preva- more coded documents, the total cost would be reduced
lence of the task and the cost of reaching a certain recall over the iterations.
target using TAR workflows is discussed Yang et al. [9]. Besides the overall cost reduction, Figure 3 shows</p>
        <p>Table 1 looks more broadly at the two datasets, averag- a heatmap of mean precision across 20 replicates for
ing costs both over all topics and over 20 replicate runs batches 1 to 81 with batch size of 100, to give insight
for each topic for batch sizes of both 100 and 200 . By into the moderator experience of TAR workflows.
Pre20 iterations with batch size of 100 (2002 training doc- cision for relevance feedback starts high and declines
uments), TAR workflows with both relevance feedback very gradually. Uncertainty sampling maintains
relaand uncertainty sampling significantly reduce costs ver- tively constant precision. For the very low prevalence
sus TAR with random sampling. (Significance is based on category Curse Exclusion we cut of the heatmap at 52
paired t-tests assuming non-identical variances and mak- iterations for relevance feedback and 48 iterations for
ing a Bonferroni correction for 72 tests.) All three TAR uncertainty sampling since on average 80% recall is
obmethods in turn dominate reviewing a random 80% of tained on training data alone by those iterations. For
the dataset, which costs 92,590 for Wikipedia and 90,958 both categories, even applying uncertainty sampling that
for ASKfm. is intended to improve the quality of the classifier
im</p>
        <p>The improvement over cost plateaued after the train- proves the batch precision over the random sampling be</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>6. Summary and Future Work</title>
      <sec id="sec-4-1">
        <title>Our results suggest that TAR workflows developed for</title>
        <p>legal review tasks may substantially reduce costs for
content moderation tasks. Other legal workflow
techniques, such as routing near duplicates and
conversational threads in batches to the same reviewer, may be
worth testing as well.</p>
        <p>This preliminary experiment omitted complexities that
should be explored in more detailed studies. Both
content moderation and legal cases involve (at diferent time
scales) streaming collection of data, and concomitant
constraints on the time available to make a review decision.</p>
        <p>Batching and prioritization must reflect these constraints.</p>
        <p>Moderation in addition must deal with temporal variation
in both textual content and the definitions of sensitive
content, as well as scaling across many languages and
cultures. As litigation and investigations become more
international, these challenges may be faced in the law as
well, providing opportunity for the legal and moderation
ifelds to learn from each other.
Proceedings of the 57th Annual Meeting of the As- [39] G. K. Pitsilis, H. Ramampiaro, H. Langseth,
Desociation for Computational Linguistics, 2019, pp. tecting ofensive language in tweets using deep
1668–1678. learning, arXiv preprint arXiv:1801.04433 (2018).
[28] J. Pavlopoulos, P. Malakasiotis, I. Androutsopoulos, [40] S. Sotudeh, T. Xiang, H.-R. Yao, S. MacAvaney,
Deeper attention to abusive user content moder- E. Yang, N. Goharian, O. Frieder, Guir at
semevalation, in: Proceedings of the 2017 conference on 2020 task 12: Domain-tuned contextualized models
empirical methods in natural language processing, for ofensive language detection, arXiv preprint
2017, pp. 1125–1135. arXiv:2007.14477 (2020).
[29] C. Van Hee, G. Jacobs, C. Emmery, B. Desmet, [41] M. Zampieri, S. Malmasi, P. Nakov, S. Rosenthal,
E. Lefever, B. Verhoeven, G. De Pauw, W. Daele- N. Farra, R. Kumar, Semeval-2019 task 6:
Identifymans, V. Hoste, Automatic detection of cyber- ing and categorizing ofensive language in social
bullying in social media text, PloS one 13 (2018) media (ofenseval), arXiv preprint arXiv:1903.08983
e0203794. (2019).
[30] K. Reynolds, A. Kontostathis, L. Edwards, Using [42] M. Zampieri, P. Nakov, S. Rosenthal, P. Atanasova,
machine learning to detect cyberbullying, in: 2011 G. Karadzhov, H. Mubarak, L. Derczynski, Z. Pitenis,
10th International Conference on Machine learning Ç. Çöltekin, Semeval-2020 task 12: Multilingual
and applications and workshops, volume 2, IEEE, ofensive language identification in social media
2011, pp. 241–244. (ofenseval 2020), arXiv preprint arXiv:2006.07235
[31] A. Schmidt, M. Wiegand, A survey on hate speech (2020).</p>
        <p>
          detection using natural language processing, in: [43] R. Binns, M. Veale, M. Van Kleek, N. Shadbolt, Like
Proceedings of the Fifth International workshop on trainer, like bot? inheritance of bias in algorithmic
natural language processing for social media, 2017, content moderation, in: International Conference
pp. 1–10. on Social Informatics, Springer, 2017, pp. 405–415.
[32] E. Wulczyn, N. Thain, L. Dixon, Ex machina: Per- [44] L. Dixon, J. Li, J. Sorensen, N. Thain, L.
Vassersonal attacks seen at scale, in: Proceedings of the man, Measuring and mitigating unintended bias
26th International Conference on World Wide Web, in text classification, in: Proceedings of the 2018
International World Wide Web Conferences Steer- AAAI/ACM Conference on AI, Ethics, and Society,
ing Committee, 2017, pp. 1391–1399. 2018, pp. 67–73.
[33] T. Davidson, D. Warmsley, M. Macy, I. Weber, Au- [45] N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman,
tomated hate speech detection and the problem of A. Galstyan, A survey on bias and fairness in
maofensive language, in: Eleventh international aaai chine learning, arXiv preprint arXiv:1908.09635
conference on web and social media, 2017. (2019).
[34] N. Djuric, J. Zhou, R. Morris, M. Grbovic, V. Ra- [46] D. Link, B. Hellingrath, J. Ling, A
human-is-thedosavljevic, N. Bhamidipati, Hate speech detection loop approach for semi-automated content
moderwith comment embeddings, in: Proceed
          <xref ref-type="bibr" rid="ref1">ings of the ation., in: ISCRAM, 2016</xref>
          .
24th international conference on world wide web, [47] N. M. Pace, L. Zakaras, Where the money goes:
ACM, 2015, pp. 29–30. Understanding litigant expenditures for producing
[35] P. Fortuna, S. Nunes, A survey on automatic de- electronic discovery, RAND Corporation, 2012.
tection of hate speech in text, ACM Computing [48] M. Bagdouri, W. Webber, D. D. Lewis, D. W. Oard,
Surveys (CSUR) 51 (2018) 1–30. Towards minimizing the annotation cost of certified
[36] C. Nobata, J. Tetreault, A. Thomas, Y. Mehdad, text classification, in: CIKM 2013, ACM, 2013, pp.
        </p>
        <p>Y. Chang, Abusive language detection in online 989–998.
user content, in: Proceedings of the 25th interna- [49] G. V. Cormack, M. R. Grossman, Autonomy and
relitional conference on world wide web, International ability of continuous active learning for
technologyWorld Wide Web Conferences Steering Committee, assisted review, arXiv preprint arXiv:1504.06868
2016, pp. 145–153. (2015).
[37] M. Zampieri, S. Malmasi, P. Nakov, S. Rosenthal, [50] B. Settles, Active learning literature survey (2009).</p>
        <p>N. Farra, R. Kumar, Predicting the type and target [51] J. Rocchio, Relevance feedback in information
reof ofensive posts in social media, arXiv preprint trieval, The Smart retrieval system-experiments in
arXiv:1902.09666 (2019). automatic document processing (1971) 313–323.
[38] R. Kumar, A. N. Reganti, A. Bhatia, T. Maheshwari, [52] D. D. Lewis, W. A. Gale, A sequential algorithm for
Aggression-annotated corpus of hindi-english code- training text classifiers, in: SIGIR 1994, 1994, pp.
mixed data, in: Proceedings of the Eleventh Inter- 3–12.
national Conference on Language Resources and [53] G. V. Cormack, M. R. Grossman, Engineering
QualEvaluation (LREC-2018), 2018. ity and Reliability in Technology-Assisted Review,</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>in: SIGIR</source>
          , ACM Press, Pisa, Italy,
          <year>2016</year>
          , pp.
          <fpage>75</fpage>
          -
          <lpage>84</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          URL: http://dl.acm.org/citation.cf m?
          <source>doid=2911</source>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          451.2911510. doi:
          <volume>10</volume>
          .1145/2911451.2911510,
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          00024. [54]
          <string-name>
            <surname>D. D. Lewis</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Frieder</surname>
          </string-name>
          , Certifying one-
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <article-title>phase technology-assisted reviews (</article-title>
          <year>2021</year>
          ). [55]
          <string-name>
            <given-names>E.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. D.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Frieder</surname>
          </string-name>
          , Heuristic stopping
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <source>ings of the 21st ACM Symposium on Document</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Engineering</surname>
          </string-name>
          ,
          <year>2021</year>
          . [56]
          <string-name>
            <given-names>D.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Kanoulas</surname>
          </string-name>
          , When to stop reviewing in
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <article-title>tems (TOIS) 38 (</article-title>
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>36</lpage>
          . [57]
          <string-name>
            <given-names>G. F.</given-names>
            <surname>Cormack</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. F.</given-names>
            <surname>Grossman</surname>
          </string-name>
          , Evaluation of
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <article-title>review in electronic discovery</article-title>
          ,
          <source>SIGIR</source>
          <year>2014</year>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          153-
          <fpage>162</fpage>
          . doi:
          <volume>10</volume>
          .1145/2600428.2609601. [58]
          <string-name>
            <given-names>Y.-Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.-C.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.-A.</given-names>
            <surname>Chung</surname>
          </string-name>
          , T.-E. Wu, S.-A.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>University</surname>
          </string-name>
          ,
          <year>2017</year>
          . URL: https://github.com/ntu
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>//arxiv.org/abs/1710.00379.</mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>