<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Comparative Religion, Topic Models, and Conceptualization: Towards the Characterization of Structural Relationships between Online Religious Discourses</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Zachary K. Stine</string-name>
          <email>zkstine@ualr.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>James E. Deitrick</string-name>
          <email>deitrick@uca.edu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nitin Agarwal</string-name>
          <email>nxagarwal@ualr.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Arkansas at Little Rock, 2801 S. University Ave.</institution>
          ,
          <addr-line>Little Rock, AR 72204</addr-line>
          ,
          <country country="US">United States</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Central Arkansas</institution>
          ,
          <addr-line>201 Donaghey Ave., Conway, AR 72035</addr-line>
          ,
          <country country="US">United States</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <fpage>3903</fpage>
      <lpage>3914</lpage>
      <abstract>
        <p>The similarity between the lexicons of diferent religious discourses does not necessarily reflect the similarity between the ways of understanding the world inherent in their discourses. Drawing on scholarship from comparative religion that distinguishes between surface-level, lexical distinctions and deeper grammatical and structural distinctions between two religious traditions, we present a computational approach to assessing the structural similarity between religious discourses irrespective of their lexical diferences. We argue that unsupervised machine learning models trained on diferent discourses can be indirectly compared by how consistently they organize information as an operationlization of structural similarity. This consistency can be quantified as the mutual information between the models' clusterings of a designated set of comparison data. We present our approach through a case study comparing discussions from Reddit concerning Buddhism and Christianity.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;comparative religion</kwd>
        <kwd>topic modeling</kwd>
        <kwd>information theory</kwd>
        <kwd>digital religion</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        rather than cultural lexicon in order to measure what we are calling their structural similarity.
We assume that a particular discourse reflects a way of understanding the world or some aspect
of it [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. In other words, a discourse reflects a cultural grammar or structure. However, as in
the example of Olcott, a discourse may be expressed within a cultural lexicon that is
incongruous with the underlying cultural grammar (see section 2.1 for a more detailed discussion of
this phenomenon). Importantly, our use of the term “lexical” should be understood to refer
to how culturally specific a particular term is, reflecting this notion of cultural lexicon.
      </p>
      <p>Our approach is based on the assumption that a discourse divides the world, or some aspect
of it, in a particular way. In other words, a categorization scheme is implicit within a discourse.
Given this assumption, we argue that if discourses are structurally similar, they can be expected
to produce categorization schemes that carve up information in a mutually consistent manner,
despite diferences in culturally specific lexicons used in each discourse. We operationalize this
notion using unsupervised machine learning models that are trained on each discourse being
compared. Each model learns a clustering scheme that is specific to the discourse used to train
it, thereby acting as a plausible representation of that discourse’s categorization scheme. We
then interrogate the relationship between categorization schemes (represented by the learned
models) by forcing each model to apply its discourse-specific scheme to the unseen discourse
with which it is being compared. We then measure the mutual consistency with which each
discourse-specific model classifies both its own discourse and the comparison discourse using
the mutual information between the resulting clusterings.</p>
      <p>
        In order to better clarify what we are attempting to do in this approach, we draw on and
extend a particular usage of the term “conceptualization.” A clustering of a data set can be
understood as implying a particular conceptualization of that data, and multiple clusterings
may imply various ways that a researcher might conceptualize the data, with potential
differences or similarities between them [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. In this sense, clustering data leads a researcher to
interpret the resulting clusters as salient concepts for understanding the data. In that case, a
single data set is explored through various clusterings in order to find useful conceptualization
schemes. Here, we use “conceptualization” to mean how one discourse—as represented within
a model trained on it—organizes another discourse in terms of its own semantic elements. In
other words, the representation of a diferent, unseen discourse by a model trained on a
different discourse can be understood as how the training discourse “conceptualizes” this unseen
discourse.
      </p>
      <p>
        This usage of “conceptualization” is especially useful given the type of unsupervised model
we use: latent Dirichlet allocation (or LDA). LDA learns two things from a corpus: a set of
word-usage patterns (or topics), which can be understood as corpus-specific concepts, and a
representation of each document in the corpus as a mixture of these word-usage patterns [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
Importantly, these word-usage patterns may be characterized by the corpus-specific lexicon
alongside less corpus-specific terms. When we force such a model to represent the documents
of a diferent corpus as mixtures of its own word-usage patterns, we get a representation of
the diferent corpus through the lens of the corpus which was used to train the model. In
other words, we get a conceptualization of this diferent corpus in terms of the training corpus.
We argue that the mutual consistency with which both corpus-specific models “conceptualize”
each other reflects their structural similarity—the degree to which each discourse-as-model
categorizes the training corpus of each. In this operationalization, the structural similarity
is reflected by the mutual information between how two models organize input, regardless of
how diferent the actual word-usage patterns are between the two models. In this way, we
are not comparing the features of each model directly, but instead are comparing only how
consistently these corpus-specific features are applied by each model. From this, we get a
mapping from the “true” word-usage patterns (from the model trained on the corpus) to those
used in another model’s conceptualization of the corpus. This mapping can be usefully thought
of as the interpretation of one model’s topics by another model.
      </p>
      <p>
        Motivated by prior work in comparative religion concerning encounters between Buddhism
and American Protestantism [
        <xref ref-type="bibr" rid="ref11 ref12">12, 11</xref>
        ], we explore the empirical implications of this
operationalization in a narrow case study between two English-language discourses from the popular
discussion platform, Reddit: r/Buddhism and r/Christianity. Importantly, there is no reason to
assume that either discourse we examine constitutes a general representation of global
Buddhism or Christianity (assuming such general forms, untethered from particular social systems,
are even valid to begin with). Instead, these discourses should be understood to reflect only
the particular versions of Buddhism and Christianity which emerge from these online
communities. In other words, rather than focus our comparisons on abstract representations of
Buddhism and Christianity, we focus our comparisons on specific communities engaged in
discussing Buddhism and Christianity. Therefore, our findings should not be construed as
reflections of Buddhism and Christianity as transcendent forms, but as contingent upon these
online communities. We include two additional communities to help contextualize our results.
      </p>
      <p>
        Far from being trivial or unserious objects of scholarly inquiry, such online discourses ofer
valuable insights into how religious traditions are understood and engaged with in popular
culture. In recent years, a body of literature has emerged specifically around the study of
religion in digital contexts under the name of “digital religion” [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Given the popularity of
Reddit, it is reasonable to think that an understanding of its religious communities does have
salience for understanding popular conceptions of religious traditions in the English-speaking
world. Additionally, the quantity of data that is available from these communities is sufficiently
large to be an obstacle to researchers analyzing these data without the aid of computational
tools. Quantitative methods are underused within the study of digital religion [
        <xref ref-type="bibr" rid="ref17 ref22">23, 17</xref>
        ], and so
another goal of this work is to demonstrate how such methods may be imported and customized
as useful complements to qualitative methods.
      </p>
      <p>We find evidence that our proposed operationalization of structural similarity accords with
our expectations about the relationship between the two subreddits’ discourses and with the
discourses of two secondary subreddits. Additionally, we investigate which features from models
of r/Buddhism and r/Christianity are most responsible for their structural similarities by
calculating the pointwise mutual information between each possible pair of features. We find
the context in which the two corpora are compared is highly influential on which feature pairs
emerge as most strongly related between models. We also find that, while these feature pairs
may have stark diferences between the lexical items that characterize them, their mappings
between models often appear surprisingly reasonable as if analogies for each other within their
diferent lexical contexts.</p>
      <p>In the following sections of this article, we provide background for understanding the
theoretical framework we present, describe the data and methods used to illustrate this framework
within the case study of the r/Buddhism and r/Christianity discourses from Reddit, present
our findings from the case study, and briefly discuss what these findings suggest about our
operationalization and directions for further investigation.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>In the following subsections, we provide background information necessary for constructing
our argument that the mutual information between topic models trained on lexically distinct
religious discourses can be understood as a reflection of their structural relationship.</p>
      <sec id="sec-2-1">
        <title>2.1. Comparative religion and religious creolization</title>
        <p>
          While this comparative problem may be faced in a variety of cultural contexts, we explore it
from within the context of comparative religion, and so a brief consideration of the problems
faced in comparative religion will provide important context for understanding the challenges
faced by this work. Paden identifies three primary criticisms that have been levied against
traditional comparative approaches [
          <xref ref-type="bibr" rid="ref30">31</xref>
          ]. First, comparativism may mislead by suppressing
diferences between cultures, engaging in colonialist reductiveness. Second, comparativists have
sometimes been guilty of introducing theological or ontological assumptions into their work in
an unscientific manner. Finally, charges have been made that comparativism is untheoretical
in that it lacks the ability to explain religious diferences and similarities.
        </p>
        <p>
          The use of empirical methods in comparative religion has been suggested as a possible
antidote to this last criticism [
          <xref ref-type="bibr" rid="ref23">24</xref>
          ], and while computational methods are certainly not objective,
they at least reduce the ways in which researchers may introduce their own faulty assumptions
into an analysis or make those assumptions explicit. However, the potential for reductionism in
computational approaches is worth consideration. Computational methods, specifically those
from machine learning, are efective in identifying large-scale patterns within data too
numerous for individuals to comb through. Such large-scale analyses require a trade-of between
the particular and the general. In other words, machine learning methods excel at
illuminating trends and generalities, but potentially at the expense of finer-grained variation. While
reductionism is certainly a concern, it has been argued by some that a preoccupation with
reductionism has substantially hindered comparative religion [
          <xref ref-type="bibr" rid="ref36 ref9">37, 9</xref>
          ].
        </p>
        <p>
          With these challenges in mind, we now turn to the comparative work undertaken by Deitrick
concerning the relationship between the social ethics of engaged Buddhism and mainstream
American religion, which serves as the inspiration for the present study. In [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], Deitrick
invokes a theory of religious creolization put forward to describe the religious life of Henry
Steel Olcott [
          <xref ref-type="bibr" rid="ref32">33</xref>
          ]. This theory posits a distinction between a religion’s grammatical structures
and the particular lexical forms through which these structures are expressed. In the case
of American engaged Buddhism, Deitrick argues that, in terms of its social ethics, it can be
understood as the adoption of a Buddhist lexicon to describe cultural structures that ultimately
reflect mainstream American religion. Deitrick refers to this as an “inverse creole faith” in that
it reverses the power dynamics of what is typically referred to as “creole”—a dominant group
adopts the lexicon of a minority group [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
        </p>
        <p>In the present study, we are interested in whether the Buddhist discourse from Reddit is
only lexically distinct from the Christian discourse, or if it is both lexically and structurally
distinct.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Religion on Reddit</title>
        <p>
          Reddit consists of a large number of communities, called subreddits, which facilitate discussions
around a defined theme or topic. Users can author submissions to a subreddit and author
comments within discussion threads that accompany each submission. Data from Reddit have
been usefully analyzed in work ranging from the efectiveness of hate speech bans [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], violations
of community norms [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], persuasion [
          <xref ref-type="bibr" rid="ref40">41</xref>
          ], birth narratives [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], and discourses around China [
          <xref ref-type="bibr" rid="ref39">40</xref>
          ].
        </p>
        <p>Reddit is a useful source of popular discourses for several reasons. Most importantly, each
community constitutes a discourse that is endogenously defined. Constructing a corpus that
represents a particular religious tradition is complicated by the decisions that must be made
about which documents to include and exclude from the corpus. In the case of Reddit, such
consequential decisions are avoided: The community of users and their discussions presents an
unambiguously delineated discourse. Additionally, comparative analyses of subreddits have the
benefit that all subreddits being analyzed are subject to the same efects that stem from simply
being on Reddit, whether in the form of demographic trends of its users or the afordances of
the platform. Each subreddit we analyze is predominantly English-language.</p>
        <p>While a number of subreddits exist which focus on Buddhist and Christian traditions, we
limit ourselves to r/Buddhism and r/Christianity for two reasons. First, our primary goal in
this paper is to explain our proposed approach for making structural comparisons between
religious discourses; therefore, we analyze these two subreddits to serve as a focused case study.
Second, r/Buddhism and r/Christianity appear to be the most general subreddits dedicated to
their respective religious traditions as well as having the largest discussion histories. We are
more interested in popular conceptions of Buddhism generally rather than engagements with
more specific traditions within, for example, Theravada, Mahayana, or Vajrayana Buddhism.
Similarly, we are interested in general conceptions of Christianity rather than in specific
denominations. This is not to suggest that communities with a narrower focus on more specific
traditions and denominations are irrelevant to our questions, but simply that, within the
current study, we are interested in the two most popular subreddits that involve discussions of
Buddhism or Christianity.</p>
        <p>Various sects and denominations are surely represented to some extent in these communities,
but there is no reason to think they are represented in a balanced way—certain perspectives
may loom larger than others. However, to reiterate a previous point, we are not studying
r/Buddhism because we mistakenly believe it to be an accurate representation of global
Buddhist perspectives. Instead, we study it because it is a wildly popular Buddhist discussion
community on a wildly popular social media platform and its discourse is therefore salient
for understanding Buddhism within popular English-language online culture. The same
applies to r/Christianity. In future work, we intend to extend our approach to other
communities including several smaller sect-specific subreddits alongside those analyzed here. However,
r/Buddhism and r/Christianity remain reasonable and interesting starting places for our case
study for the reasons just given.</p>
        <p>To provide context for our results comparing r/Buddhism and r/Christianity, we also report
results comparing them with two other subreddits: r/religion and r/math. Our rationale for
including r/religion is that we expect it to reflect a tendency in Western culture to associate
the notion of religion with Abrahamic traditions and especially with Christianity. We include
r/math because we expect that, while r/Buddhism and r/Christianity may present two distinct
discourses, they are more likely to reflect similar conceptualization schemes with each other
than with discussions about mathematics. Additionally, the inclusion of r/math serves as a
check to make sure that our approach is still capable of showing dissimilarity and not simply
forcing all corpora being compared to appear mostly similar.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Latent Dirichlet allocation</title>
        <p>
          We use latent Dirichlet allocation (LDA) to represent each discourse as a topic model. LDA
views a corpus as the result of a generative statistical process in which each document in the
corpus is generated by drawing a probability distribution over a set of “topics”—probability
distributions over the vocabulary of the corpus—from which each word in the document is then
drawn [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. In training, LDA attempts to infer the distribution over topics for each document
in the corpus as well as the distributions over the vocabulary (or “topics”). The learned topics
correspond to latent features underlying the corpus. While these features may sometimes
correspond to colloquial usages of “topic,” they are better understood as patterns of
wordusage, or as [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] suggests, contexts. The topics of LDA can also be understood to reflect several
concepts from the sociology of culture [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. LDA not only provides a representation of each
document in the corpus as a mixture of these features but can also provide representations of
unseen documents not included in the training corpus as mixtures of these features.
        </p>
        <p>
          An unsupervised algorithm, LDA learns the topics and document-topic distributions without
any specifications of what the content of its features ought to look like. However, LDA does
require the selection of the number of topics, k. Diferent choices of k may influence the
specificity of the learned features, with smaller values of k yielding more general topics and
larger values yielding more specific topics [
          <xref ref-type="bibr" rid="ref1 ref28">29, 1</xref>
          ]. Quantitative evaluation of LDA models is a
complex problem, and qualitative evaluation is typically necessary to ensure that a model is
understandable and therefore helpful to a researcher [35]. Ultimately, it may not make sense
to think of one model as more correct than another, even if one appears optimal according
to one or more evaluation metrics, but to simply see each as plausible representations of the
training corpus.
        </p>
        <p>
          LDA has been previously used within the context of religious studies including a comparative
analysis of three Confucian texts [
          <xref ref-type="bibr" rid="ref29">30</xref>
          ] and an investigation into mind-body holism in medieval
Chinese thought [
          <xref ref-type="bibr" rid="ref37">38</xref>
          ]. LDA has also been used in comparative contexts outside of religious
studies to compare the proceedings of natural language processing conferences over time [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]
and to compare two discourses about China from Reddit [
          <xref ref-type="bibr" rid="ref39">40</xref>
          ]. In each of these cases, LDA is
used to train a common topic model that is shared by each of the collections being compared.
The relevant documents, terms, or collections of documents are then compared within this
shared topic space. This approach makes sense when the objects being compared are not
characterized by distinct lexicons or if such lexical distinctions are of interest. What diferentiates
the approach we describe here is that we are not comparing objects within a shared topic space
but are instead comparing how topic models try to fit unseen, lexically distinct discourses into
their own topic spaces that are specific to their training discourses. We are not looking at
which topics are associated with which discourse but are instead comparing how much
consistency exists between how models place documents from diferent discourses within their own
discourse-specific features. In other words, we are looking at how diferent models
“conceptualize” other discourses and measuring the consistency between those conceptualizations rather
than measuring the similarity between the concepts themselves.
        </p>
        <p>
          The LDA models trained on the discussions of r/Buddhism and r/Christianity can be thought
of as representations of their corresponding discourses, where we understand a discourse as a
way of understanding the world or some aspects of it [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. While useful, these
representations are not perfect, functioning more like metonyms of the corresponding discourses [
          <xref ref-type="bibr" rid="ref31">32</xref>
          ].
We propose thinking of LDA models as not only representations of a discourse, but also as
operationalizations of a discourse in that we can deploy the organizational scheme of the model
in novel contexts to see how the model organizes new information, i.e., how it conceptualizes.
In addition to learning features and a representation of the training corpus as mixtures of
those features, a trained LDA model also has the ability to infer the topic mixtures of new
documents using the posterior parameter for document-topic distributions (typically notated
as α) that becomes the prior in the inference process for new documents’ topic distributions.
When inferring the topic distributions of unseen documents, this prior acts as the conceptual
disposition of the model, which is taken in along with the observed text of the new document
to determine its topic distribution. If we were to ask the model to infer the topic mixture of a
blank document, it would simply assign this prior topic mixture.
        </p>
        <p>Contrary to the usual goals of machine learning, we do not want these discourse models to
generalize beyond their training data. Instead, we want them to reflect only the conceptual
schemes latent in their training corpus. Rather than examine the similarity between the
features of the models, which reflect diferences in lexical content, we are interested in the
mutual consistency between how the models conceptualize—do certain features tend to be
co-applied to documents regardless of the lexical diferences that constitute those features?</p>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Information theory</title>
        <p>
          Information theory provides a useful means for quantifying the kinds of relationships we are
trying to uncover between discourses. To quantify the consistency with which two LDA models
conceptualize a discourse, we use the mutual information between each model’s topic
assignments. Introduced in the context of communication channels by [
          <xref ref-type="bibr" rid="ref35">36</xref>
          ], the mutual information of
two random variables, I(X; Y ), quantifies the reduction in uncertainty about X (or Y ) that is
provided by knowing Y (or X) given in bits [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. A common usage of mutual information is to
measure how similarly two clustering schemes partition a set of observations (e.g., [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]).
Typically, this is done for hard clusterings in which each observation is assigned to a single class,
as distinct from LDA, which assigns observations (documents) to a mixture of multiple classes
(topics). A method for “hardening” topic mixtures from LDA is proposed by [
          <xref ref-type="bibr" rid="ref39">40</xref>
          ]. However,
we calculate the mutual information between the probabilistic clusters of documents, following
[21], which does have some complications. Other information theoretic quantities exist for
comparing two clusterings, including variations based on mutual information (e.g., [
          <xref ref-type="bibr" rid="ref27">28</xref>
          ]) and
the metric, variation of information [
          <xref ref-type="bibr" rid="ref24">25</xref>
          ], which we plan to compare with the standard mutual
information in further work.
        </p>
        <p>
          Additionally, measures of information divergence provide a useful means for quantifying how
lexically distinct two discourses are and how distinguishing each term is individually. One such
quantity, the Kullback-Leibler divergence, provides an asymmetric measure of how much one
probability distribution difers from an expectation based on another distribution [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. The
Kullback-Leibler divergence (or KLD) has been previously used alongside LDA to characterize
the reading behavior of Charles Darwin [
          <xref ref-type="bibr" rid="ref26">27</xref>
          ], innovation within parliamentary speeches [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], and
legislative change [
          <xref ref-type="bibr" rid="ref38">39</xref>
          ]. The Jensen-Shannon divergence (or JSD) is a symmetrical divergence
derived from the KLD [
          <xref ref-type="bibr" rid="ref21">22</xref>
          ]. The JSD has been previously used to measure the distinguishability
between distributions of features from violent and non-violent court trials [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. It has also been
used to measure the diference between LDA topics (e.g., [
          <xref ref-type="bibr" rid="ref25">26</xref>
          ]).
        </p>
        <p>
          The contribution of each feature to the total JSD between distributions can also be
calculated. For example, this is done in [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] to identify which trial features most distinguish violent
from non-violent trials and vice versa. We use the JSD between the relative frequencies of each
word between subreddits to quantify how lexically distinct the discourses of the subreddits are
from each other. Additionally, we can characterize the extent to which each word functions
as part of a discourse’s lexicon by calculating each word’s individual contribution to the total
JSD between discourses. In this context, a word’s contribution to the JSD between discourses
represents how strongly the word implies one discourse over another.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Methods and Data</title>
      <p>In this section, we describe our data collection, preprocessing steps, and put together our
framework built from the topics introduced in the previous section, explaining it in parallel to
the methods we use to compare the discourses of r/Buddhism and r/Christianity.1</p>
      <sec id="sec-3-1">
        <title>3.1. Data collection and preprocessing</title>
        <p>We collect data from the two subreddits of primary interest, r/Buddhism and r/Christianity,
as well as for r/religion and r/math. For the subreddits of interest, we first collected all
available submission IDs from the creation date of the subreddit through the end of 2019.
These submission IDs were collected from the service PushShift.io (using the Python wrapper
PSAW), which maintains historical data from Reddit. We then used Reddit’s own Application
Programming Interface (API) (using the Python wrapper, PRAW) to collect the submission
title, body text, and all comments for each submission ID, which were written to CSV files
along with relevant metadata such as user ID and timestamps.</p>
        <p>After collecting the submissions from each subreddit, we performed basic preprocessing on
the text. Tokens are lowercase strings with a minimum length of three characters. URLs are
tokenized so that they are reduced to their hostname with hyphens replacing any punctuation
(e.g., “en.wikipedia.org” becomes “en-wikipedia-org”). References to users and subreddits are
preceded by “u/” and “r/” respectively. We preserve these indicators when tokenizing so that
a distinction is made in cases where a user name or subreddit name overlaps with another
word type. For example, if a comment references the subreddit, r/Buddhism, that reference
will be assigned to the word type, “r-buddhism” in order to distinguish it from the word
type, “buddhism.” Tokens other than URLs, user names, and subreddit names do not include
punctuation or numeric characters.</p>
        <p>We created a custom set of 42 stopwords from the most frequent words in each subreddit
which were removed from all documents. Additionally, words which occurred in fewer than
ifve documents within each subreddit were removed from all documents. After word removal,
the final vocabulary was limited to words that were within the 30,000 most frequent words of a
subreddit. Using this final vocabulary, only documents with 20 or more tokens were included
in each subreddit’s corpus. An overview of the data collected can be seen in Table 1.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Quantifying the lexical distinctness of discourses</title>
        <p>While it might be reasonable to take it for granted that the discourses of r/Buddhism and
r/Christianity use cultural lexicons that distinguish each from the other, we use the JSD
between the relative word frequencies from each subreddit to quantify the degree to which
they are lexically distinct from each other. For each word type in the combined vocabulary
of the subreddits, we calculate the probability of each word within a subreddit as the number
1All code used for this analysis is available at https://github.com/zacharykstine/chr2020_comp_relg_lda.
of times that word occurs divided by the total number of tokens present in all documents
from that subreddit. We then calculate the JSD between the two distributions for each pair
of subreddits under consideration.</p>
        <p>
          Additionally, we calculate the individual contributions of each word to the JSD between
r/Buddhism and r/Christianity to see if the words which contribute the most to the total
JSD reasonably correspond to what we would expect to see in the cultural lexicons of the
subreddits. The way in which we calculate the JSD contribution of each term difers slightly
from the method used by [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. There, the authors calculate the partial KLD of each feature
from one distribution to the mean of the two distributions, which quantifies how much each
feature signals one particular distribution over the other. Here, we simply calculate the
perfeature JSD contributions by calculating the partial KLD of each feature for both distributions.
This results in two partial KLD values for each feature from which we take the mean to get
the partial JSD of the feature. Done this way, we can see which terms are most distinguishing
between the two subreddits from both directions, rather than which terms distinguish one
subreddit over the other.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Structural comparisons between discourses</title>
        <p>
          We now propose and explain our implementation of the structural comparisons between the
discourses of r/Buddhism and r/Christianity. We separately train LDA models with 30 topics
on the r/Buddhism corpus and the r/Christianity corpus using the Gensim package for Python
[
          <xref ref-type="bibr" rid="ref33">34</xref>
          ]. For brevity, we will refer to the model trained on r/Buddhism as model B, and the
30topic model trained on r/Christianity as model C, and refer to the ith topic of a model as B.i or
C.i. After training each model, we get three primary results: a set of “topics” (or features) as
probability distributions over the vocabulary, a representation of all documents in the training
corpus as distributions of topics, and a way to infer topic distributions for unseen documents.
We qualitatively choose labels for each topic based on the highest probability words in the
topic as well as close readings of exemplar documents of the topic.
        </p>
        <p>
          A more common way to compare these two models would be to calculate the similarity
or distance between the topics from one model and the topics from the other model (e.g.,
as in [
          <xref ref-type="bibr" rid="ref25">26</xref>
          ]). However, we are less interested in how similar the models’ topics are, and more
interested in how similarly the models apply their topics. This is a substantial distinction. We
are acknowledging that the two diferent models, trained on two diferent corpora, may have
completely diferent topics. However, as long as the models apply those topics to documents in
a mutually consistent fashion, the models functionally conceptualize the documents similarly.
Our assumption is that, if two models that organize input in a mutually consistent fashion,
then they are similar at a structural level regardless of how diferent their particular features
are from each other.
        </p>
        <p>
          It is common to think of LDA models as primarily being the set of inferred topics, but
this is only half of the full picture. In addition to their topics, LDA models instantiate a
particular organizational scheme that takes input text and categorizes it as a mixture of those
topics, weighing the observed text being input with a model’s learned disposition for applying
its topics. In other words, LDA models can be thought of as both a representation and an
operationalization of a discourse, with topics being the former and the way in which models
apply those topics to particular documents being the latter. Drawing on and extending the
use of the term in [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] and [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ], we frame this activity of assigning topics to new information
things as conceptualization—the activity of the model representing the novel information in
terms of its own discursive features (topics) and dispositions (the trained model’s posterior
document-topic distribution parameters, which act as a prior when doing inference on new
documents). By comparing how two models apply their topics to a set of documents (rather
than comparing their topics directly), we are comparing how each model conceptualizes that
particular set of documents. If two models conceptualize information in a mutually consistent
way, then they share a kind of similarity that is deeper than the particular forms their concepts
(or topics) take. This is what we are referring to as structural similarity, distinct from lexical
similarity.
        </p>
        <p>To quantify this shared consistency between two ways of conceptualizing input, we calculate
the mutual information between their conceptualizations of a set of documents. Given two
clusterings of the same set of objects, the mutual information between the two clusterings
represents how much information knowing one cluster assignment provides for knowing the
assignment made by the other clustering. As previously noted, mutual information is typically
used to quantify the similarity between two “hard” clusterings—those in which each object is
assigned to a single cluster. However, we calculate the mutual information between
documenttopic distributions from two models following the proposed method in [21] by multiplying the
transposed document-topic probability matrix from one model with the document-topic matrix
of the other model to create a kind of contingency table from which the joint and marginal
probabilities of the topics from the two models can be calculated.</p>
        <p>Calculating the mutual information between two LDA models in this way requires us to
choose the set of documents across which the two models will be compared, and there is no
reason to suppose that the mutual information between models will be the same when diferent
document sets are used when calculating it. If we assume that the topic assignments made on
the same documents which were used to train the model are the “true” topic assignments of
that corpus, we can think of the mutual information between the models based on that corpus
as representing how well the other model is able to interpret the first.</p>
        <p>For example, when comparing models B and C on the r/Buddhism corpus used to train model
B, we consider the topic assignments made by model B to be the “true” assignments, since B
was trained from this corpus. The topic assignments made by model C, on the other hand,
represent something very diferent. Model C, acting as an extension of its training corpus,
is forced to apply its own set of topics (or contexts) from r/Christianity to r/Buddhism. In
other words, model C conceptualizes r/Buddhism based on the broad discourse underlying
r/Christianity. So if model B assigns a document from r/Buddhism to have high probability of
topic B.i, the topic assignment made by model C can be understood as model C’s interpretation
of B.i. If model C is highly certain about how to assign a topic mixture, perhaps assigning
the document to have high probability for topic C.j, then model C can be understood as
interpreting B.i as C.j within the context of this single document. If, on the other hand, model
C is highly uncertain about how to assign a topic mixture to the document, the resulting
distribution of topics may be highly spread out, lacking a clear mapping from model B to C.</p>
        <p>As this is repeated over all of the documents from r/Buddhism, if B.i and C.j continue to
occur with high probability in the same documents, then the association between them (in
the context of r/Buddhism) continues to strengthen. If, however, model C applies a variety
of topics to documents with topic B.i, whether by topic distributions that are continually
spread out over the topics or by applying high probability topics which vary from document
to document, then the interpretation of B.i by model C becomes less clear. This relationship
between the topics is quantified by the mutual information between the models.</p>
        <p>Specific relationships between a single topic from one model with a single topic from the
other model are quantified by the pointwise mutual information between them. The mutual
information is simply the expected pointwise mutual information between all topic pairs across
models. Importantly, the mutual information between two models is contingent on the set of
documents over which they are compared. As we will show, the mutual information between
B and C will depend on the comparison corpus, and more notably, the strongest mappings
between topic pairs will also depend on the comparison corpus.</p>
        <p>The argument we are exploring here is that if two models representing two lexically
distinct discourses are functionally similar (in that they organize information similarly), then the
discourses represented by the two models are structurally similar—the two discourses divide
aspects of the world up into similar categories, despite using diferent lexical items to describe
the categories. The degree of structural similarity between models is reflected in the mutual
information between them on a particular discourse.</p>
        <p>In the present article, we empirically explore this argument by comparing how models trained
on the discourses of r/Buddhism and r/Christianity interpret each other by calculating the
mutual information between their topic assignments twice: once for each corpus to act as the
comparison corpus. To contextualize these results, we compare them to the self-mutual
information of each corpus and corresponding model. We also compare how each model interprets
models trained on the discussions of r/math and r/religion. For each comparison, we refer to
the model trained on the comparison corpus as the source model and refer to the model trained
on a corpus other than the comparison corpus as the interpreting model.</p>
        <p>To better understand what the mutual information between models represents, we look at
which topic pairs between models B and C have the largest pointwise mutual information. To
assess how diferent our proposed method for comparing models is from a direct comparison
of topics between models, we also calculate the distance between all topic pairs using the
Jensen-Shannon divergence.</p>
        <p>To get a sense of how dependent these results are on using models with 30 topics, we also
train models on each subreddit with 60 topics and calculate the mutual information between
them. To diferentiate models with diferent numbers of topics, we subscript k with the model
name (e.g., B30 or C60).</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>In this section, we report the results of our methodology within the narrow case study of
the subreddits previously described. Our goal in reporting these results is to illustrate the
empirical implications of the way we have defined and operationalized the notion of structural
similarity.</p>
      <sec id="sec-4-1">
        <title>4.1. Lexical comparisons</title>
        <p>When we calculate the JSD between the vocabulary distributions of each subreddit, we find that
r/Christianity and r/religion have the least divergence between them. In other words, they are
the least lexically distinct pair. We also find that r/Buddhism is slightly less lexically distinct
from r/religion than from r/Christianity. The JSD values between the vocabulary distributions
of the subreddits are given in Table 2. The relationships between subreddits that emerge from
their lexical distinctness provide a good baseline against which we can compare their structural
similarity. As we will show in the sections below, there is some disagreement between the
ordering of lexical similarity between subreddits in Table 2 with the orderings we obtain from
their structural similarity reported in the subsections that follow. This disagreement, though
slight, is an encouraging sign that our approach to calculating structural similarity is not
simply a more complicated, but functionally equivalent, calculation of the lexical similarity—
it measures something diferent.</p>
        <p>A sample of the twenty-two words with the largest contributions to the JSD between the
vocabulary distributions of r/Buddhism and r/Christianity is provided in Table 3. Most of
these highly distinguishing terms are reasonable candidates for the cultural lexicons of either
subreddit. Some terms such as “practice” or “sufering” may not be unique to a single
religious lexicon. However, their relatively large JSD contributions indicate that they are highly
distinguishing terms between the subreddits—they are strong signals of one discourse over the
other. Importantly, this way of quantifying the extent to which a word functions as part of a
discourse’s lexicon is dependent on the discourse it is being compared with.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Mutual information between models</title>
        <p>Given that we are calculating mutual information between probabilistic clusterings of
documents, we first calculate the mutual information between each model and itself. This
selfmutual information for each model gives us a rough sense of the maximum mutual information
possible for the corpus on which the model was trained. For that reason, when we report
mutual information between models trained on diferent corpora, we also report what percentage
of the self-mutual information that value is, according to the self-mutual information of the
source corpus. The self-mutual information values for each subreddit can be found in Table 4.</p>
        <p>When we calculate the mutual information between models B30 and C30 with r/Buddhism as
the comparison corpus, we get 0.182 bits or 59% of the mutual information model B30 has with
itself. Similarly, we find that B 60 and C60 have mutual information of 0.133 bits (57% of the
self-mutual information of B60). When we calculate the mutual information between B30 and
C30 within the context of the r/Christianity corpus, we get 0.168 bits or 42% of the self-mutual
information of model C30. We likewise find 0.125 bits of mutual information between B 60 and
C60 (40% of the self-mutual information of C60).</p>
        <p>These results can be understood to reflect how well the two models—and as imperfect
representations, the two discourses—interpret each other. In the form of their corresponding
models, the discourse of r/Christianity is capable of interpreting the discourse of r/Buddhism
better than r/Buddhism can interpret r/Christianity. This is true in the case of the models
with 30 topics as well as the 60-topic models (see Tables 5 and 6).</p>
        <p>Simply knowing the mutual information values between does not provide strong intuitions
about their structural similarity, so we contextualize these values with comparisons to r/religion
and r/math. Given the number of topics in the model trained on r/religion that reflect
generally Abrahamic and monotheistic religious concerns, we expect r/Christianity and r/religion to
have higher structural similarity with each other than any other subreddit pairing. We find that
the largest mutual information between any two subreddits occurs between r/Christianity and
r/religion when the comparison corpus is r/Christianity. This is the case for both the 30-topic
and 60-topic models. In the case of the 30-topic models, r/religion interprets r/Christianity
with 54% of the self-mutual information of r/Christianity, the third-highest. In the
60topic models, r/religion interprets r/Christianity with 63% of the self-mutual information
of r/Christianity, rising to the second-highest.</p>
        <p>In the case of r/math, we expect both r/Buddhism and r/Christianity to be highly distinct,
both lexically and structurally. Accordingly, the four comparisons done between the subreddits
of interest and r/math generate the lowest four mutual information values (as percentages of
the appropriate self-mutual information). This is true for the 30-topic models and for the
60-topic models. Mutual information values for models with 30 topics can be seen in Table 5,
and values for models with 60 topics can be seen in Table 6.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Pointwise mutual information between topics</title>
        <p>While the mutual information between models on a comparison corpus provides a high-level
picture of the relationship between models, it is also possible to dig into which features of the
discourses are mapped together by looking at which topic pairs between models have the highest
pointwise mutual information. For brevity, we only focus on the 30-topic models, B30 and C30,
as a case study for which we obtain all 900 pointwise mutual information values for each
combination of topics for both r/Buddhism and r/Christianity with each as the source corpus.
As examples, we report the ten topic pairs with the highest pointwise mutual information
in Tables 7 and 8, annotated with our qualitative topic labels based on high-probability topic
words and close readings of exemplar documents for each topic. Notably, these examples reveal
that, despite their lexical diferences, these mappings appear surprisingly reasonable in many
cases.</p>
        <p>The topic pairs with high pointwise mutual information suggest interesting analogies. For
example, the association that emerges between topics B.24 and C.18 suggests that discussions
about dietary ethics are to r/Buddhism what discussions about abortion are to r/Christianity.
The content of these discussions is considerably distinct lexically. Yet, these divisive ethical
and moral debates occur in both subreddits with the particular focus of the debates marking
the discourse as that of r/Christianity (in the case of abortion) or of r/Buddhism (in the case
of eating meat).</p>
        <p>This example provides important clues as to how this method of comparison works. When
model B30 encounters discussions about abortion in r/Christianity, it is confronted with terms
that are not prominent in its training corpus from r/Buddhism. None of the topics in model
B30 include the term “abortion” as a high-probability term and so the term does not play much
of a role in model B30 choosing an appropriate topic mixture. Instead, model B30 is forced
to ignore lexically distinct terms like “abortion” in favor of terms that are less distinguishing
between the two discourses. Thus a common structural property between discourses emerges
that we might label as something that is non-discourse specific such as “contentious ethical
issues.”</p>
        <p>Additionally, we find that the relative strength of the associations between topics is
dependent on the comparison corpus used. The interpretation by model C30 of B.24 as C.18 has
the second-highest pointwise mutual information (see Table 7), whereas the interpretation by
model B30 of C.18 as B.24 ranks tenth (see Table 8).</p>
        <p>In order to assess how diferent these topic mappings are from those we might get using a more
standard method of comparing topics directly, we calculate the Jensen-Shannon divergence
between each pair of topics between model B30 and model C30. The ten most similar topic
pairs (i.e., those with the lowest Jensen-Shannon divergence) can be seen in Table 9. We find
that, while overlap certainly exists, the ten most similar pairs of topics between models are
not necessarily those that appear most salient when making indirect comparisons within the
context of a comparison corpus.</p>
        <p>Topics B.16 and C.15 appear as the most similar when compared directly in this way. This is
also true when compared indirectly through the interpretation of r/Buddhism by r/Christianity
(in the form of model B30 and C30) as shown in Table 7. However, this topic pair is ranked
twelfth when indirectly compared through the interpretation of r/Christianity by r/Buddhism.
Evidently, the choice of comparison corpus is consequential for how salient the same topic
pair is within the comparison. The extent of how consequential the diferences are between
direct and indirect comparisons can be severe. When r/Buddhism interprets r/Christianity, the
relationship between C.04 and B.07 is strongest. When r/Christianity interprets r/Buddhism,
this pairing is ranked 32nd. When compared directly using the Jensen-Shannon divergence,
the pair is ranked 672nd. Clearly, indirect comparisons between topics within the context
specified by a comparison corpus are capable of painting substantially diferent pictures of how
the features between two models are mapped.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion</title>
      <p>Our goal in reporting the above results is not to prove the validity of our operationalization of
structural similarity but to provide a glimpse of what this operationalization looks like within a
narrow case study, and to see how closely the results of this case study conform to our intuitions.
Further work is therefore necessary to continue exploring the method we have proposed here.
While we have suggested one possible operationalization of structural similarity, there are likely
to be many diferent possible operationalizations which may overcome limitations present in
ours.</p>
      <p>An important limitation of the analysis we present here is that we have only considered
two sets of LDA models for representing the discourses. LDA models trained on the same
corpus and with the same parameters may still exhibit diferences due to the randomness in
the training process. For this reason, it is possible that particularities within these models may
produce mutual information that is highly dependent on those particularities. In future work,
we will examine the relationships between corpora where each is represented by a variety of
LDA models in order to get a more robust reading of the mutual information that tends to
occur between models trained on diferent corpora.</p>
      <p>We believe that an important strength of the approach we outline here is that it does not
require any significant modifications to each corpus beyond standard preprocessing. However,
our next steps will include an approach in which each corpus is modified in such a way that it is
forced to be less lexically distinct from the corpora with which it is compared. Possibilities for
reducing the lexical distinctness between two corpora might include the removal of certain terms
based on their contribution to the JSD between the vocabulary distributions of the corpora
being compared. Additionally, the methods put forward by [42] to reduce the correlation
between the topics of an LDA model and metadata may be appropriate for this context as
well.</p>
      <p>
        If our attempt at quantifying structural relationships between discourses has some
validity, we can begin to explore comparative religion (and perhaps comparative culture more
broadly conceived) as a meta-clustering problem in which relationships between various
clustering schemes learned from diferent discourses suggest similarities and diferences that go
far deeper than lexical distinctions. This is similar to the meta-clustering problem described
in [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], except in that case, the diferent clusterings being compared are all learned from the
same set of observations. Our case, wherein each clustering is learned from a diferent set of
observations, brings up additional complications. Most importantly, it is not clear whether
or not the structural similarity, as we have defined it here, between two discourses is stable
across various contexts in which the discourses are compared (i.e., the comparison corpus). As
our results show, the structural similarity is contingent on the context in which the discourses
are compared. However, it is possible that, as two discourses are compared within a greater
variety of comparison corpora, that their structural similarity becomes stable. Even if a stable
trend of structural similarity does not emerge between discourses, then examining the contexts
in which their structural similarity difers should still ofer useful insights.
      </p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>
        Drawing from the comparative religion research in [
        <xref ref-type="bibr" rid="ref11 ref12">12, 11</xref>
        ] and the framing of unsupervised
machine learning models as conceptualization schemes found in [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] and [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], we have
proposed a computational theory of the structural similarity between lexically distinct religious
discourses—discourses that are characterized by distinct lexicons. We have argued that, if two
unsupervised machine learning models organize information with a high degree of mutual
consistency as quantified by the mutual information between them, then they share a high degree
of structural similarity, regardless of the lexical distinctions between the models’
representations. Using latent Dirichlet allocation as our model of choice, we developed our theory and
explored its empirical implications for a case study comparing the discourses of two discussion
communities from Reddit: r/Buddhism and r/Christianity. The results from this case study
suggest that our method for quantifying structural similarity has merit and warrants further
exploration.
      </p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This research is funded in part by grants from the U.S. National Science Foundation
(OIA1946391, OIA-1920920, IIS-1636933, ACI-1429160, and IIS-1110868), U.S. Office of Naval
Research (N00014-10-1-0091, N00014-14-1-0489, N00014-15-P-1187, N00014-16-1-2016,
N0001416-1-2412, N00014-17-1-2675, N00014-17-1-2605, N68335-19-C-0359, N00014-19-1-2336,
N6833520-C-0540), U.S. Air Force Research Lab, U.S. Army Research Office (W911NF-17-S-0002,
W911NF-16-1-0189), U.S. Defense Advanced Research Projects Agency (W31P4Q-17-C-0059),
Arkansas Research Alliance, the Jerry L. Maulden/Entergy Endowment at the University of
Arkansas at Little Rock, and the Australian Department of Defense Strategic Policy Grants
Program (SPGP) (award number: 2020-106-094) to the third co-author, Nitin Agarwal. Any
opinions, findings, and conclusions or recommendations expressed in this material are those
of the authors and do not necessarily reflect the views of the funding organizations. The
researcher gratefully acknowledges the support.
[21] Y. Lei et al. “Generalized information theoretic cluster validity indices for soft
clusterings”. In: 2014 IEEE Symposium on Computational Intelligence and Data Mining
(CIDM). 2014, pp. 24–31.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Allen</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Murdock. LDA Topic</surname>
          </string-name>
          <article-title>Modeling: Contexts for the History &amp; Philosophy of Science. Preprint of a chapter forthcoming in Ramsey</article-title>
          , G.,
          <string-name>
            <surname>De Block</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          .(Eds.)
          <article-title>The Dynamics of Science: Computational Frontiers in History and Philosophy of Science</article-title>
          . Pittsburgh University Press; Pittsburgh. May
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Antoniak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Mimno</surname>
          </string-name>
          , and
          <string-name>
            <surname>K. Levy.</surname>
          </string-name>
          “
          <article-title>Narrative Paths and Negotiation of Power in Birth Stories”</article-title>
          .
          <source>In: Proc. ACM Hum.-Comput. Interact. 3.CSCW (Nov</source>
          .
          <year>2019</year>
          ).
          <source>doi: 10.1</source>
          <volume>145</volume>
          /3359190.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>K. D.</given-names>
            <surname>Bailey</surname>
          </string-name>
          .
          <article-title>Typlogies and Taxonomies: An Introduction to Classification Techniques . 1st. Quantitative Applications in the Social Sciences</article-title>
          . Beverly Hills, CA: Sage,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A. T. J.</given-names>
            <surname>Barron</surname>
          </string-name>
          et al. “
          <article-title>Individuals, institutions, and innovation in the debates of the French Revolution”</article-title>
          .
          <source>In: Proceedings of the National Academy of Sciences 115.18</source>
          (
          <year>2018</year>
          ), pp.
          <fpage>4607</fpage>
          -
          <lpage>4612</lpage>
          . issn:
          <fpage>0027</fpage>
          -
          <lpage>8424</lpage>
          . doi:
          <volume>10</volume>
          .1073/pnas.1717729115. eprint: https://www.pn as.org/content/115/18/4607.full.pdf. url: https://www.pnas.org/content/115/18/4607.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Blei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Y.</given-names>
            <surname>Ng</surname>
          </string-name>
          , and
          <string-name>
            <surname>M. I. Jordan.</surname>
          </string-name>
          “
          <article-title>Latent Dirichlet Allocation”</article-title>
          .
          <source>In: Journal of Machine Learning Research 3.1</source>
          (
          <issue>2003</issue>
          ), pp.
          <fpage>993</fpage>
          -
          <lpage>1022</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>H.</given-names>
            <surname>Campbell</surname>
          </string-name>
          . “
          <article-title>Making Space for Religion in Internet Studies”</article-title>
          .
          <source>In: The Information Society 21.4</source>
          (
          <issue>2005</issue>
          ), pp.
          <fpage>309</fpage>
          -
          <lpage>315</lpage>
          . doi:
          <volume>10</volume>
          .1080/01972240591007625.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>E.</given-names>
            <surname>Chandrasekharan</surname>
          </string-name>
          et al. “
          <article-title>The Internet's Hidden Rules: An Empirical Study of Reddit Norm Violations at Micro, Meso, and Macro Scales”</article-title>
          .
          <source>In: Proc. ACM Hum.-Comput. Interact. 2.CSCW (Nov</source>
          .
          <year>2018</year>
          ). doi:
          <volume>10</volume>
          .1145/3274301.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>E.</given-names>
            <surname>Chandrasekharan</surname>
          </string-name>
          et al. “
          <article-title>You Can't Stay Here: The Efficacy of Reddit's 2015 Ban Examined Through Hate Speech”</article-title>
          .
          <source>In: Proc. ACM Hum.-Comput. Interact. 1.CSCW (Dec</source>
          .
          <year>2017</year>
          ). doi:
          <volume>10</volume>
          .1145/3134666.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>F.</given-names>
            <surname>Cho</surname>
          </string-name>
          and
          <string-name>
            <given-names>R. K.</given-names>
            <surname>Squiers</surname>
          </string-name>
          . “
          <article-title>Religion as a Complex and Dynamic System”</article-title>
          .
          <source>In: Journal of the American Academy of Religion 81.2</source>
          (
          <issue>Apr</issue>
          .
          <year>2013</year>
          ), pp.
          <fpage>357</fpage>
          -
          <lpage>398</lpage>
          . issn:
          <fpage>0002</fpage>
          -
          <lpage>7189</lpage>
          . doi:
          <volume>10</volume>
          .1093/jaarel/lft016.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>T. M.</given-names>
            <surname>Cover</surname>
          </string-name>
          and
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Thomas</surname>
          </string-name>
          .
          <source>Elements of Information Theory. 2nd. Hoboken</source>
          , NJ: John Wiley &amp; Sons, Inc.,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>J. E. Deitrick.</surname>
          </string-name>
          “
          <article-title>Engaged Buddhist ethics: Mistaking the boat for the shore”</article-title>
          . In: Action Dharma: New Studies in Engaged Buddhism. Ed. by
          <string-name>
            <given-names>C.</given-names>
            <surname>Queen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Prebish</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Keown</surname>
          </string-name>
          . 1st.
          <article-title>RoutledgeCurzon Critical Studies in Buddhism</article-title>
          . New York, NY: RoutledgeCurzon,
          <year>2003</year>
          , pp.
          <fpage>252</fpage>
          -
          <lpage>269</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>J. E. Deitrick.</surname>
          </string-name>
          “
          <article-title>Mistaking the Boat for the Shore?: A Critical Analysis of Socially Engaged Buddhism in the United States”</article-title>
          .
          <source>UMI Number: 3041445</source>
          . Los Angeles, CA: University of Southern California,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>I. S.</given-names>
            <surname>Dhillon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mallela</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Modha</surname>
          </string-name>
          . “
          <string-name>
            <surname>Information-Theoretic</surname>
          </string-name>
          Co
          <article-title>-Clustering”</article-title>
          .
          <source>In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD '03</source>
          .
          <string-name>
            <surname>Washington</surname>
          </string-name>
          , D.C.: Association for Computing Machinery,
          <year>2003</year>
          , pp.
          <fpage>89</fpage>
          -
          <lpage>98</lpage>
          . isbn:
          <volume>1581137370</volume>
          . doi:
          <volume>10</volume>
          .1145/956750.956764.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>P.</given-names>
            <surname>DiMaggio</surname>
          </string-name>
          , M. Nag, and
          <string-name>
            <given-names>D.</given-names>
            <surname>Blei</surname>
          </string-name>
          . “
          <article-title>Exploiting affinities between topic modeling and the sociological perspective on culture: Application to newspaper coverage of U.S. government arts funding”</article-title>
          .
          <source>In: Poetics 41.6</source>
          (
          <year>2013</year>
          ).
          <source>Topic Models and the Cultural Sciences</source>
          , pp.
          <fpage>570</fpage>
          -
          <lpage>606</lpage>
          . issn:
          <fpage>0304</fpage>
          -
          <lpage>422X</lpage>
          . doi: https://doi.org/10.1016/j.poetic.
          <year>2013</year>
          .
          <volume>08</volume>
          .004. url: http://www.sciencedirect.com/science/article/pii/S0304422X13000661.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Grimmer</surname>
          </string-name>
          and
          <string-name>
            <given-names>G.</given-names>
            <surname>King.</surname>
          </string-name>
          “
          <article-title>General purpose computer-assisted clustering and conceptualization”</article-title>
          .
          <source>In: Proceedings of the National Academy of Sciences 108.7</source>
          (
          <issue>2011</issue>
          ), pp.
          <fpage>2643</fpage>
          -
          <lpage>2650</lpage>
          . issn:
          <fpage>0027</fpage>
          -
          <lpage>8424</lpage>
          . doi:
          <volume>10</volume>
          .1073/pnas.1018067108.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>D.</given-names>
            <surname>Hall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Jurafsky</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          . “
          <article-title>Studying the History of Ideas Using Topic Models”</article-title>
          .
          <source>In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. EMNLP '08</source>
          .
          <string-name>
            <surname>Honolulu</surname>
          </string-name>
          , Hawaii: Association for Computational Linguistics,
          <year>2008</year>
          , pp.
          <fpage>363</fpage>
          -
          <lpage>371</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>T.</given-names>
            <surname>Hutchings</surname>
          </string-name>
          . “
          <article-title>Digital Humanities and the Study of Religion”</article-title>
          . In:
          <article-title>Between Humanities and the Digital</article-title>
          . Ed. by
          <string-name>
            <given-names>P.</given-names>
            <surname>Svensson</surname>
          </string-name>
          and
          <string-name>
            <given-names>D. T.</given-names>
            <surname>Goldberg</surname>
          </string-name>
          . 1st. Cambridge, MA: The MIT Press,
          <year>2015</year>
          , pp.
          <fpage>283</fpage>
          -
          <lpage>294</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Jørgensen</surname>
          </string-name>
          and
          <string-name>
            <given-names>L. J.</given-names>
            <surname>Phillips</surname>
          </string-name>
          .
          <article-title>Discourse Analysis as Theory and Method</article-title>
          .
          <source>1st. London: Sage</source>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>S.</given-names>
            <surname>Klingenstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hitchcock</surname>
          </string-name>
          , and
          <string-name>
            <surname>S. DeDeo.</surname>
          </string-name>
          “
          <article-title>The civilizing process in London's Old Bailey”</article-title>
          .
          <source>In: Proceedings of the National Academy of Sciences 111.26</source>
          (
          <year>2014</year>
          ), pp.
          <fpage>9419</fpage>
          -
          <lpage>9424</lpage>
          . issn:
          <fpage>0027</fpage>
          -
          <lpage>8424</lpage>
          . doi:
          <volume>10</volume>
          .1073/pnas.1405984111. eprint: https://www.pnas.org/co ntent/111/26/9419.full.pdf. url: https://www.pnas.org/content/111/26/9419.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kullback</surname>
          </string-name>
          and
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Leibler</surname>
          </string-name>
          . “
          <article-title>On Information and Sufficiency”</article-title>
          . In: Ann. Math. Statist.
          <volume>22</volume>
          .1 (
          <issue>Mar</issue>
          .
          <year>1951</year>
          ), pp.
          <fpage>79</fpage>
          -
          <lpage>86</lpage>
          . doi:
          <volume>10</volume>
          .1214/aoms/1177729694. url: https://doi.org/10.1 214/aoms/1177729694.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lin</surname>
          </string-name>
          . “
          <article-title>Divergence measures based on the Shannon entropy”</article-title>
          .
          <source>In: IEEE Transactions on Information Theory 37.1</source>
          (
          <issue>1991</issue>
          ), pp.
          <fpage>145</fpage>
          -
          <lpage>151</lpage>
          . doi:
          <volume>10</volume>
          .1109/18.61115.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lövheim</surname>
          </string-name>
          and
          <string-name>
            <given-names>H. A.</given-names>
            <surname>Campbell</surname>
          </string-name>
          . “
          <article-title>Considering critical methods and theoretical lenses in digital religion studies”</article-title>
          .
          <source>In: New Media &amp; Society 19.1</source>
          (
          <issue>2017</issue>
          ), pp.
          <fpage>5</fpage>
          -
          <lpage>14</lpage>
          . doi:
          <volume>10</volume>
          .1177 /1461444816649911.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>L. H.</given-names>
            <surname>Martin</surname>
          </string-name>
          . “
          <article-title>Comparison”</article-title>
          . In: Guide to the Study of Religion. Ed. by
          <string-name>
            <given-names>W.</given-names>
            <surname>Braun</surname>
          </string-name>
          and R.
          <source>T. McCutcheon. 1st. London: Cassell</source>
          ,
          <year>2005</year>
          , pp.
          <fpage>45</fpage>
          -
          <lpage>56</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>M.</given-names>
            <surname>Meilă</surname>
          </string-name>
          . “
          <article-title>Comparing clusterings-an information based distance”</article-title>
          .
          <source>In: Journal of Multivariate Analysis 98.5</source>
          (
          <issue>2007</issue>
          ), pp.
          <fpage>873</fpage>
          -
          <lpage>895</lpage>
          . issn:
          <fpage>0047</fpage>
          -
          <lpage>259X</lpage>
          . doi: https://doi.org/10.10 16/j.jmva.
          <year>2006</year>
          .
          <volume>11</volume>
          .013. url: http://www.sciencedirect.com/science/article/pii/S004725 9X06002016.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>F.</given-names>
            <surname>Morstatter</surname>
          </string-name>
          et al. “
          <article-title>Is the Sample Good Enough? Comparing Data from Twitter's Streaming API with Twitter's Firehose”</article-title>
          .
          <source>In: International AAAI Conference on Web and Social Media</source>
          .
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>J.</given-names>
            <surname>Murdock</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Allen</surname>
          </string-name>
          , and
          <string-name>
            <surname>S. DeDeo.</surname>
          </string-name>
          “
          <article-title>Exploration and exploitation of Victorian science in Darwin's reading notebooks”</article-title>
          .
          <source>In: Cognition</source>
          <volume>159</volume>
          (
          <year>2017</year>
          ), pp.
          <fpage>117</fpage>
          -
          <lpage>126</lpage>
          . issn:
          <fpage>0010</fpage>
          -
          <lpage>0277</lpage>
          . doi: https://doi.org/10.1016/j.cognition.
          <year>2016</year>
          .
          <volume>11</volume>
          .012. url: http://www.sciencedirect.co m/science/article/pii/S0010027716302840.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>M. E. J.</given-names>
            <surname>Newman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. T.</given-names>
            <surname>Cantwell</surname>
          </string-name>
          , and J.-G. Young. “
          <article-title>Improved mutual information measure for clustering, classification, and community detection”</article-title>
          .
          <source>In: Phys. Rev. E</source>
          <volume>101</volume>
          (
          <issue>4</issue>
          Apr.
          <year>2020</year>
          ), p.
          <fpage>042304</fpage>
          . doi:
          <volume>10</volume>
          .1103/PhysRevE.101.042304. url: https://link.aps.org/d oi/10.1103/PhysRevE.101.042304.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>D.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          et al. “
          <article-title>How we do things with words: Analyzing text as social and cultural data”</article-title>
          . In: CoRR abs/
          <year>1907</year>
          .01468 (
          <year>2019</year>
          ). arXiv:
          <year>1907</year>
          .01468.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>R.</given-names>
            <surname>Nichols</surname>
          </string-name>
          et al. “
          <article-title>Modeling the Contested Relationship between Analects, Mencius, and Xunzi: Preliminary Evidence from a Machine-Learning Approach”</article-title>
          .
          <source>In: The Journal of Asian Studies 77.1</source>
          (
          <issue>2018</issue>
          ), pp.
          <fpage>19</fpage>
          -
          <lpage>57</lpage>
          . doi:
          <volume>10</volume>
          .1017/S0021911817000973.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [31]
          <string-name>
            <surname>W. E. Paden.</surname>
          </string-name>
          “
          <article-title>Comparative Religion”</article-title>
          . In:
          <article-title>The Routledge Companion to the Study of Religion</article-title>
          . Ed. by
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Hinnells</surname>
          </string-name>
          . 1st. New York, NY: Routledge,
          <year>2005</year>
          . Chap.
          <volume>11</volume>
          , pp.
          <fpage>208</fpage>
          -
          <lpage>225</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>A.</given-names>
            <surname>Piper</surname>
          </string-name>
          . “There Will Be Numbers”.
          <source>In: Journal of Cultural Analytics (May</source>
          <year>2016</year>
          ). doi:
          <volume>10</volume>
          .22148/16.006.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>S.</given-names>
            <surname>Prothero</surname>
          </string-name>
          .
          <article-title>The White Buddhist: The Asian Odyssey of Henry Steel Olcott</article-title>
          .
          <year>1st</year>
          . Indianapolis, IN: Indiana University Press,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>R.</given-names>
            <surname>Řehůřek</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Sojka</surname>
          </string-name>
          . “
          <article-title>Software Framework for Topic Modelling with Large Corpora”</article-title>
          .
          <source>English. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks</source>
          . http://is.muni.cz/publication/884893/en. Valletta, Malta: ELRA, May
          <year>2010</year>
          , pp.
          <fpage>45</fpage>
          -
          <lpage>50</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          <string-name>
            <surname>M. E. Roberts</surname>
            ,
            <given-names>B. M.</given-names>
          </string-name>
          <string-name>
            <surname>Stewart</surname>
            , and
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Tingley</surname>
          </string-name>
          . “
          <article-title>Navigating the Local Modes of Big Data: The Case of Topic Models”</article-title>
          .
          <source>In: Computational Social Science: Discovery and Prediction</source>
          . Ed. by
          <string-name>
            <given-names>R. M.</given-names>
            <surname>Alvarez</surname>
          </string-name>
          . 1st. New York, NY: Cambridge University Press,
          <year>2016</year>
          . Chap.
          <volume>2</volume>
          , pp.
          <fpage>51</fpage>
          -
          <lpage>97</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [36]
          <string-name>
            <surname>C. E. Shannon. “</surname>
          </string-name>
          <article-title>A mathematical theory of communication”</article-title>
          .
          <source>In: The Bell system technical journal 27</source>
          .3 (
          <issue>1948</issue>
          ), pp.
          <fpage>379</fpage>
          -
          <lpage>423</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>E.</given-names>
            <surname>Slingerland</surname>
          </string-name>
          . “
          <article-title>Who's Afraid of Reductionism? The Study of Religion in the Age of Cognitive Science”</article-title>
          .
          <source>In: Journal of the American Academy of Religion 76.2</source>
          (
          <issue>Mar</issue>
          .
          <year>2008</year>
          ), pp.
          <fpage>375</fpage>
          -
          <lpage>411</lpage>
          . doi:
          <volume>10</volume>
          .1093/jaarel/lfn004.
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>E.</given-names>
            <surname>Slingerland</surname>
          </string-name>
          et al. “
          <article-title>The Distant Reading of Religious Texts: A “Big Data” Approach to Mind-Body Concepts in Early China”</article-title>
          .
          <source>In: Journal of the American Academy of Religion 85.4</source>
          (
          <issue>Mar</issue>
          .
          <year>2017</year>
          ), pp.
          <fpage>985</fpage>
          -
          <lpage>1016</lpage>
          . issn:
          <fpage>0002</fpage>
          -
          <lpage>7189</lpage>
          . doi:
          <volume>10</volume>
          .1093/jaarel/lfw090. url: https ://doi.org/10.1093/jaarel/lfw090.
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>Z. K.</given-names>
            <surname>Stine</surname>
          </string-name>
          and
          <string-name>
            <given-names>N.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          .
          <article-title>“A Quantitative Portrait of Legislative Change in Ukraine”</article-title>
          . In: Social, Cultural, and Behavioral Modeling. Ed. by
          <string-name>
            <given-names>R.</given-names>
            <surname>Thomson</surname>
          </string-name>
          et al. Cham: Springer International Publishing,
          <year>2019</year>
          , pp.
          <fpage>50</fpage>
          -
          <lpage>59</lpage>
          . isbn:
          <fpage>978</fpage>
          -3-
          <fpage>030</fpage>
          -21741-9.
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>Z. K.</given-names>
            <surname>Stine</surname>
          </string-name>
          and
          <string-name>
            <given-names>N.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          . “
          <article-title>Comparative Discourse Analysis Using Topic Models: Contrasting Perspectives on China from Reddit”</article-title>
          .
          <source>In: International Conference on Social Media and Society</source>
          . SMSociety'
          <fpage>20</fpage>
          . Toronto, ON, Canada: Association for Computing Machinery,
          <year>2020</year>
          , pp.
          <fpage>73</fpage>
          -
          <lpage>84</lpage>
          . isbn:
          <volume>9781450376884</volume>
          . doi:
          <volume>10</volume>
          .1145/3400806.3400816.
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>C.</given-names>
            <surname>Tan</surname>
          </string-name>
          et al. “
          <article-title>Winning Arguments: Interaction Dynamics and Persuasion Strategies in Good-Faith Online Discussions”</article-title>
          .
          <source>In: Proceedings of the 25th International Conference on World Wide Web. WWW '16</source>
          .
          <string-name>
            <surname>Montréal</surname>
          </string-name>
          , Québec, Canada: International World Wide Web Conferences Steering Committee,
          <year>2016</year>
          , pp.
          <fpage>613</fpage>
          -
          <lpage>624</lpage>
          . isbn:
          <volume>9781450341431</volume>
          . doi:
          <volume>10</volume>
          .1145/2872427.2883081.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>