=Paper= {{Paper |id=Vol-3232/paper09 |storemode=property |title=Distinguishing Discourses: a Data-Driven Analysis of Works and Publishing Networks of the Scottish Enlightenment |pdfUrl=https://ceur-ws.org/Vol-3232/paper09.pdf |volume=Vol-3232 |authors=Iiro Tiihonen,Yann Ryan,Lidia Pivovarova,Aatu Liimatta,Tanja Säily,Mikko Tolonen |dblpUrl=https://dblp.org/rec/conf/dhn/MarjanenKM22 }} ==Distinguishing Discourses: a Data-Driven Analysis of Works and Publishing Networks of the Scottish Enlightenment== https://ceur-ws.org/Vol-3232/paper09.pdf
Distinguishing Discourses: a Data-Driven Analysis of
Works and Publishing Networks of the Scottish
Enlightenment
Iiro Tiihonen1 , Yann Ryan1 , Lidia Pivovarova1 , Aatu Liimatta1 , Tanja Säily1 and
Mikko Tolonen1
1
    University of Helsinki, Finland. Faculty of Arts


               Abstract
               A key feature of the Enlightenment is the development of a discourse on commerce and economy
               entangled in larger discussions around politics, morality, and progress. Importantly, this debate was not
               formalised in the way it may be seen today. Instead, it was an emerging subject incorporating diverse
               and contested ideas. The objective of this case study, then, is to use various methods to identify the
               boundaries of these emerging economic, political, and moral disquisitions in a data-driven way, using
               a unified version of the metadata (e.g. publisher, publication year, format of the print product) of the
               English Short Title Catalogue (ESTC) and full texts of the Eighteenth Century Collections Online (ECCO).
               We approach the task iteratively, first making a separation between broadly defined economic documents
               and other eighteenth century documents by modeling the features which separate samples from two
               collections of historical economic texts from the wider ECCO data. Then, based on the previous step, we
               distinguish works similar to Hume’s Political Discourses (a text at the heart of the Scottish Enlightenment)
               from other branches of commercial and economic discourse, and analyse this set of works in more detail.
               We also experiment on how a purely unsupervised approach – a contextualized topic model using BERT
               encodings – groups our set of economic texts.
                   Previous historical scholarship has taken the perspective that we ought to identify language use from
               large corpora of text. The aim has been to contextually understand language from the perspective of a
               particular group of historical actors or, as is the case with conceptual historians, detect contested and
               changing concepts. Our approach is different in that we closely link language use to the material and
               historical circumstances of the individual texts within which these uses can be found. Essentially, we
               combine computation and social network analysis with the study of changing concepts and word uses
               over time, taking individual editions rather than language abstracted from them as the basic object of
               study. This approach allows us to identify additional contextually relevant works by both their linguistic
               features, and the material and network history of their production.
                   Jointly, the combination of iterative data-driven discourse detection and the focus on manifested
               editions allows us not only to extract a significant proportion of the debates forming the Scottish
               Enlightenment in a data-driven manner, but to link them to the social networks and commercial context
               in which they were produced and in which they evolved. Thus, this approach allows us to evaluate the
               existence, scope, and contexts of historical discourses (in this case, economic discourse) in the eighteenth
               century in a way which is both computationally state-of-the-art and relevant to historical practices and
               interests. It also demonstrates how data-driven analysis and the traditional hermeneutic approach can
               be combined to study meanings and their changes over time.

               Keywords
               computational history, eighteenth-century studies, economic discourse, social network analysis




                                                            120
1. Introduction
The emergence and development of commercial and economic discourse in the eighteenth
century is one of the major strands of study for the intellectual history of the Enlightenment
era [1, 2, 3]. This article links to that tradition, aiming to detect and analyse the debates of
the Scottish Enlightenment that lie at the intersection of entangled political, economic and
moral considerations. Compared to most of the existing scholarship, our approach differs in
two crucial and interlinked ways. It is data driven, utilising the most comprehensive data sets
of full texts and metadata of early modern British print products in a quantitative manner.
Previous historical scholarship has focused either on language and its historical context [4] or
the contest over and change of concepts. Here we take a different approach and connect the
analysis of language to the way it is manifested, namely the production process of the physical
editions from which the text comes. Thus, the novelty of our contribution is in the contextually
sensitive approach we take to computational intellectual history rather than the application of
the computational methods. We also aim to use methods that are interpretable at the level of
editions, for example by analysing their term counts and publishers. We especially focus on the
participants of the publishing process of different editions and their collaboration networks –
the context in which the Enlightenment discourse was physically produced.
   In this article we describe the first test case for this approach. First, we produce a computa-
tionally derived set of editions similar to Hume’s Political Discourses. In a second step, this set
is linked to its producers, allowing us to analyse both the discourse itself and the co-operation
networks of publishers in which it was produced. In a third step, we find meaningful groups
of economic texts with a state-of-the-art topic model, and these subsets also show interest-
ing variation in their presence in different communities of the book trade. The qualitative
interpretation of this set of works and the related publisher networks confirms that – despite
need for further improvement – computational detection of Enlightenment discourses and the
publishing networks that physically produced them is a realistic goal, something that can be
achieved with the methods at hand. Our computationally derived data agrees with the expert
evaluation, in that the works most similar to Political Discourses represent a coherent set of
Scottish Enlightenment texts. Furthermore, the clusters of publisher networks which produced
them match the communities and co-operation patterns found with traditional approaches.
   The term discourse is in itself ambiguous,1 and instead of fixing its meaning or resolution for
the entire article, we start from a very broad category of economic texts and iterate onward to
see at which resolution and level of nuance we are currently able to detect and cluster texts
that are similar, and whether this similarity is meaningful from the point of view of historians.

The 6th Digital Humanities in the Nordic and Baltic Countries Conference (DHNB 2022), Uppsala, Sweden, March 15-18,
2022.
Envelope-Open iiro.tiihonen@helsinki.fi (I. Tiihonen); yann.ryan@helsinki.fi (Y. Ryan); lidia.pivovarova@helsinki.fi
(L. Pivovarova); aatu.liimatta@helsinki.fi (A. Liimatta); tanja.saily@helsinki.fi (T. Säily); mikko.tolonen@helsinki.fi
(M. Tolonen)
Orcid 0000-0003-0703-4556 (I. Tiihonen); 0000-0003-1878-4838 (Y. Ryan); 0000-0002-0026-9902 (L. Pivovarova);
0000-0001-9056-1087 (A. Liimatta); 0000-0003-4407-8929 (T. Säily); 0000-0003-2892-8911 (M. Tolonen)
                                    © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
 CEUR
 Workshop
 Proceedings           CEUR Workshop Proceedings (CEUR-WS.org)
               http://ceur-ws.org
               ISSN 1613-0073




               1
                    Our use of it should not be confused with e.g. historical discourse analysis as it is defined in [5].




                                                                                                      121
The article considers the extent to which definitions of discourse are currently functional for
computational intellectual history, and what the level of nuance could be in the future. The aim
is that this practical work will also in time make it possible for us to develop a more coherent
theoretical position that can be directly compared e.g. to Quentin Skinner’s contextualism [6]
or other relevant frameworks of traditional intellectual history. The step of model fitting and
evaluation at the beginning is to see to which extent commercial and economic texts can be
differentiated from all other types of textual content as a discrete category in a data-driven way,
based on existing collections of economic texts partially linked to the ESTC. The motivation for
this is twofold: Computational differentiation of an economic corpus from the rest allows us to
expand the set of economic texts. It also produces a more interpretable set of terms related to
economic matters, and by exploring the variation of this terminology within our set of economic
texts, we can try to cluster this data further, which would allow us to speak of discourses at
a more precise level. In this article, we attempt to identify works similar to Hume’s Political
Discourses from the wider set of economic texts based on similarity of term frequencies, and
reflect on how this set of works relates to the traditional understanding of which works form
part of the Scottish Enlightenment. The reason for picking Hume’s Political Discourses as our
test case was that it is a work that is specifically about commerce and economy in its widest
sense instead of being a technical work focused on trade only.


2. Data
Our data set is derived from the English Short Title Catalogue (ESTC) and the Eighteenth
Century Collections Online (ECCO) [7]. The ESTC is the largest catalogue of early modern
print products of the Anglosphere, and its machine readable form – enriched, harmonised and
refined [8] by the Helsinki Computational History Group – is our main source of information
for metadata. This metadata includes information about the place and time of publication,
actors related to the print products, physical characteristics and – crucially – references to other
collections in which the ESTC record or its copy is part of or referenced. ECCO contains OCR
full texts for roughly 200,000 documents, and all of these can be linked to corresponding ESTC
records.2 Our group has developed an API called Octavo for more convenient information
retrieval from ECCO, allowing e.g. the retrieval of a filtered set of terms for each document
instead of the full texts that are often heavily affected by OCR noise.
   By combining the text data from ECCO with metadata from the ESTC, we created a starting
point data set for the analysis of economic discourse. We can form a subset – based on
information in the ESTC fields 510a and 533f – of about 11,000 ESTC records belonging to the
Goldsmiths’-Kress Library of Economic Literature (GKL), a microfilm collection that covers
documents related to economic history in the widest sense, from the primary works of classical
economic theory to political pamphleteering and agriculture [9]. Additionally, we know of 4,000
ESTC records that are cited in the Contemporary Printed Sources for British and Irish Economic
History 1701–1750 (CSEH) catalogue, which also covers a broad range of topics related to
early modern economy. The two collections cover the same eight topics in their classification

    2
     However, it is important to note that this mapping is not one to one, as an ESTC record can e.g. cover multiple
volumes that are separate documents in ECCO.




                                                       122
schemes, although GKL adds five additional categories. We then linked these ESTC records to
ECCO, which resulted in roughly 8,000 documents from the collections with full text available
(this sample is henceforth referred as GKL-CSEH). Having obtained our economic texts, we took
a sample of 16,000 ECCO documents that did not link to the GKL-CSEH. In the following steps,
our default assumption for these documents was that they were not about economic discourse.
The motivation for the random sample was that it could help us detect those linguistic features
that differentiate economic discourses from other eighteenth century texts.
   Instead of using ECCO texts in their entirety, we aimed for information that would summarize
the general trends of term use in the data while not being inflated by OCR noise and the most
common words. To balance representativeness and the need to filter, we queried the Octavo
API for a vector (e.g. list) of the 5,000 most frequent tokens plus their term frequencies (the
number of times they were used in a given document) that were globally (as in the entire ECCO
data) found in at least 100 and at most 50,000 documents. This data set of 58 million rows was
further filtered with a number of steps to make it easier (both in terms of computation and
interpretation) to analyse: tokens with non-standard characters were removed, as were tokens
that occurred in less than 200 of the GKL-CSEH documents. Sample documents with less than
250 terms were also removed. The relative presence of a token in a document was normalised
by the sum of tokens in the top 5,000 vector of the document (in most cases this is the same as
normalising with filtered token count) and this relative frequency of each token per document
was scaled with its own standard deviation approximated from the training and test data. The
end result of this process was a matrix of roughly 22,000 observations (documents) and 13,000
features (scaled relative frequencies of tokens).


3. Derivation of Economic Vocabulary
Our aim was to use this matrix to detect terms relevant for discourse on commerce, which we
did by using it as data for a classifier. We fitted a 𝐿1 penalised logistic regression model with
cross validation 20 times, using the features of the matrix as predictors for the class (whether it
was about an economy-related topic or not) of the document. The random sample was treated
as if it consisted entirely of non-economic texts. Penalised regression is a relatively standard
and well-performing approach to text classification with large numbers of features [10]. The
specific choices were motivated by the fact that a 𝐿1 penalised model tends to drop features as
predictors entirely and with cross validation the parametrisation of the model would be more
robust. However, single runs of the cross validated models seemed to have enough variance that
further robustness was needed, and this was obtained by running the model 20 times and only
keeping those tokens that were non-zero predictors with each run. For these always non-zero
parameters, the final parametrisation was the average over these 20 runs. Qualitative evaluation
of the 1,060 terms with a non-zero effect size served as a sanity check for our approach, as most
of the very impactful words with a positive sign were clearly related to economic discourse.
The very top of the predictors (in terms of effect size) is presented in Table 1.
   The model was validated with test data of 3,692 observations that were left out of the model
fitting steps. In evaluating the model’s success in finding “new” instances of economic writings,
we did not accept documents that could have been only accepted to the GKL-CSEH categories




                                               123
Table 1
Top 20 predictors of the model by effect size. The word goldsmiths is sometimes stamped to the beginning
of the ECCO copy of a GKL document, which most likely explains the significant effect size of the term.
The table also highlights the high amount OCR errors and noise in the data, which motivated the use of
the 𝐿1 penalty.
                             term       effect size         term      effect size
                         goldsmiths     0.25              interef     0.081
                             ooo        0.14               ships      0.081
                            prices      0.14              wrhich      0.077
                         merchants      0.14                ico       0.071
                           trading      0.13              traders     0.07
                            stock       0.12             draining     0.067
                          woollen       0.087             enrich      0.067
                            molk        0.087             pound       0.066
                             pays       0.085           contributed   0.063
                         individual     0.083            necelfary    0.062


Table 2
Confusion matrix of the test data. Columns indicate the real and rows the predicted category.
                                                Positive     Negative
                                    Positive          850      175
                                    Negative          354      2313


of politics or miscellaneous (e.g. theological works were by default not accepted even though
miscellaneous includes some theology). The model performed quite well on this test data, as
Table 2 demonstrates. Prior to hand checking of the data, it had a 71 percent true positive and
93 percent true negative rate (Table 2). As a further step of model validation, we looked at
the documents labeled as false positives or true negatives, as it was relevant for further steps
to understand whether the model had misclassified documents or detected new instances of
economic discourse, and to which extent it detects (and GKL-CSEH covers) ECCO’s economic
documents. All of the false positives were checked, and of these 167 (95 percent) contained
at least some content that could justify them belonging to a collections such as GKL-CSEH.
A random sample of 100 of the true negatives was checked, and out of these 29 contained at
least some content that could justify them belonging to a collection such as GKL-CSEH. The
model does not detect all economic documents, but it makes very few false positive mistakes.
We concluded that it works well enough to provide material and structure for the historical
and linguistic research it serves, but as it fails to detect much of what could be put under its
topics, the suitability of its elements (predictions and the obtained list of term features related
to economic topics) for analysing any specific discourse must be evaluated separately.
   Next we applied the model for two purposes: the extrapolation of our data set of discourse on
commerce was the first step, and it resulted in 15,488 new documents classified as being about
economic matters. We further extended the data set by including other editions of extrapolated
and original GKL-CSEH documents in our data set, which resulted in 33,153 documents, 32,895




                                                  124
having corresponding term frequencies to be used in subsequent analyses.3 The second step
was to see whether we could cluster this extended set of economic documents further in a way
that allowed us to track specific discourses, which we will discuss in the following section.


4. The Discourse of Political Discourses
Hume’s Political Discourses is a work that sits at the centre of the Scottish Enlightenment and
its debates on commerce [11]. It hits the nerve on all the main topics of civil society, greatly
paving the way for Adam Smith and Adam Ferguson. Hume’s Political Discourses is a death
punch to mercantilism. It changes the nature of the luxury debate. It penetrates deep into
continental discourses with its treatment of rich country, poor country (it was also translated
twice into French within a short period of time after its initial publication). And, perhaps
most importantly, it engages in a manner that could not be avoided by any eighteenth-century
political thinker with the ancient and modern debate and the discussion of population growth
in tandem with Robert Wallace. Without exaggeration we may note that the relevance of
Political Discourses as an individual work for the rise of commercial society is not surpassed
even by Montesquieu in the eighteenth century. We may also note that Hume’s contemporary
reputation and canonization in the eighteenth century relies on Political Discourses and not
his philosophical works or other essays. These qualities make it an ideal starting point for
clustering works of commercial Enlightenment discourse, the next step in our process.
    The comparison of the economic documents to Political Discourses was done by comparing
the distribution of the economic terms (the selected features from the model-fitting step with a
positive effect) in it to the corresponding distributions of all the other economic documents,
measured using the Jensen-Shannon Divergence.4 The motivation of the idea shares some
resemblance with the theoretical assumption in topic modeling that texts belonging to the same
topic come from the same distribution of term frequencies, but instead of an unsupervised
approach, we cluster based on a pre-selected work and consider only the frequencies of a heavily
filtered set of terms. The approach was evaluated qualitatively by looking at how the similarity
to Hume’s Political Discourses from a historian’s point of view varied as the function of the
similarity score. This process was also used to determine the threshold scores for Hume-likeness
used to form a subset of the data for further linguistic and network analysis of works that were
deemed similar.
    Our method for identifying textual similarity is based on word frequencies, hence we also
tested the reuses of Hume’s Political Discourses (e.g. textual overlaps between all the works
measured against the work). Text reuses can be full or partial reprints of Hume’s essays or
direct quotes. The assumption was that if there are lot of direct quotes or reprints of individual
essays by Hume included in the works examined, then obviously this would be at least a partial
explanation for a high similarity score between them and Hume’s and hence an excellent sanity
check for our approach. This indeed turned out to be the case, but importantly it is not the
full explanation for similarity. Hume’s essay collections that include Political Discourses are

     3
       42 document term vectors were lost to problems in querying Octavo, the remaining losses did not have full
text in ECCO.
     4
       Small smoothing was applied to distributions to be able to compute the metric.




                                                     125
Table 3
Works (with known authors) with the least Jensen-Shannon divergence from Hume’s Political Discourses.
The year given is the year of publication of the first edition, not necessarily the edition with the lowest
divergence.
     Author                            Short Title                                   First published
     Priestley, Joseph                 Lectures on history, and general policy                  1788
     Kames, Henry Home, Lord           Sketches of the history of Man                           1774
     Steuart, James, Sir               An inquiry into the principles of political              1767
                                       oeconomy
     Raynal, Abbé                      A philosophical and political history of                 1775
                                       the settlements and trade of the
                                       Europeans in the East and West Indies
     Mortimer, Thomas                  The elements of commerce                                 1772
     Wallace, Robert                   Characteristics of the present political                 1758
                                       state of Great Britain
     Michell, Charles                  Principles of legislation                                1796
     Tucker, Josiah                    A brief essay on the advantages &                        1749
                                       disadvantages which respectively attend
                                       France and Great-Britain
     Williams, John                    The rise, progress, and present state of                 1777
                                       the northern governments
     Adams, John                       Curious thoughts on the history of man                   1789


obviously the ones that score highest evaluated by both of our methods (also left out of Table 3
as trivial). Other large segments of text reuse compared to works included in Table 3 include
Priestley’s Lectures and Thomas Mortimer’s Elements of Commerce. It should be also noted
that Robert Wallace’s Characteristics is written with Hume as its interlocutor and it includes
plenty of quotes from Political Discourses. At the same time, the list of works closest to Political
Discourses based on our method contains several titles that include only few (or no) direct
quotes from Hume. Overall, if we study the top-list of highest similarity score to Hume with a
qualitative eye (Table 3), we may note that it is a very interesting list of works that in different
ways can be said to be similar to Political Discourses.


5. Publishing Networks of the Scottish Enlightenment
Another way to study these texts is to look at the networks of actors – printers, booksellers,
publishers and so forth – which produced them. Early modern texts were complex commodities
and their networks should be taken into account alongside their content. Previous work by
the project has shown the validity and potential of historical network analysis using ESTC
data [12]. To understand more about the networks which produced the works we deemed
relevant using the above methods, we constructed a bipartite graph using the co-occurrences
of book trade actors in imprints of all works which were deemed sufficiently close to Hume’s
Political Discourses (a Jensen-Shannon divergence of below .42). This section defines and




                                                     126
describes this graph and its subsequent analysis.
   The term network is often used informally by book historians – books in the seventeenth
and eighteenth century were ‘networked technologies’ [13], in that they were produced collab-
oratively and those involved naturally formed networks, had overlapping alliances and fostered
both local and international links. Studying these networks – at least qualitatively – has long
been a key mode of study of the history of the book. But the term ‘network’ can also be formally
defined using network science. In this field, a network is defined as a mathematical graph of
entities, known as nodes, and their connections, known as edges. To create a network of book
trade actors, we first drew a link between each work and the actors listed in their imprints. Next,
a co-occurrence network was created by directly connecting actors based on co-occurrences on
these imprints.
   This approach has been utilised in computational book history elsewhere, for example
to analyse the works of John Milton from a collaborative, materialist perspective, to situate
authors’ works within a larger marketplace of print, and to look for shifts in the practice of book
dedications [14, 15, 16]. This approach typically does not attempt to re-create the complete
social network of publishers and their relationships: publishers are sometimes listed on the
same imprint without having collaborated in any meaningful way, and many publishers will
have known each other but not collaborated, meaning their connections will not show up.
Rather, it attempts to represent a specific type of network, one based solely on co-production
within texts. By combining a similar approach to constructing networks along with a threshold
based on the similarity scores described above, we have been able to take into account both the
books’ content and the material circumstances surrounding their production, and understand
the data from the dual perspectives of both metadata and full text.
   Once the connections between the actors have been extracted in this way, the resulting
mathematical graph can be analysed using a range of standard metrics, such as counting a
node’s connections or the paths in the network. Another common form of analysis is to detect
clusters – or communities – within the network. In network science, a community is usually
considered to be a group of nodes which are more densely connected to each other than they are
to nodes outside that community [17]. Detecting communities in an actor network of this kind
might help to understand the various and overlapping sub-groups involved in the production of
a group of closely-related texts. To form these communities we used one of the most widely-used
algorithms, known as Louvain [18]. This uses an iterative process to maximise a global network
metric – modularity – which measures the number of links between nodes within a given set of
community labels, and compares it to the same measurement in a random graph of the same
size and distribution [19].
   To find particularly relevant communities, we retained all pairs of actors who occurred
together on at least two separate imprints. The resulting network of 510 nodes and 2,740 edges
is divided into 35 separate communities by the algorithm. Not all are significant: only seven
of these contain more than ten nodes – many smaller communities are formed by groups of
actors who might work on only one text and are therefore disconnected from the wider network.
Looking at the works worked on by actors in each of the ten largest communities, we see
that they are often geographically or temporally distinct, and sometimes both. Charting the
editions in each by year (Figure 1) shows that communities 1 and 2, for example, are large,
interconnected, mostly consisting of actors based in London, but clearly temporal. Works




                                               127
produced by community 2 are gradually superseded by works produced by communities 3 and
then 1. Community 4 is almost exclusively Dublin publishers. Works from Scotland are mostly
found either in community 5 or community 8 – the former also has significant numbers of
works published in London.




Figure 1: Force-directed diagram showing Louvain community groupings, as well as a plot of works by
authors in each community by publication year. For the full network analysis, edges (links) are drawn
between pairs of publishers, booksellers, and printers who co-occur on at least two imprints in the ESTC
data. In this network diagram, only the strongest links are visualised, for legibility.


  These communities therefore allow us to understand more about the distinct but overlapping
series of networks responsible for the production and eventual diffusion of works similar to the
Hume edition. Drawing the network as a force-directed network graph also helps to grasp its
topographical structures (Figure 1). This shows, for instance, a group of American publishers and
booksellers (community 7) entirely disconnected from the wider network: unsurprisingly, those
separated by the Atlantic were unlikely to directly collaborate with each other. The network also
shows the impact and patterns of piracy: there is a mostly disconnected community of Dublin
actors, who, in producing pirated editions, naturally did not tend to work with ‘legitimate’
London publishers. However, this network does have some links, and looking at these can
point to crucial figures who straddled both established and illegitimate print enterprises – or
perhaps moved from one to another. Three figures in particular stand out: Luke White, an
‘obscure bookseller’ based in Dublin who became a wealthy Member of Parliament, winning
Government contracts [20], John Archer, often listed as the Dublin agent for London-produced
books [21], and the bookseller Hulton Bradley. In the network diagram these can be seen as
some of the few figures with connections both to this ‘Dublin’ community as well as to the
communities consisting primarily of large London publishers.
  Network analysis is also useful in seeing the changing patterns of production of various




                                                  128
editions of particular works, allowing us to trace their evolution from one set of actors to another,
and even highlight some of the key figures who facilitated this change. To demonstrate this, we
outline the evolution and diffusion of Hume’s Essays and Treatises on Several Subjects, a collected
volume of various shorter works by Hume (that includes also his Political Discourses), most
of which had been published initially by the Edinburgh bookseller Alexander Kincaid and his
London partner, Andrew Millar. The network and circumstances surrounding the publication of
this work forms the basis of a chapter on the Scottish Enlightenment by Richard Sher, and serves
as a useful test case in comparing the data-driven results with what is known from the historical
record [22]. Tracing the actors involved in the various editions of Essays and Treatises illustrates
how the text moved through various and overlapping networks of publishers. Imprints of the
first editions – in 1753 – show the involvement of Kincaid and his partner Alexander Donaldson,
both found in community 5, which consists primarily of actors whose works are printed in
Edinburgh, whereas the London edition of the same year lists Kincaid, Donaldson, as well
as Millar. The same three actors are attached to most subsequent editions until 1768 where
Millar is replaced by Thomas Cadell, to whom he had resigned his business (Millar died in
June 1768) [23]. The edition of 1777 lists William Creech, found in community 5, a community
notable for containing works almost evenly spread across Edinburgh and London. Subsequent
editions also have actors from a London-Edinburgh community: the bookseller Thomas Kay,
based in London, and the Edinburgh bookseller Charles Elliot, who opened a shop in London in
1786, and the final editions of the century were worked on by actors found in a mostly London
community, community 1. Cases like this have been found by looking for patterns in works
deemed interesting from domain expertise, but a future approach will be to look for individuals
and works which move from one community to another in a more systematic manner.
   Community detection, then, is a useful tool to further our understanding of the various
overlapping groups of actors who worked in the book trade in London and Edinburgh. In this
case it has helped to see how a work such as Essays and Treatises diffused or circulated from
one set of closely-connected actors to another, and the intermediate steps it took along the way.
Alongside this should be noted the drawbacks of this approach: the sizes of the communities
are somewhat arbitrary, and the results give no indication as to whether a given node is well
embedded in a particular community, or if it is an edge case sitting between multiple cases.
Also worth noting is that the network connections do not capture the extent or quality of any
given relationship – just that each pair co-occurred on an imprint together. Despite this, it is
clearly a useful way to understand more about the various networks involved in the
production of texts within a particular discourse.


6. Categories of Commerce
Following previous work that uses topic modelling to track discourses in historical collec-
tions [24, 25, 26], we apply topic modelling to further investigate communities and try to find
out whether they are different in terms of the content they are publishing. We trained a Contex-
tualized Topic Model (CTM [27]), which is a neural network that takes as input a contextualized
sentence representation (embedding) and predicts a topic distribution. The model is trained
using text reconstruction: from the topic distribution it should predict a bag-of-words text




                                                129
Table 4
Example topics detected by the CTM model trained on the collection of commercial documents.
  #   top 10 words in the topic                                                   interpretation
  0   men people man army officers persons body troops enemy poor                 WAR
  1   geo fol car pa andl tha thi ta mi die                                       -
  2   man thing men things people manner number nature place persons              PEOPLE
  3   war army french general troops france peace enemy men treaty                HISTORIES
  4   miles called river north town south sea fide place built                    GEOGRAPHY
  5   great britain country large good greater parts england trade land           POLITICAL COMMERCE
  6   sea north water miles feet fide river south half called                     GEOGRAPHY
  7   goods pay paid persons person duties duty hall money ship                   SHIPPING
  8   england king kingdom france majesty crown parliament queen prince britain   CROWN
  9   king year reign time years fame parliament majesty late henry               HISTORY
 10   city town county place london country miles places towns church             LONDON / TRAVEL
 15   money pounds paid price pay pence gold cent sum pound                       MONEY
 25   cafe lands law estate plaintiff goods defendant party debt cafes            LAW
 33   plants leaves flowers plant flower trees ground fruit grow roots            NATURAL HISTORY / AGRICULTURE
 47   people nation war peace government church country power public state        NATION



representation. As a result, each topic is associated with word probabilities, thus resulting in
a final topic model similar to standard LDA output, even though it is trained using different
principles. A bag-of-words representation is needed only during training, inference is done
using a contextualized representation. Thus, the model is able to assign topic probabilities even
to sentences that consist of words not seen during training.
   To make contextualized text representations we used the ECCO BERT model, which has been
made publicly available and is described in a separate publication [28]. Since BERT takes as an
input 512 input tokens at most, we use paragraphs as a unit of analysis. Paragraphs are run
through the BERT model and the resulting token embeddings are averaged to obtain paragraph
embeddings. We use 500,000 random paragraphs from the extrapolated data set of economic
documents presented in Section 3 and train a model with 50 topics.
   Several topics are exemplified in Table 4. Since all documents in the collection are already
relevant to economic discourse (in a broad sense), the resulting topics describe different themes
related to this major discourse, such as war (topic 0), shipping (topic 7) or law (topic 25). As
is usually the case with topic modelling, a few topics are meaningless, which is exacerbated
by OCR errors in historical documents. We try to minimize the problem by introducing a
collection-specific list of stopwords, which include the most common non-words produced by
OCR – e.g. ‘tbe’, ‘thc’, ‘thle’, which are all misspelled variations of a standard stopword ‘the’.
Still, two of the fifty topics consist of meaningless words, e.g. topic 1 in the table. In addition,
topics are overlapping, as 0 and 3 shown in the table. In training 50 topics we follow the same
logic as [26]: it is always an option to merge topics in further analysis while splitting them is
impossible.
   After inferring a topic distribution for each paragraph we average them across documents to
get a document-topic distribution. Two meaningless topics – 1 and 24 – are the most prominent
for approximately 37% of documents. Thus, we use the top 3 topics for each document for further
analysis. Then we manually check how documents group together according to their topics.
Note that the most probable words, as presented in Table 4, may be not the best representation
for human topic interpretation [29]. For our study of changing discourses the most crucial




                                                         130
question is whether topics group together documents in a meaningful way. Manual inspection
shows that this is the case with most of the topics. For example, topic 25 is the most prominent
in legal reports and groups together law documents, which is not immediately clear from the
word list, especially due to the fact that the most salient word in this topic – ‘case’ – is misspelled
as ‘cafe’ (mixing long s with f is one of the most frequent OCR bugs). In Table 4 we add a
human interpretation assigned after inspection of documents where the topic was among three
most prominent. After the manual inspection we conclude that topics group documents in a
meaningful manner and could be used to further explore network communities as detected in
the previous step (Section 5).
   Looking at the most prominent topics in each work, we can see that they successfully
differentiate between different groups of texts. Topic 47, for example, is most prominent for
Hume’s Political Discourses, as well as Essays and Treatises, which reprints it. Other works with
this as their most prominent topic have in common the study of history, morality, politics, and
the economy in a broad sense, including Lord Henry Home Kames’s Sketches of the History of
Man, and Joseph Priestly’s Lectures on history, and general policy, as well as large numbers of
works by Edmund Burke and many of Daniel Defoe’s political essays and pamphlets. Topic 15,
on the other hand, can be linked to technical works or those with a narrower focus on trade
and the economy. Exemplary texts include Adam Smith’s An inquiry into the nature and causes
of the wealth of nations, James Steuart’s An inquiry into the principles of political oeconomy, and
various tracts and texts written on specific economic theory or principles of trade and money.
In this way, the topics can be used to get a more fine-grained division between works already
identified as part of a more general corpus of economic texts.
   The topics also make sense with respect to the way they are distributed across the communities
of publisher networks extracted using network analysis techniques. There are significant
differences between communities in terms of the distribution of topics within them (Figure
2). In communities 1, 4, 5, and 10, topic 47 is the most prominent: as suggested above, those
where this topic is most prominent are texts which deal with the economy in a broad sense.
Communities 2 and 3 have a very different profile: in both, topic 15 stands out, one which has a
narrower and more technical focus. This suggests (though at this early stage we cannot fully
assert this claim) that within those working on economic texts and discourse, there are different
– though not distinct – publishing communities, and that these can be found with this approach.


7. Conclusions
The aim we set for ourselves for this article was to develop approaches for analysing early
modern discourses on commerce in a data-driven manner, and to link these discourses to the
wider context of the physically manifested texts through the publisher networks in which they
were produced. We achieved this aim, as we were able to cluster and analyse the relevant
context of Hume’s Political Discourses – both in terms of works and publisher communities
– in a data-driven manner, and the results align with traditional humanities expertise both
at a macro and micro level. Similarly, many of the topics produced by the CTM model were
meaningful from a historian’s point of view, and were not equally distributed across publishing
communities, but varied in their prominence. The main implication of this success is that the




                                                 131
Figure 2: Distribution of selected topics across the publisher communities. We show a relative number
of documents within a community where a given topic was among three most prominent.


development of the process and tools applied in this test case and their application for detecting
and analysing other historical discourses have the potential to contribute significantly to the
historical understanding of the Enlightenment era. It can also be further expanded, as we utilised
only some of the metadata related to the materiality aspect of the texts. In the near future
we will deepen the analysis of publisher communities and text similarities by considering the
variation in the format (closely connected to price and status of the print product) of different
editions of the same work, which allows an analysis of the intended audience of a work during
its history or the strategies of different clusters of the publishers. The same iterative approach
that was applied within this article also applies to the larger aim of developing state-of-the-art
ways to approach complex questions such as the detection of soft-edged discourses, and we
deem the work presented in this article as a successful first iteration.


Acknowledgments
We thank the Academy of Finland for funding the work on this article (Rise of commercial
society and eighteenth-century publishing, grant numbers 333716 and 333717) and Helsinki
Computational History Group members for discussions and previous work that made this
article possible. We especially thank Eetu Mäkelä for the Octavo API and Ville Vaara for the
harmonisation and enrichment of the publisher data. We also thank Richard Sher and Mark
Hill for discussions that facilitated the work.


References
 [1] J. Robertson, The Case for The Enlightenment: Scotland and Naples 1680–1760, Cam-
      bridge University Press, 2005. doi:10.1017/CBO9780511490705 .
 [2] I. Hont, Jealousy of Trade. International Competition and the Nation-state in Historical
      Perspective, Harvard University Press, 2005.




                                                132
 [3] M. Tolonen, Mandeville and Hume: Anatomists of civil society, SVEC, University of
     Oxford, Voltaire Foundation, United Kingdom, 2013.
 [4] P. de Bolla, E. Jones, P. Nulty, G. Recchia, J. Regan, The idea of liberty, 1600–1800:
     A distributional concept analysis, Journal of the History of Ideas 81 (2020), 381–406.
     doi:10.1353/jhi.2020.0023 .
 [5] L. Given, The Sage encyclopedia of qualitative research methods, 2008. URL: https://
     methods.sagepub.com/reference/sage-encyc-qualitative-research-methods. doi:10.4135/
     9781412963909 .
 [6] Q. Skinner, Meaning and understanding in the history of ideas, History and Theory 8
     (1969), 3–53. URL: http://www.jstor.org/stable/2504188.
 [7] M. Tolonen, E. Mäkelä, A. Ijaz, L. Lahti, Corpus linguistics and Eighteenth Century
     Collections Online (ECCO), Research in Corpus Linguistics 9 (2021), 19–34. URL: https:
     //ricl.aelinco.es/index.php/ricl/article/view/161. doi:10.32714/ricl.09.01.03 .
 [8] L. Lahti, J. Marjanen, H. Roivainen, M. Tolonen, Bibliographic data science and the
     history of the book (c. 1500–1800), Cataloging & Classification Quarterly 57 (2019),
     5–23. URL: https://doi.org/10.1080/01639374.2018.1543747. doi:10.1080/01639374.2018.
     1543747 . arXiv:https://doi.org/10.1080/01639374.2018.1543747 .
 [9] D. Whitten, Democracy returns to the library: The Goldsmiths’-Kress library of economic
     literature, Journal of Economic Literature 16 (1978), 1004–1006. URL: http://www.jstor.org/
     stable/2723473.
[10] M. Gentzkow, B. Kelly, M. Taddy, Text as data, Journal of Economic Literature 57 (2019),
     535–74. URL: https://www.aeaweb.org/articles?id=10.1257/jel.20181020. doi:10.1257/jel.
     20181020 .
[11] M. Tolonen, The Scottish Enlightenment, in: G. Clayes, M. S. Cummings, L. T. Sargent
     (Eds.), The Encyclopedia of Modern Political Thought (Volume 2), Sage, 2013, 740–745.
[12] M. J. Hill, V. Vaara, T. Säily, L. Lahti, M. Tolonen, Reconstructing intellectual networks:
     From the ESTC’s bibliographic metadata to historical material, in C. Navarretta, M. Agir-
     rezabal, B. Maegaard (Eds.), Proceedings of the Digital Humanities in the Nordic Countries
     4th Conference, volume 2364 of CEUR Workshop Proceedings, CEUR, Copenhagen, Denmark,
     2019, 201–219. URL: http://ceur-ws.org/Vol-2364/#19_paper.
[13] B. Greteman, Making connections with Milton’s Epitaphium Damonis, in Making Milton,
     Oxford University Press, 2021, 31–41. URL: https://oxford.universitypressscholarship.
     com/view/10.1093/oso/9780198821892.001.0001/oso-9780198821892-chapter-3.
     doi:10.1093/oso/9780198821892.003.0003 .
[14] M. Gavin, Historical text networks: The sociology of early english criticism, Eighteenth-
     Century Studies 50 (2016), 53–80. URL: https://muse.jhu.edu/article/634558. doi:10.1353/
     ecs.2016.0041 .
[15] B. Greteman, Milton and the early modern social network: The case of the Epitaphium
      Damonis, Milton Quarterly 49 (2015), 79–95. URL: https://www.jstor.org/stable/26603192.
[16] J. R. Ladd, Imaginative networks: Tracing connections among early modern book
     dedications, Journal of Cultural Analytics (2021). URL: https://culturalanalytics.org/article/
     21993-imaginative-networks-tracing-connec-tions-among-early-modern-book-dedi-cations.
     doi:10.22148/001c.21993 .




                                              133
[17] M. Girvan, M. E. J. Newman, Community structure in social and biological networks,
     Proceedings of the National Academy of Sciences 99 (2002), 7821–7826. URL: http://www.
     pnas.org/cgi/doi/10.1073/pnas.122653799. doi:10.1073/pnas.122653799 .
[18] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, E. Lefebvre, Fast unfolding of communities
     in large networks, Journal of Statistical Mechanics: Theory and Experiment 2008 (2008),
     P10008. URL: https://iopscience.iop.org/article/10.1088/1742-5468/2008/10/P10008. doi:10.
     1088/1742- 5468/2008/10/P10008 .
[19] M. E. J. Newman, Modularity and community structure in networks, Proceedings of the
     National Academy of Sciences 103 (2006), 8577–8582. URL: https://www.pnas.org/content/
     103/23/8577. doi:10.1073/pnas.0601602103 .
[20] WHITE, Luke (c.1750-1824), of Woodlands, (formerly Luttrellstown), co. Dublin and
     Porters, Shenley, Herts. | History of Parliament Online, 2022. URL: http://www.
     historyofparliamentonline.org/volume/1820-1832/member/white-luke-1750-1824.
[21] M. Kennedy, The domestic and international trade of an eighteenth-century Dublin
     bookseller: John Archer (1782-1810), Dublin Historical Record 49 (1996), 94–105. URL:
     https://www.jstor.org/stable/30101144.
[22] R. B. Sher, The Enlightenment & the book: Scottish authors & their publishers in eighteenth-
     century Britain, Ireland, & America, University of Chicago Press, 2006.
[23] H. Amory, Millar, Andrew (1705–1768), bookseller, in: Oxford Dictionary of National Biog-
     raphy, volume 1, Oxford University Press, 2004. URL: http://www.oxforddnb.com/view/10.
     1093/ref:odnb/9780198614128.001.0001/odnb-9780198614128-e-18714. doi:10.1093/ref:
     odnb/18714 .
[24] L. Viola, J. Verheul, Mining ethnicity: Discourse-driven topic modelling of immigrant
     discourses in the USA, 1898–1920, Digital Scholarship in the Humanities (2019).
[25] E. Bunout, Grasping the anti-modern discourse on Europe in the digitised press or can
     text mining help identify an ambiguous discourse? (2020).
[26] J. Marjanen, E. Zosa, S. Hengchen, L. Pivovarova, M. Tolonen, Topic modelling discourse
     dynamics in historical newspapers, in: Digital Humanities in the Nordic Countries
     Conference, Schloss Dagstuhl Leibniz Center for Informatics, 2021, 63–77.
[27] F. Bianchi, S. Terragni, D. Hovy, Pre-training is a hot topic: Contextualized document
     embeddings improve topic coherence, in: Proceedings of the 59th Annual Meeting of the
     Association for Computational Linguistics and the 11th International Joint Conference on
     Natural Language Processing (Volume 2: Short Papers), 2021, 759–766.
[28] I. Rastas, Y. Ciarán Ryan, I. Tiihonen, M. Qaraei, L. Repo, R. Babbar, E. Mäkelä, M. Tolonen,
     F. Ginter, Explainable publication year prediction of eighteenth century texts with the
     BERT model, in: Proceedings of the 3rd Workshop on Computational Approaches to
     Historical Language Change, Association for Computational Linguistics, 2022, 68–77.
     URL: https://aclanthology.org/2022.lchange-1.7.
[29] J. H. Lau, D. Newman, T. Baldwin, Machine reading tea leaves: Automatically evaluating
     topic coherence and topic model quality, in Proceedings of the 14th Conference of the
     European Chapter of the Association for Computational Linguistics, 2014, 530–539.




                                              134