<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">How BERT Speaks Shakespearean English? Evaluating Historical Bias in Masked Language Models</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Miriam</forename><surname>Cuscito</surname></persName>
							<email>miriam.cuscito@unicas.it</email>
							<affiliation key="aff0">
								<orgName type="department">Dipartimento di Lettere e Filosofia</orgName>
								<orgName type="institution">Università degli Studi di Cassino e del Lazio Meridionale</orgName>
								<address>
									<addrLine>Via Zamosch 43</addrLine>
									<postCode>03043</postCode>
									<settlement>Cassino</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Alfio</forename><surname>Ferrara</surname></persName>
							<email>alfio.ferrara@unimi.it</email>
							<affiliation key="aff1">
								<orgName type="department">Dipartimento di Informatica &quot;Giovanni Degli Antoni&quot;</orgName>
								<orgName type="institution">Università degli Studi di Milano</orgName>
								<address>
									<addrLine>Via Celoria 18</addrLine>
									<postCode>20133</postCode>
									<settlement>Milano</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Martin</forename><surname>Ruskov</surname></persName>
							<email>martin.ruskov@unimi.it</email>
							<affiliation key="aff2">
								<orgName type="department" key="dep1">Dipartimento di Lingue</orgName>
								<orgName type="department" key="dep2">Letterature</orgName>
								<orgName type="department" key="dep3">Culture e Mediazioni</orgName>
								<orgName type="institution">Università degli Studi di Milano</orgName>
								<address>
									<addrLine>Piazza Sant&apos;Alessandro 1</addrLine>
									<postCode>20123</postCode>
									<settlement>Milano</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">How BERT Speaks Shakespearean English? Evaluating Historical Bias in Masked Language Models</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">E75CDCD1175C9E5AA3517D8344003271</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T18:11+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Masked Language Models</term>
					<term>Early Modern English</term>
					<term>Historical Bias</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In this paper, we explore the idea of analysing the historical bias of masked language models based on BERT by measuring their adequacy with respect to Early Modern (EME) and Modern (ME) English. In our preliminary experiments, we perform fill-in-the-blank tests with 60 masked sentences (20 EME-specific, 20 ME-specific and 20 generic) and three different models (i.e., BERTBase, MacBERTh, BL Books). We then rate the model predictions according to a 5-point bipolar scale between the two language varieties and derive a weighted score to measure the adequacy of each model to EME and ME varieties of English.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Masked language models (MLMs) are deep neural language models which create contextualised word representations, in the sense that the representation for each word depends on the entire context in which it is used. That is to say, word representations are a function of the entire input sentence. Such models are designed to have high predictive capabilities and usually pre-trained on large textual corpora. This makes them closely tied to the domains on which they were trained and dependent on the infrastructure upon which they are based. The presence of various biases in MLMs has been extensively studied, typically with the aim of proposing effective mitigation strategies <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2,</ref><ref type="bibr" target="#b2">3,</ref><ref type="bibr" target="#b3">4]</ref>. However, there are instances where the bias in certain MLMs is not necessarily negative. This is particularly true when the bias manifested in the language reflects its socio-temporal context. This bias could be advantageous for tasks that demand such socio-temporal staging <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b5">6]</ref>.</p><p>In this paper, we explore the idea of analysing the bias by focusing on the major syntactic, semantic, and grammatical differences between two varieties of the English language: Early Modern (EME) and Modern (ME). More precisely, we propose a method and a measure of adequacy to test the adherence of MLMs to the natural language variety of interest. In particular, we assess the level of diachronic bias of three MLMs: Bert-Base-Uncased <ref type="bibr" target="#b6">[7]</ref> (referred as BERT Base here) 1 ; MacBERTh [8] 2 ; and Bert British Library Books English <ref type="bibr" target="#b8">[9]</ref> (BL Books) 3 . In our preliminary experiments, we perform tests with 60 masked questions in which the models have the task to predict the masked word in the sentence.</p><p>We then rate the proposed responses according to a 5-point bipolar scale between the two language variants and derive a weighted score from the response probabilities and their respective scores on the scale.</p><p>These results, although preliminary, might suggest a method applicable in the digital humanities when MLMs are employed for the analysis of historical corpora.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>If it is true that language shapes culture while it is shaped by it <ref type="bibr" target="#b9">[10]</ref>, language models in general -and MLMs in particular -constitute a still partially covered mirror of this dual relationship. Not only can a MLM be tested based on its level of representativeness of the language to determine its reliability, but also it can tell us about linguistic, social, and historical phenomena that concern the culture tied to that specific language. In other words, a MLM could be a valuable tool towards the expansion of the broader social knowledge of a given culture, rightfully becoming part of the basic tools of Cultural Analytics discussed by <ref type="bibr" target="#b10">Manovich [2020]</ref>. According to <ref type="bibr" target="#b11">Bruner's [1984]</ref> pragmatic-cultural perspective, learning a language also means learning the cultural patterns associated with it. Similarly, analysing the language in its various realisations would mean having the opportunity to visualise the underlying cultural patterns.</p><p>Moreover, MLMs can be highly beneficial also for philological <ref type="bibr" target="#b12">[13]</ref>, pragmatic <ref type="bibr" target="#b13">[14]</ref>, critical <ref type="bibr" target="#b5">[6]</ref>, and literary work <ref type="bibr" target="#b14">[15]</ref>. However, the effectiveness of these models depends on their ability to adapt to language specificity in its historical dimension. This is typically achieved by training models on historical text corpora. However, the difficulty of accessing large historical documentary collections means that the models available are still few and requires verifying whether they adapt effectively to the historical linguistic context.</p><p>BERT is a foundational masked language model (MLM) which to date is the most widely adopted <ref type="bibr" target="#b15">[16]</ref>. A number of studies have explored different forms of bias in BERT <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2,</ref><ref type="bibr" target="#b2">3]</ref>. Three BERT-based MLMs are of particular interest for our study: (i) Bert-Base-Uncased <ref type="bibr" target="#b6">[7]</ref>, created from a corpus of texts from Wikipedia and BookCorpus and a model of contemporary language, which we use as a control condition in our experiment; (ii) MacBERTh <ref type="bibr" target="#b7">[8]</ref>, pre-trained on texts from 1500 to 1950; and (iii) Bert British Library Books English, pre-trained on contemporary texts and fine-tuned on historical texts from the 19th century to the present.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Method</head><p>To evaluate the adequacy of MLMs on a test set, we define a temporal valence task consisting of a collection of test sentences, each with a masked token (i.e., word). This is a typical fill-in-the-blank task, where the models are required to predict the masked token. Formally, we consider the following three sets: (i) we denote with 𝒮 the set of all test sentences, (ii) with 𝒱 we denote a set of vocabulary words, and (iii) with 𝒯 = {−1, −0.5, 0, 0.5, 1} ⊂ R, we denote a 5-point bipolar temporal valence scale, where −1 represents the farthest historical period and 1 the closest to today.</p><p>With the above notation, for each of the masked sentences (denoted as 𝑠 ∈ 𝒮), we define a function 𝜌 : 𝒮 → 𝒯 representing the sentence temporal valence score. This function indicates the period from which the masked sentence is typical.</p><p>Then, we calculate a token-in-sentence temporal valence score 𝜎 : 𝒱 × 𝒮 → 𝒯 , indicating the score of a token substituting the sentence mask.</p><p>The mentioned temporal valence scores are assigned arbitrarily according to the research hypotheses. Taking this study as an example, the criterion used to determine each score was the degree of alignment of certain sentences or tokens with a specific historical period on a philological-linguistic basis. Scholars wishing to delve into language study using this methodological approach can selectively choose the score to assign to their test set based on their specific research needs. The versatility of the proposed methodology is evident in its adaptability to a diverse array of fields of interest. This flexibility enables researchers to seamlessly integrate personalized metrics, ensuring a tailored approach to analysis without undermining the inherent consistency of the results.</p><p>As an example of temporal valence score, given EME (Early Modern English) as the farthest period (i.e., −1 ∈ 𝒯 ) and ME (Modern English) as closest (i.e., 1 ∈ 𝒯 ), if we consider the sentence 𝑠 1 = "Why wilt [MASK] be offended by that?" we have 𝜌(𝑠 1 ) = −1 as 𝑠 1 is a representative sentence for EME, and 𝜎("𝑡ℎ𝑜𝑢", 𝑠 1 ) = −1, because in this context "thou" is indicative for EME. On the other hand, 𝜎("𝑛𝑜𝑡", 𝑠 1 ) = 0, because "not" is neutral regarding the two language varieties. Given a model 𝑚, for the masked token in each sentence (𝑠 ∈ 𝑆), we have the set of {𝑤 1 , 𝑤 2 , . . . , 𝑤 𝑛 } ⊂ 𝒱 of 𝑛 words predicted by 𝑚 for 𝑠, that are associated with the vector of corresponding probabilities from this model, shown in Equation <ref type="formula" target="#formula_0">1</ref>.</p><formula xml:id="formula_0">p 𝑚 = (𝑝(𝑤 1 ), 𝑝(𝑤 2 ), . . . , 𝑝(𝑤 𝑛 )) 𝑇<label>(1)</label></formula><p>For this set, using the temporal valence score 𝜎, we define a token-in-sentence temporal valence score vector x 𝑚 for 𝑚 given the sentence 𝑠 as in Equation <ref type="formula" target="#formula_1">2</ref>.</p><p>x 𝑚 = (𝜎(𝑤 1 , 𝑠), 𝜎(𝑤 2 , 𝑠), . . . , 𝜎(𝑤 𝑛 , 𝑠)) 𝑇</p><p>This allows us to define the bias of a model regarding the sentence as the dot product of the modelderived probabilities and the token valence scores, providing us with a weighted score as in Equation <ref type="formula" target="#formula_2">3</ref>, and effectively getting a single value measurement from the two vectors above.</p><formula xml:id="formula_2">𝛽(𝑚, 𝑠) = x 𝑇 𝑚 p 𝑚<label>(3)</label></formula><p>We can also proceed to define the domain adequacy of a model with respect to a sentence 𝑠 (see Equation <ref type="formula" target="#formula_3">4</ref>), based on the difference between the sentence temporal valence score 𝜌(𝑠) and the model bias 𝛽(𝑚, 𝑠). To do this, we consider the difference between the model bias and the sentence temporal valence (disregarding which one is larger), and project it on the unit interval, making sure that more similar values lead to higher adequacy scores.  </p><formula xml:id="formula_3">𝛿(𝑚, 𝑠) = 1 − | 𝜌(𝑠) − 𝛽(𝑚, 𝑠) | 2<label>(4)</label></formula><p>Examples of three sentences classified in different periods are provided in Tables 1, 2 and 3, which show the corresponding values for 𝜌, 𝑝, 𝜎, 𝛽(𝑚, 𝑠) and 𝛿(𝑚, 𝑠).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Evaluation</head><p>We test our metrics with three BERT-based linguistic models we consider relevant for the varieties of the English language of interest: (i) Bert-Base-Uncased, (ii) MacBERTh, and (iii) BL Books. In accordance with the objectives of this study, the choice of models reflects a specific interest in language; therefore, they can be replaced to best fit any other specific interest in diachronic language analysis. For the test we used 60 word-masked sentences, specifically created for this study. To create the test set, we relied on different types of written language: contemporary standard, journalistic language, social media non-standard, and Early Modern language.</p><p>The elements to be masked were selected based on their belonging to specific word classes known to have suffered more exposure to the diachronic variation of the English language: pronouns, verbs, adverbs, adjectives, and nouns. Of the 60 sentences<ref type="foot" target="#foot_0">4</ref> , 20 are selected to be suggestive for the EME variety of English, further 20 -as suggestive for ME, and final 20 are generic. Once the test set was complete, a temporal valence score was assigned to each sentence (see 𝜌 in Section 3) based on their level of chronological markedness.</p><p>The test set was administered to the three MLMs, and the suggested words with their probability were collected. The resultant vocabulary was marked independently from the models that provided it by setting the token-in-sentence temporal valence score (i.e. 𝜎) to each word, based on an estimation of proximity of the token's meaning to a certain linguistic variety in the context in which it appeared. Notably, during this phase, our decision was to work on a sentence level (contextually) rather than on a set level (globally). The method proved highly effective in avoiding the risk of semantic flattening, given that almost every word has shown some level of contextual semantic specificity if taken contextually rather than globally. An example is the pronoun you in "fare you well, sir", which is globally neutral and yet acquires a strong diachronic value if evaluated in its context, in which it appears to be utmost archaic.</p><p>Once 𝛽 and 𝛿 were calculated, we proceeded with the analysis of the data and the collection of results. The distribution of the bias score 𝛽 and the domain adequacy score 𝛿 for the sentences in the three groups (i.e., EME, Neutral, and ME) is shown in Figures <ref type="figure" target="#fig_2">1 and 2</ref>, respectively. Figure <ref type="figure">3</ref> shows that for all three test sets, MacBERTh is most aligned with EME, whereas BERT Base is always most aligned as ME. BL Books shows a tendency towards a more neutral language than the other two models in marked sentences, whilst surprisingly it aligns to ME in neutral sentences. Figure <ref type="figure" target="#fig_2">2</ref> shows that MacBERTh has best domain adequacy for EME, and BERT Base has best domain adequacy for ME. In the case of the neutral test set, domain adequacy is no less informative. Although the sentences do not inherently carry their expectations regarding language, models appear consistently well-suited to a neutral context, and none of them pushes for strong specificity of their corresponding trained domain. In effect this leaves the sentences close to their original neutrality This preliminary study provides an illustration of the nature and functioning of the MLMs predictive behaviour. The presence or absence of markedness in the sentences enables all three MLMs to select the type of element which best fits the co-text. So, while for diachronically marked sentences, models without training in that domain attempted to suggest probable solutions, sometimes resulting in a form of linguistically inconsistent mimicry, in unmarked sentences, the models perform exceptionally well, and linguistic inaccuracies are rare.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion</head><p>Both notions of bias (𝛽) and domain adequacy (𝛿) provide important insights of the nature of the models. The first, 𝛽, indicates a tendency in terms of temporal valency. In other words, the interpretation of its value should be considered within the context of the specific dichotomy of language varieties. On the other hand, 𝛿 reflects the adequacy for an individual language variety. It successfully captures model tendencies when completing historically predetermined sentences. However, due to its inclusion of 𝜌, 𝛿 is less informative when there is no bias originating from the sentence, i.e. when completing temporally-neutral sentences.</p><p>Notably, our measures demonstrate that MacBERTh is better at representing the EME historical context than BL Books. This could possibly be explained by the nature of the models in question. First, MacBERTh is a model created from scratch and trained on texts spanning a time range that takes into account the evolution of the English language from EME to ME. BL Books, on the other hand, was only fine-tuned on texts from the modern period, so it has no direct exposure to EME. It does perform better on ME than MacBERTh and worse than BERT Base. Thus, MacBERTh demonstrates a strong linguistic consistency, given the wide range of language varieties it is trained on, but in tasks related to ME yields worse results than other more specialised models. Simply put, having a specific, narrower domain poses fewer problems when working within it but reveals clear gaps when moving outside that domain.</p><p>The notion that LMs can serve as a window into the history of a population is not new, but there is a growing interest in exploring the relationships between these models and the socio-linguistic and socio-cultural contexts <ref type="bibr" target="#b16">[17,</ref><ref type="bibr" target="#b17">18,</ref><ref type="bibr" target="#b18">19,</ref><ref type="bibr" target="#b19">20]</ref>. It is equally imperative to establish a procedural framework to address the lack of evaluative methods for these models, as previously hinted at in this text. This is particularly useful when no direct links could be drawn between the corpus used to train the model and the social context of the test set.</p><p>Within this evaluation, we created a dedicated test set for each model under scrutiny, drawing upon approaches used for evaluation of bias in MLMs. In creating our test sets, we built our sentences both on logical-semantic and logical-syntactic tasks. Future work could try to create a test set for model interrogation that is culture-oriented, delving into socio-culturally significant elements such as customs, historical events, and attitudes towards social groups -elements recognised as belonging to social knowledge. Alternatively tests could be derived from word in context datasets, such as TempoWiC and HistWiC <ref type="bibr" target="#b20">[21,</ref><ref type="bibr" target="#b21">22,</ref><ref type="bibr" target="#b22">23]</ref>.</p><p>Alternatively, the temporal valence of word tokens could be derived not simply from the sentence where they emerge, but from the wider historical context, e.g. from a large corpus, representative for the period <ref type="bibr" target="#b23">[24,</ref><ref type="bibr" target="#b24">25]</ref>. This would allow automating not only the calculation of the token temporal valence 𝜎, but also the identification of sentences that are representative for each historical period. As a consequence, the dependence on manual expert evaluation would be strongly reduced, which would result in both higher reproducibility and wider generalisability of the approach.</p><p>This study aims not only to propose a methodology for assessing language models but also to put forth hypotheses for expanding the available tools to humanities scholars interested in studying complex socio-cultural phenomena with an approach which begins by interpreting textual clues and inferring their connections to reality. As such it is also applicable to contexts beyond diachronic, but also across dialects or professional jargons.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Distribution of the bias 𝛽(𝑚, 𝑠) of the three models with respect to the three test sets.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Distribution of the domain adequacy 𝛿(𝑚, 𝑠) of the three models with respect to the three test sets.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Scores of the models for the historically biased sentence "Why willt [MASK] be offended by that?" (temporal valence 𝜌 = −1)</figDesc><table><row><cell cols="2">BERT Base</cell><cell></cell><cell cols="2">MacBERTh</cell><cell></cell><cell cols="2">BL books</cell><cell></cell></row><row><cell>token</cell><cell>𝑝</cell><cell>𝜎</cell><cell>token</cell><cell>𝑝</cell><cell>𝜎</cell><cell>token</cell><cell>𝑝</cell><cell>𝜎</cell></row><row><cell>thou</cell><cell cols="2">0.712 -1.0</cell><cell>thou</cell><cell cols="2">0.987 -1.0</cell><cell>thou</cell><cell cols="2">0.573 -1.0</cell></row><row><cell>you</cell><cell>0.101</cell><cell>0.0</cell><cell>not</cell><cell>0.008</cell><cell>0.0</cell><cell>you</cell><cell>0.246</cell><cell>0.0</cell></row><row><cell>i</cell><cell>0.085</cell><cell>0.0</cell><cell>you</cell><cell>0.004</cell><cell>0.0</cell><cell>he</cell><cell>0.102</cell><cell>0.0</cell></row><row><cell>she</cell><cell>0.055</cell><cell>0.0</cell><cell>ye</cell><cell cols="2">0.001 -1.0</cell><cell>she</cell><cell>0.040</cell><cell>0.0</cell></row><row><cell>he</cell><cell>0.048</cell><cell>0.0</cell><cell>he</cell><cell>0.000</cell><cell>0.0</cell><cell>Thou</cell><cell cols="2">0.038 -1.0</cell></row><row><cell>𝛽</cell><cell>-0.712</cell><cell></cell><cell>𝛽</cell><cell>-0.988</cell><cell></cell><cell>𝛽</cell><cell>-0.612</cell><cell></cell></row><row><cell>𝛿</cell><cell>0.856</cell><cell></cell><cell>𝛿</cell><cell>0.994</cell><cell></cell><cell>𝛿</cell><cell>0.806</cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Scores</figDesc><table><row><cell cols="2">BERT Base</cell><cell></cell><cell cols="2">MacBERTh</cell><cell></cell><cell cols="2">BL books</cell><cell></cell></row><row><cell>token</cell><cell>𝑝</cell><cell>𝜎</cell><cell>token</cell><cell>𝑝</cell><cell>𝜎</cell><cell>token</cell><cell>𝑝</cell><cell>𝜎</cell></row><row><cell cols="3">orientation 0.720 1.0</cell><cell>##ists</cell><cell cols="2">0.493 -0.5</cell><cell>##ly</cell><cell cols="2">0.582 0.0</cell></row><row><cell>misconduct</cell><cell cols="2">0.112 1.0</cell><cell>offenders</cell><cell>0.165</cell><cell>1.0</cell><cell cols="3">##ists 0.355 0.0</cell></row><row><cell>minorities</cell><cell cols="2">0.067 1.0</cell><cell>characters</cell><cell>0.130</cell><cell>0.5</cell><cell cols="3">##ally 0.024 0.0</cell></row><row><cell>partners</cell><cell cols="2">0.052 1.0</cell><cell>drunkards</cell><cell>0.117</cell><cell>0.0</cell><cell>men</cell><cell cols="2">0.021 0.0</cell></row><row><cell>harassment</cell><cell cols="2">0.048 1.0</cell><cell cols="2">delinquents 0.095</cell><cell>0.5</cell><cell>to</cell><cell cols="2">0.018 0.0</cell></row><row><cell>𝛽</cell><cell>1.000</cell><cell></cell><cell>𝛽</cell><cell>0.031</cell><cell></cell><cell>𝛽</cell><cell>0.000</cell><cell></cell></row><row><cell>𝛿</cell><cell>1.000</cell><cell></cell><cell>𝛿</cell><cell>0.516</cell><cell></cell><cell>𝛿</cell><cell>0.500</cell><cell></cell></row><row><cell>Table 3</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row><row><cell cols="9">Scores for the sentence "Should men who are known sexual [MASK] be given a platform?", biased towards</cell></row><row><cell>modernity (𝜌 = 1)</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell></row></table><note>for the neutral sentence "Have you come [MASK] to torment us before the time?" (𝜌 = 0)</note></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_0">For transparency and reproducibility purposes, the following anonymous link contains the complete test set with the corresponding values produced during evaluation: https://tinyurl.com/bert-shakespearean</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>The research leading to these results has received funding from MUR, PRIN2022 project "MetaLing Corpus: Creating a corpus of English linguistics metalanguage from the 16th to the 18th century", ref.: 202233C93X, funded by the European Union under the programme NextGenerationEU.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Stereotype and skew: Quantifying gender bias in pre-trained and fine-tuned language models</title>
		<author>
			<persName><forename type="first">D</forename><surname>De Vassimon Manela</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Errington</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Fisher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Van Breugel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Minervini</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2021.eacl-main.190</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Association for Computational Linguistics</title>
				<meeting>the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="2232" to="2242" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Mitigating language-dependent ethnic bias in BERT</title>
		<author>
			<persName><forename type="first">J</forename><surname>Ahn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Oh</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2021.emnlp-main.42</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Online and</title>
				<meeting>the 2021 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Online and<address><addrLine>Punta Cana, Dominican Republic</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="533" to="549" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Hate speech detection and racial bias mitigation in social media based on BERT model</title>
		<author>
			<persName><forename type="first">M</forename><surname>Mozafari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Farahbakhsh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Crespi</surname></persName>
		</author>
		<idno type="DOI">10.1371/journal.pone.0237861</idno>
	</analytic>
	<monogr>
		<title level="j">PLOS ONE</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="page">e0237861</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">HONEST: Measuring Hurtful Sentence Completion in Language Models</title>
		<author>
			<persName><forename type="first">D</forename><surname>Nozza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Bianchi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Hovy</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2021.naacl-main.191</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Rumshisky</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Hakkani-Tur</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">I</forename><surname>Beltagy</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Bethard</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Cotterell</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Chakraborty</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Zhou</surname></persName>
		</editor>
		<meeting>the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="2398" to="2406" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Fleeting gestures and changing styles of greeting: researching daily life in British towns in the long eighteenth century</title>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">J</forename><surname>Corfield</surname></persName>
		</author>
		<idno type="DOI">10.1017/S0963926821000274</idno>
	</analytic>
	<monogr>
		<title level="j">Urban History</title>
		<imprint>
			<biblScope unit="volume">49</biblScope>
			<biblScope unit="page" from="555" to="567" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Explicit references to social values in fairy tales: A comparison between three European cultures</title>
		<author>
			<persName><forename type="first">A</forename></persName>
		</author>
		<author>
			<persName><forename type="first">Morollon</forename><surname>Diaz-Faes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Murteira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ruskov</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/2023.nlp4dh-1.8" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Joint 3rd International Conference on Natural Language Processing for Digital Humanities and 8th International Workshop on Computational Linguistics for Uralic Languages, Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">M</forename><surname>Hämäläinen</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Öhman</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Pirinen</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Alnajjar</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Miyagawa</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Bizzoni</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Partanen</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Rueter</surname></persName>
		</editor>
		<meeting>the Joint 3rd International Conference on Natural Language Processing for Digital Humanities and 8th International Workshop on Computational Linguistics for Uralic Languages, Association for Computational Linguistics<address><addrLine>Tokyo, Japan</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="62" to="75" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">BERT: Pre-training of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/N19-1423</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
		<title level="s">Long and Short Papers</title>
		<meeting>the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies<address><addrLine>Minneapolis, Minnesota</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="4171" to="4186" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Adapting vs. Pre-training Language Models for Historical Languages</title>
		<author>
			<persName><forename type="first">E</forename><surname>Manjavacas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Fonteyn</surname></persName>
		</author>
		<idno type="DOI">10.46298/jdmdh.9152</idno>
	</analytic>
	<monogr>
		<title level="j">Journal of Data Mining &amp; Digital Humanities NLP4DH</title>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Pretrained Language Models on British Library Corpus</title>
		<author>
			<persName><forename type="first">S</forename><surname>Schweter</surname></persName>
		</author>
		<idno type="DOI">10.5281/zenodo.10715629</idno>
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">How Language Shapes Thought</title>
		<author>
			<persName><forename type="first">L</forename><surname>Boroditsky</surname></persName>
		</author>
		<idno type="DOI">10.1038/scientificamerican0211-62</idno>
	</analytic>
	<monogr>
		<title level="j">Scientific American</title>
		<imprint>
			<biblScope unit="volume">304</biblScope>
			<biblScope unit="page" from="62" to="65" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<title level="m" type="main">Cultural analytics</title>
		<author>
			<persName><forename type="first">L</forename><surname>Manovich</surname></persName>
		</author>
		<ptr target="https://mitpress.mit.edu/9780262037105/cultural-analytics/" />
		<imprint>
			<date type="published" when="2020">2020</date>
			<publisher>The MIT Press</publisher>
			<pubPlace>Cambridge, Massachusetts</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Pragmatics of Language and Language of Pragmatics</title>
		<author>
			<persName><forename type="first">J</forename><surname>Bruner</surname></persName>
		</author>
		<ptr target="https://www.jstor.org/stable/40970973" />
	</analytic>
	<monogr>
		<title level="j">Social Research</title>
		<imprint>
			<biblScope unit="volume">51</biblScope>
			<biblScope unit="page" from="969" to="984" />
			<date type="published" when="1984">1984</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Among Digitized Manuscripts</title>
		<author>
			<persName><forename type="first">L</forename><surname>Van Lit</surname></persName>
		</author>
		<idno type="DOI">10.1163/9789004400351</idno>
	</analytic>
	<monogr>
		<title level="m">Philology, Codicology, Paleography in a Digital World</title>
				<meeting><address><addrLine>Leiden, The Netherlands</addrLine></address></meeting>
		<imprint>
			<publisher>Brill</publisher>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Who and How: Using Sentence-Level NLP to Evaluate Idea Completeness</title>
		<author>
			<persName><forename type="first">M</forename><surname>Ruskov</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-031-36336-8_44</idno>
	</analytic>
	<monogr>
		<title level="m">Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky, Communications in Computer and Information Science</title>
				<editor>
			<persName><forename type="first">N</forename><surname>Wang</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Rebolledo-Mendez</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">V</forename><surname>Dimitrova</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Matsuda</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">O</forename><forename type="middle">C</forename><surname>Santos</surname></persName>
		</editor>
		<meeting><address><addrLine>Nature Switzerland; Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="284" to="289" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Modeling Narrative Revelation</title>
		<author>
			<persName><forename type="first">A</forename><surname>Piper</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">D</forename><surname>Kolaczyk</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Computational Humanities Research Conference 2023</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<editor>
			<persName><forename type="first">A</forename><surname>Šel</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Jannidis</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">I</forename><surname>Romanowska</surname></persName>
		</editor>
		<meeting>the Computational Humanities Research Conference 2023<address><addrLine>Paris, France</addrLine></address></meeting>
		<imprint>
			<publisher>CEUR</publisher>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">3558</biblScope>
			<biblScope unit="page" from="500" to="511" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Lexical Semantic Change through Large Language Models: a Survey</title>
		<author>
			<persName><forename type="first">F</forename><surname>Periti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Montanelli</surname></persName>
		</author>
		<idno type="DOI">10.1145/3672393</idno>
		<ptr target="https://dl.acm.org/doi/10.1145/3672393.doi:10.1145/3672393" />
	</analytic>
	<monogr>
		<title level="j">ACM Comput. Surv</title>
		<imprint>
			<biblScope unit="volume">56</biblScope>
			<biblScope unit="page">38</biblScope>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">T5 meets Tybalt: Author Attribution in Early Modern English Drama Using Large Language Models</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">M M</forename><surname>Hicke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Mimno</surname></persName>
		</author>
		<ptr target="https://ceur-ws.org/Vol-3558/#paper2757" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Computational Humanities Research Conference 2023</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<editor>
			<persName><forename type="first">A</forename><surname>Šel</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Jannidis</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">I</forename><surname>Romanowska</surname></persName>
		</editor>
		<meeting>the Computational Humanities Research Conference 2023<address><addrLine>Paris, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">3558</biblScope>
			<biblScope unit="page" from="274" to="302" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">Translation from historical to contemporary Japanese using Japanese t5</title>
		<author>
			<persName><forename type="first">H</forename><surname>Usui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Komiya</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/2023.nlp4dh-1.4" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Joint 3rd International Conference on Natural Language Processing for Digital Humanities and 8th International Workshop on Computational Linguistics for Uralic Languages, Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">M</forename><surname>Hämäläinen</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Öhman</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Pirinen</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Alnajjar</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Miyagawa</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Bizzoni</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Partanen</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Rueter</surname></persName>
		</editor>
		<meeting>the Joint 3rd International Conference on Natural Language Processing for Digital Humanities and 8th International Workshop on Computational Linguistics for Uralic Languages, Association for Computational Linguistics<address><addrLine>Tokyo, Japan</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="27" to="35" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Machines in the media: semantic change in the lexicon of mechanization in 19th-century British newspapers</title>
		<author>
			<persName><forename type="first">N</forename><surname>Pedrazzini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Mcgillivray</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/2022.nlp4dh-1.12" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2nd International Workshop on Natural Language Processing for Digital Humanities, Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">M</forename><surname>Hämäläinen</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Alnajjar</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Partanen</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Rueter</surname></persName>
		</editor>
		<meeting>the 2nd International Workshop on Natural Language Processing for Digital Humanities, Association for Computational Linguistics<address><addrLine>Taipei, Taiwan</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="85" to="95" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">BERToldo, the historical BERT for Italian</title>
		<author>
			<persName><forename type="first">A</forename><surname>Palmero Aprosio</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Menini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Tonelli</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/2022.lt4hala-1.10" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages, European Language Resources Association</title>
				<editor>
			<persName><forename type="first">R</forename><surname>Sprugnoli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Passarotti</surname></persName>
		</editor>
		<meeting>the Second Workshop on Language Technologies for Historical and Ancient Languages, European Language Resources Association<address><addrLine>Marseille, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="68" to="72" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">T</forename><surname>Pilehvar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Camacho-Collados</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/N19-1128</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
		<title level="s">Long and Short Papers</title>
		<editor>
			<persName><forename type="first">J</forename><surname>Burstein</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Doran</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Solorio</surname></persName>
		</editor>
		<meeting>the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies<address><addrLine>Minneapolis, Minnesota</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="1267" to="1273" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">TempoWiC: An Evaluation Benchmark for Detecting Meaning Shift in Social Media</title>
		<author>
			<persName><forename type="first">D</forename><surname>Loureiro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Souza</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">N</forename><surname>Muhajab</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">A</forename><surname>White</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Wong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Espinosa-Anke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Neves</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Barbieri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Camacho-Collados</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/2022.coling-1.296" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 29th International Conference on Computational Linguistics, International Committee on Computational Linguistics</title>
				<editor>
			<persName><forename type="first">N</forename><surname>Calzolari</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C.-R</forename><surname>Huang</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Kim</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Pustejovsky</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Wanner</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K.-S</forename><surname>Choi</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P.-M</forename><surname>Ryu</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H.-H</forename><surname>Chen</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Donatelli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Ji</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Kurohashi</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Paggio</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Xue</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Kim</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Y</forename><surname>Hahm</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">Z</forename><surname>He</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><forename type="middle">K</forename><surname>Lee</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Santus</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Bond</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S.-H</forename><surname>Na</surname></persName>
		</editor>
		<meeting>the 29th International Conference on Computational Linguistics, International Committee on Computational Linguistics<address><addrLine>Gyeongju, Republic of Korea</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="3353" to="3359" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Chat)GPT v BERT Dawn of Justice for Semantic Change Detection</title>
		<author>
			<persName><forename type="first">F</forename><surname>Periti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Dubossarsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Tahmasebi</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/2024.findings-eacl.29" />
	</analytic>
	<monogr>
		<title level="m">Findings of the Association for Computational Linguistics: EACL 2024, Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">Y</forename><surname>Graham</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Purver</surname></persName>
		</editor>
		<meeting><address><addrLine>St. Julian&apos;s, Malta</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2024">2024</date>
			<biblScope unit="page" from="420" to="436" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">A Strategy to Identify the Peculiarity of a Lexicon in the Analysis of a Corpus</title>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">D</forename><surname>Gasperis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Pavone</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bolasco</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-031-55917-4_9</idno>
	</analytic>
	<monogr>
		<title level="m">New Frontiers in Textual Data Analysis</title>
				<editor>
			<persName><forename type="first">G</forename><surname>Giordano</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">M</forename><surname>Misuraca</surname></persName>
		</editor>
		<meeting><address><addrLine>Nature Switzerland; Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2024">2024</date>
			<biblScope unit="page" from="105" to="118" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">L&apos;analisi automatica dei testi: fare ricerca con il text mining</title>
		<author>
			<persName><forename type="first">S</forename><surname>Bolasco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">De</forename><surname>Mauro</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">number 922</title>
		<title level="s">Studi superiori Statistica</title>
		<meeting><address><addrLine>Carocci, Roma</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
	<note>1a edizione ed</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
