<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Promises from an Inferential Approach in Classical Latin Authorship Attribution</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Giulio Tani Rafaelli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Computer Science, Czech Academy of Sciences</institution>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
      </contrib-group>
      <issue>5</issue>
      <fpage>610</fpage>
      <lpage>619</lpage>
      <abstract>
        <p>Applying stylometry to Authorship Attribution requires distilling the elements of an author's style sufifcient to recognise their mark in anonymous documents. Often, this is accomplished by contrasting the frequency of selected features in the authors' works. A recent approach, CP2D, uses innovation processes to infer the author's identity, accounting for their propensity to introduce new elements. In this paper, we apply CP2D to a corpus of Classical Latin texts to test its efectiveness in a new context and explore the additional insight it can ofer the scholar. We show its efectiveness on a corpus of classical Latin texts and how-moving beyond maximum likelihood-we can visualise the stylistic relationships and gather additional information on the relationships among documents.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;authorship attribution</kwd>
        <kwd>inference</kwd>
        <kwd>classical Latin</kwd>
        <kwd>visualisation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>approach. If the anonymous text is split into fragments, the researchers either compute the
text likelihood by assuming the fragments are independent or let each fragment cast a vote
(Majority Rule). The authors test the approach on literary prose in three languages and
informal English texts. This approach is interesting as it is transparent and can be applied without
relying on language tools—e.g., lemmatisers or large pre-trained models—whose quality can
vary dramatically from language to language.</p>
      <p>While this approach is proven efective, its basic formulation does not fully exploit its
capabilities. Although the model has few hyperparameters, in the case of small corpora, optimising
the hyperparameters based on the best performance on known texts risks overfitting. In the
same paper, the performance on the test set for the smallest corpus is considerably lower than
on the training set, while on large corpora, it tends to be stable or increase1[8, Table 1]. Also,
while using lemmatisers is not necessary, this could still help overcome data sparsity when the
corpus is small. On a diferent note, even in the case of dubious attribution, the likelihoods
produced by CP2D can ofer further insight. The actual distribution of the likelihood values
can help assess the relative position of the document of disputed attribution. This paper aims
threefold: testing the application of the CP2D to Classical Latin poetry, dealing with the risks
of overfitting, and propose a projection to examine the model output directly.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Results</title>
      <p>
        The first promising result is the correct attribution of at least 34 out of the 36 documents in
the corpus when following the method used in 1[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. We say “at least” because—in the method
described by its proponents—no procedure is suggested to choose among diferent sets of
hyperparameters, all sharing the same micro-averaged recall on the training data. For one fourth
of the documents, the same maximum is obtained with at least 15 diferent hyperparameter
sets. Lacking a way to select a single one based on the training corpus increases the risk of
overfitting (selecting a parameter that is efective on the training set but not on the test).
      </p>
      <p>Considering all sets of hyperparameters that ofer performances comparable to the
maximum (see Methods section) and choosing the most common author, the correctly attributed
documents are 35. This requires accounting for a relevant fraction of all hyperparameters
tested. The only document not assigned to its canonical author is theHalieutica from Ovid,
whose authorship is indeed is debated 8[, Chap. 12]. In most cases, the first author is selected
with more than 70% of the hyperparameter sets, replacing the potential instability of the simple
attribution with a clear rule (see Fig.1, panel C). Moreover, these results are comparable to the
baseline imposters method [14]. The size and relative simplicity of the corpus do not allow to
claim a significant diference.</p>
      <p>A second observation is that, while CP2D does not require the use of lemmatisers, we find
that—in this corpus—the use of sequences of lemmas instead of words increases the number
of sets of hyperparameters that ofer performances comparable to the maximum up to
oneifth of all parameters tried (see Fig. 1, panel A). At the same time, simply relying on any set
of parameters that gives the best attribution gives 33 correct attributions. In this case, one
fourth of the documents has at least 22 best-performing hyperparameter sets. These changes,
possibly due to reduced sparity, require additional care in identifying which parameters to
15
s
t
n
e
m
u
c
o
d
fo10
r
e
b
m
u
N
5
0</p>
      <p>B
0.5</p>
      <p>C
Words</p>
      <p>Lemmas
Tokens
1</p>
      <p>2 3
Number of proposed authors
4</p>
      <p>Words</p>
      <p>Lemmas
Tokens
trust. However, for most documents with both kinds of tokens, a single author is identified as
the most likely with all sets of hyperparameters, Fig.1, panels B. Also, even when more than
one author is proposed, most hyperparameter sets usually select the same one, Fig1., panel C.
Considering lemmas, the baseline method has two misclassified documents, the Heroides and
the Consolatio ad Liviam for which Seneca is preferred.</p>
      <p>The method so far has two ways to ofer better insight into the position of each document
relative to the candidate authors: how often each author is selected with diferent sets of
hyperparameters or—for each set of hyperparameters in the model—the relative likelihood of the
authors or the number of fragments assigned to each one. However, these methods do not
allow for the easy accounting of more documents at once. Here, we try to overcome this issue by
projecting the documents on a hyper-sphere where the relationship between texts and authors
and among texts are encoded as angles.</p>
      <p>In Fig. 2, we show the positioning of the documents and texts of uncertain attribution in our
corpus and a sample document. In every plot, we show the position of all documents of the
three closest authors.</p>
      <p>We notice how we can have diferent scenarios with a shared message. The attribution of
Halieutica is either barely to Ovid or to Horace (panels A and B), but other documents from all
authors are always distant from it. The attribution can remain correct over the full range of
variation of the Macro Recall (panels C and D). The diference in recall is driven by documents
of other authors crossing attribution boundaries but does not afect stable attributions. A
document can change attribution even with the same Macro Recall on the training set (panels G and
H). However, even in panels F and G, when the anonymous document crosses the boundary
towards a diferent author, it remains close to the books of the actual author more than to any
other. This suggests that assigning the documents to the author of the nearest document could
give better results if cases like panels F and G were common. However, on this corpus, there
is no noticeable diference.</p>
      <p>
        Figure 2 shows that the Halieutica seems far from all the authors in our corpus, while the
Heroidum Epistulae and the Consolatio ad Liviam are well integrated into the Ovidian
production. We also observe how diferent books from the same collection tend to be grouped. On
a similar note, we can read that while the fourth book of Propertius seems close in style to
its reported author, it may be the least typical in its author’s production. Perhaps
unsurprisingly [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], even removing the subdivision in verses for poetical documents, the works in prose
from Seneca form a well-isolated group, and none of the other documents is ever attributed
to Seneca with any choice of parameters. Less obvious is that using lemmas and ignoring the
Heroides, no author is proposed outside Ovid, Propertius and Tibullus for all documents in
elegiac distich.
      </p>
      <p>These observations suggest that closeness between documents in this space is a good proxy
for stylistic similarity. At the same time, being closer to the bottom or one of the top corners
of the graphs indicates similarity to that author.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Discussion</title>
      <p>We showed that the CP2D approach is also efective on Classical Latin texts, even on a small
and imbalanced corpus. We showed that it is possible to increase the stability of the results
by accounting for the many equivalent sets of hyperparameters and that using lemmas instead
of words expands the subspace of hyperparameters where CP2D has high accuracy. We also
showed how a suitable projection of the documents gives a meaningful representation of the
relationships among documents. This representation can ofer insight into the stylistic
properties of the documents. Lastly, we proposed diferent approaches to attribution leveraging the
distances among documents.</p>
      <p>
        However, this corpus proved to be simple, and it is not possible to judge if some of the
proposed alternatives (use of lemmas, authorship of the nearest document) would have a positive
efect on corpora where the simple majority vote over the equivalent set of hyperparameters
is not satisfying. A future challenge will be attributing not entire books but individual poems.
This tougher challenge—some poems are only a few tens of words long—is of greater interest
as often—e.g., is the case of the Heroidum Epistulae—the uncertainty in attribution is mainly
on selected poems [
        <xref ref-type="bibr" rid="ref16">17</xref>
        ].
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Methods</title>
      <p>
        We selected a corpus of 34 documents from six diferent authors writing in Classical Latin for
this work. Poems in elegiac distich form the main part of the corpus (works by Ovid,
Propertius and Tibullus), followed by other works in various metres (works by Horace and Catullus)
and three examples of prose from Seneca (Consolationes: Ad Marciam, Ad Helviam matrem, Ad
Polybium). We designed the corpus to be imbalanced (the works of Ovid comprise half of the
documents) and divided into literary genres that we expect to challenge the attribution to
diferent extents. The Consolationes in prose from Seneca might show similarities with the Ovidian
text of similar topic. Moreover, the corpus contains four documents considered entirely or
partially from a diferent author. These are the third book of the Elegiae from Tibullus [8, Chapters
8-11] and Ovid’s Halieutica [ibid., Chapter 12-13], the Consolatio ad Liviam [ibid., Chapter 14]
and Heroidum Epistulae [ibid., Chapter 15]. The lemmatised sequences are publicly available
in the LASLA collection from the University of Liège 1[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]; with the exception of theConsolatio
as Liviam; see Table 2 for the complete list of the documents included. Despite the known
relevance of morphosyntactic annotations 9[], for this work, we took into consideration only the
lemmas. We executed the entire analysis in Python, using standard packages (numpy, scipy)
and the cp2d module from [
        <xref ref-type="bibr" rid="ref18">19</xref>
        ]. The code is available athttps://github.com/GiulioTani/CHR24.
      </p>
      <p>We prepared the texts, removing the separation in verses. The distinction between ‘u’ and</p>
      <p>
        (accounting for spaces at both ends).
‘v’ is already removed in the documents, and we removed the distinction between upper- and
lower-case letters and all the non-alphabetic characters (i.e., punctuation). We considered the
sequences of words and lemmas and o f -grams with  ∈ [
        <xref ref-type="bibr" rid="ref3 ref6">3, 6</xref>
        ] for both. Note that the built-in
definition of  -grams in CP2D is derived from [13] and allows a space to appear only at the
beginning or at the end of the -gram. This definition excludes words or lemmas shorter than
      </p>
      <p>As a baseline method, we used the imposters [14] approach built-in in Stylo [7], using the
top 2000 character 4-grams and the Wurzburg delta.</p>
      <p>We followed a nested leave-one-out paradigm to evaluate the CP2D’s performance. This is
because it works better maximising the size of the training corpus and most authors in the
corpus have only 3-5 documents. All the results are obtained by excluding one document at a
time and treating it as anonymous. Then, we optimise the model hyperparameters, maximising
the attribution in a new leave-one-out experiment. Finally, we evaluate the attribution of the
left-out document. This procedure requires each author to have at least three documents. To
this end, we split the book of Catullus into three parts containing 39 carmina each, in order of
appearance. The final corpus contains 36 documents.
laptop computer (8 × 2.4 GHz CPU, 16 GiB RAM).</p>
      <p>
        The simplest approach to attribution requires searching—for every document—the set of
hyperparameters that maximises the attribution on the remaining corpus. We followed the
authors in [
        <xref ref-type="bibr" rid="ref17">18</xref>
        ] and used a grid search considering two normalisations of 0 (constant and
author dependent), five lengths of fragments (full documents, 50, 100, 150 and 300 tokens as
the shortest document contains 339 words), five token definitions (full words and four lengths
of  -grams), two options for the attribution (Maximum Likelihood and Majority Rule) and 21
values of delta logarithmically spaced between 0.01 and 100. The left-out document is attributed
using (one of) the sets of hyperparameters that give the best accuracy out of the 2100 taken
into consideration. The search over the entire space of parameters for the attribution of one
document (including the use of lemma and word sequences) takes about two hours on a regular
      </p>
      <p>The first step forward is not to limit the analysis to the set of parameters that ofers the best
attribution on the training set but to consider all other sets that provide comparable results.
To determine which results are comparable, we will assume that for every set of parameters,
a “true” probability of correct attribution exists. We sample this probability in a leave-one-out
experiment, but the number of correctly attributed texts can be higher or lower than expected
due to chance. To limit the efect of the class imbalance, we will consider—instead of the simple
fraction of correctly attributed books—the macro averaged recall. Taking the best-performing
set of parameters as a reference, we consider all the sets for which the fraction of correctly
attributed texts is at least at the 2.5th percentile in the confidence interval of the best result,
assuming a Bernoulli distribution. This choice will allow us to distinguish cases where the
attribution is unanimous and where diferent authors compete. In this case, every set of parameters
will vote for the final attribution.
ments. For every document   , the software returns the average log-likelihood per token
While this procedure allows attribution, it does not allow comparisons between
docu
 1 log ℒ (  ∣   ) = ℒ</p>
      <p>of every author   , with  number of tokens. These likelihoods are not
directly comparable across documents. Indeed, in the leave-one-out approach, each known
document of an author and the anonymous are compared against slightly diferent versions
computed using a corpus of − 1
of the author’s corpus. For each of the
the anonymous document.</p>
      <p>documents. The reference corpus of  contains all  for</p>
      <p>documents of   , the likelihoodℒ (  ∣   ) will be</p>
      <p>To compare documents, we will ignore this aspect for two reasons: First, the reference corpus
is meant to reflect the best available description of the author as a proxy for the author’s style.
Each of the diferent versions represents the author with varying approximations. Second, from
a more technical point of view, the efect on the likelihood of the changing reference corpus
decreases with the size of the corpus itself.</p>
      <p>We will now consider minus the inverse of the output of CP2D, i.e.,  = − / log(ℒ ),

and treat these as Cartesian coordinates. With this transformation, the most likely author

is still associated with the maximum coordinate, and each author identifies with one of the
axes in space. The smallest angle   between the document and the axes in the -dimensional
space, with  the number of authors, identifies the most likely author. In the limit of ℒ  → 1
moves towards the axis  → 0.
(increasing likelihood of the author), the associated coordinate
 → ∞ and the document</p>
      <p>The same attribution results would be achieved by projecting all the points on the surface
of an  -ball, i.e., an ( − 1 )-sphere. Since the variability of the valuesℒ is limited in practice,
most documents are scattered around the -dimensional bisector. Thus, the distance from the

origin encodes general information on the typicality of the documents. In the following, we
will disregard this information and work only with the computed as:


 = arctan</p>
      <p>√∑=+1</p>
      <p>2



(1)
wit  the number of candidate authors and  =  /2 −  −1 . We apply this transformation

only to the likelihood values computed with the sets of hyperparameters that include
Maximum Likelihood attribution, setting aside the attribution with Majority Rule. When
computing attribution based on the angle between documents, we use the cosine distance of the to

determine the nearest document.</p>
      <p>This measure misses some characteristics of a proper metric. Most notably, the angle
between two documents can be zero without being the same text. If the two texts difer only in
the order of the words and in words that appear only in the individual documents (and with
the same distribution of the frequencies), every author will have the same likelihood for both
texts, which will have zero distance. The distance between texts should not be interpreted as
a measure of their textual diference, as the position in space depends on the relationship with
the authors. However, it can be viewed as a measure of the stylistic diference.</p>
      <p>This projection allows us to visualise on 2D paper the relationship with up to three authors
without dimensionality reduction (a 3D sphere has 2D surface). This natural representation
allows visualising decision boundaries, defining regions associated with each author and
corresponding to the ML attribution. Moreover, when interested in stylistic relationships and not
in attribution, we can use just a single level of the leave-one-out procedure. This means
looking at the documents of a group of authors when none of them is treated as anonymous. Here,
each document is compared against all others. In practice, in Fig2., this is the case of the works
of Horace, Propertius and Tibullus in panels A–F.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>The author thanks Noemi Daria Zaccagnino for providing the lemmatisation of thCeonsolation
ad Liviam and other advice in the assembly of the corpus. Her contribution was essential to
the completion of this work.
[12] Kestemont, Mike and Moens, Sara and Deploige, Jeroen. “Collaborative authorship in the
twelfth century: a stylometric study of Hildegard of Bingen and Guibert of Gembloux”.</p>
      <p>In: Digital Scholarship In The Humanities 30.2 (2015), 199–224.
[13]
[14]
[15]
TibEleg3
TiTbiEblEelge1g2
0.948
0.772 0.776 0.780 0.784 0.788 0.792
P0: auth.dep., F: 150, N: 6, : 0.204 (M.R.: 1.0)</p>
      <p>OviArsA1OOvivROiEevmpiAiesmtdor2
OvFasti52
OOvOvFFvaaFssattisO3i1tivO6OiAvvirAisAArms2Ao3rO3viIbin</p>
      <p>OviOCvoinAsmLoir1
OvFasti4
OviMedic</p>
      <p>OviMedic</p>
      <p>B
0.960
HorSaecu
0.956</p>
      <p>PropePrrt2opert3</p>
      <p>Prope0rt.1954
OviMedic</p>
      <p>Propert1
0.772 0.776 0.780 0.784 0.788 0.792
P0: auth.dep., F: 50, N: 5, : 0.602 (M.R.: 1.0)</p>
      <p>OOvviFFOAarvsFAtai21sti6viRe</p>
      <p>asti5 O OviEpist
OvFasti3OviArsAO3mviAedmor2
OvOFvaiAstris1A2 OviIbin</p>
      <p>OvFOavsiAOti4mvOioCrvo3iAnsmLoi r1
OviMedic
0.780 0.783 0.786 0.789 0.792</p>
      <p>P0: fixed, f.d., N: 6, : 0.398 (M.R.: 0.9804)
OvOiOAOOOvrOvvsvFvFvFAFaiaOAFaa1OssarsvstOtvsistiit2OA3iFAivt56iiam21vAsiOORortsirvOev4OA3iimAAv3vimiemOCEdoovoprinriIs21bstiLni PropPerrot4perPt2ropert3</p>
      <p>OviMedic</p>
      <p>OviHalie</p>
      <p>Propert1</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>P.</given-names>
            <surname>Agapitos</surname>
          </string-name>
          and
          <string-name>
            <surname>A. van Cranenburgh. A Stylometric</surname>
          </string-name>
          <article-title>Analysis of Seneca?s Disputed Plays</article-title>
          .
          <source>Authorship Verification of Octavia and Hercules Oetaeus . Tech. rep. 1. Darmstadt: TU Darmstadt</source>
          ,
          <year>2024</year>
          , 31 Seiten. doi: https://doi.org/10.26083/tuprints-00027394.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>D.</given-names>
            <surname>Bamman</surname>
          </string-name>
          and
          <string-name>
            <given-names>P. J.</given-names>
            <surname>Burns</surname>
          </string-name>
          .
          <article-title>Latin BERT: A Contextual Language Model for Classical Philology</article-title>
          .
          <year>2020</year>
          . arXiv:
          <year>2009</year>
          .
          <article-title>10053 [cs</article-title>
          .CL].
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.-P.</given-names>
            <surname>Benzécri. L'</surname>
          </string-name>
          <article-title>Analyse des Correspondances</article-title>
          . Vol.
          <volume>2</volume>
          . 2 vols. Paris, Bruxelles, Montreal: Dunod,
          <year>1973</year>
          . 625 pp.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.-P.</given-names>
            <surname>Benzécri. L'Analyse des Données</surname>
          </string-name>
          .
          <volume>2</volume>
          <fpage>vols</fpage>
          . Paris, Bruxelles, Montreal: Dunod,
          <year>1973</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T. J.</given-names>
            <surname>Bolt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Flynt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Chaudhuri</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Dexter</surname>
          </string-name>
          .
          <article-title>“A Stylometry Toolkit for Latin Literature”</article-title>
          .
          <source>In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing</source>
          (
          <article-title>EMNLP-IJCNLP): System Demonstrations</article-title>
          . Ed. by
          <string-name>
            <given-names>S.</given-names>
            <surname>Padó</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Huang</surname>
          </string-name>
          . Hong Kong, China: Association for Computational Linguistics,
          <year>2019</year>
          , pp.
          <fpage>205</fpage>
          -
          <lpage>210</lpage>
          .
          <year>doi1</year>
          :
          <fpage>0</fpage>
          .18653/v1 /
          <fpage>D19</fpage>
          -3035.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P.</given-names>
            <surname>Chaudhuri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Dasgupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Dexter</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Iyer.</surname>
          </string-name>
          “
          <article-title>A small set of stylometric features diferentiates Latin prose and verse”</article-title>
          .
          <source>In: Digital Scholarship in the Humanities 34.4</source>
          (
          <issue>2018</issue>
          ), pp.
          <fpage>716</fpage>
          -
          <lpage>729</lpage>
          . doi:
          <volume>10</volume>
          .1093/llc/fqy070.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Eder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rybicki</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Kestemont</surname>
          </string-name>
          . “
          <article-title>Stylometry with R: A Package for Computational Text Analysis”</article-title>
          .
          <source>In: The R Journal 8.1</source>
          (
          <issue>2016</issue>
          ), pp.
          <fpage>107</fpage>
          -
          <lpage>121</lpage>
          . doi:
          <volume>10</volume>
          .32614/rj-2016-007.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T. E.</given-names>
            <surname>Franklinos</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Fulkerson</surname>
          </string-name>
          .
          <article-title>Constructing Authors and Readers in the Appendices Vergiliana</article-title>
          , Tibulliana, and
          <string-name>
            <surname>Ouidiana</surname>
          </string-name>
          . Oxford University Press,
          <year>2020</year>
          . doi:
          <volume>10</volume>
          .1093/oso/97 80198864417.001.0001.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R.</given-names>
            <surname>Gorman</surname>
          </string-name>
          . “
          <article-title>Morphosyntactic Annotation in Literary Stylometry”</article-title>
          .
          <source>In:Information 15.4</source>
          (
          <year>2024</year>
          ). doi:
          <volume>10</volume>
          .3390/info15040211.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>P.</given-names>
            <surname>Juola</surname>
          </string-name>
          . “
          <article-title>JGAAP: A system for comparative evaluation of authorship attribution”</article-title>
          .
          <source>In: Journal of the Chicago Colloquium on Digital Humanities and Computer Science</source>
          .
          <year>2009</year>
          . doi:
          <volume>10</volume>
          .6082/m1n29v4z.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>F.</given-names>
            <surname>Karsdorp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kestemont</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Riddell</surname>
          </string-name>
          .
          <source>Humanities Data Analysis: Case Studies with Python</source>
          . Princeton University Press,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>M.</given-names>
            <surname>Koppel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schler</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Argamon</surname>
          </string-name>
          . “
          <article-title>Authorship attribution in the wild”</article-title>
          .
          <source>InL:anguage Resources and Evaluation 45.1</source>
          (
          <issue>2011</issue>
          ), pp.
          <fpage>83</fpage>
          -
          <lpage>94</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <source>In: Journal of the Association for Information Science and Technology</source>
          <volume>65</volume>
          (1
          <year>2014</year>
          ), pp.
          <fpage>178</fpage>
          -
          <lpage>187</lpage>
          . doi:
          <volume>10</volume>
          .1002/asi.22954.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>D.</given-names>
            <surname>Longree</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Fantoli</surname>
          </string-name>
          .
          <source>LASLAfiles_Latin_APNformat . Version V1</source>
          .
          <year>2023</year>
          . doi:
          <volume>10</volume>
          .581 19/ulg/qjj0sa.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [16]
          <string-name>
            <surname>B. Nagy.</surname>
          </string-name>
          (
          <article-title>Not) Understanding Latin Poetic Style with Deep Learning</article-title>
          .
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .48550 /arXiv.2404.06150.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [17]
          <string-name>
            <surname>B. Nagy. “</surname>
          </string-name>
          <article-title>Some stylometric remarks on Ovid's Heroides and the Epistula Sapphus”</article-title>
          .
          <source>In: Digital Scholarship in the Humanities</source>
          <volume>38</volume>
          (3
          <year>2023</year>
          ), pp.
          <fpage>1183</fpage>
          -
          <lpage>1199</lpage>
          . doi:
          <volume>10</volume>
          .1093/llc/fqac098.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>G. T.</given-names>
            <surname>Rafaelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lalli</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Tria</surname>
          </string-name>
          . “
          <article-title>Inference through innovation processes tested in the authorship attribution task”</article-title>
          .
          <source>In:Communications Physics 2024</source>
          <volume>7:1 7</volume>
          (
          <issue>1</issue>
          2024), pp.
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          . doi:
          <volume>10</volume>
          .1038/s42005-024-01714-6.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>G.</given-names>
            <surname>Tani Rafaelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lalli</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Tria</surname>
          </string-name>
          . GiulioTani/InnovationProcessesInference: Accepted.
          <source>Version v1.0.0</source>
          .
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .5281/zenodo.12163218.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <source>trep 0</source>
          .960 o
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <source>t:P</source>
          <volume>20</volume>
          .957 n
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <source>m 0</source>
          .954 u
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <article-title>D 0.951 P0: auth</article-title>
          .dep., F: 150, N:
          <volume>5</volume>
          , :
          <volume>0</volume>
          .
          <string-name>
            <surname>602 (M.R</surname>
          </string-name>
          .:
          <source>1.0) HHoHroCorCraCramarmr2m31 HorCarm4 HorSaecu 0.76 0.77 0.78 0.79 0.80 0.81 0</source>
          .82 P0:OaOvuOvFtFvahaFs.
          <year>stadit5sie2tip6</year>
          .,
          <source>OFv:iE3p0is0t, N: 5, : 0</source>
          .
          <string-name>
            <surname>602 (M.R</surname>
          </string-name>
          .:
          <volume>1</volume>
          .0)
          <string-name>
            <surname>OOvFvaFsOatsiv1tiiA3OrsvAiA1OrsvAiR3emed OvOFavsiAtOir4svAiOA2vmOiCovOorOi3AnvvsmiAiLIbomiirn1or2 Propert4 OviMedic</surname>
          </string-name>
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <source>0.768 0.776 0.784 0.792 0.800 0</source>
          .808 P0: fixed, f.d.,
          <source>N: 4, : 0</source>
          .
          <string-name>
            <surname>799 (M.R</surname>
          </string-name>
          .:
          <volume>0</volume>
          .9314)
          <article-title>OvvFOFavsFstita2i5sti6 OOOvFvOaFvsaitAsi1tris3A1 a OviArsA3OOvviEiApmisotr2 OvFOavsitAiO4rsvAiOA2mviRore3med OviAmor1 OviCoOnsvLiIibin</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>