=Paper= {{Paper |id=Vol-2021/paper3 |storemode=property |title=A choice of relationship-revealing variants for a cladistic analysis of Old Norse texts: Some methodological considerations |pdfUrl=https://ceur-ws.org/Vol-2021/paper3.pdf |volume=Vol-2021 |authors=Katarzyna Anna Kapitan }} ==A choice of relationship-revealing variants for a cladistic analysis of Old Norse texts: Some methodological considerations== https://ceur-ws.org/Vol-2021/paper3.pdf
  A Choice of Relationship-Revealing Variants for a Cladistic
         Analysis of Old Norse texts: Some Methodological
                                    Considerations



                                   Katarzyna Anna Kapitan
        Department of Nordic Studies and Linguistics, University of Copenhagen
                                       kak@hum.ku.dk




Keywords: stemmatology, manuscript studies, manuscripts, cladistic analysis,
Old Norse


Abstract
The research presented in this article centers on the methodological field of computer-assisted
stemmatics, specifically the application of the tools and methods originating from phylogenetics
to answer questions of textual criticism. Given the well-known problem of stemmatics that the
shape of the stemma changes depending on the readings selected for analysis, surprisingly little
discussion has been devoted to the definition of a relationship-revealing reading among the
practitioners of computer-assisted methods. This article discusses some of the controversies
regarding the methodological principles within the field of computer-assisted stemmatics (new
stemmatics and cladistic textual criticism), with the main focus on a choice of relationship-
revealing readings. This research takes an experimental approach towards different
methodological principles, and tests them using PHYLIP (the Phylogeny Inference Package,
version 3.695). The experiments aim to assess the influence of different types of variants on the
results of a cladistic analysis. The experiments are based on readings collected from the oldest
part of the manuscript tradition of an Icelandic saga, Hrómundar saga Gripssonar. The results of
the experiments suggest that the cladistic method can be employed in traditional textual
research, but the results achieved through this process are highly dependent on the type of
variation included in the input file: the shape of the unrooted tree of relationships changes
depending whether it was built on major or on minor variants.
Introduction
The application of computer-assisted methods, originating from phylogenetics, to answer
questions of textual criticism has been recognized in academic discourse as a powerful tool for
revealing the filiation of manuscripts (recently in Old Norse studies: Hall & Parsons, 2013;
Zeevaert et al., 2013). Surprisingly, not much discussion within the field of "New Stemmatics"
has been devoted to the definition of a relationship-revealing reading, and there is disagreement
among practitioners of computer assisted-methods regarding the fundamental question: What
type of textual variation can, or should, be used for text-genealogical analysis? Salemans (1996)
suggested a strictly systematized classification of text-genealogically informative variants, while
Robinson (1996, p. 75) opted for basing the analysis on all substantive variants of different
readings. This paper takes an experimental approach towards this problem, and presents the
results of applying a phylogenetic analysis to the oldest part of the tradition of an Icelandic saga,
Hrómundar saga Gripssonar (HsG). The discussion is based on experiments conducted
employing a package of programs for inferring phylogenies, PHYLIP, developed by Felsenstein
(version 3.695, 2013). The aim of the experiments was to assess the influence of different types
of variants on the results of cladistic analysis.


Theoretical framework
Close similarities between the theoretical assumptions of cladistics and stemmatics have been
noted by Platnick and Cameron (1977, pp. 384–385), who pointed out "that cladistic analysis is a
general comparative method applicable to all studies of historical interrelationships based on
actual ancestor-descendant sequences," including stemmatics and historical linguistics. The
same idea has been recently revived by Howe et al. (2004), who discussed the main similarities
between evolutionary biology and stemmatics, emphasizing the similarity of the challenges both
disciplines face, namely contamination in manuscript traditions and the evolutionary processes
occurring in DNA-sequences, such as recombination, transposition, and homoplasy (cf. O’Hara
& Robinson, 1993; Robinson & O’Hara, 1996).


In recent years, the use of a cladistic analysis in stemmatological research has become highly
popular, and has led to the appearance of two terms, "cladistic textual criticism," and "New
Stemmatics." Even though both terms refer to computer-assisted stemmatics based on cladistic
analysis, the principles governing the data collection are different. "New Stemmatics," according
to the definition published on The Textual Scholarship webpage (Robinson & Bordalejo, 2010b),
aims at obtaining, as far as possible, a comprehensive overview of the relationships between
witnesses through an analysis based on all available data using quantitative tools, typically
computer-assisted analysis. In contrast, "cladistic textual criticism," a term introduced by
Salemans (1987), argues for a careful selection of variants to be processed by a computer, and
emphasizes that only very few textual differences can be considered genealogically informative,
and hence used for the analysis.


In my view, it is justifiable to state that all philologists, regardless of their background, would
agree that not all modifications of a text (innovations) can be considered genealogically
informative. There is, however, no consensus regarding which variants (readings, errors) can be
considered as such, and can be used to build the stemma (chain, tree) of the manuscript
tradition. The well-known approach, practiced by neo-Lachmannians, including to some extent
also Salemans, follows the rule that exclusively “non-polygenetic significant errors” can be
considered text-genealogically informative, and can thus be used to build a stemma (cf. Trovato
2014, p. 55, p. 110). In this case, the judgment of a polygenetic reading remains in the
philologist's individual domain, and gives philologists an opportunity to make their critical
judgment based on their own preferences and experiences. This subjectivity, as suggested
recently by Andrews (2016), might pose some challenges to text-critical research, and bring
under scrutiny the quality of the decisions made by philologists.


Conversely, the dominant view within "New Stemmatics" is that orthography, punctuation, and
formal presentation are types of variation that do not carry genealogy-revealing information (cf.
Bordalejo 2015, p. 566; Robinson 2016, p. 639); accordingly, only substantive variants can be
used to build a stemma. This suggestion follows Greg’s (1950) distinction between substantives,
which are likely to have been copied from witness to witness, and accidentals, which might be
particular to a scribe (cf. Robinson 1996, p. 75). However, Greg's considerations were more of
an editorial than a stemmatic nature and, even though he observed that "the distribution of
substantive variants generally agrees with the genetic relation of the texts" (Greg 1950, p. 22),
he did not comment on how accurate this "general agreement" is, or how useful substantives are
to build a stemma. On the contrary, in the earlier stages of his career, Greg was a zealous
aficionado of building stemmas using exclusively type-two variants, which he defined as
genealogically significant variants that divide a given tradition into exactly two groups, where
each variant occurs in at least two text versions (Greg, 1927, pp. 22–23). Due to the lack of more
detailed discussion within “New Stemmatics” to define genealogically informative readings, the
most reasonable way to reveal some of the principles applied in computer-assisted analysis is to
consult lists of variants used in previous scholarship. The complete lists of variants of The Wife
of Bath's Prologue, The General Prologue, The Miller's Tale, The Nun's Priest's Tale, and
Sólarljóð are all available online on The Textual Scholarship webpage (Robinson & Bordalejo,
2010a); considering the scope of this paper, I have decided to consult Robinson's (2004) list of
variants for Old Norse-Icelandic poem Sólarljóð.
The readings listed for Sólarljóð, and therefore those considered genealogically informative,
include, for example, the sentence: helgir englar komu or himni ofan ok tóku sál hans "holy
angels came from heaven above and took his soul" (nos. 117-125). In this sentence, variation
appears in a verb tense (komu - koma, no. 119), omission or use of a different preposition (or -
af - frá, no. 120), the number of a noun (himni - himnum, no. 121), and inversion of a determiner
and head noun (sál hans - hans sál, no. 125). Other examples include omission and variation in
conjunctions (því - því að, no. 797), prepositions (fyrir - frá, no. 847), adverbs (brot - í burt - burt,
no. 853), and word order (menn ég sá þá - menn sá ég þar, nos. 1129-1131), as well as obvious
errors (fuglar [birds] as "fulgar" [sic], no. 991). Even though many of the listed variants can be
considered minor from a traditional point of view, it has been suggested by Hall and Parsons
(2013, § 39) that "an accumulation of minor variants all pointing in the same direction can
become a powerful argument for a particular manuscript filiation" and as Robinson (2016, p. 649)
emphasized, the results achieved by a phylogenetic analysis are reliable "because our analysis
does not rest on only these one or two variants ('indicative' as they might be), but on patterns
within the whole mass of variation."


Research questions
Given, firstly, the dichotomy in approaching genealogically informative variants, and secondly,
the well-known problem of stemmatology that the shape of the stemma changes depending on
the readings selected for the analysis, there is a need to evaluate the effectiveness of both
approaches in the context of computer-assisted stemmatology. Obvious questions arise, such as
how to select readings that carry relationship-revealing information, and how to use them in
computer-assisted analysis. Should we base stemmas on neo-Lachmannian non-polygenetic
significant errors, and build the input file exclusively on loci critici, or should we register all types
of substantives, as new stemmaticists suggest? To my knowledge there has been no research
evaluating how these two categories influence the results of computer-assisted analysis.
Instead, scholars have been creating computer-assisted stemmas and comparing them with
existing, traditionally-built stemmas, in order to evaluate how accurate computer-generated
stemmas can be. It is hoped that the innovative approach of the experiments presented in this
paper will shed new light on this problem.


Methodology
Choice of manuscripts
The experiments presented in this paper are a case study of the oldest witnesses of HsG, listed
in Table 1. These are the manuscripts on which previous scholars, mainly Kölbing (1876) and
Andrews (1911), but also to some extent Rafn (1829, pp. xii–xiii), based their arguments
regarding the saga's origin and transmission, but arriving at contradictory conclusions. An
exception is B4859, which was dismissed by both Andrews and Kölbing as worthless, and L222,
which was most likely unknown to them: both manuscripts are included in my analysis based on
chronological criteria.
                           Table 1: Oldest manuscripts of HsG, by shelf mark.

   Siglum                 Shelf mark                Date                        Scribe
   A193                   AM 193 e fol.             1690-1697                   Ásgeir Jónsson
   A345                   AM 345 4to                1695                        Jón Þórðarson
   A587                   AM 587 b 4to              1686-1688                   Ásgeir Jónsson
   A601                   AM 601 b 4to              1650-1689                   Jón Eggertsson
   B4859                  BL Add. 4859              1695                        Jón Þórðarson
   L222                   Lbs 222 fol.              1695                        Jón Þórðarson
   P67                    Papp. Fol. nr 67          1687                        Jón Eggertsson
   T1768                  Thott 1768 4to            1686-1688                   Ásgeir Jónsson


This choice of witnesses serves the purpose of placing the results of the computer-assisted
analysis in the context of the existing classifications of HsG manuscripts, by Kölbing and
Andrews, who based their judgment on traditional genealogical methods. Kölbing (1876, p. 181)
excluded the possibility that any of the witnesses examined could be the codex optimus of the
existing tradition, or even an ancestor of the other manuscripts examined, as presented in Figure
1. Andrews (1911, p. 531) rejected the idea that all the manuscripts represent independent
branches; instead he suggested that A601 preserves an original text of the saga, and is an
ancestor of the entire tradition, as presented in Figure 2. Andrews did not discuss the
relationships between other manuscripts, but his stemma seemed to group other witnesses into
two branches: on one side he placed manuscripts in Ásgeir Jónsson’s hand, A587, A193, and a
third unnamed node, which certainly represents T1768, to which he refers in the text; on the
other side he placed P67 in Jón Eggertsson's hand, and A345 in Jón Þórðarson's hand.




                            Figure 1: Stemma based on Kölbing 1876, p. 182.
   Figure 2: Stemma based on Andrews 1911, p. 531. The dotted line represents a connection between
                 A601 and a manuscript that was missing a label in the original stemma.

Andrews's stemma is certainly incorrect regarding the relationships between the manuscripts in
Ásgeir Jónsson's hand (A193, A587, T1768). A193 and A587 are almost identical and must be
closely related, and A193 is most likely a descendant of A587. Kölbing's stemma includes only
four manuscripts and might be a good hypothesis for the relationships between these
manuscripts, but whether A601 and P67 are independent witnesses is not obvious. Since there
is insufficient space here for a detailed discussion of relationships between the manuscripts
based on extra-textual evidence, neither to present this author's own stemma, nor make an
argument in favor of it, this paper focuses on comparing the trees of relationships based on
different criteria. Additionally, the computer-assisted stemmas are put into the context of the two
traditionally achieved stemmas presented above, testing whether any of them can be obtained
by computer-assisted analysis based on different data sets.


Cladistic analysis
The experiments presented in this article were inspired by Hall's (2013) experiments with the
phylogenetic analysis conducted on small samples from Konráðs saga keisarasonar, as well as
experiments conducted on chapter 86 of Brennu-Njáls saga — during the stemmatology
workshop organized by Hall and Zeevaert at the Arnamagnæan Summer School in Manuscript
Studies (Zeevaert et al., 2013). Following Hall's example, instead of conducting the experiments
with the help of PAUP* (Phylogenetic Analysis Using Parsimony), which currently seems to be the
most popular software in the field of stemmatics, I have chosen a free open-source package of
programs for inferring phylogenies, PHYLIP, specifically the general parsimony program PARS and
tree-plotting program DRAWTREE and CONSENSE (Felsenstein, 2013).


Complete transcriptions of the texts were prepared in plain text format, and collated in a
spreadsheet. The texts were transcribed on a simplified diplomatic level, with abbreviations
expanded in round brackets (), unclear readings marked in square brackets [], deletions likely to
be by the scribe between slashes //, and scribal additions in insertion marks `´. No variation of
graphemes or orthography has been represented and, for the sake of practicality, modern
Icelandic letterforms have been employed (but not modern Icelandic orthography). Variation in
the transcriptions posed no difficulty for the research because PARS requires manual encoding of
the variants: this means that it is the scholar’s subjective decision which readings will be
considered genealogically informative variants and which not. In practice, within the spreadsheet
the numeric values are assigned manually to each character and then converted into the PARS
input files: therefore, normalization of the transcriptions is not crucial. Where it was impossible to
determine which variant appears in a particular witness, for example due to its illegibility, the
character was encoded with a question mark to indicate uncertainty.


The complete transcriptions comprised approximately 3600 words per witness (e.g. 3587 words
in A587, and 3643 words in L222) and were collated in 690 characters (columns), which
correspond to places of possible variation, averaging 5.2 words per character. This number may
give rise to some controversies, since scholars usually tend to have characters as small as
possible: Salemans (1996, p. 15), for example, preferred a collation where each word is
considered an independent place of variation. However, the collation should imitate the
manuscript's copying process. It is rather unlikely that any professional scribe would copy any
given text word by word, especially a text in the vernacular: copying phrase by phrase seems to
be more probable. In the situation where more than one place of variation appears within one
character, this character is divided into as many smaller characters as necessary; this excludes
possibility that some changes arose independently from each other on different stages of the
saga's transmission and accidently ended up in one character. For example, a reading
Hrómundur spyr hver nú vill ganga í hauginn "Hrómundur asks who now wants to go into the
mound", which was first intended as one character, was divided into three characters based on
the registered variation (e.g. hver - hvor, nú vill - vill, ganga í - ganga inn í). The characters,
which do not contain any places of actual variation, remained in their initial form and can be fairly
long, reaching up to eight words, such as Setti hann þá klær sínar á hnakka Hrómundi "He put
then his claws on Hrómundur's neck".


Choice of relationship-revealing variants
There is no golden rule for revealing significant errors regardless of genre, language, and scribal
tradition. Therefore, as Trovato (2014, p. 115) suggested, philologists must distinguish significant
errors from noise in each individual case. Van Mulken (1993, p. 25) drew the same conclusion,
stating that "it is the corpus which should dictate a typology of variants with respect to the
kinship-revealing character." The typology she used to distinguish and assess the use of a
number of variant types in the Perceval tradition is vital for the development of stemmatic
methodology. Van Mulken (1993, p. 36-40) distinguished variants with low and high relationship-
revealing power. The variants with low relationship-revealing power include interpolation or
omission of verses, variants affecting the possessive or determinative quality of a word,
numbers, changes in a narrative point of view, as well as extra-textual variants (e.g. historiated
initials and lombards). The variants with high relationship-revealing power include interchange of
verses, or of rhyming constituents, changes in word order, variants concerning the aspect, time
or mood of verbs, and important semantic changes. Salemans (1996, p. 4) also developed a
typology of relationship-revealing variants and established rules of text-critical research; he
suggested that "only very few textual differences can serve as genealogical, relationship-
revealing elements." He defined four basic text-genealogical rules, which included a definition of
a genealogical variant, place of variation, type-two variants, and the necessity of presenting
variants in the apparatus. Within the first rule, Salemans discussed the types of variants that
cannot be used for stemmatic analysis (essentially polygenetic variants): these include
synonymous, regional, inflectional and historical parallelism.

The present paper draws extensively on the implications of Salemans’s (1996) and van Mulken’s
(1993) methodology, which served as the basis for developing my preliminary typology of
genealogically informative variants in the HsG early tradition. It must be emphasized, by
contrast, that both aforementioned classifications were developed for poetic texts, not for prose
such as HsG. Taking into consideration the stylistic differences which might influence the
copying process, the criteria applied to the saga’s tradition therefore required some adjustments.
For the purpose of this paper I divided all the variants appearing in the oldest witnesses of the
saga into two main categories: major and minor. Major variants are readings with a high
likelihood of having been copied from the exemplar, which are mainly lexical, such as:

   •   omission, addition, or replacement of nouns, e.g. grátt og sítt hár, skegg "grey and long
       hair, beard" - grátt og sítt skegg "grey and long beard", and stokkar "timber stakes" -
       steinar "stones", herskip "a warship" - skip "a ship";
   •   omission, addition, or replacement of verbs, e.g. batt "bound" - bar "carried", including
       synonyms, e.g. datt - féll "fell", but excluding verbs introducing speech;
   •   omission, addition or replacement of adjectives, including synonyms, e.g. myrkt - dimt
       "dark";
   •   omission, addition, or replacement of entire phrases, e.g. heldur enn ræna kotkarla
       "rather than rob a cottager" - omitted;
   •   clear errors, e.g. kü for ský "sky", and Svylöð for Gunnlöð (personal name).

Minor variants are readings with a high probability of occurring independently from the exemplar,
and might be scribe-individual, and thus often polygenetic, such as:

   •   number of nouns, e.g. búkum (dat. pl.) "human corpses" - búki (dat. sg);
   •   tense of verbs, e.g. komu (imperf.) "they came" - koma (pres.);
   •   definiteness of nouns and adjectives, e.g. líf "a life" - lífið "the life";
   •   position, addition, omission, or replacement of prepositions, adverbs, and conjunctions,
       e.g. og spyr eftir "and asks after" - og spyr "and asks";
   •   linguistic variation, e.g. hvor - hver "who", gera - gjöra "to do";
   •   word order, e.g. sá hét Greipur "this [man] was called Greipur" - er Greipur hét "who was
       Greipur called", and maður frægur "famous man" - frægur maður "man famous"
   •   numbers, e.g. xvi undir "sixteen wounds" - xiv undir "fourteen wounds".

Additionally, I use Greg's definition of type-two variants, as introduced above, and build on this
concept by including in one experiment quasi-type-two variants, which are minor variants that
are hypothetically not genealogically significant, and which divide the tradition into two groups of
readings with at least two witnesses representing each group.

For each experiment, different types of variation were used to build unrooted trees of
relationships; therefore the number of parsimony-informative characters changes from one
experiment to another. The criteria for selecting the variants are described in detail in the next
section. The environment in which experiments were conducted, such as the number of
characters analyzed, order of manuscripts in the input file and the settings in PARS, DRAWTREE,
and CONSENSE, remained unchanged.


Experiments and results
Experiment 1
The input file for the first experiment was based on all types of variants, both major and minor,
thus 255 characters were considered parsimony-informative: around 37% of all characters in the
data set. The vast majority of variants included in this experiment were minor variants. Due to
limitations of space, the full list of 255 variants is not included in this paper but the full data set
containing collated transcriptions and PARS input files can be obtained from the author on
request.




             Figure 3: An unrooted tree of relationships: Experiment 1 (all variants included).
In the result of the PARS analysis, one most-parsimonious tree was found, which can be
presented in Netwick notation with branch lengths removed and spaces introduced for the
reader's convenience: (((B4859, A345), L222), ((T1768, (A193, A587)), A601), P67). The results
were plotted into the program DRAWTREE in order to visualize an unrooted tree of relationships,
as presented in Figure 3. The unrooted tree presents the relationships between the manuscripts,
including the branch lengths. The manuscripts in Jón Þórðarson's hand (A345, B4859, L222) are
placed relatively close to each other, while the manuscripts in Ásgeir Jónsson's hand (A193,
A587, T1768) are grouped together with A601. The relationship between Jón Eggertsson's
manuscripts (P67 and A601) was not recognized in such a straightforward way.


Experiment 2
In the second experiment, only 73 potentially significant variants or major variants were
considered parsimony-informative: around 11% of all the characters analyzed. In this
experiment, three possible trees of relationships were found, which can be represented as
follows:
       a. ((A345, (B4859, L222)), (T1768, (A193, A587)), A601, P67);
       b. ((A345, (B4859, L222)), (T1768, ((A193, A587), A601)), P67);
       c. ((A345, (B4859, L222)), ((T1768, (A193, A587)), A601), P67);
All three trees have the same groupings of A193 and A587 (green) and of Jón Þórðarson's
manuscripts (red). Given the PARS results – three possible trees of relationships – the PARS
output file was plotted first to the CONSENSE program, in order to obtain a consensus tree,
applying a strict consensus method. Next, the results of CONSENSE were plotted to DRAWTREE to
visualize the results, as presented in Figure 4.




  Figure 4: A strict consensus tree: Experiment 2 (note that branch lengths in the consensus tree are not
                                                relevant)
As shown in Figure 4, Jón Þórðarson's manuscripts (A345, B4859, L222) are derived from a
common ancestor, while the two manuscripts in Ásgeir Jónsson's hand (A193 and A587) are
presented as siblings; also P67, A601, and T1768 can be interpreted as siblings.


Experiment 3
In the third experiment, the data set was based on quasi-type-two variants and contained 68
parsimony-informative characters: around 10% of all characters in the data set.




      Figure 5: An unrooted tree of relationships: Experiment 3 (all quasi-type-two variants included)

As a result (Figure 5), one most-parsimonious tree was found: (((B4859, A345), L222), ((T1768,
(A193, A587)), A601), P67). The results allow us to identify two groups of manuscripts: Ásgeir
Jónsson's manuscripts (A193, A587, T1768) and Jón Þórðarson's group (A345, B4859, L222).
The relationship between Jón Eggertsson's manuscripts is inconclusive, and A601 is grouped
together with Ásgeir Jónsson's group.


Experiment 4
This experiment was based on type-two variants, which are listed in Table 2; variants are
collated against P67. The symbol ✔ indicates that the particular witness contains the same
reading as P67, while when the readings disagree with P67 they are given in the table.
                        Table 2: Distribution of major type-two variants used in Experiment 4

             P67           A601         A587         A193        T1768        L222          A345       B4859
      Sá kongur réði      add. `/í      add. í       add. í
1.                                                                  ✔           ✔               ✔         ✔
       fyrir Görðum      Danmörk/´     Danmörk      Danmörk
                                                                             herskip]      herskip]    herskip]
2.    engin herskip          ✔             ✔            ✔           ✔
                                                                               skip          skip        skip
        og ræna
3.                           ✔             ✔            ✔           ✔         om. fé            ✔       om. fé
       drauga fé
      heldur enn
4.                           ✔            om.          om.          ✔           ✔               ✔         ✔
    ræna kotkarla
      Hrómundur                                                                fregn          fregn      fregn
5.   þakkar karli            ✔             ✔            ✔           ✔         þessa]         þessa]     þessa]
     fregn þessa                                                            frá söguna    frá söguna   frá sögu
      að grjót og
                                        stokkar]     stokkar]
6. stokkar gengu             ✔                                      ✔           ✔               ✔         ✔
                                        steinar       steinar
          upp,
      og víst ertu                       add.         add.
7.                           ✔                                      ✔           ✔               ✔         ✔
        hraustur                        maður        maður
8.      Gunnlöð              ✔            ✔            ✔            ✔         Svílöð        Svílöð      Svílöð
    og er þeir voru
9.                           ✔             ✔            ✔           ✔        leið] veg     leið] veg   leið] veg
     á leið komnir
    koma af landi
10.                          ✔             ✔            ✔           ✔        sky] kü        sky] kü    sky] kü
       svört sky
       mér þótti
                                                                             settur]                   settur]
11.   jarnhringur            ✔             ✔            ✔           ✔                           ✔
                                                                             sleginn                   sleginn
         settur

                         í /völlinn/                                         í völlinn     í völlinn   í völlinn
      í `jördina´ upp                  í upp að     í upp að     í upp að
12.                        upp að                                             upp að        upp að        að
         að hjöltum                     hjöltum      hjöltum      hjöltum
                          hjöltum                                            hjöltum       hjöltum     hjöltum
       svo sverðið
                                         sókk]        sókk]       sókk]
13.      sókk að         sókk] hljóp                                        sókk] hljóp         om.      om.
                                         hljóp        hljóp       hljóp
         hjöltum
                          ofan /í                                             ofan í
14.        om.                           ofan         ofan        ofan                          om.      om.
                          völlinn/                                            völlinn


    karl sagði, að
15.                          ✔         karl] hann   karl] hann      ✔           ✔               ✔         ✔
         hann
     og að líðnum                       fjórum]      fjórum]     fjórum]
16.                          ✔                                                  ✔               ✔         ✔
    fjórum dögum                          sex          sex         sex


 As presented in Table 2, this experiment is based mainly on type-two variants, but there are
 some exceptions from this rule, for example the definite form of the noun saga, against its
 indefinite form (no. 4) is not considered a variant; only reading frá sögu against fregn þessa is
 encoded as text-genealogically informative. Additionally, variants in the light grey rows (nos. 12-
 14) are type-four variants, which need some explanations. In no. 12, the omission of the
 preposition upp was not considered parsimony-informative; only the omission of völiinn is
 considered a major variant – because it creates a sentence without an object – thus creating a
 type-two variant in the input file. At the same time, it is open to discussion whether this reading
 can be considered as genealogically informative, since P67 has a supralinear addition of jördina,
 while A601 has völlinn deleted, so the scribe copying A601 could either restore the deleted
reading or follow the deletion, and create a sentence without an object, as in case of A587 and
A193. In no. 13, P67 is the only manuscript that reads sóck for hljóp, while A345 and B4859 omit
the entire reading; the omission of the entire phrase should be considered a major difference
and this is why it was encoded in the input file. No. 14 could be considered a minor variant, since
the omission of the preposition does not meet the criteria for major variants. However, the
readings were included in the analysis as a type-four variant, because both readings 12 and 14
seem to be somehow related to each other, perhaps as a result of an error involving the
omission of the word völlinn.

The variants in the dark grey rows (nos. 15-16) can be polygenetic, but are included in the
analysis: no 16 because in some of the witnesses the numbers were spelled out, while other
witnesses had roman numerals, so even though there is a high possibility that the roman
numerals will be incorrectly copied, the written-out forms are more likely to remain unchanged;
no 15 because the use of a personal pronoun for a noun did not fit directly with my definition of a
minor variant, even though it might be polygenetic. Taking these methodological concerns into
consideration, I have split the experiment into four subtests:
   •   Test 4a: based on 14 parsimony-informative characters (nos. 1-14)
   •   Test 4b: based on 16 characters (nos. 1-16)
   •   Test 4c: based on 13 characters (nos. 1-11, and 15-16)
   •   Test 4d: based on 11 characters (nos. 1-11)
As the result of Test 4a, three possible trees of relationships were found, which were identical in
their shapes to the results of Experiment 2. The visualization of the tree is not reproduced here
since it is identical with the one presented in Figure 4 above.


As the results of Test 4b, one most-parsimonious tree was found (Figure 6), which has a
promising distinction between Ásgeir Jónsson's manuscripts (A193, A587, T1768), Jón
Þórðarson's manuscripts (L222, B4859, A345), and Jón Eggertsson's manuscripts (P67, A601).
         Figure 6: An unrooted tree of relationships: Experiment 4b (16 major type-two variants).


As the result of Test 4c, one most-parsimonious tree was found (Figure 7). In this unrooted tree
P67 and A601 (Jón Eggertsson's manuscripts) are identical, A193 and A587 (Ásgeir Jónsson's
manuscripts) are also identical and are related to T1768, while B4859 is no longer a sibling of
L222, but is now is presented as its descendant, which is a descendant of A345 (Jón
Þórðarson's group).




         Figure 7: An unrooted tree of relationships: Experiment 4c (13 major, type-two variants).

As the result of Test 4d, one most-parsimonious tree was found (Figure 8). The distinction
between P67, A601 and T1768 has disappeared, while other manuscripts have stayed in the
same position relative to each other.
     Figure 8: An unrooted tree of relationships based on 11 major, type-two variants: Experiment 4d.



Additional experiment 2'
Inspired by the significant differences in the results of Experiment 4, I decided to apply similar
rules to the data set from Experiment 2, and include readings 15-16 (from Table 2), into the data
set; thus the input file for Experiment 2' contained 75 parsimony-informative characters (11%).




  Figure 9: An unrooted tree of relationships: Experiment 2' (73 major variants, variants 1-16 from Table 2
                                                 included).

Excluding branch lengths, the trees obtained in Experiment 2' and Experiment 4b are identical,
and in Netwick notation can be represented as ((A345, (B4859, L222), (T1768, (A193, A587)),
A601, P67).
Discussion
The experiments presented in the previous section tested the effectiveness of the assorted types
of variants for cladistic analysis. Not surprisingly, the results show that the shape of unrooted
trees of relationships changes depending on the criteria applied while selecting the potentially
genealogically informative readings. As presented by tests within Experiment 4, but also 2 and
2', the analysis is very sensitive to any change in the data set, namely that the presence or
absence of two or three variants in the input file resulted in different results, varying from three
possible trees of relationships to one most-parsimonious tree.


Generally, it can be observed that, regardless of which variants were included in the input file,
the general grouping of the manuscripts corresponds to the expected grouping by scribe. In all
experiments, manuscripts in Jón Þórðarson's hand (A345, B4859, L222) were grouped as
derived from a common ancestor, as were all manuscripts in Ásgeir Jónsson's hand (A193,
A587, T1768). However, the relationship between manuscripts in Jón Eggertsson's hand (A601,
P67) was not obvious, and it must be emphasized that one of them, A601, had previously been
considered a codex optimus, so its position is especially interesting from the perspective of the
HsG tradition. The tree resulting from Experiment 1 corresponds closely to the tree from
Experiment 3, while the consensus tree achieved in Experiment 2' corresponds to the tree from
Experiment 4b. So, the data sets based on major variants deliver similar results, while the tests
based on all types of variants result in a different grouping. The change can be observed in Jón
Þórðarson's group: in Experiments 1 and 3, based on all variants, A345 can be interpreted as a
sibling of B4859, and their parent as a sibling of L222, while in Experiments 2' and 4b, based on
major variants, L222 can be interpreted as a sibling of B4859 and their parent as a sibling of
A345.


Even though Salemans, following Greg, claimed that only type-two variants can be used to build
a stemma, these experiments show that the shape of the tree is more influenced by the type of
variation included in the analysis than by the selection of exclusively type-two variants, as in
Experiment 1 and 2, 3 and 4. The tree resulting from Experiment 2', which is based on all major
variants, seems to represent a better hypothesis of the relationships between the witnesses
within the tradition under scrutiny than the one in Experiment 4b. This is certainly a result of
including readings which are unique to single witnesses (lectio singularis or type-one variants),
since some of them might play an important role as separative errors. I did not test how type-four
variation influences the results of the analysis, but they were also included in the input files of
Experiments 2 and 2'.
The similarities between trees based on minor variants and trees based on major variants
require further explanation. The number of parsimony-informative characters included in each
experiment was as follows: Experiment 1: 255 (all variants), 2: 73 (major variants), 3: 68 (quasi-
type-two variants), 4: a - 14, b - 16, c - 13, d - 11 (major type-two variants), and 2': 75 (major
variants). This shows that minor variants, which are likely to be polygenetic or individual to a
scribe, established the vast majority of the variation within the HsG tradition. 71% of the
characters in Experiment 1 can be considered to be minor variants; similarly 79% in Experiment
3. In both cases there is a remarkable imbalance between major and minor variants in favor of
the latter; thus the results achieved in Experiments 1 and 3 are largely based on minor variants,
which overwrite any potentially genealogical information carried by major variants.


After a close examination of the distribution of selected minor variants, as presented in Table 3,
one might get the impression that there are some patterns which perhaps reflect the stylistic
preferences of the scribe. For example, the use of the historical present (no. 16) seems to be a
stylistic choice by Ásgeir Jónsson, while neither Jón Þórðarson nor Jón Eggertsson employ this
form in this particular example. In some instances, however, other scribes also use the historical
present, for example Jón Þórðarson in L222 and Jón Eggertsson in both P67 and A601 (no. 15).
Moreover, the historical present is very common, effectively the default form used in Icelandic
saga narrative: so it cannot be considered as a distinctive feature of Ásgeir Jónsson's
manuscripts. The use of the singular instead of plural genitive of ferð "journey" (no. 1) seems
again to be a innovation by Ásgeir Jónsson, who additionally in T1768 changed the tense of a
verb from bjóst (imperf. passive) to býst (pres. passive).


Other variants, which I refer to as "linguistic variants" (nos. 3-12), are not consistently distributed
between the scribes. Even though some of them give this impression, such as Ásgeir Jónsson's
choice of the definite form of the noun sverðið "the sword" (no. 3), which is common to all of his
manuscripts, it is rather difficult to defend as being not polygenetic. Similarly, it seems as though
Ásgeir Jónsson preferred the unrounded form gera instead of gjöra "to do" (nos. 5-6), and words
starting hver- instead of hvor- (nos. 9-10); elsewhere in the saga, however, he used hvornin, for
example in "spurðu menn þa hvorninn Þrainn oc hann hefþo skiliþ" (A587, 5r:15-16), and hvorn
in "Hrongvidur matti kiosa hvorn `dag´ mann fyrir sverdsins odde" (T1768, 261v:13-14).


Word order seems to be even more accidental, as in example 17, where T1768 agrees with
L222, and in example 2, where T1768 agrees with A345, but it is impossible to prove any
relationship between these manuscripts. The distribution of conjunctions, pronouns, and
prepositions also seems to be absolutely fortuitous, as shown in examples 12-14.
                    Table 3: Distribution of selected minor (quasi type-two variants).

         Jón Eggertsson                          Ásgeir Jónsson                              Jón Þórðarson
         P67          A601           A587             A193           T1768          L222          A345         B4859
1.    hann bjóst      ferða]      ferða] ferðar    ferða] ferðar   bjóst] býst,       ✔             ✔             ✔
       til ferda      ferðar                                       ferða] ferðar
2.         í            ✔              ✔                ✔           þú vanst í        ✔         þú vanst          ✔
      hólmgöngu                                                    hólmgöngu                         í
      m þú vanst                                                       með                      hólmgön
         með                                                        Mistilteini                  gu með
      Mistilteini                                                                               Mistilteini
3.    hvar sverð      sverð]         sverð]           sverð]          sverð]          ✔             ✔             ✔
        hangir       sverðið        sverðið          sverðið         sverðið
4.    þeir gjörðu       ✔              ✔             gjörðu]            ✔             ✔          gjörðu]       gjörðu]
         svo                                          gerðu                                       gerðu         gerðu
5.    og svo var    gjört] gert    gjört] gert      gjört] gert         ✔             ✔         gjört] gert   gjört] gert
         gjört
6.    og gjörðist       ✔           gjörðist]        gjörðist]          ✔             ✔             ✔         gjörðist]
          þar                       gerðist          gerðist                                                   gerðist
7.    engin jarn        ✔         engin] engi      engin] engi          ✔           engin]        engin]        engin]
                                                                                     engi          engi          engi
8.    hvorn dag         ✔            hvorn]           hvorn]            ✔             ✔             ✔          hvorn]
                                     hvern            hvern                                                     hvern
9.     hvor sá      hvor] hver     hvor] hver       hvor] hver      hvor] hver        ✔             ✔             ✔
         væri
10.    hvort er         ✔         hvort] hvert     hvort] hvert    hvort] hvert     hvort]          ✔             ✔
       nafn þitt                                                                     hvert
11.     spurðu          ✔              ✔             hvornin]        hvornin]         ✔             ✔         hvornin]
       menn þá                                       hvernin         hvernin                                   hvernin
       hvornin
12.     sagði           ✔              ✔             add. að            ✔             ✔          add. að          ✔
        Blindur
13.   og kastaði     og] enn        og] enn          og] enn            ✔          og] enn       og] enn       og] enn
         niður
14.   þó sér væri       ✔              ✔                ✔               ✔           þó] að       þó] að        þó] að
      niður slept
15.   hann tekur        ✔              ✔                ✔               ✔             ✔         tekur] tók    tekur] tók
       sér kylfu
16.    og þegar         ✔         komu] koma       komu] koma      komu] koma         ✔             ✔             ✔
      þeir komu
17.   hafa með          ✔              ✔                ✔              með           með        gert hafa         ✔
       göldrum                                                       göldrum       göldrum      ísin með
       gjört ísin                                                   hafa gjört     hafa gjört   göldrum
                                                                       ísin           ísin
Even if the minor variants can be regarded as individual to a scribe, the grouping we achieved
using them is based on some sort of "scribal signal" and not on genealogically significant
information. In the case of the oldest witnesses of HsG, the general grouping was surprisingly
accurate, most likely because there were only three scribes who might have been somewhat
consistent in their stylistic choices, but it does not mean that the relationship between the
manuscripts was correctly established from a genealogical point of view. If we consider the
minor variants as polygenetic, assuming that any scribe at any point in time could make the
same change (for example in the word order or use of adverbs), then the results obtained with
the use of minor variants must be considered inconclusive.


The distribution of the major variants does not deliver definite results either. As shown in
Experiment 4, the shape of the tree changes depending on which variants are considered major
and whether we include 11, 13, 14, or 16 parsimony-informative variants in the input file.
Moreover, the results obtained from the data set based on 11 variants do not allow us to
determine the relationship between A601, P67, and T1768, since they are identical in the matter
of major type-two variants. The only reading that differentiates them in the input file is the
reading "sá konungur réði fyrir Görðum," for which A601 reads "sá konugur réði fyrir /Görðum/ `i
Danmörk´." It should be noted that in A601 "Görðum" is crossed out, and "í Danmörk" is a
supralinear addition, which was later also crossed out: thus T1768, A587, and A193 are the only
witnesses that preserve the fully written out form "sá konungur réði fyrir Görðum í Danmörk." In
this case, one has to refer to variants classified in this paper as minor variants in order to reveal
the relationships between these manuscripts, which brings into question the purpose of variant
classification.


Taking the external evidence into consideration, or rather material aspects of the manuscripts
A601 and P67, the tree obtained in experiments 4b and 2' seems to represent a better
hypothesis about the relationship between these manuscripts than the others and, together with
the consensus tree of Experiments 2 and 4a, these trees are close to the outcome of Kölbing's
hypothesis. However, it should not be forgotten that P67 and A601 are both in Jón Eggertsson's
hand, and one can be a copy of the other.


There is no space here for an extensive discussion of the material features of these manuscripts,
but it is important to mention some of them. P67 is a large manuscript in folio with the Old Norse-
Icelandic text written only on the verso sides of the leaves, while the recto sides are left blank,
presumably to leave room for a Latin or Swedish translation. This is typical of manuscripts from
the late seventeenth and early eighteenth centuries, for example, Papp. fol. nr 73, dated to 1738,
preserves Gríms saga loðinkinna, Ketils saga hængs and Örvar-Odds saga, with the Old Norse-
Icelandic text on the verso side and a Swedish translation on the recto side; Papp. fol. nr 90,
dated to 1683-1720, preserves Hálfdanar saga Eysteinssonar written in two columns, Old Norse-
Icelandic on the left-hand side and Swedish on the right-hand side; similarly Papp. fol. nr 88,
dated to 1683-1691, preserves Göngu-Hrólfs saga written in two columns with Old Norse and
Swedish side by side. P67 does not employ many abbreviations and the text is not very dense,
which might suggest a representative function that this manuscript was originally meant to have.
A601 is a quarto manuscript with a denser text, containing many abbreviations and corrections in
the scribe's own hand. A601 might therefore be considered as a draft, or a fast copy made while
the scribe had access to the exemplar for a limited period of time, perhaps during Jón
Eggertsson’s incarceration in Copenhagen (cf. Jucknies, 2009). Extensive textual differences
between A601 and P67 may suggest that P67 is an intentionally revised version of the text in
A601, and thus useless from a text-genealogical point of view.


This brings us to some final considerations of the purpose of the stemma. If the stemma plays
only the role of visualizing a general network of relationships between manuscripts, and not as a
tool for application of majority rule, as it does in traditional textual criticism, then perhaps the
results achieved in the experiments based on all types of variants are sufficient to fulfill this role
and the classification of variants is not necessary. It is especially relevant in the light of
Experiment 4d, which did not allow us to draw any conclusions about the filiation of some of the
manuscripts. Moreover, if one is inclined to build a stemma based exclusively on 11 variants,
then it seems more reasonable to draw the stemma by hand rather than to go through the entire
process of complete transcriptions, collations, and computer processing. Also, if the results
achieved by a computer-assisted analysis are only of a preliminary nature and always require
manual adjustments, as recently suggested by Buzzoni et al. (2016, pp. 652, 665), and a
stemma is always a hypothesis and simplification of the manuscript tradition, then perhaps it is
not crucial whether we take the results from Experiments 1 or 4b as a point of departure.


Conclusion
The results of my experiments suggest that the cladistic method can be employed in traditional
textual research, and a careful selection of variants can improve the results, but it does not
guarantee conclusive results. As shown by the results of Experiment 4 and 2', the input file can
be based only on traditionally selected major variants (or major type-two variants) and the results
still have some sign of manuscript filiation, so the data set can be built exclusively on loci critici
characters. Computer-assisted methods improve the efficiency of a textual analysis by delivering
a rough hypothesis of the relationships between manuscripts relatively quickly, but the results
achieved through this process are of a preliminary nature and require further detailed
investigation to construct the stemma.


The presence of minor variants in the case of HsG did not disturb the general grouping, but the
relationships between the selected manuscripts were different from the ones based on major
variants only. It has to be emphasized that we were dealing with only three scribes in this case
study. Even if the minor variants are scribe-specific and if one can distinguish Ásgeir Jónsson's
group from Jón Þórðarson's group based on their "scribal signal", it does not mean that these
minor variants are genealogically informative because descendants of these manuscripts might
not preserve these features. This is a vital question which requires further investigation, and is a
subject of my current on-going research on the complete manuscript tradition of HsG.


REFERENCES
Manuscripts by repository
Stofnun Árna Magnússonar í íslenskum fræðum, Reykjavík:
   •   AM 193 e fol.
   •   AM 345 4to
   •   AM 587 b 4to
   •   AM 601 b 4to
Det Kongelige Bibliotek, Copenhagen:
   •   Thott 1768 4to
Landsbókasafn Íslands og Háskólabókasafn, Reykjavík:
   •   Lbs 222 fol.
Kungliga biblioteket, Stockholm:
   •   Isl. papp. fol. nr 67
   •   Isl. papp. fol. nr 73
   •   Isl. papp. fol. nr 88
   •   Isl. papp. fol. nr 90
British Library, London:
   •   BL Add. 4859


Secondary literature
Andrews, A. L. (1911). Studies in the fornaldarsögur Norðurlanda. Modern Philology, 8, 527–
       544.
Andrews, T. L. (2016). Analysis of Variation Significance in Artificial Traditions Using
       Stemmaweb. Digital Scholarship in the Humanities, 31(3), 523–539.
Bordalejo, B. (2015). The Genealogy of Texts: Manuscript Traditions and Textual Traditions.
       Digital Scholarship in the Humanities, 31(3), 563–577. Retrieved from
       https://doi.org/http://dx.doi.org/10.1093/llc/fqv038
Buzzoni, M., Burgio, E., Modena, M., & Simion, S. (2016). Open versus closed recensions
       (Pasquali): Pros and cons of some methods for computer-assisted stemmatology. Digital
       Scholarship in the Humanities, 31(3), 652. Retrieved from
       https://doi.org/10.1093/llc/fqw014
Felsenstein, J. (2013). PHYLIP (Phylogeny Inference Package) (Version 3.695). Seattle:
       Department of Genome Sciences. Retrieved from
       http://evolution.genetics.washington.edu/phylip/doc/main.html
Greg, W. W. (1927). The calculus of variants, an essay on textual criticism. Oxford: Clarendon
       Press.
Greg, W. W. (1950). The Rationale of Copy-Text. Studies in Bibliography, 3, 19–36.
Hall, A., & Parsons, K. (2013). Making stemmas with small samples, and digital approaches to
       publishing them: testing the stemma of Konráðs saga keisarasonar. Digital Medievalist, 9.
Howe, C., Barbrook, A., Mooney, L., & Robinson, P. (2004). Parallels Between Stemmatology
       and Phylogenetics. In Studies in Stemmatology II (pp. 3–11). Amsterdam/Philadelphia:
       John Benjamins Publishing Company.
Jucknies, R. (2009). Der Horizont eines Schreibers, Jón Eggertsson (1643-89) und seine
       Handschriften. Frankfurt am Main: Peter Lang.
Kölbing, E. (1876). Beiträge zur Vergleichenden Geschichte der Romantishen Poesie und Prosa
       des Mittelalters. Breslau: Verlag von Wilhelm Koebner.
O’Hara, R. J., & Robinson, P. M. W. (1993). Computer-assisted methods of stemmatic analysis.
       Occasional Papers of the Canterbury Tales Project, 1, 53–74.
Platnick, N. I., & Cameron, H. D. (1977). Cladistic Methods in Textual, Linguistic, and
       Phylogenetic Analysis. Systematic Zoology, 26(4), 380–385. Retrieved from
       https://doi.org/10.2307/2412794
Rafn, C. C. (1829). Fornaldarsögur Norðrlanda. Kaupmannahöfn.
Robinson, P. (1996). Computer-Assisted Stemmatic Analysis and “Best-Text” Historical Editing.
       In Studies in Stemmatology (pp. 71–104). Amsterdam: John Benjamins Publishing
       Company.
Robinson, P. (2004). The Old Norse Sólarljóð. NEXUS file created from “NexSeg.xml” Monday
       September 13 16:07:41 2004. Retrieved from
       http://www.textualscholarship.org/newstemmatics/data/Sol.nex
Robinson, P. (2016). Four rules for the application of phylogenetics in the analysis of textual
       traditions. Digital Scholarship in the Humanities, 31(3), 637. Retrieved from
       https://doi.org/10.1093/llc/fqv065
Robinson, P., & Bordalejo, B. (2010a). The New Stemmatics: Data. Retrieved from
       http://www.textualscholarship.org/newstemmatics/data/index.html
Robinson, P., & Bordalejo, B. (2010b). What is “The New Stemmatics”? Retrieved from
       http://www.textualscholarship.org/newstemmatics/index.html
Robinson, P., & O’Hara, R. J. (1996). Cladistic analysis of an Old Norse manuscript tradition.
       Research in Humanities Computing, 4, 115–137.
Salemans, B. (1987). Van Lachmann tot Hennig: cladistische tekstkritiek. Gramma, 11, 191–224.
Salemans, B. (1996). Cladistics or the Resurrection of the Method of Lachmann: On Building the
       Stemma of Yvain. In Studies in Stemmatology (Pieter van Reenen, Margot van Mulken,
       Janet Dyk). Amsterdam: John Benjamins Publishing Company.
Trovato, P. (2014). Everything you always wanted to know about Lachmann’s method: a non-
       standard handbook of genealogical textual criticism in the age of post-structuralism,
       cladistic, and copy-text. Padova: Libreriauniversitaria.it edizioni.
van Mulken, M. (1993). The manuscript tradition of the Perceval of Chrétien de Troyes. A
       stemmatological and dialectological approach. University of Amsterdam, Amsterdam.
Zeevaert, L., Hall, A., Kapitan, K. A., et. al. (2013). A new stemma of Njáls saga -a working
       paper. Retrieved from
       https://www.academia.edu/7317515/A_New_Stemma_of_Njáls_saga


ACKNOWLEDGMENTS

This article reflects on the content of the paper presented during the International Digital
Humanities Symposium held in Vaxjö, Sweden 7-8 November 2016, organized by Koraljka
Golub, Marcelo Milrad, and Tamara Laketic. I would like to thank the organizers for allowing me
to present my methodological considerations during the symposium and participants of the
conference for some stimulating discussions. The content of this article exceeds the subject of
my original presentation and it is based on the work I have conducted as a part of my PhD
fellowship at the University of Copenhagen (2015-2018). I would like to thank the supervisors of
my PhD project Matthew James Driscoll and Annette Lassen for their feedback, as well as my
colleagues Sheryl McDonald Werronen, Tarrin Wills, and Seán Douglas Vrieland, who read
drafts of this paper and helped to improve its style.