<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Workshop on Computational Humanities Research, November</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Between Flexibility and Universality: Combining TAGML and XML to Enhance the Modeling of Cultural Heritage Text</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Elli Bleeker</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bram Buitendijk</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ronald Haentjens Dekker</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>R&amp;D group, Humanities Cluster, Royal Netherlands Academy of Arts and Sciences</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <volume>1</volume>
      <issue>4</issue>
      <fpage>8</fpage>
      <lpage>20</lpage>
      <abstract>
        <p>This short paper first presents a conceptual workflow of a digital scholarly editor, and then illustrates how the smaller components of the workflow can be supported and advanced by technology. The focus of the paper is on the need to encode a historical text from multiple, co-existing research perspectives. Step by step, we show how this need translates to a computational pipeline, and how this pipeline can be implemented. The case study constitutes the transformation of a TAGML document containing multiple concurrent hierarchies into an XML document with one single leading hierarchy. We argue that this data transformation requires input from the editor who is thus actively involved in the process of text modeling.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;text modeling</kwd>
        <kwd>text encoding</kwd>
        <kwd>TEI XML</kwd>
        <kwd>overlapping hierarchies</kwd>
        <kwd>Multi-Colored Trees</kwd>
        <kwd>editorial workflows</kwd>
        <kwd>digital scholarly editing</kwd>
        <kwd>TAGML</kwd>
        <kwd>computational pipelines</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        1. Introduction1
How can we efectively model the workflow of a scholarly editor so that we can develop
computational technology to support and advance this research process? This has been the overarching
research question that informs the work of the R&amp;D group of the Humanities Cluster (part
of the Royal Netherlands Academy of Arts and Sciences). Considering the vastness of the
question, the fact that our research is ongoing, and the limited number of pages allowed for a
short paper, this contribution focuses on two smaller aspects of this question. First, how can
we model diferent research perspectives on the same cultural heritage text? And secondly,
how can we ensure that the resulting documents can be processed by generic text analysis
tools? Our research takes place in the context of the Text As Graph (TAG) model, under
development at the R&amp;D group since 2017, which is set up to address several long-existing
challenges for the digital editing of cultural heritage texts [
        <xref ref-type="bibr" rid="ref10 ref22 ref8">8, 10, 23</xref>
        ].
      </p>
      <p>
        Text encoding can be considered an intellectual activity as scholars are compelled to
translate their interpretation of the text into a computer-readable format. This includes selecting a
data model, a markup syntax, and a specific vocabulary to encode cultural heritage documents
such as literary or historical texts. The choices made here directly influence the subsequent
processing, querying, analysing, or repurposing of the encoded document. Ideally, then,
scholarly editors base their choice for a data model and an encoding vocabulary on their research
question(s), the goal(s) of the encoding project, and/or the properties of the source material.
In reality, the majority of editing projects opt for XML as a data model [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. XML is after
all the de facto standard for text encoding and accordingly omnipresent: the Text Encoding
Initiative (TEI) Guidelines [
        <xref ref-type="bibr" rid="ref24">25</xref>
        ] are currently based on XML; the family of X-languages ofers a
wide support in navigating, querying, and transforming XML documents; and many text
editing tools take XML as input format. However, the single-rooted, fully ordered tree structure
of XML ofers only limited support for the modeling of cultural heritage texts (see section 2).
      </p>
      <p>The questions we address in this paper – how can we enable scholarly editors to model
diferent research perspectives on text and make the result widely available – also involve a
form of knowledge dissemination among the scholarly editing community: we want the user
to understand how their abstract, conceptual idea of the text corresponds with the way the
computer understands that text, how the textual data transforms during the editing process,
and how they may influence these transformations. To this end, we conceptualized the
worklfow of a scholarly editor as a step-by-step process that can be subdivided into smaller tasks or
“modular components”. This conceptual workflow can be quite easily translated into a
computational pipeline in which the output from one step forms the input for the next. As a result,
we could develop technology to address the individual components, making conscious choices
about which step(s) to automate and which step(s) needed to remain a manual act because
they require user input.</p>
      <p>After briefly describing our conceptual model of the editorial workflow in section 3, we
discuss how we approached the translation from workflow to data model to implementation. After
introducing the data model of TAG (section 4.1), we highlight two features of the
implementation: the representation of multiple perspectives on a text (section 4.2) and the export from
one data format to another (section 4.3). In both cases we will indicate how these features
correspond with the steps in the conceptual editorial workflow. Section 5 dives deeper into
the export feature by providing the high-level description of the code flow of the export, and
section 6 illustrates the transformation process with a small input and output sample.2</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related work</title>
      <p>
        Probably the most exemplified limitation of XML for text modeling is the fact that the
tree structure does not inherently provide for overlapping, discontinuous, or non-linear
structures [
        <xref ref-type="bibr" rid="ref11 ref21 ref26 ref8">11, 27, 22, 8</xref>
        ]. These types of structures are nevertheless common in cultural heritage
texts: the various research perspectives from which a text can be encoded often constitute
diferent hierarchies that (partly) overlap [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Since modeling a text from diferent research
perspectives remains a widely acknowledged objective of digital scholarly editing, several
alternatives to XML for text encoding have been developed ranging from stand-of approaches
to entirely new markup languages (see [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]; see also [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] for a more recent overview of these
alternatives).
      </p>
      <p>When using XML, scholarly editors are compelled to use local, project-specific approaches
or to do significant additional coding work in order to model overlapping structures in XML.</p>
      <p>
        2It should be noted that the implementation of the pipeline is currently on prototype-level. Still, we consider
our findings thus far to be relevant for a productive discussion on conceptual models of scholarly activities, and
to what extent these can be delegated to software.
As a consequence, the result is in a non-generic file format or uses proprietary software. The
result will thus significantly hinder any subsequent querying and interchange of the encoded
documents [
        <xref ref-type="bibr" rid="ref18 ref3">19, 3</xref>
        ]. The approach of the digital edition of Johann Wolfgang Goethe’s Faust
(2016) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] is a good example of a local solution to modeling a text from multiple,
overlapping perspectives. The entire text of Faust has been transcribed in TEI/XML several times.
Each transcription represents a diferent research perspective and has a diferent hierarchical
structure. The transcriptions are subsequently synchronized using the collation algorithm of
CollateX [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], and the result is stored in a graph database similar to an MCT data structure [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
Users can switch between perspectives in the edition’s graphical user interface.
      </p>
      <p>
        Another relevant example is presented by the Annotated Turki Manuscripts from the Jarring
Collection Online (ATMO) project [
        <xref ref-type="bibr" rid="ref20">21</xref>
        ]. This project combines regular embedded TEI/XML
markup with Trojan Horse markup [
        <xref ref-type="bibr" rid="ref11 ref20">11, 21</xref>
        ]. Trojan Horse markup elements are a specific
type of milestone elements or “markers” that have a namespace definition th: to
diferentiate them from regular milestone elements. Two related markers are linked by means of
matching start and end attributes, so the regular XML &lt;s&gt;The sun is yellow&lt;/s&gt; becomes
&lt;th:s sID="foo"/&gt;The sun is yellow&lt;th:s eID="foo"/&gt; in Trojan Horse markup. It is
quite a well-known strategy to express (partly) overlapping hierarchies in XML, and various
XSLT strategies exist to convert Trojan Horse to regular XML content elements and vice
versa [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In the ATMO project, transcriptions contain three hierarchical structures, and
neither structure is permanently dominant: users are able to switch between dominant structures
via an XSLT template converting the Trojan Horse milestones to regular XML content
elements. A third approach to representing multiple textual perspectives making use of XML
and related X-technologies is Concurrent XML [
        <xref ref-type="bibr" rid="ref12 ref7">12, 7</xref>
        ], which represents multiple concurrent
hierarchical structures in XML by dividing text into “atomic text nodes” (usually based on
word division). Each node is reachable via an XPath Expression locating its place in one or
more hierarchies.
      </p>
      <p>
        Finally, there exist several stand-of systems, such as the Just In Time Markup system
(19992005) in which users create “on demand user-customized versions of electronic editions of
historical documents” [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], or the stand-of properties system of Desmond Schmidt and others [
        <xref ref-type="bibr" rid="ref19">20</xref>
        ].
The downsides of stand-of systems are that they often require a specialized, non-generic
editing environment, and that they require a stable base text (while in practice it often is unstable
and subject to changes). What is more, they usually produce quite illegible transcriptions (see
[
        <xref ref-type="bibr" rid="ref20">21</xref>
        ]). So even though it may not be the perfect choice for text modeling from a
computational perspective, XML’s substantial role for text encoding as well as the significant amount
of XML-based text processing and analysis tools makes it an undeniable community standard.
Alternative text encoding approaches are therefore more likely to succeed if they take XML
into account, instead of ofering a completely new encoding model.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Conceptual model</title>
      <p>
        In order to best accommodate the work of digital scholarly editors, we first mapped their
research activities (see figure 1).3 The model has three “levels”: the upper level representing the
intellectual activity of the editor; the middle level showing the editor’s action(s) associated to
this thinking; the third and lowest level indicating the product(s) of the actions. Starting at the
upper left corner, for example, we can see that the research perspective of the editor influences
3This workflow is inspired by [
        <xref ref-type="bibr" rid="ref13 ref25">26, 18, 13</xref>
        ] among others.
the conceptual modeling of the source text. When encoding, annotating, and linking the text,
the editor has to make decisions about syntax, vocabulary, schemata, etc. The community’s
standards are also at play here, as they are with the digitization of the source text. The actions
of analyzing and/or querying are again influences by the project’s research objective. It will
produce a selection of the information contained in the transcription, which can subsequently
be represented in diferent ways: via a published edition of the text, a visualization, a data
set, etc.
      </p>
      <p>What we intend to demonstrate with this visualization, is how an editor’s research
perspective(s) on the source text translates to a choice for a certain data model, a syntax, a markup
vocabulary, a normalization policy, etc. In turn, these choices influence the way the text can
be analyzed, queried or represented. The visualization also shows the importance of an editor
who knows what they want to do with the text in terms of querying, visualising, or exporting:
knowing this at the outset helps them decide what would be the most suitable data format,
markup strategy, and tools. The TAG data model addresses primarily the steps of encoding,
annotating, linking, analyzing, and querying in the workflow. In the remaining of the paper,
we will focus on the encoding step, showing how TAG allows for the encoding of multiple,
co-existing research perspectives on the source text, and on how these encoded documents can
be exported to XML for analysis or publication.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Implementation</title>
      <sec id="sec-4-1">
        <title>4.1. Data model</title>
        <p>
          In order to computationally support the encoding step in the workflow, TAG makes use of
a variant of a General Ordered Directed Descent Acyclic Graph (GODDAG) [
          <xref ref-type="bibr" rid="ref21">22</xref>
          ]. The data
model is based on the Multi-Colored Trees model (MCT), which permits nodes to be shared by
multiple hierarchies that are distinguishable by color [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. The TAG data model distinguishes
Text nodes, Markup nodes, and Annotation nodes. In contrast to the mono-hierarchical, fully
ordered tree that is implied by XML, the fact that TAG is based on a graph data model means
that it can ofer much more flexibility in modeling documents from diferent perspectives,
including overlapping hierarchies and non-linear or discontinuous structures [
          <xref ref-type="bibr" rid="ref10 ref3">10, 3</xref>
          ]. As a
result, we find that TAG is able to surpass the limitations of XML for text modeling.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Alexandria: from research perspectives to views</title>
        <p>
          TAG’s reference implementation is the text-repository system Alexandria. In contrast to many
local or technologically complex approaches to modeling diferent perspectives on text (see
section 2), the code of Alexandria is open source and designed to be implemented in any
editorial workflow. 4 Users can directly encode texts in the TAG markup language TAGML [
          <xref ref-type="bibr" rid="ref23">24</xref>
          ]
and upload the TAGML documents in the Alexandria repository. This is the functionality that
corresponds to the encoding step in the editorial workflow. Via Alexandria’s command-line
interface, users can subsequently query the TAGML documents (the querying step), or export
them to other formats like PNG, SVG, DOT, and XML for analysis or publication (cf. the
analyzing and representing steps).
        </p>
        <p>
          The TAGML syntax is similar to that of embedded markup systems like XML or
TexMECS [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], with a markup element consisting of an open and a close tag:
[s&gt;This is an [del&gt;easy&lt;del] example of [add&gt;the&lt;add] TAGML syntax.&lt;s]
Additionally, TAGML markup elements can be grouped together in sets or “layers”. A layer of
markup elements may be used to express a perspective on text. Each TAGML document can
contain one or more layers of markup elements, and markup elements may be shared between
layers. A TAGML document could for example consist of a transcription of a poem’s text plus
three layers of markup: one layer of markup that describes the physical aspects of the poem;
one layer of markup that describes its poetic structure; and one layer of markup that describes
the poem’s linguistic features. Markup elements can be shared by two or more layers. In this
way, it becomes easy for the editor to group together the markup elements that represent their
research perspective on the text. Note the relationship between the act of encoding and the
editor’s research perspective as visualized in the workflow above.
        </p>
        <p>Markup elements are associated with a layer by means of a layer suffix on the open tag:
[s|+T,+D&gt;This is an [del|D&gt;easy&lt;del] example of [add|T,D&gt;the&lt;add]
TAGML syntax.&lt;s]
In the example above, the s element is associated with the layer T and D; the del element is
associated to the layer D; and the add element is again part of both the T and the D layer.
Each layer in the TAGML is represented by a color in the MCT data model. At the time of
writing, the markup elements within each layer are hierarchically ordered. This means that
each color in the MCT forms one single-rooted tree. Using the layer functionality, TAGML
markup elements can overlap. Figure 2 shows a visualization of the following TAGML sentence:
[s|+T,+D&gt;This is an [del|D&gt;easy [add|T&gt;example&lt;del] illustration&lt;add]
of the TAGML syntax.&lt;s]</p>
        <p>A TAGML document in the Alexandria repository can contain many (layers of) markup
elements, but it is safe to assume that editors do not always want to see all markup. Especially
documents containing many annotations can quickly become near-illegible for human readers.
So, in addition to grouping together markup elements in layers, Alexandria also allows users
to generate a new document that contains only a selection of information from the master
4Interested users are invited to explore the use of the tool for their own textual material. See https:
//huygensing.github.io/alexandria/, last accessed September 14, 2020.
TAGML document. The user can indicate which information they want to retain and which
information can be ignored in a separate file, called a “view definition”.</p>
        <p>
          In Alexandria, views work as a filter mechanism: users indicate in a view what markup
and/or layers they want to see. Alexandria works similar to Git: the user can use
the view to “checkout” a selection of the information in the TAGML document, while
the master document remains intact in the repository (see figure 3 for a visualization
of this workflow). The example above showed a set of markup elements grouped
together in a layer with the ID D. A view definition is expressed in JSON [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] and, in
this example, will look as follows: {"includeLayers":["D"]}. The editor can also
filter on specific markup elements by adding an includeMarkup line to the view definition:
{"includeLayers":["D"]} {"includeMarkup:["del"]"}. Checking out a TAGML
document with this view definition generates a new TAGML document containing only the selected
markup elements and the selected layer(s). In this case, the checkout generates a TAGML file
that contains all markup elements with the layer ID D plus all del elements. The edited
document can subsequently be exported as XML, SVG, or PNG. The export files are not
intended to be edited, but can be used for visualizing, analyzing, or publication purposes. This
corresponds respectively with the analyzing, querying, and representing steps.
        </p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Exporting to XML and Trojan Horse markup</title>
        <p>In the following paragraphs, we outline how we handle the graph-to-tree conversion of the
MCT-to-XML export. From a technical perspective, transforming a graph into a tree is rather
straightforward. However, since the transformation of a graph with multiple concurrent
hierarchies into a mono-hierarchical tree inevitably leads to information loss, it presents some
philological questions. What information in the TAGML document is relevant, what markup
elements should form the leading XML hierarchy, and what information can be left out and/or
does not need to be part of the leading hierarchy? Since the answers to these questions are
based on the editor’s research perspectives and on what they want to do with the exported
XML document (e.g., analyze or visualize), they may vary from time to time.</p>
        <p>Again, as figure 1 shows, the editor’s conceptual idea of the text translates to their actions
which in turn afect the output. In order to ensure that the XML output aligns with the
editor’s objectives, they can provide their preferences via Alexandria’s view feature. By
indicating which hierarchy (i.e., which layer of markup elements) should be leading, the editor
can coordinate the conversion of a document with several overlapping hierarchies to one single
hierarchy. TAGML elements that belong to the leading hierarchy are transformed into XML
content elements; TAGML elements that are not part of the leading hierarchy are transformed
into Trojan Horse markup elements.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Computational pipeline</title>
      <p>This section describes the export function from a higher-level perspective. The flow chart
in figure 4 starts with a TAGML document and a view document that are uploaded in the
Alexandria repository. The steps that follow are listed below:
1. The user gives the xml export command to the Alexandria server.
2. The TAGTraverser iterates over the MCT of the TAG document, using information from
the TAG view;
3. The TAGTraverser generates a stream of Events; The TAGTraverser traverses over the
nodes in topological order and creates TextEvent, MarkupOpen or MarkupClose events.
A Text Node generates a TextEvent, and a Markup Node generates a MarkupOpen event
of that Node. For the MarkupClose events, we need to keep track of the markup that is
currently open by using stacks. For every Markup Node we add it to the relevant color
stack and the global stack. Now for each Node, before we can generate a TextEvent or
MarkupOpen event, we have to check the top of the relevant color stacks for Markup that
is not a parent of the current Node. Those Markups generate MarkupClose events and
can be removed from the color stacks and the global stack. After all Nodes have been
processed, what Markup is left on the global stack generates the remaining MarkupClose
events.</p>
      <p>4. For each Event is checked whether it is an open tag, a close tag, or text characters;
5. If the Event is text, the characters are transformed into an XML text node;
6. If the Event is an open tag or a close tag, it is checked if the the tag is part of the leading
hierarchy;
a) If not, the open tag or close tag is transformed into a Trojan Horse start or end
element, respectively.
b) If the tag is part of the leading hierarchy, the open tag or close tag is transformed
into an XML open tag or an XML close tag.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Evaluation</title>
      <p>By means of example, we return to the TAGML example containing overlapping markup (see
section 4.2):
[s|+T,+D&gt;This is an [del|D&gt;easy [add|T&gt;example&lt;del] illustration&lt;add]
of the TAGML syntax.&lt;s]
Let’s say that we’ve indicated in the view definition that the D layer should be leading in the
XML export. The markup elements part of the T layer will thus be exported as Trojan Horse
elements. The XML output looks as follows:
&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;xml xmlns:th="http://www.blackmesatech.com/2017/nss/trojan-horse" th:doc="D T"&gt;
&lt;s&gt;This is an &lt;del&gt;easy &lt;add th:doc="T" th:sId="add1"/&gt;example&lt;/del&gt;
illustation&lt;add th:doc="T" th:eId="add1"/&gt; of
&lt;add th:doc="T" th:sId="add2"/&gt;the
&lt;add th:doc="T" th:eId="add2"/&gt; TAGML syntax.</p>
      <p>&lt;/s&gt;
&lt;/xml&gt;</p>
      <p>Switching from flattened XML into hierarchically structured XML and back again is a
common need in the XML community and there exists a large body of XSLT patterns designed
specifically for this purpose. 5 Still, though “flattening” and “raising” the hierarchies in XML
documents is by no means a novel activity, this practice usually takes place within a digital
edition project, which means that the code is tailored to a specific text and rarely shared as
part of the project’s output. The idea presented in this paper, by contrast, is to encourage
editors to create a custom, modular workflow in which they can use both Alexandria and, via
the TAGML-to-XMl export feature, generic XML-based tools.</p>
      <p>While generating an XML tree from an MCT is computationally straightforward (a
depthifrst graph traversal), as far as we know it is unprecedented to put the user in control by
including them in the conversion process, and to take this process it out of a customized,
project-specific environment. As a consequence, we believe that Alexandria provides a more
powerful and flexible approach to text modeling. This paves the way for an editorial modeling
workflow that strikes the right balance between flexibility and universality.</p>
      <p>As mentioned above, the TAG model is currently under active development. Among others,
it is currently not possible for multiple users to collaborate in Alexandria, since the repository
is now only initialized locally. We aim to allow the user of Alexandria to checkout and edit
a TAGML document, and to upload the document again to Alexandria. Similar to Git, this
upload first difs the master document and the edited document. Any detected changes are
subsequently merged into the master document. Such a dif and merge is quite straightforward
for Git (as well as for stand-of systems), but because we want to track the edits in a TAGML
document on the level of the markup as well as on the level of the text, we require a more
advanced dif that is very challenging to implement. A second point of attention is that thinking
about data transformations and working with Alexandria on the command line requires a level
of technical know-how that is not available to all scholarly editors. For that reason, we are
keen to provide training in TAG text modeling via workshops, summer schools, and online
Jupyter Notebooks. Similar to the present contribution, these courses aim to illustrate the
relationships between a conceptual, abstract model and its technical implementation(s).</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>This paper discussed the text modeling approach of Alexandria and focused on the intellectual
process of translating a conceptual model to a technical implementation. Using an MCT data
structure, Alexandria facilitates the modeling and encoding of cultural heritage text. Editors
can express diferent research perspectives (“views”) on the source text and subsequently export
a view to XML. If the XML output contains multiple overlapping hierarchies, the editor can
indicate which hierarchy should be leading. The other hierarchical structures are expressed
in Trojan Horse milestone elements. The XML-export function of Alexandria allows editors
to build a custom pipeline in which they combine the best of both worlds: TAG’s flexible
text modeling capacity with XML’s generic publication tools. In other words, an editorial
workflow that strikes the right balance between flexibility and universality. The editor’s close
involvement in every step of the text modeling process contributes to the their control over
and insight into the various transformations the data undergoes.</p>
      <p>
        5[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] presents an extensive survey of at least seven possible approaches to raising flattened XML; four of
which use XSLT.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G.</given-names>
            <surname>Barwell</surname>
          </string-name>
          et al.
          <source>Authenticated Electronic Editions Project: A Progress Report</source>
          .
          <year>2001</year>
          . url: https://ro.uow.edu.au/cgi/viewcontent.cgi?article=1535&amp;context=artspapers.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Birnbaum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. E.</given-names>
            <surname>Beshero-Bondar</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Sperberg-McQueen</surname>
          </string-name>
          .
          <article-title>“Flattening and Unflattening XML Markup: a Zen Garden of XSLT and Other Tools”</article-title>
          .
          <source>In: Proceedings of Balisage: The Markup Conference</source>
          . Vol.
          <volume>21</volume>
          .
          <year>2018</year>
          . doi: https://doi.org/10.4242/Balisage Vol21.
          <fpage>HaentjensDekker01</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E.</given-names>
            <surname>Bleeker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Buitendijk</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R. H.</given-names>
            <surname>Dekker</surname>
          </string-name>
          . “
          <article-title>Marking Up Microrevisions with Major Implications”</article-title>
          .
          <source>In: Proceedings of Balisage: The Markup Conference</source>
          . Vol.
          <volume>25</volume>
          .
          <year>2020</year>
          . doi: https://doi.org/10.4242/BalisageVol25.Bleeker01.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <article-title>[4] “Faustedition”</article-title>
          . In: ed. by
          <string-name>
            <given-names>A.</given-names>
            <surname>Bohnenkamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Henke</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Jannidis</surname>
          </string-name>
          .
          <year>2016</year>
          . url: http: //beta.faustedition.net/.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T.</given-names>
            <surname>Bray</surname>
          </string-name>
          et al.
          <source>XML 1.0. Tech. rep. W3C</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Coombs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. H.</given-names>
            <surname>Renear</surname>
          </string-name>
          , and
          <string-name>
            <surname>S. J. DeRose.</surname>
          </string-name>
          “
          <article-title>Markup Systems and the Future of Scholarly Text Processing”</article-title>
          . In: The Digital Word:
          <article-title>Text-Based Computing in the Humanities</article-title>
          . Ed. by
          <string-name>
            <given-names>G. P.</given-names>
            <surname>Landow</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Delany</surname>
          </string-name>
          .
          <year>1993</year>
          , pp.
          <fpage>85</fpage>
          -
          <lpage>118</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Dekhtyar</surname>
          </string-name>
          and
          <string-name>
            <given-names>I. E.</given-names>
            <surname>Iacob</surname>
          </string-name>
          .
          <article-title>“A Framework for Management of Concurrent XML Markup”</article-title>
          .
          <source>In: Data &amp; Knowledge Engineering 52.2</source>
          (
          <issue>2005</issue>
          ), pp.
          <fpage>185</fpage>
          -
          <lpage>208</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>R. H.</given-names>
            <surname>Dekker</surname>
          </string-name>
          and
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Birnbaum</surname>
          </string-name>
          . “
          <article-title>It's More Than Just Overlap: Text As Graph”</article-title>
          .
          <source>In: Proceedings of Balisage: The Markup Conference</source>
          . Vol.
          <volume>20</volume>
          .
          <year>2017</year>
          . doi: https://doi.org/10 .4242/BalisageVol19.Dekker01.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R. H.</given-names>
            <surname>Dekker</surname>
          </string-name>
          and
          <string-name>
            <given-names>G.</given-names>
            <surname>Middell. CollateX</surname>
          </string-name>
          .
          <year>2019</year>
          . url: https://collatex.net/.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R. H.</given-names>
            <surname>Dekker</surname>
          </string-name>
          et al. “
          <article-title>TAGML: a Markup Language of Many Dimensions”</article-title>
          .
          <source>In: Proceedings of Balisage: The Markup Conference</source>
          . Vol.
          <volume>21</volume>
          .
          <year>2018</year>
          . doi: https://doi.org/10.4242/Balis ageVol21.
          <fpage>HaentjensDekker01</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>S. J. DeRose.</surname>
          </string-name>
          “
          <article-title>Markup Overlap: A Review and a Horse”</article-title>
          .
          <source>In: Proceedings of the Extreme Markup Languages</source>
          .
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P.</given-names>
            <surname>Durusau and M. O'Donnell</surname>
          </string-name>
          . “
          <article-title>Implementing Concurrent Markup in XML”</article-title>
          .
          <source>In: Extreme Markup Languages</source>
          . Vol.
          <year>2001</year>
          .
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>R.</given-names>
            <surname>Hoekstra</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Koolen</surname>
          </string-name>
          .
          <article-title>“Data Scopes for Digital History Research”</article-title>
          .
          <source>In: Historical Methods: A Journal of Quantitative and Interdisciplinary History 52.2</source>
          (
          <issue>2019</issue>
          ), pp.
          <fpage>79</fpage>
          -
          <lpage>94</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>C.</given-names>
            <surname>Huitfeldt</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Sperberg-McQueen. TexMECS: An Experimental Markup MetaLanguage for Complex Documents</surname>
          </string-name>
          .
          <year>2001</year>
          . url: http://www.hit.uib.no/claus/mlcd/pape rs/texmecs.html.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>H.</given-names>
            <surname>Jagadish</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Lakshmanan</surname>
          </string-name>
          . “
          <string-name>
            <surname>Colorful</surname>
            <given-names>XML</given-names>
          </string-name>
          :
          <article-title>One Hierarchy Isn't Enough”</article-title>
          .
          <source>In: Proceedings of the 2004 ACM SIGMOD international conference on management of data. ACM</source>
          ,
          <year>2004</year>
          . doi: https://dl.acm.org/doi/abs/10.1145/1007568.1007598.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>JSON. JavaScript Object</surname>
          </string-name>
          <article-title>Notation</article-title>
          (JSON).
          <year>2007</year>
          . url: https://www.json.org/json-en.
          <source>h tml.</source>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>G.</given-names>
            <surname>Middell</surname>
          </string-name>
          .
          <article-title>On the Value of Comparing Truly Remarkable Texts</article-title>
          .
          <article-title>Presented at the symposium “Knowledge Organization and Data Modeling in the Humanities”</article-title>
          .
          <year>2012</year>
          . url: https://datasymposium.wordpress.com/middell/.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>D.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          .
          <article-title>“A Model of Versions and Layers”</article-title>
          .
          <source>In: DHQ: Digital Humanities Quarterly 13.3</source>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>D.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Colomb</surname>
          </string-name>
          . “
          <article-title>A Data Structure for Representing Multi-version Texts Online”</article-title>
          .
          <source>In: International Journal of Human-Computer Studies 67.6</source>
          (
          <issue>2009</issue>
          ), pp.
          <fpage>497</fpage>
          -
          <lpage>514</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>C.</given-names>
            <surname>Sperberg-McQueen</surname>
          </string-name>
          .
          <article-title>“Representing Concurrent Document Structures Using Trojan Horse Markup”</article-title>
          .
          <source>In: Proceedings of Balisage: The Markup Conference</source>
          . Vol.
          <volume>21</volume>
          .
          <year>2018</year>
          . doi: https://doi.org/10.4242/BalisageVol21.
          <article-title>Sperberg-McQueen01.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>C.</given-names>
            <surname>Sperberg-McQueen</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Huitfeldt</surname>
          </string-name>
          . “
          <article-title>GODDAG: A Data Structure for Overlapping Hierarchies”</article-title>
          .
          <source>In: Lecture Notes in Computer Science</source>
          . Ed. by
          <string-name>
            <given-names>P.</given-names>
            <surname>King</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Munson</surname>
          </string-name>
          . Vol.
          <volume>20</volume>
          -
          <fpage>23</fpage>
          . Berlin: Springer-Verlag,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [23]
          <string-name>
            <surname>TAG</surname>
          </string-name>
          .
          <article-title>Text As Graph</article-title>
          .
          <source>Version alexandria 2.3</source>
          .
          <year>2019</year>
          . url: https://huygensing.github.io /TAG/.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [24]
          <string-name>
            <surname>TAGML</surname>
          </string-name>
          .
          <article-title>Text as Graph Markup Language</article-title>
          .
          <year>2019</year>
          . url: https : / / github . com / Huygens ING/TAG/tree/master/TAGML.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [25]
          <string-name>
            <surname>TEI-Consortium</surname>
            <given-names>.</given-names>
          </string-name>
          /TEI P5:
          <article-title>Guidelines for Electronic Text Encoding and Interchange</article-title>
          .
          <source>Version version 4.0.0</source>
          .
          <year>2019</year>
          . url: http://www.tei-c.org/Guidelines/P5/.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>J.</given-names>
            <surname>Unsworth</surname>
          </string-name>
          . “Scholarly Primitives:
          <article-title>What Methods Do Humanities Researchers Have in Common, and How Might our Tools Reflect This”</article-title>
          .
          <source>In: Symposium on Humanities Computing: Formal Methods, Experimental Practice. King's College</source>
          , London. Vol.
          <volume>13</volume>
          .
          <year>2000</year>
          , pp.
          <fpage>5</fpage>
          -
          <lpage>00</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>A.</given-names>
            <surname>Witt</surname>
          </string-name>
          . “Multiple Hierarchies:
          <article-title>New Aspects of an Old Solution”</article-title>
          .
          <source>In: Proceedings of the Extreme Markup Languages</source>
          .
          <year>2004</year>
          . url: http://www.mulberrytech.com/Extreme/Proce edings/html/2004/Witt01/EML2004Witt01.html.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>