<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The Whole is Greater than the Sum of its Parts Analyzing Aristotle Commentaries in Collaboration Between Philology and Data Science</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Germaine Gotzelmann</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Philipp H</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Soring</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Freie Universitat Berlin</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Karlsruhe Institute of Technology</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Technical University of Darmstadt</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <fpage>103</fpage>
      <lpage>114</lpage>
      <abstract>
        <p>This paper aims at presenting the surplus value of collaboration between philologists and data scientists in the research on medieval digitized manuscripts. Both the great potential and the challenges of such a collaboration will be addressed. The following case study originates from research which is conducted in the Collaborative Research Center \Episteme in Motion. Transfer from the Ancient World to the Early Modern Period" which is located at the Freie Universitat Berlin and funded by the German Research Foundation (DFG). One of the goals of this collaboration is to advance research questions in which the data basis is complex or too complex for traditional research methods. The case study presented in this paper will deal with the knowledge transfer and text transmission in manuscripts of Aristotle's ancient Greek treatises on logic, the so-called Organon, and will focus on the manuscripts of his work de interpretatione (On Interpretation) and on commentaries and explanations which are found as paratexts in the manuscripts.</p>
      </abstract>
      <kwd-group>
        <kwd>Information Infrastructure ters De interpretatione</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>the collaboration between one of the humanities projects that deals with
Aristotelian works on the one side and the information infrastructure in use in the
CRC on the other side will illustrate the speci c requirements and challenges
that this kind of collaboration generates.
1</p>
    </sec>
    <sec id="sec-2">
      <title>The Humanities Research Problem</title>
      <p>
        Aristotle's writings have been transmitted in manuscript form (via codices)
during the Middle Ages. Today, we still possess approximately 150 copies of his
logical treatise de interpretatione, which date from the 9th to the 16th century.
They are preserved in many di erent libraries worldwide. Together with the
manuscripts of the Categories, another Aristotelian treatise which belongs to
his logical writings, this is the highest number of conserved manuscripts of a
pagan ancient Greek text. Although the number of 150 manuscripts does only
represent the manuscripts which were not lost or destroyed in the course of the
centuries, the comparatively enormous number of preserved manuscripts unveils
an intensive occupation with the Aristotelian logic among scholars between the
9th and 16th century [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>
        Another evidence for the scholarly occupation with the treatise is found inside
the manuscripts themselves. The manuscripts do not only contain the text of the
treatise but also various forms of explanations on the pages, a kind of the
socalled paratexts ([
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] for a general overview of the paratexts in the manuscripts of
the Aristotelian treatises).4 Even the pages of the earliest manuscripts had been
structured from the beginning for the purpose of adding longer explanations by
leaving e.g. a wide margin. Other manuscripts also show a big space between the
single lines to give room for explanations of single words (the so-called interlinear
glosses).5
      </p>
      <p>
        On the basis of paleographical particularities it can also be observed that the
margins or interlinear spaces were often lled from di erent scholars or scribes
at di erent times, sometimes over the course of centuries. All these paratexts
(glosses, logical diagrams, scholia and commentaries) show that scholars have
worked with these manuscripts and that these manuscripts were the basis for
teaching and learning Aristotelian logic within schools or erudite circles
(Examples have been given by [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]).
      </p>
      <p>Both the high number of preserved manuscripts of the Aristotelian treatise
de interpretatione and its integration in the logical instruction have fascinated
4 For an overview over the manuscripts and whether they contain paratexts
cf. http://pinakes.irht.cnrs.fr/notices/oeuvre/2973/ (2020{08{13) and particularly
https://cagb-db.bbaw.de/register/werke.xql?cRef=Int. (2020{08{13) although both
lists do not contain all preserved manuscripts of Aristotle's treatise de
interpretatione.
5 To mention only two manuscripts to give some impressions, it can
be referred to the codices Vatican, BAV, Urb. Gr. 35 (9th century),
f. 55v.: https://digi.vatlib.it/view/MSS Urb.gr.35 (2020-08{13), Paris, BNF,
Par. Gr. 1845 (end of 13th / beginning of the 14th century), f. 33v.:
https://gallica.bnf.fr/ark:/12148/btv1b107218100/f70.image (2020{08{13).
104/143
scholars over centuries from a philological, philosophical and historical
perspective. Particularly two research elds and challenges can be stressed: (1) The
philological research concerning the question what the original Aristotelian text
looked like. (2) The challenge to follow the paths in which this Aristotelian
treatise has spread over the centuries. Determining the way the manuscripts travelled
and detecting epistemic centers where many manuscripts were copied is
equivalent with getting to know historical times and places with a particular interest
in the Aristotelian logic.</p>
      <p>The research on the rst philological question has encountered signi cant
problems and challenges which have not yet been resolved. It is a long tradition
in the eld of philology, particularly in the classical philology, to make
endeavours to edit a text of an antique author which comes as close as possible to the
original. Related manuscripts usually contain the same mistakes or the same
textual variants. While copying the text scribes committed mistakes, e.g. omitting
unintentionally words. Text editors have to nd and eliminate such mistakes and
opt for the textual variant which they consider to be the authentic one.6</p>
      <p>
        The problem which the philologists have encountered in the case of
Aristotle's work de interpretatione is a so-called highly contaminated transmission of
the text. Contaminated means that there were scribes or scholars who did not
only copy the text from one older copy, but from more than one older copy,
whereby these older copies showed di erent variants in single passages. In such
cases copying the text accompanied a philological and philosophical
interpretation and examination of the text as it can be found within an erudite school
or teaching context, for which the texts were also copied and annotated with
paratexts. Due to the fact that such a contamination is widely spread in the
manuscripts of the Aristotelian logic [
        <xref ref-type="bibr" rid="ref14 ref8">8, 14</xref>
        ] one of the greatest experts in the
eld of Aristotelian manuscripts studies has stated that the retracing of the
relations of the manuscripts of the Aristotelian logical writings { including de
interpretatione { is such a complex task that it cannot be solved within the life
of a scholar [11, pp. 57{69].
      </p>
      <p>Since scholars who have dealt with this question have come to a negative
result, the detailed research data which derived e.g. from applying the philological
methods, as shortly outlined above, and which had led them to their conclusion
has never been published. It is simply gone with the end of the research project
or the end of the scholar's life. This is the second problem which everyone faces
who deals with this research question: She or he has to start again from scratch.</p>
      <p>
        But also scholars conducting research on the paths in which the knowledge
of Aristotelian logic spread e.g. over Europe during approximately 800 and 1600
and on epistemic centers in which the Aristotelian logic was taught and studied
and in which the Aristotelian logical treatises were copied face major challenges,
6 For the stemmatological method, which is also named after its founder Karl
Lachmann (1793-1851) `Lachmannsche Methode' c. f. in detail already [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Cf. further [
        <xref ref-type="bibr" rid="ref12 ref5 ref6">6,
5, 12</xref>
        ] Applying this method also relations between the manuscripts can be detected
if they share e.g. the same mistakes. For a categorization of di erent mistakes which
often occur while copying a text cf. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
105/143
because for the majority of the manuscripts we do not possess information about
the exact year and the place where the manuscript was copied. Due to
paleographical and codicological particularities a rough approximation to the date or
region when or where the manuscript was copied becomes possible in many cases
(Cf. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] for a good approach). But for tracing the exact paths it is indispensable
to nd additionally, if possible, relations of manuscripts of which we do not know
the exact provenience with manuscripts of which we know this exact provenience
due to corresponding notes which scribes or scholars left in the manuscripts.
      </p>
      <p>One method which can be applied in this case is the same which has been
outlined above. If, for example, two manuscripts have to be dated in di erent
centuries and if it is, moreover, clear that they originate from di erent regions
(e.g. one from the Greek East, the other one from southern Italy) due to their
paleographical or codicological particularities, these particularities would suggest
that there is no relation between the two manuscripts. But it can become evident
e.g. that the southern Italian manuscript, which has to be dated later, shares
the same signi cant mistakes and textual variants with the older manuscript
from the Greek East and contains additionally some own mistakes. Thus, it can
be concluded that this later southern Italian manuscript is a copy of the earlier
Greek manuscript and that the Greek manuscript was brought to southern Italy
by scholars. One path of the dissemination of the Greek manuscripts can be
detected in this way. If the exact place, where the southern Italian manuscript
was copied, can be determined, the historical research can now examine { if
further evidence can be found { whether there was an epistemic center at this
place which might have been the reason why manuscripts were brought to it for
being copied.</p>
      <p>Researching these paths becomes more complicated when e.g. the
southern Italian manuscripts does not only show the signi cant mistakes or textual
variants of the older Greek manuscript but also signi cant mistakes or textual
variants from another older e.g. Greek manuscript, whereby the two older Greek
manuscripts do not mutually share their mistakes. In the case of a text
transmission which is accompanied by a very erudite occupation with this text as
it is the case with Aristotle's treatise de interpretatione, such examples suggest
that an erudite scholar has copied the text from two older manuscripts. This
scholar did not only want to copy a text but compared two templates with
different variants and decided for the textual variants which he considered to be
the authentic Aristotelian one. Moreover, the scholar can also correct the variant
which he considers to be wrong in the respective template. Such a case is a
culturally contextualized example of the already mentioned contamination of the
text. If another scribe or scholar copies from the older copy with the correction
of the scholar into a new copy, this correction cannot be noticed as a correction
anymore by a reader who has only this new copy in front of him.</p>
      <p>For the scholar today, who tries to research the path of copies for gaining
knowledge about the way in which this text spread, it becomes very di cult to
explain this contamination unless he succeeds in nding the original copy with
the correction. To detect the origin the scholar has to nd the crossing point at
106/143
which a previous scholar had two manuscripts { as in the given example { with
two di erent textual versions. Moreover, it is possible that an erudite scholar
tried to correct obvious mistakes, which he detected in the template for his
copy, but that his correction originates only from his own thought. This so-called
conjecture can, at the end, be di erent from the authentic variant and, thus, also
cause contamination. And even more complicated cases can be found when an
erudite scholar copied his text form more than one already contaminated older
copies of the text. And, lastly, a scholar { particularly if she or he was erudite {
was also able to insert paratexts such as glosses, scholia, extracts of commentaries
or logical diagrams or to copy them (or some of them) from one of this witnesses
into the other if she or he found them in only one of his templates. The latter fact
becomes particular signi cant for detecting later contacts and relations between
manuscripts. This potential has not been exploited yet.</p>
      <p>To conclude: The complexity of the outlined tasks and challenges to research
a textual transmission which took place in connection with a highly erudite
scholarship concerning the text is enormous. The di culty particularly consists
in nding exactly these crossing points of manuscripts with di erent textual
variants and mistakes. These di culties and the complexity of research has raised
big problems for scholars to proceed in this eld of research so that it was
concluded that the life of a single scholar is too short to solve it.
2</p>
    </sec>
    <sec id="sec-3">
      <title>The Solution</title>
      <p>Although the described situation poses a big challenge, it has never been doubted
that signi cant progresses can be made by widening the research data and by
extending the method of research. Regarding the rst point, it is a
promising approach to take into account also similarities of the paratexts in di erent
manuscripts. If e.g. two or more manuscripts contain the same paratexts or the
same false logical diagram, which have been added in the manuscript in later
years or centuries compared to the copy of the text itself, this can be a strong
argument that these manuscripts came together at the same place at a later date
when these paratexts were added. It might seem to be paradoxical and
counterproductive to further increase the complexity of the already complex research
eld. And it makes, indeed, only sense within an extended methodological
approach which builds upon the great potential of a digital research infrastructure
and digital research tools for the speci c research problems outlined above.</p>
      <p>Thus, a two-way approach is promising. Firstly, researching the relations
between the manuscripts by means of the described philological method, and
secondly, researching the relation between the manuscripts by means of the
comparison of the paratexts in manuscripts. A digital research infrastructure can be
the key for handling the described problems of contamination and complexity.
The rst way resembling the traditional method will now be complemented by
the possibilities o ered by a digital research infrastructure as it will be explained
in the following. The second way, on which the research data will be enriched
via including the paratexts, represents a useful supplement. There is a
signif107/143
icant amount of cases in which relations between manuscripts as far as their
later mutual contact is concerned cannot be proved via the mere comparison of
signi cant mistakes and textual variants. On the contrary, the outcome of such
textual comparisons in the narrow sense can be that these manuscripts are not
related to each other because they do not share signi cant mistakes or textual
variants. Thus, including the second way is signi cant to nd later contacts
between manuscripts and crossing points when e.g. an erudite scholar caused a
contamination as explained above.</p>
      <p>If such points of contamination of two or more manuscripts can be
determined and if we also possess reliable metadata for all or at least some of these
manuscripts, it becomes possible to reveal epistemic centers in which manuscripts
of di erent origins with di erent textual variants circulated particularly due to
a broad scholarly philological and philosophical interest in the Aristotelian logic
in general and in the treatise de interpretatione in particular. The interest in
copying the text and editing a manuscript with a reliable text which comes as
close to the original as possible can be a second reason for the circulation of
different copies. And the will to create the best text for a didactical or philosophical
reasons can be another reason for di erent textual variants.</p>
      <p>A digital research infrastructure can help to handle methodologically all
mentioned problems, if it is built in a bottom up process facing concisely the
research problems. As a rst step a data repository for the digitized manuscripts
of the Aristotelian treatise was set up with restricted access to researchers
associated with the Aristotle archive at Freie Universitat Berlin. This archive
comprises the collection of micro lms of all known (approximately 1000 preserved)
manuscripts. Digital reproductions of said micro lms in black and white, in few
cases accompanied by digitized manuscripts in color, were put in the data
repository. To these digital reproductions metadata les (TEI-XML)7 were added.
These les contain scholarly descriptions which particularly base on the single
descriptions of every manuscript as it is found in the respective library
catalogues or scholarly articles on the respective manuscript. The les mainly focus
on the provenience, the date of the manuscript and the material and size of the
codex. These data are complemented with research results of the current project
itself concerning the relation between single manuscripts.</p>
      <p>
        As a second step, several digital tools to face the research problems and
challenges were newly developed or adapted. Firstly, a semiautomatic layout
analysis focuses on measuring the size of a manuscript page and of the text area
and margin area [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. We know that the pages of the codices were structured
in advance before the text was copied, because a lining for the text (and in
many cases also for the paratexts) can be seen on the single pages.8 Therefore,
it has to be concluded that this layout was organized before the text was copied.
Furthermore, it is obvious that manuscripts within a school often got the same
7 http://www.tei-c.org/Guidelines/P5/ (2020{09{03).
8 Cf. e.g. the codex Oxford, Magdalen College, Fonds principal Gr. 15, f. 1r.:
https://digital.bodleian.ox.ac.uk/inquire/p/b4890d13-f697-494f-8bea-4ee3cee103d8
(2020{08{13).
108/143
layout. Thus, if an automatic layout analysis can detect the same layout in
di erent manuscripts and if we possess e.g. the metadata of only one manuscript
out of a group of manuscripts with the same or a very similar layout, we can
conclude that the other manuscripts originate from the same place. These results
help us to detect the provenience of a manuscript which is the prerequisite for
following the path which this manuscript has traveled.
      </p>
      <p>
        Secondly, an annotation tool for enriching the data on the digitized
manuscripts in the data repository has been developed [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This tool enables scholars
to describe, translate and { if necessary or helpful { also describe paratexts
(glosses, scholia, diagrams and commentaries) as well as a semantic tagging of
the paratexts. Furthermore, the transcription and semantic tagging of mistakes
and signi cant textual variants in the Aristotelian text itself will soon be possible
with this tool for scholarly use.
      </p>
      <p>Thirdly, search and analyzation tools have been set up. By means of these
tools it becomes possible to search and nd the same glosses, scholia, diagrams
or commentaries in di erent manuscripts { also at the backdrop of the metadata
of the manuscripts, particularly relations which cannot be found by means of
the traditional analog method (one example will follow).</p>
      <p>The rst evident advantage of this collaboration between philologists and
data scientists compared to every single previous research in this eld is that
any research data will not, as it often is the case, disappear at the end of the
project, but it will be stored and made accessible in a repository. Due to the
application of standardized models, the reusability of the data is guaranteed. The
second big advantage is that scholars from di erent places can research the
outlined question together thanks to the digital infrastructure, making data, tools
and services accessible independently of their location and operation system.
Thus, competences and research power of humanities scholars interested in this
question, can be pooled on an international level. A third big advantage is that
all manuscripts can be aggregated via this digital infrastructure. This enables
to combine results of search and analyzation tools directly with the respective
digital reproductions of the manuscript pages on which the searched or analyzed
items are found, have been transliterated, tagged etc. Handwritings can be
compared in this way directly with each other. Scholars can see e.g. whether the
search results originate from manuscripts of the same region, of a di erent
region, of the same or di erent time, whether there are more similarities between
the results etc.</p>
      <p>Just to give two concrete examples: (1) Both the paratexts to an important
chapter of the Aristotelian treatise and the textual variants and mistakes in the
text of this chapter have been transliterated and tagged by scholars. If a scholar
encounters during her or his research on the manuscripts and the paratexts
among the transliterated diagrams a false logical diagram in the de
interpretatione manuscript Vienna, O NB, Suppl. Gr. 67, f. 117v., it is possible to search
via these tools whether a second manuscript contains the false version of the
diagram. The result of the search shows us that this false diagram is also
contained in two other manuscripts (Paris, BNF, Par. Gr. 1971, f. 34v. and Vatican,
109/143</p>
      <p>BAV, Urb. Gr. 56, f. 88r.). Additionally, it can be analyzed via this tool, in the
next step, if these manuscripts contain also further similar paratexts (in this case
glosses, because glosses are richly found in all three manuscripts). The result of
this automatic analysis shows that these three manuscripts do also contain a
comparatively high degree of matches among the glosses. In a further step the
scholar can test whether there are other manuscripts besides these three which
share signi cant variants of glosses with these three manuscripts. The result of
this last analysis is negative: there are no matches.</p>
      <p>
        Thus, by applying the respective components of the digital infrastructure on
the paratexts, the scholar concludes that there is a close relation between these
manuscripts. Considering that one of these three manuscripts does not share the
signi cant mistakes or textual variants of the other two { a result which the
automatic analysis can also show us perspectively { makes the importance of
this results, that we owe to the digital research infrastructure, even more
evident. Since the result that the codices Vienna, O NB, Suppl. Gr. 67 and Paris,
BNF, Par. Gr. 1971 do not share the same signi cant mistakes shows that the
codex Vienna, O NB, Suppl. Gr. 67 cannot be the copy or the template of the
Parisian codex, and since the paratexts in the Viennese codex have been added
later in comparison to the text, due to paleographical reasons, there must have
been a later contact between these codices. If a scholar goes into further
detail, this context does also help to explain contamination in the codex Vienna,
O NB, Suppl. Gr. 67 which contains textual corrections. The corrected versions
are the same which are found in the codices Paris, BNF, Par. Gr. 1971 and
Vatican, BAV, Urb. Gr. 56. Thus, one crossing point and point of contamination
of manuscripts can be detected in this way. And to cut short: taking also into
account the metadata of these manuscripts, we can follow the paths the Viennese
codex has most probably travelled (from Greece to Italy) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. And similarly, the
research can proceed in many other cases by uncovering the relations between
the manuscripts.
      </p>
      <p>(2) The advantages of such digital methods resulting from the intensive
collaboration of humanities scholars and data scientists cannot be stressed enough,
because the scholar's attention can be drawn again and again { also randomly {
to relations which could hardly have been detected without this digital
infrastructure and research tools. During her or his research the scholar can e.g. notice
a seldom textual variant which was given as citation of the text by a commentary
found at the margin of a manuscript.9 The scholar can again search via a fuzzy
search whether he nds this textual variant in another paratext or manuscript.
The astonishing result shows that among other relevant results even logical
diagrams can be found with a lettering which derives from this textual variant.
Since the digital infrastructure also enables the scholar to synoptically compare
results with the images of the diagrams on the respective manuscript pages, she
9 Cf. for such a case the codex Vatican, BAV, Urb. Gr. 35, f. 55r, where a fragment
of Olympiodorus' otherwise lost commentary on de interpretatione is transmitted
in form of a scholion at the margin of the manuscript page. This fragment shows a
textcritically important long variant of the de nition of the noun.
110/143
or he can more or less randomly notice that this diagram does not only occur
in more than one manuscript, but that two of these diagrams were drawn by
the same hand, because the handwriting is identical (in the codices Milan, BA,
Ambr. Q 87sup., f. 54r and Genova F VI 9, f. 68v). Thus, a contact between these
two manuscripts has been proved and the contact between these two manuscripts
can be further analyzed, although one might not be the copy the other.
3</p>
    </sec>
    <sec id="sec-4">
      <title>The Collaboration Experience</title>
      <p>The given example of a humanities problem and its solution with digital tools and
methods may seem to be rather peculiar. Due to its potential for methodological
advances it has become a leading example of our collaboration approach between
humanities scholars and data scientists. However, in the scope of a Collaborative
Research Center, which provides the speci c academic context for the given
collaboration experience, this is only part of a bigger picture.</p>
      <p>In this CRC, although mostly traditional research is carried out and digital
methods are not in the focus of many participants, more and more data are
produced. For the preservation of digital results and insights, an `information
infrastructure project' (INF project) was established after four years of research
activity, and needed to be integrated into an established environment. It is
expected of a successful information infrastructure approach in this context to be
applicable for research data of all involved projects. Basically there are two
options for this: (1) providing a generic data storage solution for all projects or
(2) exploring the potential of joint research. In our case a generic data
repository was set up to manage the research data from the CRC in a structured way
and to serve as an access point for further tool integration. But the main focus
is implementing a concept of joint research and development where both sides
aim to pro t in regards to pushing boundaries of the respective (disciplinary)
research { be it established philological methods or the application of big data
technologies on heterogeneous humanities data.</p>
      <p>Due to their heterogeneous disciplinary a liations and research topics, it
is quite challenging to focus on their speci c subjects, research questions and
methods, and to provide generic solutions beyond individual project scopes at
the same time. The focal point of interest common for all is that of concepts
of knowledge transfer. Thus, the data science collaboration also aimed for a
focus on those concepts, to bridge the gap between peculiar research interest
and information infrastructure but to also provide an additional point of view
on those concepts. Multiple types of knowledge transfer were identi ed early
on, enabling application and development of technical components like
established metadata standards (TEI, CIDOC-CRM, etc.) or visualization techniques
(spatio-temporal visualization, visualization of multidimensional relations and
similarities) with high potential to become a methodological complement for as
many of the projects as possible.</p>
      <p>The entry barrier for such a close, reciprocal collaboration is certainly higher
than for simple data storage. From the perspective of the humanities scholars one
111/143
of the most crucial starting points is to explain their speci c research problem
clearly to the data scientists and to outline what would have to be done to make
progress. The data scientists need to break down logical concepts which work
great in theory and apply them to the `real world' and thus { as regards the
humanities { to often messy data. Afterwards they can carefully discuss where
research and infrastructure needs could meet to bene t both sides.</p>
      <p>In the case of the CRC, four of the existing projects volunteered as `pilot
projects' for this new form of interdisciplinary research, one of them being the
project on de interpretatione. All of the projects were already pro cient in
explaining their research to scholars from other elds due to the CRC's structure,
but this cannot be taken for granted in general and needs to be actively
encouraged. Still, in spite of this advantage, the results range from conceptual examples
of possibilities to comprehensive digital enrichment of data resulting in new
research insights. The reasons for this outcome are manifold and complex, just to
name a few:
{ In the beginning some scholars were unsure if they could `ful ll' the
requirements of an infrastructure, e.g. being able to provide a su cient amount of
data. In this case it worked quite well to start with a very small data set,
show prototypes with a few functionalities early and then extend both sides
iteratively. In the example of the outlined case study we have begun with a
data set of only 50 out of 150 manuscripts. This procedure gave us the
opportunity to train the involved humanities scholars in the speci c
metadatamanagement for manuscripts. After having adapted the annotation tool for
the scholars' speci c needs this limited data set opened the opportunity of
integrating the tools into their daily research early on. By using the
infrastructure, the scholars did not only become familiar with the tools, but also
acknowledged the potential for progress in their research questions. Thus,
they also started uttering wishes for further functions of the tools. Since all
projects in the CRC are concerned with the research on knowledge transfer,
the annotation tool in particular became of interest also for another pilot
project. Thus, as a next step the generic parts of the tool could be adapted
again to the special issues of this second project, also beginning with a small
data set.
{ On the other hand data might not be collected to the full extent initially
aimed for, due to for example limited time or the need of comprehensive
research work. While this is not an unusual development in qualitative
research, there is a special risk to invalidate any additional quantitative
research, if it cannot be performed on a complete or coherent dataset. Joint
research { and here especially the data collection part { is resource intensive
and tends to be underestimated. For small projects it is nearly impossible to
do it on top of their regular workload. It seems particularly important that
the humanities scholar plans time for the collecting and structuring of his
research data. The collaboration can only work if there is a careful matching
of resources and manpower on both sides.
112/143
{ Even if data already exist they might not t the requirements of the
digital methods and tools in use, e.g. compliance to standards for an
(semi)automatic analysis or a lack of metadata. This is typically the case if the
data were collected in advance without any intention to include a digital
infrastructure into the research process afterwards. An adaption of the tools
or an adaption of the data or even of both is needed if a digital analysis
should become possible.</p>
      <p>All in all one has to conclude that the e ort for joint research is high. As
research on all sides is an on-going process, the required communication and
the need for exibility should not be underestimated. Additionally in the case
study above the humanities scholar contributed all his competence regarding
paleography and codicology by collecting and adding information to the existing
descriptions of the codices as well as transliterating, transcribing and tagging
the Greek paratexts. As the infrastructure was still in development during that
time it required a huge amount of trust on his side that it will be bene cial for
his research in the end.</p>
      <p>Our experience shows that exemplary case studies as shown are crucial for
collaboration in a diverse project environment. Although a small project will
probably hesitate to commit if a comprehensive example is shown, in general it
is fruitful to demonstrate the full range of collaboration and research possibilities
and afterwards de ne the speci cs. While discussing the case studies within the
CRC the projects developed ideas how the presented method and tools could
add something to their research. Thus, data scientists can develop the most
sophisticated infrastructure { it won't be used unless there are fascinating use
cases and enthusiastic scholars that spark interest in colleagues.
4</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and Recommendations to Educators</title>
      <p>This paper explores the possibilities and pitfalls of interdisciplinary
collaboration in the context of a Collaborative Research Center. The presented case study
on Aristotle's de interpretatione illustrates a concept of joint research and
development between disciplines, in particular philology and data science.</p>
      <p>In the given example our experience shows that it is advisable for the
philologists to get an insight into the management of their research data, either as
a part of the regular curriculum or by speci c courses at the beginning of the
collaboration. It is very helpful if they not only use tools, but understand some
of their basics and are able to handle e.g. the XML modeling of their metadata
to some extent. Such an education allows the assessment of possibilities as well
as limitations while applying digital methods, thus raising awareness to foster
reuse and reproducibility. Reciprocal understanding of both sides can for
example help to nd creative solutions as far as categorizations and vocabularies
are concerned in order to avoid too much free texts if a systematic and
federated search is aspired. This understanding is indispensable in order that both
sides { humanists and data scientists { pro t from the research and work of the
113/143
other. If this comes o well, the success also increases the motivation for further
collaboration on both sides.</p>
      <p>The CRC 980 has thrived on interdisciplinary research between various so
considered `small disciplines'. Thus, the later extension to include computer
scientists into the CRC was particularly easy in this well-established
interdisciplinary context. In our experience especially those small humanities disciplines
can bene t tremendously from such a collaboration and larger digital
infrastructure while o ering unique and challenging use cases for digital approaches. All
in all we argue that it is absolutely crucial to strengthen openness to other
disciplines' questions, tools and methods and to advance interdisciplinary pro les
{ creating a research environment where the whole is truly greater than the sum
of its parts.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Arnesano</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Aristotele in Terra d'Otranto. I Manoscritti fra XIII e XIV secolo</article-title>
          .
          <source>Segno e Testo 4</source>
          <volume>2</volume>
          (
          <issue>5</issue>
          ), pp.
          <fpage>149</fpage>
          -
          <lpage>190</lpage>
          +Tav.
          <volume>1</volume>
          {
          <issue>16</issue>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Busch</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chandna</surname>
          </string-name>
          , S.: eCodicology.
          <article-title>The Computer and the Mediaeval Library</article-title>
          . In: Busch,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Fischer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Sahle</surname>
          </string-name>
          , P. (eds)
          <article-title>Codicology and Palaeography in the Digital Age 4</article-title>
          , pp.
          <volume>3</volume>
          {
          <fpage>24</fpage>
          . Books on Demand,
          <source>Norderstedt</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. Gotzelmann, G.,
          <string-name>
            <surname>Tonne</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Aristoteles annotieren { Vom Handschriftendigitalisat zur qualitativ-quantitativen Analyse</article-title>
          . In: Hastik,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Hegel</surname>
          </string-name>
          , Ph. (eds.) Bilddaten in den Digitalen Geisteswissenschaften, pp.
          <volume>53</volume>
          {
          <fpage>66</fpage>
          .
          <string-name>
            <surname>Harrassowitz</surname>
          </string-name>
          ,
          <string-name>
            <surname>Wiesbaden</surname>
          </string-name>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Krewet</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hegel</surname>
          </string-name>
          , Ph.:
          <article-title>Diagramme in Bewegung. Scholien und Glossen zu Aristoteles' de interpretatione</article-title>
          . In: Hastik,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Hegel</surname>
          </string-name>
          , Ph. (eds.) Bilddaten in den Digitalen Geisteswissenschaften, pp.
          <volume>199</volume>
          {
          <fpage>216</fpage>
          .
          <string-name>
            <surname>Harrassowitz</surname>
          </string-name>
          ,
          <string-name>
            <surname>Wiesbaden</surname>
          </string-name>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kristeller</surname>
            ,
            <given-names>P. O.</given-names>
          </string-name>
          :
          <article-title>The Lachmann Method</article-title>
          .
          <source>Merits and Limitations. TEXT - Transactions of the Society for Textual Scholarship 1</source>
          <volume>2</volume>
          (
          <issue>5</issue>
          ), pp.
          <volume>11</volume>
          {
          <issue>20</issue>
          (
          <year>1981</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Maas</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : Textkritik. Teubner, Leipzig (
          <year>1927</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Minio-Paluello</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Aristotelis</surname>
          </string-name>
          categoriae et liber de interpretatione. Oxford University Press, Oxford (
          <year>1949</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Montanari</surname>
          </string-name>
          , E.:
          <source>La sezione linguistica del Peri Hermeneias di Aristotele. 2</source>
          Vol.
          <article-title>Universita degli Studi di Firenze</article-title>
          ,
          <source>Firenze</source>
          (
          <year>1984</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Moraux</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : Aristoteles Graecus.
          <article-title>Die griechischen Manuskripte des Aristoteles</article-title>
          . Erster Band: Alexandrien { London. De Gruyter, Berlin / New York (
          <year>1976</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Pasquali</surname>
          </string-name>
          , G.:
          <article-title>Storia della tradizione e critica del testo</article-title>
          .
          <source>Le Monnier</source>
          ,
          <string-name>
            <surname>Firenze</surname>
          </string-name>
          (
          <year>1934</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Reinsch</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Fragmente einer Organon-Handschrift des zehnten Jahrhunderts aus dem Katharinenkloster auf dem Berg Sinai</article-title>
          .
          <source>Philologus 145</source>
          <volume>2</volume>
          (
          <issue>5</issue>
          ), pp.
          <fpage>57</fpage>
          -
          <lpage>69</lpage>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Sahle</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : Digitale Editionsformen.
          <article-title>Zum Umgang mit der Uberlieferung unter den Bedingungen des Medienwandels. Teil 1: Das typographische Erbe</article-title>
          . Book on Demand,
          <source>Norderstedt</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Trizio</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          : Reading and Commeting on Aristotle. In: Kaldellis,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Siniossoglou</surname>
          </string-name>
          , N. (eds.) The Cambridge Intellectual History of Byzantium, pp.
          <volume>397</volume>
          {
          <fpage>412</fpage>
          . Cambridge University Press, Cambridge (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Weidemann</surname>
          </string-name>
          , H.: Aristoteles. De interpretatione. De Gruyter, Berlin / Boston (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>West</surname>
            ,
            <given-names>M. L.</given-names>
          </string-name>
          :
          <article-title>Textual Criticism and Editorial Technique: Applicable to Greek and Latin Texts</article-title>
          . Teubner,
          <string-name>
            <surname>Stuttgart</surname>
          </string-name>
          (
          <year>1973</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>