=Paper=
{{Paper
|id=Vol-2717/paper11
|storemode=property
|title=The Whole is Greater Than the Sum of its Parts Analyzing Aristotle Commentaries in Collaboration between Philology and Data Science
|pdfUrl=https://ceur-ws.org/Vol-2717/paper11.pdf
|volume=Vol-2717
|authors=Michael Krewet,Danah Tonne,Germaine Götzelmann,Philipp Hegel,Sibylle Söring
|dblpUrl=https://dblp.org/rec/conf/dhn/KrewetTGHS20
}}
==The Whole is Greater Than the Sum of its Parts Analyzing Aristotle Commentaries in Collaboration between Philology and Data Science==
<pdf width="1500px">https://ceur-ws.org/Vol-2717/paper11.pdf</pdf>
<pre>
                   The Whole is Greater than the Sum of its Parts
                       Analyzing Aristotle Commentaries in
                     Collaboration Between Philology and Data
                                      Science

                      Michael Krewet1[0000−0001−7807−6089] , Danah Tonne2[0000−0001−6296−7282] ,
                                Germaine Götzelmann2[0000−0003−3974−3728] , Philipp
                              3[0000−0001−6867−1511]
                        Hegel                        , and Sibylle Söring1[0000−0002−1698−3289]
                                              1
                                               Freie Universität Berlin, Germany
                                        2
                                          Karlsruhe Institute of Technology, Germany
                                        3
                                          Technical University of Darmstadt, Germany


                           Abstract. This paper aims at presenting the surplus value of collabora-
                           tion between philologists and data scientists in the research on medieval
                           digitized manuscripts. Both the great potential and the challenges of
                           such a collaboration will be addressed. The following case study orig-
                           inates from research which is conducted in the Collaborative Research
                           Center “Episteme in Motion. Transfer from the Ancient World to the
                           Early Modern Period” which is located at the Freie Universität Berlin
                           and funded by the German Research Foundation (DFG). One of the goals
                           of this collaboration is to advance research questions in which the data
                           basis is complex or too complex for traditional research methods.
                           The case study presented in this paper will deal with the knowledge
                           transfer and text transmission in manuscripts of Aristotle’s ancient Greek
                           treatises on logic, the so-called Organon, and will focus on the manuscripts
                           of his work de interpretatione (On Interpretation) and on commentaries
                           and explanations which are found as paratexts in the manuscripts.

                           Keywords: Information Infrastructure · Collaborative Research Cen-
                           ters · De interpretatione.


                  Our cooperative work is embedded into the Collaborative Research Center 980
                  “Episteme in Motion. Transfer from the Ancient World to the Early Modern Pe-
                  riod”. The majority of projects within this CRC can be considered to originate
                  from so-called small disciplines, that are disciplines with a very limited number
                  of professors and students at an institution. Hence, a forum for interdisciplinary
                  discussion and definition of core terminology was implemented. The variety of
                  scholarly approaches, perspectives, and interpretations is a distinctive charac-
                  teristic of this interdisciplinary CRC but also causes the need for a specifically
                  close collaboration to ensure its success. Despite this variety, looking closer at
                     Copyright 2020 for this paper by its authors. Use permitted under Creative Commons
                     License Attribution 4.0 International (CC BY 4.0).


Twin Talks 2 and 3, 2020           Understanding and Facilitating Collaboration in Digital Humanities     103/143
                  2         M. Krewet et al.

                  the collaboration between one of the humanities projects that deals with Aris-
                  totelian works on the one side and the information infrastructure in use in the
                  CRC on the other side will illustrate the specific requirements and challenges
                  that this kind of collaboration generates.

                  1        The Humanities Research Problem
                  Aristotle’s writings have been transmitted in manuscript form (via codices) dur-
                  ing the Middle Ages. Today, we still possess approximately 150 copies of his
                  logical treatise de interpretatione, which date from the 9th to the 16th century.
                  They are preserved in many different libraries worldwide. Together with the
                  manuscripts of the Categories, another Aristotelian treatise which belongs to
                  his logical writings, this is the highest number of conserved manuscripts of a
                  pagan ancient Greek text. Although the number of 150 manuscripts does only
                  represent the manuscripts which were not lost or destroyed in the course of the
                  centuries, the comparatively enormous number of preserved manuscripts unveils
                  an intensive occupation with the Aristotelian logic among scholars between the
                  9th and 16th century [13].
                      Another evidence for the scholarly occupation with the treatise is found inside
                  the manuscripts themselves. The manuscripts do not only contain the text of the
                  treatise but also various forms of explanations on the pages, a kind of the so-
                  called paratexts ([9] for a general overview of the paratexts in the manuscripts of
                  the Aristotelian treatises).4 Even the pages of the earliest manuscripts had been
                  structured from the beginning for the purpose of adding longer explanations by
                  leaving e.g. a wide margin. Other manuscripts also show a big space between the
                  single lines to give room for explanations of single words (the so-called interlinear
                  glosses).5
                      On the basis of paleographical particularities it can also be observed that the
                  margins or interlinear spaces were often filled from different scholars or scribes
                  at different times, sometimes over the course of centuries. All these paratexts
                  (glosses, logical diagrams, scholia and commentaries) show that scholars have
                  worked with these manuscripts and that these manuscripts were the basis for
                  teaching and learning Aristotelian logic within schools or erudite circles (Exam-
                  ples have been given by [1]).
                      Both the high number of preserved manuscripts of the Aristotelian treatise
                  de interpretatione and its integration in the logical instruction have fascinated
                   4
                     For an overview over the manuscripts and whether they contain paratexts
                     cf. http://pinakes.irht.cnrs.fr/notices/oeuvre/2973/ (2020–08–13) and particularly
                     https://cagb-db.bbaw.de/register/werke.xql?cRef=Int. (2020–08–13) although both
                     lists do not contain all preserved manuscripts of Aristotle’s treatise de interpreta-
                     tione.
                   5
                     To mention only two manuscripts to give some impressions, it can
                     be referred to the codices Vatican, BAV, Urb. Gr. 35 (9th century),
                     f. 55v.: https://digi.vatlib.it/view/MSS Urb.gr.35 (2020-08–13), Paris, BNF,
                     Par. Gr. 1845 (end of 13th / beginning of the 14th century), f. 33v.:
                     https://gallica.bnf.fr/ark:/12148/btv1b107218100/f70.image (2020–08–13).


Twin Talks 2 and 3, 2020          Understanding and Facilitating Collaboration in Digital Humanities         104/143
                                                                    Analyzing Aristotle Commentaries       3

                  scholars over centuries from a philological, philosophical and historical perspec-
                  tive. Particularly two research fields and challenges can be stressed: (1) The
                  philological research concerning the question what the original Aristotelian text
                  looked like. (2) The challenge to follow the paths in which this Aristotelian trea-
                  tise has spread over the centuries. Determining the way the manuscripts travelled
                  and detecting epistemic centers where many manuscripts were copied is equiva-
                  lent with getting to know historical times and places with a particular interest
                  in the Aristotelian logic.
                      The research on the first philological question has encountered significant
                  problems and challenges which have not yet been resolved. It is a long tradition
                  in the field of philology, particularly in the classical philology, to make endeav-
                  ours to edit a text of an antique author which comes as close as possible to the
                  original. Related manuscripts usually contain the same mistakes or the same tex-
                  tual variants. While copying the text scribes committed mistakes, e.g. omitting
                  unintentionally words. Text editors have to find and eliminate such mistakes and
                  opt for the textual variant which they consider to be the authentic one.6
                      The problem which the philologists have encountered in the case of Aristo-
                  tle’s work de interpretatione is a so-called highly contaminated transmission of
                  the text. Contaminated means that there were scribes or scholars who did not
                  only copy the text from one older copy, but from more than one older copy,
                  whereby these older copies showed different variants in single passages. In such
                  cases copying the text accompanied a philological and philosophical interpreta-
                  tion and examination of the text as it can be found within an erudite school
                  or teaching context, for which the texts were also copied and annotated with
                  paratexts. Due to the fact that such a contamination is widely spread in the
                  manuscripts of the Aristotelian logic [8, 14] one of the greatest experts in the
                  field of Aristotelian manuscripts studies has stated that the retracing of the
                  relations of the manuscripts of the Aristotelian logical writings – including de
                  interpretatione – is such a complex task that it cannot be solved within the life
                  of a scholar [11, pp. 57–69].
                      Since scholars who have dealt with this question have come to a negative re-
                  sult, the detailed research data which derived e.g. from applying the philological
                  methods, as shortly outlined above, and which had led them to their conclusion
                  has never been published. It is simply gone with the end of the research project
                  or the end of the scholar’s life. This is the second problem which everyone faces
                  who deals with this research question: She or he has to start again from scratch.
                      But also scholars conducting research on the paths in which the knowledge
                  of Aristotelian logic spread e.g. over Europe during approximately 800 and 1600
                  and on epistemic centers in which the Aristotelian logic was taught and studied
                  and in which the Aristotelian logical treatises were copied face major challenges,
                   6
                       For the stemmatological method, which is also named after its founder Karl Lach-
                       mann (1793-1851) ‘Lachmannsche Methode’ c. f. in detail already [10]. Cf. further [6,
                       5, 12] Applying this method also relations between the manuscripts can be detected
                       if they share e.g. the same mistakes. For a categorization of different mistakes which
                       often occur while copying a text cf. [15].


Twin Talks 2 and 3, 2020           Understanding and Facilitating Collaboration in Digital Humanities           105/143
                  4        M. Krewet et al.

                  because for the majority of the manuscripts we do not possess information about
                  the exact year and the place where the manuscript was copied. Due to paleo-
                  graphical and codicological particularities a rough approximation to the date or
                  region when or where the manuscript was copied becomes possible in many cases
                  (Cf. [1] for a good approach). But for tracing the exact paths it is indispensable
                  to find additionally, if possible, relations of manuscripts of which we do not know
                  the exact provenience with manuscripts of which we know this exact provenience
                  due to corresponding notes which scribes or scholars left in the manuscripts.
                      One method which can be applied in this case is the same which has been
                  outlined above. If, for example, two manuscripts have to be dated in different
                  centuries and if it is, moreover, clear that they originate from different regions
                  (e.g. one from the Greek East, the other one from southern Italy) due to their
                  paleographical or codicological particularities, these particularities would suggest
                  that there is no relation between the two manuscripts. But it can become evident
                  e.g. that the southern Italian manuscript, which has to be dated later, shares
                  the same significant mistakes and textual variants with the older manuscript
                  from the Greek East and contains additionally some own mistakes. Thus, it can
                  be concluded that this later southern Italian manuscript is a copy of the earlier
                  Greek manuscript and that the Greek manuscript was brought to southern Italy
                  by scholars. One path of the dissemination of the Greek manuscripts can be
                  detected in this way. If the exact place, where the southern Italian manuscript
                  was copied, can be determined, the historical research can now examine – if
                  further evidence can be found – whether there was an epistemic center at this
                  place which might have been the reason why manuscripts were brought to it for
                  being copied.
                       Researching these paths becomes more complicated when e.g. the south-
                  ern Italian manuscripts does not only show the significant mistakes or textual
                  variants of the older Greek manuscript but also significant mistakes or textual
                  variants from another older e.g. Greek manuscript, whereby the two older Greek
                  manuscripts do not mutually share their mistakes. In the case of a text trans-
                  mission which is accompanied by a very erudite occupation with this text as
                  it is the case with Aristotle’s treatise de interpretatione, such examples suggest
                  that an erudite scholar has copied the text from two older manuscripts. This
                  scholar did not only want to copy a text but compared two templates with dif-
                  ferent variants and decided for the textual variants which he considered to be
                  the authentic Aristotelian one. Moreover, the scholar can also correct the variant
                  which he considers to be wrong in the respective template. Such a case is a cul-
                  turally contextualized example of the already mentioned contamination of the
                  text. If another scribe or scholar copies from the older copy with the correction
                  of the scholar into a new copy, this correction cannot be noticed as a correction
                  anymore by a reader who has only this new copy in front of him.
                     For the scholar today, who tries to research the path of copies for gaining
                  knowledge about the way in which this text spread, it becomes very difficult to
                  explain this contamination unless he succeeds in finding the original copy with
                  the correction. To detect the origin the scholar has to find the crossing point at


Twin Talks 2 and 3, 2020         Understanding and Facilitating Collaboration in Digital Humanities      106/143
                                                                  Analyzing Aristotle Commentaries    5

                  which a previous scholar had two manuscripts – as in the given example – with
                  two different textual versions. Moreover, it is possible that an erudite scholar
                  tried to correct obvious mistakes, which he detected in the template for his
                  copy, but that his correction originates only from his own thought. This so-called
                  conjecture can, at the end, be different from the authentic variant and, thus, also
                  cause contamination. And even more complicated cases can be found when an
                  erudite scholar copied his text form more than one already contaminated older
                  copies of the text. And, lastly, a scholar – particularly if she or he was erudite –
                  was also able to insert paratexts such as glosses, scholia, extracts of commentaries
                  or logical diagrams or to copy them (or some of them) from one of this witnesses
                  into the other if she or he found them in only one of his templates. The latter fact
                  becomes particular significant for detecting later contacts and relations between
                  manuscripts. This potential has not been exploited yet.
                      To conclude: The complexity of the outlined tasks and challenges to research
                  a textual transmission which took place in connection with a highly erudite
                  scholarship concerning the text is enormous. The difficulty particularly consists
                  in finding exactly these crossing points of manuscripts with different textual
                  variants and mistakes. These difficulties and the complexity of research has raised
                  big problems for scholars to proceed in this field of research so that it was
                  concluded that the life of a single scholar is too short to solve it.


                  2        The Solution

                  Although the described situation poses a big challenge, it has never been doubted
                  that significant progresses can be made by widening the research data and by
                  extending the method of research. Regarding the first point, it is a promis-
                  ing approach to take into account also similarities of the paratexts in different
                  manuscripts. If e.g. two or more manuscripts contain the same paratexts or the
                  same false logical diagram, which have been added in the manuscript in later
                  years or centuries compared to the copy of the text itself, this can be a strong
                  argument that these manuscripts came together at the same place at a later date
                  when these paratexts were added. It might seem to be paradoxical and counter-
                  productive to further increase the complexity of the already complex research
                  field. And it makes, indeed, only sense within an extended methodological ap-
                  proach which builds upon the great potential of a digital research infrastructure
                  and digital research tools for the specific research problems outlined above.
                      Thus, a two-way approach is promising. Firstly, researching the relations
                  between the manuscripts by means of the described philological method, and
                  secondly, researching the relation between the manuscripts by means of the com-
                  parison of the paratexts in manuscripts. A digital research infrastructure can be
                  the key for handling the described problems of contamination and complexity.
                  The first way resembling the traditional method will now be complemented by
                  the possibilities offered by a digital research infrastructure as it will be explained
                  in the following. The second way, on which the research data will be enriched
                  via including the paratexts, represents a useful supplement. There is a signif-


Twin Talks 2 and 3, 2020         Understanding and Facilitating Collaboration in Digital Humanities        107/143
                  6         M. Krewet et al.

                  icant amount of cases in which relations between manuscripts as far as their
                  later mutual contact is concerned cannot be proved via the mere comparison of
                  significant mistakes and textual variants. On the contrary, the outcome of such
                  textual comparisons in the narrow sense can be that these manuscripts are not
                  related to each other because they do not share significant mistakes or textual
                  variants. Thus, including the second way is significant to find later contacts be-
                  tween manuscripts and crossing points when e.g. an erudite scholar caused a
                  contamination as explained above.
                      If such points of contamination of two or more manuscripts can be deter-
                  mined and if we also possess reliable metadata for all or at least some of these
                  manuscripts, it becomes possible to reveal epistemic centers in which manuscripts
                  of different origins with different textual variants circulated particularly due to
                  a broad scholarly philological and philosophical interest in the Aristotelian logic
                  in general and in the treatise de interpretatione in particular. The interest in
                  copying the text and editing a manuscript with a reliable text which comes as
                  close to the original as possible can be a second reason for the circulation of dif-
                  ferent copies. And the will to create the best text for a didactical or philosophical
                  reasons can be another reason for different textual variants.
                      A digital research infrastructure can help to handle methodologically all men-
                  tioned problems, if it is built in a bottom up process facing concisely the re-
                  search problems. As a first step a data repository for the digitized manuscripts
                  of the Aristotelian treatise was set up with restricted access to researchers asso-
                  ciated with the Aristotle archive at Freie Universität Berlin. This archive com-
                  prises the collection of microfilms of all known (approximately 1000 preserved)
                  manuscripts. Digital reproductions of said microfilms in black and white, in few
                  cases accompanied by digitized manuscripts in color, were put in the data repos-
                  itory. To these digital reproductions metadata files (TEI-XML)7 were added.
                  These files contain scholarly descriptions which particularly base on the single
                  descriptions of every manuscript as it is found in the respective library cata-
                  logues or scholarly articles on the respective manuscript. The files mainly focus
                  on the provenience, the date of the manuscript and the material and size of the
                  codex. These data are complemented with research results of the current project
                  itself concerning the relation between single manuscripts.
                      As a second step, several digital tools to face the research problems and
                  challenges were newly developed or adapted. Firstly, a semiautomatic layout
                  analysis focuses on measuring the size of a manuscript page and of the text area
                  and margin area [2]. We know that the pages of the codices were structured
                  in advance before the text was copied, because a lining for the text (and in
                  many cases also for the paratexts) can be seen on the single pages.8 Therefore,
                  it has to be concluded that this layout was organized before the text was copied.
                  Furthermore, it is obvious that manuscripts within a school often got the same
                   7
                       http://www.tei-c.org/Guidelines/P5/ (2020–09–03).
                   8
                       Cf. e.g. the codex Oxford, Magdalen College, Fonds principal Gr. 15, f. 1r.:
                       https://digital.bodleian.ox.ac.uk/inquire/p/b4890d13-f697-494f-8bea-4ee3cee103d8
                       (2020–08–13).


Twin Talks 2 and 3, 2020          Understanding and Facilitating Collaboration in Digital Humanities      108/143
                                                                 Analyzing Aristotle Commentaries     7

                  layout. Thus, if an automatic layout analysis can detect the same layout in
                  different manuscripts and if we possess e.g. the metadata of only one manuscript
                  out of a group of manuscripts with the same or a very similar layout, we can
                  conclude that the other manuscripts originate from the same place. These results
                  help us to detect the provenience of a manuscript which is the prerequisite for
                  following the path which this manuscript has traveled.
                      Secondly, an annotation tool for enriching the data on the digitized manu-
                  scripts in the data repository has been developed [3]. This tool enables scholars
                  to describe, translate and – if necessary or helpful – also describe paratexts
                  (glosses, scholia, diagrams and commentaries) as well as a semantic tagging of
                  the paratexts. Furthermore, the transcription and semantic tagging of mistakes
                  and significant textual variants in the Aristotelian text itself will soon be possible
                  with this tool for scholarly use.
                      Thirdly, search and analyzation tools have been set up. By means of these
                  tools it becomes possible to search and find the same glosses, scholia, diagrams
                  or commentaries in different manuscripts – also at the backdrop of the metadata
                  of the manuscripts, particularly relations which cannot be found by means of
                  the traditional analog method (one example will follow).
                      The first evident advantage of this collaboration between philologists and
                  data scientists compared to every single previous research in this field is that
                  any research data will not, as it often is the case, disappear at the end of the
                  project, but it will be stored and made accessible in a repository. Due to the ap-
                  plication of standardized models, the reusability of the data is guaranteed. The
                  second big advantage is that scholars from different places can research the out-
                  lined question together thanks to the digital infrastructure, making data, tools
                  and services accessible independently of their location and operation system.
                  Thus, competences and research power of humanities scholars interested in this
                  question, can be pooled on an international level. A third big advantage is that
                  all manuscripts can be aggregated via this digital infrastructure. This enables
                  to combine results of search and analyzation tools directly with the respective
                  digital reproductions of the manuscript pages on which the searched or analyzed
                  items are found, have been transliterated, tagged etc. Handwritings can be com-
                  pared in this way directly with each other. Scholars can see e.g. whether the
                  search results originate from manuscripts of the same region, of a different re-
                  gion, of the same or different time, whether there are more similarities between
                  the results etc.
                      Just to give two concrete examples: (1) Both the paratexts to an important
                  chapter of the Aristotelian treatise and the textual variants and mistakes in the
                  text of this chapter have been transliterated and tagged by scholars. If a scholar
                  encounters during her or his research on the manuscripts and the paratexts
                  among the transliterated diagrams a false logical diagram in the de interpreta-
                  tione manuscript Vienna, ÖNB, Suppl. Gr. 67, f. 117v., it is possible to search
                  via these tools whether a second manuscript contains the false version of the
                  diagram. The result of the search shows us that this false diagram is also con-
                  tained in two other manuscripts (Paris, BNF, Par. Gr. 1971, f. 34v. and Vatican,


Twin Talks 2 and 3, 2020        Understanding and Facilitating Collaboration in Digital Humanities         109/143
                  8         M. Krewet et al.

                  BAV, Urb. Gr. 56, f. 88r.). Additionally, it can be analyzed via this tool, in the
                  next step, if these manuscripts contain also further similar paratexts (in this case
                  glosses, because glosses are richly found in all three manuscripts). The result of
                  this automatic analysis shows that these three manuscripts do also contain a
                  comparatively high degree of matches among the glosses. In a further step the
                  scholar can test whether there are other manuscripts besides these three which
                  share significant variants of glosses with these three manuscripts. The result of
                  this last analysis is negative: there are no matches.
                      Thus, by applying the respective components of the digital infrastructure on
                  the paratexts, the scholar concludes that there is a close relation between these
                  manuscripts. Considering that one of these three manuscripts does not share the
                  significant mistakes or textual variants of the other two – a result which the
                  automatic analysis can also show us perspectively – makes the importance of
                  this results, that we owe to the digital research infrastructure, even more evi-
                  dent. Since the result that the codices Vienna, ÖNB, Suppl. Gr. 67 and Paris,
                  BNF, Par. Gr. 1971 do not share the same significant mistakes shows that the
                  codex Vienna, ÖNB, Suppl. Gr. 67 cannot be the copy or the template of the
                  Parisian codex, and since the paratexts in the Viennese codex have been added
                  later in comparison to the text, due to paleographical reasons, there must have
                  been a later contact between these codices. If a scholar goes into further de-
                  tail, this context does also help to explain contamination in the codex Vienna,
                  ÖNB, Suppl. Gr. 67 which contains textual corrections. The corrected versions
                  are the same which are found in the codices Paris, BNF, Par. Gr. 1971 and
                  Vatican, BAV, Urb. Gr. 56. Thus, one crossing point and point of contamination
                  of manuscripts can be detected in this way. And to cut short: taking also into
                  account the metadata of these manuscripts, we can follow the paths the Viennese
                  codex has most probably travelled (from Greece to Italy) [4]. And similarly, the
                  research can proceed in many other cases by uncovering the relations between
                  the manuscripts.
                      (2) The advantages of such digital methods resulting from the intensive col-
                  laboration of humanities scholars and data scientists cannot be stressed enough,
                  because the scholar’s attention can be drawn again and again – also randomly –
                  to relations which could hardly have been detected without this digital infras-
                  tructure and research tools. During her or his research the scholar can e.g. notice
                  a seldom textual variant which was given as citation of the text by a commentary
                  found at the margin of a manuscript.9 The scholar can again search via a fuzzy
                  search whether he finds this textual variant in another paratext or manuscript.
                  The astonishing result shows that among other relevant results even logical di-
                  agrams can be found with a lettering which derives from this textual variant.
                  Since the digital infrastructure also enables the scholar to synoptically compare
                  results with the images of the diagrams on the respective manuscript pages, she
                   9
                       Cf. for such a case the codex Vatican, BAV, Urb. Gr. 35, f. 55r, where a fragment
                       of Olympiodorus’ otherwise lost commentary on de interpretatione is transmitted
                       in form of a scholion at the margin of the manuscript page. This fragment shows a
                       textcritically important long variant of the definition of the noun.


Twin Talks 2 and 3, 2020           Understanding and Facilitating Collaboration in Digital Humanities      110/143
                                                                  Analyzing Aristotle Commentaries    9

                  or he can more or less randomly notice that this diagram does not only occur
                  in more than one manuscript, but that two of these diagrams were drawn by
                  the same hand, because the handwriting is identical (in the codices Milan, BA,
                  Ambr. Q 87sup., f. 54r and Genova F VI 9, f. 68v). Thus, a contact between these
                  two manuscripts has been proved and the contact between these two manuscripts
                  can be further analyzed, although one might not be the copy the other.


                  3        The Collaboration Experience

                  The given example of a humanities problem and its solution with digital tools and
                  methods may seem to be rather peculiar. Due to its potential for methodological
                  advances it has become a leading example of our collaboration approach between
                  humanities scholars and data scientists. However, in the scope of a Collaborative
                  Research Center, which provides the specific academic context for the given
                  collaboration experience, this is only part of a bigger picture.
                      In this CRC, although mostly traditional research is carried out and digital
                  methods are not in the focus of many participants, more and more data are
                  produced. For the preservation of digital results and insights, an ‘information
                  infrastructure project’ (INF project) was established after four years of research
                  activity, and needed to be integrated into an established environment. It is ex-
                  pected of a successful information infrastructure approach in this context to be
                  applicable for research data of all involved projects. Basically there are two op-
                  tions for this: (1) providing a generic data storage solution for all projects or
                  (2) exploring the potential of joint research. In our case a generic data reposi-
                  tory was set up to manage the research data from the CRC in a structured way
                  and to serve as an access point for further tool integration. But the main focus
                  is implementing a concept of joint research and development where both sides
                  aim to profit in regards to pushing boundaries of the respective (disciplinary)
                  research – be it established philological methods or the application of big data
                  technologies on heterogeneous humanities data.
                      Due to their heterogeneous disciplinary affiliations and research topics, it
                  is quite challenging to focus on their specific subjects, research questions and
                  methods, and to provide generic solutions beyond individual project scopes at
                  the same time. The focal point of interest common for all is that of concepts
                  of knowledge transfer. Thus, the data science collaboration also aimed for a
                  focus on those concepts, to bridge the gap between peculiar research interest
                  and information infrastructure but to also provide an additional point of view
                  on those concepts. Multiple types of knowledge transfer were identified early
                  on, enabling application and development of technical components like estab-
                  lished metadata standards (TEI, CIDOC-CRM, etc.) or visualization techniques
                  (spatio-temporal visualization, visualization of multidimensional relations and
                  similarities) with high potential to become a methodological complement for as
                  many of the projects as possible.
                      The entry barrier for such a close, reciprocal collaboration is certainly higher
                  than for simple data storage. From the perspective of the humanities scholars one


Twin Talks 2 and 3, 2020         Understanding and Facilitating Collaboration in Digital Humanities       111/143
                  10       M. Krewet et al.

                  of the most crucial starting points is to explain their specific research problem
                  clearly to the data scientists and to outline what would have to be done to make
                  progress. The data scientists need to break down logical concepts which work
                  great in theory and apply them to the ‘real world’ and thus – as regards the
                  humanities – to often messy data. Afterwards they can carefully discuss where
                  research and infrastructure needs could meet to benefit both sides.
                      In the case of the CRC, four of the existing projects volunteered as ‘pilot
                  projects’ for this new form of interdisciplinary research, one of them being the
                  project on de interpretatione. All of the projects were already proficient in ex-
                  plaining their research to scholars from other fields due to the CRC’s structure,
                  but this cannot be taken for granted in general and needs to be actively encour-
                  aged. Still, in spite of this advantage, the results range from conceptual examples
                  of possibilities to comprehensive digital enrichment of data resulting in new re-
                  search insights. The reasons for this outcome are manifold and complex, just to
                  name a few:


                    – In the beginning some scholars were unsure if they could ‘fulfill’ the require-
                      ments of an infrastructure, e.g. being able to provide a sufficient amount of
                      data. In this case it worked quite well to start with a very small data set,
                      show prototypes with a few functionalities early and then extend both sides
                      iteratively. In the example of the outlined case study we have begun with a
                      data set of only 50 out of 150 manuscripts. This procedure gave us the op-
                      portunity to train the involved humanities scholars in the specific metadata-
                      management for manuscripts. After having adapted the annotation tool for
                      the scholars’ specific needs this limited data set opened the opportunity of
                      integrating the tools into their daily research early on. By using the infras-
                      tructure, the scholars did not only become familiar with the tools, but also
                      acknowledged the potential for progress in their research questions. Thus,
                      they also started uttering wishes for further functions of the tools. Since all
                      projects in the CRC are concerned with the research on knowledge transfer,
                      the annotation tool in particular became of interest also for another pilot
                      project. Thus, as a next step the generic parts of the tool could be adapted
                      again to the special issues of this second project, also beginning with a small
                      data set.
                    – On the other hand data might not be collected to the full extent initially
                      aimed for, due to for example limited time or the need of comprehensive
                      research work. While this is not an unusual development in qualitative re-
                      search, there is a special risk to invalidate any additional quantitative re-
                      search, if it cannot be performed on a complete or coherent dataset. Joint
                      research – and here especially the data collection part – is resource intensive
                      and tends to be underestimated. For small projects it is nearly impossible to
                      do it on top of their regular workload. It seems particularly important that
                      the humanities scholar plans time for the collecting and structuring of his
                      research data. The collaboration can only work if there is a careful matching
                      of resources and manpower on both sides.


Twin Talks 2 and 3, 2020         Understanding and Facilitating Collaboration in Digital Humanities     112/143
                                                                  Analyzing Aristotle Commentaries    11

                    – Even if data already exist they might not fit the requirements of the dig-
                      ital methods and tools in use, e.g. compliance to standards for an (semi-
                      )automatic analysis or a lack of metadata. This is typically the case if the
                      data were collected in advance without any intention to include a digital
                      infrastructure into the research process afterwards. An adaption of the tools
                      or an adaption of the data or even of both is needed if a digital analysis
                      should become possible.

                       All in all one has to conclude that the effort for joint research is high. As
                  research on all sides is an on-going process, the required communication and
                  the need for flexibility should not be underestimated. Additionally in the case
                  study above the humanities scholar contributed all his competence regarding
                  paleography and codicology by collecting and adding information to the existing
                  descriptions of the codices as well as transliterating, transcribing and tagging
                  the Greek paratexts. As the infrastructure was still in development during that
                  time it required a huge amount of trust on his side that it will be beneficial for
                  his research in the end.
                       Our experience shows that exemplary case studies as shown are crucial for
                  collaboration in a diverse project environment. Although a small project will
                  probably hesitate to commit if a comprehensive example is shown, in general it
                  is fruitful to demonstrate the full range of collaboration and research possibilities
                  and afterwards define the specifics. While discussing the case studies within the
                  CRC the projects developed ideas how the presented method and tools could
                  add something to their research. Thus, data scientists can develop the most
                  sophisticated infrastructure – it won’t be used unless there are fascinating use
                  cases and enthusiastic scholars that spark interest in colleagues.


                  4        Conclusions and Recommendations to Educators

                  This paper explores the possibilities and pitfalls of interdisciplinary collabora-
                  tion in the context of a Collaborative Research Center. The presented case study
                  on Aristotle’s de interpretatione illustrates a concept of joint research and de-
                  velopment between disciplines, in particular philology and data science.
                      In the given example our experience shows that it is advisable for the philol-
                  ogists to get an insight into the management of their research data, either as
                  a part of the regular curriculum or by specific courses at the beginning of the
                  collaboration. It is very helpful if they not only use tools, but understand some
                  of their basics and are able to handle e.g. the XML modeling of their metadata
                  to some extent. Such an education allows the assessment of possibilities as well
                  as limitations while applying digital methods, thus raising awareness to foster
                  reuse and reproducibility. Reciprocal understanding of both sides can for ex-
                  ample help to find creative solutions as far as categorizations and vocabularies
                  are concerned in order to avoid too much free texts if a systematic and feder-
                  ated search is aspired. This understanding is indispensable in order that both
                  sides – humanists and data scientists – profit from the research and work of the


Twin Talks 2 and 3, 2020         Understanding and Facilitating Collaboration in Digital Humanities        113/143
                  12       M. Krewet et al.

                  other. If this comes off well, the success also increases the motivation for further
                  collaboration on both sides.
                      The CRC 980 has thrived on interdisciplinary research between various so
                  considered ‘small disciplines’. Thus, the later extension to include computer sci-
                  entists into the CRC was particularly easy in this well-established interdisci-
                  plinary context. In our experience especially those small humanities disciplines
                  can benefit tremendously from such a collaboration and larger digital infrastruc-
                  ture while offering unique and challenging use cases for digital approaches. All
                  in all we argue that it is absolutely crucial to strengthen openness to other dis-
                  ciplines’ questions, tools and methods and to advance interdisciplinary profiles
                  – creating a research environment where the whole is truly greater than the sum
                  of its parts.


                  References
                  1. Arnesano, D.: Aristotele in Terra d’Otranto. I Manoscritti fra XIII e XIV secolo.
                     Segno e Testo 4 2(5), pp. 149-190+Tav. 1–16 (2006)
                  2. Busch, H., Chandna, S.: eCodicology. The Computer and the Mediaeval Library. In:
                     Busch, H., Fischer, F., Sahle, P. (eds) Codicology and Palaeography in the Digital
                     Age 4, pp. 3–24. Books on Demand, Norderstedt (2017)
                  3. Götzelmann, G., Tonne, D.: Aristoteles annotieren – Vom Handschriftendigitalisat
                     zur qualitativ-quantitativen Analyse. In: Hastik, C., Hegel, Ph. (eds.) Bilddaten in
                     den Digitalen Geisteswissenschaften, pp. 53–66. Harrassowitz, Wiesbaden (2020)
                  4. Krewet, M., Hegel, Ph.: Diagramme in Bewegung. Scholien und Glossen zu Aristote-
                     les’ de interpretatione. In: Hastik, C., Hegel, Ph. (eds.) Bilddaten in den Digitalen
                     Geisteswissenschaften, pp. 199–216. Harrassowitz, Wiesbaden (2020)
                  5. Kristeller, P. O.: The Lachmann Method. Merits and Limitations. TEXT - Trans-
                     actions of the Society for Textual Scholarship 1 2(5), pp. 11–20 (1981)
                  6. Maas, P.: Textkritik. Teubner, Leipzig (1927)
                  7. Minio-Paluello, L.: Aristotelis categoriae et liber de interpretatione. Oxford Univer-
                     sity Press, Oxford (1949)
                  8. Montanari, E.: La sezione linguistica del Peri Hermeneias di Aristotele. 2 Vol. Uni-
                     versità degli Studi di Firenze, Firenze (1984)
                  9. Moraux, P.: Aristoteles Graecus. Die griechischen Manuskripte des Aristoteles. Er-
                     ster Band: Alexandrien – London. De Gruyter, Berlin / New York (1976)
                  10. Pasquali, G.: Storia della tradizione e critica del testo. Le Monnier, Firenze (1934)
                  11. Reinsch, D.: Fragmente einer Organon-Handschrift des zehnten Jahrhunderts aus
                     dem Katharinenkloster auf dem Berg Sinai. Philologus 145 2(5), pp. 57-69 (2001)
                  12. Sahle, P.: Digitale Editionsformen. Zum Umgang mit der Überlieferung unter den
                     Bedingungen des Medienwandels. Teil 1: Das typographische Erbe. Book on De-
                     mand, Norderstedt (2013)
                  13. Trizio, M.: Reading and Commeting on Aristotle. In: Kaldellis, A., Siniossoglou, N.
                     (eds.) The Cambridge Intellectual History of Byzantium, pp. 397–412. Cambridge
                     University Press, Cambridge (2017)
                  14. Weidemann, H.: Aristoteles. De interpretatione. De Gruyter, Berlin / Boston
                     (2014)
                  15. West, M. L.: Textual Criticism and Editorial Technique: Applicable to Greek and
                     Latin Texts. Teubner, Stuttgart (1973)


Twin Talks 2 and 3, 2020         Understanding and Facilitating Collaboration in Digital Humanities           114/143

</pre>