=Paper=
{{Paper
|id=Vol-2717/paper09
|storemode=property
|title=Deep learning for paleographic analysis of medieval Hebrew manuscripts: a DH team collaboration experience
|pdfUrl=https://ceur-ws.org/Vol-2717/paper09.pdf
|volume=Vol-2717
|authors=Daria Vasyutinsky Shapira,Irina Rabaev,Berat Kurar Barakat,Ahmad Droby,Jihad El-Sana
|dblpUrl=https://dblp.org/rec/conf/dhn/ShapiraRBDE20
}}
==Deep learning for paleographic analysis of medieval Hebrew manuscripts: a DH team collaboration experience==
Deep learning for paleographic analysis of
medieval Hebrew manuscripts: a DH team
collaboration experience
Daria Vasyutinsky Shapira1 , Irina Rabaev2 , Berat Kurar Barakat1 , Ahmad
Droby1 , and Jihad El-Sana1
1
Ben-Gurion University of the Negev,Beer-Sheva, Israel
2
Shamoon College of Engineering, Beer-Sheva, Israel
{dariavas,berat,drobya}@post.bgu.ac.il
irinar@ac.sce.ac.il, el-sana@cs.bgu.ac.il
Abstract. Our research project is part of the Visual Media Lab, headed
by Professor Jihad El-Sana, the Department of Computer Science at Ben-
Gurion University of the Negev, Israel.
In this interdisciplinary project we apply deep learning models to classify
script types and sub-types in medieval Hebrew manuscripts. The model
incorporates the the techniques and databases of Hebrew paleography
and (with reservations) Hebrew codicology.
Main theoretical base of our project is the SfarData dataset, that in-
cludes the full codicological descriptions and paleographical definitions
of all dated medieval Hebrew manuscripts till the year 1540. In some ex-
ceptional cases, we go beyond this dataset framework. The major source
of the data in terms of high definition photos of manuscripts is the In-
stitute of Microfilmed Hebrew Manuscripts at the National Library of
Israel that has undertaken the mission to collect copies of all extant He-
brew manuscripts from all over the world. We mostly use manuscripts
from the National library of Israel, the British library, and the French
National library.
This multidisciplinary project brings together researchers from both fields,
Humanities and Computer Science. Currently, one professor, one lec-
turer, one post-doc, and two doctoral students are participating in the
project. This is a very exciting work in which there are no ready-made so-
lutions for the various challenges. We collectively discuss ways to address
these challenges and adapt our solution on the go.
During the presentation, we will talk about how our project functions
and how we strive to achieve a common result. The inevitable difficul-
ties that we face during this collaboration include, inter alia, different
research systems in Humanities and in Computer Sciences, lack of com-
mon terminology, different technical training, different requirements for
publications and conferences, etc.
Copyright 2020 for this paper by its authors. Use permitted under Creative Commons
License Attribution 4.0 International (CC BY 4.0).
Twin Talks 2 and 3, 2020 Understanding and Facilitating Collaboration in Digital Humanities 84/143
2 Vasyutinsky Shapira et.al.
1 The humanities research problem
Human history, as we know it, is based on written text. It can be stone or
papyrus or paper, but history consists of what was written down and has survived
through the generations. Even the most ancient and longest traditions of oral
transmission of a text are known to us to the extent that they were eventually
recorded in writing.
For centuries the study of these written sources could only be from frag-
mentary information. They were limited by both geography and the physical
capabilities of a human researcher. Already by the 18-19th centuries the amount
of accumulated knowledge was big enough that a scientist could not master such
a mass of information in his lifetime. However, it is obvious that a significant
part of the data is still waiting to be discovered and analyzed.
Our research project is looking for ways to make some of these written
sources, namely, Hebrew medieval manuscripts, available for study and research
through machine learning. In other words, we want to teach the computer to
recognize handwritten medieval Hebrew texts, and thus incorporate them into
the available compendium of historical sources.
Unlike modern books, each manuscript is unique, as it was written at a
certain point, under certain circumstances, by a certain scribe or scribes. In
order to study a large amount of material, it must be classified in one way or
another. Paleography and codicology are one of such classifications.
In our research project, we built upon existing achievements of Hebrew pale-
ography and codicology. Paleography and codicology, the science of researching
and classifying manuscripts, is one of the most important disciplines exploring
ancient texts. Hebrew paleography is a relatively young discipline that began
to take its current form in the middle of the 20th century, and which quickly
borrowed and adapted tools and techniques from other paleography domains,
such as Greek and Latin.
The first generation of Hebrew paleographers (Malachi Beit-Arié, Norman
Golb, Benjamin Richler, Colette Sirat) collected and studied various key manu-
scripts, formulated and published the solid theoretical foundation in the field [1,
3, 10, 12, 8, 7]. In addition, the Sfar-Data project?? , which is lead by Malachi
Beit-Arié and includes a large collection of classified dated manuscripts, is now
partly incorporated into the catalogue of the National Library of Israel.
There is also a number of journal articles that use the same method of pa-
leographic research of a manuscript as in the book of Engel and Beir-Arié[5, 6].
The Institute for Microfilmed Hebrew Manuscripts at the National Library of
Israel has been collecting microfilms (now digital photos) of Jewish manuscripts
for decades. The goal of this ongoing project is to obtain digital copies of all
Hebrew manuscripts worldwide and make them easily available and accessible
for the research. Today, the Institute hosts more than 70, 000 microfilms and
thousands of digital images, which makes more than 90% of the known Hebrew
manuscripts. Besides, the National Library of Israel includes 11, 000 original
Hebrew manuscripts. These collections are large enough to train deep learning
algorithms.
Twin Talks 2 and 3, 2020 Understanding and Facilitating Collaboration in Digital Humanities 85/143
Deep learning for paleographic analysis of medieval Hebrew manuscripts 3
At the initial stage of our project we are training the algorithm to recognize
different sub-types of the Medieval Hebrew script.
2 Solution and preliminary results
In this project we utilize recent development in deep learning for classifying dif-
ferent script types of historical Hebrew manuscripts. According to paleography
research, handwriting styles evolve over time differently in various regions. Pale-
ography experts estimate the origin of a manuscript and its approximate period
using the writing style. However, this manual work is time consuming, tedious,
expensive, and relies on highly trained experts. The number of paleography ex-
perts in Hebrew scripts is very small and is not expected to increase in the
near future. In addition, these manuscripts originate from different geographical
regions and their dates span over thousands of years.
Hebrew Script
Regional
Types
Graphical classifications
Fig. 1. Hierarchy of Medieval Hebrew Scripts
Medieval Hebrew scripts are classified into six regional script types: Ashke-
nazi, Italian, Sephardi, Byzantine, Oriental, and Yemenite. Each type is subdi-
vided into three graphical classifications (sub-types): square, semi-cursive, and
cursive [2], as shown in Figure 1. In total there are 15 different sub-classes, as
some regional script types do not have semi-square or square form.
We have access to a large collection of various samples from different Hebrew
scripts, the Sfar Data (http://sfardata.nli.org.il/), which are categorized into
script type classes, including the raw material and high resolution copies.
Since the image sizes are quite big, to overcome technical limitations, we
extract patch from each images, which are further are fed into CNN.
So far, we have experimented with two different architectures (simple CNN
with three convolutional layers and ResNet). The dataset was divided into train-
ing and test sets, which include 538, 468 and 70, 000 patches, respectively.
We conducted several studies to determine which alterations of solution works
best for this task.
Deep learning models are prone to over-fitting and can utilize much non-
relevant information for the task at hand to decrease their loss and increase
Twin Talks 2 and 3, 2020 Understanding and Facilitating Collaboration in Digital Humanities 86/143
4 Vasyutinsky Shapira et.al.
classification accuracy. Therefore, we experimented with different input repre-
sentations to determine the optimal amount of information passed to the ma-
chine learning model to achieve high accuracy while avoiding over-fitting. In this
experiment, a simple CNN with three convolutional layers, which was trained
using patches with varying attributes. Such attributes include color space: gray-
scale, inverted gray-scale, and binary; shape: rectangular, and square patches;
and whether the patches are smoothed or not (see examples in Fig. 2.)
We have found that gray-scale patches of size 350 × 350 gave the highest
accuracy on the test sets and the lowest difference between the train and test
losses, suggesting no over-fitting.
We recognized that in order to determine definitely that the model can clas-
sify writing styles based on the text alone and not other visual cues, it should
be tested on manuscripts that that were seen during training. Thus, new manu-
scripts were added to the dataset, which was re-split into train, validation, and
test sets, where the validation and test sets include pages from manuscripts that
are not present in the training set.
Initially, the model’s accuracy on the unseen manuscripts were low. We found
that this is because the text size in the training set is very different form that
in the validation and test sets. Therefore, there is a need to either re-scale the
training, validation, and test sets to a nearly uniform text size or increasing the
variation of text size in the training set using augmentation.
Table 1 presents the results on three types of test sets. Normal test set in-
cludes patches from unseen pages of the training manuscripts. Blind test set
consist of patches from unseen pages. Scaled test set includes the scaled versions
of the blind test patches. We experimented with four different architectures; a
simple CNN with three layers, VGG19 [9], InceptionV3 [11] and ResNet152 [4].
Each of them trained from scratch (random weights), pre-trained using Ima-
geNet, and trained with the augmented dataset, as explained above.
Practically, we need to know the how accurate the machine-learning model
predicts the writing style of a give page. Table 2 shows the page prediction
accuracy of the unseen pages from the train manuscripts. The accuracy increases
as the number of patches sampled from the page increases, but the processing
time also increases proportional to the number of patch in each page.
The network’s coarse localization map provides evidences that the machine
discriminates between the writing styles by considering specific parts of the text
in the given patch (Fig. 3). It is left to the discretion of paleographers how
legitimate is the machine’s decision criteria.
3 The collaboration experience
Our project in its current form started in January 2020. The experience is very
positive and even exciting, due both to the fact that it gives the feeling of
constant scientific research and discovery, and also because of the satisfaction
from constantly overcoming expected and unexpected challenges.
These challenges can be briefly formulated as follows:
Twin Talks 2 and 3, 2020 Understanding and Facilitating Collaboration in Digital Humanities 87/143
Deep learning for paleographic analysis of medieval Hebrew manuscripts 5
Grayscale Smoothed Rectangular
Binary Inverted Smoothed and Inverted
Fig. 2. Example patches with varying attributes
Table 1. Accuracy on the different test sets for the different architectures trained for
writing style classification task.
Random Pretrained Augmented
Normal Blind Scaled Normal Blind Scaled Normal Blind Scaled
Simple CNN 95.25 12.75 14.99 - - - 90.50 23.75 38.69
VGG19 94.69 12.89 12.49 93.49 15.99 16.31 98.31 30.31 50.66
Inception v3 97.17 28.02 21.18 98.27 29.41 29.44 98.65 31.64 49.16
Resnet152 89.26 27.66 21.64 95.96 23.96 15.30 90.54 26.21 40.33
Table 2. Page-level accuracy computed using different numbers of patches randomly
sampled from each page and the time elapsed for each accuracy computation. These
results belongs to a pre-trained VGG-19 which is trained on 16000 patches and reach
to a validation accuracy of %91.25
# of patches 3 5 7 9 11 13 15 17 19 21
Accuracy 74.04 76.99 82.89 84.07 81.71 84.66 86.14 86.43 89.38 87.61
Time (s) 192 226 262 295 330 357 387 393 457 483
Twin Talks 2 and 3, 2020 Understanding and Facilitating Collaboration in Digital Humanities 88/143
6 Vasyutinsky Shapira et.al.
Sephardic semi-square Oriental square Italian square Italian square
Fig. 3. Visualization of network’s coarse localization map highlights the important
regions in the document image patch for predicting the writing style.
– Finding your team. The main initial challenge in a DH project is to find one’s
counterpart. Researchers in the Humanities and in the Computer Sciences
(CS) sit in different building on campus, attend different conferences, read
different journals. There are practically no intersection points. In case of our
project, both sides were looking for each other for a long time, and still we
only met by a lucky coincidence. And yet, our project was initially in an
advantageous position, because the CS team knew that they were looking
for a paleographer (though they did not know where to find one) and our
Humanities researcher knew approximately which CS tools could advance the
project he was dreaming about. Finding a collaborator can be much harder
if each side has only a vague idea of what the other side can offer, and this is
often the case because of the totally different academic backgrounds. It goes
without saying that it is much easier and more effective to work with those
people who already have an interest in your topic, than to seek the help of
people for whom your project might appear weird or incomprehensible.
– New team, new rules. In the Humanities, the researcher more often works
alone, or with one collaborator, now one needs to get used to teamwork. It is
easier on one hand, because each team member is responsible for his part of
work, and tasks like writing a paper or making a presentation became easier.
A team brainstorm is also a very positive factor. On the other hand, it is
necessary to take into account the abilities and desires of the group members,
which are not always clear in advance. The same is true for articles writing. In
the Humanities, a researcher most often writes his article alone, or with one
co-author. In the CS, as in the DH, an article is typically written by team.
Both approaches have their advantages, and both require certain specific
skills.
The participation of Dr. Vasyutinsky Shapira in this project is funded by Israeli
Ministery of Science, Technology and Space, Yuval Ne’eman scholarship n. 3-16784.
Twin Talks 2 and 3, 2020 Understanding and Facilitating Collaboration in Digital Humanities 89/143
Deep learning for paleographic analysis of medieval Hebrew manuscripts 7
– Unpredictability. When a researcher works alone on a project, for example
preparing a compilation of different Manuscripts (Mss) of a text, he know
how he will do it, he can check which methods have been used before, and
he knows more or less what the outcome will be. Of cause, he could face an
unexpected challenge, like a previously unknown manuscript that will change
the general picture dramatically, but mostly we talk about minor changes.
In a DH project, on the other hand, the previous experience one can rely on
is very limited. Not only the ways of solving a problem have to be adjusted
on the go, but also the goal itself has to be sometimes modified depending
on the results. In our project, it turned out that the human paleography is
so much based on intuition that it cannot be directly applied to machine
learning. On the other hand, the machine can extract incomparably more
small fragments of exact data. This leads us to a situation when even as we
write this paper our approaches are constantly adjusted and improved.
– Learning a new language. Effective communication between all participants
is essential for the success of any project. When participants come from
different research backgrounds, it is of cause necessary that we learn to un-
derstand each other. The humanities researcher must be able to clearly for-
mulate the problem. The Computer scientist should, again understandably,
explain possible solutions, if any. The difficulty here is both the difference in
the general approaches (for example, in the humanities, a problem is usually
solved manually, while in computer science it is not customary to manually
process the source material) and the lack of a common terminology. Pro-
fessional literature in both fields is highly specialized to study it without
relevant background, and thus, all members of the team have constantly to
learn from each other.
– New tools. In the humanities, we typically use basic computer tools in our
research: Word or other similar program for text processing, and a simple
presentation program for conferences. In most fields in the humanities, the
most prominent researchers are aged 50-70 and many of them will prefer to
avoid using computer tools unless absolutely necessary. In the CS, the situa-
tion is of cause quite different, and it is the responsibility of the humanities
researcher to learn at least some basic programs (i.e. the LaTeX that was
used to write this paper) in order to work effectively with the team.
4 Conclusions and recommendations
Our research team includes both CS and Humanities researchers and work in
a CS university lab, is a textbook example of a DH team. Our experience tells
that this collaboration provides very a successful, promising, and satisfactory
ecosystem for the entire team. There is little doubt that this type of research
collaboration will become more mainstream in the near future, and its impact
on the development of the Humanities will be even greater than can be imagined
now.
We want also to suggest possible solutions for the challenges as described in
the Collaboration Experience Section. These solutions aim at helping researchers
Twin Talks 2 and 3, 2020 Understanding and Facilitating Collaboration in Digital Humanities 90/143
8 Vasyutinsky Shapira et.al.
to find each other, learn to understand each other, and make their collaboration
more efficient from the start.
– First of all, it is very desirable to have a common platform where people
from the Humanities and CS could describe their projects and look for col-
laborators. This could be especially helpful when researchers do not know
exactly what kind of counterpart they are looking for. Today, researchers
that sit in different buildings of the same campus, often have no means to
find each other. Within a particular university, such a role can be played by
a dedicated DH research center.
– Both fields, the CS and the humanities, are highly specialized and compli-
cated, and require many years of training. It is hardly possible to expect
that one person could successfully master both fields and achieve high pro-
ficiency in both. Besides, a researcher in the humanities often needs years
of practice in his field before he assembles enough knowledge and experi-
ence to put challenging research questions. Thus, though there is no point
for a humanities researcher to try to really master CS, it is important to
acquire general understanding of the field. This problem could be solved by
adding to the university curriculum courses in the fundamentals of computer
sciences tailored for MA and PhD students of Humanities. A DH research
center could also make an effective bridge between the CS and Humanities
faculties. DH conferences and workshops do help humanities researchers to
master new computer skills, and they also often provide an overview of the
state of art in a specific field, but first the more general understanding is
required and the more professionally and academically its done, the better.
– In our project, we held regular weekly team meetings. At these meetings,
both general issues and more specific technical issues are discussed, and at
all parts of the discussion all team members are present. Thus, we can all
consult each other, clarify complicated matters, and adjust our approach and
methods on the go, in accordance with the results we get. These meetings
help us learn each other’s terminology, ideas and methods. Additionally,
one of the CS team members gives the humanities member regular tutoring
about the relevant fields of the CS. All this combined together gives very
noticeable positive results, and half a year after the start of the project, the
whole team speaks, as a rule, in a common and efficient language.
References
1. Beit-Arié, M.: Hebrew codicology. Jerusalem: Israel Academy of Sciences and Hu-
manities (1981)
2. Beit-Arié, M.: Hebrew codicology: historical and comparative typology of Hebrew
medieval codices based on the documentation of the extant dated manuscripts from
a quantitative approach. M. Beit-Arié (2012)
3. Beit-Arié, M., Engel, E.: Specimens of mediaeval Hebrew scripts, in 3 vol. Israel
Academy of Sciences and Humanities (1987, 2002, 2017)
Twin Talks 2 and 3, 2020 Understanding and Facilitating Collaboration in Digital Humanities 91/143
Deep learning for paleographic analysis of medieval Hebrew manuscripts 9
4. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In:
Proceedings of the IEEE conference on computer vision and pattern recognition.
pp. 770–778 (2016)
5. Klaus, A., al-Rahman ibn Muhammad Ibn Wafid (Abu al Mutarrif), A.: Traduc-
ciones y adaptaciones al hebreo de los tratados médicos-farmacológicos del toledano
Ibn Wafid. PPU (2007)
6. Pérez, I.: El testament de na Baladre (1325): nova aportació a l’estudi de les
sinagogues de Girona. Agrupación de Editores y Autores Universitarios (2012)
7. Richler, B., Beit-Airé, M., Pasternak, N.: Hebrew manuscripts in the vatican li-
brary. Catalogue. Compiled by the Staff of the Institute of the Microfilmed Hebrew
Manuscripts, Jewish National and University Library (Città del Vaticano). PMCid:
PMC3523710 (2008)
8. Richler, B., Beit-Arié, M.: Hebrew manuscripts in the biblioteca palatina in parma:
catalogue; palaeographical and codicological descriptions (2011)
9. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale
image recognition. arXiv preprint arXiv:1409.1556 (2014)
10. Sirat, C.: Hebrew manuscripts of the Middle Ages. Cambridge University Press
(2002)
11. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the incep-
tion architecture for computer vision. In: Proceedings of the IEEE conference on
computer vision and pattern recognition. pp. 2818–2826 (2016)
12. Yardeni, A., et al.: The book of Hebrew script: history, palaeography, script styles,
calligraphy & design. Carta Jerusalem (1997)
Twin Talks 2 and 3, 2020 Understanding and Facilitating Collaboration in Digital Humanities 92/143