=Paper= {{Paper |id=Vol-2602/paper3 |storemode=property |title=Teiresias: a Tool for Automatic Greek Handwriting Translation |pdfUrl=https://ceur-ws.org/Vol-2602/paper3.pdf |volume=Vol-2602 |authors=Anna Berlino,Luciano Caroprese,Giuseppe Mirabelli,Ester Zumpano |dblpUrl=https://dblp.org/rec/conf/ircdl/BerlinoCMZ20 }} ==Teiresias: a Tool for Automatic Greek Handwriting Translation== https://ceur-ws.org/Vol-2602/paper3.pdf
      Teiresias: a Tool for Automatic Greek Handwriting
                           Translation

          Anna Berlino                       Luciano Caroprese1                        Giuseppe Mirabelli
    berlinoanna93@gmail.com              l.caroprese@dimes.unical.it              giuseppemirabelli22@libero.it

                                                 Ester Zumpano1
                                            e.zumpano@dimes.unical.it
                Università della Calabria, Ponte P. Bucci, 87036 Arcavacata di Rende (CS), Italy




                                                      Abstract
                      The term epigraphy refers to the science that studies the written texts
                      (epigraphs) of the different civilizations of the past in the different his-
                      torical periods to understand the history of the ancient world. Greek
                      epigraphs of the archaic and ancient periods contain many different
                      variations of each symbol; moreover, each of them is often written
                      imperfectly or ruined by time. The goal of epigraphs is to associate
                      these archaic symbols with the correct modern Greek letters. This
                      work is a contribution in the direction of providing ICT facilities in
                      the context of cultural heritage. More specifically, the paper presents
                      a multiclass classifier for the recognition of symbols belonging to the
                      archaic Greek alphabet. The Machine Learning technique used for our
                      classifier is that of Convolutional Neural Networks (CNN). The name
                      chosen for the tool is Teiresias, that in Greek mythology was a blind
                      prophet of Apollo in Thebes, famous for clairvoyance and for being
                      transformed into a woman for seven years. Teiresias’s final aim is help-
                      ing epigraphists in finding the correct association from an handwritten
                      letter to its correspondent modern Greek letter, and therefore, in a
                      more general context, from an handwritten script to the correspondent
                      modern Greek script.




1    Introduction
Is there any use in rebuilding the past of humanity? Does it make sense with respect to the present and the
future? The answer turns out to be affirmative if, as Marc Bloch already argued, with historical reconstruction
we must not only refer to the “history of events”, but above all to the “history of mentalities”, or to the need
to understand what the men of the past thought, how they conceived religion and time, what were the emotions
they felt and how they expressed them. Everything must be considered an historical phenomenon, even our
way of thinking and everything, as historical phenomenon, originated in a given historical period. Therefore, the

Copyright c 2020 by the paper’s authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY
4.0).
In: A. Amelio, G. Borgefors, A. Hast (eds.): Proceedings of the 2nd International Workshop on Visual Pattern Extraction and
Recognition for Cultural Heritage Understanding, Bari, Italy, 29-Jan-2020, published at http://ceur-ws.org




                                                            34
knowledge of the so-called “structures”, that is of those elements that still continue to be present in our mentality,
is fundamental to understand the reality that surrounds us. It has always been present in the human race, even
in primordial civilizations, the need to leave a testimony of one’s own passage even in a period prior to the
advent of the first writing system, such as handing down, through orality and the material culture, the exploits
of ancient heroes, their own uses and customs. Today, reconstructing the remote past, and therefore humanity,
has become fascinating if we consider the fact that the historian rely not only on what has been disclosed by
previous studies, but also on the awareness that can involve those that today are considered history’s “auxiliaries
sciences” that are in charge to confirm, integrate or refute what has been and has not been documented by men
of a given period; and thus disciplines such as archeology, archaeometry, numismatics, anthropology, epigraphy
are ready to give voice to a ceramic fragment, to a coin, to an inscription placed on a simple ex voto. Among
the above listed disciplines, in this section and in relation to the final goal of this work, the focus will be on the
epigraphy, more specifically, the Greek one.

Generally, the term epigraphy refers to the science that studies the written texts (epigraphs) of the different
civilizations of the past in the different historical periods. On the basis of what has been said, excluding the
inscriptions on papyrus, parchment and paper, the focus of epigraphs science are the inscriptions on sepulchral
stones, votive altars, supports of sculptures, slabs, stelae, columns, tablets, plaster of buildings, objects and
different materials such as: stone, marble, metal, wood, bone, ceramic, precious stones. Deciphering an epigraph
and more specifically, a Greek epigraph therefore means understanding the history of the ancient world, not only
in its most striking manifestations - such as the relations maintained with other civilizations of the past - but
also in those aspects relating to everyday life: what beliefs or habits, ways of thinking, ways to heal or feed. The
Greek world returned a multitude of epigraphs [Mnam] (see figures in the bottom) that have to be deciphered.
They can be classified in: sepulchral inscriptions (Figure 1), dedicatory/sacred inscriptions (Figure 2) and public
registrations (Figure 3).




                     (a) Archaeological Museum of Athens        (b) Archaeological Museum of Pithecusae

                                          Figure 1: Sepulchral Inscriptions



But where and when was the Greek alphabet born? Like every aspect of the Greek world, even the birth of
writing has undergone a process of mythization. The birth of writing has been attributed to several legendary
characters: Palamede, Prometeo, the Muses inspired by Zeus and the versatile Hermes. However, parallel to
the mythical tradition, there is a historiographical tradition that considers the birth of the Greek alphabet as a
derivation of the Phoenician one, with particular reference to the figure of the king Cadmus.

Once acquired the Phoenician graphic system, the Greeks began to make a series of introductions (especially
relative to complementary signs) and modifications both from a phonetic and a graphic point of view. All this
led, at least at first, to a less homogeneous formation of Greek writing in the different regions.




                                                           35
                       (a) British Museum, London             (b) Museum of Fine Arts, Boston

                                   Figure 2: Dedicatory/Sacred inscriptions




                                           Figure 4: Kirchhoff’s map


The first scholar who understand this was Adolf Kirchhoff, who decided to explain, through a map, the results he
had achieved (Figure 4). In this map the German scholar distinguished the various regions of archaic Greece with
different colors: the green indicated the regions whose alphabets had no complementary signs, more precisely
the islands of Tera, Melo and Creta; the dark blue was for the regions that were part of the eastern group;
the light blue, on the other hand, collected the signs also used in the area marked by the dark blue color
but with small differences; finally, red was the color of the western area regions (Magna Graecia and Sicily)
[Kla78, Gua05, Blo09].




                                                      36
                    (a) Archaeological Museum of Athens           (b) British Museum, London

                                           Figure 3: Public Registrations




                  Figure 5: Principal characteristic forms of representative local Greek scripts

Figure 5 reports, in the columns the modern Greek letters, and in the rows the variation in shape of the
correspondent letters in the different ancient Greek local alphabets. Note that, the letter A had different minor
variants depending on the position of the middle bar, with some of them being characteristic of local varieties.
The letter B had the largest number of highly divergent local forms. Besides the standard form (either rounded
or pointed), there were many different other forms. The letters Γ and Λ had multiple different forms that could
often be confused with each other, as both are just an angle shape that could occur in various positions. As
evident from Figure 5, each region used a different variation in shape of the modern Greek letter and in addition,
handwritten letters are imperfect, that is they present additional variation in shapes.




                                                          37
Different handwritten scripts have been produced in the classical period and luckily some of theme arrived to us.
Obviously, as with any language, the quality of the documents can vary greatly therefore making a huge difference
in readability; handwritten scripts are not perfectly written, i.e. letters have imperfections and therefore for each
handwritten letter the correct association to the corresponding modern Greek letter has to be provided. Even in
todays digital age, translating handwritten Greek documents remains a challenge and this work, far from being
easy, is generally manually performed by epigraphists.
Required skills go beyond linguistic ones: an epigraphist, in fact, has to have detailed notions about the historical
context, the evolution of the graphic system, and therefore the different dialects present in the various areas, has
to know the type of material that acts as a support to the registration, the origin area; the style of the object
and the letters that make up the inscription. Without these additional skills it is impossible to read an epigraph,
and therefore to reconstruct any mutilated text, recognize a fake or an alien from a genuine and local inscription,
recognize a mistake of various kinds made by a stonemason and report it today as such.
In the interpretation of an inscription, epigraphists follow different steps: they carefully analyze the text, recon-
struct it, make a transcription from the archaic alphabet to the modern Greek, translate what is reported on
the object and try to interpret its meaning looking at the various elements hidden behind the inscription that
allow to understand the period to which it dates.
This work is a contribution in the direction of providing ICT facilities in the context of cultural heritage. More
specifically, the paper provides a tool that helps in finding the correct association from an handwritten letter to
its correspon-dent modern Greek letter, and therefore, in a more general context, from an handwritten script to
the correspondent modern Greek script. The name chosen for the tool is Teiresias, that in Greek mythology was
a blind prophet of Apollo in Thebes, famous for clairvoyance and for being transformed into a woman for seven
years.

2     Background
In this section the concepts of Machine Learning, Deep Neural Network, Convolutional Neural Network and Data
Augmentation, fundamental in the architecture of the Teiresias system, will be recalled.

2.1   Machine Learning
Machine Learning [Ger19] consists of the study and development of techniques that allow computers to learn from
data without being explicitly programmed. The sets of examples that the system uses to learn are called training
sets. Each example of a training set (in our case a symbol of the archaic Greek alphabet) is called instance.
Machine Learning is particularly effective when traditional approaches do not guarantee any acceptable solution
for the problem to be solved, or when the number of rules to be codified using by these approaches is extremely
broad and complex. Think of the intrinsic complexity of the problem of recognizing a symbol (in our case
belonging to the Greek archaic alphabet). A traditional algorithm should analyze the source image identifying
its main components (lines, angles, etc.) and the position of each of these with respect to the others. At
the same time it should be able to recognize and ignore imperfections and anomalies and take into account
horizontal/vertical translations, rotations and deformations of the image. All these tasks are extremely difficult
to code using procedural programming techniques. With Machine Learning, a model whose task is to map an
input sample (eg. a symbol belonging to the archaic Greek alphabet) into one or more output values (eg. a
letter belonging to the modern Greek alphabet) is created. This model is characterized by a set of parameters
whose optimal values are calculated during a learning phase allowing to minimize the distance (error ) between
the output produced by the model and the expected value.
There are supervised and unsupervised Machine Learning techniques. For the former, samples belonging to the
training set are labeled and the output values of the model have to be as close as possible to the labels. For the
latter, samples in the training set do not have any label. The second important distinction is between regression
techniques and classification techniques. Regression models (or regressors) provide continuous output values.
For example, they can be used to predict the price of stocks, or calculate the value of real estate properties in a
certain area. The classification models (or classifiers) allow instead to derive the class an instance belongs to.
If only a single class is considered, we have a binary classifier (eg. detect whether or not a cat is present in a
picture). If instead more classes are considered, we have a multiclass classifier. In this case for each sample, the
classifier will output the class an instance most probably belongs to.
In this paper a multiclass classifier for the recognition of symbols belonging to the archaic Greek alphabet will
be developed. The Machine Learning technique used for our classifier is that of Deep Neural Networks (DNN)




                                                         38
and in particular that of Convolutional Neural Networks (CNN).

2.2   Deep Neural Networks




                                           Figure 6: An Artificial Neuron

An Artificial Neural Network (ANN) is a computational model composed of artificial neurons, inspired by a
biological neural network. An artificial neuron has the structure shown in Figure 6. It receives as input the
values [x1 , ..., xn ], performs a linear combination of them using the weights [w1 , ..., wn ] and bias b and processes
the result z using a non-linear function g, called activation function (Formula 1).
                                                       Pn
                                                   z = j=1 xj wj + b
                                                                                                                     (1)
                                                   y = g(z)
The output y can be part of the input of other neurons or returned as a final result.
An Artificial Neural Network is composed of many layers of artificial neurons. A Deep Neural Networks (DNN)
is an Artificial Neural Network that takes advantage of modern algorithms (AdaGrad [DHS11], RMSProp [HSS],
Adam [DJ15], etc.) that improve the gradient descent technique [Rud16] by avoiding the problems that normally
affect it (vanishing/exploding gradient, overfitting, long training time, etc.).
There are many activation functions for the inner layers (hidden layers) of a neural network. Here we report just
four of them:

  • Step function:                                            (
                                                                   0   if z ≤ 0
                                                   g(z) =
                                                                   1   if z > 0

  • Sigmoid :
                                                                      1
                                                        g(z) =
                                                                   1 + e−z
  • Hyperbolic Tangent:
                                                                   ez − e−z
                                                     g(z) =
                                                                   ez + e−z
  • Relu:
                                                     g(z) = max(0, z)

  • Leaky Relu:
                                                    g(z) = max(αz, z)

  • Elu:                                                (
                                                         α(ez − 1)         if z ≤ 0
                                               g(z) =
                                                          z                if z > 0




                                                              39
In previous formulas α is a small positive constant (e.g. α = 0.1). The output of a neural network is returned
by a layer with these activation functions:
  • Linear function for regressors;
  • Sigmoid function for binary classifiers;
  • Softmax function for multiclass classifiers (like in our case). A multiclass classifier with k classes has k
    output , each of them representing the probability for the input to belong to the corresponding class (see
    Figure 7).




                                         Figure 7: A Multiclass Classifier.

      Let [z1 , ..., zk ] the outputs of the latest layer before applying the output activation function. Then the
      softmax function is defined as follows:
                                                   ezi
                                           ŷi = Pk            , for each i ∈ [1..k]
                                                          zj
                                                     j=1 e


The training process minimizes the value of a loss function, representing the gap between what is predicted by
the network for each instance (ŷ = [ŷ1 , ..., ŷk ]) and the label of the instance (y = [y1 , ..., yk ]).
Observe that, as ŷi (i ∈ [1..k]) represents the probability for the instance x to belong to the class i, its value is
between 0 and 1. Moreover, the label y contains only one 1 corresponding to the class c the instance x belongs
to (yc = 1). The loss function used for multiclass classifiers is the categorical crossentropy (Formula 2):
                                                              k
                                                              X
                                               L(y, ŷ) = −         yj log(ŷj )                                  (2)
                                                              j=1


2.3    Convolutional Neural Networks
Convolutional Neural Networks (CNN) allow to process large images. This task would be prohibitive for tra-
ditional neural networks because the number of parameters to be trained would be too large. In fact, in a
traditional neural network every neuron belonging to a layer receives in input all the values returned by the
previous layer.
Then, the training algorithm has to train the corresponding weights (and the bias).
For example, let’s suppose to have a 1000 x 1000 picture in input and 1000 neurons in the first layer of a fully
connected neural network. In this case, the parameters to be trained (only for the first layer) are more than a
billion!
Convolutional Neural Networks solve this problem. They were made trying to imitate the structure and behavior
of the human visual cortex.
The neurons of the visual cortex have a small local receptive field, which means that they react only to visual
stimuli located in a limited region of the visual field. The receptive fields of different neurons can overlap and
together cover the entire visual field.
Furthermore, it has been shown that these neurons are specialized in detecting only certain features of the image.




                                                           40
Neurons closest to the input (low layers) detect simpler features (eg. vertical, horizontal lines, etc.).
The neurons of the next layers gradually detect more complex features derived from the simpler ones extracted
from the previous layers (eg. geometric shapes, shades, etc.).
This schema has been reproduced in Convolutional Neural Networks. The main component of a Convolutional
Neural Network is a Convolutional Layer followed by a non-linear activation function (eg. Relu). A Convolutional
Layer is characte-rized by a certain number of filters (each of them is used to detect a particular feature), a
stride (the step used by the filters to scan the input image) and a padding (a frame of zero values added to the
input to make the output the same size as the input). Each filter has a certain size (eg. 3x3).
Convolutional layers alternate with pooling layers (eg. maxpool layers) whose purpose is to reduce the size of
the image being processed.
The last layers are traditional layers (fully connected layers). In particular, the last one, in the case of a multiclass
classifier, has a softmax activation function (Figure 8).




                                     Figure 8: A Convolutional Neural Network

2.4    Data Augmentation
As known, having a large dataset is crucial for the performance of the deep learning model. Anyhow, collecting
new data is often not easy. Therefore the strategy consists in improving the performance of the model by
augmenting the data we already have. Data augmentation means increasing the number of data points and in
the specific case of images, it means increasing the number of images in the dataset.
Data augmentation encompasses a suite of techniques used to artificially expand the size and quality of training
datasets by creating modified versions of its available data. This technique is of great importance in all those
application domains that have access to limited data, such for example in medical image analysis or in our
specific domain. Transformations include a range of operations specific of the field of image manipulation, such
as shifts, rotation, flips, changing lighting conditions, zooms, and many others. Many different techniques for
data augmentation exists in the literature, see [SK19] for a survey of on Image Data Augmentation techniques.

3     Teiresias: Dataset, Architecture and Experimental Results
This work is a contribution in the direction of providing ICT facilities in the context of cultural heritage. More
specifically, the paper provides a tools, called Teiresias, for supporting epigraphists in understanding the corpus
of ancient inscription as it provides the correct association from ancient handwritten Greek symbols to the
corresponding ones in modern Greek.
The first problem in developing Teiresias, has been the lack of sufficient amount of training data. More specifically,
in our setting, to the best of our knowledge, no dataset of handwritten ancient Greek letter exists.
In order to solve the problem and have sufficient training data, image dataset has been synthesized by using
data augmentation on the different variation in shape of specific ancient local alphabets [Wiki]. For each letter
(represented by an image) of each specific ancient local alphabet, many different imperfect variants of it have
been synthesized by means of data augmentation. The final aim was expanding the training dataset with new,
plausible handwritten letters’ samples.
The original dataset consists of 583 images (format: jpeg, size: 519 x 640) belonging to 28 classes (modern Greek
letters from ’Alpha’ to ’Zeta’).
We applied the following transformations:
    • zoom: ±20%,




                                                           41
  • rotation: ±10 degrees,

  • width/height shift: ±10%,

  • shear: ±20%.

We could not apply transformations like horizontal/vertical flip because they would have change the semantic of
the original symbols. For the same reason, we limited the range of rotation to only ±10 degrees. Starting from
the original dataset we generated 500.000 small images (format: png, size: 28 x 28).
The deep neural architecture chosen for Teiresias is a Convolutional Neural Newtwok. The main motivation is
that CNN currently is the main architecture used for the image analysis and classification tasks.

The system has been developed in Python (ver. 3.6) using the Deep Learning libraries TensorFlow (ver. 1.13.1)
and Keras (ver. 2.2.4).
The structure of the CNN is the following:

  • Input layer, size: 28 x 28 x 1

  • Conv2D, filters: 64, size: 7 x 7, padding: same, activation: relu

  • MaxPooling2D, size: 2 x 2

  • Conv2D, filters: 128, size: 3 x 3, padding: same, activation: relu

  • Conv2D, filters: 128, size: 3 x 3, padding: same, activation: relu

  • MaxPooling2D, size: 2 x 2

  • Conv2D, filters: 256, size: 3 x 3, padding: same, activation: relu

  • Conv2D, filters: 256, size: 3 x 3, padding: same, activation: relu

  • MaxPooling2D, size: 2 x 2

  • Conv2D, filters: 512, size: 3 x 3, padding: same, activation: relu

  • Conv2D, filters: 512, size: 3 x 3, padding: same, activation: relu

  • MaxPooling2D, size: 2 x 2

  • Flatten

  • Dense, neurons: 128, activation: relu

  • Dropout, rate: 0.5

  • Dense, neurons: 64, activation: relu

  • Dropout, rate: 0.5

  • Dense, neurons: 28, activation: softmax

The number of trainable parameters is 4.725.596.
Observe that, the dimension of the input is 24 x 24 x 1 as the width and the heigh of the input instances are 24
and they are greyscale pictures (one channel).
It is worth recalling that Conv2D, MaxPooling2D, Flatten and Dense are classes belonging to the module ten-
sorflow. keras.layers implementing the main types of of layers used in CNNs.
Conv2D implements a convolutional layer. It receives in input the number of filters, the dimension of filters, the
stride (default is 1), the padding (same if the dimension of the output has to be the same of that of the input,
valid if it can be different) and the activation function.




                                                       42
MaxPooling2D implements a maxpool layer. It receives in input the dimension of the pool (eg. if it is 2 x 2 the
dimension of the input will be halved).
Flatten allows to reshape its input into a flat vector.
Finally, Dense implements a dense layer (each neuron is connected to all the inputs). It receives in input the
number of neurons and the activation function. The algorithm used for the training phase has been Adam with
categorical crossentropy (Formula 2) as a loss function. To manually use the system we developped a very simple
interface allowing a user to test the system. Figure 9 reports the part of the output of an experiment.
We used 360.000 images for the training set, 40.000 images for the validation set and the remaining 100.000 for
the test set. The accuracy reached by Teiresias over the test set (100.000 images) is 90.22%.
We believe that this result is encouraging and allows us to consider Teiresias a good starting point for the
development of more sophisticated systems to support epigraphists.




                                        Figure 9: The output of Teiresias

4   State of the Art
Many different epigraphs have been collected over the centuries, translated and interpreted by the epigraphists.
An example among these is the Corpus Inscriptionum Graecarum (CIG), published in Berlin by Academy of
Prussia in 1815. The CIG is a collection of Greek inscriptions found in different areas accompanied by the design
of the object bearing the inscription, the transcription of the latter and important comments of historical and
philological nature. This valuable instrument was subsequently replaced by Inscriptionum Greacae (IG), also
published in Berlin and considered as a sort of integration of the first, which reports more in depth information on
the origin of inscriptions: Greece, islands, Magna Graecia. The Supplementum Epigraphicum Graecum (SEG)
is an online magazine that collects Greek inscriptions found later. This tool provides interested users with the
opportunity to carry out research into Greek historical fragments with relative English translation.




                                                        43
Perseus Project [Pers] is another website which allows to carry out research on Greek texts with relative trans-
lation and various information and references with comments. The final aim is to build a database of historical
information also including those of archaeological nature.
Axon [Axon] is an Italian database created by the University of Venice C Foscari. The database can be consulted
without any subscription, on the contrary of the previous two, it results to be a valid support for everyone,
scholars or the merely curious, as it is quite easy to use and interpret. Each finding reported in the “catalog”
is accompanied by a detailed sheet that can be downloaded. It contains many information characterizing the
finding: the author who edited the file, the name of the find, the summary describing the object, the year to
which it dates back, the type of support and object, the place of conservation, the text engraved on the object
and its translation into Italian, the place and period of finding and the various historians that have taken care of
the registration. In addition, since the authors who interpret and translate the inscriptions are often different,
the name of the authors who interpreted the inscription on the object are highlighted in black.
Another important database is PHI Greek Inscriptions [Insc]. This is a database collecting Greek epigraphs
divided by regions; texts are just reported in Greek without any translation. Once logged into the database it
is possible to perform sophisticated searches in order to obtain information regarding all the regions of Greece,
including the islands, and the links to the pages containing the registration. An interesting feature is that a
search on a specific word or name can be performed in a specific area or an ancient region or can be extended to
all regions: the final result will highlight all the fragments bearing the word or name previously typed belonging
to the specified areas or region.
A recent AI system is Pythia, a name deriving from Pythia, priestess of the God Apollo, that, according to
ancient beliefs, provided the answers whenever the faithful addressed questions to the oracle. Pythia is the first
trained AI system that returns the missing characters in a deteriorated epigraphic text using deep learning. The
goal of this system, therefore, is to provide an important support to the epigraphists who approach the activity
of interpretation of an ancient deteriorated text.
All the above mentioned system have the final aim of providing information on handwritten Greek texts, in addi-
tion some of theme also provide the translation obtained thanks to the work manually performed by epigraphists.
As pointed out, this work is far from being easy as the main problem is that handwritten script are not perfectly
written, i.e. letters present imperfections and therefore for each handwritten letter the correct association to
the corresponding modern Greek letter has to be provided. It is therefore evident that a tool that provides the
correct association from an handwritten letter to its correspondent modern Greek letter could be a valid support
for epigraphists.

5   Concluding Remarks and Direction for Future Research
In the archaic period many different ancient Greek local alphabets exist and therefore many different variations
in shape of each symbol. In addition to this, different handwritten script have been produced in the classical
period and obviously, as with any language, the quality of the documents can vary greatly therefore making
a huge difference in readability. Handwritten scripts are not perfectly written, i.e. letters have imperfections
and therefore for each handwritten letter the correct association to the corresponding modern Greek letter has
to be provided. Even in todays digital age, translating handwritten Greek documents remains a challenge and
this work, far from being easy, is generally manually performed by epigraphists, whose goal is to interpret an
inscription, i.e. to associate archaic symbols with the correct modern Greek ones.
This work is a contribution in the direction of providing ICT facilities in the context of cultural heritage: Teiresias
helps in translating an handwritten script to the correspondent modern Greek script. Teiresias is a multiclass
classifier based on a Convolutional Neural Network. In Teiresias we used 360.000 images for the training set,
40.000 images for the validation set and 100.000 for the test set. The accuracy reached by Teiresias over the
test set (100.000 images) is 90.22%. Obtained results are encouraging and allows us to consider Teiresias a good
starting point for the development of more sophisticated systems to support epigraphists.




                                                          44
References
[SK19]    Connor Shorten, Taghi M. Khoshgoftaar. A survey on Image Data Augmentation for Deep Learning.
          J. Big Data 6: 60 (2019)
[Wiki]    https://en.wikipedia.org/wiki/Archaic Greek alphabets.
[Pers]    www.perseus.tufts.edu

[Axon]    https://virgo.unive.it
[Insc]    https://inscriptions.packhum.org
[Kla78]   Gnther Klaffenbach, Epigrafia Greca, Firenze, 1978.

[Gua05] Margherita Guarducci, Lepigrafia greca dalle origini al tardo impero, Roma, 2005.
[Blo09]   Marc Bloch, Apologia della storia o Mestiere di storico, Torino, 2009.
[Mnam] http://mnamon.sns.it/index.php?page=Esempi&id=12
[Ger19] Aurlien Gron. Hands-On Machine Learning with Scikit-Learn, Keras, and Tensorflow: Concepts, Tools,
        and Techniques to Build Intelligent Systems. Book editor: OReilly, 2019, ISBN-10: 1492032646, ISBN-
        13: 978-1492032649.
[DHS11] John C. Duchi, Elad Hazan, Yoram Singer. Adaptive Subgradient Methods for Online Learning and
        Stochastic Optimization, JMLR volume 12, 2011, pages 2121–2159.
[HSS]     Geoffrey Hinton, Nitish Srivastava, Kevin Swersky, https://www.cs.toronto.edu
          /~tijmen/csc321/slides/lecture slides lec6.pdf, slide 29.
[DJ15]    Kingma Diederik, Ba Jimmy. Adam: A Method for Stochastic Optimization. International Conference
          on Learning Representations, ICLR (Poster) 2015.
[Rud16] Sebastian Ruder. An overview of gradient descent optimization algorithms. CoRR 2016, volume
        abs/1609.04747.




                                                       45