=Paper= {{Paper |id=Vol-3015/141 |storemode=property |title=Italian Sign Language (LIS) and Natural Language Processing: an Overview |pdfUrl=https://ceur-ws.org/Vol-3015/paper141.pdf |volume=Vol-3015 |authors=Sabina Fontana,Gaia Caligiore |dblpUrl=https://dblp.org/rec/conf/aiia/FontanaC21 }} ==Italian Sign Language (LIS) and Natural Language Processing: an Overview== https://ceur-ws.org/Vol-3015/paper141.pdf
      Italian Sign Language (LIS) and Natural Language
                    Processing: an Overview

           Sabina Fontana [0000-0003-3083-1676] and Gaia Caligiore [000-0002-7087-1819]
                    1
                      University of Catania, Department of Humanities, Italy
                               sabina.fontana@unict.it
                    2
                      University of Catania, Department of Humanities, Italy
                            gaia.caligiore@phd.unict.it



       Abstract. The past decade has seen an increase in studies conducted on the inter-
       action between NLP and sign languages. In this paper, we mainly focus on LIS
       while discussing the current state of the art, possible future developments and the
       ethical implications of this growing research context. In the history of NLP, hu-
       man/computer interaction has been mainly based on the transcription of spoken
       languages. We investigate how existing resources for spoken language processing
       can be applied to SLs and combined with language-specific tools, providing ex-
       amples of recent resources. We discuss novel strategies for sign transcription that
       consider both the need for standardized writing forms to enable NLP, as well as
       the language-specific features of SLs that are conveyed through the visual-manual
       channel. Deaf contributors are fundamental within this research. When NLP and
       SLs interact, we find a shift from a user-centric approach towards a user-based one
       to be essential: Deaf end-users of the resulting resources thus become part of the
       designing process1.


       Keywords: Italian Sign Language (LIS), Natural Language Processing, Transla-
       tion.


1      Deafness and Deaf Communities

   Different degrees of deafness affect millions of people around the world. The Italian
ISTAT, National Institute for Statistics, states that in 2019, a total of 3 million people
(5,2% of Italy’s population) suffered from hearing loss [1]; showing the growth of hear-
ing-impaired people who, in 1999, were estimated to be 877.000 [2]. Outside of Italy,

1
  The present paper has been co-authored by Sabina Fontana and Gaia Caligiore. Sabina Fontana
developed section 1, subsections 2.1 and section 4. Gaia Caligiore developed subsections 2.2 to
2.6 and section 3. After proving feedback on the sections written by the co-author, the paper was
jointly revised.

Copyright ©2021 for this paper by its authors. Use permitted under Creative Commons License
Attribution 4.0 International (CC BY 4.0).
2   S. Fontana and G. Caligore



the WHO (World Health Organization) predicts that by 2050 nearly 2.5 billion people in
the world will have some degree of hearing loss [3].
   Whether they are born deaf or lose their hearing in their early infancy, due to illness
or an accident, individuals with different levels of deafness (moderate, severe, profound)
will be facing difficulties in communicating with hearing speakers. Such difficulties are
of different kinds, depending on whether deafness is experienced in early infancy during
language development or if it is acquired at an adult age, or if deaf children are exposed
to a sign language in early infancy or not. About 10% of the deaf population is born deaf
to deaf parents. The remaining 90% have hearing parents who do not know a sign lan-
guage. This raises an issue concerning the transmission of sign languages, which tends
to occur in a horizontal (among peers) rather than vertical way (from generation to gen-
eration). Recently, mainstream and technology have further influenced the educational
path of deaf children and have led to delay or deny access to a sign language. Families
tend to prefer normalization, choosing a spoken language and resisting bilingualism.
Consequently, sign language is accessed and learnt very often at an adult age. Conse-
quently, Deaf signers may acquire different levels of proficiency in spoken language and
sign language [4].
   Sign languages and spoken languages vastly differ since they employ different mo-
dalities: auditory-oral and visual-manual. For this reason, the bilingualism of Deaf sign-
ers is often referred to as bimodal [5]. Deaf people's bilingualism has some specific fea-
tures. Firstly, it is bimodal [6] because they learn or acquire two languages exploiting
different modalities through different paths: spoken language through speech therapy
starting from early infancy and sign language through exposure, rarely during infancy,
more frequently at a young age or even later in their lives. The two languages are used
alternately but are in contact, so they influence each other while occurring simultane-
ously, since there are no serial order constraints [7]. Secondly, it is generally an unbal-
anced bilingualism: even if learnt later, the sign language appears to be their natural
language and skills in spoken language are rarely comparable to natives2. Lastly, the two
languages have different sociolinguistic statuses. On the one hand, the spoken language
– that is the majority language – is widely shared, institutional and used in education, in
media communication and many other formal contexts. On the other hand, the sign lan-
guage is a minority language that has been long stigmatized and only recently is being
used in formal contexts, but still not enough in education. Although many schools and
universities provide support in sign language through professional interpreters or com-
munication assistants, there are only a few bilingual sign/spoken language schools in
Italy (in Biella, Cossato and Rome) where LIS is studied like any other subject and taught
to all students [8].
       It is important to highlight that this condition of bilingualism influences the usage
and the perception of sign languages from Deaf users [9]. It also influences the research
work that must be carried out. It is fundamental to keep in mind that the target group is
far from being homogeneous as it consists of people who have vastly different


2
    We are not using the categories ‘first language’, ‘second language’, and ‘mother tongue’ as
    they do not exactly mirror the specificities of deaf bilingualism.
                  Italian Sign Language (LIS) and Natural Language Processing: an Overview 3



experiences of Deafness and language acquisition. The experience of Deaf signers who
present bimodal bilingualism will be different from that of individuals who lost their
hearing later in life and are mainly faced with communication problems related to access
rather than comprehension of a spoken language.

1.1     The Italian Sign Language (LIS)
   As for spoken languages, when analyzing a sign language, adopting an interlinguistic
perspective and a broader, more international frame of reference, will surely provide a
rounder understanding of linguistic and social phenomena. Given our research interests,
in this paper we will mainly focus on LIS. Therefore, all images and mentioned resources
will be looked at as part of the LIS framework.
   As for the signing population, the database Ethnologue [10] lists 121 sign languages
used by 70 million Deaf people with 60.000 users of LIS. But what are the characteristics
that determine the basic traits of LIS? Studying sign language from a phonocentric per-
spective can lead to the research of the same categories that define spoken languages.
While this has been the preferred strategy in the past, it is fundamental to keep in mind
that the different modalities (visual-manual/auditory-vocal) will call for different de-
scriptions of a language that may not mirror each other. For this reason, we like to start
describing LIS by observing how gestuality is systematically organized to create mean-
ing. LIS employs the visual-manual modality, involving both manual and non-manual
elements in the construction of signs. Manual elements are usually defined using the
following major parameters: the shape one’s hand (or hands) acquire while performing
a sign, its place of articulation (also referred to as ‘location’), the way hands move in
space and hand orientation, i.e., the position of the palm of the dominant hand [11]. Non-
manual elements play an equivalently crucial role in the construction of meaning and
include head and body movements, facial expressions and mouth gestures [11]. These
elements can simultaneously convey semantic information or have pronominal value
during role-shift, a complex process that allows signers to ‘become’ the person, animal
or object they are representing.




Fig. 1. The sign ‘TU?’ (‘YOU?’) in LIS. The man depicted in the figure is raising his eyebrows
[12] and using a specific configuration of the lips. In doing so, he is creating the interrogative form
of the sign. This example shows how non-manual elements allow us to distinguish between state-
ments and questions.
4   S. Fontana and G. Caligore



The systematic creation of meaning in LIS can generate signs with different degrees of
arbitrariness or iconicity. Gestuality is an essential aspect of human communication: if
hands become the core elements of a language, there will be a continuity between gestu-
ality and signs, which originates iconic phenomena. The presence of iconicity does not
exclude arbitrariness. The spontaneous evolution of an arbitrary sign is arbitrary and un-
predictable and takes place within a community that is impacted by specific cultural,
social and geographical influences.
      More than half a century has passed since the publication of the first methodolog-
ical studies on sign languages. During this time, a shift in linguistics has slowly taken
place, leading to a transformation in the perception of sign languages that are now uni-
versally accepted as natural ones and have been legally recognized as official languages
in numerous countries. Changes in attitude and increased visibility forced signers to think
upon their language and define a notion of correctness. At the same time, when sign
languages became visible and accessible to a larger Deaf and hearing public (also
through professional interpreting services in TV and other official contexts), Deaf people
realized that it lacked many lexical items and functions to meet the different communi-
cative needs [13].
     The recognition of sign languages has been gradual but steady and has seen a fast
development in the last 20 years. Within the Italian political and social context, the 2020
pandemic made LIS (Lingua dei Segni Italiana–Italian Sign Language) increasingly vis-
ible and public, as interpreters translated presidential speeches and conferences [14].
This newly acquired visibility– together with the widespread use of social media by Ital-
ian Deaf people – only sped up the ongoing process of standardization. At the same time,
as LIS becomes more prominent and is used in more contexts, linguistic growth becomes
necessary, leading to a rapid expansion of its vocabulary. Furthermore, the pandemic
highlighted the hardship of Deaf people in contexts where face-to-face communication
is limited, culminating in the official recognition of LIS and LIST (Tactile Italian Sign
Language) by the Italian Government in May of 2021 [15].
    As for the analysis of LIS structures, the main issue is that, up until the end of 2020,
there were no grammars that extensively described its structural phenomena. This is due
to different factors. LIS was first described in 1987 by Virginia Volterra [16]. At the
time, the objective was the ‘dignification’ of the language, leading to a pursuit of the
same parameters that define spoken languages within an assimilationist perspective.
These first stages of LIS studies were characterized by a search for categories such as
phonemes or minimal couples. After the first decade of research on LIS, signers them-
selves started using LIS in more formal environments [17]: the removal of LIS from the
domestic and informal contexts led to an expansion and standardization of the language,
which is still ongoing. With that came the newfound awareness of Italian signers who
now see their language as a vehicle of Deaf pride and object in need of protection and
preservation.
    We take it for granted that sign languages are natural, meaning that they evolve spon-
taneously within a community. They are different from spoken languages in that they are
not a manual translation of spoken languages and do not have standard written forms.
Since the members of these communities are Deaf, they will make use of their bodies to
                Italian Sign Language (LIS) and Natural Language Processing: an Overview 5



convey meaning. As we have said before, at present, sign languages are considered ‘oral’
in that there is no formal system for their transcription. As bimodal bilinguals, signers
usually rely on the written form of spoken languages. This has largely influenced re-
search on sign languages, leading to the use of glosses for sign transcription: the transla-
tion of the sign into spoken language written in all capitals. Of course, an external writing
system that relies on translation does not convey the expressiveness of signs, hence the
centrality of the need to move beyond glosses in sign language studies. Within this con-
text, several attempts have been made at designing an independent signing system, the
most successful being SignWriting [18].


2      Natural Language Processing and Italian Sign Language:
       Issues and Resources

   We focus our attention on the possible contacts between NLP and sign languages to
facilitate interactions between Deaf and hearing people and reflect on the practical and
ethical challenges researchers are faced with when these two worlds come into contact.
The sign language we focus on is Italian Sign Language (LIS). However, the observa-
tions we make in this paper refer to topics that lay at the core of all sign languages. We
also make some observations on the implications of translating automatically sign lan-
guages into vocal languages and vice-versa.
   In this section, we first focus on linguistic issues to be considered when working with
LIS. We then move on to the current state of the art, describing projects for text to video
synthesis in the Italian context. After that, we discuss available options for glossing LIS.
We also provide Italian and international examples of different strategies employed to
build datasets by converting text into sign language and sign language videos into text,
through processes of manual and automatic recognition.

2.1    Linguistic and Ethical Issues
   In shaping a dataset, different types of issues must be taken into consideration. First,
we consider those issues that are related to the specific characteristics of LIS. We men-
tioned that sign languages should be looked at and analyzed using independent categories
that do not necessarily mirror those of spoken languages. However, given that a compu-
tational analysis must go through a written message, it is necessary to reflect upon sign
language transcription, its linguistic, social and ethical implications. Like other sign lan-
guages, LIS is an oral language and does not have a standardized writing system, given
the visual-manual modality it employs and the fact that it relies on written Italian as an
external writing system. An international, language-specific transcription system is
SignWriting [18] which consists of a database of iconic symbols representing orientation
and handshape of one or both hands, facial expression, movements, contact and location
of the sign. These symbols are combined to represent signs. Due to its iconic nature,
SignWriting allows for a detailed and simultaneous representation of the multilinearity
of signed discourse but, despite the positive response from researchers, it has failed to
6     S. Fontana and G. Caligore



become the annotation system of choosing for signers. As discussed in subsection 1.1.,
only recently sign languages have been analyzed, described and recognized as natural
languages. For these reasons, some signs could have not been developed yet. From a
social standpoint, dataset collection should involve Deaf people not only as informants
but as participants in that their living experience and linguistic knowledge is crucial for
the development of the system that should account for their needs. The contribution of
Deaf researchers is also important to define the setting for data elicitation to create a
more natural dataset. Very often, when data are elicited in a very artificial setting, Deaf
informants may shift to spoken language or mixed spoken and signed language varieties,
i.e., contact signing [19]. We come to the second issue that should be considered in col-
lecting datasets: the necessity to account for the variability deriving from different ages
of acquisition and skills in spoken and sign languages. Ideally, to correctly identify lin-
guistic structures, datasets should be collected in a naturalistic setting and involve Deaf
people as participants.

2.2      Automatic Processing
   Different issues arise once automatic processing is implemented on the language.
Once data is obtained, researchers will be faced with the need for data annotation. A
basic requirement is, of course, a readable transcription. This requirement is problematic
for sign languages since current methodologies rely primarily on word labels: transla-
tions taken from spoken languages [20]. For this reason, in this section we also discuss
tasks of word segmentation, reflecting on the issues one comes across when combining
it with LIS. Word segmentation is a basic step, fundamental to sign annotation. When it
comes to spoken languages, the content of each segment will naturally be one word,
usually divided by the remaining string of written language by a space delimiter [21].
However, this step is not as straightforward when it comes to sign languages since it calls
for the recognition of the beginning and end of a sign as well as the design of a transcrip-
tion strategy. For this reason, gloss annotation is one of the main and most time-consum-
ing issues [22].

2.3      State of the Art
   Several models have been used to automatically transfer information from sign lan-
guage to spoken language and vice versa. The general research problem is sign language
recognition, as well as the processes leading to it [23]. Sign language recognition is con-
cerned with the identification, segmentation and definition of signs. Given the multimo-
dality and multilinearity of sign languages, data must be collected through video or im-
ages. To achieve said recognition, multi-channel approaches have been used, combining
hardware-based or software-based resources [24]. In the past year, among the recent
technologies employed for data collection for sign recognition, we find camera images
[25] – in some instances acquired with RGB and RGB-D sensors [26] – fused with radar
sensor data [27].
                Italian Sign Language (LIS) and Natural Language Processing: an Overview 7



2.4    Text to Video Synthesis Projects for LIS
   Within the Italian context, different projects have been carried out on the application
of NLP and MT on LIS. We will now describe two related projects developed in the past
years. The LIS4ALL project (2012–2014) [28]: funded in 2012 by Piedmont to create a
prototype system for ITA-LIS automatic translation, providing a service for displaying
information in LIS on mobile devices in train station terminals. The output was LIS ut-
terances signed by an animated interpreter [20]. The LIS4ALL designers found that a
small number of templates covered most announcements [29] and thus built three regular
expressions that matched three templates. Following a process of sentence simplifica-
tion, non-mandatory components (for example deictic signs) were not translated into
LIS. The project resulted in the translation of 63 announcements. The main sources of
error identified during translation were lexical gaps and the inability to handle the dou-
bling of subject [29]. Given the novelty of the project, central aspects of LIS, such as
non-manual elements and the morphological aspects of sign movement and location,
were not considered in the project.
   The LIS4ALL system was based on an existing one developed in the context of the
ATLAS project for the translation of weather forecasts [30–31] through the creation of
a lexicon of 2350 signs [30]. Moreover, the translation strategies used for LIS4ALL were
interlingua rule-based and statistical translation [28], an evolution of the strategies that
had been employed in the ATLAS project (Automatic Translation into sign LAnguageS,
2009–2012) for the description and identification of the relations between signs. For
ATLAS, glossed LIS utterances had been analyzed using a set of lexical items and sev-
eral combinatorial rules [28] that were elaborated by a LIS generator that ‘[…] builds a
tree representing the generic LIS lexical items and some generic syntactic relations
among them […]’ [28]. Additionally, a set of values was associated with each sign gloss
and provided information on the database name of the sign, its ID, the number of hands
used to perform the sign and the part of speech of the sign [31] with the final aim of
obtaining a signed target text performed by an avatar.

2.5    Going Beyond Glosses: Available Options for LIS
   Traditionally, signs have been transcribed using glosses, i.e., a translation of the sign
into the spoken target language. Therefore, in the context of sign transcription, if a LIS
signer is describing something that happened involving something their dog did at home,
we will surely have to transcribe ‘CANE’ (DOG) and ‘CASA’ (HOUSE). As can be
inferred, despite their widespread use, glosses are flattening to the complexity of LIS. In
fact, to a non-signer, the mentioned glosses provide no information on the production of
the sign itself, only specifying its meaning. Additionally, signs may vary for many rea-
sons, such as the geographical origin of a signer. Therefore, glosses make it virtually
impossible for non-signers to obtain information on the sign once removed from the
signed context.
   Within the Italian framework, different strategies have been adopted to work around
this issue. Generally, the most successful solution is the combination of video files and
8   S. Fontana and G. Caligore



univocal glosses. In the recently online-published resource A Grammar of Italian Sign
Language (LIS) [32] the authors opted for a variation of gloss annotation by included
videos of signers performing an isolated sign or an utterance in LIS, thus showing how
signs combine and interact, as well as different variations of the same sign. The Gram-
mar was published in the context of the SIGN-HUB project which aims at preserving
and researching the linguistic, historical and cultural heritage of European Deaf signing
communities with an integral resource [33].




Fig. 2. An example of gloss writing taken from the Grammar. As can be seen, the authors included
the gloss sequence as well as a translation into English. The translation into LIS of each utterance
can be accessed by clicking on the hand-shaped icon to the right [32].


By associating glosses and video files, the Grammar certainly shows an aptitude towards
a rework of the widespread glossing methodology. However, it does not collect its signs
and utterances within a dictionary and, even if it did, the amount of data would be ex-
tremely limited. As for LIS dictionaries, at present, the only digital resources available
to LIS researchers are the Dizionario Bilingue Elementare della Lingua dei Segni Ital-
iana (LIS)3 [34] in its digital version, and the website SpreadTheSign [35–36].
    The Dictionary is one of the most renowned and retrievable resources for LIS and it
includes more than 2500 videos of signs performed by native signers. Each video is
marked by a specific code that includes the translation of the sign into Italian and a se-
quence of numbers and/or letters. As mentioned, the richness of sign languages cannot
be reproduced through capital letters [37] and might even lead to the creation of ambi-
guity, not providing information on the variation of the sign. For this reason, the Dic-
tionary can be an excellent resource for LIS transcription since it provides an unambig-
uous code for each sign, thus creating an unequivocal association between the sign and
its translation.
    Despite its usefulness and the vastness of the resource, the Dictionary is limited in
that it was concluded in 1992. For this reason, the multi-language dictionaries available
on the website SpreadTheSign provide an invaluable contribution. The website is the
result of an EU-funded project created for Deaf education and provides multilingual dic-
tionaries for several sign languages from around the world.




3
    Henceforth referred to as Dictionary.
                Italian Sign Language (LIS) and Natural Language Processing: an Overview 9




Fig. 3. Image taken from the website SpreadTheSign [35]. The signer is performing the LIS sign
HOUSE.


2.6    Existing Datasets for Sign Languages
   In the previous section, we mentioned existing online dictionaries for LIS. We will
now discuss different annotation tools and strategies used for other European and North-
ern American sign languages.
     With regards to available resources for the segmentation and analysis of sign lan-
guage videos, at present, ELAN is the most used resource in this field. It was initially
released in 2000 by The Max Planck Institute for Psycholinguistics in Nijmegen, Neth-
erlands, as a tool to annotate audiovisual files on different levels. ELAN is a useful tool
for multimodality research on sign languages since it allows users to create time-aligned
annotation levels where simultaneous information on manual and non-manual elements
can be included [38].
     Sign language translation requires glossing, which is a time-consuming, yet neces-
sary, annotation process. Tokenization on the gloss level is widely used since a recogni-
tion of each sign that makes up an utterance facilitates the translation process. An exam-
ple of a tokenized sign language corpus is the Swedish Sign language Corpus (SSLC), a
resource developed at the Department of Linguistics of the University of Stockholm. The
SSLC was compiled between 2009 and 2001 and included video-recorded conversations
of 42 Swedish Sign Language Signers from the age of 20 to 82. The tokens collected for
the SSLC are 33,600 [39]. The tokenization process was led on the ELAN platform.
Annotators developed different levels (or tiers) to provide information on sign glosses
taken from the Swedish Sign Language Dictionary [40], adding specifically developed
tags and symbols to signal phenomena such as overlapping, merging, fingerspelling, or
gesture-like sign. Another corpus partly annotated and tagged using ELAN is the British
Sign Language Corpus Project (BSLCP) [41]. For the creation of the corpus, 249 BSL
Deaf signers were recorded in conversational contexts. The ELAN software was used to
provide information on what was being signed by the participants [42]. Within the con-
text of the SIGN-HUB Project, Pfau [43] and other researchers from Germany, Italy, The
Netherlands, Spain and Turkey, collaborated on a shared project for sign language anno-
tation on ELAN. The goal was the inclusion of an annotation that went beyond transla-
tion or glossing, including aspects on non-manual productions and even comments from
10   S. Fontana and G. Caligore



annotators. As a result, more than 5 hours of footage were annotated by Deaf and hearing
researchers.
     All sign language data of the projects discussed in this section were annotated man-
ually. As mentioned, annotation is a time-consuming process. To minimize the amount
of time spent on it as well as facilitate the growth of large, annotated corpora for sign
languages research, Drew and Ney [44] created a new interface for ELAN able to auto-
matically recognize signs and annotate simultaneous tiers. The information included in
the tiers is: glosses corresponding to the translations into spoken language as well as
information on the features of each annotated sign. The richness of gloss annotation was
designed to be modelled by users depending on their interests.
      Given that most of the research on sign language recognition has been mainly fo-
cused on the tasks of gesture recognition, the linguistic qualities of sign languages have
been overlooked. However, recent papers suggest a new approach to sign language trans-
lation problems. Sign language recognition uses contact [45] or vision-based systems
[46]. The former is cumbersome, while the latter is mainly focused on identifying indi-
vidual signs. Instead, real-world sign language recognition should be continuous to pro-
cess the signing flow accurately. In 2018, Camgoz et al. [47] proposed the generation of
spoken language translation from German sign language videos, mirroring steps of
standard Neural Machine Translation which resulted in the PHOENIX4T dataset. In this
dataset, gloss information makes up the data. Further development came with the intro-
duction of sign language transformers [48] which can translate from spoken language
sentences to a 3D skeleton [49].
      To our knowledge, several attempts have been made at collecting sign language
data by combining computer vision and information captured through gloves. Using the
CopyCat system, Zafrulla et al. [50] collected 320 utterances from ASL signers. The
tools for data collection used were a camcorder and colored gloves containing accel-
erometers providing information on acceleration, direction and rotation of hands. An-
other tool is the AcceleGlove, an electronic glove placed on a signer’s hand and arm.
The glove collected information on hand movement, orientation and location in relation
to the body and recognized 176 signs in isolation [51] The AcceleGlove was combined
with a gesture recognition toolkit [52] by McGuire et al. [45] to record 665 utterances in
ASL to establish a pattern recognition framework to be expanded.
   The idea of associating glosses taken from a set of values, together with additional
information on the manual qualities of each sign, was developed in a 2020 master’s thesis
[53] and article [54]. The goal was the creation of a Universal Dependencies-compliant
resource for the syntactic annotation of LIS, i.e., the first LIS treebank. To create this
treebank, tasks of segmentation, annotation, POS tagging and parsing had to be carried
out. Particular attention was paid to the inclusion of unambiguous information in the
segmentation and annotation process. LIS videos were segmented on ELAN, after that,
each sign was analyzed on different levels. The first tier provided an unambiguous gloss
taken from the Dictionary or SpreadTheSign. The following tiers included information
on sign location and Universal Dependencies POS tag. Utterances were then transferred
into CoNLL-U format for the construction of dependency trees. During that step, gloss,
sign location and POS tag were transferred in the columns, together with a translation
              Italian Sign Language (LIS) and Natural Language Processing: an Overview 11



into Italian. The treebank can be found on GitHub [55]. The main issue encountered in
this project is, once again, the lack of information on the annotation of non-manual ele-
ments. Annotating on ELAN, which allows for simultaneous viewing of video and an-
notation, is convenient. However, once we move to the level of syntactic annotation in
CoNLL-U format, the video material is no longer visible. Therefore, all information that
could be not codified in the tiers through word labels is virtually inaccessible.
   As regards available datasets, a list of sign languages recognition datasets with infor-
mation on data type and annotation systems, as well as related papers can be found in the
bibliography [56].


3      Additional Challenges for LIS–IT and IT–LIS Translation

3.1    Subdivision of Manual and Non-Manual Elements

   Sign transcription for data collection is not the only relevant problem to be faced for
sign language processing. On the one hand, the manual elements that make up a sign in
LIS have been divided into different parameters: handshape, place of articulation, orien-
tation and movement. On the other hand, non-manual elements are equally important in
utterance construction and sign disambiguation and include facial expression (move-
ments of eyebrows, eye, mouth, nose), body posture and movements.
    Once a methodology for sign transcription is identified, the second aspect to be con-
sidered is the identification of the relevant elements in sign construction. Can sign anno-
tation be limited to only manual ones? Or are non-manual elements fundamental and
cannot be taken out of the discourse? This issue holds a central spot in Italian and inter-
national contexts. Two main perspectives provide different answers to this issue. ‘As-
similationists’ show a tendency to focus on those aspects that draw sign languages closer
to spoken ones, thus giving priority to ‘standard signs’, which are easier to define and
mainly constituted by manual elements. ‘Non-assimilationists’ want to highlight the lan-
guage-specific properties of sign languages, such as the relevance of non-manual ele-
ments [57].
   Whichever approach one may choose, the search for relevance remains central. Ide-
ally, manual and non-manual elements should be taken into consideration, thus providing
information on every aspect of the sign. However – given the current technologies em-
ployed in this field and discussed in section 2.3., such as RGB cameras or radar sensors
– manual elements hold a central position in the investigation, temporarily winning over
non-manual ones.

3.2    Challenges of LIS Utterance Structure Description
   Despite our non-assimilationist inclinations, it is undeniable that if the aim of a pro-
cess of data collection and annotation is MT, a dismissal of the pre-existing POS tagging
systems is counterproductive. For this reason, in this subsection, we examine recent as-
similationist observations on LIS utterances, by taking into consideration the manual and
non-manual levels. Utterances are based on signs, the entire body and non-manual
12   S. Fontana and G. Caligore



features such as facial expression, mouth actions, movements of the torso and eye gaze,
which is used to point at, describe or depict the referent.
   Sign languages have specific constructions: the structure of an utterance in LIS will
not mirror that of spoken Italian. It has been observed that the unmarked order of signs
in LIS is Subject-Object-Verb, describing LIS as a head-final language, where the most
meaningful element is found in final position [58]. Utterances are in most cases much
more complex as they function following pragmatic constraints based on iconicity and
the multifaceted structures of LIS have not been comprehensively defined up to this
point.
   Non-manual elements represent the hardest challenge in the representation and recog-
nition of sign languages for their non-segmentable nature. Facial expressions that also
includes eye gaze and mouth actions convey relevant information in co-occurrence with
signing that is hard to process as relevant but is crucial for the understanding of the
meaning of the single sign and the utterance.


4      Conclusion

   This paper explores some of the main challenges in the field of sign language pro-
cessing. We have discussed the socio-political situation of the Deaf community and con-
sidered how, behind the label of Deafness, there can be different users with different
needs. This highlights that the involvement of users is essential in every phase of research
and development. When creating sign language datasets, Deaf people should be involved
in collecting reliable data that represent sign language usage, making possible the crea-
tion of appropriate computational models, interface design and, finally, of the overall
systems. Furthermore, the creation of datasets based on the processing of natural conver-
sation and long utterances will be necessary to go beyond the state of the art. Another
crucial step is annotation. The lack of a standard written form and the necessity of fluent
signers who annotate to produce the machine-readable inputs for training algorithms,
represent the main difficulties in applying NLP methods to sign languages. Only through
a multidimensional approach that combines NLP with computer vision and radar-based
technologies, as well as with the involvement of Deaf participants, could it be possible
to design effective technologies that could have an impact both at the social and the
linguistic level. On a social level, an effective translation could support the interactions
between hearing and Deaf people when the interpreting service is not available, for ex-
ample. On a linguistic level, sign language processing would play an important role in
sign language acquisition and learning.
      The proper development of this new technology will innovate along two main di-
rections: technological and social. Under the technological aspect, the collaboration of
different research approaches (radar technologies, artificial intelligence, and computer
vision research) can open new insights in understanding the shaping of machines to re-
spond to human beings’ needs. On the other hand, it will have high benefits in terms of
the inclusion of Deaf people.
             Italian Sign Language (LIS) and Natural Language Processing: an Overview 13



References


   1.  Istituto Nazionale di Statistica: Conoscere il Mondo della Disabilità: Persone,
       Relazioni e Istituzioni. Istituto Nazionale di Statistica, Roma (2019).
   2. Deafness and Hearing Loss https://www.who.int/news-room/fact-sheets/de-
       tail/deafness-and-hearing-loss, last accessed 2021/10/02.
   3. Malerba, D.: Sordità Percezione e Realtà nell’approccio pedagogico. Sapienza
       Università Editrice, Roma (2020).
   4. Fontana, S., Mignosi, E.: Segnare, Parlare, Intendersi: Modalità e forme. Mi-
       mesis, Milano and Udine (2012).
   5. Onofrio, D., Rinaldi, P., Caselli, M.C., Volterra, V.: Il Bilinguismo Bimodale
       dei Bambini Sordi: Aspetti Teorici ed Esperienze di Ricerca. Rivista di psico-
       linguistica applicate XIV (1), 25–42. (2014).
   6. Pinto M. A., Volterra V.: Bilinguismo lingue dei segni / lingue vocali: Aspetti
       educativi e psicolinguistici – Sign languages, spoken languages and bilingua-
       lism: educational and psycholinguistics issues. Italian Journal of Applied Psy-
       cholinguistics VIII (2008).
   7. Capek, C. M. et al.: Hand and Mouth: Cortical Correlates of Lexical Processing
       in British Sign Language and Speechreading English. Journal of Cognitive
       Neuroscience 20 (7), pp. 1220–1233 (2008).
   8. Maragna S., Roccaforte M., Tomasuolo E.: Una Didattica innovativa per l’ap-
       prendente sordo. FrancoAngeli, Milano (2013).
   9. Fontana S: Metalinguistic Awareness in sign language: epistemological con-
       siderations, in Pinto M.A., Rinaldi P. (eds) Metalinguistic Awareness and Bi-
       modal Bilingualism: Studies on Deaf and Hearing Subjects, Italian Journal of
       Applied Pysholinguistics XVI(2) (2016).
   10. Ethnologue Homepage, https://www.ethnologue.com/, last accessed
       2021/10/08.
   11. Scelzi, R.: Le component non manuali della LIS. Studi di Glottodidattica (1),
       261–291 (2010).
   12. Romeo, O.: Dizionario dei Segni. Zanichelli Editore S. p. A., Bologna (1991).
   13. Fontana S., Corazza S., Boyes Braem P., Volterra V.: Language research and
       language community change: Italian Sign Language 1981–2013, in Interna-
       tional Journal of the Sociology of Language, pp. 1–30. De Gruyter Mouton,
       Berlin (2015).
   14. Tomasuolo E., Gulli T., Volterra V., Fontana S.: The Italian Deaf Community
       at the time of Coronavirus. Frontiers in Sociology (2021).
       DOI: https://doi.org/10.3389/fsoc.2020.612559
   15. Italian Official Gazette, Decree Law of March 22nd 2021, n.41,
       https://www.gazzettaufficiale.it/atto/serie_generale/caricaDettaglioAtto/origi-
       nario?atto.dataPubblicazioneGazzetta=2021-05-21&atto.codiceReda-
       zionale=21A03181&elenco30giorni=false, last accessed 2021/10/01.
14   S. Fontana and G. Caligore



     16. Volterra, V.: La Lingua Italiana dei Segni. La comunicazione visivo-gestuale
         dei sordi. Il Mulino, Bologna (1987).
     17. Fontana, S., Roccaforte, M.: Oltre l’approccio assimilazionista nella descri-
         zione LIS. Quando la prassi comunicativa diventa norma. In: LII Congresso
         Internazionale di Studi della Società di Linguistica Italiana, pp. 273–286. Bern
         (2019).
     18. Di Renzo, A., Lamano, L., Lucioli, T., Pennacchi, B., Ponzo, L.: Italian Sign
         Language (LIS): can we write it and transcribe it with SignWriting? In: Editor,
         F., Editor, S. (eds.) 2nd Workshop on the Representation and processing of Sign
         Languages: Lexicografic matters and didactic scenarios, International Confer-
         ence on Language Resources and Evaluation –LREC 2006, Genoa (2006).
     19. Volterra V., Roccaforte M., Di Renzo A., Fontana S.: Descrivere Lingue dei
         Segni. Una prospettiva sociosemiotica. Il Mulino, Bologna (2019).
     20. Antinoro Pizzuto, E., Chiari, I., Rossini, P: The Representation Issue and its
         Multifaceted Aspects in Constructing Sign Language Corpora: Questions, An-
         swers, Further Problems. In: Proceedings of the LREC2008 3rd Workshop on
         the Representation and Processing of Sign Languages: Construction and Ex-
         ploitation of Sign Language Corpora, pp. 150–150. ELRA, Marrakech (2009).
     21. Mitkov, R.: The Oxford Handbook of Computational Linguistics. Oxford Uni-
         versity Press, New York (2003).
     22. Camgöz, N.C., Koller, O., Hadfield, S., Bowden, R.: Multi-channel Trans-
         formers for Multi-articulatory Sign Language Translation. (2020).
     23. Manaris, B.: Natural Language Processing: A Human-Computer Interaction
         Perspective. Advances in Computes 47, 1–66 (1998).
     24. Farooq, U., Rahim, M.S.M., Sabir, N., Hussain, A., Abid, A.: Advances in ma-
         chine translation for sign language: approaches, limitations, and challenges.
         Neural Computing and Applications (2021).
     25. Nobis, F., Geisslinger, M., Weber, M., Johannes, B., Lienkamp, M.: A Deep
         Learning-based Radar and Camera Sensor Fusion Architecture for Object De-
         tection. (2020).
     26. Huang, J., Wen-gang, Z., Qilin, Z., Houqiang, L., Weiping, L.:Video-based
         Sign Language Recognition without Temporal Segmentation. AAAI (2018).
     27. Santhalingam, P., Du, R. Wilkerson, R., Al Amin, H., Ding, Z., Parth, P.,
         Rangwala, H., Kushalnagar, R.: Expressive ASL Recognition using Millime-
         ter-wave Wireless Signals. 2020 17th Annual IEEE International Conference
         on Sensing, Communication, and Networking (SECON), pp. 1–9. (2020).
     28. Lombardo, V., Battaglino, C., Damiano, R., Nunnari, F.: An Avatar-based In-
         terface for the Italian Sign Language. International Conference on Complex,
         Intelligent and Software Intensive Systems, CISIS 2011. Korean Bible Univer-
         sity, Seoul (2011).
     29. Battaglino, C., Geraci, C., Lombardo, V., Mazzei, A.: Prototyping and Prelim-
         inary Evaluation of Sign Language Translation System in the Railway Domain.
         International Conference on Universal Access in Human-Computer Interac-
         tion, pp. 229–350. (2015).
         Italian Sign Language (LIS) and Natural Language Processing: an Overview 15



30. Mazzei, A.: Translating Italian to LIS in the Rail Stations. Proceedings of the
    15th European Workshop on Natural Language Generation, ENLG. Associa-
    tion for Computational Linguistics, Brighton (2015).
31. Mazzei, A., Lesmo, L., Battaglino, C. Vendrame, M., Bucciarelli, M.: Deep
    Natural Language Processing for Italian Sign Language Translation. In: Bal-
    doni M., Baroglio C., Boella G., Micalizio R. (eds) Advances in Artificial In-
    telligence, AI*IA 2013, vol 8249. Springer, Cham.
32. Branchini, C., Mantovan, L., A Grammar of Italian Sign Language (LIS). Pub-
    lished with open access at https://www.sign-hub.eu/grammardetail/UUID-
    GRMM-e0adecd1-c01e-47ef-b2c0-c2d6a4ce45dc (2020).
33. The SIGN-HUB Project Homepage, https://www.sign-hub.eu/project, last ac-
    cessed 2021/10/04.
34. Radutzky, E.: Dizionario bilingue elementare della lingua italiana dei segni.
    Oltre 2500 significati. Con DVD-ROM. Kappa (1992).
35. SpreadTheSign Homepage, https://www.spreadthesign.com/it.it/search/, last
    accessed 2021/10/04.
36. Hilzensauer, M., Krammer, K.: A multilingual dictionary for sign languages:
    “SpreadTheSign”. The 8th annual International Conference of Education, Re-
    search and Innovation, ICERI2015, Seville (2015).
37. Chesi, C., Geraci, C.: Segni al computer. Manuale di documentazione della
    Lingua Italiana dei Segni e alcune applicazioni computazionali. Siena (2009).
38. Wittenburg, P., Brugman, H., Russel, A., Klassmann, A., and Sloetjes, H.:
    ELAN: a professional framework for multimodality research. Proceedings of
    the Fifth International Conference on Language Resources and Evaluation,
    LREC’06. European Language Resources Association (ELRA), Genoa (2006).
39. Mesch, J., Wallin, L.: Gloss annotations in the Swedish Sign Language Corpus.
    International Journal of Corpus Linguistics (20), pp. 102–120 (2015).
40. Svenskt teckenspråkslexikon (Swedish Sign Language Dictionary), http:// teck-
    ensprakslexikon.su.se/, last accessed 2021/10/08.
41. BSLCP Homepage, https://bslcorpusproject.org/, last accessed 2021/10/08.
42. Schembri, A.: British Sign Language Corpus Project: Open Access Archives
    and the Observer's Paradox. (2015).
43. Pfau, R.: Annotation in ELAN, Version 1.1. SIGN-HUB Project, European
    Commission (2018).
44. Drew, P., Ney, H.: Towards Automatic Sign Language Annotation for the
    ELAN Tool. (2008).
45. McGuire et al.: Towards a One-way American sign language translator. In Pro-
    ceedings of the Sixth IEEE International Conference on Automatic Face and
    Gesture Recognition, pp. 620– 625. Seoul (2004).
46. Santhalingam, P.S., et al.: Expressive ASL Recognition using Millimeter-
    wave Wireless Signals. In 17th Annual IEEE International Conference on Sens-
    ing, Communication and Networking, pp. 1–9. SECON, Como (2020).
16   S. Fontana and G. Caligore



     47. Camgoz, N.C., Hadfield, S., Koller, O., Ney, H., Bowden, R.: Neural Sign
         Language Translation. In: Conference on Computer Vision and Pattern Recog-
         nition, pp. 7784–7793. (2018).
     48. Camgoz, N.C., Hadfield, S., Koller, O., Bowden, R.: Sign Language Trans-
         formers: Joint End-to-End Sign Language Recognition and Translation. In:
         Conference on Computer Vision and Pattern Recognition. (2020).
     49. Saunders, B., Camgoz, N.C., Bowden, R.: Progressive Transformers for End-
         to-End Sign Language Production. In: European Conference on Computer Vi-
         sion (ECCV) (2020).
     50. Zafrulla et al.: American Sign Language Recognition with the Kinect. In: Pro-
         ceedings of the 13th International Conference on Multimodal Interfaces, ICMI
         2011. Alicante (2011).
     51. Hernandez-Rebollar, J.L., Kyriakopoulos, N., Lindeman, R.W.: A New Instru-
         mented Approach for Translating American Sign Language into Sound And
         Tex. In: Proceedings of the Sixth IEEE International Conference on Automatic
         Face and Gesture Recognition. Seoul (2004).
     52. Westeyn, T.L., Brashear, H., Atrash, A., Starner, T.: Georgia tech gesture
         toolkit: supporting experiments in gesture recognition. In Proceedings of the
         5th International Conference on Multimodal Interfaces, ICMI 2003. Vancou-
         ver (2003).
     53. Caligiore, G.: Universal Dependencies for Italian Sign Language: a treebank
         from the storytelling domain. University of Turin, Turin (2020).
     54. Caligiore, G., Bosco, C., Mazzei, A.: Building a treebank in Universal Depend-
         encies for Italian Sign Language. Proceedings of Seventh Italian Conference
         on Computational Linguistics, CLIC-2020, vol. 2769, (2021).
     55. Caligiore,         G.,     Bosco,      C.,     Mazzei,        A.:      LIS-UD.
         https://github.com/alexmazzei/LIS-UD.git, (2020), last accessed 2020/10/06.
     56. Sign language datasets http://facundoq.github.io/guides/sign_language_da-
         tasets/slr, last accessed 2021/10/03.
     57. Scelzi, R.: Le Componenti non Manuali (CNM) della LIS. In: Studi di Glotto-
         didattica, pp. 261–291. (2010).
     58. Geraci, C.: L’ordine dei segni nella LIS (Lingua dei Segni Italiana). University
         of Milan, Milan (2002).