Towards Bidirectional Conversion between Arabic
                                Sign Language and Speech/Text
                                Souha Ben Hamouda1,* , Wafa Gabsi1 and Bechir Zalila1
                                1
                                    ReDCAD Laboratory, ENIS, University of Sfax, Tunisia


                                                                         Abstract
                                                                         Sign language is the essential communicating tool for deaf and non-verbal people. Using sign language,
                                                                         deaf and mute people can communicate among themselves but they find it though to face the outside
                                                                         world. The automatic interpretation of sign language and its conversion into text message and voice
                                                                         format remains a challenging task to break the barrier between deaf and the wider majority of non-
                                                                         deaf people. Different approaches and techniques were defined to propose solutions such as android
                                                                         application, smart glove and gesture recognition based on image processing and artificial intelligence.
                                                                         In this paper, we provide an overview of the sign language and other ways used in communication
                                                                         between deaf and non deaf people as well as factors involved in this communication. We then describe
                                                                         our approach, which ensures bidirectional conversion between Arabic sign language and speech/text.

                                                                         Keywords
                                                                         Arabic Sign language, Artificial intelligence, Recognition, Image processing, Bidirectional conversion


                                1. Introduction
                                Deaf individuals are those with either complete or partial hearing loss, often referred to as hard
                                of hearing or Deaf. To communicate, they rely not only on facial expressions, eyebrows and eye
                                movements, but also a language composed of gestures and signs. Indeed, to express themselves,
                                they use not only facial expressions, eyebrows and eyes, but also a language made up of gestures
                                (or signs) and movements. For this, school, social and professional integration presents a major
                                problem for these people, leading to feelings of isolation, exclusion, and introversion, even
                                within their own families. This comes down to the fact that the majority of people do not
                                know sign language and do not want to learn it. This has led to a conflict between teaching
                                oral language and sign language among educators. In this context, the objective of research
                                in this field is to help these categories of people, using sign language as their main mode of
                                communication, aiming to enhance their participation and integration.
                                   Various researchers have explored technology-driven solutions for automatic bidirectional
                                translation between sign language and spoken/written language [1]. Researchers have tried to
                                develop different solutions translating sign language from Arabic, French, English alphabets


                                TACC 2023, Tunisian-Algerian Joint Conference on Applied Computing, 6-8 Nov, 2023, Sousse, Tunisia
                                *
                                 Corresponding author.
                                $ souha.benhamouda@redcad.org (S. Ben Hamouda); wafa.gabsi@redcad.org (W. Gabsi); bechir.zalila@redcad.org
                                (B. Zalila)
                                 0000-0003-4985-6718 (W. Gabsi); 0000-0002-2432-3520 (B. Zalila)
                                                                       © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                    CEUR
                                    Workshop
                                    Proceedings
                                                  http://ceur-ws.org
                                                  ISSN 1613-0073
                                                                       CEUR Workshop Proceedings (CEUR-WS.org)


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
and other written or spoken languages [2, 3]. Various contributions have been made to translate
hand and mouth movements, using sensors [4], mobile applications [5] [6] and gloves [7] [8].
Based on image processing and deep learning algorithms, the sign language alphabets were
then translated into speech or text. The analysis of these works revealed certain problems such
as:

    • Most of the work has aimed at recognizing the alphabetical letters of each language while
      in reality the deaf use words and sentences. Additionally, each of the quoted contributions
      has its own limitations and costs, rendering them commercially impractical.”
    • Few works focus on the processing of Arabic sign language.
    • Most of the work aims at recognizing gestures and translating them into text/audio but
      not the other way around.
    • Each of the quoted contributions has its own limitations and its own cost which make
      them commercially unusable.
    • Each of the existing works is generally interested in a category of sign language such as
      static or dynamic gestures. In our work, we address these gaps by proposing a two-way
      recognition system for Arabic sign language.

   As part of our work, to help the deaf and participate in their social and professional integration,
we are aiming for the two-way recognition of Arabic sign language.
   Our approach offers a two-way translation system from Arabic words to text/audio via mobile
development. In a first sense, we designed and developed a 3D hand allowing to translate
gestures following a voice recognition of words in Arabic. Our testing involved static as well
as dynamic gestures. Additionally, we present a mobile application utilizing image processing
for sign recognition and translation to text/audio. Our solution provides an affordable means
of facilitating bidirectional communication. The remainder of this paper is structured as
follows: Section 2 presents background concepts related to Deep learning and Image processing
algorithms. In Section 3, we present an overview of our proposed approach. Then, section 4
gives and discusses results. Finally, Section 5 concludes this paper and gives on going work.


2. Background
Faced with the significant evolution experienced by sign language, innovating technology with
a view to improving the properties and performance of communication with the deaf becomes
a challenge. First, we begin by presenting the modes of communication of deaf people and
the various factors involved. Second, we define and classify different sign languages. Next,
we present the importance of body expression and the techniques used for sign language
recognition.

2.1. Communication methods
Deaf and hard of hearing people have different profiles. They make choices based on their
situation and personal history. These choices are neither definitive nor exclusive. The same
person can communicate differently during his life or according to the contexts (family, work,
friends...). People who are deaf and hard of hearing can combine the following elements to
communicate:

    • Lip reading allows deaf individuals to better perceive speech. A specific learning allows
      the mastery of this reading.
    • Sign language is a separate language that involves learning each specific language and
      has its own grammar and vocabulary.

Besides sign language, there are several means of communication to get in touch with a deaf
person such as: e-mail, video-interpretation services (SVIsual), mediation centers for deaf
people, messages of mobile phone text, letters, etc.
   Often parents of deaf people also learn how to sign. The native sign language of their parents
will be their first language and this before any spoken language. In addition, parents, brothers
and sisters of deaf children learn to sign to communicate with them. Many people also learn
sign language in their spare time because they have deaf friends.

2.2. Factors involved in communication
During the conversation with a deaf person several factors intervene to establish a communica-
tion:

    • Brightness: The deaf person must be in a strategic position in order to provide him with
      a general visual overview of the place where he is and the place must be well lit so that
      he has good visibility.
    • Eye contact: It is important to have eye contact with the deaf person. We must avoid
      excessive movement. There is no more eye contact between people. If contact is not
      possible because of the distance, there are several other ways to get their attention:
      Knocking hard on the floor or gently on the table so that they feel the vibrations, or
      turning off the lights moving the arm in the visual field of the deaf person.
    • Speed of speech: Do not speak too quickly or too slowly. We must vocalize clearly using
      short and simple sentences for a good understanding of the subject of the conversation.
    • Way of speaking: You have to speak without impeding the mouth so that the deaf
      person can read the lips.
    • Facial expression: Is an element of great help, as well as the components that comple-
      ment verbal speech, gesture, writing, etc. If the deaf person does not understand, the
      message sent must be expressions differently.
    • Position in relation to the others: If several people are going to intervene in the
      conversation, it is advisable to stand in a circle to facilitate good visibility. To attract the
      attention of the deaf, you have to touch their arm or shoulder. Never touch a deaf person,
      either in the back or in the head.

One can always ask for help, guidance and assistance in federations and associations of deaf
people. This is the best way to optimize communication according to the different particularities
of groups of people or each group of deaf people.
2.3. Sign language
Sign language is a language that incorporates the implementation of hand movements, body
orientation and facial expressions for communication, without relying on sound waves.

2.3.1. Principle
Sign language is a communication system used by deaf and hard of hearing people to communi-
cate not only with each other but also with the hearing world. It is a language open to all. It
is not necessarily limited to deaf people but can be used by parents, companions, specialized
educators, doctors. Apprenticeships are offered to learn sign language to enable deaf people
to communicate with each other and with their loved ones. Likewise, they also allow hearing
people to communicate with the hearing impaired.

2.3.2. Classification of sign languages
The sign languages have been classified by families according to the language. A classification
was established by Henri Wittmann in 1991 [9]. The latter offers the following list of families:

    • French Sign Language Family: Also known as the Francosign Language Family. It
      descends from the old French Sign Language which developed in France from the 17th
      century.
    • Arabic sign language family: It mainly includes sign languages from the Arabic-
      speaking Middle East.
    • German sign language family: This includes German and Polish sign languages.
    • British Sign Language family: It is historically derived from a prototype variety of
      British Sign Language (BSL).
    • Japanese sign language family: It includes Japanese, Taiwanese and Korean sign
      languages. There are few difficulties in communication between these three languages.
    • Lyons sign language family: It only includes the Lyons sign language and the sign
      languages of French-speaking and Flemish Belgium.

  Even there are many sign languages, they are all based on some mouvements of the fingers,
hands and lips which is called body expression.

2.3.3. Importance of body expression
Different factors and parts of the body are involved in producing sign language:

    • Fingers: The difference between the gesture of the letter V and the X, in the position of
      the folded fingers for the X.
    • Hands: We can consider different positions of the hands such as above, in front and in
      the shape of a fist to perform static and dynamic gestures.
    • Movements: which make the hand turn and the fingers move.
    • Location: The space behind and in front of the speaker is used to express time.
    • Expressions of the face, the eyes, the movements of the eyebrows or the mouth
      are also important to express emotions, questions or feelings.

In addition to facial and body expressions, deaf person’s location is also an important factor in
understanding the messages exchanged.
   A facial expression will even, in some cases, make it possible to tell the difference between
two words signed in the same way. For this, the deaf must position themselves well in front of
their interlocutor when they express themselves with sign language. Simply turning their head
away can cause the person they are communicating with to miss many of the intricacies of
the conversation. All of these body expressions are gestures formed to spell out letters, words,
numbers and also sentences. In our context, we will focus on hand gestures to produce static
and dynamic gestures.

2.3.4. Types of gestures
Hand gestures can be categorized.
    • Static gestures A static gesture is a particular position of the hand, represented by a
      single image [10]. Hand gestures express certain information through a form of static
      movement.
      For this, researchers have been inspired to utilize temporal models. The objective of which
      is to correct errors in the recognition of static hand gestures [11]. Temporal patterns are
      also present in gestures.


      Figure 1: Arabic alphabets signs [12].


    • Dynamic gestures
      Dynamic gestures are gestures in motion, represented by a sequence of various images.
      Dynamic hand gesture means that we have gesture recognition using dynamic hand
      [10]. Hand gestures express certain information through dynamic movement of the arm,
      wrist and fingers. A dynamic hand gesture of finger movements can be considered as a
      temporal sequence of static hand gestures [11].
3. Description of the proposed approach
Our main goal is to propose an arabic sign language recognition approach based on words
recognition with lower costs. For that, we propose a dual way communication system based
on translation from signs of Arabic words to spoken language based on image processing and
vice versa. In this section, we present an overview of our proposed approach in both ways of
communication. Then, we give details about each of them.
   Our approach is bidirectional allowing translation in both directions between speech and
signs. For this, we have broken down our approach into two axes. The first axis aims to translate
Arabic words into sign language. To do this, we propose the creation of a database on sign
language. To realize the movements of the hands, we propose an arduino application. The
second axis aims to translate Arabic sign language into audio or text based on image processing.
We briefly describe each of these axes in this section.

3.1. From audio into sign language


Figure 2: Block Diagram of Speech to sign language Synthesis.


   For the transformation of spoken words into gestures, we have proposed a process based on
different stages, as shown in Figure 2. From the spoken words, we start with voice recognition
allowing the generation of the corresponding text. For voice recognition, we used a Voice
Recognition Module V3 microphone. This module accepts spoken word as input and generates
the corresponding text as output. For the translation of gestures from text, we used an Arduino
board and we designed a 3D hand. The arduino uno card makes it possible to transform the
words into order of movements based on servomotors. Servomotors are motors of a particular
type, very popular for turning something up to a very precise position. These movements make
it possible to move the different fingers of a 3D hand in the positions indicated according to the
gestures associated with the spoken words.
   This section includes screenshots of some word interfaces processed by the mobile application
and the equivalent gesture generated by the 3D hand accompanied by a brief description.
   Figures 4 and 3 show the voice recognition of the two words ‫ ِإثن َان‬and ‫ َأر بعة‬respectively
with their static sign language gestures that suit them. While, figures 5 and 6 show the voice
recognition of the two words ‫ ع َشرة‬and ‫جيد‬ َ ‫ عمَ َل‬respectively with their very similar static sign
language gestures. Regarding the dynamic gestures, we give in figure 7 an example of the
gesture conversion corresponding to the word ‫سبت‬         ّ ‫ال‬. We notice that there are many similar
Figure 3: Sign of the word ‫ِإثن َان‬                   Figure 4: Sign of the word ‫َأر بعة‬


Figure 5: Sign of the word ‫ع َشرة‬                     Figure 6: Sign of the word ‫جيد‬
                                                                                   َ ‫عمَ َل‬


gestures for different words and meaning. For that, in future work, we aim tio focus on meaning
of words and giving conversion of sentences.


Figure 7: Speech word recognition ‫سبت‬
                                    ّ ‫ ال‬and its corresponding gesture
3.2. From Arabic sign language into audio or text
Similarly, in the opposite direction, the proposed approach for translating gestures into words
follows different steps as shown in Figure 8. First, we built a database of gestures. The second
step is the pre-processing of this database to move on to the processing phase. The third step
is the use of the algorithms necessary for the classification. Similarly, the same steps must be
applied for the words of each sign. In this phase, it is necessary to concretely define the steps of
our application. We mention in the rest of this paper the different tasks carried out to develop
this work.


Figure 8: Block Diagram of sign language to Speech Synthesis.


   The last step of our approach was the realization of our 3D hand. The hand consists of 5
straws representing the fingers. Each of the fingers can be in different positions by varying
the gesture. For this, for each word, it is necessary to give the position of the corresponding
fingers based on the servos responsible for moving the fingers. As a proof of concept, we have
programmed the necessary movements for a few words. The main advantage of our approach
is the price of the material necessary for the construction of the 3D hand. Indeed, the hand does
not cost too much. It turns around 100$. This makes our solution much less expensive in terms
of materials compared to others (gloves or electric hands).
   We will detail in what follows, the different steps to translate gestures into text. Our focus is
on sign detection, which detects numbers from zero to nine.

    • Creating the dataset for gesture detection In order to validate our approach, we have
      created a database of gestures corresponding to Arabic numerals for learning and another
      database which is the test database.
      Each base contains ten folders containing images captured using a camera consisting of
      gestures corresponding to the numbers from zero 0 to nine 9.
      The training database contains 2,060 images. For each number, we recorded 206 gesture
      images by varying the background and the lightings. On the other hand, the test base is
      composed of 10 gestures.
      On both bases, we perform the necessary experiments. These two databases represent
      an important source for the development of image recognition applications. They also
  provide a resource for future researchers in order to test and evaluate the applications
  developed for this purpose.
• Gesture detection
  Now to detect a hand we get the live camera feed using OpenCV ( OpenCV is an open
  source computer vision and machine learning software library.) and create a ROI (region
  of interest) which is nothing but the part of the frame we want to detect the hand for
  gestures that will be saved in a directory. Here the gestures directory contains the two
  folders train and test containing captured images. The blue box provides the live camera
  feed from webcam.
  To differentiate the background, we calculate the accumulated weighted average for
  the background and then subtract it from images containing an object in front of the
  background that can be distinguished as the foreground. This is done by calculating the
  accumulate weight for some images (here for 60 images) we calculate the accumulate
  avg for the background. Once we have accumulated the average for the background, we
  subtract it from each frame we read after 60 frames to find any object that is covering the
  background.
  When the edges are detected (or the hand is present in the ROI), we start recording ROI
  images in the train and the test set respectively for the digit for which we detect it.
• CNN Training
  On the created dataset, we train a CNN (Convolutional Neural Network). First, we load
  the data using keras’ ImageDataGenerator through which we can use the flow from
  directory function to load the train and test set data, and each of the names of the digital
  folders will be the class name for the loaded images.
  CNN consists of multiple layers like the input layer, Convolutional layer, Pooling layer,
  and fully connected layers as it is shown in the figure9. The Convolutional layer applies
  filters to the input image to extract features, the Pooling layer downsamples the image
  to reduce computation, and the fully connected layer makes the final prediction. The
  network learns the optimal filters through backpropagation and gradient descent. The
  plotImages function is for plotting images of the dataset loaded.


  Figure 9: Simple CNN architecture.


  Now, we design the CNN as follows (or based on trial and error, other hyperparameters
      can be used).


      Figure 10: Training precision


      In training, we utilize the Reduce Learning Rate (LR) on plateau and early stopping (Stop
      training when a monitored metric has stopped improving) are used, and both depend on
      the loss of the validation data set.
      After each period, the accuracy and loss are calculated using the validation dataset and if
      the validation loss does not decrease, the LR of the model is reduced using the Reduce
      LR to prevent the model to exceed the loss minimums and we also use the earlystopping
      algorithm so that if the validation accuracy continues to decrease for certain periods,
      learning is stopped.
      The example contains the callbacks used, it also contains the two optimization algorithms
      used which are SGD (stochastic gradient descent, which means the weights are updated
      at each training instance) and Adam (combination of Adagrad and RMSProp).
      We found that for the SGD model seemed to give higher accuracies. During training, it’s
      evident that we found 100% training accuracy.
    • Predict the gesture
      To predict the gesture, we create a bounding box for detecting the ROI and calculate the
      accumulated average as we did in creating the dataset. This is done for identifying any
      foreground object. Then, we find the maximum contour and if contour is detected that
      means a hand is detected so the threshold of the ROI is treated as a test image. We load
      the previously saved model using keras.models.load model and feed the threshold image
      of the ROI consisting of the hand as an input to the model for prediction. Getting the
      imports for the gesture model. After that, we load the model that we had created earlier
      and set some of the variables that we need, i.e, initializing the background variable, and
      setting the dimensions of the ROI. Function to calculate the background accumulated
      weighted average (like we did while creating the dataset…)for Detecting the hand now
      on the live cam feed.

The test of our model has good results. It shows a good recognition for most numbers. Figures
11 and 12 show snapshots of the recognition of the two numbers 1 and 4 respectively. While
figure 13, shows that the system mistranslated the digit gesture 8. It translated to the number 9.
This error may be due to the similarity between the different forms of gestures in Arabic sign
language. Like it is the same error in figure 14 for the recognition of number 9.
Figure 11: Recognition of number 1                   Figure 12: Recognition of number 4


Figure 13: Bad recogntion of number 8                Figure 14: Bad recognition of number 9


4. Evaluation
In this section, we give and discuss results of both proposed approaches to convert speech to
gestures and conversely.

4.1. From audio to 3D gesture
To validate our approach, we tested the performance of this approach compared to other
approaches.

                                        Average      Average
                                        static       dynamic        Average
                                        gestures     gestures
                     Number of test
                                        30 images    30 images
                     images
                     Rate of correct
                                          95.82%      88.33%        93.32%
                     detection
                     Rate of false
                                          4.14%       11.66%         6.64%
                     detection
Table 1
Rates of good and bad conversion of both static and dynamic signs

   Preliminary tests of our 3D hand showed satisfactory overall accuracy of the correct transla-
tion of Arabic words translated into sign language.
   Table 1 gives the rates of good detections and false detections. The results are generally
satisfactory with an average of approximately 93.32% of good detections for 6.64% of false
detections. The results are generally satisfactory with an average of static gestures around
95.82% of good detections for 4.14% of false detections.
  The results are globally satisfactory with on average dynamic gestures about 88.33% of good
detections for 11.66% of false detections. Our application developed for sense 1 (from speech to
gesture) saves time as it analyzes words and not letters. Moreover, the use of 3D hands helps
ordinary people to communicate with deaf people without learning their own language. From
the usage point of view, our application is simple to use and solves the problem of cost and
availability of material.

4.2. From gesture to Arabic text
We have successfully developed the approach for sign language digit detection. This is an
interesting machine learning python project to gain expertise. This can be further extended
to detect Arabic words. This approach shows good gesture detection performance in terms of
good classification rate and computation time as well.
   The results are globally satisfactory with an average of approximately 94.36% of good detec-
tions for 5.64% of false detections.


5. Conclusion and perspectives
Sign language is an important area of research that is attracting increasing attention from
research communities that aims to make life easier for deaf people. Deaf individuals have
limits in terms of communication. For this, researchers have developed translation applications
capable of translating sign language into written language and vice versa. Each of the existing
solutions admits disadvantages considering certain criteria such as the language treated, the
obstacles, the treatment of letters and not of words, the high cost and the unavailability of
materials. In addition, there are few works aimed at bidirectional communication and are
generally interested in the translation of gestures into speech/text and not the reverse.
   In our work, we proposed a low-cost and easy-to-use two-way communication system based
on Arabic word recognition. We were able to develop and implement the first axis of our
approach, translating words into gestures based on voice recognition and using an Arduino
board. We tested and validated it, achieving a 93% recognition rate. Regarding the second axis,
we also developed our own approach allowing the recognition of gestures and their translation
into text. We have tested it currently using numbers. We have proven the effectiveness of our
approach with a highly satisfactory recognition rate.
   For these reasons, we aim in the short term to improve the development of the two axes
through larger databases of words and gestures for increased credibility.
   In the medium term, we aim to integrate facial recognition and lip reading to improve our
approach and and distinguish words that are similar.
   In the long term, we aim to improve our approach to consider sentences while processing
sequences of words within a reasonable time frame.
References
 [1] B. H. Souha, G. Wafa, Arabic sign language recognition: Towards a dual way communica-
     tion system between deaf and non-deaf people, in: 2021 IEEE/ACIS 22nd International
     Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Dis-
     tributed Computing (SNPD), 2021, pp. 37–42. doi:1 0 . 1 1 0 9 / S N P D 5 1 1 6 3 . 2 0 2 1 . 9 7 0 5 0 0 2 .
 [2] A. Er-Rady, R. Faizi, R. O. H. Thami, H. Housni, Automatic sign language recognition: A
     survey, in: 2017 International Conference on Advanced Technologies for Signal and Image
     Processing, ATSIP’17, 2017, pp. 1–7. doi:1 0 . 1 1 0 9 / A T S I P . 2 0 1 7 . 8 0 7 5 5 6 1 .
 [3] R. Rastgoo, K. Kiani, S. Escalera, Sign language recognition: A deep survey, Expert Systems
     with Applications 164 (2021) 113794. doi:h t t p s : / / d o i . o r g / 1 0 . 1 0 1 6 / j . e s w a . 2 0 2 0 . 1 1 3 7 9 .
 [4] P. Kumar, H. Gauba, P. Pratim Roy, D. Prosad Dogra, A multimodal framework for sensor
     based sign language recognition, Neurocomputing 259 (2017) 21–38. doi:h t t p s : / / d o i . o r g /
     1 0 . 1 0 1 6 / j . n e u c o m . 2 0 1 6 . 0 8 . 1 3 2 , multimodal Media Data Understanding and Analytics.
 [5] Setiawardhana, R. Y. Hakkun, A. Baharuddin, Sign language learning based on android
     for deaf and speech impaired people, in: 2015 International Electronics Symposium (IES),
     2015, pp. 114–117. doi:1 0 . 1 1 0 9 / E L E C S Y M . 2 0 1 5 . 7 3 8 0 8 2 .
 [6] S. Ghanem, C. Conly, V. Athitsos, A survey on sign language recognition using smart-
     phones, in: Proceedings of the 10th International Conference on PErvasive Technologies
     Related to Assistive Environments, PETRA ’17, Association for Computing Machinery,
     New York, NY, USA, 2017, p. 171–176. doi:1 0 . 1 1 4 5 / 3 0 5 6 5 4 0 . 3 0 5 6 5 4 9 .
 [7] M. Mohandes, J. Liu, M. Deriche, A survey of image-based arabic sign language recognition,
     in: 2014 IEEE 11th International Multi-Conference on Systems, Signals Devices (SSD’14),
     2014, pp. 1–4. doi:1 0 . 1 1 0 9 / S S D . 2 0 1 4 . 6 8 0 8 9 0 6 .
 [8] S. Sarker, M. M. Hoque, An intelligent system for conversion of bangla sign language into
     speech, in: 2018 International Conference on Innovations in Science, Engineering and
     Technology (ICISET), 2018, pp. 513–518. doi:1 0 . 1 1 0 9 / I C I S E T . 2 0 1 8 . 8 7 4 5 6 0 8 .
 [9] H. Wittmann, Classification linguistique des langues signÉes non vocalement, Revue
     québécoise de linguistique théorique et appliquée 10 (1991) 215–288.
[10] J. Mahmood, Z. Tao, H. Md, A real-time computer vision-based static and dynamic
     hand gesture recognition system, Int. J. Image Graph. 14 (2014) 881–898. doi:1 0 . 1 1 4 2 /
     S0219467814500065.
[11] K. Hu, L. Yin, T. Wang, Temporal interframe pattern analysis for static and dynamic
     hand gesture recognition, in: 2019 IEEE International Conference on Image Processing,
     ICIP 2019, Taipei, Taiwan, September 22-25, 2019, IEEE, Singapore, 2019, pp. 3422–3426.
     doi:1 0 . 1 1 0 9 / I C I P . 2 0 1 9 . 8 8 0 3 4 7 2 .
[12] M. MUSTAFA, A study on arabic sign language recognition for differently abled using
     advanced machine learning classifiers, JOURNAL OF AMBIENT INTELLIGENCE AND
     HUMANIZED COMPUTING 2 (2021) 211–226. doi:H T T P S : / / D O I . O R G / 1 0 . 1 0 1 6 / J . E S W A . 2 0 2 0 .
     113794.