Multi-script Off-line Signature Verification: A Two Stage Approach

           Srikanta Pal                                   Umapada Pal                              Michael Blumenstein
     School of Information and                   Computer Vision and Pattern                     School of Information and
   Communication Technology,                   Recognition Unit, Indian Statistical             Communication Technology,
  Griffith University, Gold Coast                  Institute, Kolkata, India,                  Griffith University, Gold Coast,
          Australia, Email:                      Email: umapada@isical.ac.in                           Australia, Email:
  srikanta.pal@griffithuni.edu.au                                                              m.blumenstein@griffith.edu.au

Abstract—Signature identification and verification are of great    been devoted to the task of multi-script signature
importance in authentication systems. The purpose of this          verification. Very few published papers involving multi-
paper is to introduce an experimental contribution in the          script signatures, including non-English signatures, have
direction of multi-script off-line signature identification and    been communicated in the field of signature verification.
verification using a novel technique involving off-line English,
                                                                       Pal et al. [5] introduced a signature verification system
Hindi (Devnagari) and Bangla (Bengali) signatures. In the first
evaluation stage of the proposed signature verification            employing Hindi Signatures. The direction of the paper was
technique, the performance of a multi-script off-line signature    to present an investigation of the performance of a signature
verification system, considering a joint dataset of English,       verification system involving Hindi off-line signatures. In
Hindi and Bangla signatures, was investigated. In the second       that study, two important features such as: gradient feature,
stage of experimentation, multi-script signatures were             Zernike moment feature and SVM classifiers were
identified based on the script type, and subsequently the          employed. Encouraging results were obtained in this
verification task was explored separately for English, Hindi       investigation. In a different contribution by Pal et al. [6], a
and Bangla signatures based on the identified script result. The   multi-script off-line signature identification technique was
gradient and chain code features were employed, and Support
                                                                   proposed. In that report, the signatures involving Bangla
Vector Machines (SVMs) along with the Modified Quadratic
Discriminate Function (MQDF) were considered in this               (Bengali), Hindi (Devnagari) and English were considered
scheme. From the experimental results achieved, it is noted        for the signature script identification process. A multi-script
that the verification accuracy obtained in the second stage of     off-line signature identification and verification approach,
experiments (where a signature script identification method        involving English and Hindi signatures, was presented by
was introduced) is better than the verification accuracy           Pal et al. [7]. In that paper, the multi-script signatures were
produced following the first stage of experiments.                 identified first on the basis of signature script type, and
Experimental results indicated that an average error rate of       afterward, verification experiments were conducted based
20.80% and 16.40% were obtained for two different types of         on the identified script result.
verification experiments.
                                                                      Development of a general multi-script signature
    Keywords—Biometrics; off-line signature verification; multi-
script signature identification.                                   verification system, which can verify signatures of all
                                                                   scripts, is very complicated. The verification accuracy in
                      I.   INTRODUCTION                            such multi-script signature environments will not be as
Biometrics are the most widely used approaches for                 successful when compared to single script signature
personal identification and verification. Among all of the         verification [10]. To achieve the necessary accuracy for
biometric authentication systems, handwritten signatures, a        multi-script signature verification, it is important to identify
pure behavioral biometric, have been accepted as an official       signatures based on the type of script and then use an
means to verify personal identity for legal purposes on such       individual single script signature verification system for the
documents as cheques, credit cards and wills [1].                  identified script [10]. Based on this observation, in the
In general, automated signature verification is divided into       proposed system, the signatures of three different scripts are
two broad categories: static (off-line) methods and dynamic        separated to feed into the individual signature verification
(on-line) methods [2], depending on the mode of                    system. On the other hand to get a comparative idea, multi-
handwritten signature acquisition. If both the spatial as well     script signature verification results on a joint English, Hindi
as temporal information regarding signatures are available         and Bangla dataset, without using any script identification,
to the systems, verification is performed using on-line [3]        is also investigated.
data. In the case where temporal information is not available         The remainder of this paper is organized as follows. The
and the system can only utilize spatial information gleaned        multi-script signature verification concept is described in
through scanned or even camera-captured documents,                 Section II. Section III introduces the notable properties of
verification is performed on off-line data [4].                    Hindi and Bangla script. The Hindi, Bangla and English
    Considerable research has previously been undertaken in        signature database used for the current research is described
the area of signature verification, particularly involving         in Section IV. Section V briefly presents the feature
single-script signatures. On the other hand, less attention has    extraction techniques employed in this work. The classifier
                                                                   details are described in Section VI. The experimental
settings are presented in Section VII. Results and a                               III.   PROPERTIES OF HINDI AND BANGLA SCRIPT
discussion are provided in Section VIII. Finally, conclusions                Most of the Indian scripts including Bangla and Devanagari
and future work are discussed in Section IX.                                 have originated from ancient Brahmi script through various
  II.    MULTI-SCRIPT SIGNATURE VERIFICATION CONCEPT                         transformations and evolution [8]. Bangla and Devanagari
                                                                             are the two most accepted scripts in India. In both scripts,
When a country deals with two or more scripts and                            the writing style is from left to right and there is no concept
languages for reading and writing purposes, it is known as a                 of upper/lower case. These scripts have a complex
multi-script and multi-lingual country. In India, there are                  composition of their constituent symbols. The scripts are
officially 23 (Indian constitution accepted) languages and 11                recognizable by a distinctive horizontal line called the ‘head
different scripts.                                                           line’ that runs along the top of full letters, and it links all the
    In such a multi-script and multi-lingual country like                    letters together in a word. Both scripts have about fifty
India, languages are not only used for writing/reading                       basic characters including vowels and consonants.
purposes but also applied for reasons pertaining to signing
and signatures. In such an environment in India, the                                IV.    DATABASE USED FOR EXPERIMENTATION
signatures of an individual with more than one language                      A. Hindi and Bangla Signature Database
(regional language and international language) are
essentially needed in official transactions (e.g. in passport                As there has been no public signature corpus available for
                                                                             Hindi and Bangla script, it was necessary to create a database
application forms, examination question papers, money
                                                                             of Hindi and Bangla signatures. The Hindi and Bangla
order forms, bank account application forms etc.). To deal                   signature databases used for experimentation consisted of 50
with these situations, signature verification techniques                     sets per script type. Each set consists of 24 genuine
employing single-script signatures are not sufficient for                    signatures and 30 skilled forgeries. Some genuine signature
consideration. Therefore in a multi-script and multi-lingual                 samples of Hindi and Bangla, with their corresponding
scenario, signature verification methods considering more                    forgeries, are displayed in Table 1 and Table 2.
than one script are necessarily required.
   Towards this direction of verification, the contribution of               B. GPDS English Database
this paper is twofold: First, multi-script signature                         Another database, consisting of 50 sets from GPDS-160 [9],
verification considering joint datasets as shown in Figure 1,                was also utilised for these experiments. Each signature set
the second is identification of signatures based on script,                  of this corpus consists of 24 genuine signatures and 30
and subsequent verification for English, Hindi and Bangla                    simple forgeries. The reason 50 sets were used from the
signatures based on the identified script result. A diagram of               GPDS on this occasion, is due to the fact that the Bangla
this second verification mode is shown in Figure 2.                          and Hindi datasets described previously were comprised of
                                                                             50 sets each, and it was considered important to have
              Multi-script off-line Signatures (Signatures                   equivalent signature numbers for experimentation.
                    of English, Hindi and Bangla)
                                                                                 TABLE 1. SAMPLES OF HINDI GENUINE AND FORGED SIGNATURES

             Verification based on Multi-script Signatures                          Genuine Signatures                 Forged signatures


                          Accuracy of Verification
  Figure 1. Diagram of signature verification considering a joint dataset.

                          Multi-script Signatures
                        (English, Hindi and Bangla)
                                                                                TABLE 2. SAMPLES OF BANGLA GENUINE AND FORGED SIGNATURES
                                                                                   Genuine Signatures                  Forged signatures
                        Signature Script Identification


        Signatures of           Signatures of         Signatures of
        English Script          Hindi Script          Bangla Script


           English                  Hindi                   Bangla
          Signature
                                                                                               V.    FEATURE EXTRACTION
                                  Signature                Signature
         Verification            Verification             Verification          Feature extraction is a crucial step in any pattern
        Figure 2. Diagram of multi-script signature identification           recognition system. Two different types of feature
     and verification based on English, Hindi and Bangla signatures.         extraction techniques such as: gradient feature extraction
                                                                             and the chain code feature are considered here.
A. Computation of 576-dimensional gradient Features                          f ( x)    j x j  x  b
576-dimensional gradient features were extracted for this                                  j
research and experimentation, which are described in paper                   where {xj} are the set of support vectors and the parameters
[7].                                                                         j and b have been determined by solving a quadratic
B. 64-Dimensional Chain Code Feature Extraction                              problem [11]. The linear SVM can be extended to various
                                                                             non-linear variants; details can be found in [11, 12]. In these
The 64-dimensional Chain Code feature is determined as                       proposed experiments, the Gaussian kernel SVM
follows. In order to compute the contour points of a two-                    outperformed other non-linear SVM kernels, hence
tone image, a 3 x 3 window is considered surrounding the                     identification/verification results based on the Gaussian
object point. If any one of the four neighbouring points (as                 kernel are reported only.
shown in Fig. 3 (a)) is a background point, then this object
point (P) is considered as a contour point. Otherwise it is a                B. MQDF Classifier
non-contour point.                                                           The Modified Quadratic Discriminant Function is defined as
The bounding box (minimum rectangle containing the                           follows [13].
character) of an input character is then divided into 7 x 7
blocks. In each of these blocks, the direction chain code for                   ( )    (                )   [           [‖       ‖     ∑               ]]
each contour point is noted and the frequency of the
direction codes is computed. Here, the chain code of four
directions only [directions 1 (horizontal), 2 (45 degree                                                    ∑   (            )
slanted), 3 (vertical) and 4 (135 degree slanted)] is used.
                                                                             where X is the feature vector of an input character; M is a
Four chain code directions are shown in Fig. 3 (b). It is
                                                                             mean vector of samples;      is the ith eigen vector of the
assumed that the chain code of directions 1 and 5, 2 and 6, 3
                                                                             sample covariance matrix;     is the ith eigen value of the
and 7, 4 and 8, are the same. Thus, in each block, an array is
                                                                             sample covariance matrix; k is the number of eigen values
obtained of four integer values representing the frequencies,
                                                                             considered here; n is the feature size;       is the initial
and those frequency values are used as features. Thus, for 7
                                                                             estimation of a variance; N is the number of learning
x 7 blocks, 7 x 7 x 4= 196 features are obtained. To reduce
                                                                             samples; and N0 is a confidence constant for s and N0 is
the feature dimensions, after the histogram calculation into 7
                                                                             considered as 3N/7 for experimentation. All the eigen values
x 7 blocks, the blocks are down-sampled with a Gaussian
                                                                             and their respective eigen vectors are not used for
filter into 4 x 4 blocks. As a result, 4 x 4 x 4 = 64 features
                                                                             classification. Here, the eigen values are stored in
are obtained for recognition. To normalize the features, a
                                                                             descending order and the first 60 (k=60) eigen values and
maximum value of the histograms from all the blocks, is
                                                                             their respective eigen vectors are used for classification.
computed. Each of the above features is divided by this
                                                                             Compromising on trade-off between accuracy and
maximum value to obtain the feature values between 0 and
                                                                             computation time, k was determined as 60.
1.
                                                                                                 VII. EXPERIMENTAL SETTINGS
                                                                             A. Settings for Verification used in 1st Stage of Experiments
                                                                             The skilled forgeries were not considered for training
                                                                             purposes. For experimentation, random signatures were
                         (a)                         (b)                     considered for training purposes. For each signature set, an
 Figure 3. Eight neighbours (a) For a point P and its neighbours (b) For a
       point P the direction codes for its eight neighbouring points.
                                                                             SVM was trained with 12 randomly chosen genuine
                                                                             signatures. The negative samples for training (random
                     VI.       CLASSIFIER DETAILS                            signatures) were the genuine signatures of 149 other
  Based on these features, Support Vector Machines                           signature sets. Two signatures were taken from each set. In
(SVMs) and the Modified Quadratic Discriminant Function                      total, there were 149x2=298 random signatures employed
(MQDF) are applied as the classifiers for the experiments.                   for training. For testing, the remaining 12 genuine
                                                                             signatures and 30 skilled forgeries of the signature set being
A. SVM Classifier                                                            considered were employed. The number of samples for
     For this experiment, a Support Vector Machine (SVM)                     training and testing for these experiments are shown in
classifier is used. The SVM is originally defined for two-                   Table 3.
class problems and it looks for the optimal hyper plane,                        Table 3. No. of Signatures used per set in 1st Phase of Verification
which maximizes the distance and the margin, between the                                            Genuine          Random           Skilled
nearest examples of both classes, named support vectors                                             Signature       Signatures       Forgeries
(SVs). Given a training database of M data: {xm| m=1,..., M},                         Training         12              298              n/a
the linear SVM classifier is then defined as:                                         Testing          12              n/a              30
B. Settings for Verification used in 2nd Stage of Experiments           identification stage by using the SVM classifier. The
1) Settings for Signature Script Identification                         accuracy of Bangla, English and Hindi are 85.19, 95.74 and
150 sets of signatures (50 sets of English, 50 sets of Hindi            98.33% respectively. Confusion matrices obtained using
and 50 sets of Bangla) were used for signature script                   SVM classifiers, and the 64-dimensional chain code features
identification. 30 sets of signatures from each script were             investigated, are shown in Table 6.
considered for training, and the remaining 20 sets were                  TABLE 5. ACCURACY OBTAINED USING SVM AND MQDF CLASSIFIERS
considered for testing purposes. The number of samples for
                                                                                     Classifiers              Identification Accuracy (%)
training and testing used in experimentation of the
identification approach are shown in Table 4.                                          SVMs                                    93.08
                                                                                      MQDF                                     82.45
TABLE 4. SIGNATURE SAMPLES USED FOR SCRIPT IDENTIFICATION PHASE.

            English Signatures   Hindi Signatures   Bangla Signatures      TABLE. 6. CONFUSION MATRIX OBTAINED USING THE CHAIN CODE
                                                                                          FEATURE AND SVM CLASSIFIER
            Genuine    Forged    Genuine   Forged   Genuine   Forged
                                                                                                   Bangla            English           Hindi
 Training     720       900       720       900      720        900
                                                                                    Bangla          920                19              141
 Testing      480       600       480       600      480        600                 English         27                1034              19
                                                                                    Hindi           10                 8               1062
2) Settings for Signature Verification after Signature Script
Identification                                                          Based on the outcomes of the identification phase,
The verification task in the second stage was explored                  verification    experiments     subsequently     followed.
separately for English signatures, Hindi signatures and                 Verification results obtained for individual scripts were
Bangla signatures based on the identified script result.                calculated on 93.08% (identification rate) accuracy levels.
Signature samples (30 sets from each script) that were                  In this phase of experimentation, the SVMs produced an
considered for training purposes in signature script                    overall AER of 21.10%, 13.05% and 15.05% using English,
identification were not used for the individual verification            Hindi and Bangla signatures respectively. The overall
task. Only the correctly identified samples from 20 sets                verification accuracy obtained for the second major
(used for the testing part in identification) were considered           experiments (identification plus verification) was 83.60%
for verification. For each signature set, an SVM was trained            (average of 78.90% of English, 86.95% of Hindi and
with 12 genuine signatures. The negative samples for                    84.94% of Bangla).
training were 95 (19x5) genuine signatures of 19 other
signature sets.                                                         B. Comparision of Performance

               VIII. RESULTS AND DISCUSSION                             From the experimental results obtained, it was observed that
                                                                        the performance of signature verification in the second set
A. Experimental Results
                                                                        of experiments (identification and verification) was
    1) First Verification Experiments                                   encouraging compared to the signature verification accuracy
In this stage of experimentation, 8100 (150x54) signatures              from the first experiment set (verification only). Table 7
involving English, Hindi and Bangla scripts were employed               demonstrates the accuracies attained in the first experiment
for training and testing purposes. At this operational point,           set as well as separate verification results for English, Hindi
the SVMs produced an AER of 20.80%, and an encouraging                  and Bangla from the second experiment set.
accuracy of 79.20% was achieved in this first mode of
verification.                                                              TABLE 7. VERIFICATION ACCURACIES RESULTING FROM DIFFERENT
                                                                                                      EXPERIMENTS
    2) Second Verification Experiments
In this stage of verification the signatures are identified                            Verification Techniques                     Accuracy (%)
based on their script and subsequently, the identified                       Experiment Sets              Dataset Used
signatures are applied separately for verification. In the                                           English, Hindi and
signature script identification stage, only 64-dimensional                    1st experiment                                           79.20
                                                                                                            Bangla
chain code features were used because a slightly better
accuracy was obtained when compared to the gradient                                                         English                    78.90
feature. The MQDF classifier was also taken into account in                    nd
                                                                              2 experiment                   Hindi                     86.95
the script identification step applying chain code features for                                             Bangla                     84.94
a better accuracy, but MQDF did not achieve the better
result as compared to SVMs in this study. To get a
comparative idea, script identification results using two               In the second stage of verification, the overall accuracy is
different classifiers with chain code features are shown in             83.60% (Avg. of 78.90%, 86.95% and 84.94%) which is
Table 5. An accuracy of 93.08% is achieved at the script                4.40 (83.60-79.20) higher than the accuracy in the first
stage. The comparison of these two accuracies is shown in            substantially affects the verification accuracy, indicates an
Table. 8.                                                            important step in the process. The comparatively higher
                                                                     verification accuracy obtained in the second experimental
     TABLE 8. ACCURACY IN DIFFERENT PHASES OF VERIFICATION
                                                                     approach is likewise a substantial contribution. The gradient
       Verification Experiment           Verification Accuracy (%)   feature, chain code feature as well as SVM and MQDF
     Without Script Identification                  79.20            classifiers were employed for experimentation. The idea of a
       With Script Identification                   83.60            multi-script signature verification approach, which deals
                                                                     with an identification phase, is a very important contribution
                                                                     to the area of signature verification. The proposed off-line
From the above table it is evident that verification accuracy        multi-script signature verification scheme is a new
with script identification is much higher than without script        investigation in the field of off-line signature verification. In
identification. This increased accuracy is achieved because          the near future, we plan to extend our work considering
of the proper application of the identification stage. This          further sets of signature samples, which may include
research clearly demonstrates the importance of using                different languages/scripts.
identification in multi-script signature verification
techniques.                                                                               X.      ACKNOWLEDGMENTS
C. Error Analysis                                                    Thanks to my colleague Mr. Nabin Sharma for his help
Most of the methods used for signature verification generate         towards the preparation of this paper.
some erroneous results. In these experiments, a few                                               REFERENCES
signature samples were mis-identified in both the
                                                                     [1]  R. Plamondon and G. Lorette, “Automatic signature verification and
identification and verification stages. Few of the confusing              writer identification - the state of the art”, Pattern Recognition,
signature samples obtained in the signature script                        pp.107–131, 1989.
identification stage using the SVM classifier are shown in           [2] S. Madabusi, V. Srinivas, S. Bhaskaran and M. Balasubramanian,
Figures 4, 5 and 6. Three categories of confusing samples                 “On-line and off-line signature verification using relative slope
are generated by the classifier. The first category illustrates           algorithm”, in proc. International Workshop on Measurement
                                                                          Systems for Homeland Security, pp. 11-15, 2005.
a Bangla signature sample treated as a Hindi signature
                                                                     [3] D. Impedovo, G. Pirlo, “ Automatic signature verification: The state
sample. The second one illustrates an English signature                   of the art”, IEEE transactions on Systems, Man, and Cybernetics part-
sample treated as a Bangla signature sample and the third                 C, vol. 38, no. 5, pp. 609–635, 2008.
one illustrates a Hindi signature sample treated as a Bangla         [4] M. Kalera, S. Srihari and A. Xu. “Offline signature verification and
signature sample.                                                         identification using distance statistics”, International Journal on
                                                                          Pattern Recognition and Artificial Intelligence, pp.1339-1360, 2004.
                                                                     [5] S. Pal, U. Pal, M. Blumenstein, “Hindi Off-line Signature
                                                                          Verification”, in proc. International Conference on Frontiers in
                                                                          Handwritten Recognition, ICFHR 2012, Bari, Italy, pp. 371-376.
                                                                     [6] S. Pal, A. Alaei, U. Pal, M. Blumenstein, “Multi-Script off-line
                Figure 4. Bangla sample treated as Hindi                  signature identification” , in proc. International Conference on Hybrid
                                                                          Intelligent Systems, pp. 236-240, 2012.
                                                                     [7] S. Pal, U. Pal, M. Blumenstein, “Hindi and English off-line signature
                                                                          identification and verification”, in proc. International Conference on
                                                                          Advances in Computing. pp. 905–910, 2012.
                  Figure 5. English treated as Bangla                [8] B. B. Chaudhuri and U. Pal, “An OCR system to read two Indian
                                                                          language scripts: Bangla and Devnagari (Hindi)”, in proc.
                                                                          International Conference on Document Analysis and Recognition, pp.
                                                                          1011–1015, 1997.
                                                                     [9] M. A. Ferrer, J. B. Alonso, and C. M. Travieso, “Offline geometric
                                                                          parameters for automatic signature verification using fixed-point
             Figure 6. Hindi Signature treated as Bangla                  arithmetic”, IEEE transactions on Pattern Analysis and Machine
                                                                          Intelligence, 27:993–997, 2005.
          IX.      CONCLUSIONS AND FUTURE WORK                       [10] S. Pal, U. Pal and M. Blumenstein, “A Two-Stage Approach for
This paper provides an investigation of the excellent                     English and Hindi Off-line Signature Verification”, International
                                                                          workshop on Emerging Aspects in Handwritten signature processing,
performance of a multi-script signature verification                      2013(Acceoted).
technique involving English, Hindi and Bangla off-line               [11] V.Vapnik, “The Nature of Statistical Learning Theory”, Springer
signatures. The novel approach used in a multi-script                     Verlang, 1995.
signature verification environment with the combination of a         [12] C. Burges, “A Tutorial on support Vector machines for pattern
custom Hindi and Bangla off-line signature dataset provides               recognition”, Data Mining and Knowledge Discovery, pp.1-43, 1998.
a substantial contribution to the field of signature                 [13] F. Kimura, K. Takashina, S. Tsuruoka and Y. Miyake, “Modified
                                                                          quadratic discriminant function and the application to Chinese
verification. In such a verification environment, the proper              character recognition”, IEEE transactions on Pattern Analysis and
utilization of a script identification technique, which                   Machine Intelligence, Vol. 9, pp 149-153, 1987.