Proceedings of the 1st International Workshop on Automated Forensic Handwriting Analysis (AFHA) 2011


   The effect of training data selection and sampling
        time intervals on signature verification
                                          János Csirik, Zoltán Gingl, Erika Griechisch
                                                       Department of Informatics
                                                         University of Szeged
                                                           Szeged, Hungary
                                                {csirik,gingl,grerika}@inf.u-szeged.hu


   Abstract—Based on an earlier proposed procedure and data,               on the template feature selection [3]; some combine local and
we extended our signature database and examined the differences            global features [4].
between signature samples recorded at different times and the
relevance of training data selection. We found that the false accept          A key step in signature recognition was provided in the
and false reject rates strongly depend on the selection of the             First International Signature Verification Competition [5], and
training data, but samples taken during different time intervals           reviews about the automatic signature verification process
hardly affect the error rates.                                             were written by Leclerc and Plamondon [6], [7], Gupta [8],
   Index Terms—online signature; signature verification                    Dimauro et al. [9] and Sayeed et al. [10].
                                                                              Many signals and therefore many different devices can be
                       I. I NTRODUCTION                                    used in signature verification. Different types of pen tablets
   In our earlier study [1], we investigated a procedure for               have been used in several studies, as in [11], [12]; the F-Tablet
signature verification which is based on acceleration signals.             was described in [13] and the Genius 4x3 PenWizard was used
The necessary details about the method – applied in the earlier            in [14]. In several studies (like ours), a special device (pen)
study and recent study – are explained in Section II. Previously           was designed to measure the dynamic characteristics of the
we created a database with genuine and unskilled forgeries and             signing process.
used the dynamic time warping method to solve a two-class                     In [15], the authors considered the problem of measuring
pattern recognition problem.                                               the acceleration produced by signing with a device fitted with
   In our recent study we extended the database with fresh                 4 small embedded accelerometers and a pressure transducer. It
recordings of the signatures from former signature suppliers,              mainly focused on the technical background of signal record-
thus we were able to compare signature samples recorded                    ing. In [16], they described the mathematical background
in different time periods. In addition, we examined how                    of motion recovery techniques for a special pen with an
the selection of training data can affect the results of the               embedded accelerometer.
verification process.                                                         Bashir and Kempf in [17] used a Novel Pen Device and
   Several types of biometric authentication exist. Some of                DTW for handwriting recognition and compared the accel-
them have appeared in the last few decades, such as DNA and                eration, grip pressure, longitudinal and vertical axis of the
iris recognition and they provide more accurate results than the           pen. Their main purpose was to recognize characters and PIN
earlier methods did (e.g. fingerprint, signature). Hence they              words, not signatures. Rohlik et al. [18], [19] employed a
are more difficult to forge. However, a signature is still the             similar device to ours to measure acceleration. Theirs was
most widely accepted method for identification (in contracts,              able to measure 2-axis accelerations, in contrast to ours
bank transfers, etc.). This is why studies tackle the problem              which can measure 3-axis accelerations. However, our pen
of signature verification and examine the process in detail.               cannot measure pressure like theirs. The other difference is
Usually their aim is to study the mechanics of the process and             the method of data processing. In [18] they had two aims,
learn what features are hard to counterfeit.                               namely signature verification and author identification, while
   There are two basic ways of recognizing signatures, namely              in [19] the aim was just signature verification. Both made use
the offline and the online. Offline signature recognition is               of neural networks.
based on the image of the signature, while the online case uses               Many studies have their own database [12], [13], but
data related to the dynamics of the signing process (pressure,             generally they are unavailable for testing purposes. However
velocity, etc.). The main problem with the offline approach is             some large databases are available, like the MCYT biometric
that it gives higher false accept and false reject errors, but the         database [20] and the database of the SVC2004 competition1
dynamic approach requires more sophisticated techniques.                   [5].
   The online signature recognition systems differ in their
feature selection and decision methods. Some studies analyze
the consistency of the features [2], while others concentrate                1 Available at http://www.cse.ust.hk/svc2004/download.html


                                                                       6
                 Proceedings of the 1st International Workshop on Automated Forensic Handwriting Analysis (AFHA) 2011


                  II. P ROPOSED METHOD                                B. Database
                                                                         The signature samples were collected from 40 subjects.
A. Technical background                                               Each subject supplied 10 genuine signatures and 5 unskilled
                                                                      forgeries, and 8-10 weeks later the recording was repeated with
   We used a ballpoint pen fitted with a three-axis accelerom-        20 subjects, so we had a total of 40 × 15 + 20 × 15 = 900
eter to follow the movements of handwriting sessions. Ac-             signatures. The signature forgers were asked each time to
celerometers can be placed at multiple positions of the pen,          produce 5 signatures of another person participating in the
such as close to the bottom and/or close to the top of the            study.
pen [15], [17]. Sometimes grip pressure sensors are also                 In order to make the signing process as natural as possible,
included to get a comprehensive set of signals describing the         there were no constraints on how the person should sign. This
movements of the pen, finger forces and gesture movements.            led to some problems in the analysis because it was hard
In our study we focused on the signature-writing task, so we          to compare the 3 pairs of curves (two signatures). During a
placed the accelerometer very close to the tip of the pen to          signing session, the orientation of the pen can vary somewhat
track the movements as accurately as possible (see Figure 1).         (e.g. a rotation with a small angle causes big differences for
                                                                      each axis). This was why we chose to reduce the 3 dimensional
   In our design we chose the LIS352AX accelerometer chip
                                                                      signals to 1 dimensional signals and we only compared the
because of its signal range, high accuracy, impressively low
                                                                      magnitudes of the acceleration vector data.
noise and ease-of-use. The accelerometer was soldered onto a             Figure 3 shows the acceleration signals of 2 genuine signa-
very small printed circuit board (PCB) and this board was             tures and 2 forged signature. Figures 3a and 3b show samples
glued about 10mm from the writing tip of the pen. Only                from the same author, and they appear quite similar. Figures 3c
the accelerometer, the decoupling and filtering chip capacitors       and 3d are the corresponding forged signatures, which differ
were placed on the assembled PCB. A thin five-wire thin               significantly from the first two.
ribbon cable was used to power the circuit and carry the three
acceleration signals from the accelerometer to the data acqui-        C. Distance between time series
sition unit. The cable was thin and long enough so as not to             An elastic distance measure was applied to determine
disturb the subject when s/he provided a handwriting sample.          dissimilarities between the data. The dynamic time warping
Our tiny general purpose three-channel data acquisition unit          (DTW) approach is a commonly used method to compare time
served as a sensor-to-USB interface [21].                             series. The DTW algorithm finds the best non-linear alignment
   The unit has three unipolar inputs with signal range of 0          of two vectors such that the overall distance between them is
to 3.3V, and it also supplied the necessary 3.3V to power it.         minimized. The DTW distance between the u = (u1 , . . . , un )
The heart of the unit is a mixed-signal microcontroller called        and v = (v1 , . . . , vm ) vectors (in our case, the acceleration
C8051F530A that incorporates a precision multichannel 12-bit          vector data of the signatures) can be calculated in O(n · m)
analogue-to-digital converter. The microcontroller runs a data        time.
logging program that allows easy communication with the host             We can construct, iteratively, a C ∈ R(n+1)×(m+1) matrix
computer via an FT232RL-based USB-to-UART interface. The              in the following way:
general purpose data acquisition program running on the PC
was written in C#, and it allowed the real-time monitoring                  C0,0   =   0
of signals. Both the hardware and software developments are                 Ci,0   =   +∞, i = 1, . . . , n
fully open-source [22]. A block diagram of the measurement                , C0,j   =   +∞, j = 1, . . . , m
setup is shown in Figure 2.                                                 Ci,j   =   |ui − vj | + min (Ci−1,j , Ci,j−1 , Ci−1,j−1 ) ,
   The bandwidth of the signals was set to 10Hz in order                                       i = 1, . . . , n, j = 1, . . . , m.
to remove unwanted high frequency components and prevent
aliasing. Moreover, the sample rate was set to 1000Hz. The
signal range was closely matched to the input range of the              After we get the Cn,m which tells us the DTW distance
data acquisition unit, hence a clean, low noise output was            between the vectors u and v. Thus
obtained. The acquired signals were then saved to a file for                               dDTW (u, v) = Cn,m .
offline processing and analysis.


Fig. 1: The three-axis accelerometer is mounted close to the
tip of the pen                                                            Fig. 2: Block diagram of the data acquisition system


                                                                  7
                   Proceedings of the 1st International Workshop on Automated Forensic Handwriting Analysis (AFHA) 2011


      (a) Genuine - 1st time period    (b) Genuine - 2nd time period         (c) Forgery - 1st time period            (d) Forgery - 2nd time period

       Fig. 3: The images and corresponding acceleration signals of two genuine signatures and two forged signatures


   The DTW algorithm has several versions (e.g. weighted                                        False acceptance/rejection rates
                                                                                                 Type I Type II No of cases
DTW and bounded DTW), but we decided to use the simple
version above, where |ui − vj | denotes the absolute difference                                    0%           0%                39
                                                                                                  20%           0%               135
between the coordinate i of vector u and coordinate j of vector                                   40%           0%                68
v.                                                                                                60%           0%                 7
   Since the order of the sizes of n and m are around 103 −104 ,                                  80%           0%                 3
our implementation does not store the whole C matrix, whose                                                   Total              252
                                                                                               24.13%           0%
size is about n × m ≈ 106 − 108 . Instead, for each iteration,
just the last two rows of the matrix were stored.                                   TABLE I: A typical distribution of error rates
        III. S ELECTION OF REFERENCE SIGNATURES
                                                                                                False acceptance/rejection rates
   First, we examined the 40 · 15 = 600 signatures from                                         Type I    Type II No of cases
the first time period. For each person, 5 genuine signatures                                       0%           0%                13
were chosen first randomly as references, and included in                                          0%          20%                52
                                                                                                   0%          60%                45
the training set. All the other signatures of this person and                                     20%           0%                 8
unskilled forgeries of their signature were used for testing.                                     20%          60%                58
Thus the test set contained 5 genuine and 5 unskilled forged                                      20%          20%                45
                                                                                                  40%          20%                 8
signatures for each person.                                                                       40%          60%                22
   We first computed the minimum distance between the five                                        60%          60%                 1
elements of the training set (Dmin ). Then, for each signature                                                 Total             252
in the test set, the minimum distance of the signature from                                    13.81%        38.33%
the training set’s five signatures was found (Ddis ). Now, if for
some t in the set                                                                 TABLE II: A different distribution of error rates
                         Ddis < m · Dmin

then t was accepted as a true signature; otherwise it was                     Based on our earlier studies [1], we set the multiplier m at
rejected.                                                                  2.16 because we got the highest overall accuracy ratio (88.5%)
   Besides the minimum we also used two other metrics,                     with this value.
namely the maximum and average distances, but the minimum                     A typical distribution of Type I and Type II error rates is
produced the lowest error rates.                                           shown in Table I. The first two columns show the error rates,
   The performance of a signature verification algorithm can be            while the third one shows certain cases with the corresponding
measured by the Type I error rate (false reject), when a genuine           error rates. The last row shows the average error rate.
signature is labelled as a forgery and Type II error rate (false              According this table, in 39 cases (out of 252) the Type I
accept), when a forged signature is marked as genuine. After               and Type II error rates are equal to 0. The average type error
we analyzed the results, we observed that the Type I and II                rate of 252 possibilities is 24.13%, while the average Type
errors depend on how we choose the reference signatures, so                error rate is 0. For 27 authors (out of 40) and for each case,
we checked all the possible choices of reference signatures
                                                         and              the false reject rates were 0%. A much worse, but very rare
compared error rates. For each person there were 10    5 = 252
                                                                           case is shown in Table II.
possible ways of how to choose the 5 reference signatures                     The average false accept rate was 14.34%, with a standard
from the 10 genuine signatures.                                            deviation of 13.62%; the average false reject rate was 12.89%,


                                                                       8
                      Proceedings of the 1st International Workshop on Automated Forensic Handwriting Analysis (AFHA) 2011


                   DTW          AE50         AE51         AE52         AE53         AE54         AE55         AE56         AE57         AE58         AE59         ME60         ME61          ME62         ME63         ME64

                  AE50           0
                  AE51           63             0
                  AE52           98            64            0
                  AE53          125            71          105           0
                  AE54          116            65           67          101            0
                  AE55           63           113          136          167          157           0
                  AE56          114            80           76          127           67          155            0
                  AE57          104            68           76          115           73          147           63           0
                  AE58           74            66           63          111           59          105           37           49           0
                  AE59          233           173           86          177           82          317          165          152          122           0
                  ME60          344           239          254          281          386          532          333          202          234          372           0
                  ME61          274           232          252          285          441          450          402          239          246          501          135           0
                  ME62          237           177          175          231          255          350          222          179          158          316           70          107            0
                  ME63          318           259          260          304          410          494          334          221          227          372           50           83           67           0
                  ME64          710           677          697          716          875          854          796          670          684          977          260          198          395          269           0


                                                           TABLE III: Sample distance matrix – First time period
                  DTW2          AE80         AE81         AE82         AE83         AE84         AE85         AE86         AE87         AE88         AE89         ME90         ME91          ME92         ME93         ME94

                  AE80           0
                  AE81          34             0
                  AE82          34            41            0
                  AE83          50            63           47            0
                  AE84          52            58           43           49            0
                  AE85          217           213          179          227          206           0
                  AE86          139           130          152          150          145          325            0
                  AE87          117           103          144          154          147          339           81           0
                  AE88          55            52           52           91           82           140          154          121           0
                  AE89          65            63           60           71           65           233          105          125          92            0
                  ME90          293           245          270          355          310          236          336          302          228          328           0
                  ME91          227           198          208          295          252          245          275          262          165          259          54            0
                  ME92          339           298          322          419          387          288          393          348          273          413          45           106           0
                  ME93          617           625          569          617          699          473          518          415          473          770          202          260          117           0
                  ME94          388           425          492          540          582          293          469          376          395          582          67           150          40           100           0


                                                      TABLE IV: Sample distance matrix – Second time period
   DTW   AE50   AE51     AE52          AE53         AE54         AE55         AE56         AE57         AE58         AE59         AE80         AE81         AE82         AE83         AE84         AE85         AE86        AE87   AE88   AE89

  AE50    0
  AE51    63      0
  AE52    98     64         0
  AE53   125     71       105           0
  AE54   116     65        67          101            0
  AE55    63    113       136          167          157            0
  AE56   114     80        76          127           67          155            0
  AE57   104     68        76          115           73          147           63           0
  AE58    74     66        63          111           59          105           37           49            0
  AE59   233    173        86          177           82          317          165          152          122            0
  AE80    74     51        47           95           75          112           65           67           50          168           0
  AE81    75     51        50          102           69          119           64           59           47          179           34            0
  AE82    67     40        48           96           54          104           74           66           57          179           34           41            0
  AE83    94     63        58           94           58          121           78           75           68          129           50           63           47            0
  AE84    90     54        57           87           44          120           65           53           49          124           52           58           43           49            0
  AE85    84    238       265          259          251          147          352          303          268          453          217          213          179          227          206            0
  AE86   223    145       111          192          141          306          128          145          110           92          139          130          152          150          145           325           0
  AE87   179    126       126          190          170          252           84          108           96          203          117          103          144          154          147           339          81          0
  AE88    45     63        77          132          105           82           87           83           64          217           55           52           52           91           82           140          154        121      0
  AE89   133     70        55          120           52          185           67          77            65          109           65           63           60           71           65           233          105        125     92     0


                                       TABLE V: Distances between genuine signatures from both time periods


with a standard deviation of 24.33%.                                                                                        117 and a standard deviation 73], but between a genuine and
                                                                                                                            a forged signature it varies from 158 to 977 with an average
                 IV. D IFFERENT TIME PERIOD                                                                                 of 393 and a standard deviation of 211 [from 165 to 770 with
   Since a signature can change over time, we decided to                                                                    an average value of 382 and a standard deviation of 142]. The
examine how this affects the DTW distances of the accelera-                                                                 distance matrices for other persons are similar to those given
tion signals of signatures. We recorded genuine and forged                                                                  above.
signatures from 20 authors in two time periods this year:                                                                      In most cases there were no significant differences between
between January and April and between May and June.                                                                         distance matrices calculated for different time periods (and
   Table III and IV are two (DTW) distance matrices calculated                                                              from the same author). Table V shows the DTW distance
for the same subject in the two time periods.                                                                               between genuine signatures taken from the same author for
   The intersection of the first 10 columns and 10 rows shows                                                               the different time periods. AE50-59 are from the first period,
the distance values between the genuine signatures (obtained                                                                while AE80-89 are from the second. The average distance is
from the same person). The intersection of the first 10 rows and                                                            114, the minimum is 34, the maximum is 453 and the standard
the last 5 columns tells us the distances between genuine and                                                               deviation of the distances is 70.3.
the corresponding forged signatures. The rest (the intersection                                                                Figures 4a and 4b show the false reject and false accept rates
of the last 5 rows and last 5 columns) shows the distances                                                                  as a function of the constant multiplier m of the minimum
between the corresponding forged signatures.                                                                                distance got from the training dataset.
   In Table III [Table IV] the distance between the genuine                                                                    We can see that in both time intervals we get a zero false
signatures varies from 60 to 317 with an average of 108 and a                                                               accept rate when m = 7. The curves decrease quite quickly,
standard deviation 53 [from 34 to 334 with an average value of                                                              while the increase of the false reject rate is less marked. The


                                                                                                                      9
                  Proceedings of the 1st International Workshop on Automated Forensic Handwriting Analysis (AFHA) 2011


main difference between the two time intervals and the false                                             R EFERENCES
reject rate curves is that in the first time interval it increases         [1] H. Bunke, J. Csirik, Z. Gingl, and E. Griechisch, “Online signature
faster than in the second. The reason is probably that in the                  verification method based on the acceleration signals of handwriting
second time interval the acceleration signals were quite similar               samples.” submitted, 2011.
                                                                           [2] H. Lei and V. Govindaraju, “A comparative study on the consistency of
(see tables III and IV).                                                       features in on-line signature verification,” Pattern Recognition Letters,
                                                                               vol. 26, pp. 2483–2489, 2005.
                                                                           [3] J. Richiardi, H. Ketabdar, and A. Drygajlo, “Local and global feature
                                                                               selection for on-line signature verification,” in In Proc. IAPR 8th In-
                                                                               ternational Conference on Document Analysis and Recognition (ICDAR
                                                                               2005), pp. 625–629, 2005.
                                                                           [4] L. Nanni, E. Maiorana, A. Lumini, and P. Campisi, “Combining local,
                                                                               regional and global matchers for a template protected on-line signature
                                                                               verification system,” Exp. Syst. Appl., vol. 37, pp. 3676–3684, May 2010.
                                                                           [5] D. yan Yeung, H. Chang, Y. Xiong, S. George, R. Kashi, T. Matsumoto,
                                                                               and G. Rigoll, “Svc2004: First international signature verification com-
                                                                               petition,” in In Proceedings of the International Conference on Biometric
                                                                               Authentication (ICBA), Hong Kong, pp. 16–22, Springer, 2004.
                                                                           [6] R. Plamondon and G. Lorette, “Automatic signature verification and
                                                                               writer identification - the state of the art,” Pattern Rec., vol. 22, no. 2,
                                                                               pp. 107–131, 1989.
                        (a) 1st time period                                [7] F. Leclerc and R. Plamondon, Progress in automatic signature verifica-
                                                                               tion, vol. 13, ch. Automatic Signature Verification – The State Of The
                                                                               Art 1989–1993, pp. 643–660. World Scientific, 1994.
                                                                           [8] G. K. Gupta, “Abstract the state of the art in on-line handwritten
                                                                               signature verification,” 2006.
                                                                           [9] G. Dimauro, S. Impedovo, M. Lucchese, R. Modugno, and G. Pirlo,
                                                                               “Recent advancements in automatic signature verification,” in Frontiers
                                                                               in Handwriting Recognition, 2004. IWFHR-9 2004. Ninth International
                                                                               Workshop on, pp. 179–184, oct. 2004.
                                                                          [10] S. Sayeed, A. Samraj, R. Besar, and J. Hossen, “Online Hand Signature
                                                                               Verification: A Review,” Journal of Applied Sciences, vol. 10, pp. 1632–
                                                                               1643, Dec. 2010.
                                                                          [11] S. Daramola and T. Ibiyemi, “An efficient on-line signature verification
                                                                               system,” International Journal of Engineering and Technology IJET-
                                                                               IJENS, vol. 10, no. 4, 2010.
                                                                          [12] A. Kholmatov and B. Yanikoglu, “Identity authentication using an
                        (b) 2nd time period                                    improved online signature verification method,” Pattern Recognition
                                                                               Letters, vol. 26, pp. 2400–2408, 2005.
      Fig. 4: False acceptance and false rejection rates                  [13] P. Fang, Z. Wu, F. Shen, Y. Ge, and B. Fang, “Improved dtw algorithm
                                                                               for online signature verification based on writing forces,” in Advances
                                                                               in Intelligent Computing (D.-S. Huang, X.-P. Zhang, and G.-B. Huang,
                       V. C ONCLUSIONS                                         eds.), vol. 3644 of Lecture Notes in Computer Science, pp. 631–640,
                                                                               Springer Berlin / Heidelberg, 2005.
   In this paper an online signature verification method was              [14] M. Mailah and B. H. Lim, “Biometric signature verification using pen
proposed for verifying human signatures. The new procedure                     position, time, velocity and pressure parameters.,” Jurnal Teknologi A,
                                                                               vol. 48A, pp. 35–54, 2008.
was implemented and then tested. First, a test dataset was                [15] R. Baron and R. Plamondon, “Acceleration measurement with an in-
created using a special device fitted with an accelerometer.                   strumented pen for signature verification and handwriting analysis,” In-
The dataset contained 600 + 300 = 900 signatures, where 600                    strumentation and Measurement, IEEE Transactions, vol. 38, pp. 1132–
                                                                               1138, Dec. 1989.
signatures were genuine and 300 were forged. By applying                  [16] J. S. Lew, “Optimal accelerometer layouts for data recovery in signature
a time series approach and various metrics we were able to                     verification,” IBM J. Res. Dev., vol. 24, pp. 496–511, July 1980.
place signature samples into two classes, namely those that               [17] M. Bashir and J. Kempf, “Reduced dynamic time warping for hand-
                                                                               writing recognition based on multi-dimensional time series of a novel
are probably genuine and those that are probably forged.                       pen device,” World Academy of Science, Engineering and Technology
   Based on our earlier experiments, we examined how the                       45, pp. 382–388, 2008.
training set selection varies over a period of weeks (in most             [18] O. Rohlik, Pavel Mautner, V. Matousek, and J. Kempf, “A new approach
                                                                               to signature verification: digital data acquisition pen,” Neural Network
cases it was a few months) and how time influences the false                   World, vol. 11, no. 5, pp. 493–501, 2001.
acceptance and false rejection rates. We found that a person’s            [19] P. Mautner, O. Rohlik, V. Matousek, and J. Kempf, “Signature verifi-
signature does not vary much over a period of weeks or                         cation using art-2 neural network,” in Neural Information Processing,
                                                                               2002. ICONIP ’02. Proceedings of the 9th International Conference,
months, but it could vary more over longer periods.                            vol. 2, pp. 636–639, nov. 2002.
                                                                          [20] J. Ortega-Garcia, J. Fierrez-Aguilar, D. Simon, J. Gonzalez, M. Faundez-
Acknowledgments: This work has been supported by the                           Zanuy, V. Espinosa, A. Satue, I. Hernaez, J. J. Igarza, C. Vivaracho,
Project ”TÁMOP-4.2.1/B-09/1/KONV-2010-0005 - Creating                         D. Escudero, and Q. I. Moro, “MCYT baseline corpus: a bimodal bio-
the Center of Excellence at the University of Szeged”, sup-                    metric database,” Vision, Image and Signal Processing, IEE Proceedings,
                                                                               vol. 150, no. 6, pp. 395–401, 2003.
ported by the European Union, co-financed by the Euro-                    [21] K. Kopasz, P. Makra, Z. Gingl, and Edaq530, “A transparent, open-end
pean Regional Development Fund and by the ”TÁMOP-                             and open-source measurement solution in natural science education,”
4.2.2/08/1/2008-0008” program of the Hungarian National                        Eur. J. Phys. 32, pp. 491–504, March 2011.
                                                                          [22] “http://www.noise.physx.u-szeged.hu/edudev/edaq530.”
Development Agency.


                                                                     10