=Paper= {{Paper |id=None |storemode=property |title=Fusing Modalities in Forensic Identification with Score Discretization |pdfUrl=https://ceur-ws.org/Vol-1022/Paper09.pdf |volume=Vol-1022 |dblpUrl=https://dblp.org/rec/conf/icdar/LengSS13 }} ==Fusing Modalities in Forensic Identification with Score Discretization== https://ceur-ws.org/Vol-1022/Paper09.pdf
    Fusing Modalities in Forensic Identification with
                 Score Discretization
        Y.L. Wong, S. M. Shamsuddin, S. S. Yuhaniz                                          Sargur N. Srihari
              Soft Computing Research Group                               Department of Computer Science and Engineering
               Universiti Teknologi Malaysia                            University at Buffalo,The State University of New York
                   81310 Johor, Malaysia                                               Buffalo, NY 14260 USA
    yeeleng28@gmail.com,mariyam@utm.my,sophia@utm.my                                   srihari@cedar.buffalo.edu



   Abstract—The fusion of different forensic modalities for ar-        this paper we explore how evidence of different modalities can
riving at a decision of whether the evidence can be attributed         be combined for the forensic decision. Biometric identification
to a known individual is considered. Since close similarity and        systems such as token based and password based identification
high dimensionality can adversely affect the process, a method of
score fusion based on discretization is proposed. It is evaluated      systems, unimodal identification recognizes a user, by ”who
considering the signatures and fingerprints. Discretization is         the person is”, using a one-to many matching process (1:M)
performed as a filter to find the unique and discriminatory            rather than by ”what the person carries along”. Conventional
features of each modality in an individual class before their use      systems suffer from numerous drawbacks such as forgotten
in matching. Since fingerprints and signatures are not compatible      password, misplaced ID card, and forgery issues. To address
for direct integration, the idea is to convert the features into the
same domain. The features are assigned an appropriate matched          these problems, unimodal based identification was developed
score, M Sbp which are based to their lowest distance. The final       and has seen extensive enhancements in reliability and accu-
scores are then fed to the fusion, F Sbp . The top matches with        racy of identification. However, several studies have shown
F Sbp less than a predefined threshold value, η are expected to        that the poor quality of image samples or the methodology
have the true identity. Two standard fusion approaches, namely         itself can lead to a significant decreasing in the performance
Mean and Min fusion, are used to benchmark the efficiency of
proposed method. The results of these experiments show that            of a unimodal based identification system [9], [10], [11]. The
the proposed approach produces a significant improvement in            common issues include intra-class variability, spoof attack,
the forensic identification rate of fingerprint and signature fusion   non-universality, and noisy data. In order to overcome these
and this findings support its usefulness.                              difficulties in unimodal identification, multimodal based iden-
   Keywords—forensic; multimodal; discretization; matching             tification systems (MIS) have been developed. As the name
scores; fusion; identification
                                                                       suggests, in an MIS the identification process is based on
                                                                       evidence presented by multiple modality sources from an
                       I. I NTRODUCTION
                                                                       individual. Such systems are more robust to variations in the
   The goal of forensic analysis is that of determining whether        sample quality than unimodal systems due to the presence of
observed evidence can be attributed to an individual. The              multiple (and usually independent) pieces of evidence [12].
final decision of forensic analysis can take one of three val-         A key to successful multimodal based system development
ues: identification/no-conclusion/exclusion. Biometric systems         for forensic identification, is an effective methodology orga-
have a similar goal of going from input to conclusion but              nization and fusion process, capable to integrate and handle
with different goals and terminology: biometric identification         important information such as distinctiveness characteristic of
means determining the best match in a closed set of individuals        an individual. Individual’s distinctive characteristics is unique
and verification means whether the input and known have the            to forensic. Therefore, in this paper, the multi-matched scores
same source. While biometric systems attempt to do the en-             based discretization method is proposed for forensic identifi-
tire process automatically, forensic systems narrow-down the           cation of an individual from different modalities. Compared to
possibilities among a set of individuals with the final decision       previous methods, the proposed method is unique in the sense
being made by a human examiner. Automatic tools for forensic           that the extracted features correspond to the individuality of
analysis have been developed for several forensic modalities           a particular person which are discretized and represented into
including signatures [1], fingerprints [2], handwriting [3], and       standard sizes. The method is robust and capable to overcome
footwear prints or marks [4]. In both forensic analysis and            dimensionality issues without requiring image normalization.
biometric analysis more than one modality of data can be               The low dimension and standardized features make the design
used to improve accuracy [5], [6]. Examples of the need to             of post-processing phase (classifier or decision) straightfor-
combine forensic evidence in forensic analysis are: signature          ward. Moreover, the clear physical meanings of the discretized
and fingerprints on the same questioned document, pollen               features are meaningful and distinctive, and be used in more
found on the clothing of an assailant together with human              complex systems (e.g., expert systems for interpretation and
DNA [7], multiple shoe-prints in a crime scene [8], etc. In            inference).
                     II. R ELATED W ORK                              methods in this study. Subsequently, a single representation
   In identification systems, fusion takes into account a set of     value for each interval, or cut, is computed by taking the
features that can reflect the individuality and characteristics      midpoint of the lower approximation,Approxlower and upper
of the person under consideration. However, it is difficult to       approximation, Approxupper interval. Algorithm 1 shows the
extract and select features that are discriminatory, meaningful      discretization steps discussed above.
and important for identification. Different sets of features may        Algorithm 1: Discretization Algorithm
                                                                       Require: Dataset with f continuous features, D samples and C classes;
have better performance when considering different groups              Require: Discretized features, D ′ ;
of individuals and therefore, a technique is needed to rep-               for each individual do
                                                                             Find the M ax and the M in values of D samples
resent for each sample set of features. In this paper, multi-                numb bin = numb extracted f eature
matched scores fusion based discretization is proposed for                   Divide the range of M in to M ax with numb bin

forensic identification to represent the distinctiveness in multi-            Compute representation values, RepV alue:
modalities of an individual.
                                                                              for each bin do
                                                                                  Find the Approxlower and Approxupper
A. Representation of individuality features                                       Compute the midpoints of all Approxlower and Approxupper
                                                                              end for
   Extracting and representing relevant features which contains
the natural characteristics of an individual is essential for a               Form a set of all discrete values, Dis F eatures:
good performance of the identification algorithms. Existing                  for 1 to numb extracted f eature do
multimodal based identification systems make the assumptions                     for each bin do
                                                                                     if (feature in range of interval) then
that each modality feature set from an individual is local,                              Dis F eature = RepV alue
wide-ranging, and static. Thus, these extracted feature sets                         end if
                                                                                 end for
are commonly fed to individual matching or and classification                end for
algorithms directly.                                                      end for
   As a result, the identification system becomes more com-
                                                                       C. Processing and extraction of Signature and Fingerprint
plex, time consuming, and costly because a classifier is needed
for each modality. Furthermore, concatenating features from            For signature, the input image is first binarized by adaptive
different modalities after the feature extraction method leads         thresholding, followed by morphology operations (i.e., re-
to the need of comparing high dimensional, heterogeneous               move and skel) to get the gray level of clean and universe
data which is a nontrivial issue. However, much work has               of discourse signature image (UOD) as illustrated in Fig.
been proposed to overcome the dimensional issues in extracted          1. The UOD of signature is extracted using geometry
features such as implementation of normalization techniques            based extraction approach [17], which is based on 3x3
after extraction. Careful observation and experimental analysis        window concept. The process is done on individual window
need to be performed in order to improve the performance of            instead of the whole image to give more information of
identification. Too much of normalization will diminish the            the signature image icludes the positions of different line
originality characteristic of an individual from different modal-      structures.
ity images. Thus, another process is needed to produce a more                                  Original Signature                 Binarized Signature



discriminative, reliable, unique and informative feature rep-
resentation to represent these inherently multiple continuous
features into standardized discrete features (per individual).
This leads to the multi-matched score fusion discretization
                                                                                                     (a)                                (b)
approach introduced in this paper which is explored in the                                   Skeletonized Signature                      UOD




context of forensic identification of different modalities for
distinguishing a true identity of a person.
B. The discretization algorithm
                                                                                                       (c)                                (d)
   Discretization is a process whereby a continuous valued
variable is represented by a collection of discrete values. It       Fig. 1. Examples of preprocessed signature image (a)Original image
attracted a lot of interest from and work in several different       (b)Binarized image (c)Skeletonized image (d)UOD.
domains [13], [14], [15]. The discretization method introduced
here is based on discretization defined in [16].                       For fingerprint, two types of manutia points namely termi-
   Given a set of features, the discretization algorithm first         nation and bifurcation points are extracted using Minutia
computes the size of interval, i.e., it determines its upper           based extraction approach. Fig. 2 shows the block diagram
and lower bounds. The range is then divided by the number              of minutia based extraction process. Fingerprint image are
of features which then gives each interval upper and lower             binarized, thinned and false minutia are removed to extract
approximation. The number of intervals generated is equal              the region of interest (ROIs). Finally, the extracted ROI
to the dimensionality of the feature vectors, maintaining the          for fingerprint and UOD for the signature are fed to the
original number of extracted features from different extraction        discretization.
                                                                         denotes a distance for discretized signature features and
                            1. Binarization
                                                                         Yf inger = EDf inger (y), where Yf inger = (y1 , ...yd ) is a
                                                                         distance for the discretized fingerprint features. The lowest
                  (a)                               (b)                  distance for signature can be denoted as min[EDsign (x)]
                                                                         and lowest distance for fingerprint can be defined as
                              3. Find Minut ia
                                                                         min[EDf inger (y)]. Then, we define the modality features
                                                                         with the lowest distance as match score-1,(M Sbp = 1), the
                  (c)                               (d)                  second modality features with the second lowest distance as
                                                                         M Sbp = 2 and so on. bp here defines either behavioral(i.e.,
                             5. Orient ation(ROI)                        signature) or phisiological(i.e., fingerprint) trait of the indi-
                                                                         vidual. Then, the match score, M Sbp is fed to the fusion
                  (e)                                (f)                 approach.
Fig. 2. Examples of preprocessed fingerprint image (a)Original image     D. Multi-modality fusion
(b)Binarized image (c)Thinned image (d)Minutia Points (e)False Minutia
removed (f)ROI.                                                          After matching, the matched scores of signature
                                                                         and fingerprint are fed to the fusion method.
                                                                         Let             Xsign =M Ssign (1),M Ssign (2),...M Ssign (n)
  Unimodal extraction and the discretization step are illus-             denotes the computed signature match scores and
  trated in Table I for signature data for individual 1, and             Y f inger=M Sf inger (1),M Sf inger (2),...M Sf inger (n)
  Table II for the fingerprint data for the same individual.             defines the computed match scores for fingerprint.In this
  In each of these tables, the feature values are divided into           work, the final fused score, F Sbp of the individual are
  predefined number of bins, which is based on the number                computed using Equation (2), where k represents the
  of features for each modality image.                                   number of different modalities of an individual. The M S
  In the top portion of these tables, for each bin, the lower            for fingerprint and signature are combined and divided by
  and upper values are recorded in columns two and three                 k to generate a single score which is then compared to a
  respectively, and bin, RepV alue, the average of lower and             predefined threshold to make the final decision.
  upper values, is recorded in column four. Max and Min
                                                                                                M Ssign + M Sf inger
  values are highlighted in bold face.In the bottom portion                             F Sbp =                                 (2)
  of the table, the discretized features for signature and                                                 k
  fingerprint are displayed. These tables shows an example of            Fusion approaches, namely Mean, M eanF Sbp and Min,
  how the actual feature sets from individual are discretized.           M inF Sbp fusion as defined in (3) and (4) are chosen for
  As it can be seen from the Table I, the feature values,                comparisson to show the efficiency of the proposed method
  35.259 occurs for every column of the nine features for the            on multi-modalities identification.
  signature data of the same individual.This means that the
  first individual is uniquely recognized by this discriminatory                 M eanF Sbp = (xM Ssign + yM Sf inger )/2             (3)
  value. A similar discussion holds for Table II, where the
  set of discriminatory values for fingerprint data for first
                                                                                   M inF Sbp = min(M Ssign , M Sf inger )             (4)
  individual, obtained from four different images is 104.
  The selected features are the representation values (Dis-              Finally, the F Sbp is forward to next phase for identification.
  criminatory features, DF of an individual) that describe the           In identification process of one-to-many matching (1:M),
  unique characteristics of an individual which will be used             F Sbp is compared with the predefined identification thresh-
  for matching process. In matching module, the distance                 old, η in order to identify the individual from M individuals.
  between the discretized values with the stored feature values          In this work, the identity of a person is identified if,
  are computed by Euclidean Distance equation as defined in
  (1).                                                                                              F Sbp ≤ η                         (5)
                            N (
                            ∑                              )                            III. E XPERIMENTAL R ESULTS
                                                    (r)
                  EDbp =             Dfbp,i − Dfbp,i              (1)    The performance of this work is performed using ROC
                            i=1
                                                                         curve which consists of Genuine Acceptance Rate (GAR) of
  Where Dfbp,i represents ith discretized feature of new                 a system mapped against the False Acceptance Rate (FAR).
                               (r)
  modality image meanwhile Dfbp,i defines the ith discretized            In this work, GAR is equal to 1-FRR. Fig. 1 shows the
  feature of reference modality image in stored template                 performance of Unimodal identification for signature and
  and bp represents either behavioral or phisiological trait             fingerprint. Discretization is applied in this experiment. No
  of the individual. The ith total number of features ex-                normalization and fusion methods are implemented. The
  tracted from a single modality image is denoted by N.                  performance of the identification for both discretized signa-
  Let Xsign = EDsign (x), where Xsign = (x1 , ...xd )                    ture and fingerprint and non-discretized dataset is compared.
                                                                TABLE I
                             E XAMPLE OF D ISCRETIZATION PROCESS FOR SIGNATURE FEATURES OF FIRST INDIVIDUAL

                  LOW and UPPER BIN for Individual: 1
                  MIN Value 10.8096 MAX Value 98.8273
                    Bin         Lower     Upper     RepValue

                     0            10.8096    20.5893   15.69945
                     1            20.5893    30.3691    25.4792
                     2            30.3691    40.1488     35.259
                     3            40.1488    49.9286    45.0387
                     4            49.9286    59.7083    54.8184
                     5            59.7083    69.4881    64.5982
                     6            69.4881    79.2678    74.3779
                     7            79.2678    89.0476    84.1577
                     8            89.0476    98.8273    93.9374

                  DISCRETIZED DATA
                    f1          f2               f3         f4          f5          f6          f7          f8          f9         Class
                  15.69945       15.69945     35.259     35.259    15.69945    15.69945    15.69945     25.4792     25.4792   1s Discriminatory
                  54.8184         64.5982    93.9374     35.259    15.69945      35.259     54.8184     45.0387     25.4792   1s        Value is
                  25.4792          35.259     35.259    25.4792     25.4792     54.8184      35.259     45.0387     45.0387   1s          35.259
                  64.5982          35.259    25.4792    25.4792      35.259     74.3779     45.0387    15.69945    15.69945   1s     for 1st ind.



                                                             TABLE II
                         E XAMPLE OF D ISCRETIZATION PROCESS FOR FINGERPRINT FEATURES OF FIRST INDIVIDUAL

                         LOW and UPPER BIN for Individual : 1
                         MIN Value 55 MAX Value 195
                         Bin  Lower    Upper   RepValue

                             0          55        69          62
                             1          69        83          76
                             2          83        97          90
                             3          97       111         104
                             4         111       125         118
                             5         125       139         132
                             6         139       153         146
                             7         153       167         160
                             8         167       181         174
                             9         181       195         188

                         DISCRETIZED DATA
                         f1      f2     f3                    f4      f5       f6     f7      f8       f9   f1 0         Class
                         104     90    104                   132     104      118    104     104       62   146     1f Discriminatory
                          90    132    104                   132     160       90    146     146      160   188     1f        Value is
                          76     90    104                   132     160       62    104     104      160   181     1f              104
                          90    104    118                   132     146      132     76      62      160   160     1f     for 1st ind.




From ROC graph, clearly defines that the use of discretiza-
tion on the unimodal dataset enhances the overall perfor-
mance of identification significantly over the performance
of identification without discretization. Due to efficiency of
the discretization method on unimodal identification, thus,
the same technique is applied to multimodal identification in
order to improve the accuracy of identification on multiple
modalities.
Fig. 2 and Fig. 3 below shows the performance of ROC
graph for two different fusion methods namely Mean fu-
sion rule and Min method with the implementation of
Z-Score normalization and matched scores fusion based
                                                                                                   Fig. 3. Performance of uni-modality identification.
discretization approach on multiple modalities. From the
ROC graph depicted in Fig. 2, it can be seen that the
implementation of the proposed method based discretization
on the multi-modalities fusion of signature and fingerprint                           99.9% respectively, where the performance is better than
shows a better performance than the standard signature and                            the Z-score normalization and Mean fusion on signature
fingerprint identification system. At FAR of 0.1%, 1.0%, and                          and fingerprint modalities, 93.5%, 93.7%, and 96.4%. Fig.
10.0%, the implementation of the proposed method which                                3 shows the GAR performance on Min fusion based Z-
is based on discretization has a GAR of 96.9%, 98.9%, and                             score normalization and the proposed multi-matched score
  based discretization. Again, in Fig. 3, interestingly, the             and Min fusion are performed to seek for the efficiency
  proposed method based on discretization on signature and               of the proposed method in Multimodal identification. The
  fingerprint modalities yields the best performance over the            experimental results show that the proposed multi-matched
  range of FAR. At 0.1%, 1.0%, and 10.0% of FAR, the Min                 scores discretization perform well on multiple set of in-
  fusion method works the best with proposed method, 95.0%,              dividual traits, consequently improving the identification
  97.99%, and 99.40% respectively. Therefore, it can be sum-             performance.
  marized that the used of discretization and proposed fusion
                                                                                                 ACKNOWLEDGMENT
  of fingerprint and signature modalities generally performs
  well over the use of normalization and conventional fusion             This work is supported by The Ministry of Higher Education
  approaches for personal identification.                                (MOHE) under Research University Grant (GUP) and My-
                                                                         brain15. Authors would especially like to thank Universiti
                                                                         Teknologi Malaysia, Skudai Johor Bahru MALAYSIA for
                                                                         the support and Soft Computing Research Group (SCRG)
                                                                         for their excellent cooperation and contributions to improve
                                                                         this paper.
                                                                                                      R EFERENCES
                                                                          [1]S. N. Srihari, “Computational methods for handwritten questioned
                                                                             document examination,” National Criminal Justice Research Report,
                                                                             2010.
                                                                          [2]C. Su and S. Srihari, “Evaluation of rarity of fingerprints in forensics,”
                                                                             Advances in Neural Information Processing Systems, vol. 23, pp. 1207–
                                                                             1215, 2010.
Fig. 4. Performance of Multi-modality fusion methods for signature and    [3]S. N. Srihari and K. Singer, “Role of automation in the examination of
fingerprint.                                                                 handwritten items,” in Frontiers in Handwriting Recognition (ICFHR),
                                                                             2012 International Conference on. IEEE, 2012, pp. 619–624.
                                                                          [4]Y. Tang, H. Kasiviswanathan, and S. N. Srihari, “An efficient
                                                                             clustering–based retrieval framework for real crime scene footwear
                                                                             marks,” International Journal of Granular Computing, Rough Sets and
                                                                             Intelligent Systems, vol. 2, no. 4, pp. 327–360, 2012.
                                                                          [5]J. Gonzalez-Rodriguez, J. Ortega-Garcia, and J.-L. Sanchez-Bote,
                                                                             “Forensic identification reporting using automatic biometric systems,”
                                                                             in Biometric Solutions. Springer, 2002, pp. 169–185.
                                                                          [6]J. Gonzalez-Rodriguez, J. Fierrez-Aguilar, D. Ramos-Castro, and
                                                                             J. Ortega-Garcia, “Bayesian analysis of fingerprint, face and signa-
                                                                             ture evidences with automatic biometric systems,” Forensic science
                                                                             international, vol. 155, no. 2-3, pp. 126–140, 2005.
                                                                          [7]K. J. Craft, J. D. Owens, and M. V. Ashley, “Application of plant
                                                                             dna markers in forensic botany: Genetic comparison of¡ i¿ quercus¡/i¿
                                                                             evidence leaves to crime scene trees using microsatellites,” Forensic
                                                                             science international, vol. 165, no. 1, pp. 64–70, 2007.
                                                                          [8]Y. Tang, S. N. Srihari, H. Kasiviswanathan, and J. J. Corso, “Footwear
Fig. 5. Performance of Multi-modality fusion methods for signature and       print retrieval system for real crime scene marks,” in Computational
fingerprint.                                                                 Forensics. Springer, 2011, pp. 88–100.
                                                                          [9]A. Jain and A. Ross, “Introduction to biometrics,” Handbook of
                                                                             Biometrics, pp. 1–22, 2008.
                                                                         [10]A. Ross, K. Nandakumar, and A. Jain, “Introduction to multibiomet-
                         IV. C ONCLUSION                                     rics,” Handbook of Biometrics, pp. 271–292, 2008.
                                                                         [11]N. Solayappan and S. Latifi, “A survey of unimodal biometric meth-
  A key to successful multimodal based system develop-                       ods,” in Proceedings of the 2006 International Conference on Security
  ment for forensic identification, is an effective methodology              and Management, 2006, pp. 57–63.
  organization and fusion process, capable to integrate and              [12]A. Ross and A. Jain, “Information fusion in biometrics,” Pattern
                                                                             recognition letters, vol. 24, no. 13, pp. 2115–2125, 2003.
  handle important information such as distinctiveness char-             [13]H. Liu, F. Hussain, C. L. Tan, and M. Dash, “Discretization: An
  acteristic of an individual. In this paper, the match scores               enabling technique,” Data mining and knowledge discovery, vol. 6,
  discretization is proposed and implemented on different                    no. 4, pp. 393–423, 2002.
                                                                         [14]R. Ahmad, M. Darus, S. M. H. Shamsuddin, and A. A. Bakar,
  modality datasets of an individual. The experiments are                    “Pendiskretan set kasar menggunakan taakulan boolean terhadap pen-
  done on signature and fingerprint datasets, which consist                  caman simbol matematik,” Journal Teknologi Maklumat & Multimedia,
  of 156 students (both female and male) where each stu-                     pp. 15–26, 2004.
                                                                         [15]B. O. Mohammed and S. M. Shamsuddin, “Feature discretization
  dent contributes 4 samples of signatures and fingerprint.                  for individuality representation in twins handwritten identification,”
  Ten features describing the bifurcation and termination                    Journal of Computer Science, vol. 7, no. 7, pp. 1080–1087, 2011.
  points of fingerprint, were extracted using Minutia based              [16]A. Muda, S. Shamsuddin, and M. Darus, “Invariants discretization for
                                                                             individuality representation in handwritten authorship,” Computational
  extraction approach whereas signature is extracted using                   Forensics, pp. 218–228, 2008.
  Geometry based extraction approach. In matching process,               [17]K. Huang and H. Yan, “Off-line signature verification based on
  each template-query pair feature sets is compared using                    geometric feature extraction and neural network classification,” Pattern
                                                                             Recognition, vol. 30, no. 1, pp. 9–17, 1997.
  Euclidean distance. Two fusion approaches namely Mean