=Paper=
{{Paper
|id=None
|storemode=property
|title=Fusing Modalities in Forensic Identification with Score Discretization
|pdfUrl=https://ceur-ws.org/Vol-1022/Paper09.pdf
|volume=Vol-1022
|dblpUrl=https://dblp.org/rec/conf/icdar/LengSS13
}}
==Fusing Modalities in Forensic Identification with Score Discretization==
Fusing Modalities in Forensic Identification with
Score Discretization
Y.L. Wong, S. M. Shamsuddin, S. S. Yuhaniz Sargur N. Srihari
Soft Computing Research Group Department of Computer Science and Engineering
Universiti Teknologi Malaysia University at Buffalo,The State University of New York
81310 Johor, Malaysia Buffalo, NY 14260 USA
yeeleng28@gmail.com,mariyam@utm.my,sophia@utm.my srihari@cedar.buffalo.edu
Abstract—The fusion of different forensic modalities for ar- this paper we explore how evidence of different modalities can
riving at a decision of whether the evidence can be attributed be combined for the forensic decision. Biometric identification
to a known individual is considered. Since close similarity and systems such as token based and password based identification
high dimensionality can adversely affect the process, a method of
score fusion based on discretization is proposed. It is evaluated systems, unimodal identification recognizes a user, by ”who
considering the signatures and fingerprints. Discretization is the person is”, using a one-to many matching process (1:M)
performed as a filter to find the unique and discriminatory rather than by ”what the person carries along”. Conventional
features of each modality in an individual class before their use systems suffer from numerous drawbacks such as forgotten
in matching. Since fingerprints and signatures are not compatible password, misplaced ID card, and forgery issues. To address
for direct integration, the idea is to convert the features into the
same domain. The features are assigned an appropriate matched these problems, unimodal based identification was developed
score, M Sbp which are based to their lowest distance. The final and has seen extensive enhancements in reliability and accu-
scores are then fed to the fusion, F Sbp . The top matches with racy of identification. However, several studies have shown
F Sbp less than a predefined threshold value, η are expected to that the poor quality of image samples or the methodology
have the true identity. Two standard fusion approaches, namely itself can lead to a significant decreasing in the performance
Mean and Min fusion, are used to benchmark the efficiency of
proposed method. The results of these experiments show that of a unimodal based identification system [9], [10], [11]. The
the proposed approach produces a significant improvement in common issues include intra-class variability, spoof attack,
the forensic identification rate of fingerprint and signature fusion non-universality, and noisy data. In order to overcome these
and this findings support its usefulness. difficulties in unimodal identification, multimodal based iden-
Keywords—forensic; multimodal; discretization; matching tification systems (MIS) have been developed. As the name
scores; fusion; identification
suggests, in an MIS the identification process is based on
evidence presented by multiple modality sources from an
I. I NTRODUCTION
individual. Such systems are more robust to variations in the
The goal of forensic analysis is that of determining whether sample quality than unimodal systems due to the presence of
observed evidence can be attributed to an individual. The multiple (and usually independent) pieces of evidence [12].
final decision of forensic analysis can take one of three val- A key to successful multimodal based system development
ues: identification/no-conclusion/exclusion. Biometric systems for forensic identification, is an effective methodology orga-
have a similar goal of going from input to conclusion but nization and fusion process, capable to integrate and handle
with different goals and terminology: biometric identification important information such as distinctiveness characteristic of
means determining the best match in a closed set of individuals an individual. Individual’s distinctive characteristics is unique
and verification means whether the input and known have the to forensic. Therefore, in this paper, the multi-matched scores
same source. While biometric systems attempt to do the en- based discretization method is proposed for forensic identifi-
tire process automatically, forensic systems narrow-down the cation of an individual from different modalities. Compared to
possibilities among a set of individuals with the final decision previous methods, the proposed method is unique in the sense
being made by a human examiner. Automatic tools for forensic that the extracted features correspond to the individuality of
analysis have been developed for several forensic modalities a particular person which are discretized and represented into
including signatures [1], fingerprints [2], handwriting [3], and standard sizes. The method is robust and capable to overcome
footwear prints or marks [4]. In both forensic analysis and dimensionality issues without requiring image normalization.
biometric analysis more than one modality of data can be The low dimension and standardized features make the design
used to improve accuracy [5], [6]. Examples of the need to of post-processing phase (classifier or decision) straightfor-
combine forensic evidence in forensic analysis are: signature ward. Moreover, the clear physical meanings of the discretized
and fingerprints on the same questioned document, pollen features are meaningful and distinctive, and be used in more
found on the clothing of an assailant together with human complex systems (e.g., expert systems for interpretation and
DNA [7], multiple shoe-prints in a crime scene [8], etc. In inference).
II. R ELATED W ORK methods in this study. Subsequently, a single representation
In identification systems, fusion takes into account a set of value for each interval, or cut, is computed by taking the
features that can reflect the individuality and characteristics midpoint of the lower approximation,Approxlower and upper
of the person under consideration. However, it is difficult to approximation, Approxupper interval. Algorithm 1 shows the
extract and select features that are discriminatory, meaningful discretization steps discussed above.
and important for identification. Different sets of features may Algorithm 1: Discretization Algorithm
Require: Dataset with f continuous features, D samples and C classes;
have better performance when considering different groups Require: Discretized features, D ′ ;
of individuals and therefore, a technique is needed to rep- for each individual do
Find the M ax and the M in values of D samples
resent for each sample set of features. In this paper, multi- numb bin = numb extracted f eature
matched scores fusion based discretization is proposed for Divide the range of M in to M ax with numb bin
forensic identification to represent the distinctiveness in multi- Compute representation values, RepV alue:
modalities of an individual.
for each bin do
Find the Approxlower and Approxupper
A. Representation of individuality features Compute the midpoints of all Approxlower and Approxupper
end for
Extracting and representing relevant features which contains
the natural characteristics of an individual is essential for a Form a set of all discrete values, Dis F eatures:
good performance of the identification algorithms. Existing for 1 to numb extracted f eature do
multimodal based identification systems make the assumptions for each bin do
if (feature in range of interval) then
that each modality feature set from an individual is local, Dis F eature = RepV alue
wide-ranging, and static. Thus, these extracted feature sets end if
end for
are commonly fed to individual matching or and classification end for
algorithms directly. end for
As a result, the identification system becomes more com-
C. Processing and extraction of Signature and Fingerprint
plex, time consuming, and costly because a classifier is needed
for each modality. Furthermore, concatenating features from For signature, the input image is first binarized by adaptive
different modalities after the feature extraction method leads thresholding, followed by morphology operations (i.e., re-
to the need of comparing high dimensional, heterogeneous move and skel) to get the gray level of clean and universe
data which is a nontrivial issue. However, much work has of discourse signature image (UOD) as illustrated in Fig.
been proposed to overcome the dimensional issues in extracted 1. The UOD of signature is extracted using geometry
features such as implementation of normalization techniques based extraction approach [17], which is based on 3x3
after extraction. Careful observation and experimental analysis window concept. The process is done on individual window
need to be performed in order to improve the performance of instead of the whole image to give more information of
identification. Too much of normalization will diminish the the signature image icludes the positions of different line
originality characteristic of an individual from different modal- structures.
ity images. Thus, another process is needed to produce a more Original Signature Binarized Signature
discriminative, reliable, unique and informative feature rep-
resentation to represent these inherently multiple continuous
features into standardized discrete features (per individual).
This leads to the multi-matched score fusion discretization
(a) (b)
approach introduced in this paper which is explored in the Skeletonized Signature UOD
context of forensic identification of different modalities for
distinguishing a true identity of a person.
B. The discretization algorithm
(c) (d)
Discretization is a process whereby a continuous valued
variable is represented by a collection of discrete values. It Fig. 1. Examples of preprocessed signature image (a)Original image
attracted a lot of interest from and work in several different (b)Binarized image (c)Skeletonized image (d)UOD.
domains [13], [14], [15]. The discretization method introduced
here is based on discretization defined in [16]. For fingerprint, two types of manutia points namely termi-
Given a set of features, the discretization algorithm first nation and bifurcation points are extracted using Minutia
computes the size of interval, i.e., it determines its upper based extraction approach. Fig. 2 shows the block diagram
and lower bounds. The range is then divided by the number of minutia based extraction process. Fingerprint image are
of features which then gives each interval upper and lower binarized, thinned and false minutia are removed to extract
approximation. The number of intervals generated is equal the region of interest (ROIs). Finally, the extracted ROI
to the dimensionality of the feature vectors, maintaining the for fingerprint and UOD for the signature are fed to the
original number of extracted features from different extraction discretization.
denotes a distance for discretized signature features and
1. Binarization
Yf inger = EDf inger (y), where Yf inger = (y1 , ...yd ) is a
distance for the discretized fingerprint features. The lowest
(a) (b) distance for signature can be denoted as min[EDsign (x)]
and lowest distance for fingerprint can be defined as
3. Find Minut ia
min[EDf inger (y)]. Then, we define the modality features
with the lowest distance as match score-1,(M Sbp = 1), the
(c) (d) second modality features with the second lowest distance as
M Sbp = 2 and so on. bp here defines either behavioral(i.e.,
5. Orient ation(ROI) signature) or phisiological(i.e., fingerprint) trait of the indi-
vidual. Then, the match score, M Sbp is fed to the fusion
(e) (f) approach.
Fig. 2. Examples of preprocessed fingerprint image (a)Original image D. Multi-modality fusion
(b)Binarized image (c)Thinned image (d)Minutia Points (e)False Minutia
removed (f)ROI. After matching, the matched scores of signature
and fingerprint are fed to the fusion method.
Let Xsign =M Ssign (1),M Ssign (2),...M Ssign (n)
Unimodal extraction and the discretization step are illus- denotes the computed signature match scores and
trated in Table I for signature data for individual 1, and Y f inger=M Sf inger (1),M Sf inger (2),...M Sf inger (n)
Table II for the fingerprint data for the same individual. defines the computed match scores for fingerprint.In this
In each of these tables, the feature values are divided into work, the final fused score, F Sbp of the individual are
predefined number of bins, which is based on the number computed using Equation (2), where k represents the
of features for each modality image. number of different modalities of an individual. The M S
In the top portion of these tables, for each bin, the lower for fingerprint and signature are combined and divided by
and upper values are recorded in columns two and three k to generate a single score which is then compared to a
respectively, and bin, RepV alue, the average of lower and predefined threshold to make the final decision.
upper values, is recorded in column four. Max and Min
M Ssign + M Sf inger
values are highlighted in bold face.In the bottom portion F Sbp = (2)
of the table, the discretized features for signature and k
fingerprint are displayed. These tables shows an example of Fusion approaches, namely Mean, M eanF Sbp and Min,
how the actual feature sets from individual are discretized. M inF Sbp fusion as defined in (3) and (4) are chosen for
As it can be seen from the Table I, the feature values, comparisson to show the efficiency of the proposed method
35.259 occurs for every column of the nine features for the on multi-modalities identification.
signature data of the same individual.This means that the
first individual is uniquely recognized by this discriminatory M eanF Sbp = (xM Ssign + yM Sf inger )/2 (3)
value. A similar discussion holds for Table II, where the
set of discriminatory values for fingerprint data for first
M inF Sbp = min(M Ssign , M Sf inger ) (4)
individual, obtained from four different images is 104.
The selected features are the representation values (Dis- Finally, the F Sbp is forward to next phase for identification.
criminatory features, DF of an individual) that describe the In identification process of one-to-many matching (1:M),
unique characteristics of an individual which will be used F Sbp is compared with the predefined identification thresh-
for matching process. In matching module, the distance old, η in order to identify the individual from M individuals.
between the discretized values with the stored feature values In this work, the identity of a person is identified if,
are computed by Euclidean Distance equation as defined in
(1). F Sbp ≤ η (5)
N (
∑ ) III. E XPERIMENTAL R ESULTS
(r)
EDbp = Dfbp,i − Dfbp,i (1) The performance of this work is performed using ROC
i=1
curve which consists of Genuine Acceptance Rate (GAR) of
Where Dfbp,i represents ith discretized feature of new a system mapped against the False Acceptance Rate (FAR).
(r)
modality image meanwhile Dfbp,i defines the ith discretized In this work, GAR is equal to 1-FRR. Fig. 1 shows the
feature of reference modality image in stored template performance of Unimodal identification for signature and
and bp represents either behavioral or phisiological trait fingerprint. Discretization is applied in this experiment. No
of the individual. The ith total number of features ex- normalization and fusion methods are implemented. The
tracted from a single modality image is denoted by N. performance of the identification for both discretized signa-
Let Xsign = EDsign (x), where Xsign = (x1 , ...xd ) ture and fingerprint and non-discretized dataset is compared.
TABLE I
E XAMPLE OF D ISCRETIZATION PROCESS FOR SIGNATURE FEATURES OF FIRST INDIVIDUAL
LOW and UPPER BIN for Individual: 1
MIN Value 10.8096 MAX Value 98.8273
Bin Lower Upper RepValue
0 10.8096 20.5893 15.69945
1 20.5893 30.3691 25.4792
2 30.3691 40.1488 35.259
3 40.1488 49.9286 45.0387
4 49.9286 59.7083 54.8184
5 59.7083 69.4881 64.5982
6 69.4881 79.2678 74.3779
7 79.2678 89.0476 84.1577
8 89.0476 98.8273 93.9374
DISCRETIZED DATA
f1 f2 f3 f4 f5 f6 f7 f8 f9 Class
15.69945 15.69945 35.259 35.259 15.69945 15.69945 15.69945 25.4792 25.4792 1s Discriminatory
54.8184 64.5982 93.9374 35.259 15.69945 35.259 54.8184 45.0387 25.4792 1s Value is
25.4792 35.259 35.259 25.4792 25.4792 54.8184 35.259 45.0387 45.0387 1s 35.259
64.5982 35.259 25.4792 25.4792 35.259 74.3779 45.0387 15.69945 15.69945 1s for 1st ind.
TABLE II
E XAMPLE OF D ISCRETIZATION PROCESS FOR FINGERPRINT FEATURES OF FIRST INDIVIDUAL
LOW and UPPER BIN for Individual : 1
MIN Value 55 MAX Value 195
Bin Lower Upper RepValue
0 55 69 62
1 69 83 76
2 83 97 90
3 97 111 104
4 111 125 118
5 125 139 132
6 139 153 146
7 153 167 160
8 167 181 174
9 181 195 188
DISCRETIZED DATA
f1 f2 f3 f4 f5 f6 f7 f8 f9 f1 0 Class
104 90 104 132 104 118 104 104 62 146 1f Discriminatory
90 132 104 132 160 90 146 146 160 188 1f Value is
76 90 104 132 160 62 104 104 160 181 1f 104
90 104 118 132 146 132 76 62 160 160 1f for 1st ind.
From ROC graph, clearly defines that the use of discretiza-
tion on the unimodal dataset enhances the overall perfor-
mance of identification significantly over the performance
of identification without discretization. Due to efficiency of
the discretization method on unimodal identification, thus,
the same technique is applied to multimodal identification in
order to improve the accuracy of identification on multiple
modalities.
Fig. 2 and Fig. 3 below shows the performance of ROC
graph for two different fusion methods namely Mean fu-
sion rule and Min method with the implementation of
Z-Score normalization and matched scores fusion based
Fig. 3. Performance of uni-modality identification.
discretization approach on multiple modalities. From the
ROC graph depicted in Fig. 2, it can be seen that the
implementation of the proposed method based discretization
on the multi-modalities fusion of signature and fingerprint 99.9% respectively, where the performance is better than
shows a better performance than the standard signature and the Z-score normalization and Mean fusion on signature
fingerprint identification system. At FAR of 0.1%, 1.0%, and and fingerprint modalities, 93.5%, 93.7%, and 96.4%. Fig.
10.0%, the implementation of the proposed method which 3 shows the GAR performance on Min fusion based Z-
is based on discretization has a GAR of 96.9%, 98.9%, and score normalization and the proposed multi-matched score
based discretization. Again, in Fig. 3, interestingly, the and Min fusion are performed to seek for the efficiency
proposed method based on discretization on signature and of the proposed method in Multimodal identification. The
fingerprint modalities yields the best performance over the experimental results show that the proposed multi-matched
range of FAR. At 0.1%, 1.0%, and 10.0% of FAR, the Min scores discretization perform well on multiple set of in-
fusion method works the best with proposed method, 95.0%, dividual traits, consequently improving the identification
97.99%, and 99.40% respectively. Therefore, it can be sum- performance.
marized that the used of discretization and proposed fusion
ACKNOWLEDGMENT
of fingerprint and signature modalities generally performs
well over the use of normalization and conventional fusion This work is supported by The Ministry of Higher Education
approaches for personal identification. (MOHE) under Research University Grant (GUP) and My-
brain15. Authors would especially like to thank Universiti
Teknologi Malaysia, Skudai Johor Bahru MALAYSIA for
the support and Soft Computing Research Group (SCRG)
for their excellent cooperation and contributions to improve
this paper.
R EFERENCES
[1]S. N. Srihari, “Computational methods for handwritten questioned
document examination,” National Criminal Justice Research Report,
2010.
[2]C. Su and S. Srihari, “Evaluation of rarity of fingerprints in forensics,”
Advances in Neural Information Processing Systems, vol. 23, pp. 1207–
1215, 2010.
Fig. 4. Performance of Multi-modality fusion methods for signature and [3]S. N. Srihari and K. Singer, “Role of automation in the examination of
fingerprint. handwritten items,” in Frontiers in Handwriting Recognition (ICFHR),
2012 International Conference on. IEEE, 2012, pp. 619–624.
[4]Y. Tang, H. Kasiviswanathan, and S. N. Srihari, “An efficient
clustering–based retrieval framework for real crime scene footwear
marks,” International Journal of Granular Computing, Rough Sets and
Intelligent Systems, vol. 2, no. 4, pp. 327–360, 2012.
[5]J. Gonzalez-Rodriguez, J. Ortega-Garcia, and J.-L. Sanchez-Bote,
“Forensic identification reporting using automatic biometric systems,”
in Biometric Solutions. Springer, 2002, pp. 169–185.
[6]J. Gonzalez-Rodriguez, J. Fierrez-Aguilar, D. Ramos-Castro, and
J. Ortega-Garcia, “Bayesian analysis of fingerprint, face and signa-
ture evidences with automatic biometric systems,” Forensic science
international, vol. 155, no. 2-3, pp. 126–140, 2005.
[7]K. J. Craft, J. D. Owens, and M. V. Ashley, “Application of plant
dna markers in forensic botany: Genetic comparison of¡ i¿ quercus¡/i¿
evidence leaves to crime scene trees using microsatellites,” Forensic
science international, vol. 165, no. 1, pp. 64–70, 2007.
[8]Y. Tang, S. N. Srihari, H. Kasiviswanathan, and J. J. Corso, “Footwear
Fig. 5. Performance of Multi-modality fusion methods for signature and print retrieval system for real crime scene marks,” in Computational
fingerprint. Forensics. Springer, 2011, pp. 88–100.
[9]A. Jain and A. Ross, “Introduction to biometrics,” Handbook of
Biometrics, pp. 1–22, 2008.
[10]A. Ross, K. Nandakumar, and A. Jain, “Introduction to multibiomet-
IV. C ONCLUSION rics,” Handbook of Biometrics, pp. 271–292, 2008.
[11]N. Solayappan and S. Latifi, “A survey of unimodal biometric meth-
A key to successful multimodal based system develop- ods,” in Proceedings of the 2006 International Conference on Security
ment for forensic identification, is an effective methodology and Management, 2006, pp. 57–63.
organization and fusion process, capable to integrate and [12]A. Ross and A. Jain, “Information fusion in biometrics,” Pattern
recognition letters, vol. 24, no. 13, pp. 2115–2125, 2003.
handle important information such as distinctiveness char- [13]H. Liu, F. Hussain, C. L. Tan, and M. Dash, “Discretization: An
acteristic of an individual. In this paper, the match scores enabling technique,” Data mining and knowledge discovery, vol. 6,
discretization is proposed and implemented on different no. 4, pp. 393–423, 2002.
[14]R. Ahmad, M. Darus, S. M. H. Shamsuddin, and A. A. Bakar,
modality datasets of an individual. The experiments are “Pendiskretan set kasar menggunakan taakulan boolean terhadap pen-
done on signature and fingerprint datasets, which consist caman simbol matematik,” Journal Teknologi Maklumat & Multimedia,
of 156 students (both female and male) where each stu- pp. 15–26, 2004.
[15]B. O. Mohammed and S. M. Shamsuddin, “Feature discretization
dent contributes 4 samples of signatures and fingerprint. for individuality representation in twins handwritten identification,”
Ten features describing the bifurcation and termination Journal of Computer Science, vol. 7, no. 7, pp. 1080–1087, 2011.
points of fingerprint, were extracted using Minutia based [16]A. Muda, S. Shamsuddin, and M. Darus, “Invariants discretization for
individuality representation in handwritten authorship,” Computational
extraction approach whereas signature is extracted using Forensics, pp. 218–228, 2008.
Geometry based extraction approach. In matching process, [17]K. Huang and H. Yan, “Off-line signature verification based on
each template-query pair feature sets is compared using geometric feature extraction and neural network classification,” Pattern
Recognition, vol. 30, no. 1, pp. 9–17, 1997.
Euclidean distance. Two fusion approaches namely Mean