=Paper=
{{Paper
|id=None
|storemode=property
|title= Ethnicity Prediction Based on Iris Texture Features
|pdfUrl=https://ceur-ws.org/Vol-710/paper29.pdf
|volume=Vol-710
|dblpUrl=https://dblp.org/rec/conf/maics/LagreeB11
}}
== Ethnicity Prediction Based on Iris Texture Features==
Ethnicity Prediction Based on Iris Texture Features
Stephen Lagree and Kevin W. Bowyer
Department of Computer Science and Engineering
University of Notre Dame
Notre Dame, Indiana 46556 USA
slagree@nd.edu, kwb@cse.nd.edu
Abstract Related Work
This paper examines the possibility of predicting ethnicity
based on iris texture. This is possible if there are The CASIA biometrics research group has performed
similarities of the iris texture of a certain ethnicity, and research on iris texture elements, including studies (Qiu,
these similarities differ from ethnicity to ethnicity. This sort Sun, and Tan 2006; Qiu, Sun, and Tan 2007a; Qiu, Sun,
of “soft biometric” prediction could be used, for example, to and Tan 2007b) on determining ethnicity based on iris
narrow the search of an enrollment database for a match to
probe sample. Using an iris image dataset representing 120 texture. To our knowledge, this is the only other work on
persons and 10-fold person-disjoint cross validation, we predicting ethnicity from iris texture. In (Qiu, Sun, and
obtain 91% correct Asian / Caucasian ethnicity Tan 2006), they report 86% accuracy in Asian / Caucasian
classification. classification. Thomas et al. (2007) suggests that the work
in (Qiu, Sun, and Tan 2006) may be biased due to
illumination differences in the two datasets the images
Introduction were taken from, the Asian subject images coming from
Iris texture has been shown to be useful for biometric one dataset and the Caucasian subject images from another
identification and verification (Bowyer, Hollingsworth, dataset. If one dataset was generally brighter or darker
and Flynn 2008; Phillips et al. 2005; Phillips et al. 2010; than the other, this factor could have entered into the
Daugman 2006). Studies have been done to determine if learned algorithm for separating the subjects based on
iris texture contains information that can determine “soft lighting, not iris texture. In the results presented in this
biometric” attributes of a person, such as ethnicity (Qiu, paper, we eliminate this issue by using images taken from
Sun, and Tan 2006; Qiu, Sun, and Tan 2007a) or gender a single database to build our classifier, so that any
(Thomas et al. 2007). This paper analyzes the possibility acquisition setup differences are just as likely to appear in
of ethnicity prediction based on iris texture. The ability of either ethnicity class. In (Qiu, Sun, and Tan 2007a), the
biometric systems to recognize the ethnicity of a subject CASIA group reports 91% accuracy in Asian / non-Asian
could allow automatic classification without human input. ethnicity classification, using support vector machines and
Also, in an iris recognition system, an identification texton features. The dataset in this work is composed of
request includes a “probe” iris, which is checked against a 2,400 images representing 60 different persons, so that
“gallery” of enrolled images, to find the correct identity of there are 20 images per iris. They divide the dataset into a
the requested iris. One application of this feature is to 1,200-image training set and a 1,200-image test set, with
narrow down the gallery of subjects to compare an iris to training and test set not specified to be person-disjoint. In
for identification purposes. In a system with millions of general, if iris images from the same person appear in both
enrolled subjects, comparing an iris to all subjects could the training and the test set, then the performance estimate
take an extremely long time. Narrowing down the gallery obtained is optimistically biased. In the results presented
to only irises with the same ethnicity as the probe iris for in this paper, we eliminate this issue by using a person-
comparison could give a great speed improvement. disjoint ten-fold cross-validation.
Figure 1 – Example LG 4000 Iris Images From Subjects
with Caucasian Ethnicity (top: image 02463d1892; Figure 2 – Example LG 4000 Iris Images From Subjects
middle: image 04327d1264; bottom: image with Asian Ethnicity (top: image 04815d908; middle:
04397d1461). image 04629d1385; bottom: image 05404d80).
In a study of how human observers categorize images, observers did not know the gender or ethnicity of the
Stark, Bowyer, and Siena (2010) found that humans persons in the iris images. However, the grouping of
perceive general differences in iris texture that can be used images into categories of similar iris texture resulted in
to classify iris textures into categories of similar texture categories that were, on average, split 80% / 20% on
pattern. Observers grouped a set of 100 iris images into ethnicity. The same categories were on average divided
categories of similar texture. The 100 images represented much more closely to 50% / 50% on gender. Thus, one
100 different persons, and the 100 persons were balanced result of Stark’s work (2010) is that human observers
on gender and on Asian / Caucasian ethnicity. The perceive consistent ethnicity-related differences in iris
texture. In this paper, we want to train a classifier to
explicitly perform the sort of ethnicity classification that
was found as a side effect of the texture similarity grouping
done by humans in (Stark, Bowyer, and Siena 2010) and
that was previously explored in (Qiu, Sun, and Tan 2006;
Qiu, Sun, and Tan 2007a).
Dataset
We want to see how accurately we can identify ethnicity
based on iris texture. For this study we will use two
ethnicity classes, Caucasian and Asian. This study used
1200 iris images selected from the University of Notre Figure 3 – Examples of Segmented, Normalized Iris
Dame’s iris image database. (This is a newer database than Images. (top: normalized image derived from image
was released to the iris biometrics research community for 02463d1892 above; bottom: normalized image
the government’s Iris Challenge Evaluation (ICE) program derived from image 04815d908 above). The green
(Phillips et al. 2005; Phillips et al. 2010).) All images regions indicate the “mask” of where the iris texture
were obtained using an LG 4000 sensor at Notre Dame. is occluded by eyelid / eyelash / specular highlights.
As with all commercial iris biometrics systems that we are
aware of, the images are obtained using near-infrared Feature Generation
illumination, and are 480x640 in size. One half of the
images, 600, were of persons whose ethnicity is classified After an image is segmented and normalized, we compute
as Asian and the other half were from persons classified as texture features that can be used in training a classifier to
Caucasian. For each ethnicity, the 600 images represented categorize images according to ethnicity. To do this we
60 different persons, with 5 left iris images and 5 right iris apply different filters to the image at every non-masked
images per person. This 1,200-image dataset was pixel location, and use the results of the filter to build a
randomly divided into 10 folds of 120 images each, with 6 feature vector. Six of the filters we have used are “spot
persons of each ethnicity in each fold. Thus the images in detectors” and “line detectors” of various sizes, as depicted
the folds are person-disjoint; that is, each person’s images in Tables I to VI. For a given point in the image, if
appear in just one fold. applying a given filter would result in using any pixel that
is masked, then that filter application is skipped for that
point. The rest of the filters, depicted in Tables VI-VIII,
Segmentation were created using Laws’ Texture Measures (Laws 1980).
These are designed to give responses for various types of
For this iris texture prediction study, we want to base our
textures when convolved with images.
findings solely on iris texture. Therefore we exclude
A feature vector that describes the texture is computed
periocular clues that might be used as an indicator of
for each iris image. We divided the normalized image
ethnicity. We segment the images to obtain the region of
array into a number of smaller sections in order to compute
interest, and mask out the eyelid-occluded portions of the
statistics for sub-regions of the normalized image. This is
iris. We use Notre Dame’s IrisBee software to perform the
so that classification could be based on, for example,
segmentations (Phillips et al. 2005). The output from
relative differences between the band of the iris nearer the
IrisBee that we use for texture examination is a 240x40
pupil versus the band of the iris furthest from the pupil.
pixel normalized iris image along with the corresponding
These regions were ten four-pixel horizontal bands and
bitmask of eyelid and eyelash occlusion locations. The
four 60-pixel vertical bands of neighboring pixels in the
image segmentation and masking are exactly those that
normalized iris image. The ten horizontal bands
would be used by IrisBEE in processing the images for
correspond to concentric circular bands of the iris, running
biometric recognition of a person’s identity. However, the
from the pupil out to the sclera (white) of the eye. The
normalized images are not processed by the log-Gabor
four vertical bands correspond roughly to the top, right,
filters that are used by IrisBEE to create the “iris code” for
bottom and left parts of the iris. Since the filters are
biometric recognition. We create a different texture
looking for different phenomena in the image, we find
feature vector for ethnicity prediction.
statistics for the filter response of each image. Each image
contains 630 features, with 5 statistics calculated for each
of the 9 filters on all of the 14 regions. The five statistics
are: (1) average value of filter response, (2) standard
deviation of filter response, (3) 90th percentile value of TABLE VI: Wide Horizontal Line Detector
filter response, (4) 10th percentile value of filter response, -1/10 -1/10 -1/10 -1/10 -1/10
and (5) range between 90th and 10th percentile value. The
+1/15 +1/15 +1/15 +1/15 +1/15
motivation for using the average value is to represent the
+1/15 +1/15 +1/15 +1/15 +1/15
strength of a given spot size or line width in the texture.
+1/15 +1/15 +1/15 +1/15 +1/15
The motivation for using the standard deviation is to
represent the degree of variation in the response. The -1/10 -1/10 -1/10 -1/10 -1/10
motivation for using the percentiles and range is to have an
alternate representation of the variation that is not affected TABLE VII: S5S5
by small amounts of image segmentation error. +1 0 -2 0 1
0 0 0 0 0
TABLE I: Small Spot Detector Filter -2 0 +4 0 -2
-1/8 -1/8 -1/8 0 0 0 0 0
-1/8 +1 -1/8 +1 0 -1 0 +1
-1/8 -1/8 -1/8
TABLE VIII: R5R5
TABLE II: Large Spot Detector Filter -1 -4 6 -4 +1
-1/16 -1/16 -1/16 -1/16 -1/16 -4 +16 -24 +16 -4
-1/16 +1/9 +1/9 +1/9 -1/16 6 -24 +36 -24 +6
-1/16 +1/9 +1/9 +1/9 -1/16 -4 +16 -24 +16 -4
-1/16 +1/9 +1/9 +1/9 -1/16 +1 -4 +6 -4 +1
-1/16 -1/16 -1/16 -1/16 -1/16
TABLE III: Vertical Line Detector Filter
-1/20 -1/20 +1/5 -1/20 -1/20 Results
-1/20 -1/20 +1/5 -1/20 -1/20 We tried a variety of different classification algorithms
-1/20 -1/20 +1/5 -1/20 -1/20 included in the WEKA package (Weka). This included
-1/20 -1/20 +1/5 -1/20 -1/20 using meta-algorithms like Bagging with other classifiers.
-1/20 -1/20 +1/5 -1/20 -1/20 By changing parameters, we achieved performance gains
on some of the algorithms. However, we found our best
TABLE IV: Wide Vertical Line Detector Filter results using the SMO algorithm with the default
-1/10 +1/15 +1/15 +1/15 -1/10 parameters in WEKA for classification. The SMO
algorithm implements “Sequential Minimal Optimization”,
-1/10 +1/15 +1/15 +1/15 -1/10 John Platt’s algorithm for building a support vector
-1/10 +1/15 +1/15 +1/15 -1/10 machine classifier (Weka). The input to the SMO
-1/10 +1/15 +1/15 +1/15 -1/10 algorithm is the feature vectors of all 1200 iris images that
-1/10 +1/15 +1/15 +1/15 -1/10 we have computed. To assess the results of our classifier
we use cross-fold validation with ten folds using
TABLE V: Horizontal Line Detector Filter stratification based on ethnicity. These folds are also
-1/20 -1/20 -1/20 -1/20 -1/20 subject-disjoint to ensure the persons whose images are in
the test data have not been seen by the classification
-1/20 -1/20 -1/20 -1/20 -1/20 algorithm in the training data.
+1/5 +1/5 +1/5 +1/5 +1/5 The SMO classifier results in higher accuracy compared
-1/20 -1/20 -1/20 -1/20 -1/20 to a broad range of other classifiers, including decision tree
-1/20 -1/20 -1/20 -1/20 -1/20 based algorithms and bagging. Using Bagging on the top
two classifiers, SMO and Random Forest, did not improve
performance. Running the experiment with the SMO
classifier and the feature vector as described above gives
us an accuracy of 90.58%. This is good accuracy,
representing an improvement on the 86% reported in (Qiu,
Sun, and Tan 2006) and close to the 91% reported in (Qiu,
Sun, and Tan 2007a) for a train-test split that was not
person-disjoint. When we do not use person disjoint TABLE XII: SMO Accuracy By Fold
results, we see an accuracy of 96.17%, which is Using 10 Fold Cross Validation
significantly higher than Qiu, Sun, and Tan (2006; 2007a) Fold Accuracy (%)
reported. 1 91.667
We computed the classification accuracy for each 2 100.000
feature separately to see the impact of individual features. 3 88.333
Table X shows that some of the single features have almost 4 90.833
have the performance of all of the features together. 5 97.500
However none of them do as well as the combination of all 6 82.500
of the features. Some filters may be redundant; a 7 98.333
combination of a few might reproduce the performance of
8 90.000
all nine filters.
9 87.500
To ensure that the size of our training dataset was not
10 79.167
limiting our accuracy levels, we ran the classifier with
Average 90.583
different numbers of folds. Table XI shows the results we
achieved using 5, 10, and 20 fold cross validation. The
accuracy levels are all within one percent, indicating that Future Work
our performance should not be limited by our dataset size.
To achieve even greater accuracy, we intend to implement
additional and more sophisticated features, and to look at
TABLE IX: Results for Different Classifiers the effects of the size of the training set. We envision that
Algorithm Accuracy (%) the number of different persons represented in the training
SMO 90.58 data is likely to be more important than the number of
RandomForest (100 Trees/Features) 89.50 images in the training set; that is, doubling the training set
Bagged FT 89.33 by using twice as many images per person is likely not as
FT 87.67 powerful as doubling the number of persons.
For this experiment, we only looked at very broad
ADTree 85.25
ethnicity classifications. More work could be done to
J48Graft 83.67
examine finer categories, such as Indian and Southeast
J48 83.08
Asian. The performance of a classifier such as this has not
Naïve Bayes 68.42
been tested on subjects of multiple ethnic backgrounds
either.
TABLE X: Feature Performance with SMO
Feature Accuracy (%) Acknowledgments
Small Spot Detector 85.58
This work is supported by the Technical Support Working
Large Spot Detector 85.67
Group under US Army contract W91CRB-08-C-0093, and
Vertical Line Detector 87.42
by the Central Intelligence Agency. The opinions, findings,
Wide Vertical Line 85.50 and conclusions or recommendations expressed in this
Horizontal Line Detector 78.92 publication are those of the authors and do not necessarily
Wide Horizontal Line Detector 78.33 reflect the views of our sponsors.
S5S5 78.17
R5R5 73.33
E5E5 88.0 References
All Features 90.58
Bowyer, K.W.; Hollingsworth, K.; and Flynn, P. J. Image
TABLE XI: SMO By Number of Folds Used in Cross Understanding for Iris Biometrics: A Survey, Computer Vision
Validation and Image Understanding, 110(2), 281-307, May 2008.
Folds Accuracy (%) Daugman, J., Probing the Uniqueness and Randomness of Iris
5 90.00 Codes: Results From 200 Billion Iris Pair Comparisons,
Proceedings of the IEEE, Nov. 2006, 94 (11), 1927 – 1935.
10 90.583
Laws, K. Textured Image Segmentation, Ph.D. Dissertation,
20 90.1667 University of Southern California, January 1980.
Phillips, P. J.; Bowyer, K. W.; Flynn, P.J; Liu, X; and Scruggs, T. Thomas, V; Chawla, N; Bowyer, K. W.; and Flynn, P. J. Learning
W. The Iris Challenge Evaluation 2005, Biometrics: Theory, to predict gender from iris images. In Proc. IEEE Int. Conf. on
Applications and Systems (BTAS 08), September 2008, Biometrics: Theory, Applications, and Systems, Sept 2007.
Washington, DC. Qiu, X. C.; Sun, Z. A.; and Tan, T. N. Global texture analysis of
Phillips, P. J.; Scruggs, W. T.; O'Toole, A.; Flynn, P.J.; Bowyer, iris images for ethnic classification. In Springer LNCS 3832: Int.
K.W.; Schott, C. L.; and Sharpe, M. FRVT 2006 and ICE 2006 Conf. on Biometrics, pages 411-418, June 2006.
Large-Scale Experimental Results, IEEE Transactions on Pattern Qiu, X. C.; Sun, Z. A.; and Tan, T. N. Learning appearance
Analysis and Machine Intelligence 32 (5), May 2010, 831-846.. primitives of iris images for ethnic classification. In Int. Conf. on
Stark, L; Bowyer, K.W.; and Siena,S. Human perceptual Image Processing, pages II: 405–408, 2007a.
categorization of iris texture patterns, Biometrics Theory, Qiu, X. C.; Sun, Z. A.; and Tan, T. N. Coarse iris classification by
Applications and Systems (BTAS), September 2010. learned visual dictionary. In Springer LNCS 4642: Int. Conf. on
Biometrics, pages 770–779, Aug 2007b.
Weka 3. http://www.cs.waikato.ac.nz/ml/weka/.