Multispectral Recognition Using Genetic and Evolutionary Feature Extraction Pablo A. Arias, Joseph Shelton, Kaushik Roy, Gerry V. Dozier, Foysal Ahmad Department of Computer Science, North Carolina A&T State University, Greensboro, U.S.A Center for Advanced Studies in Identity Sciences @ NC A&T {parias, jashelt1, fahmad}@aggies.ncat.edu, {kroy, gvdozier}@ncat.edu Abstract Masek 2003). Daugman developed the first iris recognition Traditionally, iris recognition systems capture iris images in algorithm (Daugman 2004), and many researchers have the 700 to 900nm range. It is within these ranges that further extended the work of iris recognition (Verma et al. researchers have found the most viable iris textures for iris 2012, Bowyer, Hollingsworth and Flynn 2008, Sanchez- recognition. Recently, there has been an interest for Avila and Scanchez-Reillo 2005). Google headquarters exploration of spectrum ranges that falls outside of these traditional ranges. In this work, we will explore the uses iris recognition access control systems for identifying performance of feature extraction techniques on a wider individuals (Adam, Neven and Steffens 2010), and the spectrum, specifically ranges between 400nm to 1550nm. company M2SYS Technology has patented an iris More specifically, we apply the traditional Local Binary identification system that can link hospital patients to their Pattern (LBP) technique & a hybrid LBP technique (Genetic medical records (Archbold 2014). In the Penobscot and Evolutionary Feature Extraction (GEFE)) in an effort to elicit the most important iris information. We also perform Country Jail, iris scanners are being implemented in order intra-spectral and cross spectrum analysis on the iris images to eliminate improper inmate releases (Ricker 2006). captured in different wavelengths. Results show that GEFE The iris biometric has advantages over other types of outperforms the LBP technique on all spectrums. biometrics. The accuracy of recognition is generally better than the accuracy of other biometrics such as facial or Introduction fingerprint (Masek 2003). An iris is also well protected by wear and tear, as opposed to one’s fingerprint. There are Biometric technologies are becoming the predominate disadvantages in that the iris is small and intricate, which form of access control. Physiological and behavioral traits makes it difficult to obtain from any significant distance are aspects of biometrics that allows for unique forms of without missing vital information. Individuals who may identification. The physiological traits (face, iris, have eye issues such as blindness or cataracts could also fingerprint, etc.) have many advantages over the traditional difficult to recognize. Environmental issues can also be a techniques such as knowledge and/or token based access factor in regards to poor lighting or shade. Finally, the iris control. Knowledge can be forgotten and tokens can be can often be covered by eyewear such as glasses or shades. easily stolen, whereas physiological traits cannot be Images captured in the near infrared (NIR) wavelength forgotten or easily stolen. Biometrics is a subarea of the band contains more iris textural information than images broader field of identity science. “Identity Science is the captured in the visible wavelength. Most of the current iris field of study devoted towards the understanding of how recognition schemes capture iris images in the NIR the dynamic nature of ‘self’ interacts with a possibly wavelength band. This range is generally from 700nm to intractable number of dynamic environments in an effort to 900nm. Though iris recognition using NIR wavelengths observe, track, and identify ‘self’ (in terms of its beliefs, has an acceptable recognition rate, there are still desires, intentions, regression, progression, etc.) via a set vulnerabilities such as faking iris images (Park and Kang of external witnesses. (Dozier 2015)”. An external witness 2005). Researchers are interested in exploring the can be viewed as any entity that has the ability to: perceive wavelengths outside of the 700nm-900nm range. an interaction between ‘self’ and an environment, process Most of the existing multispectral iris recognition the observed interaction, and track and/or identify ‘self’ systems use Gabor filters as a feature extraction technique with respect to that environment. (Daugman 2004). There has been research done using The iris biometric has been shown to be a viable texture based feature extraction methods such as a biometric for verification and identification (Ross 2010, Modified Local Binary Pattern (MLBP) algorithm (Popplewell et al. 2014). This variation of the Local Binary Copyright held by the author(s). Pattern (LBP) algorithm (Shelton et al. 2011) segments an image into even sized regions and extracts features from segmenting an image into uniform sized, non-overlapping each region. Each region can be referred to as a patch. regions, as shown in Figure 1. Each region has a histogram MLBP segments a biometric image into sub-regions and associated with it, where the bins in the histogram extracts more features from each patch than traditional correspond to the texture patterns found in each region. A LBP. A Cooperative Game Theory (CGT) based patch FV is created by concatenating the histograms from all selector was implemented in (Ahmad, Roy and Popplewell regions of a segmented image. 2014). Though the CGT with MLBP method showed Texture patterns are created by comparing center pixels, promise, the approach reported in (Ahmad, Roy and a pixel that is surrounded by i number of neighboring Popplewell 2014) utilized only the patches based on the pixels on all sides, with the i neighboring pixels. A texture image partition of LBP. It is possible that noisy data may pattern can be represented as a binary string, and that string be included if patches from the entire image space are can be decoded into a decimal value, denoted as LBP(Ni, considered. To mitigate this, we propose the application of c), where c is the pixel intensity value of a center pixel, N a genetic algorithm-based feature extraction technique. This feature extraction method evolves Feature Extractors is a set of neighboring pixel intensity values and i is the ith (FEs) that can have patches of varying sizes in various neighboring pixel of c. LBP(Ni, c) is computed in (1) and positions on an image and applies them in an overlapping (2), where the difference is taken between each fashion. neighboring pixel and a center pixel. The equation s(Ni, c) In the past, a feature extraction technique known as computes the difference and returns the appropriate bit. Genetic and Evolutionary Feature Extraction (GEFE) was i 1 (1) LBP( Ni , c)   s Ni , c  2i created at the Center for Advance Studies in Identity Science (CASIS) (Shelton et al. 2011, Shelton et al. 2014). i 0 GEFE was initially applied on facial images (Shelton et al. 0, if N i  c  0 2011) but has been recently applied on iris images (Shelton s N i , c    (2) et al. 2014). 1, if N i  c  0 GEFE is an instance of a Genetic and Evolutionary Computation (GEC), an algorithm that uses natural The total number of texture patterns that can exist selection techniques to evolve a population of candidate depend on the number of neighboring pixels, i, where the solutions for a problem (Engelbracht 2007, Davis 1991, number of possible patterns are 2i. However, the common Goldberg 1989). GEFE would allow sub-regions to be any way to create FVs with the LBP technique is to use mostly location on an image and to be any size. Previous research uniform patterns for bins in the histograms. A uniform has shown success of GEFE when testing on a single pattern is one where the bit transitions in a texture pattern modality; GEFE outperformed traditional LBP in terms of changes two or fewer times when traversing the texture recognition accuracy as well as the number of features pattern circularly. used. In this work, we will apply GEFE towards The common variation of the LBP technique is popular, multispectral iris recognition (O’Connor et al. 2014). More but it is also possible to simply consider all of the possible specifically, we will conduct inter-spectral and cross- patterns as opposed to just uniform patterns. In this case, spectral recognition on a wide range of spectrums (450nm- histograms would be length 2i, or, for a neighborhood size 1550nm) to determine the effectiveness of GEFE on this of 8, 28 = 256. dataset. During the process of recognition, a probe template, p, is The remainder of this paper is as follows. In Section 2, compared to a gallery set of vectors G ={g0, g1, ..., gk-1} we will discuss the feature extraction methods used. In using the (Manhattan) City Block distance metric. This Section 3, we will describe the experimental setup. Section distance is a numerical representation of the distinction 4 will contain the results of the experiments and Section 5 between two biometric instances and can be calculated provides the conclusions and future remarks. using the following formula: Feature Extraction Methods 𝑑 = ∑𝑛𝑖=0 |𝑝𝑖 − 𝑔𝑘,𝑖 | (3) Local Binary Patterns where d is the distance between two subjects, p is the probe The Local Binary Pattern (LBP) feature extraction feature template, g is the gallery feature template in set G, algorithm is a method that is used for texture n is total number of features, i is the index of the feature, classification(Ojala, Pietikainen and Maenpaa 2002, and k is the kth individual in the gallery. The subject, gk, is Ahonen, Hadid and Pietikainen 2006). This technique can considered a match to p when distance between the two be used to classify textures patterns in images and it uses vectors is the smallest compared to all other subjects in G. these textures to create Feature Vectors (FVs) for images. For biometric recognition, the LBP technique works by In the original implementation of GEFE, a set of FEs were evolved on a training set to produce FEs that could correctly identify subjects in that particular data set. To produce FEs that could generalize well on unseen subjects, supervised learning was added to the GEFE process. Cross validation in Genetic and Evolutionary Feature Extraction – Machine Learning (GEFEML) (Shelton et al. 2012) is done by initially generating a population of random FEs. Every candidate FE is then evaluated on the training set and additionally evaluated on a validation set. The results of the FEs on the validation set do not affect the training of FEs. While a stopping condition has not been met, FEs are selected to breed, and offspring FEs are created. The offspring are evaluated on the training set, but they are also applied on the validation set. The FE with the best results Figure 1: Image partitioned into patches on the validation set is stored as FE*. FE* is only updated when a new candidate FE performs better on the validation GEFEML set than the currently stored FE*. The offspring are used to GEFE is an instance of a GEC that evolves LBP-based FEs. create the new population and this process repeats until the Whereas a traditional LBP FE uses even sized, non- stopping condition has been met. Under this design, FE* overlapping patches over an entire image, GEFE evolves should generalize better on unseen subjects opposed to the FEs that can have patches of varying sizes in various best performing FE on the training set. positions on an image. Because GEFE is an instance of a GEC, a FE must be represented as a candidate solution. We Experiments use a 6-tuple with 5 sets and 1 single value, represented as We conducted our experiments on a multispectral iris . Each of the patches in a particular FE, fei, are designed using the values in the 6-tuple. The Xi and image dataset that contains 38,129 images (Multispectral Yi sets hold the points of the center of each patch in Iris Dataset). These images were acquired using fei, while the sets Wi and Hi holds the width and heights of Goodrich/Sensors with a custom designed lens package to the patches. The set Mi denotes a masking value for each acquire iris images at wavelengths in the range 400 to patch in fei. Though there can be multiple patches defined 1600nm. We segmented the multispectral iris images using by the 6-tuple, a patch’s specific masking value determines Canny edge detections in an effort to identify the iris whether the features extracted by that patch are included in regions and applied circular Hough transforms to define the resulting FV. the iris and pupil boundaries (Masek 2003, Popplewell et The fitness, fi, is determined by applying fei towards a al. 2014). A technique based on Daugman’s Rubber Sheet dataset of subject’s iris images. A subject has a number of Model was then used for normalization. These images images that vary, and these images are separated into a were divided into 13 sections, each section depicting a probe set and a gallery set (G). The fei is applied on these spectral band consisting of roughly 2945 iris images. For images to create FVs, and the FVs in the probe set are our first experiment, we did intra-class comparisons for compared to all of the FVs in the gallery set using the each spectral band. We divided the data set into three Manhattan distance measure. The two FVs that have the sections: training, validation, and testing. The training set least Manhattan distance are considered to be matches. If a had a total of 44 subjects, the validation set had 18 probe FV is incorrectly matched with a gallery FV, then fei subjects, and the testing set had 29 subjects. We further is said to cause an error.. The resulting fi is the number of separated each set into a probe set and a gallery set; the errors (𝜀) added to the percent of patches not masked out first sample of each subject went into the probe set while (ζ), shown below. the remaining went into the gallery set. For GEFE, we ran it for 30 runs and for each run, we ran it for 1000 𝑓_𝑖 = 10𝜀 + ζ (4) generations. For cross spectral analysis, we used features extractors evolved from intra-spectral comparisons and Previous research in (Shelton et al. 2011) has shown that applied them to each of the differing spectral band images. GEFE instances with uniform patch sizes had a statistically In this work, similarity scores are computed by modifying better performance than GEFE instances with non-uniform the Manhattan city block distance metric. The variables hi patches. This means that the sets Wi and Hi will have one and hj represents two FVs being measured, l represents the value in their set, representing the parameters for all length of the FV, and z represents the current position in patches in fei. the FV. LBP and found that 7 columns by 10 rows was the best performing partition. This 70 patch LBP partition is compared to GEFE in the results. Figures 2-5 show the Results show that for all spectrums, GEFE outperforms normalized iris regions with the ovarlapped patches. traditional LBP within their respective spectrums ‘GEFE ’ and ‘GEFE ’ had similar performances in respect to recognition accuracy, but with respect to the Results and Discussions number of patches, GEFE was proven to be Shown in Table I are the test set accuracies of FE produced statistically better. The results show that on average, the by different feature extraction algorithms. In the table, 800 nm spectrum performs the best for identification Spectrum represents the spectrum used for feature accuracy for both intra-class comparisons and cross extraction on the training and validation set. For the Cross spectrum comparisons. An ANOVA test was used as a Spectrum, this represents the spectrum used for the test set. statistical measure of performance, with a 95% confidence Accuracy represents the identification accuracy of the interval. algorithm on the test set. For LBP, the number represents In Figures 2-5, the images show the best performing FEs the accuracy of the traditional LBP feature extractor that on the 800nm test set. In Figures 2 and 3, the best opt-gen partitions an image into 7 by 10. For the GEFE variations and val-gen FEs are shown. The areas with overlapping on the iris, GEFE represents FEs that were patches are the most salient areas to extract features from optimized on the training set, while GEFE for identification. Figures 4 and 5 show the best cross- represents the best performing FEs on the validation set. spectral FEs on the 800nm test set for opt-gen and val-gen. For GEFE, the number on the outside represents the It appears that the right most area of iris images contains accuracy of the best feature extractor, whereas the number more salient texture information than the left area. within the parenthesis represents the average accuracy of Depending on if an iris is a left or right eye; there will be the 30 best FEs. The column P represents the number of slight noisy data in the form of eyelashes. It could be the patches used on average by the feature extractors. case that the grouping of patches on the right side is locking on to that noisy data or lack of. In Figures 6-9, we Table I. Performance on multispectral iris dataset. show the Cumulative Match Characteristic (CMC) curves Accuracy and the Receiver Operator Characteristic (ROC) curves for Patches Cross LBP and the best FE from GEFE. The CMC curve plots Spectrum Spectrum GEFE the rank accuracies of the methods, while the ROC curve LBP Opt Val plots the True Accept Rate (TAR) and the False Accept 405 7.02 13.79(8.05) 12.07(6.21) 28.16 22.87 Rate (FAR) of subjects. The results in Table 1 show that 405 800 n/a 96.55(89.83) 94.83(84.02) 28.16 22.87 the 800 nm spectrum achieved rank 1 accuracy of 98.28% 505 45.61 63.79(53.79) 53.44(41.03) 32.80 24.15 for both the opt-gen and val-gen FEs on the test set. 505 800 n/a 94.83(92.59) 96.55(86.09) 32.80 24.15 620 64.91 79.31(74.02) 74.13(60.80) 33.71 24.25 620 1200 n/a 98.28(92.01) 93.10(81.49) 33.71 24.25 700 50.87 67.24(59.31) 63.79(56.84) 30.96 24.48 700 800 n/a 96.55(93.62) 94.83(8.91) 30.96 24.48 800 91.23 98.28(94.54) 98.28(87.07) 28.76 24.71 800 911 n/a 98.28(91.38) 94.83(84.26) 28.76 24.71 910 89.66 96.55(92.18) 96.55(84.37) 31.48 25.08 Figure 2: 800nm FE(Opt Gen) on 800nm Image. 910 800 n/a 96.55(87.30) 96.55(87.30) 25.08 25.08 911 87.93 96.55(90.75) 93.10(81.72) 31.22 24.88 911 800 n/a 98.27(94.60) 96.55(86.32) 31.22 24.88 970 84.48 89.66(85.17) 87.93(75.46) 30.83 24.12 970 800 n/a 98.28(94.60) 94.83(85.46) 30.83 24.12 1070 89.66 94.83(89.31) 93.10(77.24) 30.92 25.08 1070 Figure 3: 800nm FE (Val Gen) on 800nm Image. 800 n/a 98.28(95.11) 96.55(87.99) 30.92 25.08 1200 93.10 98.27(92.36) 93.10(78.79) 31.62 24.78 1200 800 n/a 98.28(94.66) 94.83(85.75) 31.62 24.78 1300 82.76 93.10(89.66) 91.38(77.99) 33.59 25.27 1300 800 n/a 96.55(93.39) 94.83(85.00) 33.59 25.27 1450 36.21 67.24(58.16) 51.72(31.72) 38.55 25.08 1450 911 n/a 98.28(92.76) 64.83(80.11) 38.55 25.08 Figure 4: 1070nm FE (Opt Gen) on 800nm Image. 1550 31.03 43.10(36.94) 37.93(24.60) 34.15 24.71 1550 800 n/a 96.55(92.01) 96.55(86.49) 34.15 24.71 Figure 5: 1070nm FE (Val Gen) on 800nm Image. The CMC curves plot the accuracy at each rank. The rank represents the ranking of match scores for all probe subjects. For Figure 6 and 7, the ROC curve plots the rate of impostor attempts accepted on the x-axis, against the corresponding rate of genuine attempts accepted on the y- axis along an increasing threshold. In Figure 8, the CMC curves show a superior performance of GEFE compared to LBP. Though both techniques do not achieve 100% accuracy until rank 57, GEFE continues to outperform LBP. In Figure 9, the CMC curves for cross spectrum Figure 7: ROC Curves for Cross-Spectral analysis were created by taking the best performing FEs Matching from each spectrum when used for cross validation. It appears that the FE from the 800nm spectrum performed best in cross spectrum analysis. This supports the intra- class results, where the FEs on the 800nm spectrum had the best performance overall. Even though the results show that FEs evolved on the 800 nm spectrum had the overall best performance, it is worth mentioning that some bands outside of the NIR range performed well. In Table 1, the performance of GEFE on the 620nm spectrum performed better than the 700nm spectrum not only in intra-class matching, but also using feature extractors from the 1200 nm spectrum. The performance of the 1200 nm feature extractors performed similarly to the ranges of 910nm - 970nm. The proposed work using GEFE achieved 61% TAR at 1% FAR, while the FEs on the 910, 911, and 970 nm spectral bands achieved 48%, 60%, and 62% TAR respectively at 1% FAR. Figure 8: CMC Curves for Intra-Class Matching Figure 9: CMC Curves for Cross Spectral Matching. Figure 6: ROC Curves for Intra-Spectral Matching Conclusion and Future Work Masek, L. 2003. Recognition of human iris patterns for biometric identification. Dissertation B.Sc. thesis, University of Western We find from the experimental results that GEFE Australia. outperforms the LBP approach for cross spectral and intra- Multispectral Iris Dataset: Portions of the research in this paper spectral analysis. The best performing wavelength for the use the Consolidated Multispectral Iris Dataset of iris images collected under the Consolidated Multispectral Iris Dataset entire dataset was the 800nm wavelength for recognition Program, sponsored by the US Government. accuracy. However, there seems to be promise with feature O’Connor, B., Roy, K., Shelton J., and Dozier, G. 2014. Iris extractors evolved on the 1200 nm wavelength images. Recognition using fuzzy level set and GEFE. International Future work will be focused on fusing the cross spectral Journal of Machine Learning and Computing (IJMLC), vol. 4, no. data in order to evolve features across several different 3, pp. 225 – 231. spectral bands. This may improve accuracy by extracting Ojala, T., Pietikainen, M., and Maenpaa, T. 2002. Multiresolution features that may not have been present within a single gray-scale and rotation invariant texture classification with local spectrum. binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 971-98. Park, J. H., and Kang, M. G. 2005. Iris recognition against Acknowledgements counterfeit attack using gradient based fusion of multi-spectral images. Advances in Biometric Person Authentication, vol. 3781, This research was funded by the Army Research pp. 150-156. Laboratory (ARL) for the multi-university, Center for Popplewell, K., Roy, K., Ahmad, F., and Shelton, J. 2014. Advanced Studies in Identity Sciences (CASIS) and by the Multispectral Iris Recognition Utilizing Hough Transform and National Science Foundation (NSF), Science & Modified LBP. Proc. in IEEE Conference on Systems, Man and Technology Center: Bio/Computational Evolution in Cybernetics (SMC), pp. 1396-1399. Action Consortium (BEACON). The authors would like to Ross, A. 2010. Iris recognition: The path forward. Computer, vol. thank the ARL, NSF, and BEACON for their support of 43, no. 2, pp. 30-35. this research. Ricker, N. Penobscot County Jail Iris Scanner to Eliminate Improper Inmate Releases. Bangor Daily News RSS. N.p., 6 Dec. 2006. Web. 11 Nov. References Sanchez-Avila, C., and Sanchez-Reillo, R. 2005. Two different approaches for iris recognition using Gabor filters and multiscale Adam, H., Neven, H., and Steffens, J. B. 2010. Image base multi- zero-crossing representation. Pattern Recognition, vol. 38, no. 2. biometric system and method. U.S. Patent No. US7697735. pp. 231-240. Ahmad, F., Roy, K., and Popplewell, K. 2014. Multispectral Iris Shelton, J., Dozier, G. V., Bryant, K., Small, L., Adams, J., Recognition Using Patch Based Game Theory. Proc. in Leflore, D., Alford, A., Woodard, D. L., and Ricanek, K. 2011. International Conference on Image Analysis and Recognition. pp. Genetic and Evolutionary Feature Extraction via X-TOOLSS. 112-119. Proceedings in International Conference on Genetic and Ahonen, T., Hadid, A., and Pietikainen, M. 2006. Face Evolutionary Methods. description with local binary patterns: Application to face Shelton, J., Roy, K., O’Connor, B., and Dozier, G. V. 2014. recognition. IEEE Transactions on Pattern Analysis and Machine Mitigating Iris-Based Replay Attacks. International Journal of Intelligence, vol. 28, no. 12, pp. 2037-2041. Machine Learning and Computing (IJMLC), vol. 4, no. 3, pp. 204 Archbold Memorial Hospital Implements RightPatient® Patient – 209. Safety and Data Integrity System. M2SYS. N.p., 1 July 2014. Shelton, J., Alford, A., Small, L., Leflore, D., Williams, J., Web. 11 Nov. 2014. Adams, J., Dozier, G. V., Bryant, K., Abegaz, T., and Ricanek, K. Bowyer, K. W., Hollingsworth, K., and Flynn, P. J. 2008. Image 2012. Genetic & evolutionary biometrics: feature extraction from understanding for iris biometrics: a survey. Computer Vision and a machine learning perspective. Proceedings in IEEE Image Understanding, vol. 110, no. 2, pp. 281-307. Southeastcon, pp. 1-7. Daugman, J. 2004. How iris recognition works. IEEE Verma, P., Dubey, M., Verma, P., and Basu, S. 2012. Transactions on Circuits and Systems for Video Technology, vol. Daughman's Algorithm Method For Iris Recognition- A 14, no. 1, pp. 21-30. Biometric Approach. International Journal of Emerging Davis, L., ed. 1991. Handbook of genetic algorithms. vol. 115. Technology and Advanced Engineering, vol. 2, no. 6, pp. 177- New York: Van Nostrand Reinhold 185. Dozier, G.V. Jan 12, 2015. Center for Advanced Studies in Identity Sciences (CASIS) Presentation. Engelbrecht, A.P. 2007. Computational Intelligence: An Introduction. John Wiley & Sons. Goldberg, D. 1989. Genetic Algorithms in Search, Optimization & Machine Learning. Addison-Wesley, Reading. Kennedy, J., and Eberhart, R. 1995. Particle Swarm Optimization. Proceedings in Internationa Conference on Neural Networks. pp. 1942–1948.