Pixel Consistency, K-Tournament Selection, and Darwinian-Based Feature Extraction Joseph Shelton, Melissa Venable, Sabra Neal, Joshua Adams, Aniesha Alford, Gerry Dozier # Computer Science Department, North Carolina Agricultural and Technical State University 1601 East Market St. Greensboro, NC, 27411, United States of America jashelt1@ncat.edu, mdvenabl@ncat.edu, saneal@ncat.edu, jcadams2@ncat.edu, aalford@ncat.edu, gvdozier@ncat.edu Abstract first stage takes a set of FEs evolved by GEFEML and In this paper, we present a two-stage process for developing superimposes each to create a hyper FE. From this hyper feature extractors (FEs) for facial recognition. In this process, a FE, a probability distribution function (PDF) is created. The genetic algorithm is used to evolve a number of local binary PDF is represented as a two-dimensional matrix where each patterns (LBP) based FEs with each FE consisting of a number of (possibly) overlapping patches from which features are extracted position in the matrix corresponds to a pixel within a set of from an image. These FEs are then overlaid to form what is images. Each value within the PDF represents the number referred to as a hyper FE. of patches an associated pixel is contained within it. The hyper FE is then used to create a probability distribution In the second stage of the process, a Darwinian feature function (PDF). The PDF is a two dimensional matrix that records extractor (dFE) is developed by sampling the PDF via k- the number of patches within the hyper FE that a particular pixel tournament selection (Miller and Goldberg 1996). The is contained within. Thus, the PDF matrix records the consistency selected pixels are then grouped into c different clusters by of pixels contained within patches of the hyper FE. randomly selecting α pixels to serve as centers. Our results Darwinian-based FEs (DFEs) are then constructed by sampling the PDF via k-tournament selection to determine which pixels of a show that the computational cost of DFE (in terms of the set of images will be used in extract features from. Our results total number of pixels being processed) via dFEs is far less show that DFEs have a higher recognition rate as well as a lower expensive than the FEs evolved via GEFEML. The dFEs also computational complexity than other LBP-based feature outperform GEFEML evolved FEs in terms of recognition extractors. accuracy. The remainder of this paper is as follows. Section 2 Introduction provides an overview of the LBP feature extraction method, Genetic & Evolutionary Biometrics (GEB) is the field of GECs, and GEFEML. Section 3 provides a description of the study devoted towards the development, analysis, and two-stage process for developing dFEs. Sections 4 and 5 application of Genetic & Evolutionary Computation (GEC) present our experiment setup and our results respectively. to the area of biometrics (Ramadan and Abdel-kader 2009; Finally, in Section 6, we present our conclusions and future Galbaby et al. 2007; Alford et al. 2012; Shelton et al. work. 2012c). Over the past few years there has been a growing interest in GEB. To date, GEB has been applied to the area of biometrics in the form of feature extraction (Shelton et al. Background 2011a; Adams et al. 2010), feature selection (Kumar, Kumar and Rai 2009; Dozier et al. 2011), feature weighting Local Binary Pattern Method (Popplewell et al. 2011; Alford et al. 2011) as well as cyber security (Shelton et al. 2012a; Shelton et al. 2012b). The LBP method (Ojala and Pietikainen 2002; Ahonen, GEFEML (Genetic and Evolutionary Feature Extraction – Hadid and Pietikinen 2006) extracts texture patterns from Machine Learning) (Shelton et al. 2012c) is a GEB method images in an effort to build a feature vector (FV). It does this by segmenting an image into rectangular regions, that uses GECs to evolve feature extractors (FEs) that have referred to as patches, and comparing the grey-scale high recognition accuracy while using a small subset of intensity values of each pixel with the intensity values of a pixels from a biometric image. The results of Shelton et al. pixel’s nearest neighbors. After pixels are compared with (2012c) show that FEs evolved via GEFEML outperform the their nearest neighbors, a pattern is extracted. This pattern FEs developed via the traditional Local Binary Pattern is represented by a binary string. A histogram is built using (LBP) (Ojala and Pietikainen 2002) approach. the frequency of occurring patterns for a patch. The In this paper, we present a two-stage process for facial histograms for every patch are concatenated to form a FV. recognition (Tsekeridou and Pitas 1998; Zhao et al. 2003) known as Darwinian-based feature extraction (DFEs). The In the LBP method, images are traditionally partitioned y-coordinates of center pixel of n possible patches. The into uniform sized, non-overlapping patches. Within each widths and heights of the n patches are represented by Wi = patch, pixels are sought out that have d neighboring pixels {wi,0, wi,1, … , wi,n-1} and Hi = {hi,0, hi,1,…, hi,n-1}. Because on all sides and that are a distance of r pixels away from a the patches are uniform, W k = {wk,0, wk,1, … , wk,n-1} is center pixel. Each of these pixels can be referred to as a equivalent to, wk,0 = wk,1,…, wk,n-2 = wk,n-1, and Hk = {hk,0, center pixel, cp, due to it being surrounded by a hk,1, … , hk,n-1} is equivalent to, hk,0 = hk,1,…, hk,n-2 = hk,n-1, neighborhood of pixels. A texture pattern can be extracted meaning that the widths and heights of every patch are the using Equations 1 and 2, where N is the set of pixel same. Uniform sized patches are used because uniform intensity values for each of the neighboring pixels. In sized patches outperformed non-uniform sized patches in Equation 1, the difference between a neighboring pixel and (Shelton et al. 2011b). Mi = {mi,0, mi,1,…, mi,n-1} represents cp is calculated and sent to Equation 2. The value returned the masking values for each patch and fi represents the will either be a 1 or a 0, depending on the difference. The d fitness of fei . The masking value determines whether a bits returned will be concatenated to form a texture pattern. patch is activated or deactivated. If a patch is deactivated, by setting mi,j = 0, then the sub-histogram will not be d considered in the distance measure, and the number of LBP( N , c p )   (nt  c p ) (1) t 0 features to be used in comparisons is reduced. Otherwise, the patch is activated, with mi,j = 1.  1, x  0, (2) The fitness fi is determined by how many incorrect matches  ( x)   it makes on a training dataset D and how much of the image  0, x  0. is processed by fei. The dataset D is composed of multiple snapshots of subjects and is divided into two subsets, a Each patch has a histogram that stores the frequency of probe and a gallery set. The fei is applied on both the probe certain texture patterns extracted. The histograms for all set and gallery set to create FVs for each set. A distance patches of an image are concatenated together to create a measure is used to compare FVs in the probe to FVs in the FV for an image. This FV can be compared to another FV gallery and the smallest distances are considered a match. If of an image using a distance measure such as the the FV of an individual in the probe is incorrectly matched Manhattan Distance measure or the Euclidean distance with the FV of another individual in the gallery, then that is measure. considered an error. The fitness, shown in Equation 3, is the number of errors multiplied by 10 plus the percentage of GECs image space being processed. GEFEML uses GECs to evolve FEs (Shelton et al. 2012c). The resulting FEs have been shown to have high fi  10 ( D)   ( fei ) (3) recognition rates. A GEC uses artificial evolution to evolve a population of candidate solutions (CSs) to a particular To prevent overfitting FEs on a training set during the problem. Initially, a population of CSs is randomly evolutionary process, cross-validation is used to determine generated. Each CS in the population is then assigned a the FEs that generalize well to a dataset of unseen subjects. fitness based on a user specified evaluation function. Parent While offspring are applied to the training dataset to be CSs are then selected based on their fitness and allowed to evaluated, they were also applied to a mutually exclusive create offspring using a number of recombination and validation dataset which does not affect the evolutionary mutation techniques (Spears and DeJong 1991). After the process. The offspring with the best performance on the offspring are created, they are evaluated and typically validation dataset is recorded regardless of its performance replace the weaker members of the previous population. on the training set. The process of selecting parents, creating offspring, and replacing weaker CSs is repeated until a user specified stopping condition is met. The Two-stage Process for Developing a Hyper FE and a PDF GEFEML GEFEML evolves LBP-based FEs using some GEC, so FEs Stage I: Hyper FE/PDF must be represented as a CS. GEFEML represents an FE, fei, The hyper FE is constructed by taking a set of FEs from as a six-tuple, . The set Xi = {xi,0, xi,1,…, GEFEML and overlaying them. Figure 1a shows a set of xi,n-1} represents the x-coordinates of the center pixel of n sample FEs while Figure 1b shows a sample hyper FE. possible patches and Yi = {yi,0, yi,1, … , yi,n-1} represents the After the hyper FE is constructed, a PDF, in the form of a matrix, is created. Each position in the matrix contains the placed within the PDF. The distance between each of the number of patches a pixel was contained in. When patches selected positions for clustering will be compared to the in an FE overlapped on a position multiple times, the center positions, and the pixel will be clustered towards the overlap is considered in the count. So if the hyper FE had n closest one. After pixels have been assigned to clusters, patches, and used κ FEs, the greatest number of times a those pixels undergo LBP feature extraction to extract pixel was contained in a patch would be n * κ. Figure 1c texture patterns for a cluster. Due to the random placement shows a 3D plot of a PDF, while Figure 1d shows the 3D of clusters, it is possible for different clusters to have plot laid over a facial image. different numbers of pixels clustered to it. The clusters are similar to patches, therefore histograms are associated with each, and the patterns are used to build the histogram and ultimately create FVs for images. Experiments Two hyper FEs were used in this experiment: (a) a hyper FE composed of a set of FEs that performed well on the training set, HFEtrn and (b) a hyper FE composed of a set of the best performing FEs on the validation set, (a) (b) HFEval. The FEs were evolved using the experimental setup in Shelton et al. (2012c), which used GEFEML. GEFEML was run 30 times using increments of 1000, 2000, 3000 and 4000 evaluations. An EDA instance (Larranga and Loranzo 2002) of GEFEML was used with a population of 20 FEs and an elite of 1, meaning every generation starting from the second contained the single best performing FE of the previous generation. On each run, GEFEML returned the best performing FE on the training set and the best performing FE with respect to the validation set. The FEs were trained and validated on two mutually exclusive sets, and they were then applied to a test set. The (c) (d) datasets were composed of subjects from the Facial Recognition Grand Challenge database (FRGC) (Phillips Figure 1: (a) Set of FEs (b) hyper FE (c) 3D plot of PDF et al. 2005). The training set was composed of 100 and (d) overlay of 3D plot on a facial image subjects (FRGC-100trn), the validation set was composed of 109 subjects (FRGC-109), and the test set was Stage II: Developing dFEs composed of 100 subjects (FRGC-100tst). The average A dFE can be defined by the number of clusters it has, α, number of patches used from the set of generalizing FEs the selection pressure of tournament selection, µ, and the as well as the average number of pixels processed in a patch resolution, ρ. The variables µ and ρ are represented patch were calculated in order to set a starting point for as a percentage, or a value between 0 and 1. Assume that β this experiment. On average, 12 patches were activated represents the number of pixels a user would want for a and 504 pixels were processed by each patch using cluster, there are α *ρ*β positions that will be selected to be GEFEML. clustered. Tournament selection selects µ*σ pixels to In this experiment, instances of 16, 12, 8 and 4 clusters compete for clustering, where σ represents the total number were tested. Different patch resolutions, or the amount of of positions in the PDF that have been processed at least pixels that could belong to a cluster, were also used. In this once. When performing tournament selection, the position experiment, σ = 504. This was the average number of with the greatest consistency will be the winner. If there is a pixels in patches of the set of FEs from GEFEML. Instances tie, then the first selected position is the winner. Winning of DFE with patch resolutions of 1.0, 0.9, 0.8, 0.7, 0.6, pixels are selected without replacement. 0.5, 0.4, 0.3, 0.2 and 0.1 were run. Each resolution used After α *ρ* β pixels have been selected via tournament selection pressures from 0.0 (where number of pixels to be selection, α random centers for clusters are chosen to be compared in tournament selection is actually 2) to 1.0 and every tenth percentage in between. A dFE is defined to be a cluster, patch resolution, then selection pressure, giving a Table I: Results of DFEval, DFEtrn and GEFEML total of 880 dFEs (4 clusters * 10 patch resolutions * 11 selection pressures * 2 hyper FEs), and each DFE instance Method Feature Extractor CC Acc was ran 30 times. For each run, a dFE was applied to FRGC-100tst. <16,1.0,0.8> 8064.0 99.82% <16,0.9,0.5> 7257.6 99.69% DFEval <16,0.8,0.4> 6451.2 99.85% <16,0.7,0.5> 5644.8 99.60% Results <12,1.0,0.2> 6048.0 99.66% <12,0.9,1.0> 5443.2 99.39% The results were obtained by running each dFE listed in Section 4 on FRGC-100tst. DFEtrn <12,0.5,0.2> 3024.0 98.70% To compare the effectiveness of each method, we compare GEFEML ---- 6048.0 99.10% the results of different selection pressures within a certain resolution and patch. The results of the best selection pressure for a resolution are compared to the best selection Conclusion and Future Work pressures of every other resolution within the cluster group, and this is done for results in every cluster. After the best The results of the experiment suggest that the HFEval performing FEs are obtained from each cluster, they are produces dFEs that generalize well to unseen subjects. The compared to each other as well as the results of GEFEML. dFEs resulting from the HFEtrn also generalized well, but Results are compared using an ANOVA test and a t-test on were not as effective as when using dFEs resulting from the the recognition accuracies for a cluster-resolution-selection HFEval. Using both hyper FEs performed better than the set pressure instance. of generalized FEs from GEFEML. Future work will be Table I shows the results of this experiment. The first devoted towards using additional GECs for the DFE. column shows the methods used. The method DFEval represents dFEs that sampled the HFEval, while the method DFEtrn represents dFEs that sampled the HFEtrn. The two methods are compared to the original GEFEML method, Acknowledgment shown as GEFEML. The second column, Feature Extractor, shows the number of clusters used, the resolution and the This research was funded by the Office of the Director of selected pressure for a dFE. The third and fourth columns National Intelligence (ODNI), Center for Academic show the computational complexity (CC) and the average Excellence (CAE) for the multi-university Center for recognition accuracy (Acc) respectively for each method. Advanced Studies in Identity Sciences (CASIS), NSF The computational complexity is the number of pixels SSTEM, Lockheed Martin and the National Science processed, or extracted, by each method. Though 880 dFEs Foundation (NSF) Science & Technology Center: were tested, the only ones shown are ones that produced Bio/computational Evolution in Action CONsortium superior results to GEFEML. (BEACON). The authors would like to thank the ODNI, the For DFEval, each dFE showed in Table I outperformed NSF, and Lockheed Martin for their support of this GEFEML in terms of recognition accuracy. For DFE trn, the research. dFE <12,0.5,0.2> was statistically equivalent to FEs evolved using GEFEML. Though we compare results based References on recognition accuracy, we also considered computational Adams, J.; Woodard, D.L.; Dozier, G.; Miller, P.; Bryant, K.; Glenn, G., complexity. "Genetic-Based Type II Feature Extraction for Periocular Biometric The results show that the <12,0.5,0.2> dFE (of DFEtrn) Recognition: Less is More," Pattern Recognition (ICPR), 20th outperforms GEFEML in terms of computational complexity, International Conference on , vol., no., pp.205-208, 23-26 Aug. 2010 and that the <12,0.9,0.1> instance of DFEval outperformed Ahonen, T., Hadid, A., Pietikinen, M.; "Face Description with Local DFEtrn and GEFEMLin terms of recognition accuracy as well Binary Patterns: Application to Face Recognition," IEEE Transactions on as computational complexity. These results are promising Pattern Analysis and Machine Intelligence, vol. 28, no. 12, pp. 2037-2041, in terms of recognition and feature reduction of DFE. Dec. 2006 Alford, A., Popplewell, K., Dozier, G., Bryant, K., Kelly, J., Adams, J., Abegaz, T., and Shelton, J.“GEFeWS: A Hybrid Genetic-Based Feature Weighting and Selection Algorithm for Multi-BiometricRecognition,” Proceedings of the 2011 Midwest Artificial Intelligence and Cognitive Science Conference (MAICS), April 16-17, Cincinnati, OH., 2011 Alford, A., Steed, C., Jeffrey, M., Sweet, D., Shelton, J., Small, L., Ramadan. R. M. and Abdel-Kader, R. F. ; “Face Recognition Using Leflore, D., Dozier, G., Bryant, K., Abegaz, T., Kelly, J.C., Ricanek, K. Particle Swarm Optimization-Based Selected Features,” In International (2012) “Genetic & Evolutionary Biometrics: Hybrid Feature Selection and Journal of Signal Processing, Image Processing and Pattern Recognition, Weighting for a Multi-Modal Biometric System”, IEEE SoutheastCon Vol. 2, No. 2, June 2009. 2012. Shelton, J., Dozier, G., Bryant, K., Smalls, L., Adams, J., Popplewell, K., Dozier, G., Purrington, K., Popplewell, K., Shelton, J., Bryant, K., Adams, Abegaz, T., Woodard, D., and Ricanek, K. “Comparison of Genetic-based J., Woodard, D. L., and Miller, P. “GEFeS: Genetic & Evolutionary Feature Extraction Methods for Facial Recognition,” Proceedings of the Feature Selection for Periocular Biometric Recognition,” Proceedings of 2011 Midwest Artificial Intelligence and Cognitive Science Conference the 2011 IEEE Workshop on Computational Intelligence in Biometrics and (MAICS-2011), April 16-17, Cincinnati., 2011a Identity Management (CIBIM-2011), April 11-15, Paris, France., 2011 Shelton, J., Dozier, G., Bryant, K., Smalls, L., Adams, J., Popplewell, K., Galbally, J.; Fierrez, J.; Freire, M.R.; Ortega-Garcia, J.; "Feature Selection Abegaz, T., Woodard, D., and Ricanek, K. “Genetic and Evolutionary Based on Genetic Algorithms for On-Line Signature Feature Extraction via X-TOOLSS” in The 8th annual International Verification," Automatic Identification Advanced Technologies, 2007 Conference on Genetic and Evolutionary Methods (GEM), 2011b. IEEE Workshop on , vol., no., pp.198-203, 7-8 June 2007 Shelton, J., Bryant, K, Abrams, S., Small, L., Adams, J., Leflore, D., Kumar, D.; Kumar ,S. and Rai, C.S.; "Feature selection for face Alford, A., Ricanek, K, and Dozier, G.; “Genetic & Evolutionary recognition: a memetic algorithmic approach." Journal of Zhejanga Biometric Security: Disposable Feature Extractors for Mitigating University Science A, Vol. 10, no. 8, pp. 1140-1152, 2009. Biometric Replay Attacks” proceedings of The 10th Annual Conference on Systems Engineering Research (CSER), 2012a Larranaga, P.and Lozano, J. A.; “Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation”, Kluwer Academic Publishers, Shelton, J., Dozier, G., Adams, J., Alford, A.; “Permutation-Based 2002. Biometric Authentication Protocols for Mitigating Replay Attacks” will appear in the Proceedings of the 2012 IEEE World Congress on Miller B. L. and Goldberg, D. E.; “Genetic algorithms, selection schemes, Computational Intelligence (WCCI 2012), 2012b and the varying effects of noise”. Evol. Comput. 4, 2, 113-131, June 1996. Shelton, J., Alford, A., Abegaz, T., Small, L., Leflore, D., Williams, J., Ojala,T., Pietikainen, M.; “Multiresolution Gray-Scale and Rotation Adams, J., Dozier, G., Bryant, K.; “Genetic & Evolutionary Biometrics: Invariant Texture Classification with Local Binary Patterns”, IEEE Trans. Feature Extraction from a Machine Learning Perspective”, proceedings of Pattern Analysis and Machine Intelligence; 971-987; 2002 IEEE SoutheastCon 2012c. Phillips, P.J., Flynn, P. J., Scruggs, T., Bowyer, K. W., Chang, J., Hoff, Spears, W. M. and DeJong, K. A. “An analysis of multipoint crossover,” K., Marques, J., Min, J. and Worek, W; “Overview of face recognition in the Proc. 1990 Workshop of the Foundations of Genetic Algorithms, pp. grand challenge,” in IEEE Conference on Computer Vision and Pattern 301-315, 1991 Recognition, 2005. Tsekeridou , S. and Pitas, I. ;“Facial feature extraction in frontal views Popplewell, K., Dozier, G., Bryant, K., Alford, A., Adams, A., Abegaz, T., using biometric analogies” in Proc. of the IX European Signal Processing Purrington, K., and Shelton, J. (2011). “A Comparison of Genetic Feature Conference, I:315–318, 1998 Selection and Weighting Techniques for Multi-Biometric Recognition,” Proceedings of the 2011 ACM Southeast Conference, March 24-26, Zhao, W., Chellappa, R., Phillips, P. J. and Rosenfeld, A.; “Face Kennesaw, GA., 2011 recognition: A literature survey". ACM Comput. Surv. 35, 4 (December 2003), 399-458., 2003