
Face RecognitionReseacrh @ AGC Group of AIT(Latest update Sept. 15, 2011)Face recognition workflowClick or the blocks of the face recognition workflow diagram to navigate to the associated topic. Facial image preprocessingRecognition of correctly registered, expressionless faces is mature. Illumination variations can be controlled using nearinfrared illumination and cameras, or image processing. Techniques for the later are discussed in the following subsections. Note that these techniques can be combined, but also that they offer performance boost only when the faces are frontal and with approximatelly neutral expressions. For more information on these techniques see [1].
Feature extractionTransformationbasedDiscrete Cosine Transform changes the image data from the spacial to the frequency domain. It is applied on 8x8 blocks, yielding a vector of a number of the low frequency coefficients, excluding the DC one. Each such vector is normalised to unit norm and is concatenated with the vectors obtained from the rest of the 8x8 blocks. Its effect is to discard the local average intensity cariations, while in the same time reducing the dimensionality of the input space. Our implementation (MATLAB code) for blockDCT feature extraction can be found here. Another widely used transformation is the Gabor wavelet. Unlike DCT, this is not applied on the entire image but rather on certain key locations of the face. The most wellknown method based on Gabor Wavelets is Elastic Bunch Graph Matching. Finally, Local Binary Patterns is a kind of transformation that yields feature vectors from the pixel values. It is extensively used for texture analysis. StatisticalThese are projection algorithms from the image pixel space to spaces. They can be:
Face recognition methodsThere are so many different methods because there are many different adverse situations caused by pose, expression and illumination variations, or simply the passage of time. EigenfacesEigenfaces [4] is a linear subspace projection algorithm that uses Principal Component Analysis (PCA). As no class label information is used in PCA, the projection is estimated in an unsupervised manner. After linear projection, the resulting recognition space is of much lower dimension. The PCA feature vectors are robust to noise and minor head rotations, but not to illumination changes [5]. Since its introduction in 1991, the eigenface technique has seen many modifications. In [5] the influence of distance metrics and eigenvector selection on PCA performance is analysed. Eigenvector selection consists of discarding a few eigenvectors with larger eigenvalues and/or some of those with the smallest. This is attributed to the empirical observation that the discarded eigenvectors with the larger eigenvalues encode directionofillumination changes. Discarding three such eigenvectors is shown in [5] to greatly enhance PCA performance. FisherfacesPCA maximizes the total scatter of the training vectors while reducing their dimensions. It is optimum in the meansquared error sense for representation in the resulting subspace, but offers no guarantee of optimality for classification. Linear Discriminant Analysis (LDA) on the other hand does take into account class labels and maximizes the betweenclass scatter under the constraint that the withinclass scatter is minimized. This results to compact clusters for each class, as far as possible from each other. This projection is optimum for classification and is supervised. A PCA+LDA combination, termed Fisherfaces, was introduced in [5] and was proven robust to illumination changes.
For an indepth analysis of the different options of Eigenfaces and Fisherfaces, the two most widely used subspace projection face recognition methods, see [6]. We have proposed a variant of Fisherfaces, based on subclass LDA, that is suitable for the complicated face manifolds obtained in the multiple pose and expression face recognition. For details, see the Subclass LDA section, or [1]. Elastic Bunch Graph MatchingEBGM [7] assumes that the positions of certain facial features are known for each training image. The image regions around these features are convolved with 40 complex 2D Gabor kernels. The resulting 80 coefficients constitute the Gabor jet for each facial feature. The Gabor jets for all facial features are grouped in a graph, the Face Graph, where each jet is a node and the distances between facial features are the weights on the corresponding vertices. The information in the Face Graph is all that is needed for recognition; the image itself is discarded. All Face Graphs from the training images are combined in a stacklike structure called the Face Bunch Graph (FBG). Each node of the FBG contains a list of Gabor jets for the corresponding facial feature from all training images, and the vertices are now weighted with the average distances across the training set. The positions of the facial features in the testing images are unknown; EBGM estimates them based on the FBG. Then a Face Graph can be constructed for each testing image based on the estimated positions. The Face Graph of each testing image is compared with the FBG to determine the training image it is most similar to, according to some jetbased metric. In [7], a number of such metrics is proposed, most of which can also be used for the feature estimation step. Our results indicate that Displacement Estimation Local Search is the best choice for facial feature localization; for the actual identification stage, Displacement Estimation Grid Search yields the best recognition rate. Pseudo TwoDimensional Hidden Markov ModelsFace recognition using HMM is based on approximating blocks from the face image with a chain of states of a stochastic model [8]. For the pseudo 2D HMM the image blocks are extracted by scanning the face from left to right, top to bottom with an overlap both in horizontal and vertical direction. Pixel intensities do not lead to robust features, as they are susceptible to illumination changes and other detrimental effects. A transformation like the 2D Discrete Cosine Transform attenuates those distorting effects, leading to better performance. A pseudo 2D HMM model of hidden states is obtained by linking leftright 1D HMM models with vertical superstates. For the training of each class the BaumWelch algorithm is used. In the recognition phase the class that gives the highest value for the probability of the observation sequence of the testing image, given the class model, is considered the most likely to be the true identity of the testing face. Our results indicate that close cropping the faces and using mixture of three Gaussians for the states enhance performance. Correlation FiltersFace recognition can be performed by cross correlating a face image with a suitable filter and processing the output. Many correlation filters have been proposed [9]; amongst them, the Minimum Average Correlation Energy filter (MACE) is reported to perform best. It reduces the large sidelobes by minimizing the average correlation energy plane while at the same time satisfying the correlation peak constraints at the origin. These constraints result in the correlation plane close to zero everywhere except at a location of a trained object, where a sharp peak appears. For recognition, the output plane is searched for the highest point and that value, as well as the values of its surrounding points, is used to determine the class that the face belongs to. For implementation and testing of correlation filters refer to this MSc thesis. LaplacianfacesThe Eigenfaces and Fisherfaces algorithms both use linear projection to a subspace aiming to preserve the global structure of the face. Laplacianfaces [10] on the other hand, is an algorithm that uses optimal linear approximations to the eigenfunctions of the LaplaceBeltrami operator. By aiming to preserve the local structure of the face, Laplacianfaces attempts to attenuate the unwanted variations resulting from changes in lighting, facial expression and pose. In that, Laplacianfaces shares many of the properties of nonlinear projection algorithms. For implementation and testing of the Laplacianfaces algorithm refer to this MSc thesis. Effect of face registrationCorrect face registration is important, as without it the within class variations become more pronounced than the between class ones. In [11], we show that different recognition methods tolerate different amounts of face registration errors.
Subclass LDAThe challenge posed by the large variation in pose, expression and illumination of the farfield, nonattentive video data is that the resulting face manifolds of different people are not linearly separable, i.e., different poses of different people under different illumination and expressions are highly similar and confusable.
In our work [15], we utilise a nearest neighbour approach to automatically select the subclasses. This utilizes hierarchical bottomup tree clustering of the training vectors of every class. After the tree is built, we prune it at the appropriate depth
Performance on the CLEAR 2007 datasetReferences[1] A. Pnevmatikakis and L. Polymenakos, ‘ Subclass Linear Discriminant Analysis for VideoBased Face Recognition ,’ Journal of Visual Communication and Image Representation, Vol. 20, Issue 8, pp. 543551, Nov. 2009. [2] B.S. Venkatesh, S. Palanivel and B. Yegnanarayana, ‘ Face Detection and Recognition in an Image Sequence using Eigenedginess ,’ Third Indian Conference in Computer Vision, Graphics and Image Processing, Ahmedabab, India, Dec. 2002. [3] M. Bartlett, J. Movellan and T. Sejnowski, ‘ Face Recognition by Independent Component Analysis ,’ IEEE Trans. on Neural Networks, vol. 13, no. 6, 2002, pp. 14501464. [4] M. Turk and A. Pentland, ‘ Eigenfaces for Recognition ,’ J. Cognitive Neuroscience, 1991, pp. 7186. [5] P. Belhumeur, J. Hespanha and D. Kriegman, ‘ Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection ,’ IEEE Trans. Pattern Analysis and Machine Intelligence, 1997, pp. 711720. [6] A. Pnevmatikakis and L. Polymenakos, ‘ Subspace Projection Face Recognition: Comparison of Methods and Metrics ,’ technical report, 2004. [7] L. Wiskott, JM. Fellous, N. Krueger and C. Malsburg, ‘ Face Recognition by Elastic Bunch Graph Matching ,’ in L.C. Jain et al. (eds.) Intelligent Biometric Techniques in Fingerprint and Face Recognition, CRC Press, pp. 355396, 1999. [8] F. Samaria and A. Harter, ‘ Parametrisation of a Stochastic Model for Human Face Identification ,’ in 2nd IEEE Workshop on Applications of Computer Vision, pp. 138142, 1994. [9] C. Xie, B. V. K. Vijaya Kumar, S. Palanivel and B. Yegnanarayana, ‘ A StilltoVideo Face Verification System Using Advanced Correlation Filters ,’ in International Conference on Biometric Authentication, pp. 102108, 2004. [10] X. He, S. Yan, Y. Hu, P. Niyogi and H.J. Zhang, ‘ Face Recognition Using Laplacianfaces ,’ IEEE Trans. Pattern Analysis and Machine Intelligence, no. 3, pp. 328340, 2005. [11] E. Rentzeperis, A. Stergiou, A. Pnevmatikakis and L. Polymenakos, ‘ Impact of Face Registration Errors on Recognition ,’ in I. Maglogiannis, K. Karpouzis and M. Bramer (eds.), Artificial Intelligence Applications and Innovations (AIAI06), Springer, Berlin Heidelberg, pp. 187194, June 2006. [12] G. Baudat and Fatiha Anouar, ‘ Generalized Discriminant Analysis Using a Kernel Approach ,’ Neural Computation, vol. 12, no. 10, pp. 23852404, 2000. [13] M. Zhu and A. M. Martinez, ‘ Subclass Discriminant Analysis ,’ IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 8, pp. 12741286, 2006. [14] K. Fukunaga, ‘ Statistical Pattern Recognition ,’ Academic Press, 1990. [15] A. Pnevmatikakis and L. Polymenakos, ‘ Subclass Linear Discriminant Analysis for VideoBased Face Recognition ,’ J. of Visual Comunication and Image Representation, vol. 20, no. 8, pp. 543551, 2009. Contact Aristodemos Pnevmatikakis, apne@ait.edu.gr 
