Reseacrh @ AGC Group of AIT
(Latest update Sept. 15, 2011)
Click or the blocks of the face recognition workflow diagram to navigate to the associated topic.
Recognition of correctly registered, expressionless faces is mature. Illumination variations can be controlled using near-infrared illumination and cameras, or image processing. Techniques for the later are discussed in the following subsections. Note that these techniques can be combined, but also that they offer performance boost only when the faces are frontal and with approximatelly neutral expressions. For more information on these techniques see .
Discrete Cosine Transform changes the image data from the spacial to the frequency domain. It is applied on 8x8 blocks, yielding a vector of a number of the low frequency coefficients, excluding the DC one. Each such vector is normalised to unit norm and is concatenated with the vectors obtained from the rest of the 8x8 blocks. Its effect is to discard the local average intensity cariations, while in the same time reducing the dimensionality of the input space. Our implementation (MATLAB code) for block-DCT feature extraction can be found here.
Another widely used transformation is the Gabor wavelet. Unlike DCT, this is not applied on the entire image but rather on certain key locations of the face. The most well-known method based on Gabor Wavelets is Elastic Bunch Graph Matching.
Finally, Local Binary Patterns is a kind of transformation that yields feature vectors from the pixel values. It is extensively used for texture analysis.
These are projection algorithms from the image pixel space to spaces. They can be:
There are so many different methods because there are many different adverse situations caused by pose, expression and illumination variations, or simply the passage of time.
Eigenfaces  is a linear subspace projection algorithm that uses Principal Component Analysis (PCA). As no class label information is used in PCA, the projection is estimated in an unsupervised manner. After linear projection, the resulting recognition space is of much lower dimension. The PCA feature vectors are robust to noise and minor head rotations, but not to illumination changes . Since its introduction in 1991, the eigenface technique has seen many modifications. In  the influence of distance metrics and eigenvector selection on PCA performance is analysed. Eigenvector selection consists of discarding a few eigenvectors with larger eigenvalues and/or some of those with the smallest. This is attributed to the empirical observation that the discarded eigenvectors with the larger eigenvalues encode direction-of-illumination changes. Discarding three such eigenvectors is shown in  to greatly enhance PCA performance.
PCA maximizes the total scatter of the training vectors while reducing their dimensions. It is optimum in the mean-squared error sense for representation in the resulting subspace, but offers no guarantee of optimality for classification. Linear Discriminant Analysis (LDA) on the other hand does take into account class labels and maximizes the between-class scatter under the constraint that the within-class scatter is minimized. This results to compact clusters for each class, as far as possible from each other. This projection is optimum for classification and is supervised. A PCA+LDA combination, termed Fisherfaces, was introduced in  and was proven robust to illumination changes.
For an in-depth analysis of the different options of Eigenfaces and Fisherfaces, the two most widely used subspace projection face recognition methods, see .
We have proposed a variant of Fisherfaces, based on sub-class LDA, that is suitable for the complicated face manifolds obtained in the multiple pose and expression face recognition. For details, see the Sub-class LDA section, or .
EBGM  assumes that the positions of certain facial features are known for each training image. The image regions around these features are convolved with 40 complex 2D Gabor kernels. The resulting 80 coefficients constitute the Gabor jet for each facial feature. The Gabor jets for all facial features are grouped in a graph, the Face Graph, where each jet is a node and the distances between facial features are the weights on the corresponding vertices. The information in the Face Graph is all that is needed for recognition; the image itself is discarded. All Face Graphs from the training images are combined in a stack-like structure called the Face Bunch Graph (FBG). Each node of the FBG contains a list of Gabor jets for the corresponding facial feature from all training images, and the vertices are now weighted with the average distances across the training set. The positions of the facial features in the testing images are unknown; EBGM estimates them based on the FBG. Then a Face Graph can be constructed for each testing image based on the estimated positions. The Face Graph of each testing image is compared with the FBG to determine the training image it is most similar to, according to some jet-based metric. In , a number of such metrics is proposed, most of which can also be used for the feature estimation step. Our results indicate that Displacement Estimation Local Search is the best choice for facial feature localization; for the actual identification stage, Displacement Estimation Grid Search yields the best recognition rate.
Pseudo Two-Dimensional Hidden Markov Models
Face recognition using HMM is based on approximating blocks from the face image with a chain of states of a stochastic model . For the pseudo 2D HMM the image blocks are extracted by scanning the face from left to right, top to bottom with an overlap both in horizontal and vertical direction. Pixel intensities do not lead to robust features, as they are susceptible to illumination changes and other detrimental effects. A transformation like the 2D Discrete Cosine Transform attenuates those distorting effects, leading to better performance. A pseudo 2D HMM model of hidden states is obtained by linking left-right 1D HMM models with vertical super-states. For the training of each class the Baum-Welch algorithm is used. In the recognition phase the class that gives the highest value for the probability of the observation sequence of the testing image, given the class model, is considered the most likely to be the true identity of the testing face. Our results indicate that close cropping the faces and using mixture of three Gaussians for the states enhance performance.
Face recognition can be performed by cross correlating a face image with a suitable filter and processing the output. Many correlation filters have been proposed ; amongst them, the Minimum Average Correlation Energy filter (MACE) is reported to perform best. It reduces the large sidelobes by minimizing the average correlation energy plane while at the same time satisfying the correlation peak constraints at the origin. These constraints result in the correlation plane close to zero everywhere except at a location of a trained object, where a sharp peak appears. For recognition, the output plane is searched for the highest point and that value, as well as the values of its surrounding points, is used to determine the class that the face belongs to. For implementation and testing of correlation filters refer to this MSc thesis.
The Eigenfaces and Fisherfaces algorithms both use linear projection to a subspace aiming to preserve the global structure of the face. Laplacianfaces  on the other hand, is an algorithm that uses optimal linear approximations to the eigenfunctions of the Laplace-Beltrami operator. By aiming to preserve the local structure of the face, Laplacianfaces attempts to attenuate the unwanted variations resulting from changes in lighting, facial expression and pose. In that, Laplacianfaces shares many of the properties of nonlinear projection algorithms. For implementation and testing of the Laplacianfaces algorithm refer to this MSc thesis.
Effect of face registration
Correct face registration is important, as without it the within class variations become more pronounced than the between class ones. In , we show that different recognition methods tolerate different amounts of face registration errors.
The challenge posed by the large variation in pose, expression and illumination of the far-field, non-attentive video data is that the resulting face manifolds of different people are not linearly separable, i.e., different poses of different people under different illumination and expressions are highly similar and confusable.
In our work , we utilise a nearest neighbour approach to automatically select the subclasses. This utilizes hierarchical bottom-up tree clustering of the training vectors of every class. After the tree is built, we prune it at the appropriate depth
Performance on the CLEAR 2007 dataset
 A. Pnevmatikakis and L. Polymenakos, ‘ Subclass Linear Discriminant Analysis for Video-Based Face Recognition ,’ Journal of Visual Communication and Image Representation, Vol. 20, Issue 8, pp. 543-551, Nov. 2009.
 B.S. Venkatesh, S. Palanivel and B. Yegnanarayana, ‘ Face Detection and Recognition in an Image Sequence using Eigenedginess ,’ Third Indian Conference in Computer Vision, Graphics and Image Processing, Ahmedabab, India, Dec. 2002.
 M. Bartlett, J. Movellan and T. Sejnowski, ‘ Face Recognition by Independent Component Analysis ,’ IEEE Trans. on Neural Networks, vol. 13, no. 6, 2002, pp. 1450-1464.
 P. Belhumeur, J. Hespanha and D. Kriegman, ‘ Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection ,’ IEEE Trans. Pattern Analysis and Machine Intelligence, 1997, pp. 711-720.
 A. Pnevmatikakis and L. Polymenakos, ‘ Subspace Projection Face Recognition: Comparison of Methods and Metrics ,’ technical report, 2004.
 L. Wiskott, J-M. Fellous, N. Krueger and C. Malsburg, ‘ Face Recognition by Elastic Bunch Graph Matching ,’ in L.C. Jain et al. (eds.) Intelligent Biometric Techniques in Fingerprint and Face Recognition, CRC Press, pp. 355-396, 1999.
 F. Samaria and A. Harter, ‘ Parametrisation of a Stochastic Model for Human Face Identification ,’ in 2nd IEEE Workshop on Applications of Computer Vision, pp. 138-142, 1994.
 C. Xie, B. V. K. Vijaya Kumar, S. Palanivel and B. Yegnanarayana, ‘ A Still-to-Video Face Verification System Using Advanced Correlation Filters ,’ in International Conference on Biometric Authentication, pp. 102-108, 2004.
 E. Rentzeperis, A. Stergiou, A. Pnevmatikakis and L. Polymenakos, ‘ Impact of Face Registration Errors on Recognition ,’ in I. Maglogiannis, K. Karpouzis and M. Bramer (eds.), Artificial Intelligence Applications and Innovations (AIAI06), Springer, Berlin Heidelberg, pp. 187-194, June 2006.
 G. Baudat and Fatiha Anouar, ‘ Generalized Discriminant Analysis Using a Kernel Approach ,’ Neural Computation, vol. 12, no. 10, pp. 2385-2404, 2000.
 A. Pnevmatikakis and L. Polymenakos, ‘ Subclass Linear Discriminant Analysis for Video-Based Face Recognition ,’ J. of Visual Comunication and Image Representation, vol. 20, no. 8, pp. 543-551, 2009.
Contact Aristodemos Pnevmatikakis, email@example.com