follow us

Follow AITathens on Twitter faceebook in_logo

Happening now...
home-banner_at
PhD-home-194x112
e-ban-180x103-2
mM-194x112-1

Face Recognition

Reseacrh @ AGC Group of AIT

(Latest update Sept. 15, 2011)

Face recognition workflow

Click or the blocks of the face recognition workflow diagram to navigate to the associated topic.

Workflow Facial image preprocessing Feature extraction Face recognition methods

Facial image preprocessing

Recognition of correctly registered, expressionless faces is mature. Illumination variations can be controlled using near-infrared illumination and cameras, or image processing. Techniques for the later are discussed in the following subsections. Note that these techniques can be combined, but also that they offer performance boost only when the faces are frontal and with approximatelly neutral expressions. For more information on these techniques see [1].

Intensity normalization

This is a simple and quite effective approach. The intensity of the faces is scaled so that its mean is set to 128 and its standard deviation to 40 (to keep the range of pixel values within the unsigned 8-bit integer representation).

Histogram equalization

Histogram equalization forces the images to occupy uniformly the available intensity range, alleviating the effects of uneven illumination.

Edginess filtering

Edginess is a form of high-pass filtering comprising of successive horizontal and vertical 1-D filters. It has been introduced in face recognition by [2]. Its effect is to retain mostly edge information.

apne

Facial image preprocessing on part of the BioID face database

Back to top

Feature extraction

Transformation-based

Discrete Cosine Transform changes the image data from the spacial to the frequency domain. It is applied on 8x8 blocks, yielding a vector of a number of the low frequency coefficients, excluding the DC one. Each such vector is normalised to unit norm and is concatenated with the vectors obtained from the rest of the 8x8 blocks. Its effect is to discard the local average intensity cariations, while in the same time reducing the dimensionality of the input space. Our implementation (MATLAB code) for block-DCT feature extraction can be found here.

Another widely used transformation is the Gabor wavelet. Unlike DCT, this is not applied on the entire image but rather on certain key locations of the face. The most well-known method based on Gabor Wavelets is Elastic Bunch Graph Matching.

Finally, Local Binary Patterns is a kind of transformation that yields feature vectors from the pixel values. It is extensively used for texture analysis.

Statistical

These are projection algorithms from the image pixel space to spaces. They can be:

  • Linear resulting to same or lower dimension. Typical examples are Principal Components Analysis (PCA), used in the Eigenfaces method and Linear Discriminant Analysis (LDA) used in the Fisherfaces method.
  • Non-linear resulting to the same dimension (ICA [3]).
  • Non-linear resulting to higher (even infinite) dimension (Kernel methods).

Back to top

Face recognition methods

There are so many different methods because there are many different adverse situations caused by pose, expression and illumination variations, or simply the passage of time.

Eigenfaces

Eigenfaces [4] is a linear subspace projection algorithm that uses Principal Component Analysis (PCA). As no class label information is used in PCA, the projection is estimated in an unsupervised manner. After linear projection, the resulting recognition space is of much lower dimension. The PCA feature vectors are robust to noise and minor head rotations, but not to illumination changes [5]. Since its introduction in 1991, the eigenface technique has seen many modifications. In [5] the influence of distance metrics and eigenvector selection on PCA performance is analysed. Eigenvector selection consists of discarding a few eigenvectors with larger eigenvalues and/or some of those with the smallest. This is attributed to the empirical observation that the discarded eigenvectors with the larger eigenvalues encode direction-of-illumination changes. Discarding three such eigenvectors is shown in [5] to greatly enhance PCA performance.

Fisherfaces

PCA maximizes the total scatter of the training vectors while reducing their dimensions. It is optimum in the mean-squared error sense for representation in the resulting subspace, but offers no guarantee of optimality for classification. Linear Discriminant Analysis (LDA) on the other hand does take into account class labels and maximizes the between-class scatter under the constraint that the within-class scatter is minimized. This results to compact clusters for each class, as far as possible from each other. This projection is optimum for classification and is supervised. A PCA+LDA combination, termed Fisherfaces, was introduced in [5] and was proven robust to illumination changes.

LDA vs. PCA: Maximising the total scatter under the constraint of minimum within class scatter leads to a more suitable subspace projection, if training is representative of the classes.

apne

For an in-depth analysis of the different options of Eigenfaces and Fisherfaces, the two most widely used subspace projection face recognition methods, see [6].

We have proposed a variant of Fisherfaces, based on sub-class LDA, that is suitable for the complicated face manifolds obtained in the multiple pose and expression face recognition. For details, see the Sub-class LDA section, or [1].

Elastic Bunch Graph Matching

EBGM [7] assumes that the positions of certain facial features are known for each training image. The image regions around these features are convolved with 40 complex 2D Gabor kernels. The resulting 80 coefficients constitute the Gabor jet for each facial feature. The Gabor jets for all facial features are grouped in a graph, the Face Graph, where each jet is a node and the distances between facial features are the weights on the corresponding vertices. The information in the Face Graph is all that is needed for recognition; the image itself is discarded. All Face Graphs from the training images are combined in a stack-like structure called the Face Bunch Graph (FBG). Each node of the FBG contains a list of Gabor jets for the corresponding facial feature from all training images, and the vertices are now weighted with the average distances across the training set. The positions of the facial features in the testing images are unknown; EBGM estimates them based on the FBG. Then a Face Graph can be constructed for each testing image based on the estimated positions. The Face Graph of each testing image is compared with the FBG to determine the training image it is most similar to, according to some jet-based metric. In [7], a number of such metrics is proposed, most of which can also be used for the feature estimation step. Our results indicate that Displacement Estimation Local Search is the best choice for facial feature localization; for the actual identification stage, Displacement Estimation Grid Search yields the best recognition rate.

Pseudo Two-Dimensional Hidden Markov Models

Face recognition using HMM is based on approximating blocks from the face image with a chain of states of a stochastic model [8]. For the pseudo 2D HMM the image blocks are extracted by scanning the face from left to right, top to bottom with an overlap both in horizontal and vertical direction. Pixel intensities do not lead to robust features, as they are susceptible to illumination changes and other detrimental effects. A transformation like the 2D Discrete Cosine Transform attenuates those distorting effects, leading to better performance. A pseudo 2D HMM model of hidden states is obtained by linking left-right 1D HMM models with vertical super-states. For the training of each class the Baum-Welch algorithm is used. In the recognition phase the class that gives the highest value for the probability of the observation sequence of the testing image, given the class model, is considered the most likely to be the true identity of the testing face. Our results indicate that close cropping the faces and using mixture of three Gaussians for the states enhance performance.

Correlation Filters

Face recognition can be performed by cross correlating a face image with a suitable filter and processing the output. Many correlation filters have been proposed [9]; amongst them, the Minimum Average Correlation Energy filter (MACE) is reported to perform best. It reduces the large sidelobes by minimizing the average correlation energy plane while at the same time satisfying the correlation peak constraints at the origin. These constraints result in the correlation plane close to zero everywhere except at a location of a trained object, where a sharp peak appears. For recognition, the output plane is searched for the highest point and that value, as well as the values of its surrounding points, is used to determine the class that the face belongs to. For implementation and testing of correlation filters refer to this MSc thesis.

Laplacianfaces

The Eigenfaces and Fisherfaces algorithms both use linear projection to a subspace aiming to preserve the global structure of the face. Laplacianfaces [10] on the other hand, is an algorithm that uses optimal linear approximations to the eigenfunctions of the Laplace-Beltrami operator. By aiming to preserve the local structure of the face, Laplacianfaces attempts to attenuate the unwanted variations resulting from changes in lighting, facial expression and pose. In that, Laplacianfaces shares many of the properties of nonlinear projection algorithms. For implementation and testing of the Laplacianfaces algorithm refer to this MSc thesis.

Back to top

Effect of face registration

Correct face registration is important, as without it the within class variations become more pronounced than the between class ones. In [11], we show that different recognition methods tolerate different amounts of face registration errors.

Impact of face registration in recognition: The eyes are randmly perturbed by a given percentage relative to the inter-eye distance.

apne

Back to top

Sub-class LDA

The challenge posed by the large variation in pose, expression and illumination of the far-field, non-attentive video data is that the resulting face manifolds of different people are not linearly separable, i.e., different poses of different people under different illumination and expressions are highly similar and confusable.

Two fundamentally different discriminant analysis algorithms have been proposed to solve this problem:

  • Kernel Discriminant Analysis (KDA) [12]. KDA attempts to find a higher-dimensional space using non-linear mapping where the classes are linearly separable. Three shortcomings of this approach [13] are listed: finding the appropriate kernel for each problem, the need for very large number of vectors for training, and the higher computational requirements.
  • Subclass LDA [14]. Subclass LDA attempts to split the classes into subclasses which are linearly separable. Underlying is the assumption that the subclasses are modeled correctly by a Gaussian distribution, even though the class itself is not. Subclass LDA offers computational efficiency but requires an automatic method for splitting the gallery vectors of each class into subclasses. This is especially hard for video-based face recognition where there is a large number of vectors collected from gallery video streams.
apne

Addressing non linearly separable face manifolds: Kernel LDA projection to higher dimensional space or subclass LDA projection to a subspace.

In our work [15], we utilise a nearest neighbour approach to automatically select the subclasses. This utilizes hierarchical bottom-up tree clustering of the training vectors of every class. After the tree is built, we prune it at the appropriate depth

Automatic derivation of subclasses for an individual from CLEAR 2007. At the level the tree is pruned, the individual is split into 6 subclasses, each capturing a pose.

apne

Back to top

Performance on the CLEAR 2007 dataset

apne

Back to top

References

[1] A. Pnevmatikakis and L. Polymenakos, ‘ Subclass Linear Discriminant Analysis for Video-Based Face Recognition ,’ Journal of Visual Communication and Image Representation, Vol. 20, Issue 8, pp. 543-551, Nov. 2009.

[2] B.S. Venkatesh, S. Palanivel and B. Yegnanarayana, ‘ Face Detection and Recognition in an Image Sequence using Eigenedginess ,’ Third Indian Conference in Computer Vision, Graphics and Image Processing, Ahmedabab, India, Dec. 2002.

[3] M. Bartlett, J. Movellan and T. Sejnowski, ‘ Face Recognition by Independent Component Analysis ,’ IEEE Trans. on Neural Networks, vol. 13, no. 6, 2002, pp. 1450-1464.

[4] M. Turk and A. Pentland, ‘ Eigenfaces for Recognition ,’ J. Cognitive Neuroscience, 1991, pp. 71-86.

[5] P. Belhumeur, J. Hespanha and D. Kriegman, ‘ Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection ,’ IEEE Trans. Pattern Analysis and Machine Intelligence, 1997, pp. 711-720.

[6] A. Pnevmatikakis and L. Polymenakos, ‘ Subspace Projection Face Recognition: Comparison of Methods and Metrics ,’ technical report, 2004.

[7] L. Wiskott, J-M. Fellous, N. Krueger and C. Malsburg, ‘ Face Recognition by Elastic Bunch Graph Matching ,’ in L.C. Jain et al. (eds.) Intelligent Biometric Techniques in Fingerprint and Face Recognition, CRC Press, pp. 355-396, 1999.

[8] F. Samaria and A. Harter, ‘ Parametrisation of a Stochastic Model for Human Face Identification ,’ in 2nd IEEE Workshop on Applications of Computer Vision, pp. 138-142, 1994.

[9] C. Xie, B. V. K. Vijaya Kumar, S. Palanivel and B. Yegnanarayana, ‘ A Still-to-Video Face Verification System Using Advanced Correlation Filters ,’ in International Conference on Biometric Authentication, pp. 102-108, 2004.

[10] X. He, S. Yan, Y. Hu, P. Niyogi and H.-J. Zhang, ‘ Face Recognition Using Laplacianfaces ,’ IEEE Trans. Pattern Analysis and Machine Intelligence, no. 3, pp. 328-340, 2005.

[11] E. Rentzeperis, A. Stergiou, A. Pnevmatikakis and L. Polymenakos, ‘ Impact of Face Registration Errors on Recognition ,’ in I. Maglogiannis, K. Karpouzis and M. Bramer (eds.), Artificial Intelligence Applications and Innovations (AIAI06), Springer, Berlin Heidelberg, pp. 187-194, June 2006.

[12] G. Baudat and Fatiha Anouar, ‘ Generalized Discriminant Analysis Using a Kernel Approach ,’ Neural Computation, vol. 12, no. 10, pp. 2385-2404, 2000.

[13] M. Zhu and A. M. Martinez, ‘ Subclass Discriminant Analysis ,’ IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 8, pp. 1274-1286, 2006.

[14] K. Fukunaga, ‘ Statistical Pattern Recognition ,’ Academic Press, 1990.

[15] A. Pnevmatikakis and L. Polymenakos, ‘ Subclass Linear Discriminant Analysis for Video-Based Face Recognition ,’ J. of Visual Comunication and Image Representation, vol. 20, no. 8, pp. 543-551, 2009.

Contact Aristodemos Pnevmatikakis, apne@ait.edu.gr

Back to top

Bookmark and Share
Affiliated with Aalborg University-CTiF, Harvard-Kennedy School Of Goverment © ATHENS INFORMATION TECHNOLOGY designed by {Linakis+Associates}