https://doi.org/10.31449/inf.v43i4.2117 Informatica 43 (2019) 535–543 535 A Novel Approach to Fuzzy-Based Facial Feature Extraction and Face Recognition Aniruddha Dey Department of Computer Science & Engineering, Jadavpur University, Kolkata, India. E-mail: anidey007@gmail.com Manas Ghosh Department of Computer Application, RCCIIT, Kolkata, India E-mail: manas.ghosh@rcciit.org Keywords: fuzzy G-2DLDA, fuzzy set theory, Fk-NN, RBFNN based classifier, membership degree matrix Received: December 29, 2017 Abstract: Generalized two-dimensional Fisher’s linear discriminant (G-2DFLD) is an effective feature extraction technique that maximizes class separability along row and column directions simultaneously. In this paper, we have presented a fuzzy-based feature extraction technique, named fuzzy generalized two-dimensional Fisher’s linear discriminant analysis (FG-2DLDA) method. The FG-2DLDA is extended version of the G-2DFLD method. In this study, we also have demonstrated the face recognition using the presented method with radial basis function (RBF) as a classifier. In this context, it is to be noted that the fuzzy membership matrix for the training samples is computed by means of fuzzy k-nearest neighbour (Fk-NN) algorithm. The global mean and class-wise mean training images are generated by combining the fuzzy membership values with the training samples. These mean images are used to compute the fuzzy intra- and inter-class scatter matrices along x- and y-directions. Finally, by solving the Eigen value problems of these scatter matrices, we find the optimal fuzzy projection vectors, which actually used to generate more discriminant features. The presented method has been validated over three public face databases using RBF neural network and establish that the proposed FG-2DLDA method provides favourable recognition rates than some contemporary face recognition methods. Povzetek: V prispevku je opisana metoda dvodimenzionalne Fisherjeve linearne diskriminacijske analize na osnovi mehkih množic (FG-2DLDA). 1 Introduction Facial feature extraction technique has developed as a popular research area in last 20 years in the field of computer vision, and machine learning [1- 6]. Very popular linear methods include principal component analysis (PCA) [5- 6], linear discriminant analysis (LDA) [7] and their variants, which use Eigen faces and/or Fisherfaces to compute features, fall under this category. In particular, PCA maximizes the total scatter matrix across all face images. However, undesirable variations caused by lighting, facial expression and other factors are retained through PCA techniques [6]. Many researchers argue that the PCA techniques do not provide any information for class discrimination; only perform dimension reduction [6, 7]. The LDA has been proposed as a better alternative to the PCA to provide class discrimination information [8, 9]. The main objective of the LDA is to find best discrimination of vectors among the classes by maximizing the between-class differences and minimizing the within-class ones [8]. The disadvantage of LDA technique is that, it suffers from the “small sample size (SSS)” problem [9]. The aforementioned problem mainly occurs in case of few numbers of sample than the sample dimension. The dimension of face images is generally very high; as a results, the within-class scatter matrix become singular that makes the FLD method infeasible. The SSS problem in LDA can be solved by sampling down the face images into smaller size [10]. LDA is one of the most important linear approaches for feature extraction which maximizes the ratio of the between-and within-class scatter matrix. However, for a task with very high-dimensional facial images, LDA method may suffer from the problem of singularity. To solve this problem, PCA has been applied to reduce the dimensions of the high dimensional vector space before employing the LDA method [11]. While the PCA seeks projections which are optimal for image reform from a low dimensional space, it may remove dimensions that contain discriminant information required for face recognition. R-LDA method introduces to solve the singularity problem [12]. The main drawback of R-LDA is that the dimensionality of covariance matrix is often more than ten thousand. It is not useful for R- LDA to procedure such large covariance matrix, when the computing platform is not sufficiently powerful. Huang et al. [13] introduced a more efficient null space IDA method. The key idea of this technique is that the with-class scatter matrix (𝑆 𝑤 ) is more effective for calculating discriminant feature, whereas, between-class scatter matrix (𝑆 𝑏 ) is useless. Though, the method is 536 Informatica 43 (2019) 535–543 A. Dey et al. often criticized for the high storage requirement and computational cost in facial feature extraction and recognition. Chen et al. [8] claimed that eigenvectors corresponding to eigenvalues equal to zero of 𝑆 𝑤 contain the maximum discriminant information. Yu and Yang [14] proposed a direct linear discriminant analysis method by diagonal the between- and within-class scatter matrix. It is well-known that between- and within-class scatter are two important measures of the separability of the projected samples. Independent component analysis (ICA) is also proposed as an effective feature extraction technique [15]. ICA computes discriminant features from covariance matrix by considering high-order statistics. The two-dimensional PCA (2DPCA) directly works on the 2D image matrices and found to be computationally efficient and more superior for face recognition and reconstruction than PCA [16]. Two-dimensional FLD (2DFLD) method maximizes the class separability in one direction (row or column) at a time [17]. The significant characteristic of 2DFLD method is that it directly works on the 2D image matrices. The projection vectors are extracted from the row and, by the G-2DFLD method [18]. The discriminant feature matrices are found by linearly projecting an image matrix on aforementioned directions. Therefore, the discriminative information is maximized by this method among the classes while minimizing it column direction of the training images simultaneously within a class [18]. To increase its pertinence, many LDA extensions, such as direct LDA [19], complete LDA [20], LDA/QR [21] or LDA/GSVD [22], have been developed in the last decades. These extensions try to preserve the same validation and overcome singularity problems either by first projecting the problem in a convenient subspace, using alternative indirect or approximate optimizations. Very recently, several researchers presented fuzzy- based methods, such as fuzzy k-nearest neighbour (Fk- NN) [23], fuzzy two dimensional Fisher’s linear discrimination (F-2DFLD) [25], fuzzy maximum scatter difference (F-MSD) [28], fuzzy two dimensional principal component analysis (F-2DPCA) [32], fuzzy two dimensional linear discriminant analysis (GPG-2DLDA), Generalized multiple maximum scatter difference (GMMSD) [33], fuzzy local mean discriminant analysis (FLMDA) [36], and fuzzy linear regression discriminant projection (FLRDP) [37] for feature extraction. Keller et al. (1985) presented the fuzzy k-nearest neighbour (Fk- NN) approach, which fuzzifies the class assignment [23]. This method, popularly known as fuzzy Fisherface [24] (Fuzzy-FLD), which incorporates the fuzzy membership grades into the within- and between-class scatter matrices for binary labelled patterns to extract features and are used for face recognition [25]. The fuzzy 2DFLD (F- 2DFLD) is an extension of the fuzzy Fisherface [26]. The scatter matrices were redefined by introducing membership values into each training sample. Yang et al. proposed feature extraction using fuzzy inverse FDA [26]. The Fk-NN was also incorporated in fuzzy inverse FDA for calculating membership degree matrices. The Fk-NN is used to calculate the membership matrix, which is incorporated within the definition of between class and within class scatter matrix [26]. Reformative LDA method is used along with the Fk-NN method to redefine the scatter matrices [27]. A weighted maximum scatter difference algorithm is used for face recognition [28]. Fuzzy LDA algorithm is derived by incorporating the fuzzy membership into learning and random walk method is introduced to reduce the effect of outliers [29]. Fuzzy set theory is integrated with the scatter difference discriminant criterion (SDDC) algorithm where Fk-NN method is used to compute the membership grade which is utilized to redefine the scatter matrices [30]. Fuzzy maximum scatter difference model is proposed where Fk- NN is used to calculate the membership degree matrix of training sample [31]. The Fuzzy 2DPCA method was introduced where Fk-NN method is applied to compute the membership matrix for training sample which was utilized to obtain fuzzy mean of each class. The average of the mean was calculated to define the scatter matrices [32]. Generalized multiple maximum scatter difference discriminant criterion has been introduced for effective feature extraction and classification [33]. Gaussian probability distribution information was incorporated in defining of between class and within class scatter matrices [34]. The membership grade and label information were used to define the scatter matrices [35]. Fuzzy local mean discriminant analysis was employed to construct the scatter matrices by redefining the fuzzy local class means [36]. Fuzzy linear regression discriminant projection method is proposed to compute the fuzzy membership grade for each sample and incorporated in the definition of within class and between class scatter matrices [37]. In the proposed method, we have incorporated the fuzzy membership values in different classes which are computed from the training images (samples). To obtain the membership degrees of each training sample, we have used the fuzzy k-NN and used them for calculating the global and class-wise mean training image matrices. Finally, fuzzy scatter matrices (between and within) are computed distinctly in row and column wise direction. To solve the eigenvalue problem of aforementioned scatter matrices, the features are extracted. The remaining sections of this paper are organized as follows. In Section 2, we give brief overview of G- 2DFLD method. In Section 3, we propose a novel method for feature extraction based on G-2DFLD method, called FG-2DLDA method. The simulation results on three public face image datasets are demonstrated in Section 4. Concluding remarks is given Section 5. 2 Brief summary of the generalized 2DFLD method Our presented technique is extended version of the G- 2DFLD feature extraction technique [18]. G-2DFLD method is briefly presented in this section. Let, the face images are of 𝑟 ×𝑠 dimension which are represented in the form of 2D vectors 𝑿 𝑖 (𝑖 = 1,2,…,𝑁 ). The total number of “𝐶 ” classes comprises 𝑁 A Novel Approach to Fuzzy-Based Facial Feature Extraction... Informatica 43 (2019) 535–543 537 face images. The 𝑐 𝑡 ℎ class is represented by 𝐶 𝑐 having total samples of 𝑁 𝑐 and also satisfying the condition (∑ 𝑁 𝑐 =𝑁 ) 𝐶 𝑐 =1 . Given an image 𝑿 , the G-2DFLD-based 2D feature matrix 𝒀 is generated by the following linear transformation: 𝒀 =(𝑷 𝑜𝑝𝑡 ) 𝑇 𝑿 (𝑸 𝑜𝑝𝑡 ) (1) where 𝑷 𝑜𝑝𝑡 and 𝑸 𝑜𝑝𝑡 are the two optimal projection matrices. The two Fisher’s criteria (objective function) along row and column direction ( 𝐽 (𝑷 ), 𝐽 (𝑸 )) have been expressed as stated below: { 𝐽 (𝑷 )= 𝑷 𝑇 𝑮 𝑟𝑏 𝑷 ×(𝑷 𝑇 𝑮 𝑟𝑤 𝑷 ) −𝟏 𝑎𝑛𝑑 𝐽 (𝑸 )= 𝑸 𝑇 𝑮 𝑐𝑏 𝑸 ×(𝑸 𝑇 𝑮 𝑐𝑤 𝑸 ) −𝟏 (2) The optimal projection vectors 𝑷 𝑜𝑝𝑡 and 𝑸 𝑜𝑝𝑡 can be obtained by finding the normalized eigenvalues the eigenvectors of 𝑮 𝑟𝑏 𝑮 𝑟𝑤 −1 and 𝑮 𝑐𝑏 𝑮 𝑐𝑤 −1 , respectively. The eigenvalues are sorted in descending order and the eigenvectors are also rearranged accordingly [18]. The optimal projection (eigenvector) matrix 𝑷 𝑜𝑝𝑡 and 𝑸 𝑜𝑝𝑡 can be stated as follows: { 𝑷 𝑜𝑝𝑡 = arg𝑚𝑎𝑥 𝑃 |𝑮 𝑟𝑏 𝑮 𝑟𝑤 −1 | =[𝑝 1 ,𝑝 2 ,…,𝑝 𝑢 ] 𝑎𝑛𝑑 𝑸 𝑜𝑝𝑡 =arg𝑚𝑎𝑥 𝑄 |𝑮 𝑐𝑏 𝑮 𝑐𝑤 −1 | =[𝑞 1 ,𝑞 2 ,…,𝑞 𝑣 ] (3) The between-class and within-class scatter matrices along row direction (𝑮 𝑟𝑏 and 𝑮 𝑟𝑤 ) and column direction (𝑮 𝑐 𝑏 and 𝑮 𝑐𝑤 ) are computed as follows : { 𝑮 𝑟𝑏 = ∑𝑁 𝑐 (𝒎 𝑐 −𝒎 )(𝒎 𝑐 −𝒎 ) 𝑇 𝐶 𝑐 𝑎𝑛𝑑 (4𝑎 ) 𝑮 𝑟𝑤 =∑∑(𝑿 𝑖 − 𝒎 𝑐 )(𝑿 𝑖 − 𝒎 𝑐 ) 𝑇 𝑁 𝑖 ∈𝑐 𝐶 𝑐 { 𝑮 𝑐𝑏 = ∑𝑁 𝑐 (𝒎 𝑐 −𝒎 ) 𝑇 𝐶 𝑐 (𝒎 𝑐 −𝒎 ) 𝑎𝑛𝑑 𝑮 𝑐𝑤 =∑∑(𝑿 𝑖 − 𝒎 𝑐 ) 𝑇 (𝑿 𝑖 − 𝒎 𝑐 ) 𝑁 𝑖 ∈𝑐 𝐶 𝑐 (4𝑏 ) In above expression, the global mean training image (𝒎 = 1 𝑁 ∑ 𝑿 𝑖 𝑁 𝑖 =1 ) and class-wise mean training image (𝒎 𝑐 = 1 𝑁 𝑐 ∑ 𝑿 𝑖 𝑁 𝑖 =1 |𝑿 𝑖 ∈𝐶 𝑐 ) are calculated. The dimensions of the row-wise scatter matrices (𝑮 𝑟𝑏 𝑎𝑛𝑑 𝑮 𝑟𝑤 ) and the column-wise scatter matrices (𝑮 𝑐𝑏 𝑎𝑛𝑑 𝑮 𝑐𝑤 ) are found to be r×r and s×s, respectively. 3 Proposed fuzzy generalized two- dimensional linear discriminant analysis (FG-2DLDA) method Human faces are highly susceptible to vary under different environmental conditions, such as illumination, pose, etc. As a result, sometimes, images of a person may look alike to that of a different person. In addition, variability among the images of a person may differ quite significantly. The proposed FG-2DLDA method is basically based on the concept of fuzzy class assignment, where a face image belongs to different classes as characterized by its fuzzy membership values. The idea of fuzzification using fuzzy k-nearest neighbour (Fk-NN) was conceived by Keller et al. and found to be more effective [23]. In the present study, we have used the Fk- NN for generating fuzzy membership values for training images; resulting a fuzzy membership matrix. The fuzzy membership values are incorporated with the training images to obtain global and class-wise mean images, which in turn used to form fuzzy (between- and within- class) scatter matrices. Therefore, these scatter matrices yield useful information regarding association of each training image into several classes. The optimal fuzzy 2D projection vectors are obtained by solving the eigenvalue problems of these scatter matrices. Finally, the FG- 2DLDA-based features are extracted by projecting a face image onto these optimal fuzzy 2D projection vectors. The different steps of the FG-2DLDA method are presented in details in the following sub-sections. 3.1 Generation of membership matrix by fuzzy k-nearest neighbour (Fk-NN) Let, there are C classes and N training images; each one is represented in the form of 2D vectors 𝑿 𝑖 (𝑖 = 1,2,…,𝑁 ). A fuzzy k-NN-based decision algorithm has been performed for assigning membership values (degree) to the training images [23, 24]. This Fuzzy k- Nearest Neighbour (Fk-NN) method redefines the membership values of the labelled face images. When, all of the neighbours belong to the 𝑖 𝑡 ℎ class which is equal to the class of 𝑗 𝑡 ℎ image under consideration, then 𝑛 𝑖𝑗 =𝑘 and µ 𝑖𝑗 returns 1, making membership values for the other classes as zero. In addition, µ 𝑖𝑗 also satisfies two obvious properties (∑ 𝜇 𝑖𝑗 =1 𝐶 𝑖 =1 𝑎 𝑛𝑑 0< ∑ 𝜇 𝑖𝑗 <𝑁 𝑁 𝑗 =1 ). So, the fuzzy membership matrix 𝑈 using the Fk-NN can be demonstrated as given below: 𝑈 =[𝜇 𝑐𝑖 ];𝑐 =1,2,3,…,𝐶 ; 𝑖 =1,2,3,…,𝑁 (5) 3.2 Fuzzy generalized two dimensional linear discriminant analysis (FG- 2DLDA) algorithm FG-2DLDA methods has employed the fuzzy membership values with the training images and redefine the scatter matrices along row and column directions. Finally, the optimal fuzzy projection vectors are generated by solving the eigenvalue problems of these 538 Informatica 43 (2019) 535–543 A. Dey et al. scatter matrices. Let the training set contains N images of C classes (subjects) and each one is denoted as 𝑿 𝑖 (𝑖 = 1,2,3…,𝑁 ) having dimension as r×s. The 𝑐 𝑡 ℎ class 𝐶 𝑐 , has total 𝑁 𝑐 images and satisfies ∑ 𝑁 𝑐 =𝑁 𝐶 𝑐 =1 . For an image 𝑿 , the FG-2DLDA-based features in the form of 2D matrix of size 𝑢 ×𝑣 is generated by projecting it onto the optimal fuzzy projection matrices and can be achieved by the following linear transformation as defined below: 𝒀 𝑓 =(𝑷 𝑜𝑝𝑡 𝑓 ) 𝑇 𝑿 (𝑸 𝑜𝑝𝑡 𝑓 ) (6) The Fisher’s criteria (objective function) 𝐽 𝑓 (𝑷 ) and 𝐽 𝑓 (𝑸 ) along row and column directions are defined as follows: { 𝐽 𝑓 (𝑷 )= (𝑷 𝑓 ) 𝑇 𝑮 𝑟𝑏 𝑓 𝑷 𝑓 ×{(𝑷 𝑓 ) 𝑇 𝑮 𝑟𝑤 𝑓 𝑷 𝑓 } −1 𝑎𝑛𝑑 𝐽 𝑓 (𝑸 )= (𝑸 𝑓 ) 𝑇 𝑮 𝑐𝑏 𝑓 𝑸 𝑓 ×{(𝑸 𝑓 ) 𝑇 𝑮 𝑐𝑤 𝑓 𝑸 𝑓 } −𝟏 (7) The ratio is maximized in the above equations (7) when the column vectors of the projection matrix 𝑷 𝑓 and 𝑸 𝑓 are the eigenvectors of 𝑮 𝑟𝑏 𝑓 (𝑮 𝑟𝑤 𝑓 ) −1 and 𝑮 𝑐𝑏 𝑓 (𝑮 𝑐𝑤 𝑓 ) −1 , respectively. The fuzzy optimal projection matrix 𝑷 𝑜𝑝𝑡 𝑓 and 𝑸 𝑜𝑝𝑡 𝑓 are obtained by finding the eigenvectors of 𝑮 𝑟𝑏 𝑓 (𝑮 𝑟𝑤 𝑓 ) −1 and 𝑮 𝑐𝑏 𝑓 (𝑮 𝑐𝑤 𝑓 ) −1 corresponding to the 𝑢 and 𝑣 largest eigenvalues, respectively. The fuzzy optimal projection matrices 𝑷 𝑜𝑝𝑡 𝑓 and 𝑸 𝑜𝑝𝑡 𝑓 can be represented as follows: { 𝑷 𝑜𝑝𝑡 𝑓 =arg𝑚𝑎𝑥 𝑃 𝑓 |𝑮 𝑟𝑏 𝑓 (𝑮 𝑟𝑤 𝑓 ) −1 | =[𝑝 1 ,𝑝 2 ,…,𝑝 𝑢 ] 𝑎𝑛𝑑 𝑸 𝑜𝑝𝑡 𝑓 =arg𝑚𝑎𝑥 𝑄 𝑓 |𝑮 𝑐𝑏 𝑓 (𝑮 𝑐𝑤 𝑓 ) −1 | =[𝑞 1 ,𝑞 2 ,…,𝑞 𝑣 ] (8) where {𝑝 𝑖 |𝑖 = 1,2,...,𝑢 } is the set of normalized eigenvectors of 𝑮 𝑟𝑏 𝑓 (𝑮 𝑟𝑤 𝑓 ) −1 corresponding to 𝑢 largest eigenvalues {𝜆 𝑖 |𝑖 = 1,2,...,𝑢 } and {𝑞 𝑗 |𝑗 = 1,2,...,𝑣 } is the set of normalized eigenvectors of 𝑮 𝑐𝑏 𝑓 (𝑮 𝑐𝑤 𝑓 ) −1 corresponding to 𝑣 largest eigenvalues {𝛼 𝑗 |𝑗 = 1,2,...,𝑣 }. The four (within- and between- class) fuzzy scatter matrices ( 𝑮 𝑟𝑏 𝑓 , 𝑮 𝑟𝑤 𝑓 , 𝑮 𝑐𝑏 𝑓 , 𝑮 𝑐𝑤 𝑓 ) along the row and column directions are defined as follows: { 𝑮 𝑟𝑏 𝑓 = 1 𝑁 ∑𝑁 𝑐 𝑓 (𝒎̅ 𝑐 − 𝒎̅)(𝒎̅ 𝑐 −𝒎̅) 𝑇 𝐶 𝑐 𝑎𝑛𝑑 (9𝑎 ) 𝑮 𝑟𝑤 𝑓 = 1 𝑁 ∑∑(𝑿 𝑖 −𝒎̅ 𝑐 )(𝑿 𝑖 − 𝒎̅ 𝑐 ) 𝑇 𝑁 𝑖 ∈𝑐 𝐶 𝑐 { 𝑮 𝑐𝑏 𝑓 = 1 𝑁 ∑𝑁 𝑐 𝑓 (𝒎̅ 𝑐 −𝒎̅) 𝑇 (𝒎̅ 𝑐 − 𝒎̅) 𝐶 𝑐 𝑎𝑛𝑑 (9𝑏 ) 𝑮 𝑐𝑤 𝑓 = 1 𝑁 ∑∑(𝑿 𝑖 − 𝒎̅ 𝑐 ) 𝑇 (𝑿 𝑖 −𝒎̅ 𝑐 ) 𝑁 𝑖 ∈𝑐 𝐶 𝑐 Where fuzzy membership degrees are integrated into the training images to get fuzzy global mean image (𝒎̅ = ∑ ∑ 𝝁 𝑐𝑖 𝑿 𝑖 𝑁 𝑖 =1 𝐶 𝑐 =1 ∑ ∑ 𝝁 𝑐𝑖 𝑁 𝑖 =1 𝐶 𝑐 =1 ) and fuzzy class-wise mean images (𝒎̅ 𝑐 = ∑ 𝝁 𝑐𝑖 𝑿 𝑖 𝑁 𝑖 =1 ∑ 𝝁 𝑐𝑖 𝑁 𝑖 =1 ; 𝑐 =1,2,3,…,𝐶 ). It may be also noted that the size of the (𝑮 𝑟𝑏 𝑓 and 𝑮 𝑟𝑤 𝑓 ) scatter matrices is 𝑟 ×𝑟 ; whereas for the 𝑮 𝑐𝑏 𝑓 and 𝑮 𝑐𝑤 𝑓 scatter matrices it is 𝑠 ×𝑠 . 4 Simulation results and discussion We have assessed the performance of the proposed FG- 2DLDA on three publicly available databases namely, FERET [39, 40] AT&T [41], and UMIST [42]. The equation for calculating the recognition rate is represented below: 𝑅 𝑎𝑣𝑔 = ∑ 𝑛 𝑐𝑙𝑠 𝑖 𝑞 𝑖 =1 𝑞 ×𝑛 𝑡𝑜𝑡 (10) where, 𝑞 denotes total number of experimental runs. Correct recognition number in the 𝑖 𝑡 ℎ run is represented by 𝑛 𝑐𝑙𝑠 𝑖 . 𝑛 𝑡𝑜𝑡 indicates the whole number of test face images. FERET face database is used to evaluate the FG- 2DLDA method under several facial expressions, pose and lighting conditions. AT&T and UMIST database are used to access the presented method under the condition of minor variations of rotation and scaling. In these experiments, we have used a RBFNN classifier due to its superiority and simplicity over the other types of neural networks. As discussed in Section 3 of the proposed FG- 2DLDA method, the experiments are performed to validate our claim. The FG-2DLDA algorithm is implemented in C programming language on the Linux operational system with Intel Core i5 (2.4 GHz) and DDR3 (8 GB, 1333 MHz). The suggested method is evaluated on a subset of FERET face database [39, 40]. The database consists of 1400 images of 200 individuals and each individual is having 7 images. The images differ in facial expression, illumination and pose. In our study, the facial portion of each original image was lopped and resized to 80×80 pixels based on the location of the eyes. Here, the values of s are taken as 2, 3 and 4 A Novel Approach to Fuzzy-Based Facial Feature Extraction... Informatica 43 (2019) 535–543 539 and our method is tried out 10 times with each value of s with the different training sets and test sets. Some examples of images of a person are shown in Fig. 1 (i). A set of 400 images of 40 persons comprise AT&T face database. There are 10 dissimilar images for each person. In our present study, from the set of images for each person, s images are picked out in random from the database to generate the training set and remaining (10- s) images are considered as the test set. Hence, a distinct set of images encompasses the training and test set. 3, 4, 5, and 6 are taken as the values of s to form different pairs of training and test sets. Some examples of images of a individual are shown in Fig. 1 (ii). A total of 575 grey-scaled images of 20 different individuals covering a variety of race, sex, and appearance is contained in the multi-view UMIST database. The Face database of images per individual varies from 19 to 48 images. In recent studies, we have diminished each image to 112 × 92 pixels. Fig. 1 (iii) shows one person face image from the database. The experiments are repeated 10 times for each value of s with different training set and test sets on the FERET face database. Here, we choose s = 2, 3, 4 images from each subject at random for training and remaining (7-s) images are employed for testing. The proposed method is evaluated with feature matrices sizes from 6×6 (i) (ii) (iii) Figure 1: Some pictures of a person from the (i) FERET, (ii) AT&T, and (iii) UMIST face databases. Figure 2: Minimum, maximum and average recognition rates of the FG-2DLDA method for different values of s by varying feature size on the FERET face database. Table 2: Comparison in terms of average recognition rates (%) obtained from different methods on the FERET face database. Method Average recognition rates s = 2 s = 3 s = 4 FG-2DLDA 49.05 (10×10) 58.81 (12×12) 65.51 (10×10) F-2DFLD [22] 48.88 (40×8) - - MMSD (𝜃 =0.4) [28] - 52.6 - 55.81 - MSD (𝜃 =0.4) [28] - 50.5 - 53.68 - FMSD (𝜃 =0.4) [28] - 53.46 - 56.9 - Alternative-2DPCA [35] 48.31 (112×20) 53.21 (112×20) 53.97 (112×20) (2D) 2PCA [35] 47.70 (112×20) 52.36 (112×20) 55.45 (112×20) 2DPCA [35] 47.12 (112×20) 52.66 (112×20) 55.20 (112×20) *Highest recognition rates are indicated by the bold values. 46 47 48 49 50 51 6×6 8×8 10×10 12×12 14×14 16×16 18×18 20×20 s = 2 The dimension of feature vector Avg. recognition rate ( % ) 54 55 56 57 58 59 60 61 62 6×6 8×8 10×10 12×12 14×14 16×16 18×18 20×20 s = 3 The dimension of feature vector Avg. recognition rate ( % ) 58 60 62 64 66 68 70 72 6×6 8×8 10×10 12×12 14×14 16×16 18×18 20×20 s = 4 Avg. recognition rate ( % ) The dimension of feature vector 540 Informatica 43 (2019) 535–543 A. Dey et al. to 20×20 using RBFNN as a classifier. Fig. 2 demonstrate the minimum, maximum and average recognition rates (%) of the FG-2DLDA method for different values of (s= 2, 3 and 4) by varying feature size. We have compared the performance of the proposed FG-2DLDA method with other competent related methods. FG-2DLDA method extracts discriminative feature by calculating the within class and between class Figure 3: Minimum, maximum and average recognition rates of the FG-2DLDA method for different values of s by varying feature size on the AT&T face database. Figure 4: Minimum, maximum and average recognition rates of the FG-2DLDA method for different values of s by varying feature size on the UMIST face database. 86 88 90 92 94 96 98 6×6 8×8 10×10 12×12 14×14 16×16 18×18 20×20 22×22 s = 3 The dimension of feature vector Avg recognition rate ( % ) 90 92 94 96 98 100 6×6 8×8 10×10 12×12 14×14 16×16 18×18 20×20 22×22 s = 4 Avg recognition rate ( % ) The dimension of feature vector 92 94 96 98 100 6×6 8×8 10×10 12×12 14×14 16×16 18×18 20×20 22×22 s = 5 Avg. recognition rate ( % ) The dimension of feature vector 96 97 98 99 100 6×6 8×8 10×10 12×12 14×14 16×16 18×18 20×20 22×22 s = 6 The dimension of feature vector Avg. recognition rate ( % ) 75 80 85 90 95 8×8 10×10 12×12 14×14 16×16 18×18 20×20 22×22 24×24 s = 4 The dimension of feature vector Avg. recognition rate ( % ) 88 90 92 94 96 8×8 10×10 12×12 14×14 16×16 18×18 20×20 22×22 24×24 s = 6 Avg. recognition rate ( % ) The dimension of feature vector 92 94 96 98 100 8×8 10×10 12×12 14×14 16×16 18×18 20×20 22×22 24×24 s = 8 Avg. recognition rate ( % ) The dimension of feature vector 92 94 96 98 100 8×8 10×10 12×12 14×14 16×16 18×18 20×20 22×22 24×24 s = 10 The dimension of feature vector Avg. recognition rate ( % ) A Novel Approach to Fuzzy-Based Facial Feature Extraction... Informatica 43 (2019) 535–543 541 scatter matrix in row and column direction. Thus, the results again demonstrate the superiority of the FG- 2DLDA method over other methods. In this study, we have validated the performance of our method with 20 different pairs of training and test sets for each value of s on the AT&T face database. Since the present method considers that a face image may simultaneously belong to different classes with possibly different membership values, the class-wise mean images may differ from the actual ones. Fig. 3 Minimum, maximum and average recognition rates of the FG-2DLDA method for different values of s by varying feature size on the AT&T face database. The proposed method yields the best average recognition rates of 93.41% (14×14), 96.08% (16×16), 98.08% (14×14), and 98.68% (18×18) for s = 3, 4, 5, and 6, respectively. Table 3 demonstrates the best average recognition rates achieved by this algorithm for different combination of training and test set. Moreover, we also have compared the result of our method with the other competent methods. In general the face images are severely affected by the different environmental condition. These factors need to be investigated to measure their impact on the intra-class assignment. The scatter matrices involve the overlapping sample distribution information for classification. In this experiment, UMIST database, to generate distinct pair of the training and test sets we have taken the s as 4, 6, 8 and 10. In this context, each pair of training and test sets is disjoint in nature. The performance of the proposed technique is performed by considering each value of s with 20 dissimilar pairs of training and test sets on the UMIST face database. Fig. 4 also shows the minimum, maximum and average recognition rates (%) of the FG-2DLDA method for different values of s by varying feature size. Table 4 shows a comparative presentation of the FG-2DLDA method along with other contemporary methods in terms of best average recognition rates. The proposed method yields the best average recognition rates (dimension of feature vector) of 86.81% (18×18), 92.75% (20×20), 96.83% (14×14), and 97.3% (14×14) for s = 4, 6, 8 and 10, respectively. In this case, the discriminative information is extracted by calculating fuzzy scatter matrices. The discriminative projection vectors are obtained when the fuzzy scatter matrices are singular. The results show that in all the cases, the performance of the FG-2DLDA method is superior to the other methods. 5 Conclusion In this paper, fuzzy generalized two-dimensional Fisher’s linear discriminant analysis (FG-2DLDA) method for face recognition is presented. This method assumes that a face image may belong to several classes with possibility of different membership values. These membership values are generated by fuzzy k-NN algorithm and used to generate fuzzy global mean image and fuzzy class- wise mean images. Finally these mean images are used to generate fuzzy intra-class and inter-class scatter matrices along row and column directions. The projection matrices obtained by solving these scatter matrices, satisfying the two Fisher’s criteria, yield rich information leading to generation of superior discriminant features. Image classification and recognition is performed using a RBF neural network. The performance of our method is validated on the FERET, AT&T and UMIST and face databases. The experimental results demonstrate that the FG-2DLDA method outperforms the competent methods. Table 4: Comparison in terms of average recognition rates (%) obtained from different methods on the UMIST face database. Method Average recognition rates s = 4 s = 6 s = 8 s = 10 FG-2DLDA 86.81 (18×18) 92.75 (20×20) 96.83 (14×14) 97.30 (14×14) G-2DFLD [18] 86.22 (14×14) 92.28 (14×14) 95.54 (14×14) 96.92 (18×18) F-LDA [29] 84.5 (19) - - 92.01 (19) MF-LDA [29] 85.38 (19) - - 92.53 (19) 2DFLD [18] 86.12 (112×14) 92.16 (112×14) 95.25 (112×14) 96.55 (112×18) 2DPCA [18] 85.70 (112×14) 91.91 (112×14) 95.07 (112×14) 96.60 (112×18) RF-LDA [29] 84.8 (19) - - 92.38 (19) PCA [18] 80.72 (60) 86.53 (60) 94.01 (60) 95.11 (60) *Highest recognition rates are indicated by the bold values 542 Informatica 43 (2019) 535–543 A. Dey et al. Acknowledgement This work was supported by the Senior Research fellowship Program of Aniruddha Dey under the State Government Fellowship (Ref. No. - P-1/RS./365/12, dated 05 th October, 2012.) Of the Department of Computer Science & Engineering, Jadavpur University, Kolkata. References [1] R. Chellappa, C. L. Wilson, and S. Sirohey. (1995) Human and machine recognition of faces: a survey. Proc. IEEE vol. 83, 705–740. https://doi.org/10.1109/5.381842 [2] W. Zhao, R. Chellappa, and P. J. Phillops. (2003) A. Rosenfeld. Face recognition: a literature survey. ACM Comput. Surveys. 35: 399–458. https://doi.org/10.1145/954339.954342 [3] A. S. Tolba, A.H. El-Baz, and A.A. El-Harby. (2006) Face recognition: a literature review. Int. J. Signal Process, 2: 88–103. [4] H. Zhou, A. Mian, L. Wei, D. Creighton, M. Hossny, and S. Nahavandi. (2014) Recent advances on singlemodal and multimodal face recognition: a survey. IEEE Trans. Human Machine Systems, 44(6): 701–716. https://doi.org/10.1109/THMS.2014.2340578 [5] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman. (1997) Eigenfaces versus fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell., 19:711– 720. https://doi.org/10.1109/34.598228 [6] B. Poon, M. A. Amin, and H. Yan. (2011) Performance evaluation and comparison of PCA Based human face recognition methods for distorted images. International Journal of Machine Learning and Cybernetics, 2(4): 245-259. https://doi.org/10.1007/s13042-011-0023-2 [7] G. J. Alvarado, W. Pedrycz,·M. Reformat, and K.- C. Kwak. (2006) Deterioration of visual information in face classification using eigenfaces and fisherfaces. Machine Vision and Applications, 17(1): 68–82. https://doi.org/10.1007/s00138-006-0016-4 [8] L. F Chen, H. Y Mark Liao, M. T. Ko, J.C Lin, and G. J.Yu. (2000) A new LDA-based face recognition system which can solve the small sample size problem. Pattern Recogn., 33: 1713–26. https://doi.org/10.1016/S0031-3203(99)00139-9 [9] [9] H. Yu, and J. Yang. (2001) A direct LDA algorithm for high-dimensional data-with application to face recognition. Pattern Recogn. 34: 2067–70. https://doi.org/10.1016/S0031-3203(00)00162-X [10] X. S. Zhuang, and D. Q. Dai. (2007) Improved discriminant analysis for high-dimensional data and its application to face recognition. Pattern Recogn., 40(5): 1570-1578. https://doi.org/10.1016/j.patcog.2006.11.015 [11] D. Swets, and J. Weng. (1996) Using discriminant eigenfeatures for image retrieval. IEEE Trans. Pattern Anal. Machine Intell., 18(8) 831–836. https://doi.org/10.1109/34.531802 [12] J. Friedman. (1989) Regularized discriminant analysis. J. Am. Stat. Assoc., 165– 175. https://doi.org/10.1080/01621459.1989.10478752 [13] R. Huang, Q. Liu, H. Lu, and S. Ma. (2002) Solving the small sample size problem of lda. In Proceedings of the 16th International Conference on Pattern Recognition, 3, 29–32. https://doi.org/10.1109/ICPR.2002.1047787 [14] H. Yu, and J. Yang. (2001) A direct lda algorithm for high-dimensional data-with application to face recognition. Pattern Recogn., 34 (10): 2067- 2075. https://doi.org/10.1016/S0031-3203(00)00162-X [15] M. S. Bartlet, J. R. Movellan, and T. J. Sejnowski. (2002) Face recognition by independent component analysis. IEEE Trans. Neural Netw., 13(6):1450- 1464. https://doi.org/10.1109/TNN.2002.804287 [16] J. Yang, D. Zhang, A. F. Frangi, and J. Y. Yang. (2004) Two-dimensional PCA: a new approach to appearance-based face representation and recognition. IEEE Trans. Pattern Anal. Mach. Intell., 26(1):131–137. https://doi.org/10.1109/TPAMI.2004.1261097 [17] H. Xiong, M. N. S. Swamy, and M. O. Ahmad. (2005) Two-dimensional FLD for face recognition. Pattern Recogn., 38(7):1121–1124. https://doi.org/10.1016/j.patcog.2004.12.003 [18] S. Chowdhury, J. K. Sing, D. K. Basu, and M. Nasipuri. (2011) Face recognition by generalized two-dimensional FLD method and multi-class support vector machines. Appl. Soft Comput.,11(7):4282–4292. https://doi.org/10.1016/j.asoc.2010.12.002 [19] H. Yu, and J. Yang. (2001) A direct LDA algorithm for high-dimensional data with application to face recognition, Pattern Recogn., 34: 2067–2070. https://doi.org/10.1016/S0031-3203(00)00162-X [20] J. Yang, and J. Yang. (2003) Why can LDA be performed in PCA transformed space? Pattern Recognit. 36 (2): 563-566. https://doi.org/10.1016/S0031-3203(02)00048-1 [21] J. Ye, and Q. Li. (2004) LDA/QR: An efficient and effective dimension reduction algorithm and its theoretical foundation, Pattern Recognit. 37 (4), 851–854. https://doi.org/10.1016/j.patcog.2003.08.006 [22] J. Ye, R. Janardan, C. Park, and H. Park. (2004) An optimization criterion for generalized discriminant analysis on undersampled problems, IEEE Trans. Pattern Anal. Mach. Intell. 26 (8): 982–994. https://doi.org/10.1109/TPAMI.2004.37 [23] J. M. Keller, M. R. Gray, and J. A. Givens. (1985) A fuzzy k-nearest neighbor algorithm. IEEE Trans. Syst. Man Cybernet., 15(4):580–585. https://doi.org/10.1109/TSMC.1985.6313426 [24] K. C. Kwak, and W. Pedrycz. (2005) Face recognition using a fuzzy fisherface classifier. A Novel Approach to Fuzzy-Based Facial Feature Extraction... Informatica 43 (2019) 535–543 543 Journal of the Pattern Recogn. 38(10):1717-1732. https://doi.org/10.1016/j.patcog.2005.01.018 [25] W. Yang, J. Wang, M. Ren, and J. Yang. (2009) Fuzzy 2-dimensional FLD for face recognition. Journal of Information and Computing Science, 4(3): 233-239. https://doi.org/10.1109/CCPR.2009.5344077 [26] W. Yang, J. Wang, M. Ren, L. Zhang, and J. Yang. (2009) Feature extraction using fuzzy inverse FDA. Neurocomputing, 72(13- 15): 3384–3390. https://doi.org/10.1016/j.neucom.2009.03.011 [27] X. N.Song, Y. J. Zheng, X. J.Wu, X. B.Yang, and J. Y.Yang. (2010) A complete fuzzy discriminant analysis approach for face recognition. Applied Soft Computing, 10: 208-214. https://doi.org/10.1016/j.asoc.2009.07.002 [28] L. Xiaodong, F. Shumin, and T. Zhang. (2013) Weighted maximum scatter difference based feature extraction and its application to face recognition. Machine Vision and Applications, 22: 591-595. [29] M. Zhao, T. W. S. Chow, and Z. Zhang (2012) Random walk-based fuzzy linear discriminant analysis for dimensionality reduction. Soft Computing, 16:1393-1409. https://doi.org/10.1007/s00500-012-0843-3 [30] J. Wang, W. Yang, and J. Yang. (2013) Face recognition using fuzzy maximum scatter discriminant analysis. Neural Computing & Application, 23: 957-964. https://doi.org/10.1007/s00521-012-1020-4 [31] X. Li, and A. Song. (2013) Fuzzy MSD based feature extraction method for extraction. Neurocomputing, 122: 266-271. https://doi.org/10.1016/j.neucom.2013.06.025 [32] X. Li (2014) Face recognition method based on fuzzy 2DPCA. Journal of Electrical and Computer Engineering, 2014:1- 7. https://doi.org/10.1155/2014/919041 [33] N. Zheng, L. Qi, and L. Guan. (2014) Generalised multiple maximum scatter difference feature extraction using QR decomposition. Journal of visual Communication Image Representation, 25:1460-1471. https://doi.org/10.1016/j.jvcir.2014.04.009 [34] J.K. Sing. (2015) A novel Gaussian probabilistic generalized 2DLDA for feature extraction and face recognition. In Proceedings of the IEEE Conference on Computer Graphics, Vision and Information Security, pages 258-263. https://doi.org/10.1109/CGVIS.2015.7449933 [35] P. Huang, Z. Yang, and C. Chen. (2015) Fuzzy local discriminant embedding for image feature extraction. Computers and Electrical Engineering, 46:231-240. https://doi.org/10.1016/j.compeleceng.2015.03.013 [36] J. Xu, Z. Gu, and K. Xie. Fuzzy local mean discriminant analysis for dimensionality reduction. Neural Processing Letter, 44:701-718, 2016. https://doi.org/10.1007/s11063-015-9489-3 [37] P. Huang, G. Gao, C. Qian, G. Yang, and Z. Yang. (2017) Fuzzy linear regression discriminant projection for face recognition. IEEE Access, 23:169-174. https://doi.org/10.1109/ACCESS.2017.2680437 [38] Q. Zhu, and Y. Xu. (2013) Multi-directional two dimensional PCA with matching score level fusion for face recognition. Neural Comput. & Applic., 23(1): 169-174. https://doi.org/10.1007/s00521- 012-0851-3 [39] P. J. Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss. (2000) The FERET evaluation methodology for face-recognition algorithms. IEEE Trans. Pattern. Anal. Mach. Intell., 22: 1090–1104. https://doi.org/10.1109/34.879790 [40] P. J. Phillips. (2004) The Facial Recognition Technology (FERET) database. . [41] The ORL face database, . [42] D. B. Graham, N. M. Allinson, H. Wechsler, P. J. Phillips, V. Bruce, F. Fogelman-Soulie, and T. S. Huang (Eds.), (1998) Characterizing virtual eigen signatures for general purpose face recognition: From theory to applications. NATO ASI Series F Computer and Systems Sciences, 163: 446–456. https://doi.org/10.1007/978-3-642-72201-1_25 544 Informatica 43 (2019) 535–543 A. Dey et al.