https://doi.org/10.31449/inf.v43i4.2117 Informatica 43 (2019) 535–543 535 
A Novel Approach to Fuzzy-Based Facial Feature Extraction and 
Face Recognition 
Aniruddha Dey 
Department of Computer Science & Engineering, Jadavpur University, Kolkata, India. 
E-mail: anidey007@gmail.com 
 
Manas Ghosh 
Department of Computer Application, RCCIIT, Kolkata, India 
E-mail: manas.ghosh@rcciit.org 
Keywords: fuzzy G-2DLDA, fuzzy set theory, Fk-NN, RBFNN based classifier, membership degree matrix  
Received: December 29, 2017 
Abstract: Generalized two-dimensional Fisher’s linear discriminant (G-2DFLD) is an effective feature 
extraction technique that maximizes class separability along row and column directions simultaneously. 
In this paper, we have presented a fuzzy-based feature extraction technique, named fuzzy generalized 
two-dimensional Fisher’s linear discriminant analysis (FG-2DLDA) method. The FG-2DLDA is 
extended version of the G-2DFLD method. In this study, we also have demonstrated the face recognition 
using the presented method with radial basis function (RBF) as a classifier. In this context, it is to be 
noted that the fuzzy membership matrix for the training samples is computed by means of fuzzy k-nearest 
neighbour (Fk-NN) algorithm. The global mean and class-wise mean training images are generated by 
combining the fuzzy membership values with the training samples. These mean images are used to 
compute the fuzzy intra- and inter-class scatter matrices along x- and y-directions. Finally, by solving 
the Eigen value problems of these scatter matrices, we find the optimal fuzzy projection vectors, which 
actually used to generate more discriminant features. The presented method has been validated over 
three public face databases using RBF neural network and establish that the proposed FG-2DLDA 
method provides favourable recognition rates than some contemporary face recognition methods. 
Povzetek: V prispevku je opisana metoda dvodimenzionalne Fisherjeve linearne diskriminacijske analize 
na osnovi mehkih množic (FG-2DLDA). 
1 Introduction 
Facial feature extraction technique has developed as a 
popular research area in last 20 years in the field of 
computer vision, and machine learning [1- 6]. Very 
popular linear methods include principal component 
analysis (PCA) [5- 6], linear discriminant analysis (LDA) 
[7] and their variants, which use Eigen faces and/or 
Fisherfaces to compute features, fall under this category. 
In particular, PCA maximizes the total scatter matrix 
across all face images. However, undesirable variations 
caused by lighting, facial expression and other factors are 
retained through PCA techniques [6]. Many researchers 
argue that the PCA techniques do not provide any 
information for class discrimination; only perform 
dimension reduction [6, 7]. The LDA has been proposed 
as a better alternative to the PCA to provide class 
discrimination information [8, 9]. The main objective of 
the LDA is to find best discrimination of vectors among 
the classes by maximizing the between-class differences 
and minimizing the within-class ones [8]. The 
disadvantage of LDA technique is that, it suffers from 
the “small sample size (SSS)” problem [9]. The 
aforementioned problem mainly occurs in case of few 
numbers of sample than the sample dimension. The 
dimension of face images is generally very high; as a 
results, the within-class scatter matrix become singular 
that makes the FLD method infeasible. The SSS problem 
in LDA can be solved by sampling down the face images 
into smaller size [10]. LDA is one of the most important 
linear approaches for feature extraction which maximizes 
the ratio of the between-and within-class scatter matrix. 
However, for a task with very high-dimensional facial 
images, LDA method may suffer from the problem of 
singularity. To solve this problem, PCA has been applied 
to reduce the dimensions of the high dimensional vector 
space before employing the LDA method [11]. While the 
PCA seeks projections which are optimal for image 
reform from a low dimensional space, it may remove 
dimensions that contain discriminant information 
required for face recognition. R-LDA method introduces 
to solve the singularity problem [12]. The main drawback 
of R-LDA is that the dimensionality of covariance matrix 
is often more than ten thousand. It is not useful for R-
LDA to procedure such large covariance matrix, when 
the computing platform is not sufficiently powerful. 
Huang et al. [13] introduced a more efficient null space 
IDA method. The key idea of this technique is that the 
with-class scatter matrix (𝑆 𝑤 ) is more effective for 
calculating discriminant feature, whereas, between-class 
scatter matrix (𝑆 𝑏 ) is useless. Though, the method is 
536 Informatica 43 (2019) 535–543 A. Dey et al.  
often criticized for the high storage requirement and 
computational cost in facial feature extraction and 
recognition. Chen et al. [8] claimed that eigenvectors 
corresponding to eigenvalues equal to zero of 𝑆 𝑤 contain 
the maximum discriminant information. Yu and Yang 
[14] proposed a direct linear discriminant analysis 
method by diagonal the between- and within-class scatter 
matrix. It is well-known that between- and within-class 
scatter are two important measures of the separability of 
the projected samples. Independent component analysis 
(ICA) is also proposed as an effective feature extraction 
technique [15]. ICA computes discriminant features from 
covariance matrix by considering high-order statistics. 
The two-dimensional PCA (2DPCA) directly works on 
the 2D image matrices and found to be computationally 
efficient and more superior for face recognition and 
reconstruction than PCA [16]. Two-dimensional FLD 
(2DFLD) method maximizes the class separability in one 
direction (row or column) at a time [17]. The significant 
characteristic of 2DFLD method is that it directly works 
on the 2D image matrices. The projection vectors are 
extracted from the row and, by the G-2DFLD method 
[18]. The discriminant feature matrices are found by 
linearly projecting an image matrix on aforementioned 
directions. Therefore, the discriminative information is 
maximized by this method among the classes while 
minimizing it column direction of the training images 
simultaneously within a class [18]. To increase its 
pertinence, many LDA extensions, such as direct LDA 
[19], complete LDA [20], LDA/QR [21] or LDA/GSVD 
[22], have been developed in the last decades. These 
extensions try to preserve the same validation and 
overcome singularity problems either by first projecting 
the problem in a convenient subspace, using alternative 
indirect or approximate optimizations. 
Very recently, several researchers presented fuzzy-
based methods, such as fuzzy k-nearest neighbour (Fk-
NN) [23], fuzzy two dimensional Fisher’s linear 
discrimination (F-2DFLD) [25], fuzzy maximum scatter 
difference (F-MSD) [28], fuzzy two dimensional 
principal component analysis (F-2DPCA) [32], fuzzy two 
dimensional linear discriminant analysis (GPG-2DLDA), 
Generalized multiple maximum scatter difference 
(GMMSD) [33], fuzzy local mean discriminant analysis 
(FLMDA) [36], and fuzzy linear regression discriminant 
projection (FLRDP) [37] for feature extraction. Keller et 
al. (1985) presented the fuzzy k-nearest neighbour (Fk-
NN) approach, which fuzzifies the class assignment [23]. 
This method, popularly known as fuzzy Fisherface [24] 
(Fuzzy-FLD), which incorporates the fuzzy membership 
grades into the within- and between-class scatter matrices 
for binary labelled patterns to extract features and are 
used for face recognition [25]. The fuzzy 2DFLD (F-
2DFLD) is an extension of the fuzzy Fisherface [26]. The 
scatter matrices were redefined by introducing 
membership values into each training sample. Yang et al. 
proposed feature extraction using fuzzy inverse FDA 
[26]. The Fk-NN was also incorporated in fuzzy inverse 
FDA for calculating membership degree matrices. The 
Fk-NN is used to calculate the membership matrix, which 
is incorporated within the definition of between class and 
within class scatter matrix [26]. Reformative LDA 
method is used along with the Fk-NN method to redefine 
the scatter matrices [27]. A weighted maximum scatter 
difference algorithm is used for face recognition [28]. 
Fuzzy LDA algorithm is derived by incorporating the 
fuzzy membership into learning and random walk 
method is introduced to reduce the effect of outliers [29]. 
Fuzzy set theory is integrated with the scatter difference 
discriminant criterion (SDDC) algorithm where Fk-NN 
method is used to compute the membership grade which 
is utilized to redefine the scatter matrices [30]. Fuzzy 
maximum scatter difference model is proposed where Fk-
NN is used to calculate the membership degree matrix of 
training sample [31]. The Fuzzy 2DPCA method was 
introduced where Fk-NN method is applied to compute 
the membership matrix for training sample which was 
utilized to obtain fuzzy mean of each class. The average 
of the mean was calculated to define the scatter matrices 
[32]. Generalized multiple maximum scatter difference 
discriminant criterion has been introduced for effective 
feature extraction and classification [33]. Gaussian 
probability distribution information was incorporated in 
defining of between class and within class scatter 
matrices [34]. The membership grade and label 
information were used to define the scatter matrices [35]. 
Fuzzy local mean discriminant analysis was employed to 
construct the scatter matrices by redefining the fuzzy 
local class means [36]. Fuzzy linear regression 
discriminant projection method is proposed to compute 
the fuzzy membership grade for each sample and 
incorporated in the definition of within class and between 
class scatter matrices [37]. 
In the proposed method, we have incorporated the 
fuzzy membership values in different classes which are 
computed from the training images (samples). To obtain 
the membership degrees of each training sample, we 
have used the fuzzy k-NN and used them for calculating 
the global and class-wise mean training image matrices. 
Finally, fuzzy scatter matrices (between and within) are 
computed distinctly in row and column wise direction. 
To solve the eigenvalue problem of aforementioned 
scatter matrices, the features are extracted. 
The remaining sections of this paper are organized 
as follows. In Section 2, we give brief overview of G-
2DFLD method. In Section 3, we propose a novel 
method for feature extraction based on G-2DFLD 
method, called FG-2DLDA method. The simulation 
results on three public face image datasets are 
demonstrated in Section 4. Concluding remarks is given 
Section 5. 
2 Brief summary of the generalized 
2DFLD method 
Our presented technique is extended version of the G-
2DFLD feature extraction technique [18]. G-2DFLD 
method is briefly presented in this section. 
Let, the face images are of 𝑟 ×𝑠 dimension which 
are represented in the form of 2D vectors 𝑿 𝑖 
(𝑖 =
1,2,…,𝑁 ). The total number of “𝐶 ” classes comprises 𝑁 
A Novel Approach to Fuzzy-Based Facial Feature Extraction... Informatica 43 (2019) 535–543 537 
face images. The 𝑐 𝑡 ℎ
class is represented by 𝐶 𝑐 having 
total samples of 𝑁 𝑐 and also satisfying the condition 
(∑ 𝑁 𝑐 =𝑁 )
𝐶 𝑐 =1
 . Given an image 𝑿 , the G-2DFLD-based 
2D feature matrix 𝒀 is generated by the following linear 
transformation: 
𝒀 =(𝑷 𝑜𝑝𝑡 )
𝑇 𝑿 (𝑸 𝑜𝑝𝑡 )                                                (1) 
where 𝑷 𝑜𝑝𝑡 and 𝑸 𝑜𝑝𝑡 are the two optimal projection 
matrices.  
The two Fisher’s criteria (objective function) along 
row and column direction ( 𝐽 (𝑷 ), 𝐽 (𝑸 )) have been 
expressed as stated below: 
{
𝐽 (𝑷 )=   𝑷 𝑇 𝑮 𝑟𝑏
 𝑷 ×(𝑷 𝑇 𝑮 𝑟𝑤
 𝑷 )
−𝟏 𝑎𝑛𝑑                                                   
𝐽 (𝑸 )=   𝑸 𝑇 𝑮 𝑐𝑏
 𝑸 ×(𝑸 𝑇 𝑮 𝑐𝑤
 𝑸 )
−𝟏                           (2) 
The optimal projection vectors 𝑷 𝑜𝑝𝑡 and 𝑸 𝑜𝑝𝑡 can be 
obtained by finding the normalized eigenvalues the 
eigenvectors of 𝑮 𝑟𝑏
𝑮 𝑟𝑤
−1
 and 𝑮 𝑐𝑏
𝑮 𝑐𝑤
−1
 , respectively. The 
eigenvalues are sorted in descending order and the 
eigenvectors are also rearranged accordingly [18]. The 
optimal projection (eigenvector) matrix 𝑷 𝑜𝑝𝑡 and 𝑸 𝑜𝑝𝑡 
can be stated as follows: 
{
 
 
 
 
𝑷 𝑜𝑝𝑡 = arg𝑚𝑎𝑥 𝑃 |𝑮 𝑟𝑏
𝑮 𝑟𝑤
−1
|
=[𝑝 1
,𝑝 2
,…,𝑝 𝑢 ]
𝑎𝑛𝑑                                          
𝑸 𝑜𝑝𝑡 =arg𝑚𝑎𝑥 𝑄 |𝑮 𝑐𝑏
𝑮 𝑐𝑤
−1
|
=[𝑞 1
,𝑞 2
,…,𝑞 𝑣 ] 
                                         (3) 
The between-class and within-class scatter matrices 
along row direction (𝑮 𝑟𝑏
 and 𝑮 𝑟𝑤
) and column 
direction (𝑮 𝑐 𝑏 and 𝑮 𝑐𝑤
) are computed as follows : 
{
 
 
 
 
 
 
𝑮 𝑟𝑏
= ∑𝑁 𝑐 (𝒎 𝑐 −𝒎 )(𝒎 𝑐 −𝒎 )
𝑇                          
𝐶 𝑐 𝑎𝑛𝑑                                                                          (4𝑎 )
𝑮 𝑟𝑤
=∑∑(𝑿 𝑖 − 𝒎 𝑐 )(𝑿 𝑖 − 𝒎 𝑐 )
𝑇                      
𝑁 𝑖 ∈𝑐 𝐶 𝑐 
{
 
 
 
 
 
 
𝑮 𝑐𝑏
= ∑𝑁 𝑐 (𝒎 𝑐 −𝒎 )
𝑇 𝐶 𝑐 (𝒎 𝑐 −𝒎 )                
𝑎𝑛𝑑                                                                           
𝑮 𝑐𝑤
=∑∑(𝑿 𝑖 − 𝒎 𝑐 )
𝑇 (𝑿 𝑖 − 𝒎 𝑐 )
𝑁 𝑖 ∈𝑐 𝐶 𝑐            
(4𝑏 ) 
In above expression, the global mean training image 
(𝒎 = 
1
𝑁 ∑ 𝑿 𝑖 𝑁 𝑖 =1
 ) and class-wise mean training image 
(𝒎 𝑐 = 
1
𝑁 𝑐 ∑ 𝑿 𝑖 𝑁 𝑖 =1
|𝑿 𝑖 ∈𝐶 𝑐 ) are calculated. The 
dimensions of the row-wise scatter matrices 
(𝑮 𝑟𝑏
 𝑎𝑛𝑑 𝑮 𝑟𝑤
) and the column-wise scatter matrices 
(𝑮 𝑐𝑏
 𝑎𝑛𝑑 𝑮 𝑐𝑤
) are found to be r×r and s×s, 
respectively. 
3 Proposed fuzzy generalized two-
dimensional linear discriminant 
analysis (FG-2DLDA) method 
Human faces are highly susceptible to vary under 
different environmental conditions, such as illumination, 
pose, etc. As a result, sometimes, images of a person may 
look alike to that of a different person. In addition, 
variability among the images of a person may differ quite 
significantly. The proposed FG-2DLDA method is 
basically based on the concept of fuzzy class assignment, 
where a face image belongs to different classes as 
characterized by its fuzzy membership values. The idea 
of fuzzification using fuzzy k-nearest neighbour (Fk-NN) 
was conceived by Keller et al. and found to be more 
effective [23]. In the present study, we have used the Fk-
NN for generating fuzzy membership values for training 
images; resulting a fuzzy membership matrix. The fuzzy 
membership values are incorporated with the training 
images to obtain global and class-wise mean images, 
which in turn used to form fuzzy (between- and within-
class) scatter matrices. Therefore, these scatter matrices 
yield useful information regarding association of each 
training image into several classes. The optimal fuzzy 2D 
projection vectors are obtained by solving the eigenvalue 
problems of these scatter matrices. Finally, the FG-
2DLDA-based features are extracted by projecting a face 
image onto these optimal fuzzy 2D projection vectors. 
The different steps of the FG-2DLDA method are 
presented in details in the following sub-sections. 
3.1 Generation of membership matrix by 
fuzzy k-nearest neighbour (Fk-NN) 
Let, there are C classes and N training images; each one 
is represented in the form of 2D vectors 𝑿 𝑖 
(𝑖 =
1,2,…,𝑁 ). A fuzzy k-NN-based decision algorithm has 
been performed for assigning membership values 
(degree) to the training images [23, 24]. This Fuzzy k-
Nearest Neighbour (Fk-NN) method redefines the 
membership values of the labelled face images. When, 
all of the neighbours belong to the 𝑖 𝑡 ℎ
 class which is 
equal to the class of 𝑗 𝑡 ℎ
 image under consideration, then 
𝑛 𝑖𝑗
=𝑘 and µ
𝑖𝑗
 returns 1, making membership values for 
the other classes as zero. In addition, µ
𝑖𝑗
 also satisfies 
two obvious properties (∑ 𝜇 𝑖𝑗
=1
𝐶 𝑖 =1
 𝑎 𝑛𝑑  0<
∑ 𝜇 𝑖𝑗
<𝑁 𝑁 𝑗 =1
). So, the fuzzy membership matrix 𝑈 using 
the Fk-NN can be demonstrated as given below: 
𝑈 =[𝜇 𝑐𝑖
];𝑐 =1,2,3,…,𝐶 ;    𝑖 =1,2,3,…,𝑁           (5) 
3.2 Fuzzy generalized two dimensional 
linear discriminant analysis (FG-
2DLDA) algorithm 
FG-2DLDA methods has employed the fuzzy 
membership values with the training images and redefine 
the scatter matrices along row and column directions. 
Finally, the optimal fuzzy projection vectors are 
generated by solving the eigenvalue problems of these 
538 Informatica 43 (2019) 535–543 A. Dey et al.  
scatter matrices. Let the training set contains N images of 
C classes (subjects) and each one is denoted as 𝑿 𝑖 (𝑖 =
 1,2,3…,𝑁 ) having dimension as r×s. The 𝑐 𝑡 ℎ
 class 𝐶 𝑐 , 
has total 𝑁 𝑐 images and satisfies ∑ 𝑁 𝑐 =𝑁 𝐶 𝑐 =1
.  
For an image 𝑿 , the FG-2DLDA-based features in 
the form of 2D matrix of size 𝑢 ×𝑣 is generated by 
projecting it onto the optimal fuzzy projection matrices 
and can be achieved by the following linear 
transformation as defined below: 
𝒀 𝑓 =(𝑷 𝑜𝑝𝑡 𝑓 )
𝑇 𝑿 (𝑸 𝑜𝑝𝑡 𝑓 )                                     (6) 
The Fisher’s criteria (objective function) 𝐽 𝑓 (𝑷 ) and 
𝐽 𝑓 (𝑸 ) along row and column directions are defined as 
follows: 
{
𝐽 𝑓 (𝑷 )=  (𝑷 𝑓 )
𝑇 𝑮 𝑟𝑏
𝑓 𝑷 𝑓 ×{(𝑷 𝑓 )
𝑇 𝑮 𝑟𝑤
𝑓 𝑷 𝑓 }
−1
𝑎𝑛𝑑                                                                  
𝐽 𝑓 (𝑸 )= (𝑸 𝑓 )
𝑇 𝑮 𝑐𝑏
𝑓 𝑸 𝑓 ×{(𝑸 𝑓 )
𝑇 𝑮 𝑐𝑤
𝑓 𝑸 𝑓 }
−𝟏 
          (7) 
The ratio is maximized in the above equations (7) 
when the column vectors of the projection 
matrix  𝑷 𝑓 and 𝑸 𝑓 are the eigenvectors of 𝑮 𝑟𝑏
𝑓 (𝑮 𝑟𝑤
𝑓 )
−1
 
and 𝑮 𝑐𝑏
𝑓 (𝑮 𝑐𝑤
𝑓 )
−1
, respectively. The fuzzy optimal 
projection matrix 𝑷 𝑜𝑝𝑡 𝑓 and 𝑸 𝑜𝑝𝑡 𝑓 are obtained by finding 
the eigenvectors of 𝑮 𝑟𝑏
𝑓 (𝑮 𝑟𝑤
𝑓 )
−1
 and 𝑮 𝑐𝑏
𝑓 (𝑮 𝑐𝑤
𝑓 )
−1
 
corresponding to the 𝑢 and 𝑣 largest eigenvalues, 
respectively. The fuzzy optimal projection matrices 𝑷 𝑜𝑝𝑡 𝑓 
and 𝑸 𝑜𝑝𝑡 𝑓 can be represented as follows: 
{
 
 
 
 
 
 
𝑷 𝑜𝑝𝑡 𝑓 =arg𝑚𝑎𝑥 𝑃 𝑓 |𝑮 𝑟𝑏
𝑓 (𝑮 𝑟𝑤
𝑓 )
−1
|
=[𝑝 1
,𝑝 2
,…,𝑝 𝑢 ]       
𝑎𝑛𝑑                                                  
𝑸 𝑜𝑝𝑡 𝑓 =arg𝑚𝑎𝑥 𝑄 𝑓 |𝑮 𝑐𝑏
𝑓 (𝑮 𝑐𝑤
𝑓 )
−1
|
=[𝑞 1
,𝑞 2
,…,𝑞 𝑣 ]       
                               (8) 
where {𝑝 𝑖 |𝑖 = 1,2,...,𝑢 } is the set of normalized 
eigenvectors of 𝑮 𝑟𝑏
𝑓 (𝑮 𝑟𝑤
𝑓 )
−1
 corresponding to 𝑢 largest 
eigenvalues {𝜆 𝑖 |𝑖 = 1,2,...,𝑢 } and {𝑞 𝑗 |𝑗 =
 1,2,...,𝑣 } is the set of normalized eigenvectors of 
𝑮 𝑐𝑏
𝑓 (𝑮 𝑐𝑤
𝑓 )
−1
corresponding to 𝑣 largest eigenvalues 
{𝛼 𝑗 |𝑗 = 1,2,...,𝑣 }. 
The four (within- and between- class) fuzzy scatter 
matrices ( 𝑮 𝑟𝑏
𝑓 , 𝑮 𝑟𝑤
𝑓 , 𝑮 𝑐𝑏
𝑓 , 𝑮 𝑐𝑤
𝑓 ) along the row and 
column directions are defined as follows: 
 
{
 
 
 
 
 
 
𝑮 𝑟𝑏
𝑓 =
1
𝑁 ∑𝑁 𝑐 𝑓 (𝒎̅
𝑐 − 𝒎̅)(𝒎̅
𝑐 −𝒎̅)
𝑇             
𝐶 𝑐 𝑎𝑛𝑑                                                                        (9𝑎 )
𝑮 𝑟𝑤
𝑓 =
1
𝑁 ∑∑(𝑿 𝑖 −𝒎̅
𝑐 )(𝑿 𝑖 − 𝒎̅
𝑐 )
𝑇            
𝑁 𝑖 ∈𝑐 𝐶 𝑐 
{
 
 
 
 
 
 
𝑮 𝑐𝑏
𝑓 =
1
𝑁 ∑𝑁 𝑐 𝑓 (𝒎̅
𝑐 −𝒎̅)
𝑇 (𝒎̅
𝑐 − 𝒎̅)
𝐶 𝑐                
𝑎𝑛𝑑                                                                         (9𝑏 ) 
𝑮 𝑐𝑤
𝑓 =
1
𝑁 ∑∑(𝑿 𝑖 − 𝒎̅
𝑐 )
𝑇 (𝑿 𝑖 −𝒎̅
𝑐 )  
𝑁 𝑖 ∈𝑐 𝐶 𝑐            
 
Where fuzzy membership degrees are integrated into 
the training images to get fuzzy global mean image 
(𝒎̅ = 
∑ ∑ 𝝁 𝑐𝑖
𝑿 𝑖 𝑁 𝑖 =1
𝐶 𝑐 =1
∑ ∑ 𝝁 𝑐𝑖
𝑁 𝑖 =1
𝐶 𝑐 =1
 ) and fuzzy class-wise mean images 
(𝒎̅
𝑐 = 
∑ 𝝁 𝑐𝑖
𝑿 𝑖 𝑁 𝑖 =1
∑ 𝝁 𝑐𝑖
𝑁 𝑖 =1
;  𝑐 =1,2,3,…,𝐶 ). It may be also 
noted that the size of the (𝑮 𝑟𝑏
𝑓 and 𝑮 𝑟𝑤
𝑓 ) scatter matrices 
is 𝑟 ×𝑟 ; whereas for the 𝑮 𝑐𝑏
𝑓 and 𝑮 𝑐𝑤
𝑓 scatter matrices it 
is 𝑠 ×𝑠 . 
4 Simulation results and discussion 
We have assessed the performance of the proposed FG-
2DLDA on three publicly available databases namely, 
FERET [39, 40] AT&T [41], and UMIST [42]. The 
equation for calculating the recognition rate is 
represented below: 
𝑅 𝑎𝑣𝑔 =
∑ 𝑛 𝑐𝑙𝑠 𝑖 𝑞 𝑖 =1
𝑞 ×𝑛 𝑡𝑜𝑡                                (10) 
where, 𝑞 denotes total number of experimental runs. 
Correct recognition number in the 𝑖 𝑡 ℎ
 run is represented 
by  𝑛 𝑐𝑙𝑠 
𝑖 . 𝑛 𝑡𝑜𝑡 indicates the whole number of test face 
images.  
FERET face database is used to evaluate the FG-
2DLDA method under several facial expressions, pose 
and lighting conditions. AT&T and UMIST database are 
used to access the presented method under the condition 
of minor variations of rotation and scaling. In these 
experiments, we have used a RBFNN classifier due to its 
superiority and simplicity over the other types of neural 
networks. As discussed in Section 3 of the proposed FG-
2DLDA method, the experiments are performed to 
validate our claim. The FG-2DLDA algorithm is 
implemented in C programming language on the Linux 
operational system with Intel Core i5 (2.4 GHz) and 
DDR3 (8 GB, 1333 MHz). The suggested method is 
evaluated on a subset of FERET face database [39, 40]. 
The database consists of 1400 images of 200 individuals 
and each individual is having 7 images. The images 
differ in facial expression, illumination and pose. In our 
study, the facial portion of each original image was 
lopped and resized to 80×80 pixels based on the location 
of the eyes. Here, the values of s are taken as 2, 3 and 4 
A Novel Approach to Fuzzy-Based Facial Feature Extraction... Informatica 43 (2019) 535–543 539 
and our method is tried out 10 times with each value of s 
with the different training sets and test sets. Some 
examples of images of a person are shown in Fig. 1 (i). A 
set of 400 images of 40 persons comprise AT&T face 
database. There are 10 dissimilar images for each 
person. In our present study, from the set of images for 
each person, s images are picked out in random from the 
database to generate the training set and remaining (10- 
s) images are considered as the test set. Hence, a distinct 
set of images encompasses the training and test set. 3, 4, 
5, and 6 are taken as the values of s to form different 
pairs of training and test sets. Some examples of images 
of a individual are shown in Fig. 1 (ii). A total of 575 
grey-scaled images of 20 different individuals covering a 
variety of race, sex, and appearance is contained in the 
multi-view UMIST database. The Face database of 
images per individual varies from 19 to 48 images. In 
recent studies, we have diminished each image to 112 × 
92 pixels. Fig. 1 (iii) shows one person face image from 
the database. 
The experiments are repeated 10 times for each 
value of s with different training set and test sets on the 
FERET face database. Here, we choose s = 2, 3, 4 images 
from each subject at random for training and remaining 
(7-s) images are employed for testing. The proposed 
method is evaluated with feature matrices sizes from 6×6 
(i)  
(ii)  
(iii)  
Figure 1: Some pictures of a person from the (i) FERET, 
(ii) AT&T, and (iii) UMIST face databases. 
 
Figure 2: Minimum, maximum and average recognition rates of the FG-2DLDA method  
for different values of s by varying feature size on the FERET face database. 
Table 2: Comparison in terms of average recognition rates (%) obtained from  
different methods on the FERET face database. 
Method 
Average recognition rates 
s = 2 s = 3 s = 4 
FG-2DLDA 49.05 (10×10) 58.81 (12×12) 65.51 (10×10) 
F-2DFLD [22] 48.88 (40×8) - - 
MMSD (𝜃 =0.4) [28] - 52.6   - 55.81   - 
MSD (𝜃 =0.4) [28] - 50.5   - 53.68   - 
FMSD (𝜃 =0.4) [28] - 53.46   - 56.9   - 
Alternative-2DPCA [35] 48.31 (112×20) 53.21 (112×20) 53.97 (112×20) 
(2D) 2PCA [35] 47.70 (112×20) 52.36 (112×20) 55.45 (112×20) 
2DPCA [35] 47.12 (112×20) 52.66 (112×20) 55.20 (112×20) 
*Highest recognition rates are indicated by the bold values. 
46
47
48
49
50
51
6×6 8×8 10×10 12×12 14×14 16×16 18×18 20×20
s = 2
The dimension of feature vector
Avg. recognition rate ( % )
54
55
56
57
58
59
60
61
62
6×6 8×8 10×10 12×12 14×14 16×16 18×18 20×20
s = 3
The dimension of feature vector
Avg. recognition rate ( % )
58
60
62
64
66
68
70
72
6×6 8×8 10×10 12×12 14×14 16×16 18×18 20×20
s = 4
Avg. recognition rate ( % )
The dimension of feature vector
540 Informatica 43 (2019) 535–543 A. Dey et al.  
to 20×20 using RBFNN as a classifier. Fig. 2 
demonstrate the minimum, maximum and average 
recognition rates (%) of the FG-2DLDA method for 
different values of (s= 2, 3 and 4) by varying feature size. 
We have compared the performance of the proposed 
FG-2DLDA method with other competent related 
methods. FG-2DLDA method extracts discriminative 
feature by calculating the within class and between class 
 
 
Figure 3: Minimum, maximum and average recognition rates of the FG-2DLDA method for  
different values of s by varying feature size on the AT&T face database. 
 
 
Figure 4: Minimum, maximum and average recognition rates of the FG-2DLDA method for  
different values of s by varying feature size on the UMIST face database. 
86
88
90
92
94
96
98
6×6 8×8 10×10 12×12 14×14 16×16 18×18 20×20 22×22
s = 3
The dimension of feature vector
Avg recognition rate ( % )
90
92
94
96
98
100
6×6 8×8 10×10 12×12 14×14 16×16 18×18 20×20 22×22
s = 4
Avg recognition rate ( % )
The dimension of feature vector
92
94
96
98
100
6×6 8×8 10×10 12×12 14×14 16×16 18×18 20×20 22×22
s = 5
Avg. recognition rate ( % )
The dimension of feature vector
96
97
98
99
100
6×6 8×8 10×10 12×12 14×14 16×16 18×18 20×20 22×22
s = 6
The dimension of feature vector
Avg. recognition rate ( % )
75
80
85
90
95
8×8 10×10 12×12 14×14 16×16 18×18 20×20 22×22 24×24
s = 4
The dimension of feature vector
Avg. recognition rate ( % )
88
90
92
94
96
8×8 10×10 12×12 14×14 16×16 18×18 20×20 22×22 24×24
s = 6
Avg. recognition rate ( % )
The dimension of  feature vector
92
94
96
98
100
8×8 10×10 12×12 14×14 16×16 18×18 20×20 22×22 24×24
s = 8
Avg. recognition rate ( % )
The dimension of feature vector
92
94
96
98
100
8×8 10×10 12×12 14×14 16×16 18×18 20×20 22×22 24×24
s = 10
The dimension of feature vector
Avg. recognition rate ( % )
A Novel Approach to Fuzzy-Based Facial Feature Extraction... Informatica 43 (2019) 535–543 541 
scatter matrix in row and column direction. Thus, the 
results again demonstrate the superiority of the FG-
2DLDA method over other methods. 
In this study, we have validated the performance of 
our method with 20 different pairs of training and test 
sets for each value of s on the AT&T face database. 
Since the present method considers that a face image 
may simultaneously belong to different classes with 
possibly different membership values, the class-wise 
mean images may differ from the actual ones. Fig. 3 
Minimum, maximum and average recognition rates of 
the FG-2DLDA method for different values of s by 
varying feature size on the AT&T face database. The 
proposed method yields the best average recognition 
rates of 93.41% (14×14), 96.08% (16×16), 98.08% 
(14×14), and 98.68% (18×18) for s = 3, 4, 5, and 6, 
respectively. 
Table 3 demonstrates the best average recognition 
rates achieved by this algorithm for different 
combination of training and test set. Moreover, we also 
have compared the result of our method with the other 
competent methods. In general the face images are 
severely affected by the different environmental 
condition. These factors need to be investigated to 
measure their impact on the intra-class assignment. The 
scatter matrices involve the overlapping sample 
distribution information for classification. 
In this experiment, UMIST database, to generate 
distinct pair of the training and test sets we have taken 
the s as 4, 6, 8 and 10. In this context, each pair of 
training and test sets is disjoint in nature. The 
performance of the proposed technique is performed by 
considering each value of s with 20 dissimilar pairs of 
training and test sets on the UMIST face database. Fig. 4 
also shows the minimum, maximum and average 
recognition rates (%) of the FG-2DLDA method for 
different values of s by varying feature size. Table 4 
shows a comparative presentation of the FG-2DLDA 
method along with other contemporary methods in terms 
of best average recognition rates. The proposed method 
yields the best average recognition rates (dimension of 
feature vector) of 86.81% (18×18), 92.75% (20×20), 
96.83% (14×14), and 97.3% (14×14) for s = 4, 6, 8 and 
10, respectively. In this case, the discriminative 
information is extracted by calculating fuzzy scatter 
matrices. The discriminative projection vectors are 
obtained when the fuzzy scatter matrices are singular. 
The results show that in all the cases, the performance of 
the FG-2DLDA method is superior to the other methods. 
5 Conclusion 
In this paper, fuzzy generalized two-dimensional Fisher’s 
linear discriminant analysis (FG-2DLDA) method for 
face recognition is presented. This method assumes that a 
face image may belong to several classes with possibility 
of different membership values. These membership 
values are generated by fuzzy k-NN algorithm and used 
to generate fuzzy global mean image and fuzzy class-
wise mean images. Finally these mean images are used to 
generate fuzzy intra-class and inter-class scatter matrices 
along row and column directions. The projection 
matrices obtained by solving these scatter matrices, 
satisfying the two Fisher’s criteria, yield rich information 
leading to generation of superior discriminant features. 
Image classification and recognition is performed using a 
RBF neural network. The performance of our method is 
validated on the FERET, AT&T and UMIST and face 
databases. The experimental results demonstrate that the 
FG-2DLDA method outperforms the competent methods. 
Table 4: Comparison in terms of average recognition rates (%) obtained  
from different methods on the UMIST face database. 
Method 
Average recognition rates 
s = 4 s = 6 s = 8 s = 10 
FG-2DLDA 86.81  
(18×18) 
92.75 
(20×20) 
96.83 
(14×14) 
97.30 
(14×14) 
G-2DFLD [18] 86.22 
(14×14) 
92.28 
(14×14) 
95.54 
(14×14) 
96.92 
(18×18) 
F-LDA [29] 84.5 
(19) 
- - 92.01 
(19) 
MF-LDA [29] 85.38 
(19) 
- - 92.53 
(19) 
2DFLD [18] 86.12 
(112×14) 
92.16 
(112×14) 
95.25 
(112×14) 
96.55 
(112×18) 
2DPCA [18] 85.70 
(112×14) 
91.91 
(112×14) 
95.07 
(112×14) 
96.60 
(112×18) 
RF-LDA [29] 84.8 
(19) 
- - 92.38 
(19) 
PCA [18] 80.72 
(60) 
86.53 
(60) 
94.01 
(60) 
95.11 
(60) 
 
 
 
 
 
 
 
 
 
 
 
 
                                  *Highest recognition rates are indicated by the bold values 
542 Informatica 43 (2019) 535–543 A. Dey et al.  
Acknowledgement 
This work was supported by the Senior Research 
fellowship Program of Aniruddha Dey under the State 
Government Fellowship (Ref. No. - P-1/RS./365/12, 
dated 05
th
 October, 2012.) Of the Department of 
Computer Science & Engineering, Jadavpur University, 
Kolkata. 
References 
[1] R. Chellappa, C. L. Wilson, and S. Sirohey. (1995) 
Human and machine recognition of faces: a survey. 
Proc. IEEE vol. 83, 705–740. 
https://doi.org/10.1109/5.381842 
[2] W. Zhao, R. Chellappa, and P. J. Phillops. (2003) 
A. Rosenfeld. Face recognition: a literature survey. 
ACM Comput. Surveys. 35: 399–458. 
https://doi.org/10.1145/954339.954342  
[3] A. S. Tolba, A.H. El-Baz, and A.A. El-Harby. 
(2006) Face recognition: a literature review. Int. J. 
Signal Process, 2: 88–103. 
[4] H. Zhou, A. Mian, L. Wei, D. Creighton, M. 
Hossny, and S. Nahavandi. (2014) Recent advances 
on singlemodal and multimodal face recognition: a 
survey. IEEE Trans. Human Machine Systems, 
44(6): 701–716. 
https://doi.org/10.1109/THMS.2014.2340578 
[5] P. N. Belhumeur, J. P. Hespanha, and D. J. 
Kriegman. (1997) Eigenfaces versus fisherfaces: 
recognition using class specific linear projection. 
IEEE Trans. Pattern Anal. Mach. Intell., 19:711–
720. https://doi.org/10.1109/34.598228 
[6] B. Poon, M. A. Amin, and H. Yan. (2011) 
Performance evaluation and comparison of PCA 
Based human face recognition methods for distorted 
images. International Journal of Machine Learning 
and Cybernetics, 2(4): 245-259. 
https://doi.org/10.1007/s13042-011-0023-2 
[7] G. J. Alvarado, W. Pedrycz,·M. Reformat, and K.-
C. Kwak. (2006) Deterioration of visual 
information in face classification using eigenfaces 
and fisherfaces. Machine Vision and Applications, 
17(1): 68–82. 
https://doi.org/10.1007/s00138-006-0016-4 
[8] L. F Chen, H. Y Mark Liao, M. T. Ko, J.C Lin, and 
G. J.Yu. (2000) A new LDA-based face recognition 
system which can solve the small sample size 
problem. Pattern Recogn., 33: 1713–26. 
https://doi.org/10.1016/S0031-3203(99)00139-9 
[9] [9] H. Yu, and J. Yang. (2001) A direct LDA 
algorithm for high-dimensional data-with 
application to face recognition. Pattern Recogn. 34: 
2067–70. 
https://doi.org/10.1016/S0031-3203(00)00162-X 
[10] X. S. Zhuang, and D. Q. Dai. (2007) Improved 
discriminant analysis for high-dimensional data and 
its application to face recognition. Pattern Recogn., 
40(5): 1570-1578. 
https://doi.org/10.1016/j.patcog.2006.11.015 
[11] D. Swets, and J. Weng. (1996) Using discriminant 
eigenfeatures for image retrieval. IEEE Trans. 
Pattern Anal. Machine Intell., 18(8) 831–836. 
https://doi.org/10.1109/34.531802 
[12] J. Friedman. (1989) Regularized discriminant 
analysis. J. Am. Stat. Assoc., 165– 175. 
https://doi.org/10.1080/01621459.1989.10478752 
[13] R. Huang, Q. Liu, H. Lu, and S. Ma. (2002) Solving 
the small sample size problem of lda. In 
Proceedings of the 16th International Conference 
on Pattern Recognition, 3, 29–32. 
https://doi.org/10.1109/ICPR.2002.1047787 
[14] H. Yu, and J. Yang. (2001) A direct lda algorithm 
for high-dimensional data-with application to face 
recognition. Pattern Recogn., 34 (10): 2067- 2075. 
https://doi.org/10.1016/S0031-3203(00)00162-X 
[15] M. S. Bartlet, J. R. Movellan, and T. J. Sejnowski. 
(2002) Face recognition by independent component 
analysis. IEEE Trans. Neural Netw., 13(6):1450-
1464. https://doi.org/10.1109/TNN.2002.804287 
[16] J. Yang, D. Zhang, A. F. Frangi, and J. Y. Yang. 
(2004) Two-dimensional PCA: a new approach to 
appearance-based face representation and 
recognition. IEEE Trans. Pattern Anal. Mach. 
Intell., 26(1):131–137. 
https://doi.org/10.1109/TPAMI.2004.1261097 
[17] H. Xiong, M. N. S. Swamy, and M. O. Ahmad. 
(2005) Two-dimensional FLD for face recognition. 
Pattern Recogn., 38(7):1121–1124. 
https://doi.org/10.1016/j.patcog.2004.12.003 
[18] S. Chowdhury, J. K. Sing, D. K. Basu, and M. 
Nasipuri. (2011) Face recognition by generalized 
two-dimensional FLD method and multi-class 
support vector machines. Appl. Soft 
Comput.,11(7):4282–4292. 
https://doi.org/10.1016/j.asoc.2010.12.002 
[19] H. Yu, and J. Yang. (2001) A direct LDA algorithm 
for high-dimensional data with application to face 
recognition, Pattern Recogn., 34: 2067–2070. 
https://doi.org/10.1016/S0031-3203(00)00162-X 
[20] J. Yang, and J. Yang. (2003) Why can LDA be 
performed in PCA transformed space? Pattern 
Recognit. 36 (2): 563-566. 
https://doi.org/10.1016/S0031-3203(02)00048-1 
[21] J. Ye, and Q. Li. (2004) LDA/QR: An efficient and 
effective dimension reduction algorithm and its 
theoretical foundation, Pattern Recognit. 37 (4), 
851–854. 
https://doi.org/10.1016/j.patcog.2003.08.006 
[22] J. Ye, R. Janardan, C. Park, and H. Park. (2004) An 
optimization criterion for generalized discriminant 
analysis on undersampled problems, IEEE Trans. 
Pattern Anal. Mach. Intell. 26 (8): 982–994. 
https://doi.org/10.1109/TPAMI.2004.37 
[23] J. M. Keller, M. R. Gray, and J. A. Givens. (1985) 
A fuzzy k-nearest neighbor algorithm. IEEE Trans. 
Syst. Man Cybernet., 15(4):580–585. 
https://doi.org/10.1109/TSMC.1985.6313426 
[24] K. C. Kwak, and W. Pedrycz. (2005) Face 
recognition using a fuzzy fisherface classifier. 
A Novel Approach to Fuzzy-Based Facial Feature Extraction... Informatica 43 (2019) 535–543 543 
Journal of the Pattern Recogn. 38(10):1717-1732. 
https://doi.org/10.1016/j.patcog.2005.01.018 
[25] W. Yang, J. Wang, M. Ren, and J. Yang. (2009) 
Fuzzy 2-dimensional FLD for face recognition. 
Journal of Information and Computing Science, 
4(3): 233-239. 
https://doi.org/10.1109/CCPR.2009.5344077 
[26] W. Yang, J. Wang, M. Ren, L. Zhang, and J. Yang. 
(2009) Feature extraction using fuzzy inverse FDA. 
Neurocomputing, 72(13- 15): 3384–3390. 
https://doi.org/10.1016/j.neucom.2009.03.011 
[27] X. N.Song, Y. J. Zheng, X. J.Wu, X. B.Yang, and J. 
Y.Yang. (2010) A complete fuzzy discriminant 
analysis approach for face recognition. Applied Soft 
Computing, 10: 208-214. 
https://doi.org/10.1016/j.asoc.2009.07.002 
[28] L. Xiaodong, F. Shumin, and T. Zhang. (2013) 
Weighted maximum scatter difference based feature 
extraction and its application to face recognition. 
Machine Vision and Applications, 22: 591-595.  
[29] M. Zhao, T. W. S. Chow, and Z. Zhang (2012) 
Random walk-based fuzzy linear discriminant 
analysis for dimensionality reduction. Soft 
Computing, 16:1393-1409. 
https://doi.org/10.1007/s00500-012-0843-3 
[30] J. Wang, W. Yang, and J. Yang. (2013) Face 
recognition using fuzzy maximum scatter 
discriminant analysis. Neural Computing & 
Application, 23: 957-964. 
https://doi.org/10.1007/s00521-012-1020-4 
[31] X. Li, and A. Song. (2013) Fuzzy MSD based 
feature extraction method for extraction. 
Neurocomputing, 122: 266-271. 
https://doi.org/10.1016/j.neucom.2013.06.025 
[32] X. Li (2014) Face recognition method based on 
fuzzy 2DPCA. Journal of Electrical and Computer 
Engineering, 2014:1- 7. 
https://doi.org/10.1155/2014/919041 
[33] N. Zheng, L. Qi, and L. Guan. (2014) Generalised 
multiple maximum scatter difference feature 
extraction using QR decomposition. Journal of 
visual Communication Image Representation, 
25:1460-1471. 
https://doi.org/10.1016/j.jvcir.2014.04.009 
[34] J.K. Sing. (2015) A novel Gaussian probabilistic 
generalized 2DLDA for feature extraction and face 
recognition. In Proceedings of the IEEE Conference 
on Computer Graphics, Vision and Information 
Security, pages 258-263. 
https://doi.org/10.1109/CGVIS.2015.7449933 
[35] P. Huang, Z. Yang, and C. Chen. (2015) Fuzzy 
local discriminant embedding for image feature 
extraction. Computers and Electrical Engineering, 
46:231-240. 
https://doi.org/10.1016/j.compeleceng.2015.03.013 
[36] J. Xu, Z. Gu, and K. Xie. Fuzzy local mean 
discriminant analysis for dimensionality reduction. 
Neural Processing Letter, 44:701-718, 2016. 
https://doi.org/10.1007/s11063-015-9489-3 
[37] P. Huang, G. Gao, C. Qian, G. Yang, and Z. Yang. 
(2017) Fuzzy linear regression discriminant 
projection for face recognition. IEEE Access, 
23:169-174. 
https://doi.org/10.1109/ACCESS.2017.2680437 
[38] Q. Zhu, and Y. Xu. (2013) Multi-directional two 
dimensional PCA with matching score level fusion 
for face recognition. Neural Comput. & Applic., 
23(1): 169-174. https://doi.org/10.1007/s00521-
012-0851-3 
[39] P. J. Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss. 
(2000) The FERET evaluation methodology for 
face-recognition algorithms. IEEE Trans. Pattern. 
Anal. Mach. Intell., 22: 1090–1104. 
https://doi.org/10.1109/34.879790 
[40] P. J. Phillips. (2004) The Facial Recognition 
Technology (FERET) database. 
<http://www.itl.nist.gov/iad/humanid/feret/feret_ma
ster.html>. 
[41] The ORL face database, <http://www.cl.cam.ac.uk/ 
research/dtg/attarchive/facedatabase.html>. 
[42] D. B. Graham, N. M. Allinson, H. Wechsler, P. J. 
Phillips, V. Bruce, F. Fogelman-Soulie, and T. S. 
Huang (Eds.), (1998) Characterizing virtual eigen 
signatures for general purpose face recognition: 
From theory to applications. NATO ASI Series F 
Computer and Systems Sciences, 163: 446–456. 
https://doi.org/10.1007/978-3-642-72201-1_25 
 
  
544 Informatica 43 (2019) 535–543 A. Dey et al.