DOI: 10.2478/orga-2013-0003 Financial Distress Prediction of Iranian Companies Using Data Mining Techniques Mahdi Moradi1, Mahdi Salehi1*, Mohammad Ebrahim Ghorgani2, Hadi Sadoghi Yazdi1 1Ferdowsi University of Mashhad, Iran 2East Oil and Gas Company, NIOC, Iran Decision-making problems in the area of financial status evaluation are considered very important. Making incorrect decisions in firms is very likely to cause financial crises and distress. Predicting financial distress of factories and manufacturing companies is the desire of managers and investors, auditors, financial analysts, governmental officials, employees. Therefore, the current study aims to predict financial distress of Iranian Companies. The current study applies support vector data description (SVDD) to the financial distress prediction problem in an attempt to suggest a new model with better explanatory power and stability. To serve this purpose, we use a grid-search technique using 3-fold cross-validation to find out the optimal parameter values of kernel function of SVDD. To evaluate the prediction accuracy of SVDD, we compare its performance with fuzzy c-means (FCM).The experiment results show that SVDD outperforms the other method in years before financial distress occurrence. The data used in this research were obtained from Iran Stock Market and Accounting Research Database. According to the data between 2000 and 2009, 70 pairs of companies listed in Tehran Stock Exchange are selected as initial data set. Keywords: financial distress prediction; Support vector data description; Fuzzy c-mean. 1 Introduction The empirical literature of financial distress prediction has gained considerable attention in the post 2007-2009 global financial crises. Policymakers (Dodd-Frank Act of 2010) and regulators (SEC, Basel III) emphasize about failure of many banks in the aftermath of the global financial crisis and are seeking the best way to predict business failures. Prior studies have addressed two major research trends in financial distress prediction. One is investigating the situation of failure to find the symptoms (Dambolena & Khoury, 1980; Gombola & Ketz, 1983; Jo & Han, 1997; Scott, 1981). The other is comparing the prediction accuracy of the diverse classification methods (Tam & Kiang, 1992; Jo & Han, 1997). This study belongs to the second group of research. The primary purpose of this study is to apply support vector data description (SVDD) to the financial distress prediction problem in an attempt to suggest a new model with better explanatory power and stability. We use a grid-search technique using 3-fold cross-validation to examine the optimal parameter values of kernel function of SVDD. In addition, to evaluate the prediction accuracy of SVDD, we compare its performance with fuzzy c-means (FCM).Using the data from Iran Stock Market and Accounting Research Database for 70 couples of companies listed in Tehran Stock Exchange during 2000 and 2009; we find that SVDD outperforms the other method. 2 Literature review The empirical literature of financial distress prediction has recently gained further momentum and attention from financial institutions. Academicians and practitioners realize that the problem of asymmetric information between banks and firms lies at the heart of important market crashes such as credit rationing and that improvement in monitoring techniques represents a valuable alternative to any incomplete contractual arrangement aimed at reducing the borrowers' moral hazard (Becchetti & Sierra, 2003; Stiglitz & Weiss, 1981; Xu, 2000). Among financial distress forecasting methods, discriminant analysis was the dominant method for predicting corporate failure from 1966 until the early part of the 1980s (Altman, 1968, 1983; Back et al., 1996b). It gained wide popularity due to its ease of use and interpretation. However, * Corresponding author: Ferdowsi University of Mashhad, Faculty of Economics and Business Administration, Azadi Square, Vakilabad Bolvard, Mashhad City, Khorasan Razavi Province, Iran, E-mail: mahdi.salehi@um.ac.ir Received: 22nd October 2012; revised: 14th December 2012; accepted 5th January 2013 both linear and quadratic discriminant analyses are sensitive to deviation from multivariate normality (Karels & Prakash, 1987; Laitinen & Laitinen, 2000). During the 1980s, the probit (Zmijewski, 1984) and, especially, the logit methods (logistic regression model) (Back et al., 1996b; Ohlson, 1980) used the discriminant method. These two models do give a crisp relationship between explanatory and response variables of the given data from a statistical viewpoint and do not assume multivariate normality, but the probit model assumed that the cumulative probability distribution must be standardized normal distribution, while the logit model assumed that the cumulative probability distribution must be logistic distribution. Since the 1990s, neural networks have been the most widely used techniques in developing quantitative financial distress prediction (Back et al., 1996b; Tam & Kiang, 1992; Wilson & Sharda, 1994), in particular, the approximation or classification powers of the MLP trained by the backpropagation algorithm (Hassoun, 1995; Hertz, Krogh, & Palmer, 1991). Many studies compared the neural networks backpropagation algorithm with the statistical methods and found neural networks backpropagation outperforms the other statistic methods, such as multivariate discriminant analysis (MDA), probit and logit methods (Back et al., 1996a; Shin, Lee & Kim, 2005; Wilson & Sharda, 1994). Neural networks have recently been employed to extract rules for solving fuzzy classification problems (Kim et al., 2003). A number of fields use the radial basis function network (i.e., RBFN), for classification problems (Jang, Sun, & Mizutani, 1997; Surendra & Krishnamurthy, 1997), function approximations (Chuang, Jeng, & Lin, 2004; Hertz et al., 1991; Jang, 1993; Jang et al., 1997; Nam & Thanh, 2003) and management sciences (Stam, Sun, & Haines, 1996; Vythoulkas & Koutsopoulos, 2003). The approximation or classification powers of the MLP trained by the back propagation algorithm (Hassoun, 1995; Hertz et al., 1991) and RBFN are determined by the number of hidden nodes. In fact, the number of hidden layers influences the performance of back propagation MLP. Additionally, an RBFN is functionally equivalent to a zero-order Sugeno fuzzy inference system under some conditions (Jang et al., 1997). In addition, it was proven that the zero-order Sugeno fuzzy inference system could approximate any nonlinear function on a compact set to an arbitrary degree of accuracy under certain conditions (Jang, 1993). However, if a phenomenon under consideration does not have stochastic variability but is also uncertain in some sense, it is more natural to seek a fuzzy functional relationship for the given data, which may be either fuzzy or crisp. Sun and Li (2008) use weighted majority voting combination of multiple classifiers for FDP, Chen and Du (2009) introduced an integration strategy with subject weight based on neural network for financial distress prediction. They all generated diverse classifiers by applying different learning algorithms (with heterogeneous model representations) to a single data set, and concluded that, to some degree, FDP based on combination of multiple classifiers was superior to single classifiers according to accuracy rate or stability. The most used machine learning technique is the neural network model (Haykin, 1999), trained by the back-propagation learning algorithm (Wong et al., 1997; Wong and Selvi, 1998) whose prediction accuracy outperforms statistical models including logistic regression (LR), linear discriminant analysis (LDA), multiple discriminant analysis (MDA) and other machine learning models, such as k-nearest neighbor (k-NN) and decision trees. In addition, the back-propagation neural network (BPN) model can be used as the benchmark for financial decision support models. Chen and Du (2009) found that prediction performance for the clustering approach is more aggressively influenced than the BPN model and the BPN approach obtains better prediction accuracy than the data mining (DM) clustering approach in developing a financial distress prediction model. classifiers which were diversified by using neural networks on different data sets for financial distress prediction, and their experimental results showed that multiple neural network classifiers did not outperform a single best neural network classifier, based on which they considered that the proposed multiple classifiers system may be not suitable for the binary classification problem as financial distress prediction. Song et al, (2010) presented genetic algorithm (GA) based approach and statistical filter approaches are applied to identify the best features for the support vector machine (SVM). The proposed GA-based approach is carefully designed in order to have the capability of simultaneously optimizing the features and parameters of the SVM. Experimental results on the data from Chinese companies showed that the GA-based approach could extract fewer features with a higher accuracy compared with statistical filter approaches. Recent studies in Artificial Intelligence (AI) approach, such as ANN (Ravisankar & Ravi 2010), SVM (Lin et al. 2011; Min & Lee 2005; Bao et al., 2012) have also been successfully applied to financial distress prediction. The purpose of this paper is to apply fuzzy clustering means and support vector data description (SVDD) in financial distress prediction model. Fuzzy c-means (FCM) clustering is one of well-known unsupervised clustering techniques, which allows one piece of data were two or more clusters. SVDD is known as the algorithm that finds a special kind of linear model with the maximum margin hyperplane. The maximum margin hyperplane gives the maximum separation between decision classes. The training examples that are closest to the maximum margin hyperplane are called support vectors. The SVDD classifier will be trained by different kernel functions in order to compare it with the benchmark of the neural network model. In SVDD, Using different kernel functions and the determination of optimal values of the parameters to train SVMs will lead to different results. Therefore, the current study aims to compare the accuracy of these two forecasting techniques in predicting financial distress of companies. The original classification accuracy indicates that SVDD outperforms the FCM model. 3 Technical background 3.1 Support vector data description SVDD, inspired by the idea of support vector machine by Vapnik (1995), is a method of one-class classification, for which not the optimal separating hyperplane but the sphere with minimal volume containing all or most objects has to be found; its sketch in two-dimensional spaces is shown in Figure 1. It is often used as a method of novelty detection. Novelty detection based on boundary essentially is to find a sphere with minimum volume containing all (or most of) the normal data objects. For a data set containing N normal data objects, when one or a few very remote objects are in it, a very large sphere is obtained, which will not represent the data very well. Therefore, some data points outside the sphere are allowed with introducing slack variable $ , then, the sphere can be described by centre a and radius R as follows min L( R) = R 2 + C ^fy t=i s.t.(xt - a)T (xt - a) < R2 + fy fy > 0 (i = 1,2,....N) (1) Where the variable C gives the trade-off between simplicity (or volume of the sphere) and the number of errors (number of target objects rejected). We construct the Lagrange L(R, a, at, ) = r2 + C -Xa [R2 + i2 - (x2 - 2ax, + a 2 t t (a> o,y> 0) (2) outlier hypersphere (boundary) Figure.1 The sketch of SVDD in two-dimensional space Setting the partial derivatives to 0, new constraints are obtained at = 1, ^ _, a =_ c -at-yt = 0 Vi = ^Zatxt (3) Then, new optimal equation can be obtained N N N max L = Yuat(X' X) -Y7Lataf (X' xf ) t=i ¡=i f=1 (4) s.t. 0