Credit Scoring Using Ensemble of Various Classifiers on Reduced Feature Set

  • Shashi Dahiya Manav Rachna International University (MRIU), Department of Computer Science and Engineering.,Faridabad, India
  • S. Handa Manav Rachna International University (MRIU), Faridabad, India
  • Netra Pal Singh Management Development Institute
Keywords: Classification, Ensemble, Machine learning, Credit Scoring, Credit,

Abstract


Credit scoring methods are widely used for evaluating loan applications in financial and banking institutions. Credit score identifies if applicant customers belong to good risk applicant group or a bad risk applicant group. These decisions are based on the demographic data of the customers, overall business by the customer with bank, and loan payment history of the loan applicants. The advantages of using credit scoring models include reducing the cost of credit analysis, enabling faster credit decisions and diminishing possible risk. Many statistical and machine learning techniques such as Logistic Regression, Support Vector Machines, Neural Networks and Decision tree algorithms have been used independently and as hybrid credit scoring models. This paper proposes an ensemble based technique combining seven individual models to increase the classification accuracy. Feature selection has also been used for selecting important attributes for classification. Cross classification was conducted using three data partitions. German credit dataset having 1000 instances and 21 attributes is used in the present study. The results of the experiments revealed that the ensemble model yielded a very good accuracy when compared to individual models. In all three different partitions, the ensemble model was able to classify more than 80% of the loan customers as good creditors correctly. Also, for 70:30 partition there was a good impact of feature selection on the accuracy of classifiers. The results were improved for almost all individual models including the ensemble model.

 

Author Biography

Netra Pal Singh, Management Development Institute

Departmemt  of Information Technology/ Information Management

Professor

References

Abdou, H.A. (2009). Genetic programming for credit scoring: The case of Egyptian public sector banks. Expert Systems with Applications, 36(9), 11402-11417.

Altman, E.I. (1968). Financial Ratios, Discriminant Analysis and the Prediction of Corporate Bankruptcy. Journal of Finance, 23(4), 589-609.

Baesens, B., Gestel, V., Viaene, S., Stepanova, M., Suykens, J., & Vanthienen, J. (2003). Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the Operational Research Society, 54(6), 627-635. doi:10.1057/palgrave.jors.2601545

Beaver, W. (1967). Financial ratios as predictors of failures. Empirical Research in Accounting: Selected, 38(1), 63-93.

Chen, M.C., & Huang, S.H. (2003). Credit scoring and rejected instances reassigning through evolutionary computation techniques. Expert Systems with Applications, 24(4), 433-441.

Chen, W.M., Ma, C.Q., & Ma, L. . Mining the Customer Credit Using Hybrid Support Vector Machine Technique. Expert Systems with Applications, 36(4), 7611-7616.

Chuang, C.L., & Lin, R.H. (2009). Constructing a reassigning credit scoring model, Part 1. Expert Systems with Applications, 36(2), 1685-1694.

Davis, R.H., Edelman, D.B., & Gammerman, A.J. (1992). Machine-learning algorithms for credit-card applications. IMA Journal of Management Mathematics, 4(1), 43-51. doi:10.1093/imaman/4.1.43

Desai, V. (1997). Credit-scoring models in the credit-union environment using neural networks and genetic algorithms. IMA Journal of Management Mathematics, 8(4), 323-346. doi:10.1093/imaman/8.4.323

Dietterich, T.G. (1997). Machine-learning research: Four current directions. AI Magazine, 18(4), 97-136.

Fisher, R. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, 179-188.

Frydman, H., Altman, E.I., & Kao, D.L. (1985). Introducing recursive partitioning for financial classification: The case of financial distress. The Journal of Finance, 40(1), 269-291.

Gestel, T.V., Baesens, B., Garcia, J., & Dijcke, P.V. (2003). A support vector machine approach to credit scoring. Bank en Financiewezen, 2, 73-82.

Henley, W.E., & Hand, D.J. (1996). A k-Nearest-Neighbour Classifier for Assessing Consumer Credit Risk. Statistician,45(1), 77. doi:10.2307/2348414

Hoffmann, F., Baesens, B., Martens, J., Put, F., & Vanthienen, J. (2002). Comparing a genetic fuzzy and a neurofuzzy classifier for credit scoring. International Journal of Intelligent Systems, 17(11), 1067-1083. doi:10.1002/int.10052

Huang, C.L., Chen, M.C., & Wang, C.J. (2007). Credit scoring with a data mining approach based on support vector machines. Expert Systems with Applications, 33(4), 847-856.

Jensen, H.L. (1992). Using neural networks for credit scoring. Managerial Finance, 18(6), 15-26.

Jo, H., Han, I., & Lee, H. (1997). Bankruptcy prediction using case-based reasoning, neural networks, and discriminant analysis. Expert Systems with Applications, 13(2), 97-108.

John, G.H., Kohavi, R., & Pfleger, K. (1994). Irrelevant Features and the Subset Selection Problem. In: Proc. of the 11th Int. Conf. on Machine Learning. 121-129.

Kuncheva, L.I. (2004). Combining pattern classifiers: Methods and algorithms. Hoboken, NJ: Wiley.

Lee, T., & Chen, I. (2005). A two-stage hybrid credit scoring model using artificial neural networks and multivariate adaptive regression splines. Expert Systems with Applications, 28(4), 743-752. doi:10.1016/j.eswa.2004.12.031

Leung, K., Cheong, F., & Cheong, C. (2007). Consumer credit scoring using an artificial immune system algorithm. In: IEEE congress on evolutionary computation. 3377-3384.

Li, W., Han, J., & Pei, J. (2001). CMAR: Accurate and efficient classification based on multiple class-association rules. In:ICDM’01, CA: San Jose. 369-376.

Li, X.L., & Zhong, Y. (2012). An Overview of Personal Credit Scoring: Techniques and Future Work. International Journal of Intelligence Science, 2012(2), 181-189.

Lin, S.L. (2009). A new two-stage hybrid approach of credit risk in banking industry. Expert Systems with Applications,36(4), 8333-8341.

Liu, B., Hsu, W., & Ma, Y. (1998). Integrating classification and association rule mining. In: KDD’98, NY. 80-86.

Mitchell, T.M. (1982). Generalization as Search. Artificial Intelligence, 18(2), 203-226.

Nanni, L., & Lumini, A. (2009). An Experimental Comparison of Ensemble of Classifiers of Bankruptcy Prediction and Credit Scoring. Expert Systems with Applications, 36(2), 3028-3033.

Ong, C., Huang, J., & Tzeng, G. (2005). Building credit scoring models using genetic programming. Expert Systems with Applications, 29(1), 41-47. doi:10.1016/j.eswa.2005.01.003

Paleologo, G., Elisseeff, A., & Antonini, G. (2010). Subagging for Credit Scoring Models. European Journal of Operational Research, 201(1), 490-499.

Park, C.S., & Han, I. (2002). A case-based reasoning with the feature weights derived by analytic hierarchy process for bankruptcy prediction. Expert Systems with Applications, 23(3), 255-264.

Reichert, A.K., Cho, C.C., & Wagner, G.M. (1983). An examination of the conceptual issues involved in developing credit-scoring models. J of Busi. and Econ. Statistics, 1(2), 101-114.

Setiono, R., Baesens, B., & Mues, C. (2008). Recursive neural network rule extraction for data with mixed attributes, neural networks. IEEE Transactions, 19, 299-307.

Sustersic, M., Mramor, D., & Zupan, J. (2009). Consumer credit scoring models with limited data. Expert Systems with Applications, 36(3), 4736-4744.

Walker, R.F., Haasdijk, E., & Gerrets, M.C. (1995). Credit evaluation, using a genetic algorithm. In Intelligent Systems for Finance and Business. (pp. 39-59).

West, D. (2000). Neural network credit scoring models. Computers and Operations Research, 27(11-12), 1131-1152.

West, D., Dellana, S., & Qian, J.X. (2005). Neural network ensemble strategies for financial decision applications.Computers and Operations Research, 32(10), 2543-2559.

Yin, X., & Han, J. (2003). CPAR: Classification based on predictive association rule. In: SDM, San Francisco, CA.

Yu, L., Wang, S., & Lai, K. (2008). Credit risk assessment with a multistage neural network ensemble learning approach.Expert Systems with Applications, 34(2), 1434-1444. doi:10.1016/j.eswa.2007.01.009

Zhang, D.F., Huang, H.Y., Chen, Q.S., & Jiang, Y. (2007). A comparison study of credit scoring models. Natural Computation, 1(15-18), 24-27.

Zhou, X.Y., Zhang, D.F., & Jiang, Y. (2008). A new credit scoring method based on rough sets and decision tree. Lecture Notes in Artificial Intelligence, 5012, 1081-1089.

Published
2016/03/24
Section
Short Communication