SHAP-Instance Weighted and Anchor Explainable AI: Enhancing XGBoost for Financial Fraud Detection
Abstract
Doi: 10.28991/ESJ-2024-08-06-016
Full Text: PDF
Keywords
References
West, J., & Bhattacharya, M. (2016). Intelligent financial fraud detection: A comprehensive review. Computers & Security, 57, 47-66. doi:10.1016/j.cose.2015.09.005.
Abdallah, A., Maarof, M. A., & Zainal, A. (2016). Fraud detection system: A survey. Journal of Network and Computer Applications, 68, 90-113. doi:10.1016/j.jnca.2016.04.007.
Khaksar, J., Salehi, M., & DashtBayaz, M. L. (2022). The relationship between auditor characteristics and fraud detection. Journal of Network and Computer Applications, 20(1), 79-101. doi:10.1108/JFM-02-2021-0024.
Cordis, A. (2023). Political alignment and corporate fraud: Evidence from the United States of America. Journal of Applied Accounting Research. doi:10.1108/JAAR-06-2022-0159.
Rahman, M. J., & Jie, X. (2024). Fraud detection using fraud triangle theory: Evidence from China. Journal of Financial Crime, 31(1), 101–118. doi:10.1108/JFC-09-2022-0219.
Maniatis, A. (2022). Detecting the probability of financial fraud due to earnings manipulation in companies listed in Athens stock exchange market. Journal of Financial Crime, 29(2), 603–619. doi:10.1108/JFC-04-2021-0083.
Gong, Y., Li, J., Xu, Z., & Li, G. (2022). Detecting financial fraud using two types of Benford factors: Evidence from China. Procedia Computer Science, 214, 656–663. doi:10.1016/j.procs.2022.11.225.
Xiuguo, W., & Shengyong, D. (2022). An Analysis on Financial Statement Fraud Detection for Chinese Listed Companies Using Deep Learning. IEEE Access, 10, 22516-22532. doi:10.1109/ACCESS.2022.3153478.
Craja, P., Kim, A., & Lessmann, S. (2020). Deep learning for detecting financial statement fraud. Decision Support Systems, 139, 113421. doi:10.1016/j.dss.2020.113421.
Pai, P. F., Hsu, M. F., & Wang, M. C. (2011). A support vector machine-based model for detecting top management fraud. Knowledge-Based Systems, 24, 314–321. doi:10.1016/j.knosys.2010.10.003.
Alfaiz, N. S., & Fati, S. M. (2022). Enhanced Credit Card Fraud Detection Model Using Machine Learning. Electronics, 11(4), 662. doi:10.3390/electronics11040662.
Strelcenia, E., & Prakoonwit, S. (2023). Improving Classification Performance in Credit Card Fraud Detection by Using New Data Augmentation. AI, 4(1), 172–198. doi:10.3390/ai4010008.
Chaquet-Ulldemolins, J., Gimeno-Blanes, F.-J., Moral-Rubio, S., Muñoz-Romero, S., & Rojo-Álvarez, J.-L. (2022). On the Black-Box Challenge for Fraud Detection Using Machine Learning (I): Linear Models and Informative Feature Selection. Applied Sciences, 12(7), 3328. doi:10.3390/app12073328.
Zhao, Z., & Bai, T. (2022). Financial Fraud Detection and Prediction in Listed Companies Using SMOTE and Machine Learning Algorithms. Entropy, 24(8), 1157. doi:10.3390/e24081157.
Ali, A. A., Khedr, A. M., El-Bannany, M., & Kanakkayil, S. (2023). A Powerful Predicting Model for Financial Statement Fraud Based on Optimized XGBoost Ensemble Learning Technique. Applied Sciences, 13(4), 2272. doi:10.3390/app13042272.
Cheah, P. C. Y., Yang, Y., & Lee, B. G. (2023). Enhancing Financial Fraud Detection through Addressing Class Imbalance Using Hybrid SMOTE-GAN Techniques. International Journal of Financial Studies, 11(3), 110. doi:10.3390/ijfs11030110.
El Hlouli, F. Z., Riffi, J., Sayyouri, M., Mahraz, M. A., Yahyaouy, A., El Fazazy, K., & Tairi, H. (2023). Detecting Fraudulent Transactions Using Stacked Autoencoder Kernel ELM Optimized by the Dandelion Algorithm. Journal of Theoretical and Applied Electronic Commerce Research, 18(4), 2057-2076. doi:10.3390/jtaer18040103.
Raval, J., Bhattacharya, P., Jadav, N. K., Tanwar, S., Sharma, G., Bokoro, P. N., Elmorsy, M., Tolba, A., & Raboaca, M. S. (2023). RaKShA: A Trusted Explainable LSTM Model to Classify Fraud Patterns on Credit Card Transactions. Mathematics, 11(8), 1901. doi:10.3390/math11081901.
El Kafhali, S., Tayebi, M., & Sulimani, H. (2024). An Optimized Deep Learning Approach for Detecting Fraudulent Transactions. Information, 15(4), 227. doi:10.3390/info15040227.
Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The Journal of Finance, 23(4), 589-609. doi:10.2307/2978933.
Altman, E. I. (1983). Corporate financial distress: A complete guide to predicting, avoiding, and dealing with bankruptcy. John Wiley & Sons. doi:10.1002/9781118267806.
Martins, T., de Almeida, A. M., Cardoso, E., & Nunes, L. (2024). Explainable Artificial Intelligence (XAI): A Systematic Literature Review on Taxonomies and Applications in Finance. IEEE Access, 12, 618-629. doi:10.1109/access.2023.3347028.
Medianovskyi, K., Malakauskas, A., Lakstutiene, A., & Yahia, S. B. (2023). Interpretable machine learning for SME financial distress prediction. 12th International Conference on Information Systems and Advanced Technologies “ICISAT 2022”. ICISAT 2022. Lecture Notes in Networks and Systems, Istanbul, Turkey, doi.org/10.1007/978-3-031-25344-7_42.
Torky, M., Gad, I., & Hassanien, A. E. (2023). Explainable AI Model for Recognizing Financial Crisis Roots Based on Pigeon Optimization and Gradient Boosting Model. International Journal of Computational Intelligence Systems, 16, 50. doi:10.1007/s44196-023-00222-9.
Tran, K. L., Le, H. A., Nguyen, T. H., & Nguyen, D. T. (2022). Explainable Machine Learning for Financial Distress Prediction: Evidence from Vietnam. Data, 7(11), 160. doi:10.3390/data7110160.
Nallakaruppan, M. K., Balusamy, B., Shri, M. L., Malathi, V., & Bhattacharyya, S. (2024). An Explainable AI framework for credit evaluation and analysis. Applied Soft Computing, 153, 111307. doi:10.1016/j.asoc.2024.111307.
Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. 31st International Conference on Neural Information Processing Systems (NIPS). Curran Associates, New York, United States. doi:10.48550/arXiv.1705.07874.
Ribeiro, M. T., Singh, S., & Guestrin, C. (2018). Anchors: high-precision model-agnostic explanations. AAAI Conference on Artificial Intelligence, 2-7 February 2018, Louisiana, United States. doi:10.1609/aaai.v32i1.11491.
Band, S. S., Yarahmadi, A., Hsu, C. C., Biyari, M., Sookhak, M., Ameri, R., Dehzangi, I., Chronopoulos, A. T., & Liang, H. W. (2023). Application of explainable artificial intelligence in medical health: A systematic review of interpretability methods. Informatics in Medicine Unlocked, 40, 101286. doi:10.1016/j.imu.2023.101286.
Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. (2019). Optuna: A next-generation hyperparameter optimization framework. 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, United States. doi:10.48550/arXiv.1907.10902.
Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A., & Talwalkar, A. (2017). Hyperband: A novel bandit-based approach to hyperparameter optimization, Journal of Machine Learning Research, 18, 6765–6816. doi:10.48550/arXiv.1603.06560.
Sawangarreerak, S., & Thanathamathee, P. (2021). Detecting and Analyzing Fraudulent Patterns of Financial Statement for Open In-novation Using Discretization and Association Rule Mining. Journal of Open Innovation: Technology, Market, and Complexity, 7(2), 128. doi.org/10.3390/joitmc7020128.
Elewa, M. M. (2022). Using Altman Z-Score Models for Predicting Financial Distress for Companies - The Case of Egypt panel data analysis. Alexandria Journal of Accounting Research, 6(1), 1-28. doi:10.21608/aljalexu.2022.225155.
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2006). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357. doi.org/10.1613/jair.953.
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32. doi:10.1023/A:1010933404324.
Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 13-17 August 2016, New York, NY, USA. doi:10.1145/2939672.2939785.
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T.-Y. (2017). LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, California, United States.
Anbananthen, K. S. M., Busst, M. B. M. A., Kannan, R., & Kannan, S. (2023). A Comparative Performance Analysis of Hybrid and Classical Machine Learning Method in Predicting Diabetes. Emerging Science Journal, 7(1), 102-115. doi:10.28991/ESJ-2023-07-01-08.
Abraham, A., Mohideen, H. S., & Kayalvizhi, R. (2023). A Tabular Variational Auto Encoder-Based Hybrid Model for Imbalanced Data Classification with Feature Selection. IEEE Access, 11, 122760-122771. doi:10.1109/ACCESS.2023.3329139.
Antwarg, L., Miller, R. M., Shapira, B., & Rokach, L. (2021). Explaining anomalies detected by autoencoders using Shapley Additive Explanations. Expert Systems with Applications, 186, 115736. doi:10.1016/j.eswa.2021.115736.
Ge, W., & McVay, S. (2005). The disclosure of material weaknesses in internal control after the Sarbanes-Oxley Act. Accounting Horizons, 19(3), 137-158. doi:10.2308/acch.2005.19.3.137.
Beaver, W. H. (1966). Financial ratios as predictors of failure. Journal of Accounting Research, 4, 71-111. doi:10.2307/2490171.
Beneish, M. D. (1999). The detection of earnings manipulation. Financial Analysts Journal, 55(5), 24-36.
Ou, J. A., & Penman, S. H. (1989). Financial statement analysis and the prediction of stock returns. Journal of Accounting and Economics, 11(4), 295-329. doi:10.1016/0165-4101(89)90017-7.
Skousen, C. J., Smith, K. R., & Wright, C. J. (2009). Detecting and predicting financial statement fraud: The effectiveness of the fraud triangle and SAS No. 99. Advances in Financial Economics, 13, 53-81. doi:10.1108/S1569-3732(2009)0000013005.
Cressey, D. R. (1953). Other people's money: A study in the social psychology of embezzlement. American Journal of Sociology, 59(6). doi:10.15388/Teise.2021.120.10.
Wolfe, D. T., & Hermanson, D. R. (2004). The fraud diamond: Considering the four elements of fraud. The CPA Journal, 74, 38.
Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47, 263-291. doi:10.2307/1914185.
Abdel-Khalik, A. R. (2014). Prospect theory predictions in the field: Risk seekers in settings of weak accounting controls. Journal of Accounting Literature, 33(1-2), 58-84. doi:10.1016/j.acclit.2014.10.001.
Dechow, P. M., Sloan, R. G., & Sweeney, A. P. (1996). Causes and consequences of earnings manipulation: An analysis of firms subject to enforcement actions by the SEC. John Wiley & Sons., 13(1), 1-36. doi:10.1111/j.1911-3846.1996.tb00489.x.
Persons, O. S. (1995). Using financial statement data to identify factors associated with fraudulent financial reporting. Journal of Applied Business Research, 11(3), 38. doi:10.19030/jabr.v11i3.5858.
Kanapickienė, R., & Grundienė, Ž. (2015). The model of fraud detection in financial statements by means of financial ratios. Procedia-Social and Behavioral Sciences, 213, 321-327. doi:10.1016/j.sbspro.2015.11.545.
Dechow, P. M., Ge, W., Larson, C. R., & Sloan, R. G. (2011). Predicting material accounting misstatements. Contemporary Accounting Research, 28(1), 17-82. doi:10.1111/j.1911-3846.2010.01041.x.
Kaminski, K. A., Wetzel, T. S., & Guan, L. (2004). Can financial ratios detect fraudulent financial reporting?. Managerial Auditing Journal, 19(1), 15-28. doi:10.1108/02686900410509802.
Summers, S. L., & Sweeney, J. T. (1998). Fraudulently misstated financial statements and insider trading: An empirical analysis. The Accounting Review, 73(1), 131-146.
Grice, J. S, & Ingram, R. W. (2001). Tests of the generalizability of Altman's bankruptcy prediction model. Journal of Business Research, 54(1), 53-61. doi:10.1016/S0148-2963(00)00126-0.
Sharma, D. S., & Iselin, E. R. (2003). The relative relevance of cash flow and accrual information for solvency assessments: A multi-method approach. Journal of Business Finance & Accounting, 30(7-8), 1115-1140. doi:10.1111/1468-5957.05421.
Burgstahler, D., & Dichev, I. (1997). Earnings management to avoid earnings decreases and losses. Journal of Accounting and Economics 1997, 24(1), 99-126. doi:10.1016/S0165-4101(97)00017-7.
SEC. (2024). The Securities and Exchange Commission (SEC), Bangkok, Thailand. Available online: https://www.sec.or.th/EN/Pages/Home.aspx, (accessed on November 2024).
Rezaee, Z. (2005). Causes, consequences, and deterrence of financial statement fraud. Critical Perspectives on Accounting, 16(3), 277-298. doi:10.1016/S1045-2354(03)00072-8.
DOI: 10.28991/ESJ-2024-08-06-016
Refbacks
- There are currently no refbacks.