Analyzing the Effect of Basic Data Augmentation for COVID-19 Detection through a Fractional Factorial Experimental Design

Mateo Hidalgo Davila, Maria Baldeon-Calisto, Juan Jose Murillo, Bernardo Puente-Mejia, Danny Navarrete, Daniel Riofrío, Noel Peréz, Diego S. Benítez, Ricardo Flores Moyano

Abstract


The COVID-19 pandemic has created a worldwide healthcare crisis. Convolutional Neural Networks (CNNs) have recently been used with encouraging results to help detect COVID-19 from chest X-ray images. However, to generalize well to unseen data, CNNs require large labeled datasets. Due to the lack of publicly available COVID-19 datasets, most CNNs apply various data augmentation techniques during training. However, there has not been a thorough statistical analysis of how data augmentation operations affect classification performance for COVID-19 detection. In this study, a fractional factorial experimental design is used to examine the impact of basic augmentation methods on COVID-19 detection. The latter enables identifying which particular data augmentation techniques and interactions have a statistically significant impact on the classification performance, whether positively or negatively. Using the CoroNet architecture and two publicly available COVID-19 datasets, the most common basic augmentation methods in the literature are evaluated. The results of the experiments demonstrate that the methods of zoom, range, and height shift positively impact the model's accuracy in dataset 1. The performance of dataset 2 is unaffected by any of the data augmentation operations. Additionally, a new state-of-the-art performance is achieved on both datasets by training CoroNet with the ideal data augmentation values found using the experimental design. Specifically, in dataset 1, 97% accuracy, 93% precision, and 97.7% recall were attained, while in dataset 2, 97% accuracy, 97% precision, and 97.6% recall were achieved. These results indicate that analyzing the effects of data augmentations on a particular task and dataset is essential for the best performance.

 

Doi: 10.28991/ESJ-2023-SPER-01

Full Text: PDF


Keywords


Medical Image Classification; COVID-19 Detection; Convolutional Neural Networks; Image Data Augmentation; Design of Experiments; Fractional Factorial Design.

References


Narin, A., Kaya, C., & Pamuk, Z. (2021). Automatic detection of coronavirus disease (COVID-19) using X-ray images and deep convolutional neural networks. Pattern Analysis and Applications, 24(3), 1207–1220. doi:10.1007/s10044-021-00984-y.

Elgendi, M., Nasir, M. U., Tang, Q., Smith, D., Grenier, J.-P., Batte, C., Spieler, B., Leslie, W. D., Menon, C., Fletcher, R. R., Howard, N., Ward, R., Parker, W., & Nicolaou, S. (2021). The Effectiveness of Image Augmentation in Deep Learning Networks for Detecting COVID-19: A Geometric Transformation Perspective. Frontiers in Medicine, 8. doi:10.3389/fmed.2021.629134.

Ji, T., Liu, Z., Wang, G. Q., Guo, X., Akbar khan, S., Lai, C., Chen, H., Huang, S., Xia, S., Chen, B., Jia, H., Chen, Y., & Zhou, Q. (2020). Detection of COVID-19: A review of the current literature and future perspectives. Biosensors and Bioelectronics, 166, 112455. doi:10.1016/j.bios.2020.112455.

Albawi, S., Mohammed, T. A., & Al-Zawi, S. (2017). Understanding of a convolutional neural network. 2017 International Conference on Engineering and Technology (ICET). doi:10.1109/icengtechnol.2017.8308186.

Baldeon-Calisto, M., & Lai-Yuen, S. K. (2020). AdaResU-Net: Multiobjective adaptive convolutional neural network for medical image segmentation. Neurocomputing, 392, 325–340. doi:10.1016/j.neucom.2019.01.110.

Baldeon Calisto, M., & Lai-Yuen, S. K. (2020). AdaEn-Net: An ensemble of adaptive 2D–3D Fully Convolutional Networks for medical image segmentation. Neural Networks, 126, 76–94. doi:10.1016/j.neunet.2020.03.007.

Baldeon Calisto, M., & Lai-Yuen, S. K. (2021). EMONAS-Net: Efficient multiobjective neural architecture search using surrogate-assisted evolutionary algorithm for 3D medical image segmentation. Artificial Intelligence in Medicine, 119, 102154. doi:10.1016/j.artmed.2021.102154.

Wang, L., Lin, Z. Q., & Wong, A. (2020). COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images. Scientific Reports, 10(1), 19549. doi:10.1038/s41598-020-76550-z.

Monshi, M. M. A., Poon, J., Chung, V., & Monshi, F. M. (2021). CovidXrayNet: Optimizing data augmentation and CNN hyperparameters for improved COVID-19 detection from CXR. Computers in Biology and Medicine, 133, 104375. doi:10.1016/j.compbiomed.2021.104375.

Algarni, A. D., El-Shafai, W., El Banby, G. M., Abd El-Samie, F. E., & Soliman, N. F. (2022). An efficient CNN-based hybrid classification and segmentation approach for COVID-19 detection. Computers, Materials and Continua, 70(3), 4393–4410. doi:10.32604/cmc.2022.020265.

Cohen, J. P., Morrison, P., Dao, L., Roth, K., Duong, T. Q., & Ghassemi, M. (2020). Covid-19 image data collection: Prospective predictions are the future. arXiv preprint arXiv:2006.11988. doi:10.48550/arXiv.2006.11988.

Chollet, F. (2017). Xception: Deep Learning with Depthwise Separable Convolutions. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). doi:10.1109/cvpr.2017.195.

Khan, A. I., Shah, J. L., & Bhat, M. M. (2020). CoroNet: A deep neural network for detection and diagnosis of COVID-19 from chest x-ray images. Computer Methods and Programs in Biomedicine, 196, 105581. doi:10.1016/j.cmpb.2020.105581.

Taylor, L., & Nitschke, G. (2018). Improving Deep Learning with Generic Data Augmentation. 2018 IEEE Symposium Series on Computational Intelligence (SSCI). doi:10.1109/ssci.2018.8628742.

Mikolajczyk, A., & Grochowski, M. (2018). Data augmentation for improving deep learning in image classification problem. 2018 International Interdisciplinary PhD Workshop (IIPhDW). doi:10.1109/iiphdw.2018.8388338.

Shijie, J., Ping, W., Peiyi, J., & Siping, H. (2017). Research on data augmentation for image classification based on convolution neural networks. 2017 Chinese Automation Congress (CAC). doi:10.1109/cac.2017.8243510.

Perez, F., Vasconcelos, C., Avila, S., Valle, E. (2018). Data Augmentation for Skin Lesion Analysis. In: , et al. OR 2.0 Context-Aware Operating Theaters, Computer Assisted Robotic Endoscopy, Clinical Image-Based Procedures, and Skin Image Analysis. CARE CLIP OR 2.0 ISIC 2018. Lecture Notes in Computer Science, 11041. Springer, Cham, Switzerland. doi:10.1007/978-3-030-01201-4_33.

Safdar, M., Kobaisi, S., & Zahra, F. (2020). A Comparative Analysis of Data Augmentation Approaches for Magnetic Resonance Imaging (MRI) Scan Images of Brain Tumor. Acta Informatica Medica, 28(1), 29. doi:10.5455/aim.2020.28.29-36.

Omigbodun, A. O., Noo, F., McNitt-Gray, M., Hsu, W., & Hsieh, S. S. (2019). The effects of physics-based data augmentation on the generalizability of deep neural networks: Demonstration on nodule false-positive reduction. Medical Physics, 46(10), 4563–4574. doi:10.1002/mp.13755.

Zargari Khuzani, A., Heidari, M., & Shariati, S. A. (2021). COVID-Classifier: an automated machine learning model to assist in the diagnosis of COVID-19 infection in chest X-ray images. Scientific Reports, 11(1), 9887. doi:10.1038/s41598-021-88807-2.

Kermany, D., Zhang, K., & Goldbaum, M. (2018). Labeled optical coherence tomography (oct) and chest x-ray images for classification. Mendeley data, 2(2). doi:10.17632/RSCBJBR9SJ.2.

Chlap, P., Min, H., Vandenberg, N., Dowling, J., Holloway, L., & Haworth, A. (2021). A review of medical image data augmentation techniques for deep learning applications. Journal of Medical Imaging and Radiation Oncology, 65(5), 545–563. doi:10.1111/1754-9485.13261.

Shorten, C., & Khoshgoftaar, T. M. (2019). A survey on Image Data Augmentation for Deep Learning. Journal of Big Data, 6(1), 60. doi:10.1186/s40537-019-0197-0.

Abbas, A., Abdelsamea, M. M., & Gaber, M. M. (2021). Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network. Applied Intelligence, 51(2), 854–864. doi:10.1007/s10489-020-01829-7.

Baldeon calisto, M., Balseca Zurita, J. S., & Cruz Patiño, M. A. (2021). COVID-19 ResNet: Residual neural network for COVID-19 classification with bayesian data augmentation. ACI Avances En Ciencias e Ingenierías, 13(2), 19. doi:10.18272/aci.v13i2.2288.

Chowdhury, N. K., Rahman, Md. M., & Kabir, M. A. (2020). PDCOVIDNet: a parallel-dilated convolutional neural network architecture for detecting COVID-19 from chest X-ray images. Health Information Science and Systems, 8(1). doi:10.1007/s13755-020-00119-3.

Goel, T., Murugan, R., Mirjalili, S., & Chakrabartty, D. K. (2021). OptCoNet: an optimized convolutional neural network for an automatic diagnosis of COVID-19. Applied Intelligence, 51(3), 1351–1366. doi:10.1007/s10489-020-01904-z.

Kumar, A., Tripathi, A. R., Satapathy, S. C., & Zhang, Y. D. (2022). SARS-Net: COVID-19 detection from chest x-rays by combining graph convolutional network and convolutional neural network. Pattern Recognition, 122, 108255. doi:10.1016/j.patcog.2021.108255.

Marques, G., Agarwal, D., & de la Torre Díez, I. (2020). Automated medical diagnosis of COVID-19 through EfficientNet convolutional neural network. Applied Soft Computing Journal, 96, 106691. doi:10.1016/j.asoc.2020.106691.

Nishio, M., Noguchi, S., Matsuo, H., & Murakami, T. (2020). Automatic classification between COVID-19 pneumonia, non-COVID-19 pneumonia, and the healthy on chest X-ray image: combination of data augmentation methods. Scientific Reports, 10(1). doi:10.1038/s41598-020-74539-2.

Rahimzadeh, M., & Attar, A. (2020). A modified deep convolutional neural network for detecting COVID-19 and pneumonia from chest X-ray images based on the concatenation of Xception and ResNet50V2. Informatics in Medicine Unlocked, 19(100360). doi:10.1016/j.imu.2020.100360.

Yoo, S. H., Geng, H., Chiu, T. L., Yu, S. K., Cho, D. C., Heo, J., Choi, M. S., Choi, I. H., Cung Van, C., Nhung, N. V., Min, B. J., & Lee, H. (2020). Deep Learning-Based Decision-Tree Classifier for COVID-19 Diagnosis From Chest X-ray Imaging. Frontiers in Medicine, 7. doi:10.3389/fmed.2020.00427.

Montgomery, D. (2019). Design and Analysis of Experiments (10th Ed.). Wiley, Hoboken, United States.

Lujan-Moreno, G. A., Howard, P. R., Rojas, O. G., & Montgomery, D. C. (2018). Design of experiments and response surface methodology to tune machine learning hyperparameters, with a random forest case-study. Expert Systems with Applications, 109, 195–205. doi:10.1016/j.eswa.2018.05.024.

Staelin, C. (2003). Parameter selection for support vector machines. Hewlett-Packard Company, Tech. Rep. HPL-2002-354R1, 1. HP Laboratories, Haifa, Israel.

Chou, F. I., Tsai, Y. K., Chen, Y. M., Tsai, J. T., & Kuo, C. C. (2019). Optimizing Parameters of Multi-Layer Convolutional Neural Network by Modeling and Optimization Method. IEEE Access, 7, 68316–68330. doi:10.1109/ACCESS.2019.2918563.

Fang, K. T., & Lin, D. K. J. (2003). Ch. 4. Uniform experimental designs and their applications in industry. Handbook of Statistics, 22, 131–170, Elsevier, Amsterdam, Netherlands. doi:10.1016/S0169-7161(03)22006-X.

Ahrens, W. H., Cox, D. J., & Budhwar, G. (1990). Use of the Arcsine and Square Root Transformations for Subjectively Determined Percentage Data. Weed Science, 38(4–5), 452–458. doi:10.1017/s0043174500056824.

Wodzinski, M., Banzato, T., Atzori, M., Andrearczyk, V., Cid, Y. D., & Muller, H. (2020). Training Deep Neural Networks for Small and Highly Heterogeneous MRI Datasets for Cancer Grading. 2020 42nd Annual International Conference of the IEEE Engineering in Medicine; Biology Society (EMBC). doi:10.1109/embc44109.2020.9175634.

Ogawa, R., Kido, T., & Mochizuki, T. (2019). Effect of augmented datasets on deep convolutional neural networks applied to chest radiographs. Clinical Radiology, 74(9), 697–701. doi:10.1016/j.crad.2019.04.025.

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84–90. doi:10.1145/3065386.

Cubuk, E. D., Zoph, B., Mane, D., Vasudevan, V., & Le, Q. V. (2019). AutoAugment: Learning Augmentation Strategies From Data. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). bdoi:10.1109/cvpr.2019.00020.


Full Text: PDF

DOI: 10.28991/ESJ-2023-SPER-01

Refbacks

  • There are currently no refbacks.


Copyright (c) 2022 Maria Baldeon Calisto, Juan Jose Murillo, Bernardo Puente-Mejia, Danny Navarrete, Daniel Riofrío, Noel Peréz, Diego Benítez, Ricardo Flores Moyano