Diagnosis of Covid-19 Via Patient Breath Data Using Artificial Intelligence

Özge Doğuç, Gökhan Silahtaroğlu, Zehra Nur Canbolat, Kailash Hambarde, Ahmet Alperen Yiğitbaşı, Hasan Gökay, Mesut Yılmaz


Using machine learning algorithms for the rapid diagnosis and detection of the COVID-19 pandemic and isolating the patients from crowded environments are very important to controlling the epidemic. This study aims to develop a point-of-care testing (POCT) system that can detect COVID-19 by detecting volatile organic compounds (VOCs) in a patient's exhaled breath using the Gradient Boosted Trees Learner Algorithm. 294 breath samples were collected from 142 patients at Istanbul Medipol Mega Hospital between December 2020 and March 2021. 84 cases out of 142 resulted in negatives, and 58 cases resulted in positives. All these breath samples have been converted into numeric values through five air sensors. 10% of the data have been used for the validation of the model, while 75% of the test data have been used for training an AI model to predict the coronavirus presence. 25% have been used for testing. The SMOTE oversampling method was used to increase the training set size and reduce the imbalance of negative and positive classes in training and test data. Different machine learning algorithms have also been tried to develop the e-nose model. The test results have suggested that the Gradient Boosting algorithm created the best model. The Gradient Boosting model provides 95% recall when predicting COVID-19 positive patients and 96% accuracy when predicting COVID-19 negative patients.


Doi: 10.28991/ESJ-2023-SPER-08

Full Text: PDF


COVID-19; Epidemic Disease; Artificial Intelligence; E-Nose; Machine Learning; Breath Data.


Marcel, S., Christian, A. L., Richard, N., Silvia, S., Emma, H., Jacques, F., Marcel, Z., Gabriela, S., Manuel, B., Annelies, W. S., Isabella, E., Matthias, E., & Nicola, L. (2020). COVID-19 epidemic in Switzerland: On the importance of testing, contact tracing and isolation. Swiss Medical Weekly, 150(1112). doi:10.4414/smw.2020.20225.

Cheng, H. Y., Jian, S. W., Liu, D. P., Ng, T. C., Huang, W. T., & Lin, H. H. (2020). Contact Tracing Assessment of COVID-19 Transmission Dynamics in Taiwan and Risk at Different Exposure Periods before and after Symptom Onset. JAMA Internal Medicine, 180(9), 1156–1163. doi:10.1001/jamainternmed.2020.2020.

Kucharski, A. J., Klepac, P., Conlan, A. J. K., Kissler, S. M., Tang, M. L., Fry, H., Gog, J. R., Edmunds, W. J., Emery, J. C., Medley, G., Munday, J. D., Russell, T. W., Leclerc, Q. J., Diamond, C., Procter, S. R., Gimma, A., Sun, F. Y., Gibbs, H. P., Rosello, A., … Simons, D. (2020). Effectiveness of isolation, testing, contact tracing, and physical distancing on reducing transmission of SARS-CoV-2 in different settings: a mathematical modelling study. The Lancet Infectious Diseases, 20(10), 1151–1160. doi:10.1016/S1473-3099(20)30457-6.

Hellewell, J., Abbott, S., Gimma, A., Bosse, N. I., Jarvis, C. I., Russell, T. W., Munday, J. D., Kucharski, A. J., Edmunds, W. J., Funk, S., Eggo, R. M., Sun, F., Flasche, S., Quilty, B. J., Davies, N., Liu, Y., Clifford, S., Klepac, P., Jit, M., … van Zandvoort, K. (2020). Feasibility of controlling COVID-19 outbreaks by isolation of cases and contacts. The Lancet Global Health, 8(4), e488–e496. doi:10.1016/s2214-109x(20)30074-7.

Wang, W., Xu, Y., Gao, R., Lu, R., Han, K., Wu, G., & Tan, W. (2020). Detection of SARS-CoV-2 in Different Types of Clinical Specimens. JAMA - Journal of the American Medical Association, 323(18), 1843–1844. doi:10.1001/jama.2020.3786.

Turkoglu, M. (2021). COVID-19 Detection System Using Chest CT Images and Multiple Kernels-Extreme Learning Machine Based on Deep Neural Network. Irbm, 42(4), 207–214. doi:10.1016/j.irbm.2021.01.004.

Ghaderzadeh, M., & Asadi, F. (2021). Deep Learning in the Detection and Diagnosis of COVID-19 Using Radiology Modalities: A Systematic Review. Journal of Healthcare Engineering, 2021. doi:10.1155/2021/6677314.

Visvam Devadoss Ambeth, K. (2016). Human security from death defying gases using an intelligent sensor system. Sensing and Bio-Sensing Research, 7, 107–114. doi:10.1016/j.sbsr.2016.01.006.

Yunusa, Z., Hamidon, M. N., Kaiser, A., & Awang, Z. (2014). Gas Sensors: A Review. Sensors & Transducers, 168(4), 61-75.

Santoso, B., Wijayanto, H., Notodiputro, K. A., & Sartono, B. (2018). K-Neighbor over-sampling with cleaning data: a new approach to improve classification performance in data sets with class imbalance. Applied Mathematical Sciences, 12(10), 449–460. doi:10.12988/ams.2018.8231.

Walimbe, R. (2017). Handling imbalanced dataset in supervised learning using family of smote algorithm. Data Science Central. Available online: https://www.datasciencecentral.com/handling-imbalanced-data-sets-in-supervised-learning-using-family/ (accessed on August 2022).

Berthold, M. R., Cebron, N., Dill, F., Gabriel, T. R., Kötter, T., Meinl, T., Ohl, P., Thiel, K., & Wiswedel, B. (2009). KNIME - the Konstanz information miner. ACM SIGKDD Explorations Newsletter, 11(1), 26–31. doi:10.1145/1656274.1656280.

Yıldırım, S. (2020). Gradient Boosted Decision Trees-Explained. Towards Data Science. Available online: https://towardsdatascience.com/gradient-boosted-decision-trees-explained-9259bd8205af (accessed on August 2022).

Fabien, M. (2022). Gradient Boosting regression: Supervised Learning Algorithms. Switzerland. Available online: https://maelfabien.github.io/machinelearning/GradientBoost/ (accessed on August 2022).

Full Text: PDF

DOI: 10.28991/ESJ-2023-SPER-08


  • There are currently no refbacks.

Copyright (c) 2022 Ozge Doguc, Gokhan Silahtaroglu, Zehra Nur Canbolat, Kailash Hambarde, Ahmet Alperen Yiğitbaşı, Hasan Gökay, Mesut Yılmaz