The Eye: A Light Weight Mobile Application for Visually Challenged People Using Improved YOLOv5l Algorithm

Kalaiarasi Sonai Muthu Anbananthen, Sridevi Subbiah, Subiksha Gayathri Baskar, Ratchana Selvaraj, Jayakumar Krishnan, Subarmaniam Kannan, Deisy Chelliah

Abstract


The eye is an essential sensory organ that allows us to perceive our surroundings at a glance. Losing this sense can result in numerous challenges in daily life. However, society is designed for the majority, which can create even more difficulties for visually impaired individuals. Therefore, empowering them and promoting self-reliance are crucial. To address this need, we propose a new Android application called “The Eye” that utilizes Machine Learning (ML)-based object detection techniques to recognize objects in real-time using a smartphone camera or a camera attached to a stick. The article proposed an improved YOLOv5l algorithm to improve object detection in visual applications. YOLOv5l has a larger model size and captures more complex features and details, leading to enhanced object detection accuracy compared to smaller variants like YOLOv5s and YOLOv5m. The primary enhancement in the improved YOLOv5l algorithm is integrating L1 and L2 regularization techniques. These techniques prevent overfitting and improve generalization by adding a regularization term to the loss function during training. Our approach combines image processing and text-to-speech conversion modules to produce reliable results. The Android text-to-speech module is then used to convert the object recognition results into an audio output. According to the experimental results, the improved YOLOv5l has higher detection accuracy than the original YOLOv5 and can detect small, multiple, and overlapped targets with higher accuracy. This study contributes to the advancement of technology to help visually impaired individuals become more self-sufficient and confident.

 

Doi: 10.28991/ESJ-2023-07-05-011

Full Text: PDF


Keywords


Text to Speech; Visually Challenged; YOLO; Machine Learning.

References


WHO. (2023). Blindness and vision impairment. World Health Organization (WHO), Geneva, Switzerland. Available online: https://www.who.int/news-room/fact-sheets/detail/blindness-and-visual-impairment (accessed on April 2023).

Fu, X. (2021). Mobile assistant app for visually impaired people, with face detection, gender classification and sound representation of image. Electrical Engineering Department, Stanford University, California, United States.

Mao, Q.-C., Sun, H.-M., Liu, Y.-B., & Jia, R.-S. (2019). Mini-YOLOv3: Real-Time Object Detector for Embedded Applications. IEEE Access, 7, 133529–133538. doi:10.1109/access.2019.2941547.

Statista (2023). Number of smartphone mobile network subscriptions worldwide from 2016 to 2022, with forecasts from 2023 to 2028. Statista Inc, New York, United States. Available online: https://www.statista.com/statistics/330695/number-of-smartphone-users-worldwide (Accessed on June 2023).

Lara, S.D., & Labrador, M. A. (2012). A mobile platform for real-time human activity recognition. 2012 IEEE Consumer Communications and Networking Conference (CCNC). doi:10.1109/ccnc.2012.6181018.

Huu, P. N., & Xuan, K. D. (2021). Proposing Algorithm Using YOLOV4 and VGG-16 for Smart-Education. Applied Computational Intelligence and Soft Computing, 2021. doi:10.1155/2021/1682395.

Wu, L., Ma, J., Zhao, Y., & Liu, H. (2021). Apple detection in complex scene using the improved yolov4 model. Agronomy, 11(3). doi:10.3390/agronomy11030476.

Li, Y., Wang, H., Dang, L. M., Nguyen, T. N., Han, D., Lee, A., Jang, I., & Moon, H. (2020). A deep learning-based hybrid framework for object detection and recognition in autonomous driving. IEEE Access, 8, 194228–194239. doi:10.1109/ACCESS.2020.3033289.

Thammarak, K., Sirisathitkul, Y., Kongkla, P., & Intakosum, S. (2022). Automated Data Digitization System for Vehicle Registration Certificates Using Google Cloud Vision API. Civil Engineering Journal, 8(7), 1447-1458. doi:10.28991/CEJ-2022-08-07-09.

Hu, X., Wei, Z., & Zhou, W. (2021). A video streaming vehicle detection algorithm based on YOLOv4. 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). doi:10.1109/iaeac50856.2021.9390613.

Kurdthongmee, W., Kurdthongmee, P., Suwannarat, K., & Kiplagat, J. K. (2022). A YOLO Detector Providing Fast and Accurate Pupil Center Estimation using Regions Surrounding a Pupil. Emerging Science Journal, 6(5), 985-997. doi:10.28991/ESJ-2022-06-05-05.

25-Linglin, H., Qiang, L., Xianzhen, H., & Maosong, L. (2020). Research on pruning algorithm of target detection model with YOLOv4. 2020 Chinese Automation Congress (CAC). doi:10.1109/cac51589.2020.9326798.

Anbananthen, K. S. M., Subbiah, S., Chelliah, D., Sivakumar, P., Somasundaram, V., Velshankar, K. H., & Khan, M. K. A. A. (2021). An intelligent decision support system for crop yield prediction using hybrid machine learning algorithms. F1000Research, 10(1143). doi:10.12688/f1000research.73009.1.

Subbiah, S., Anbananthen, K. S. M., Thangaraj, S., Kannan, S., & Chelliah, D. (2022). Intrusion detection technique in wireless sensor network using grid search random forest with Boruta feature selection algorithm. Journal of Communications and Networks, 24(2), 264–273. doi:10.23919/jcn.2022.000002.

Sridevi, S., Parthasarathy, S., & Rajaram, S. (2018). An effective prediction system for time series data using pattern matching algorithms. International Journal of Industrial Engineering: Theory Applications and Practice, 25(2), 123–136. doi:10.23055/ijietap.2018.25.2.3318.

Wu, D., Lv, S., Jiang, M., & Song, H. (2020). Using channel pruning-based YOLO v4 deep learning algorithm for the real-time and accurate detection of apple flowers in natural environments. Computers and Electronics in Agriculture, 178, 105742. doi:10.1016/j.compag.2020.105742.

Abdurahman, F., Fante, K. A., & Aliy, M. (2021). Malaria parasite detection in thick blood smear microscopic images using modified YOLOV3 and YOLOV4 models. BMC Bioinformatics, 22(1). doi:10.1186/s12859-021-04036-4.

Albahli, S., Nida, N., Irtaza, A., Yousaf, M. H., & Mahmood, M. T. (2020). Melanoma Lesion Detection and Segmentation Using YOLOv4-DarkNet and Active Contour. IEEE Access, 8, 198403–198414. doi:10.1109/ACCESS.2020.3035345.

Yilmazer, R., & Birant, D. (2021). Shelf auditing based on image classification using semi-supervised deep learning to increase on-shelf availability in grocery stores. Sensors (Switzerland), 21(2), 327. doi:10.3390/s21020327.

Fu, H., Song, G., & Wang, Y. (2021). Improved yolov4 marine target detection combined with CBAM. Symmetry, 13(4). doi:10.3390/sym13040623.

Parico, A. I. B., & Ahamed, T. (2021). Real time pear fruit detection and counting using yolov4 models and deep sort. Sensors, 21(14). doi:10.3390/s21144803.

Anbananthen, S. K., Sainarayanan, G., Chekima, A., & Teo, J. (2006). Data Mining using Pruned Artificial Neural Network Tree (ANNT). 2nd International Conference on Information & Communication Technologies, 16 October 2006, Damascus, Syria. doi:10.1109/ictta.2006.1684577.

Cai, Y., Luan, T., Gao, H., Wang, H., Chen, L., Li, Y., Sotelo, M. A., & Li, Z. (2021). YOLOv4-5D: An Effective and Efficient Object Detector for Autonomous Driving. IEEE Transactions on Instrumentation and Measurement, 70, 1–13. doi:10.1109/tim.2021.3065438.

Ponnada, S., Sekharamantry, P. K., Dayal, A., Yarramalle, S., Vadaparthi, N., & Hemanth, J. (2021). An assisting model for the visually challenged to detect bus door accurately. Telkomnika (Telecommunication Computing Electronics and Control), 19(6), 1924–1934. doi:10.12928/TELKOMNIKA.v19i6.19811.

Bagyam, M. L. N., Indujha, S., Karthika, P., & Hariharan, T. (2021). Smart hearing and visually impaired passenger voice alert system. AIP Conference Proceedings. doi:10.1063/5.0068999.

Kumar, L. A., Renuka, D. K., Rose, S. L., & Wartana, I. M. (2022). Deep learning based assistive technology on audio visual speech recognition for hearing impaired. International Journal of Cognitive Computing in Engineering, 3, 24-30. doi:10.1016/j.ijcce.2022.01.003.

Ranjan, A., & Navamani, T. M. (2019). Android-Based Blind Learning Application. Ambient Communications and Computer Systems. Advances in Intelligent Systems and Computing, vol 904. Springer, Singapore. doi:10.1007/978-981-13-5934-7_22.

30-Arvind Bhile, A., & Hole, V. (2020). Real-Time Environment Description Application for Visually Challenged People. Second International Conference on Computer Networks and Communication Technologies. ICCNCT 2019. Lecture Notes on Data Engineering and Communications Technologies, 44, Springer, Cham, Switzerland. doi:10.1007/978-3-030-37051-0_38.

Bhatia, S., Devi, A., Alsuwailem, R. I., & Mashat, A. (2022). Convolutional Neural Network Based Real Time Arabic Speech Recognition to Arabic Braille for Hearing and Visually Impaired. Frontiers in Public Health, 10. doi:10.3389/fpubh.2022.898355.

Baskar, A., Kumar, T. G., & Samiappan, S. (2023). A vision system to assist visually challenged people for face recognition using multi-task cascaded convolutional neural network (MTCNN) and local binary pattern (LBP). Journal of Ambient Intelligence and Humanized Computing, 14(4), 4329–4341. doi:10.1007/s12652-023-04542-8.

Al-Allaf, A. F., & Rida, M. M. (2023). Design and implementation of a walking stick aid for visually challenged people. AIP Conference Proceedings. doi:10.1063/5.0116712.

Khan Shishir, Md. A., Rashid Fahim, S., Habib, F. M., & Farah, T. (2019). Eye Assistant: Using mobile application to help the visually impaired. 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT). doi:10.1109/icasert.2019.8934448.

COCO. (2023). Common Objects in Context. Available online: https://cocodataset.org/#home (accessed on April 2023).


Full Text: PDF

DOI: 10.28991/ESJ-2023-07-05-011

Refbacks

  • There are currently no refbacks.


Copyright (c) 2023 Kalaiarasi Sonai Muthu Anbananthen, Sridevi Subbiah, Subiksha Gayathri Baskar, Ratchana Selvaraj, Jaya-kumar Krishnan, Subarmaniam Kannan, Deisy Chelliah