Identification of Sickle Cell Anemia Using Deep Neural Networks

A molecule called hemoglobin is found in red blood cells that holds oxygen all over the body. Hemoglobin is elastic, round, and stable in a healthy human. This makes it possible to float across red blood cells. But the composition of hemoglobin is unhealthy if you have sickle cell disease. It refers to compact and bent red blood cells. The odd cells obstruct the flow of blood. It is dangerous and can result in severe discomfort, organ damage, heart strokes, and other symptoms. The human life expectancy can be shortened as well. The early identification of sickle calls will help people recognize signs that can assist antibiotics, supplements, blood transfusion, pain-relieving medications, and treatments etc. The manual assessment, diagnosis, and cell count are time consuming process and may result in misclassification and count since millions of red blood cells are in one spell. When utilizing data mining techniques such as the multilayer perceptron classifier algorithm, sickle cells can be effectively detected with high precision in the human body. The proposed approach tackles the limitations of manual research by implementing a powerful and efficient MLP (Multi-Layer Perceptron) classification algorithm that distinguishes Sickle Cell Anemia (SCA) into three classes: Normal (N), Sickle Cells(S) and Thalassemia (T) in red blood cells. This paper also presents the precision degree of the MLP classifier algorithm with other popular mining and machine learning algorithms on the dataset obtained from the Thalassemia and Sickle Cell Society (TSCS) located in Rajendra Nagar, Hyderabad, Telangana, India.

ceases, it may cause serious body pains and heart strokes etc. Sickle cell was found in the black community and was later seen in citizens of many ethnic groups, including citizens from parts of the Middle East, Central India, the Mediterranean Region, Italy, and Greece in particular [6]. Treatment for Sickle Cell Disease by taking early drugs such as Antibiotics, Blood Transfusion, Bone Marrow. For at least 1-2 months, most physicians prescribe blood transfusion, and patients can take antibiotics to treat risks, including persistent pain. SCD is an uncommon blood condition [7] in the human body that is now found in the hemoglobin of newborn babies. SCD is a crescent-shaped cell that is stiff, sticky, and associates with other cells. If one organ is gradually damaged it can spread across the body, which may contribute to death. To escape this kind of illness, early treatment is needed to stop severe problems (Blood Transfusion). A person develops hemoglobin S, which is triggered by two defective genes.

1-1-Syndromes and Indications of Sickle Cell Anemia [8]
Signs of sickle cell anemia and symptoms typically begin at the age of 5 months. They differ and alter with time from individual to individual. Symptoms and signs can involve: 1) Episodes of pain: The RBC prevents blood supply into narrow blood vessels joints, abdomen, and lungs which is one of the main signs of sickle cells. Chronic discomfort has been identified in bones, ulcers, and joint injuries in adults and teenagers.
2) Distressing swelling of hands and feet: RBC inhibits the blood supply of hands and feet because of the sickle cell involvement in the body and may swell in the hands and feet.
3) Frequent infections: Due to cellular sickle damage in bodies that trigger organ harm (infection fights i.e. spleen). Physicians also prescribe vaccinations and drugs to avoid diseases that endanger patient health, such as pneumonia.
3) Disrupted development: If RBCs in the human body are decreased by a lack of oxygen, which can contribute to slower development.
4) Vision problems: RBC prevents sickle cell blood flow through tiny blood vessels. Such cells do not supply the eyes with oxygen, which may contribute to retinal injury. It creates trouble with the vision.

1-2-Sickle Cell Complications [9]
1) Heart Stroke: Cardiac stroke may arise when the blood supply to the brain is disrupted by the sickle cells. The stroke signs include arms and legs fatigue or numbness, frequent speech difficulties, lack of consciousness [10].
2) Acute chest syndrome: It induces thorn discomfort, fever, and breathing trouble. The issue, which can also lead to life-threats, is very severe. Clinical intervention is needed with the usage of antibiotics and other therapies [11].
3) Pulmonary Hypertension: High blood pressure in the lungs (pulmonary hypertension) can occur in patients with sickle cell anemia. This condition affects not only the children but also adults. Symptoms can trigger breathing difficulties and tiredness.

5) Gallstones.
A substance such as bilirubin may be produced by the RBC breakdown. If bilirubin has a high body level that contributes to gallstones.
2) The sickle cell gene is now considered to be common among citizens in central India's Deccan plateau with less emphasis in northern Kerala and Tamil Nadu.

In worldwide Scenario [12]
1) Sickle cells were originally unknown, but now spread across the world and are found particularly in the Western Hemisphere (Caribbean, South America, and Central America), Saudi Arabia, Asia, and the Mediterranean countries (Greece, ITA, Turkey), Sub-Saharan Africa [13].
2) In the African-American countries, sickle cell death is more observed; between 1999 and 2002, sickle cell mortality [14] in children younger than 4 years declined by 42%. The reduced sickle cell in 2000, which defends against infectious diseases of the pneumococcal community, was detected through vaccination.
3) Sickle cell disease was reported in California, Illinois, and New York during the period 1990-1994 by the newborn screening of mortality infants.

4)
In Africa-American children with SCD in California and Illinois, mortality rates stood at 1.5 per 100 by the end of 1995. African / American or black babies born in California and Illinois have a mortality rate of 2.0 per 100 Blacks or African-Americans during this time [15].
5) SCD is one of the biggest public health issues, with the total expense of around $475 million for Hospitalization related to SCD in the US of 75000.
The rest of this paper is presented as follows: A comprehensive review of the sickle cell crisis and various attempts to find the crises through computational methods is described in section-2 (Literature Review). Complete overview of the dataset used in this experimentation is mentioned in section-3. Section-4 includes the previous results obtained on the dataset and usage of Multi-Layer Perceptron that escalates the prediction accuracy. Section-5 presents the results obtained from Multi-Layer Perceptron and the results comparison with other models can be found.

2-1-Sickle Cell Disease-A Comprehensive Study and Usage of Technology for Diagnosis [16]
A comprehensive study of sickle cell disease with the properties, symptoms, signs, and treatment procedure is mentioned. The authors also present the comprehensive study and characteristics of the disease with other similar diseases (like thalassemia) along with technological implications and usage in the field of sickle cell disease.

2-2-Healthy and Unhealthy Red Blood Cell Detection in Human Blood Smears using Neural Networks [17]
The following methods are used in this paper for diagnosing anemia: A hemoglobin level recognition system, focused on a microscopic blood smear examination, and an RBC classification. The author used neural networks to classify and count [18] three forms of anemia (Sickle cells, elliptocytosis cells, microsite cells, and cells with unknown shapes) by utilizing circular Hough transformations and morphological tools in a microscope.

2-3-Emerging Point-of-care Technologies for Sickle Cell Disease Diagnostics [19]
In this paper, POC platforms are developed to ensure cost efficiency and portability which enables millions of people in low-resource countries to be diagnosed with sickle cell anemia as potentially improbable. A comprehensive literature analysis was conducted to evaluate the sensitivity and specification of several POC diagnostics produced for SCD with an emphasis on resource constraint use. In this paper, Microfluidic paper-based devices, Paper-based tests for screening newborns, Sickle cell detection using a smartphone etc. are discussed.

2-4-Data Mining Technique using WEKA Classification for Sickle Cell Disease [17]
In this paper, two classification strategies, J48 and Random Tree are used, to forecast cell disease, heavily affected in Gujarat tribal regions, and then contrasted J48 and Random Tree categorization strategies for the mining method. The author contrasted this strategy to the Random Tree. For this prediction method, the WEKA platform is used as an opensource application.

2-5-A Comparative Analysis by KNN, SVM & ELM Classification to Detect Sickle Cell Anemia [8]
In this paper, the dataset of blood samples is taken in picture format. In the image pre-processing phase, gray photography, noise filters, and image enhancement are transformed. This paper also presents Fuzzy C which classifies regular and sickle cells. For analysis, the graphical and mathematical properties are used. The author used KNN, SVM, and Extreme Learning Machine classification tools [20] to evaluate images [21].

2-6-Edge Detection of Sickle Cells in Red Blood Cells [4]
In this paper, the sickle and the regular cells are identified by measuring the smaller, largest, and medium radius of each cell by contrasting that with the typical cell dimension. For this method, sickle cells with red circles are labelled using edge detection techniques [22]. Regular, sickle, and other irregular cells are located using a microscope in the clinical process, but manually it is impossible to identify superimposed cells. In this automatic process, blood cells overlapped and incomplete are first observed, and then separated to regular or irregular cells by means of specific edge finding algorithms depending on the shape of each cell derived from a microscopic picture [23].

2-7-Automatic Blood Cell Segmentation using K-Mean Clustering from Microscopic Thin Blood Images [25]
In this source, Savkare [25] presents an automatic blood cell segmentation in parasitic diseases using K-Means algorithms on online image library (http://www.dpd.cdc.gov/dpdx/ and is tested on 60 microscopic thin blood images with an accuracy of 98.89% for clustering.

2-8-Machine Learning Approaches to the Application of Disease Modifying Therapy for Sickle Cell using Classification Models [26]
Khalaf [26] has obtained the results with several algorithms on sickle cell disease especially using Elman-Jordan Hybrid Neural Network (EJNN) and Levenberg Marquardt algorithm (LEVNN) where the accuracy is very higher than other models.

2-9-Detection of Anemia Disease in Human Red Blood Cells using Cell Signature, Neural Networks and SVM [22]
Elsalamony et al. has done the detection of sickle cell anemia disease in RBC using various popular algorithms like cell signature, Support Vector Machine (SVM), Back Propagation (BP) and Self-Organizing Map (SOM) neural networks by testing with 45 microscopic colour images from 15 patients who is already suffering with this kind of anemia. After pre-processing, 13 attributes are identified along with the class label (Diagnosis of a Blood sample). The complete list of attributes, their meaning and can be found from our previous research article (in press) presented at ICACECS-2020 (International Conference on Advances in Computer Engineering and Communication Systems 2020) and available in Gowtham et al. (2020) [27].

4-Implementation
In Gowtham et al. (2020) [27] study, we have presented the results of prediction of identification of Thalassemia and Sickle Cell traits in the blood samples using the classification algorithms [28] is summarized [27] in Table 1: Later, another 100 records of patients were received who approached TSCS (during the period September 2019 to August 2020) that could help us to experiment and test with another model called "Multi-Layer Perceptron (MLP) Classifier".
The aim of this work is the production and identification of sickle cell anemia with the aid of Multilayer Perceptron. The model architecture includes three stages, such as Dataset preparation, Analyzing dataset, splitting dataset (Training and Testing), and usage of classification algorithms [29]. The proposed classifier, named Multi-Layer Perceptron Classifier gives the detection results on the sickle cell anemia. The implementation of the proposed technique in PYTHON. The system is evaluated in terms of Accuracy and Log Loss to show the performance of the technique.

4-1-Dataset Preparation
As stated above, 1387 patients with 13 parameters are included in the data collection obtained from Thalassemia and the Sickle Cell Society. The parameters are explained in Table 2.  Reticulocytes (RETIC) An immature red blood cell without a nucleus, having a granular or reticulated appearance when suitably stained.

10
Fetal Hemoglobin (HBF) It is the main oxygen carrier protein in the human fetus.

HBA0
It is defined as the non-glycated hemoglobin.

HBA2
It is a normal variant of hemoglobin A that consists of two alpha and two delta chains and are found at low levels in normal human blood 13 Diagnosis Diagnosis is the output variable; We need to predict based on a set of features(inputs) wither it is a Normal Cell / Sickle Cell / Thalassemia Cell Usually datasets are dealt-in machine learning that have multiple labels in one or more columns. The training data is also labelled in words to render it comprehensible or understandable in human terms.

4-2-Analyzing Dataset and Splitting Dataset
Label Encoding involves transforming labels into numeric types to make them readable by the computer. Machine learning algorithms will then help to determine how to run the labels. For the organized data set in supervised learning, it is an essential pre-processing phase.
The dataset is split using a train-test-split function which is helpful to partition the data into a training set and test set (of 80% and 20 % respectively) provided by scikit-learn. The training set includes a known value, and the model learns from that data such that other data can later be generalized. The reference dataset (or subclass) to check the accuracy of our model for this subclass is available.

4-3-Multi-Layer Perceptron (MLP) Classifier
MLP Classifier is often called a perceptron classifier of multilayer representation [30] that indicates a neural network itself. To conduct the classification process, MLP Classifier depends on an elementary Neural Network. That is ANN (Artificial Neural Network). The expression MLP is used ineffectively, often almost for every forward ANN, and often specifically refers to multi-layer perceptron networks (with threshold activation). Now and then, multilayer perceptrons are called "vanilla" neural networks, particularly if they involve one secret layer. It is rather robust and can usually be used to know the mapping from input to output. The lab findings are long-term avoidance.
MLPs are valuable in analysis to solve problems stochastically, also offering estimated answers for highly complicated problems like an approximation of fitness. MLPs allow effective classification of algorithms as a special case of regressions, where the answer variable is categorical. In the 1980s, MLPs became common for machine learning approaches, and they found implementations in a variety of different fields, such as speech recognition, Imagerecognition, and machine translation tools. Because of the achievements of deep learning, curiosity in history contact networks resurfaced. As shown in the above figure-2, the proposed model of MLP comprises four layers of non-linearly triggering nodes (an input and an output layer of two hidden layers). Since MLPs are completely connected, each node in one-layer binds to each node in the corresponding layer, with a certain weight.
Training happens in the perceptron after analyzing each piece of data by adjusting relation weights, depending on the magnitude of the performance error compared with the expected result. It is an example of controlled learning and a generalization of the lowest mean square technique in linear vision by backpropagation.
MLP Classifier trains iteratively because the partial loss function derivatives are computed for updating the parameters in conjunction with model parameters in any step. It may also be applied to the loss function with a regularization time that reduces configuration parameters to avoid overfitting [31,32].

5-Results
The classifier is implemented in PYTHON 3.6 with the configuration of the PC that has a processor Intel I7 running on Windows 10 OS with 16 GB RAM. The system suggested in this paper is measured based on confusion matrix, which includes Accuracy score, precision, recall, f1-score. Metrics assessment techniques [24]   The following formulae is used in arriving the various measures of accuracy.
And finally, the results obtained using all classification algorithms including the MLP is shown in the following Table 3.  Validation of results: Sincere thanks to TSCS for providing an additional dataset of 100 patients records (data of patients approached to TSCS during September 2019 to August 2020) is selected for validation of the performance of the algorithms and is expressed as follows mentioned in Table 4.

5-1-Web Application Development
A web application is developed based on the inputs given by TSCS especially that uses popular machine learning algorithms. The results are demonstrated to the team at TSCS using the Google Meet service. The team at TSCS has expressed their satisfaction and happiness about the performance of the application. The application is ready to deploy at the premises of TSCS for their regular usage.

6-Conclusion
Sickle cell anemia is a hematological condition, it was historically one of the peculiar traits of the indigenous community, but now it extends to the entire world and requires emergency treatment. This paper describes Sickle Cell Disease (SCD) and its history both internationally and in the national scenario (in the Indian context). The disease symptoms, signs, complications, and treatment are presented. Blood cells have also been characterized by different blood criteria among people with sickle-cell and Thalassemia patients. This paper further describes sickle cell disease with better precision with Multi-Layer Perceptron and builds an effective predictive model to minimize time and actions of sickle cell disease pain control systems. This paper examines the working model of the deep-learning model called "MLP Classifier" that gives efficient results than the other models like Support Vector Machine, K-Nearest Neighbor, Logistic Regression, Decision Tree Classifier and Random Forest Algorithms. The results of the simulations demonstrate that the proposed MLP classification has a 99 percent accuracy of prediction for both sickle cell and thalassemia. A web application is also developed for usage in the TSCS environment that includes the prediction model with the MLP classification algorithm even by a normal medical laboratory staff in an easier and simpler way.