INTRODUCTION
Autism spectrum disorder (ASD) is a neurodevelopmental condition affecting a child’s communication, social interaction, and knowledge acquisition, typically presenting within the first 2 years of life ( Frith and Happé, 2005). People with autism face various obstacles, including difficulty with focus, learning disabilities, mental health issues such as anxiety, depression, movement, sensory issues, and other challenges ( Tripathy et al., 2021). As a result, it impacts an individual’s entire cognitive, social, emotional, and physical health ( Omar et al., 2019; Alenizi and Al-Karawi, 2023a). The symptoms of this condition vary in extent and intensity, including communication difficulties, obsessive hobbies, and repeated mannerisms in social situations. A comprehensive examination is needed to detect ASD. This also comprises a thorough evaluation and a range of assessments performed by child psychologists and other qualified professionals ( Bastiaansen et al., 2011; Alenizi and Al-Karawi, 2023b, c). Autism is a rapidly growing and numerous global condition, affecting approximately one child out of every 160, according to the World Health Organization ( Suhas et al., 2021; Al-Karawi, 2023). ASD is a neurodevelopmental condition that affects social interaction and communication abilities, requiring 24-h care and assistance for some individuals ( Vaishali and Sasikala, 2018; Thabtah, 2019). Individuals with ASD often experience lifelong challenges in these areas. ASD, a condition characterized by persistent symptoms, is believed to be caused by a combination of genetic and environmental factors, with no known cure, but early detection can help manage its effects. Genes, environmental factors, and risk factors like low birth weight, ASD sibling presence, and older parents can influence a person’s development. Early diagnosis of autism can be quite beneficial because it allows doctors to provide patients with the appropriate treatment at an earlier stage. It can potentially halt any further deterioration of the patient’s condition. It would help to cut down on the expenditures associated with delayed diagnosis over long term. Therefore, there is a significant need for a screening test instrument that is time efficient, accurate, and simple. This test instrument would predict autistic symptoms in an individual and determine whether or not that individual requires a thorough autism examination ( Lakhan et al., 2020; Alenizi and Al-Karawi, 2022). Early detection and intervention are crucial for mitigating ASD symptoms and improving the quality of life. Observation is the primary method, with parents, teachers, and special education teams identifying potential symptoms. Children should seek healthcare for further testing, as identifying ASD symptoms in adults can be more challenging, while behavioral changes in children can be recognized as early as 6 months ( Al-Karawi and Ahmed, 2021; Alenizi and Al-Karawi, 2023c). This study aims to develop a platform for accurately predicting autistic characteristics in individuals of any age, using machine-learning (ML) approaches to aid in early diagnosis and intervention.
BACKGROUND AND LITERATURE REVIEW
In their study, Vaishali and Sasikala (2018) proposed a method for identifying ASD using optimized behavior sets. The researchers experimented with an ASD diagnosis dataset containing 21 features from the UCI machine-learning repository. They employed a swarm intelligence-based binary firefly feature selection wrapper to explore the dataset. Researchers tested the hypothesis that a machine-learning model could improve classification accuracy using minimal feature subsets, finding that only 10 features from the original 21-feature ASD dataset were sufficient. The study found that swarm intelligence-based binary firefly feature selection can achieve accurate ASD diagnosis with fewer features, achieving an average accuracy in the range of 92.12 to 97.95%, potentially improving efficiency and reducing computational complexity in ASD diagnostic systems. In their study, Thabtah (2017b) introduced an ASD screening model incorporating machine-learning adaption and diagnostic and statistical manual of mental disorders (DSM-5)criteria. Screening tools play a crucial role in achieving various objectives in ASD screening. This paper explores the use of machine learning for ASD classification, highlighting its advantages and disadvantages and the challenges existing tools face in aligning with the DSM-5 manual. In their study, Mythili and Shanavas (2014) researched ASD using classification techniques. The primary objective of their paper was to detect and classify levels of autism. They employed neural networks, support vector machine (SVM), and fuzzy techniques with WEKA tools to analyze students’ behavior and social interaction. In another study, Kosmicki et al. (2015) proposed a method for identifying a minimal set of traits for autism detection. The authors used machine learning to assess ASD clinically using the Autism Diagnostic Observation Schedule (ADOS). They identified 98.27% of the 28 behaviors from module 2 and 97.66% from module 3, achieving an overall accuracy of 98.27%. The effectiveness of ML in predicting various diseases based on syndromes is highly noteworthy. For instance, Khan et al. (2017) and Al-Karawi and Ahmed (2021) utilized ML to predict whether a person has diabetes, whereas Cruz and Wishart (2006) attempted to diagnose cancer using ML. Alternating decision tree (ADTree) was used by Wall et al. (2012a) and Alenizi and Al-Karawi (2023b) to shorten the screening process and speed up the identification of ASD features. With data from 891 people, they employed the Autism Diagnostic Interview, Revised (ADI-R) approach. They reached high accuracy, but the test was restricted to people between the ages of 5 and 17, and it could not predict ASD for various age groups (children, adolescents, and adults). Machine learning has been used in several types of research in multiple ways to enhance and expedite the diagnosis of ASD. Using a 65-item Social Responsiveness Scale, Duda et al. (2016) used forward feature selection and under sampling to distinguish between autism and attention deficit hyperactivity disorder (ADHD). The metrics of Al-Karawi (2021) and Deshpande et al. (2013) for predicting ASD were based on brain activity. Artificial neural networks (ANN), probabilistic reasoning, and classifier combinations are examples of soft computing approaches that have also been employed ( Pratap et al., 2014; Alenizi and Al-Karawi, 2022). Numerous papers have discussed automatic ML models that solely consider characteristics for input features. Several research also used brain neuroimaging data. Parikh et al. (2019) selected six personal traits from the ABIDE database and used a cross-validation technique to train and test ML models using data from 851 subjects. Patients with and without ASD were categorized using this, accordingly. Rules of machine learning, which Thabtah and Peebles (2020) introduced, provide users with a knowledge base of rules for comprehending the classification’s fundamental causes and detecting ASD characteristics. Al Banna et al. (2020) track and support ASD patients while they deal with the COVID-19 epidemic. The study utilized five machine-learning models to classify participants as having ASD or No-ASD based on various parameters like age, sex, and ethnicity. We then analyzed each classifier to find the model that performed the best. SVM was utilized by Bone et al. (2016) to apply ML for the same goal and achieve 89.2% sensitivity and 59% specificity. In their study, 1264 people with ASD and 462 people without ASD features were involved. However, because of the vast age range (4-55 years), their research was not approved as a screening method for all age groups. Using more than 90% accuracy, Allison et al. (2012) used the “Red Flags” tool to screen for ASD in both children and adults with the Autism Spectrum Quotient before shortlisting them to the AQ-10. Schankweiler et al. (2023) attempted to identify relatively more important screening questions for the ADI-R and ADOS screening methods. They found that ADI-R and ADOS screening tests can work better when they are combined. Thabtah compared the previous works on ML algorithms to predict autism traits ( Thabtah, 2017b). To identify ASD symptoms in children, such as developmental delay, obesity, and insufficient physical activity, van den Bekerom (2017) utilized multiple ML algorithms, including naïve Bayes (NB), SVM, and random forest algorithms. He then compared those results. ADTree and the functional tree fared well with high sensitivity, specificity, and accuracy, according to Wall et al. (2012b) study on identifying autism using a short screening test and validation. Heinsfeld et al. (2018) used a sizable brain imaging dataset from the Autism Imaging Data Exchange (ABIDE I) to identify ASD patients and got a mean classification accuracy of 70% with accuracy in the range of 66 to 71%. The random forest classifier’s (RFC) mean accuracy was 63%, compared to the SVM classifier’s mean accuracy of 65%. This study’s accuracy, specificity, sensitivity, and AUC were 88.51%. To pinpoint the problems with conceptual problem formulation, methodology implementation, and result in interpretation, Bone et al. (2015) analyzed the earlier works of Wall et al. (2012b) and Kosmicki et al. (2015). The researchers used machine learning to replicate their findings, but there is no consensus on the best approach for generalizing autism screening tools across different age ranges.
WORKING MODEL
This research aims to create a robust machine-learning model for detecting autism in individuals of different ages, ensuring accurate and effective detection. Figure 1 shows our system’s operation and data flow, starting with preliminary data processing, removing noise, missing values, outliers, and encoding categorical attributes. We use feature-engineering techniques to reduce dataset dimensionality, improve training speed, and use preprocessed datasets for classification using SVM, decision tree, and RFCs. The system evaluates classifier accuracy using a structured workflow, starting with data preprocessing, feature selection, and classification techniques, identifying the most accurate model for further training and categorization tasks.
RESEARCH METHODOLOGY
The research involved five stages: data collection, synthesis, prediction model development, evaluation, and application development, each with a brief discussion of each phase.
Data collection
The dataset utilized for this research has been acquired from the publicly available UCI Repository. The four ASD datasets, namely, toddlers, adolescents, children, and adults, were obtained from publicly available repositories, specifically Kaggle and UCI ML ( Hasan et al., 2022). These repositories provide a valuable data source for research and analysis related to ASD.
These datasets have 20 common attributes that are used for prediction. These attributes are listed below:
Data preprocessing
Data preparation encompasses all the necessary preprocessing steps before commencing model training, aiming to achieve optimal results ( Gopal Krishna Patro and Sahu, 2015). This preparation entails a series of three stages.
Data encoding involves transforming a dataset comprising 6 numerically assigned values and 13 nominally assigned values. To effectively employ various machine-learning algorithms, it is essential to work with real numbers. Consequently, all nominal values must be converted into real numbers. A straightforward representation is adopted in this case, wherein the real numbers 1 and 2 are utilized. For instance, the male class is encoded as 1, while the female type is encoded as 2.
Dealing with missing values is a crucial step in data handling. A significant portion (48.3%) of the data is missing in the given dataset. Removing these missing values would render the dataset unusable, reducing it to 155 samples. Hence, it becomes essential to address this issue. A statistical approach is adopted whereby the missing values are replaced with the mean of the values corresponding to each class ( Wohlrab and Fürnkranz 2011). This ensures that the dataset remains intact and usable for further analysis and modeling.
Normalization becomes necessary as the dataset exhibits significant variations in the range of values, particularly after the nominal values have been encoded into real numbers (1 and 2). Without normalization, attributes with more extensive numeric ranges can dominate those with smaller ranges, potentially biasing the analysis. Moreover, normalization facilitates faster execution of algorithms by avoiding the utilization of wide-ranging numbers ( Deshpande et al., 2013). In this case, the data are scaled to fit within the interval of 0 to 1, following Equation (1), where x represents the original value of the attribute, x Normalized represents the scaled value, min a is the minimum value of attribute a, and max a is the maximum value of attribute a.
List of ASD datasets ( Hasan et al., 2022).
S. no. | Dataset name | Sources | Attribute type | Attributes number | Instances number |
---|---|---|---|---|---|
1 | ASD screening data for adult UCI | Machine-learning repository ( Thabtah, 2017b) | Categorical, continuous, and binary | 21 | 704 |
2 | ASD screening data for children UCI | Machine-learning repository ( Thabtah, 2017b) | Categorical, serial, and binary | 21 | 292 |
3 | ASD screening data for adolescent UCI | Machine-learning repository ( Thabtah, 2017a) | ASD categorical, continuous, and binary | 21 | 104 |
Abbreviation: ASD, autism spectrum disorder.
List of attributes in the dataset ( Hasan et al., 2022).
Attribute id | Attributes description |
---|---|
1 | Patient age |
2 | Sex |
3 | Nationality |
4 | The patient suffered from jaundice problem at birth |
5 | Any family member suffered from pervasive developmental disorders |
6 | Who is the fulfillment of the experiment |
7 | The country in which the user lives |
8 | Did the user use the screening application before or not? |
9 | Screening test type |
10-19 | Based on the screening method, answers to 10 questions |
20 | Screening score |
Selecting the optimal subset of the feature
The feature selection block outlines the process of selecting the best subset of features, which is influenced by the chosen algorithm and the desired learning performance. The following steps are followed to accomplish this selection procedure:
This study employs adaptive wrapper feature selection and precisely backward elimination ( Mao, 2004; Al-Karawi and Mohammed, 2023) to determine the optimal set of features. The results of this process are presented in the paper. Initially, all features related to the chosen algorithm are included. Then, in each iteration, the importance of each feature is evaluated, and the feature with the lowest priority is eliminated. This iterative loop continues until only one feature remains unexplored. The process is repeated until a significant decline in diagnostic performance is observed, as discussed in the Results and Discussion section.
After selecting the feature subset with optimal performance, as described earlier (initially starting with the full feature set in the first iteration), cross-validation with 10-fold is employed to evaluate the discriminant performance ( Berrar, 2019). As demonstrated later, all trained models are saved from being utilized for diagnosing unseen samples. This process is repeated for each algorithm under consideration. To assess the model performance, the results obtained from cross-validation for each feature subset are compared, determining the best-performing model for each specific number of features.
As part of this research, the objective is to develop a mobile application for patients or healthcare facilities. Therefore, one of the essential goals is to minimize the number of features, thereby reducing the cost of tests while maximizing accuracy. To achieve this, a procedure is implemented to identify the minor features that yield the most optimal performance across 10-folds. The resulting 10 models from each fold are saved for later use during the testing phase. This approach ensures that the application maintains high accuracy while minimizing the required features.
Training framework architecture
As mentioned earlier, previous studies have predominantly focused on selecting features independently of the training model. In traditional classification systems, a feature selection technique is often applied, and the selected features are then used across all algorithms to classify diseases. However, this approach can lead to varying performance for each model, depending on the algorithm used and the representation of the selected features. Specific algorithms may underperform because the chosen features may not be the most suitable for that particular algorithm. To address this feature selection challenge, this subsection proposes and justifies a stand-alone platform for diagnosing hepatitis disease. The platform encompasses the training framework architecture, testing framework architecture, and real-time diagnosis platform. Figure 1 illustrates the complete training framework architecture, with each section detailed. The entire process is repeated for all selected algorithms.
Testing framework architecture
Figure 2 illustrates the execution of a simulated test on an unseen portion of the dataset. The testing process involves data preparation, similar to the training process. The prepared data are then passed to a script that performs predictions using the 10 pretrained models. A voting process determines the final decision based on the highest probability ( Parikh et al., 2019). However, in the scenario where five models predict “affected” and five models predict “healthy,” the patient is considered to have autism disease. It is important to note that, since we are dealing with a disease, it is highly recommended that the patient consult a doctor for further examination and diagnosis.
CLASSIFICATIONS ALGORITHMS
Support vector machine
SVM is a supervised machine-learning technique for classification and regression tasks. It is a practical approach to solving pattern recognition problems. One notable advantage of SVM is its ability to mitigate overfitting issues. By establishing a decision boundary, SVM effectively segregates classes ( Huang et al., 2018).
Naïve Bayes
The NB classifier is a supervised learning algorithm that operates as a generative model based on joint probability distribution. It makes use of independence assumptions to simplify computations. Compared to SVM and ME models, NB exhibits faster training times. It calculates the posterior probability for a dataset by combining prior probability and likelihood estimations ( John and Langley, 2013).
Logistic regression
Logistic regression (LR) is a regression technique for analyzing binary dependent variables. Its output values are constrained to 0 or 1, making it suitable for binary classification tasks. LR is beneficial for datasets with continuous values. It enables examining the relationship between a single dependent binary variable and one or more nominal or ordinal variables. The relationship is typically represented using the sigmoidal function.
K-nearest neighbor
K-nearest neighbor (KNN) is a supervised learning method known for its simplicity. It is employed in both classification and regression tasks. The underlying principle of KNN is that similar data points tend to be located close to each other. The “K” in KNN refers to the number of neighboring points to consider. Selecting an appropriate “K” value is crucial in minimizing errors. KNN relies on similarity, measured by distance, closeness, or proximity. The widely used distance metric is the Euclidean distance.
Random forest classifier
The RFC is a versatile algorithm capable of handling classification, regression, and other tasks ( Alam and Vuong, 2013). It operates by generating multiple decision trees using random subsets of the data. Once predictions are obtained from each tree, the final solution is determined by employing a voting mechanism. The prediction that receives the highest number of votes is selected as the best solution. This voting-based approach allows RFC to leverage the collective wisdom of multiple decision trees, resulting in improved accuracy and flexibility.
The random forest algorithm creates many decision trees from a randomly selected section of the training dataset shown in Figure 3. The votes from several decision trees are then averaged to establish the final class of test objects ( Alam and Vuong, 2013).

An SVM classifier. Abbreviation: SVM, support vector machine ( Alenizi and Al-Karawi, 2023c).
Decision tree classification method
The cornerstone of a decision tree is the decision-making process, which has outstanding accuracy and stability and can be seen as a tree. In Figure 2, a decision tree is displayed ( Song and Ying, 2015).
RESULTS AND DISCUSSION
The performance of the classification model is evaluated using metrics such as specificity, sensitivity, and accuracy, which are derived from the confusion matrix and classification report. These metrics provide insights into the model’s precision in predicting true negatives, positives, and overall accuracy. The model’s effectiveness depends on the accuracy of its training, as it directly influences the quality of the results obtained from these performance measures.
Performance evaluation
Evaluating the performance of a classification model is crucial to assess its effectiveness in achieving a desired outcome. Performance evaluation metrics quantitatively assess the model’s performance on a test dataset. Selecting appropriate metrics to evaluate the model’s performance accurately is essential. Several metrics can be utilized, including the confusion matrix, accuracy, specificity, sensitivity, and more. The following formulas are commonly employed to calculate these performance metrics.
The experimental results demonstrate the application of various machine-learning algorithms with feature selection for ASD screening data in children. All features were selected to evaluate the predictive models’ specificity, sensitivity, and accuracy. The specific implementations for each algorithm are as follows:
NB: Gaussian NB algorithm was used.
SVM: Radial basis function (RBF) kernel with a gamma value of 0.1 was utilized.
KNN: N = 5 neighbors were considered.
ANN: Adam optimizer with a learning rate of 0.01 and 100 epochs was employed; random forest and decision tree algorithm were used.
Elements of a confusion matrix.
Predictive ASD values | ||
---|---|---|
Actual ASD values | True Positive (TP) | False positive (FP) |
False Negative (FN) | True negative (TN) |
Abbreviation: ASD, autism spectrum disorder.
Performance measures for all machine-learning classifiers with the three datasets.
Classifier | Specificity | Sensitivity | Accuracy |
---|---|---|---|
Logistic regression | 0.9375 | 0.9696 | 96.69 |
SVM | 0.9474 | 0.88888 | 98.11 |
Naïve Bayes | 0.9361 | 96.76 | 96.24 |
KNN | 0.9148 | 0.9687 | 95.65 |
Random forest | 1.00 | 0.9933 | 99.75 |
Decision tree | 0.9887 | 0.98877 | 97.47 |
Abbreviations: KNN, K-nearest neighbor; SVM, support vector machine.
The evaluation of different machine-learning models on the ASD diagnosis dataset resulted in accuracy ranging from 95.65 to 99.75% on the original dataset. The KNN classifier with K = 5 achieved the lowest accuracy of 95.65%, while the random forest model achieved the highest prediction accuracy of 99.75% on the original dataset. Additionally, the learning curves of all the machine-learning algorithms provide further insights into the performance of the prediction models.
CONCLUSION
This study presents a machine-learning framework designed to detect ASD in individuals across various age groups, including toddlers, children, adolescents, and adults. Our findings demonstrate the effectiveness of predictive models based on machine-learning techniques as valuable tools for accomplishing this task. As a result, the prediction models proposed in this study, which are based on machine-learning techniques, can serve as an alternative or supportive tool for healthcare professionals in accurately identifying ASD cases across various age groups. The experimental analysis conducted in this research provides valuable insights for healthcare practitioners, enabling them to consider the most significant features when screening for ASD cases. It is important to note that the limitation of this study lies in the insufficient amount of data to develop a generalized model encompassing all stages of ASD. It is vital to have a huge dataset to construct an appropriate model. The dataset we used for this analysis did not contain enough cases.
On the other hand, our research findings have contributed to creating an automated model that can assist medical professionals in diagnosing autism in youngsters. In the future, we will examine the possibility of employing a larger dataset to increase generalization. In future endeavors, we aim to gather a larger dataset related explicitly to ASD and construct a more comprehensive prediction model applicable to individuals of any age. This will further enhance ASD detection and facilitate improved identification of other neuro-developmental disorders.