INTRODUCTION
Attention-deficit/hyperactivity disorder (ADHD) is a neuropsychiatric condition that frequently manifests itself throughout childhood and adolescence ( Kieling and Rohde, 2012). This condition is quite common. Insufficiencies in attention, abnormally high levels of activity, and impulsive behavior are the core manifestations of this condition ( Thomas et al., 2015). Different subtypes may be distinguished by the extent of these symptoms. The inattentive subtype (ADD) and the combined subtype attention deficit hyperactivity disorder combined subtype (ADHD-C) are the two most prominent subtypes of ADHD ( Randall et al., 2009; Ahmadi et al., 2014). Individuals who exhibit symptoms that fall under both diagnostic categories have significant deficiencies in attention. However, those who have been diagnosed with ADHD-C struggle not only with their ability to pay attention but also with their ability to control their impulses and their level of activity. In an effort to shed light on the variables that lie behind the surface, a number of ideas have been proposed, with the most prominent theories concentrating on the processing of dopamine and alterations in the functioning of the prefrontal cortex ( Kessler et al., 2007; Ziegler et al., 2016; Luo et al., 2019). The current standard method for diagnosing ADHD involves a battery of tests. Clinical interviews, symptom questionnaires with multiple assessors, cognitive tests, and a methodical procedure for rolling out other potential causes of the reported symptoms are all part of the toolkit. These potential causes include comorbid mental problems, sensory impairments, thyroid dysfunction, and electroencephalogram (EEG) abnormalities. This method has reached the level of conventional wisdom and is now the standard.
ADHD has traditionally been considered to be a condition which primarily affects children, as it is believed that ADHD symptoms tend to ameliorate as children mature ( De Graaf et al., 2008). However, many extensive investigations have shown that individuals who were diagnosed with ADHD during childhood continue to exhibit symptoms that align well with the diagnostic criteria into adulthood ( Weiss et al., 2002; Montes et al., 2007; Kessler et al., 2010; Montejano et al., 2011; Park et al., 2011). Adult ADHD often includes those who are in the late adolescence stage or older, namely those who are 17 years of age or older. ADHD is a psychiatric condition characterized by a confluence of symptoms, majorly inattention, hyperactivity, and impulsivity. These symptoms together contribute to a notable impairment in social functioning. The primary manifestations of ADHD in adult individuals are characterized by tendencies toward inattentiveness and impulsive behavior. Nonetheless, people with ADHD exhibit significant improvement in symptoms related to hyperactivity ( Sibley et al., 2012).
People with ADHD may struggle to overcome a variety of challenges when it comes to starting and maintaining conversations with others. In addition, studies have shown that members of this group often have difficulties at work, especially in terms of their capability to correctly organize and complete activities ( Ward et al., 1993; Adler et al., 2006). This is problematic because people who fall under this category often struggle to adjust to new settings. This phenomenon has been identified as a major hurdle for the affected people. While medical experts have a lesser likelihood of accurately diagnosing ADHD in adults ( Barkley, 1997), the general population is more likely to receive inaccurate diagnosis. This is because there are significant differences in the symptoms of ADHD in children and adults. Among the many diagnostic criteria for ADHD, carelessness and hyperactivity/impulsivity are considered to be the most essential. In addition to this, in the context of adult ADHD, symptoms outside of the basic diagnostic criteria become more noticeable. Impairments in executive function, difficulty in attentiveness to inner feelings, self-concept and self-esteem disorders, and social difficulties are among the most common ( Conners et al., 1999; Shaw-Zirt et al., 2005; Willcutt et al., 2005; Canu and Carlson, 2007; Faraone et al., 2010; Safren et al., 2010; Brikell et al., 2015; Corbisiero et al., 2017; Musser and Nigg, 2019; Yoo et al., 2019; Faraone et al., 2021). In light of this, it is exceedingly challenging to conduct thorough ADHD screenings of adult patients solely using diagnostic criteria established in the Diagnostic and Statistical Manual of Mental Disorders or the International Classification of Diseases system. The aforementioned issue may be seen as a substantial barrier to improvement in terms of clinical practice. Therefore, it is of utmost importance to develop a valid screening instrument for adult ADHD ( Freeman-Fobbs, 2003).
Presently, scientists in the academic community are actively engaged in efforts to identify the risk factors associated with ADHD in order to reduce the prevalence of this condition in children and adolescents. A recent research ( Stevens et al., 2005) has provided empirical data supporting a substantial link between genetic traits and ADHD, hence indicating a strong association between them. The etiology of ADHD in younger children is remarkably influenced by genetic predisposition, which contribute to about 75% of the overall risk ( Bazar et al., 2006). ADHD has been associated with many risk factors, such as brain damage, prenatal exposure to alcohol and nicotine, and preterm birth ( Stevens et al., 2005). These risk factors are accompanied by the inherited traits that may potentially contribute. Several previous research ( Agranat-Meged et al., 2005; Kollins et al., 2005; Bramlett and Blumberg, 2007; Cortese et al., 2008; Waring and Lapane, 2008; Choy et al., 2018; Zhou et al., 2019; Ghaderzadeh et al., 2021) have shown high correlation between ADHD in children and a range of factors, such as age, gender, asthma, race, anxiety, depression, obesity, smoking, and socioeconomic level. The primary objective of this research was to ascertain the risk variables associated with ADHD in individuals of pediatric and teenage age. The critical need to provide a predictive model has been demonstrated, and other than relying on conventional prediction techniques, the existing situation provides a favorable option for the use of machine learning (ML) informed models. ML models have been widely used across several domains, including medical imaging ( Alanazi et al., 2017; Battineni et al., 2020; Zea-Vera et al., 2021), healthcare ( Dwyer et al., 2018; Burke et al., 2019; Kessler et al., 2019), and mental health ( Barry et al., 2003; Linthicum et al., 2019), to effectively perform tasks related to identification and prediction.
ADHD is a neurodevelopmental illness that may manifest in individuals of various age groups, characterized by symptoms such as inattention, hyperactivity, and impulsivity. Diagnosing ADHD may present challenges due to its reliance on subjective assessments, including self-reporting and observations provided by parents, teachers, and clinicians. The assessments are susceptible to bias, potentially leading to either an inaccurate diagnosis of the illness or an insufficient one.
The research gap highlights the need to conduct more investigations and develop artificial intelligence algorithms that are specifically tailored for the purpose of accurately identifying individuals with ADHD. By addressing this research gap, it is anticipated that artificial intelligence approaches may enhance the accuracy and objectivity of ADHD diagnosis, leading to improved treatment options and outcomes for individuals with ADHD. The primary contributions of this research are
Developing decision system based on machine leaning models that can detect ADHD patients.
The proposed approach aims to enhance clinicians’ comprehension and assessment of the likelihood of a person being diagnosed with ADHD by using the existing data.
The proposed system achieved 91% accuracy using a small standard dataset.
BACKGROUND
ML algorithms refer to a computational approach that autonomously identifies appropriate techniques and parameters in order to achieve an optimum solution to a given issue ( Buchsbaum and Wender, 1973). Computer learning is a process through which a computer obtains knowledge from data that are recorded with little human interaction. It is capable of identifying patterns within the data and suggesting methods to enhance the accuracy of diagnosis and prognosis. This technique has significant use in the prediction of human behavior, particularly in relation to high-risk behavior. Moreover, its application holds potential for enhancing the efficacy and objectives of preventive programs and treatments ( Buchsbaum and Wender, 1973). When compared to traditional statistical methods, ML technology offers benefits in terms of prediction accuracy and scalability ( Robaey et al., 1992). Therefore, several recent research studies have used ML technology to distinguish individuals with ADHD from control groups. The aforementioned studies have shown a reasonable level of accuracy when using linear classifiers ( Satterfield and Braley, 1977; Smith et al., 2003; Riaz et al., 2020; Hang et al., 2022; Zhao et al., 2022). However, it is evident that a larger body of more rigorous research is required in order to effectively predict ADHD via ML techniques.
Diverse data gathering strategies and artificial intelligence algorithms have recently made substantial contributions to the field of ADHD diagnosis. Several groups of researchers have employed deep learning and ML algorithms to study ADHD diagnosis, with the Neuro Bureau attention-deficit/ hyperactivity disorder dataset (ADHD-200) Dataset serving as a common resource. The ADHD-200 Dataset comprises a complete compilation of 776 instances of resting-state functional magnetic resonance imaging and structural magnetic resonance imaging (MRI) data, as shown by the citations stated earlier ( Liu et al., 2020; Luo et al., 2020; Riaz et al., 2020; Sun et al., 2020; Zhang et al., 2022; Zhao et al., 2022).
Peng et al. (2021) introduced a convolutional neural network framework for deep learning in their research. This approach has resulted in a diagnosis accuracy of 72.9% ( Sun et al., 2020) while dealing with ADHD. It was Peng et al. that created the system. An ML approach using Support Vector Machines (SVMs) was also developed as a consequence of Chen et al.’s study. ADHD was establish to analyze the diagnosis accuracy of this method in a research conducted by the authors ( Chen et al., 2020), with a success rate of 88.1%. Multiple research groups may benefit from using a high-quality public dataset to improve the reliability of their results using iterative algorithmic refinements. Researchers who are interested in experimenting the potential of using MRI data in the diagnosis of ADHD may find the offered dataset an excellent resource. The dataset is now accessible to anybody who wants to use it.
The research is a compilation of papers ( Chen et al., 2019; Vahid et al., 2019; Dubreuil-Vall et al., 2020) that investigate whether it is possible to diagnose ADHD by using EEG data or not. Tosun (2021) employed a deep learning system that includes long short-term memory in their study targeting precise diagnosis of ADHD. Their goal was to properly diagnose ADHD. It was shown that the system was 92.2% accurate regarding categorization. A total of 1088 participants who had been diagnosed with ADHD and the same number of people who acted as controls participated in the research study ( Müller et al., 2019). In addition, Altınkaynak et al. (2020) carried out a research in which they analyzed the EEG data collected from a sample of 23 persons. The sample consisted of 23 people with ADHD and 23 people who did not have ADHD. In the course of this investigation, the ML strategy known as multilayer perceptron (MLP) was used. The findings of their investigation revealed an accuracy of 91.3% in the overall ( Koh et al., 2022) index which was provided by the user.
Another approach that has potential for use in investigating the subject is one that is based on the data obtained from continuous performance tests (CPTs). The aforementioned test is used rather often in healthcare facilities as an axillary method in the process of ADHD diagnosis. The continuous performance test, often known as the CPT, is used as a primary source of data in a research that investigates the categorizations of ADHD ( Slobodin et al., 2020; Yasumura et al., 2020). The findings of the CPT were analyzed by Slobodin et al. (2020) in a sample population consisting of 213 individuals who had been diagnosed with ADHD and 245 individuals who did not have this condition. The individuals who were included in this sample were examined throughout a period of 5 years. The research was carried out in a total of 213 participants. The use of random forests (RFs), a kind of ML, was one of the methods that the study team relied on to assure the precision of their ADHD diagnosis. As a result, they were able to achieve an extremely high degree of precision, as shown by the fact that their percentage of success was 87%. The latest investigation pertaining to this topic was conducted by the research team headed by O’Mahony et al. (2014) and its results were recently published in a scholarly journal. The researchers relied on the administering a continuous performance test results as a foundation for their classification of individuals who were diagnosed with ADHD. Each participant in the research was equipped with two inertial measurement unit sensors, with one sensor fastened around their waist and the other positioned on either their ankles or feet. By using the SVM approach, as described in Slobodin et al. (2020), a classification accuracy of 95.1% was attained.
MATERIALS AND METHODS
Framework of the proposed system
Figure 1 depicts the method that has been developed for the purpose of detecting and classifying ADHD.
Dataset
The dataset was a collection of the phenotypic characteristics of children diagnosed with ADHD ( Kieling and Rohde, 2012). The present data set only encompasses the variables of interest, with a sample size of 221 people and a total of eight variables. The participants were selected from the outpatient population at the Peking University Institute of Mental Health. The study used a standardized diagnostic interview known as the Clinical Diagnostic Interviewing Scale. The sample consisted of 63 female participants and 158 male participants. The dataset consisted of two classes, namely control and ADHD. Figure 2 shows the numbers of classes in the dataset. The dataset is available in the following link: https://github.com/rahmarid/dataset accessed date 2-8-2023.
Preprocessing data
Scaling data
The procedure of min-max normalization, also known as feature scaling, entails the application of a linear transformation to the initial dataset. The approach used in this study utilizes all the normalized data within the interval (0, 1). The formula necessary to achieve this target is as follows: the min-max normalization procedure is designed to preserve the relative relationships between the original data values. A significant limitation associated with using a narrow range leads to an evident decrease in standard deviations, which may therefore diminish the influence of outliers.
where f min and f max denote the fmin and fmax values, respectively, in this expression.
Balance data
Unbalanced dataset is characterized by a disparity in the number of instances between different class labels, with one class label being more prevalent than the other. In the context of classifying unbalanced data, it is important to note that ML algorithms tend to exhibit bias toward the majority class. In order to address this issue, we used two distinct approaches for data sampling oversampling and under sampling methods. Oversampling is a sampling approach where samples from the minority class are randomly selected with replacement and then added to the training dataset. Consequently, the efficacy of ML-based classifiers will be enhanced. Under sampling is a sampling technique that involves the random selection of samples, without replacement, from the majority class until a balanced distribution of class labels is achieved. The dataset indicates that the ADHD class exhibits a greater number of incidents in comparison to the control class. Consequently, in order to improve the precision of the ML approaches, imbalanced methods have been used. Figure 3 illustrates the presence of an unbalanced class within the dataset.
The synthetic minority over sampling (SMOTE) approach involves the random replication of minority data in order to balance the distribution of data. Despite its effectiveness in enhancing the categorization accuracy of minority data, SMOTE. However, one of the persisting issues is the incidence of overgeneralization, among other challenges. The synthetic data generated by the SMOTE technique have the potential to be distributed among both the minority and majority classes, thereby reducing the imbalance. The formula for generating synthetic data using the SMOTE technique may be represented as follows:
where D new represents ADHD dataset, D i represents samples from a minority group, and ˆDj represents one of the k-nearest neighbors from D i . Let δ be a randomly generated number within the range of 0 to 1. We have applied the SMOTE method for improving the classification process.
Machine learning approaches
Support vector machine
The SVM: the aforementioned approach is extensively used in supervised ML for problems such as classification and regression. The basic objective of SVMs is to identify a hyperplane that effectively partitions the feature space into separate classes. The objective of the SVM technique in binary classification is to identify an optimal decision boundary that maximizes the separation between the two groups. The margin refers to the distance between the decision boundary and the support vectors, which are the closest data points to the decision boundary for each class. The SVM algorithm is designed to find a decision boundary that effectively separates different classes and also performs well when applied to new, unknown data. Figure 4 illustrates how the support vector effectively separates the classes. In this study, the researcher employed Kernel functions to classify ADHD and control classes. Kernel functions are mathematical functions that map the original data into a feature space with a higher dimensionality. This allows for the transformed data to be linearly separable. ML commonly utilizes several Kernel functions, such as the polynomial Kernel, Gaussian [radial basis function kernel (RBF)] Kernel, and sigmoid Kernel.
The X and y are used in the field of ML to denote a feature vector that is used for training an algorithm on a given ADHD dataset. The feature vector is further used for the assessment of the dataset. The variable || X–y || 2 represents the squared Euclidean difference between two feature inputs, and it has the capability of being modified.
Random forest tree
The RF algorithm is well recognized in the field of ML and is classified as a member of the ensemble learning methodology. A forest is formed by combining many decision trees (DTs). The training process of a RF involves training each DT on a distinct random subset of the training data. The final prediction is then determined by combining the predictions provided by all the individual trees. RF, the technique of random sampling involves the random selection of subsets from the training data, with replacement. This process is used to generate distinct training sets for each DT within the ensemble. The aforementioned procedure is often referred to as bootstrapping or random sampling with replacement. Finally, the RF algorithm aggregates their individual forecasts in order to get the final prediction. In classification problems, the projected class is determined by selecting the class that receives the majority of votes from the trees. In regression tasks, the final prediction is obtained by averaging the predictions of all the trees.
Random forest
function random_forest_tree(dataset, max_depth, num_features): |
if max_depth == 0 or dataset is pure: |
return create_leaf_node(dataset) |
feature_subset = select_random_features(num_features) |
best_feature, best_split_value = find_best_split(dataset, feature_subset) |
if best_feature is None: |
return create_leaf_node(dataset) |
left_dataset, right_dataset = split_dataset(dataset, best_feature, best_split_value) |
left_subtree = random_forest_tree(left_dataset, max_depth - 1, num_features) |
right_subtree = random_forest_tree(right_dataset, max_depth - 1, num_features) |
return create_decision_node(best_feature, best_split_value, left_subtree, right_subtree) |
function create_leaf_node(dataset): |
label = majority_vote(dataset) |
return LeafNode(label) |
function create_decision_node(feature_index, split_value, left_subtree, right_subtree): |
return DecisionNode(feature_index, split_value, left_subtree, right_subtree) |
function select_random_features(num_features): |
// Randomly select a subset of features from the available features |
feature_subset = random.sample(available_features, num_features) |
return feature_subset |
function find_best_split(dataset, feature_subset): |
best_feature = None |
best_split_value = None |
best_gini = infinity |
for feature in feature_subset: |
feature_values = get_feature_values(dataset, feature) |
unique_values = unique(feature_values) |
for value in unique_values: |
left_dataset, right_dataset = split_dataset(dataset, feature, value) |
gini = compute_gini(left_dataset, right_dataset) |
if gini < best_gini: |
best_gini = gini |
best_feature = feature |
best_split_value = value |
return best_feature, best_split_value |
function split_dataset(dataset, feature_index, split_value): |
left_dataset = empty_dataset() |
right_dataset = empty_dataset() |
for instance in dataset: |
feature_value = instance[feature_index] |
if feature_value <= split_value: |
left_dataset.add(instance) |
else: |
right_dataset.add(instance) |
return left_dataset, right_dataset |
function compute_gini(left_dataset, right_dataset): |
total_instances = len(left_dataset) + len(right_dataset) |
gini = 0 |
for dataset in [left_dataset, right_dataset]: |
dataset_size = len(dataset) |
if dataset_size > 0: |
class_counts = count_classes(dataset) |
class_probabilities = class_counts / dataset_size |
gini += (1 - sum(class_probabilities ** 2)) * (dataset_size / total_instances) |
return gini |
A multilayer perceptron
A specific type of neural network is referred to as an MLP neural network. This network is also known as a feedforward neural network. The MLP is unique among neural networks due to its specific characteristics. It consists of a single implicit layer and connections that only go in one direction between neurons. Additionally, data can freely move within the network across its three levels simultaneously. The quantity of input data attributes is directly proportional to the quantity of nodes present in the input, hidden, and output layers. The number of nodes in the output layer is directly proportional to the number of classes present in the final dataset. The aforementioned statement applies to both the hidden and output layers, wherein each node in the input layer is connected to every node in the hidden layer, and vice versa. Figure 5 illustrates the structure, displayed below, consisting of 7 inputs, 10 hidden layers, and 2 outputs. You can view the figure here.
Multilayer perceptron
Step 1: | Initialize the network
Initialize weights and biases randomly |
Step 2: | Forward propagation
#Calculate the activation of the first layer #Calculate the activation of each neuron in the hidden layers For each hidden layer # Calculate the weighted sum of inputs and biases # Obtained the final output of network by apply the activation function to the weighted |
Step 3: | Backward propagation (updating weights and biases)
#Calculate the error at the output layer # Update the weights and biases |
Step 4: | Continue to iterate steps 2 and 3 until either convergence is achieved or the maximum allowable number of iterations has been reached. |
Step 5: | Utilize the learned neural network for making predictions. |
EXPERIMENT RESULTS
An efficient ML model was developed using several techniques, including SVMs, DTs, RFs, and MLP. The model was produced using a database obtained from a well-established dataset, as previously mentioned. The aforementioned algorithms were used in order to differentiate individuals diagnosed with ADHD from those who do not exhibit the disorder. The computational platform used in this study was a Python-based model, which served as a foundation for the modeling work conducted. The characteristics used as input for the detection and categorization of ADHD.
Configuration system
The experimental findings of our investigation were obtained using a laptop equipped with hardware specs that consisted of an eighth generation Intel Core i7 CPU and 8GB RAM. In contrast, the scikit-learn Python library was used for the development of our models. These criteria are used for the purpose of properly training and evaluating our ML models.
RESULTS OF MACHINE LEARNING
Performance of the models
The main objective of this section was to employ four ML-based classifiers in order to detect and classify children diagnosed with ADHD. Table 1 presents a comparison of the predictive capacities of ML classifiers in the identification of children diagnosed with ADHD. The findings of the study revealed that the classifier based on SVM-search attained the highest level of classification accuracy, reaching 91%. Additionally, the precision of the classifier was determined to be 92%, while the recall stood at 91%. In contrast, the DT classifier exhibited the lowest classification accuracy of 78%, accompanied by a precision of 78% and a recall of 91%. The research yielded an accuracy rate of 85%, a precision rate of 85%, and a recall rate of 87% for the RF algorithm. Nevertheless, the MLP algorithm exhibited remarkable levels of accuracy, reaching up to 89%. Additionally, it had a precision rate of 87% and a recall rate of 89%. The performance indicators of the ML models are shown in Figure 6.

Accuracy performance of the machine learning model. Abbreviations: MLP, multilayer perceptron; SVM, support vector machine.
Results of the machine learning models.
Accuracy (%) | Precision (%) | Recall (%) | F1-score (%) | |
---|---|---|---|---|
Decision tree | 78 | 78 | 78 | 73 |
Random forest | 87 | 85 | 87 | 85 |
SVM | 91 | 92 | 91 | 89 |
MLP | 89 | 87 | 89 | 87 |
Abbreviations: MLP, multilayer perceptron; SVM, support vector machine.
Regarding the binary classification task involving two classes, it observed the classification accuracy for distinguishing between control and ADHD. Figure 7 shows the confusion metrics of ML algorithms. The reported accuracy of classifying patients with ADHD from healthy control persons in binary classification tasks is provided. During the testing phase, the SVM algorithm successfully categorized 39 instances as belonging to the health class and 2 instances as belonging to the ADHD class. The DT approach has shown worst result as only 34 patients have been classified as control whereas 1 patient is classified as ADHD, and misclassification is more.
DISCUSSION
Diagnosing ADHD accurately is a challenging task. Receiving incorrect diagnosis significantly increases the risk of experiencing unfavorable medical outcomes. Due to the intricate nature of this ailment, there is currently no computerized expert diagnostic system accessible. The difficulty in diagnosing this condition may be the reason for this dilemma. Using artificial intelligence techniques to automatically diagnose ADHD by analyzing brain signals in recent years is one solution for the early detection of ADHD.
The objective of this study was to utilize ML techniques to predict and report symptoms of adult ADHD. Throughout four ML algorithms including SVM, DT, RF, and MLP were applied to distinguish between individuals with ADHD and control patients. The results demonstrated a notable level of precision, with scores varying between 78 and 91%. The accuracy of predicting ADHD symptoms in adults was very high, even though the different approaches used showed some variation. The use of the commonly used screening instrument, SVM, allows for the identification of risk factors associated with a shorter attention span, a symptom of adult ADHD. This is achieved through the application of ML algorithms. The task can be accomplished by utilizing the SVM algorithm. The classifier based on RF demonstrated the highest area under the curve (AUC) among the examined classifiers, with a value of 90%. The significance of this statistic much surpassed that of all other measures. A comparative analysis was conducted on four distinct ML classifiers, using the receiver operating characteristic curve as a visual representation, as shown in Figure 8. The classifier constructed using a RF-based technique has shown notable efficacy in reliably discerning youngsters who have ADHD. The RF classifier produced a much higher AUC value of 90% compared to the other classifiers.

ROC of the proposed machine learning algorithms. Abbreviations: MLP, multilayer perceptron; ROC, receiver operating characteristic; SVM, support vector machine.
This research primarily focuses on the detection of ADHD by using a dataset obtained from individuals who were specifically chosen from the outpatient population at the Peking University Institute of Mental Health. Future research has the potential to broaden the use of diverse datasets derived from electroencephalography (EEG) and MRI images. Notwithstanding these constraints, we posit that our study makes a valuable contribution to the expanding corpus of information regarding the precise discernment of ADHD using the utilization of ML approaches.
CONCLUSION
The prevalence of mental disorders on a worldwide scale is steadily increasing, leading to significant health implications as well as substantial social, human rights, and economic consequences across all nations. Hence, the objective of this study was to use ML methodologies for the purpose of categorizing ADHD, with the aim to contribute in providing significant findings which accelerate the progress toward the development of an automated diagnostic system. The dataset was obtained from the Institute of Mental Health at Peking University. The study sample included 63 female individuals and 158 male participants. The dataset consisted of two distinct classes, namely control and ADHD. The ML classifiers used in this study are SVM, DT, RF, and MLP ML algorithms. It is observed that the SVM approach achieved high accuracy for detecting ADHD. The SVM technique achieved a maximum classification accuracy of 91%. Finally, detection of ADHD via the use of ML techniques exhibits encouraging outcomes. In addition to the pursuit of achieving high classification accuracy, using ML techniques to investigate ADHD may also ascertain the significance of features and the discriminative capabilities of modalities. This can provide valuable insights for both clinical and research purposes. There is a strong need for future research endeavors that prioritize the enhancement of interpretability and generalizability of models.