Processing math: 100%
389
views
0
recommends
+1 Recommend
1 collections
    0
    shares

      King Salman Center for Disability Research is pleased to invite you to submit your scientific research to the Journal of Disability Research. JDR contributes to the Center's strategy to maximize the impact of the field, by supporting and publishing scientific research on disability and related issues, which positively affect the level of services, rehabilitation, and care for individuals with disabilities.
      JDR is an Open Access scientific journal that takes the lead in covering disability research in all areas of health and society at the regional and international level.

      scite_
      0
      0
      0
      0
      Smart Citations
      0
      0
      0
      0
      Citing PublicationsSupportingMentioningContrasting
      View Citations

      See how this article has been cited at scite.ai

      scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

       
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Machine Learning Approaches for Detecting Early Biomarkers of Parkinson’s Disease Through Sleep Behavior Disorder Analysis

      Published
      research-article
      Bookmark

            Abstract

            An increasing number of people are living with Parkinson’s disease (PD), which has emerged as a major issue in world health in the last several years. The application of artificial intelligence methodologies has demonstrated encouraging outcomes, therefore emerging as a crucial tool in addressing the early diagnosis of PD. This study presents a comprehensive overview of the identification and monitoring of PD using contemporary technology. This research primarily focuses on the development of advanced diagnostic methods for the early detection of PD, with the aim of effectively managing the condition during its first phases. Our proposed approach can assist professionals in continuously monitoring the PD scale, which is a rating system for PD. The machine learning (ML) techniques used for diagnosing PD included random forest (RF), K-nearest neighbors, support vector machine, and artificial neural network. The algorithms were evaluated using a standardized dataset obtained from Kaggle, which includes data on PD, sleep behavior disorder, and healthy control subjects. The dataset was partitioned into a training set comprising 70% of the data and a testing set comprising 30% of the data using the proposed ML method. The empirical results show that the RF algorithms score a high percentage of 96% with respect to accuracy. This system facilitates the enhancement of social healthcare quality through the implementation of timely interventions, the improvement of patient outcomes, and the optimization of resource allocation.

            Main article text

            INTRODUCTION

            Parkinson’s disease (PD), which ranks as the second most prevalent neurodegenerative disorder, is distinguished by the gradual onset of motor symptoms as time progresses. Patients with PD may also exhibit non-motor indications, containing cognitive injury, depressive symptoms, hallucinations, sleep disruption, and constipation, among other potential manifestations (DeMaagd and Philip, 2015). The global prevalence of this condition reached more than 8.5 million individuals in 2019 (World Health Organization, n.d.), with a greater occurrence in males compared to females (Liu et al., 2016). The illness affects 1-2 individuals per 1000 in the general population and about 1% of those aged 60 and older (Tysenes and Storstein, 2017). It is now seeing an upward trend as a result of population aging. The number of PD patients in China in 2030 is expected to reach 4.94 million (Ou et al., 2021). A thorough investigation conducted in the United States revealed that the total yearly social and economic impact associated with PD amounted to $51.9 billion, affecting almost 1 million individuals diagnosed with PD in 2017 (Yang et al., 2020). The frequency of PD is projected to surpass $1.6 million, resulting in an economic burden of $79 billion by the year 2037 (Agosta et al., 2015).

            According to neuropathologists, misfolded proteins may have a role in the spread of neurodegenerative disorders across the brain and spinal cord (Meghdadi et al., 2021). Like the proliferation of "prions," this process unfolds slowly but surely. Impaired brain function results from pathological alterations, and clinical symptoms emerge over time as the disease advances. In other words, neuropathological changes can occur up to 10 years before clinical symptoms show up (Dickson, 2012). This sets off the prodromal phase, which is characterized by a variety of symptoms that do not yet meet diagnostic criteria or identify the disease (Mahlknecht et al., 2015; Postuma et al., 2015). People may have hyperechogenicity in the substantia nigra, olfactory and autonomic dysfunction, depression, sleep issues related to sleep behavior disorder (RBD), and cognitive impairment years before they experience clinical symptoms. An early diagnosis is essential for doctors to develop a neuroprotective treatment plan that can influence the evolution of the illness. This is true even during the early prodromal stage, with the goal of avoiding or slowing down the disease’s advancement and giving patients a longer time of improved quality of life (Berg, 2008).

            To optimize care and improve the quality of life for patients with PD, a multidisciplinary team of researchers has collaborated. The existing literature covers a wide range of studies, from those exploring chemical and behavioral factors to those employing computer-assisted diagnostic techniques (Pereira et al., 2019). The primary contribution of this study is based on the latter methodologies, which utilize computer technologies to assist researchers in expediting and enhancing the diagnosis of PD.

            In order to present a summary of the present state of research on this subject, most studies utilize artificial intelligence, namely machine learning (ML), to determine the crucial factors that are pertinent for diagnosing individuals. Spadotto et al. (2010) proposed the Optimum-Path Forest (OPF) classifier (Papa et al., 2009, 2012) to help in the automated detection of PD. Following that, the authors of the research suggested an approach based on evolution to determine the most unique collection of characteristics that improve the accuracy of identifying PD (Spadotto et al., 2011). The OPF classifier is considered a practical tool because it does not require any parameters and is easy to manage.

            ML has been a significant example in the healthcare sector in recent years. ML approaches are computational techniques that aid computer-based applications to independently acquire information and originate significant insights from data, reducing the reliance on human involvement. Different types of data, such as handwriting, gait, neuroimaging, voice, cerebrospinal fluid, cardiac scintigraphy, serum, and optical coherence tomography (Bakar et al., 2012; Benba et al., 2016a, 2016b; Bernad-Elazari et al., 2016; Baby et al., 2017; Alqahtani et al., 2018; Amoroso et al., 2018; Anand et al., 2018; Alharthi and Ozanyan, 2019, Ali et al., 2019a, 2019b, 2019c; Baggio et al., 2019; Banerjee et al., 2019; Khodatars et al., 2021), have been utilized to diagnose PD through the application of ML models. ML empowers the incorporation of supplementary modalities, such as magnetic resonance imaging (MRI) and single photon emission computed tomography (SPECT) data, for the purpose of diagnosing PD (Bhati et al., 2019; Buongiorno et al., 2019). ML techniques can be used to find important structures that are not often included in the clinical diagnosis of PD. Therefore, these alternative measures can be trusted to identify PD in its early stages or in atypical forms. This research has made the following contributions:

            1. Developing a diagnosis system based on ML algorithms for identifying PD in its early stages through RBD.

            2. The proposed system achieved a high accuracy of 96% by using the random forest (RF) algorithm.

            BACKGROUND OF STUDY

            ML is a subfield within the realm of computational intelligence that focuses on the creation of algorithms, which facilitate the enhancement of a computer program’s performance by leveraging previously acquired knowledge. The functioning of the brain has been actively studied on a daily basis. Extensive research has spurred several studies focused on utilizing ML approaches to assist in the identification of PD (Alshammri et al., 2023).

            Drotár et al. (2016) recommended measuring handwriting skills using entropy, energy, and intrinsic metrics. To maximize handwriting’s potential, the authors considered using these measures for in-air activities and pressure. Researchers classified data using a support vector machine (SVM) with a radial basis function (RBF) and achieved good prediction accuracy. Connolly et al. (2015) utilized linear discriminant analysis (LDA) support, SVM, and K-nearest neighbors (K-NN) techniques to analyze local field potentials obtained from an entrenched deep brain stimulation device. For conducting this research work, a total of 83 montages were documented from 15 patients who were diagnosed with advanced idiopathic PD. This resulted in precision rate of 91%.

            Wahid et al. (2015) made two primary contributions. Initially, the researchers used multiple regression normalization to assess and compare spatial-temporal gait characteristics in individuals with PD and individuals without the disease [healthy controls (HC)]. They also evaluated ML methods for PD gait categorization after multiple regression normalization. This study affects PD diagnosis and severity by using spatial-temporal gait data. PD gait was detected using Kyasanur forest disease (KFD), Naïve Bayes (NB), K-nearest neighbors (KNN), SVM, and RF algorithms.

            To develop meaningful and objective measures for the detection of PD in animals and people, we used evolutionary algorithms (Smith et al., 2015). The collection of animal data employed fruit flies with and without PD genetic abnormalities, while the gathering of human data used commercial sensors and non-invasive methods. Researchers used Cartesian genetic programming to categorize patients’ dyskinesia severity and to distinguish between healthy individuals and individuals with PD. Hirschauer et al. (2015) devised a novel method for diagnosing PD by analyzing continuous phonation data. Researchers employed the minimal redundancy maximum relevance (mRMR) method to choose the most significant features. Different feature selection approaches were tested and compared to the findings. The National Center for Voice and Speech in Denver, Colorado, and the University of Oxford, both contributed to this dataset. Following the process of feature selection, both standard artificial neural network (ANN) and complex-valued neural networks (CVANNs) utilize the data. The findings demonstrated encouraging results: the ANN attained a precision rate of 98.12%, while the cross-validated CVANNs achieved an accuracy rate of 98.12%. To investigate and predict PD patients’ medication adherence, Tucker et al. (2015) developed a cheap data mining system that uses non-wearable multimodal sensors. This system also considers variations in their walking patterns. The whole-body movement data of patients with PD can be utilized to differentiate between individuals who are undergoing pharmaceutical treatment and those who are not. A customized model may achieve an accuracy of 97%, whereas a generic model that utilizes gait data from several individuals may achieve an accuracy of 78%. Procházka et al. (2015) examined five well-documented PD sites using computer-assisted diagnostics. To distinguish PD patients from controls, experiments were conducted using 3D volumetric T1-weighted MRI. The study used SVM and achieved 86.67% accuracy. Wabnegger et al. (2015) examined brain activity during emotion perception in PD patients and healthy people using face emotion recognition. Faces expressing different emotions were shown to participants while their brain activity was monitored.

            Martínez-Murcia et al. (2014) used dopamine transporter scan (DaTSCAN) images for developing a PD detection tool and employed independent component analysis features to train a classifier. DaTSCAN is a radiopharmaceutical imaging agent used in SPECT to evaluate dopamine transporter levels in the brain. It helps differentiate PD from other movement disorders, especially when clinical symptoms are unclear. Martínez-Murcia et al. (2014) utilized DaTSCAN images and independent component analysis to build a classifier for identifying PD. The Parkinson’s Progression Markers Initiative (PPMI) is a large-scale, international research project focused on identifying biomarkers and understanding the progression of PD. It gathers and analyzes clinical, imaging, and biological data from individuals with PD as well as HC. The goal is to drive the development of more effective diagnostics and treatments for PD.

            Villa-Cañas et al. (2015) briefly introduced the Wigner-Ville distribution: a time-frequency analysis tool that depicts a signal’s energy distribution over both time and frequency. It is effective for analyzing non-stationary signals with varying spectral content, such as speech or complex modulated signals. Villa-Cañas et al. (2015) developed four time-frequency techniques using this distribution to examine low-frequency components in continuous speech signals of individuals with PD. These techniques aim to correlate spectral changes with specific characteristics, facilitating the detection of voice tremors linked to PD.

            Ertugrul et al. (2016) proposed local binary pattern (LBP) and ML approaches for classifying PD. LBP is a texture analysis method used in image processing. It captures local texture features by comparing each pixel with its neighbors, creating a binary pattern that represents the texture of the image. LBP is widely used for tasks such as texture classification and facial recognition due to its simplicity and effectiveness. LBP method was used to extract features from the dataset. The experiments used various circumstances and had 88.88% accuracy. The author claims that the method can diagnose PD and identify patterns from localized signal changes. An acoustic investigation of hypokinetic dysarthria in PD patients was performed by Smekal et al. (2015). They quantitatively analyzed Czech vowels and created an empirical mode decomposition-based speech characteristic. Combining this feature with sequential forward feature selection enhances the performance of the analysis. Researchers found 94% accuracy in recognizing vowels in Parkinson’s patients. Dai et al. (2015) introduced an innovative approach that utilizes empirical mode decomposition to analyze filtered electromyograms (EMG). EMG are electrical recordings of muscle activity that have been processed to remove noise and artifacts. In diagnosing PD, filtered EMG can be used to analyze muscle tremors and other motor symptoms associated with the condition. By examining these filtered signals, clinicians can assess the characteristic tremor patterns and muscle activity that help differentiate PD from other movement disorders and monitor disease progression. This method allows for the analysis of PD by counting a variable number of characteristics. The signals underwent a three-stage preprocessing procedure using a unique bandpass filtering approach to demonstrate the linear separability of the characteristics. Subsequently, the suggested algorithm was deployed as a mobile application to enhance its adaptability compared to the current techniques.

            MATERIALS AND METHODS

            In Figure 1, the suggested method for identifying PD is depicted. The developing system utilizes a standard dataset obtained from Kaggle, which has three labels: RBD, PD, and HC. A proficient neurologist with expertise in movement disorders clinically assessed all patients. Kaggle is an online platform that provides data science and ML enthusiasts with access to datasets, competitions, and educational resource. Each participant engaged in the task of reading a standardized, phonetically balanced text consisting of 80 words. Additionally, they delivered a monologue lasting roughly 90 seconds, discussing their interests, career, family, or current activities. We have utilized computational methods to tackle the problem of absent values. Once the missing values were addressed, the data were separated into 70% training examples and 30% testing examples. The ML algorithms were then used to classify the dataset into RBD, PD, and HC.

            Framework of the proposed system
            Figure 1:

            A generic system.

            Dataset

            The dataset comprises 30 individuals diagnosed with early untreated PD, 50 individuals diagnosed RBD who are at a heightened risk of developing PD or other synucleinopathies, and 50 individuals serving as HC. The clinical scoring of all patients was conducted by a highly skilled neurologist specializing in movement disorders. Every participant underwent evaluation in a solitary session with a speech specialist. The participants engaged in the task of reading a homogenous, phonetically well-adjusted text consisting of 80 words, as well as delivering a monologue pertaining to their hobbies, occupation, family, or ongoing activities for a duration of roughly 90 seconds. Hlavnička et al. (2017) conducted an automated analysis of speech characteristics. The labels of the dataset are shown in Figure 2. Figure 3 displays the features of the dataset. A boxplot, shown in Figure 3, is a standardized method for visually representing a dataset using the five-number summary, which includes the lowest, maximum, median, and first and third quartiles.

            Labels of dataset
            Figure 2:

            Labels of dataset.

            Features of PD dataset
            Figure 3:

            Features of PD dataset. Abbreviation: PD, Parkinson’s disease.

            Preprocessing approaches

            The preprocessing stage has significant standing in the classification model development as it facilitates the fitting of the dataset, hence enhancing the accuracy and performance of ML algorithms. In this study, we have eliminated the missing values to sanitize the dataset. Following the data set cleaning process, Min-Max normalization was used to standardize the dataset’s scaling, thereby enhancing the performance of the ML algorithms.

            (1) xnormalize=xxminxmaxxmin

            The training data are represented by the variable x, whereas the maximum value is denoted as x max [Eqn (1)]. The value of an x min is the smallest value at index 0.

            Machine learning algorithms
            K-nearest neighbors

            The KNN algorithm is a supervised ML approach applied for addressing classification and regression tasks. Among ML methods, KNN is a cornerstone classification methodology. This technique is classified under the domain of supervised learning and is extensively utilized in several fields such as pattern recognition, data mining, and intrusion detection.

            The K-NN method is a highly adaptable and extensively employed ML technique, largely favored for its straightforwardness and ease in implementation. There is no need to make any expectations on the underlying spreading of the data. Additionally, it possesses the capability to process both numerical and categorical data, rendering it a versatile option for diverse datasets in the context of classification and regression problems. The proposed approach is a non-parametric technique that utilizes the similarity of data points within a certain dataset to generate predictions. When compared to other algorithms, K-NN exhibits a lower sensitivity toward outliers.

            K-NN method operates by evaluating a distance metric, such as Euclidean distance, to identify the K-NN of a given data point. The classification or numerical value of the data point is then ascertained using the majority vote or the mean of the K neighboring data points. This methodology enables the algorithm to dynamically adjust to various patterns and make predictions by using the intricate local structure of the data.

            (2) Ei=(x1x2)+(x3x4)2

            where x 1, x 2, x 3, and x 4 are used to calculate the Euclidean distance in a two-dimensional space.

            Support vector machine

            The SVM is a robust ML technique that is commonly employed for various tasks such as linear or nonlinear. The SVM approach has widespread applications in several domains, including but not limited to text classification, image classification, spam detection, handwriting recognition, and anomaly detection. The SVM technique demonstrates versatility and efficacy in various applications because to its capacity to efficiently handle high-dimensional data and nonlinear correlations. SVM approaches are exceptionally successful in determining the ideal hyperplane that efficiently distinguishes the several classes inside the target feature. While regression difficulties are often noted, classification is the most appropriate application.

            The main aim of the SVM approach is to decide the optimal hyperplane in an N-dimensional space that can successfully separate the data arguments belonging to different classes in the dataset. The hyperplane aims to maximize the separation between the nearest points associated to distinct classes. The size of the hyperplane is determined by the number of features. When there are two input characteristics, the hyperplane can be shown as a straight line. With the presence of three input features, the hyperplane undergoes a transformation and becomes a two-dimensional plane. It becomes challenging to conceptualize when the number of characteristics surpasses three. A viable selection for the optimal hyperplane is the one that exhibits the greatest degree of separation or margin among the three classes. The data are divided into three classes using several hyperplanes. The hyperplane used maximizes the distance from itself to the nearest data point on each side. If a hyperplane of this nature is present, it is commonly referred to as the maximum-margin hyperplane or hard margin.

            Random forest approach

            The RF technique is widely utilized in ML for the purposes of classification and regression. The proposed approach is a collaborative learning method that integrates many decision trees to generate a conclusive forecast.

            The fundamental concept underlying the RF method involves generating a substantial number of decision trees, which are subsequently aggregated to yield a more precise and robust outcome. Every decision tree inside the forest undergoes training using a randomly selected portion of the data and a randomly selected subgroup of the features. This training process serves to mitigate overfitting and enhances the model’s capacity for generalization. The RF method synergistically combines the predictions of all decision trees inside the forest to provide a final prediction. In classification tasks, the prevailing methodology involves employing majority voting, whereas regression tasks often rely on averaging techniques. The RF algorithm has several benefits compared to individual decision trees, including enhanced precision, less overfitting, and superior management of noisy and missing data. The utilization of this technology is prevalent across several domains such as banking, healthcare, and natural language processing.

            Artificial neural network

            The ANN is a model that draws inspiration from the intricate structure and functioning of the linked network consisting of set neurons like the human brain. The system comprises linked nodes known as artificial neurons, which are arranged in layers. The network facilitates the transmission of information, wherein individual neurons engage in the processing of incoming signals and generate an output signal that subsequently impacts other neurons within the network. The multilayer perceptron (MLP) model is a type of ANN consisting of multiple layers of neurons, including an input layer, one or more hidden layers, and an output layer. Each neuron in one layer is connected to every neuron in the next layer, allowing the network to learn complex patterns and relationships within the data. In PD research, MLP models are used to analyze and classify various types of data, such as clinical symptoms, imaging features, or biomarkers. By training on large datasets, MLP models can help identify patterns associated with PD, predict disease progression, and improve diagnostic accuracy. They are particularly useful for developing predictive models and decision support tools in clinical settings. Figure 4 displays proposed structures of the MLP model for detecting PD, which contains an input layer, hidden layers with 1 and 20 units, and an output layer with 3 classes.

            Structure of MLP approach
            Figure 4:

            Structure of MLP approach. Abbreviations: HC, healthy controls; MLP, multilayer perceptron; PD, Parkinson’s disease; RBD, sleep behavior disorder.

            EXPERIMENTAL RESULTS

            This section provides an overview of the results achieved during the development of the system.

            Environment of testing the system

            The experiment was carried out using a computer operating on the Windows platform, utilizing the hardware components indicated in Table 1. Python was utilized to code the experiments.

            Table 1:

            Environment of testing the system.

            SoftwareHardware
            PythonMemory 8 GB
            AMD Ryzen 7 7730U with Radeon
            Graphics 2.00 GHz
            Sklearn library
            Matplotlib
            Seaborn
            NumPy
            pandas

            Abbreviation: Sklearn, Scikit-learn.

            Assessment criteria

            To evaluate the efficacy of the classification approaches on the PD dataset, four measurement metrics were utilized, namely accuracy, precision, recall, and F1-score. Evaluation metrics are often regarded as the most efficacious means of assessing the efficacy of categorization models. The statistical metrics are calculated using Equations (3-6), where TP (true positive) and TN (true negative) represent properly categorized cases, whereas FP (false positive) and FN (false negative) designate wrongly classified occurrences.

            (3) Accuracy=TP+TNTP+FP+FN+TN×100%

            (4) Sensitivity=TPTP+FN×100%

            (5) Precision=TPTP+FP×100%

            (6) F-score=2×preision×sensitivitypreision+sensitivity×100%

            Receiver operating characteristic

            The receiver operating characteristic (ROC) curve is a visual depiction of how well a binary classifier system performs as the threshold for classifying data is adjusted. The plot is generated by graphing the true-positive rate (TPR) against the false-positive rate (FPR) at different threshold values. The ROC curve is a commonly employed technique in the fields of ML and data analysis for assessing the effectiveness of binary classification models.

            Results of machine learning algorithms
            Support vector machine

            Table 2 presents the outcomes of the SVM method in identifying PD. The results indicate that the SVM methodology obtained accuracy, recall, and F1-score metrics of 92%, 100%, and 96%, respectively. The SVM technique achieved a high accuracy rate of 85%.

            Table 2:

            The testing results of SVM approach.

            PrecisionRecallF1-score
            HC9210096
            PD1005773
            RBD708878
            Weighted Avg.878584
            Accuracy overall %85

            Abbreviations: HC, healthy controls; PD, Parkinson’s disease; RBD, sleep behavior disorder; SVM, support vector machine; Weighted Avg., weighted average.

            An ROC curve is a visual representation that depicts the effectiveness of multi-class classification over different threshold values. The ROC curve illustrates the ability of three predictors to predict peptide cleaving in the proteasome. Figure 5 displays the ROC of the SVM. The SVM approach achieved the following ROC metrics: HC = 100$, PD = 100%, and RBD = 99%.

            ROC of support vector machine approach
            Figure 5:

            ROC of SVM approach. Abbreviations: ROC, receiver operating characteristic; SVM, support vector machine.

            The outcomes of the KNN technique are presented in Table 3. The experimental findings indicate that the KNN technique yielded poor accuracy across all classes. However, it is worth noting that the HC class demonstrated superior performance, achieving a precision of 61%, recall of 100%, and F1-score of 76%. The KNN technique has an accuracy rate of 69%. We have examined the suitability of the KNN technique as an algorithm for identifying PD.

            Table 3:

            Results of KNN approach.

            PrecisionRecallF1-score
            HC6110076
            PD1004360
            RBD805062
            Weighted Avg.776967
            Accuracy overall %69

            Abbreviations: HC, healthy controls; KNN, K-nearest neighbors; PD, Parkinson’s disease; RBD, sleep behavior disorder; Weighted Avg., weighted average.

            The ROC plot of the KNN model for all classes is depicted in Figure 6. The KNN algorithm achieved a 90% accuracy rate in the HC and RBD classes, which is much higher than the 72% accuracy rate seen in the PD class. Based on this investigation, we suggest utilizing the KNN technique for identifying PD with only three labels.

            ROC of K-nearest neighbors approach
            Figure 6:

            ROC of KNN approach. Abbreviations: KNN, K-nearest neighbors; ROC, receiver operating characteristic; SVM, support vector machine.

            The findings of the RF technique for identifying PD are presented in Table 4. The RF algorithm achieved a high percentage score for both groups. The RF technique had a 100% success rate in the HC and Parkinson categories. The RF technique produced a 100% accuracy rate for the HC label, a 100% accuracy rate for identifying the RBD label, and a 94% accuracy rate for the Recall and F1-score metrics. Based on the accuracy metric, the RF classifier attained an exceptional score of 96%.

            Table 4:

            Results of RF approach.

            PrecisionRecallF1-scoreAccuracy
            HC100100100
            PD1008692
            RBD8910094
            Weighted Avg.979696
            Accuracy overall %96

            Abbreviations: HC, healthy controls; PD, Parkinson’s disease; RBD, sleep behavior disorder; RF, Random forest; Weighted Avg., weighted average.

            Figure 7 shows the ROC of the RF model, which is being examined. The RF model attained a 100% accuracy rate for all classes. Consequently, we propose using RF to detect and classify PD.

            ROC of random forest approach
            Figure 7:

            ROC of RF approach. Abbreviations: ROC, receiver operating characteristic; SVM, support vector machine.

            Table 5 displays the outcomes of the ANN model in diagnosing PD. It is apparent that the ANN attained a high precision rate in detecting the HC class. The testing results of the ANN is assessed using precision, recall, and F1-score metrics. The weighted average of these metrics is 86%, 77%, and 79%, respectively.

            Table 5:

            Results of ANN approach.

            PrecisionRecallF1-scoreAccuracy
            HC7310085
            PD1005773
            RBD867580
            Weighted Avg.867779
            Accuracy overall %81

            Abbreviations: ANN, artificial neural networks; HC, healthy controls; PD, Parkinson’s disease; RBD, sleep behavior disorder; Weighted Avg., weighted average.

            Figure 8 illustrates the performance of the ANN utilizing the ROC measure for detecting PD. The ANN has excellent performance in recognizing the high-confidence HC class, with a ROC score of 98%.

            ROC of artificial neural network approach
            Figure 8:

            ROC of ANN approach. Abbreviations: ANN, artificial neural network; ROC, receiver operating characteristic; SVM, support vector machine.

            Neuropsychiatry extensively investigates biomarkers in PD. PD is diagnosed based on medical observations, but there is a necessity for improved biomarkers to identify the illness in its early stages. As a result, the use of ML technology has been proposed as a viable way for spotting PD at an early stage. Individuals afflicted with PD exhibit speech impairments. Therefore, we utilized a standardized dataset generated and evaluated from patients suffering from PD, RBD, and HC. The dataset comprises important aspects linked to respirational deficits, dysphonia, imprecise enunciation, and dysrhythmia. These variables are extracted from acoustic microphone data of genuine connected speech. The purpose is to predict early and unique patterns of PD. We have used an ML approach for classifying and predicting PD from speech, revealing early biomarkers. Figure 9 displays the testing results and performance of the suggested system.

            Performance of ML approach
            Figure 9:

            Performance of ML approach (accuracy). Abbreviations: ANN, artificial neural network; KNN, K-nearest neighbors; ML, machine learning; SVM, support vector machine.

            The main objective of statistical analysis is to reveal patterns, correlations, and trends in the data that might facilitate comprehension of the underlying phenomena or enable informed decision-making. Data analysis encompasses the utilization of various statistical techniques and methodologies on data sets. These approaches may encompass descriptive statistics, inferential statistics, hypothesis testing, regression analysis, and data visualization.

            Regression analysis is a statistical method employed to construct and examine the association between a dependent variable and one or more independent variables. It aids in comprehending the impact of alterations in the independent variables on the dependent variable and may be employed for predicting the relationship between the target values and prediction values. The RF method has demonstrated superior prediction accuracy compared to mean squared error (MSE) and root mean squared error (RMSE), as depicted in Figure 10. The RF approach achieved a high regression score, with an R-squared (R2) value above 90%, whereas the KNN approach scored below 50%. Figure 11 displays the R-squared of the ML approaches.

            Statistical analysis between prediction values and target values
            Figure 10:

            Statistical analysis between prediction values and target values. Abbreviations: ANN, artificial neural network; KNN, K-nearest neighbors; SVM, support vector machine.

            R-squared between the prediction values and target values of machine learning
            Figure 11:

            R-squared between the prediction values and target values. Abbreviations: ANN, artificial neural network; KNN, K-nearest neighbors; SVM, support vector machine.

            CONCLUSIONS

            Early detection of PD is crucial for the effective management of the patient. Early identification of PD increases the likelihood of patient recovery with appropriate therapy and medication. Computer-aided learning has made significant developments in today’s world, particularly in the field of ML, leading to a revolutionary impact on the medical industry. The utilization of ML algorithms has significantly simplified the lives of individuals. These algorithms have been crucial in categorizing various diseases, tracking the health of patients, and making early predictions of PD disorders. Early identification of PD is crucial for effective therapy. Our study utilized the RF, KNN, SVM and ANN methods to detect PD. The algorithms were evaluated using a standardized dataset obtained from Kaggle, which includes data on PD, RBD, and HC subjects. The dataset was partitioned into a training set containing 70% of the data and a testing set including 30% of the data using the proposed ML method. The actual data demonstrate that the RF algorithm achieves a remarkable accuracy rate of 96%. This system enables the improvement of the quality of social healthcare by implementing treatments in a timely manner, enhancing patient outcomes, and optimizing the allocation of resources. The suggested model has exceptional precision in identifying the onset of PD throughout its initial phases. Our suggested approach is expected to provide a paradigm shift in this sector and aid practitioners in the initial discovery of Parkinson’s sickness in patients. The suggested model possesses the capacity to spot PD at an initial phase. To expand upon this research, we will investigate additional deep learning models that have been discussed in numerous academic papers.

            DATASET LINK

            REFERENCES

            1. Agosta F, Weiler M, Filippi M. 2015. Propagation of pathology through brain networks neurodegenerative diseases: from molecules to clinical phenotypes. CNS Neurosci. Ther. Vol. 21:754–767. [Cross Ref]

            2. Alharthi AS, Ozanyan KB. 2019. Deep learning for ground reaction force data analysis: application to wide-area floor sensingProceedings of the 2019 IEEE 28th International Symposium on Industrial Electronics (ISIE); Vancouver, BC, Canada. 12-14 June 2019; p. 1401–1406

            3. Ali L, Khan SU, Arshad M, Ali S, Anwar M. 2019a. A multi-model framework for evaluating type of speech samples having complementary information about Parkinson’s diseaseProceedings of the 2019 International Conference on Electrical, Communication, and Computer Engineering (ICECCE); Swat, Pakistan. 24-25 July 2019; p. 1–5.

            4. Ali L, Zhu C, Golilarz NA, Javeed A, Zhou M, Liu Y. 2019b. Reliable Parkinson’s disease detection by analyzing handwritten drawings: construction of an unbiased cascaded learning system based on feature selection and adaptive boosting model. IEEE Access. Vol. 7:116480–116489

            5. Ali L, Zhu C, Zhang Z, Liu Y. 2019c. Automated detection of Parkinson’s disease based on multiple types of sustained phonations using linear discriminant analysis and genetically optimized neural network. IEEE J. Transl. Eng. Health Med. Vol. 7:2000410

            6. Alqahtani EJ, Alshamrani FH, Syed HF, Olatunji SO. 2018. Classification of Parkinson’s disease using NNge classification algorithmProceedings of the 2018 21st Saudi Computer Society National Computer Conference (NCC); Riyadh, Saudi Arabia. 25-26 April 2018; p. 1–7

            7. Alshammri R, Alharbi G, Alharbi E, Almubark I. 2023. Machine learning approaches to identify Parkinson’s disease using voice signal features. Front Artif Intell. Vol. 6:1084001. [Cross Ref]

            8. Amoroso N, La Rocca M, Monaco A, Bellotti R, Tangaro S. 2018. Complex networks reveal early MRI markers of Parkinson’s disease. Med. Image Anal. Vol. 48:12–24

            9. Anand A, Haque MA, Alex JSR, Venkatesan N. 2018. Evaluation of machine learning and deep learning algorithms combined with dimentionality reduction techniques for classification of Parkinson’s diseaseProceedings of the 2018 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT); Louisville, KY, USA. 6-8 December 2018; p. 342–347

            10. Baby MS, Saji AJ, Kumar CS. 2017. Parkinson’s disease classification using wavelet transform based feature extraction of gait dataProceedings of the 2017 International Conference on Circuit, Power and Computing Technologies (ICCPCT); Kollam, India. 20-21 April 2017; p. 1–6

            11. Baggio HC, Abos A, Segura B, Campabadal A, Uribe C, Giraldo DM, et al.. 2019. Cerebellar resting-state functional connectivity in Parkinson’s disease and multiple system atrophy: characterization of abnormalities and potential for differential diagnosis at the single-patient level. Neuroimage Clin. Vol. 22:101720

            12. Bakar ZA, Ispawi DI, Ibrahim NF, Tahir NM. 2012. Classification of Parkinson’s disease based on Multilayer Perceptrons (MLPs) Neural Network and ANOVA as a feature extractionProceedings of the 2012 IEEE 8th International Colloquium on Signal Processing and Its Applications; Malacca, Malaysia. 23-25 March 2012; p. 63–67

            13. Banerjee M, Chakraborty R, Archer D, Vaillancourt D, Vemuri BC. 2019. DMR-CNN: a CNN tailored for DMR scans with applications to PD classificationProceedings of the 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019); Venice, Italy. 8-11 April 2019; p. 388–391

            14. Benba A, Jilbab A, Hammouch A. 2016a. Discriminating between patients with Parkinson’s and neurological diseases using cepstral analysis. IEEE Trans. Neural Syst. Rehabil. Eng. Vol. 24:1100–1108

            15. Benba A, Jilbab A, Hammouch A, Sandabad S. 2016b. Using RASTA-PLP for discriminating between different neurological diseasesProceedings of the 2016 International Conference on Electrical and Information Technologies (ICEIT); Tangiers, Morocco. 4-7 May 2016; p. 406–409

            16. Berg D. 2008. Biomarkers for the early detection of Parkinson’s and Alzheimer’s disease. Neurodegener. Dis. Vol. 5:133–136. [Cross Ref]

            17. Bernad-Elazari H, Herman T, Mirelman A, Gazit E, Giladi N, Hausdorff JM. 2016. Objective characterization of daily living transitions in patients with Parkinson’s disease using a single body-fixed sensor. J. Neurol. Vol. 263:1544–1551

            18. Bhati S, Velazquez LM, Villalba J, Dehak N. 2019. LSTM Siamese Network for Parkinson’s disease detection from speechProceedings of the 2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP); Ottawa, ON, Canada. 11-14 November 2019; p. 1–5

            19. Buongiorno D, Bortone I, Cascarano GD, Trotta GF, Brunetti A, Bevilacqua V. 2019. A low-cost vision system based on the analysis of motor features for recognition and severity rating of Parkinson’s disease. BMC Med. Inform. Decis. Mak. Vol. 19 suppl. 9:243

            20. Connolly AT, Kaemmerer WF, Dani S, Stanslaski SR, Panken E, Johnson MD, et al.. 2015. Guiding deep brain stimulation contact selection using local field potentials sensed by a chronically implanted device in Parkinson’s disease patientsProceedings of the 7th International IEEE/EMBS Conference on Neural Engineering (NER); Montpellier, France. 22-24 April 2015; p. 840–843

            21. Dai Y, Kuang W, Ling BWK, Yang Z, Tsang KF, Chi H, et al.. 2015. Detecting Parkinson’s diseases via the characteristics of the intrinsic mode functions of filtered electromyogramsProceedings of the IEEE 13th International Conference on Industrial Informatics (INDIN); Cambridge, UK. 22-24 July 2015; p. 1484–1487

            22. DeMaagd G, Philip A. 2015. Parkinson’s disease and its management: part 1: disease entity, risk factors, pathophysiology. Clinical presentation, and diagnosis. P T. Vol. 40:504–532

            23. Dickson DW. 2012. Parkinson’s disease and parkinsonism: neuropathology. Cold Spring Harb. Perspect. Med. Vol. 2:a009258. [Cross Ref]

            24. Drotár P, Mekyska J, Rektorová I, Masarová L, Smékal Z, Faundez-Zanuy M. 2016. Evaluation of handwriting kinematics and pressure for differential diagnosis of Parkinson’s disease. Artif. Intell. Med. Vol. 67:39–46

            25. Ertugrul OF, Kaya Y, Tekin R, Almali MN. 2016. Detection of Parkinson’s disease by shifted one dimensional local binary patterns from gait. Expert Syst. Appl. Vol. 56:156–163

            26. Hirschauer TJ, Adeli H, Buford JA. 2015. Computer-aided diagnosis of Parkinson’s disease using enhanced probabilistic neural network. J. Med. Syst. Vol. 39:1–12

            27. Hlavnička J, Čmejla R, Tykalová T, Šonka K, Růžička E, Rusz J. 2017. Automated analysis of connected speech reveals early biomarkers of Parkinson’s disease in patients with rapid eye movement sleep behaviour disorder. Sci. Rep. Vol. 7:12[Cross Ref]

            28. Khodatars M, Shoeibi A, Sadeghi D, Ghaasemi N, Jafari M, Moridian P, et al.. 2021. Deep learning for neuroimaging-based diagnosis and rehabilitation of autism spectrum disorder: A review. Comput. Biol. Med. Vol. 139:104949

            29. Liu WM, Wu RM, Lin JW, Liu YC, Chang CH, Lin CH. 2016. Time trends in the prevalence and incidence of Parkinson’s disease in Taiwan: a nationwide, population-based study. J. Formos. Med. Assoc. Vol. 115:531–538. [Cross Ref]

            30. Mahlknecht P, Seppi K, Poewe W. 2015. The concept of prodromal Parkinson’s disease. J. Parkinsons Dis. Vol. 5:681–697. [Cross Ref]

            31. Martínez-Murcia FJ, Górriz JM, Ramírez J, Illán IA, Ortiz A. 2014. Automatic detection of Parkinsonism using significance measures and component analysis in DaTSCAN imaging. Neurocomputing. Vol. 126:58–70

            32. Meghdadi AH, Stevanović Karić M, McConnell M, Rupp G, Richard C, Hamilton J, et al.. 2021. Resting state EEG biomarkers of cognitive decline associated with Alzheimer’s disease and mild cognitive impairment. PLoS One. Vol. 16:e0244180. [Cross Ref]

            33. Ou Z, Pan J, Tang S, Duan D, Yu D, Nong H, et al.. 2021. Global trends in the incidence, prevalence, and years lived with disability of Parkinson’s disease in 204 countries/territories from 1990 to 2019. Front. Public Health. Vol. 9:776847. [Cross Ref]

            34. Papa JP, Falcão AX, Suzuki CTN. 2009. Supervised pattern classification based on Optimum-Path Forest. Int. J. Imaging Syst. Technol. Vol. 19:120–131

            35. Papa JP, Falcão AX, Albuquerque VHC, Tavares JMRS. 2012. Efficient supervised Optimum-Path Forest classification for large datasets. Pattern Recognit. Vol. 45:512–520

            36. Pereira CR, Pereira DR, Weber SAT, Hook C, de Albuquerque VHC, Papa JP. 2019. A survey on computer-assisted Parkinson’s disease diagnosis. Artif Intell Med. Vol. 95:48–63

            37. Postuma RB, Berg D, Stern M, Poewe W, Olanow CW, Oertel W, et al.. 2015. MDS clinical diagnostic criteria for Parkinson’s disease. Mov. Disord. Vol. 30:1591–1601. [Cross Ref]

            38. Procházka A, Vysata O, Valis M, Tupa O, Schätz M, Marík V. 2015. Bayesian classification and analysis of gait disorders using image and depth sensors of Microsoft Kinect. Digit. Signal Process. Vol. 47:169–177

            39. Smekal Z, Mekyska J, Galaz Z, Mzourek Z, Rektorova I, Faundez-Zanuy M. 2015. Analysis of phonation in patients with Parkinson’s disease using empirical mode decompositionProceedings of the International Symposium on Signals, Circuits and Systems (ISSCS); Iasi, Romania. 13-14 July 2023; p. 1–4

            40. Smith SL, Lones MA, Bedder M, Alty JE, Cosgrove J, Maguire RJ, et al.. 2015. Computational approaches for understanding the diagnosis and treatment of Parkinson’s disease. Inst. Eng. Technol. Syst. Biol. Vol. 9:226–33

            41. Spadotto AA, Guido RC, Papa JP, Falcão AX. 2010. Parkinson’s disease identification through optimum-path forest. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. Vol. 2010:6087–6090

            42. Spadotto AA, Guido RC, Carnevali FL, Pagnin AF, Falcão AX, Papa JP. 2011. Improving Parkinson’s disease identification through evolutionary-based feature selectionProceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society; Boston, MA, USA. 30 August 2011; p. 7857–7860

            43. Tucker CS, Behoora I, Nembhard HB, Lewis M, Sterling NW, Huang X. 2015. Machine learning classification of medication adherence in patients with movement disorders using non-wearable sensors. Comput. Biol. Med. Vol. 66:120–134

            44. Tysnes OB, Storstein A. 2017. Epidemiology of Parkinson’s disease. J. Neural. Transm. Vol. 124:901–905. [Cross Ref]

            45. Villa-Cañas T, Arias-Londoño JD, Vargas-Bonilla JF, Orozco-Arroyave JR. 2015. Time frequency approach in continuous speech for detection of Parkinson’s diseaseProceedings of the 20th Symposium on Signal Processing, Images and Computer Vision (STSIVA); Bogota, Colombia. 2-4 September 2015; p. 1–6

            46. Wabnegger A, Ille R, Schwingenschuh P, Katschnig PW, Kögl MW, Wenzel K, et al.. 2015. Facial emotion recognition in Parkinson’s disease: an fMRI investigation. PLoS One. Vol. 10:1–9

            47. Wahid F, Begg RK, Hass CJ, Halgamuge S, Ackland DC. 2015. Classification of Parkinson’s disease gait using spatial-temporal gait features. IEEE J. Biomed. Health Inform. Vol. 19:1794–802

            48. World Health Organization. n.d.. Parkinson disease. https://www.who.int/news-room/fact-sheets/detail/parkinson-disease#accessed January 13, 2023

            49. Yang W, Hamilton JL, Kopil C, Beck JC, Tanner CM, Albin RL, et al.. 2020. Current and projected future economic burden of Parkinson’s disease in the U.S. NPJ Parkinsons Dis. Vol. 6:15[Cross Ref]

            Author and article information

            Journal
            jdr
            Journal of Disability Research
            King Salman Centre for Disability Research (Riyadh, Saudi Arabia )
            1658-9912
            02 November 2024
            : 3
            : 8
            : e20240104
            Affiliations
            [1 ] King Salman Center for Disability Research, Riyadh 11614, Saudi Arabia;
            [2 ] Department of Quantitative Methods, School of Business, King Faisal University, Al-Ahsa 31982, Saudi Arabia ( https://ror.org/00dn43547)
            [3 ] Department of Health Informatics, College of Health Sciences, Saudi Electronic University, Riyadh 11673, Saudi Arabia;
            [4 ] College of Applied in Abqaiq, King Faisal University, Al-Ahsa 31982, Saudi Arabia;
            [5 ] School of Computer Science, University of Petroleum & Energy Studies, Dehradun, India;
            [6 ] Department of Computer Science and Information Technology, Dr Babasaheb Ambedkar Marathwada University, Aurangabad, India;
            Author notes
            Correspondence to: Theyazn H.H. Aldhyani*, e-mail: taldhyani@ 123456kfu.edu.sa , Tel.: 00966504937970
            Author information
            https://orcid.org/0000-0003-1822-1357
            Article
            10.57197/JDR-2024-0104
            522ef968-035b-4c29-87b4-f8a519461db3
            2024 The Author(s).

            This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY) 4.0, which permits unrestricted use, distribution and reproduction in any medium, provided the original author and source are credited.

            History
            : 04 April 2024
            : 24 July 2024
            : 25 August 2024
            Page count
            Figures: 11, Tables: 5, References: 49, Pages: 10
            Funding
            Funded by: King Salman Center for Disability Research
            Award ID: KSGR-2023-236
            The authors extend their appreciation to the King Salman Center for Disability Research for funding this work through Research Group no. KSGR-2023-236.

            Parkinson’s disease,sleep behavior disorder,mental disability,artificial intelligence

            Comments

            Comment on this article