INTRODUCTION
Dyslexia, a complex neurodevelopmental disorder, presents a global challenge that affects a significant portion of the population. Specific regions of the brain, including the posterior corpus callosum and the temporoparietal regions of both the left and right hemispheres, are affected due to this disease. The left temporoparietal region is particularly relevant as it is part of the reading network, while the right hemisphere is involved in attentional networks. These neural differences play a crucial role in the manifestation of dyslexia and contribute to the challenges individuals with this condition face in processing written and spoken language. This is often most noticeable when children begin to learn to read, typically around the ages of 5-7 years. This is the age range at which the symptoms of dyslexia become more evident, as it is a critical period for developing reading and language skills. According to the Yale Center of Dyslexia and Creativity an alarming statistic reveals the magnitude of this issue that a staggering 20% of the world’s population is suffering with dyslexia, making it one of the most prominent neurocognitive disorders worldwide (Shaywitz et al., 2020). Dyslexia is often characterised by difficulties in reading, spelling, and language processing, often stemming from differences in the way the brain processes information (Hachinski et al. 2006; Ahmad et al., 2022). While dyslexia cannot be entirely cured, its symptoms can be minimised with the right interventions and support. Early detection is truly crucial to addressing this issue, as it opens the door to timely and effective strategies that empower individuals to overcome the challenges associated with dyslexia. Early detection and intervention are crucial to provide the necessary support and strategies for children who may be struggling with dyslexia.
The conventional methods employed for detecting dyslexia have largely relied on time-consuming, manual assessments and examinations. Traditionally, the diagnosis and management of dyslexia have relied on a range of practices, such as standardised reading and cognitive tests, often administered by trained professionals (Al-Barhamtoshy and Motaweh, 2017; Hussein et al., 2024). These assessments are designed to identify specific reading difficulties and provide insights into the severity of the condition. However, the consequences of undiagnosed or poorly managed dyslexia are significant, leading to lifelong academic struggles, diminished self-esteem, and emotional distress. These methods involve a series of cognitive and literacy tests, often administered by trained professionals (Hatcher et al., 2002; Helland, 2007; Aldehim et al., 2024). However, these approaches suffer from several limitations. First, they need manpower, requiring extensive one-on-one testing, which can be expensive and impractical on a large scale. Second, they are susceptible to subjective interpretation, making results dependent on the assessor’s experience and judgement. Furthermore, the global awareness of dyslexia, while improving, remains inadequate. This limited awareness hampers the timely identification of dyslexia, preventing countless children from receiving the necessary support and accommodations in educational and social settings, leading to unnecessary academic and emotional challenges. In this context, artificial intelligence (AI) techniques offer a promising avenue for addressing the challenges associated with dyslexia detection. By leveraging AI-driven approaches such as machine learning and neural networks, we can create standardised, scalable, and objective methods for identifying dyslexia (Colenbrander et al., 2018; Ahire et al., 2023). These technologies can process vast amounts of data swiftly and accurately, thereby enabling early detection and intervention. They also have the potential to reduce the subjectivity inherent in conventional methods. Moreover, by utilising AI tools, we can promote global awareness and education about dyslexia, as these systems can be made widely accessible, helping create a more inclusive society where all children, regardless of their neurodevelopmental differences, can thrive (Abd Rauf et al., 2018; Schabmann et al., 2020; Subramaniyan et al., 2020).
Our research is centred around a unique dataset, an online gamified test dataset featuring tactical alphabetical games as questions, capturing players’ correct answers, wrong answers, and scores. We focussed on a distinctive dataset, a captivating online gamified test, which presents tactical alphabetical games as questions, meticulously recording players’ responses, including both correct and incorrect answers, as well as their scores. This dataset is unique in terms of cognitive insights and performance metrics. In the subsequent sections, we present our research, which strides ahead of contemporary approaches. At the core of our methodology lies an artificial neural network (ANN) model, complemented by the integration of cutting-edge preprocessing techniques. It is in view of this that ANNs have been considered the most powerful algorithms for the detection of dyslexia, essentially considering their great ability to model non-linear, high-dimensional relationships. This is important to be able to handle complex patterns intrinsic in neurodevelopmental disorders such as dyslexia, over which traditional algorithms would otherwise fail. The ANNs capture and learn from those intricate details of the patterns and hence provide a strong and effective means of detection of dyslexia. More so, there is flexibility in that it can use several methods, for example, dropout and regularisation techniques to remove overfitting, hence assuring the model generalises perfectly on unseen out-of-sample data. Surely, such a set of deep learning capabilities with added resilience and adaptability to avoid falling into the pit of overfitting puts ANNs in a unique pole position and gives very strong reason to go with this architecture over other machine learning when considering dyslexia detection. To begin, we employ label encoding to ensure that our data are in the optimal format for analysis, setting the stage for meaningful insights. Furthermore, our research involves comprehensive feature extraction, a process that allows us to filter critical information from a wide range of features within the dataset. This data refinement approach also encompasses the meticulous detection and elimination of outliers, an imperative step to enhance data quality by excising any anomalous data points that might otherwise distort our analysis. Uniformity and consistency within the dataset are ensured through normalisation, a crucial measure to maintain the integrity of the data. Moreover, we address the formidable challenge of class imbalance, a common issue encountered in real-world datasets. This collective and studious approach culminates in the development of an ANN model that excels in all departments of our analysis, marking a pivotal milestone in our research. The significance of this approach lies in its potential to redefine disease detection and intervention, made possible by the wealth of information and insights encapsulated within this unique online gamified test dataset.
The main contributions of this research are:
We proposed a significant approach to redefine dyslexia detection and intervention using an ANN model providing an industry-leading accuracy.
This research has incorporated a clean and fundamental approach of preprocessing techniques on rather noisy and imbalanced data, which completely sync with each other, helping the model to fetch better results.
The performance of our proposed model outperforms the results of the state-of-the-art methods.
The rest of the paper has been organised as follows: the Literature Review section provides detailed literature review. The Methodology section provides the detailed methodology of the proposed preprocessing techniques and ANN model. The experimental test and results are presented in the Results section. Finally, the paper concludes with the Conclusion section.
LITERATURE REVIEW
As discussed earlier, traditional dyslexia diagnosis methods have limits, encouraging a transition toward more effective AI tools. The following literature work highlights the growing role of AI in the identification and diagnosis of dyslexia, indicating a paradigm change in this discipline. The presented research (Rello and Ballesteros, 2015) explores the use of eye-tracking measures in predicting individuals with dyslexia, a specific learning disorder affecting approximately 10% of the global population. Leveraging machine learning techniques, specifically a support vector machine binary classifier, the study achieved an accuracy of 80.18% in distinguishing between readers with and without dyslexia based on eye-tracking data. The work presented by Usman et al. (2021) offers a critical examination of recent machine learning approaches aimed at detecting dyslexia and its associated biomarkers. The study highlights the importance of addressing specific challenges to ensure clinical relevance and high accuracy in the utilisation of deep learning methods for dyslexia diagnosis. The review incorporates a systematic analysis of 22 selected articles, employing the Preferred Reporting Items for Systematic review and Meta-Analyses (PRISMA) protocol to enhance transparency and clarity. The study presented by Frid and Manevitz (2018) introduces an automated approach using event-related potential (ERP) signals and machine learning for classifying dyslexic and skilled readers. It achieves state-of-the-art results without human intervention, offering reliable differentiation between the two groups. Novel complex features reveal distinctions primarily in the left hemisphere. Surprisingly, the research uncovers valuable information within typically disregarded high pass signals. Moreover, this method’s versatility extends to ERP-based studies, showcasing its potential to advance dyslexia research and diagnosis.
Early identification of dyslexia is very important; hence, the work presented by Sanfilippo et al. (2020) explains the recent genetic and neuroimaging research and highlights the heritability of dyslexia and early brain differences. Early literacy skill deficits can serve as warning signs, detectable in preschoolers. This supports the idea that paediatricians can identify dyslexia before school, during a crucial window for effective interventions (O’Hare, 2010). This review explores the clinical implications and stresses the importance of early identification and screening to prevent dyslexia’s adverse effects. Practical strategies for paediatricians are also discussed to better support patients and families dealing with dyslexia. In the work done by Le Jan et al. (2011), the authors developed a multi-step approach divided into two sections, with the first part consisting of picking the best representative task using principal component analysis (PCA) and implementing logistic regression (LR) models on preselected variables, as well as performing spelling and reading tasks independently. The re-model performed well in 94% of the children, with good sensitivity (91%) and specificity (95%). The study by Vanitha and Kasthuri (2023) aims to select features for the detection of dyslexia using machine learning models. The authors have also used a benchmark online gamified dataset and have achieved an 89.8% accuracy for the selection of features by correlation attribute evaluation with the LR classifier model. The study by Pennington et al. (2012) tested five cognitive models of dyslexia in two large population-based samples, one cross-sectional (Colorado Learning Disability Research Centre) and one longitudinal (International longitudinal twin study). They employed two ways to determine an individual case, and nearly these two methods give test of model fit for multiple deficit models 30-36% and single deficit model 24-28%, with the hybrid model providing the best fit to data and the remaining 40% sample without the deficit. They also examine the clinical significance of these findings for the diagnosis of school-age children and toddlers in preschool. In Mather and Schneider (2023), the authors investigate the benefits and drawbacks of several ways for identifying specific learning disorders in schools. They also explored current issues surrounding the use of standardised cognitive testing in dyslexia examinations. In Morciano et al. (2024), the authors provided an outline regarding clinical observations and research data and provided support for cognitive exams.
Based on the state-of-the-art methods, the summarisation of the various works, their methods, and research gaps are given in Table 1.
Summarisation of the various works, their methods, and research gaps.
Reference | Method used by them | Research gap |
---|---|---|
Ahmad et al. (2022) | SVM optimisation and game testing data for dyslexia detection | Emerging machine learning techniques have not yet been explored beyond SVM kernel functions for improved detection rates |
Ahire et al. (2023) | Clustering algorithms in dyslexia detection | The integration of clustering algorithms with real-time adaptive learning systems has not yet been achieved, which is necessary for dynamic feedback and personalised learning support |
Usman et al. (2021) | Challenges in using deep learning for dyslexia prediction | Robust, efficient models that operate efficiently with smaller, less curated datasets have not yet been developed, requiring fewer resources |
Mather and Schneider (2023) | Predictive models for dyslexia using machine learning | The actionable integration of predictive models with educational software and real-world applications to provide insights has not been fully realised yet |
Abbreviation: SVM, support vector machine.
Based on the research gaps highlighted in the Table 1, our research work improves dyslexia detection by applying an ANN model with advanced preprocessing, addressing class imbalance and noisy data. Our technique advances the application of machine learning in dyslexia detection by incorporating a unique online gamified test, refining data analysis. Moreover, our technique offers an AI-driven, accessible, and scalable solution, reducing reliance on traditional cognitive assessments and making early detection more practical.
METHODOLOGY
The methodology employed in this research makes use of data preprocessing techniques to develop an ANN model responsible for disease identification. The dataset (Kaggle, 2020) we used had multiple issues such as noisy data, class imbalance, and outliers. To detect outliers, we used boxplots and eliminated them using the interquartile range (IQR) technique. For addressing class imbalance, we applied the Synthetic Minority Oversampling Technique (SMOTE) technique. To handle noise in the data, regularisation and sensitivity parameters were considered in our ANN model. These variables significantly impact the performance and accuracy of the models. Noisy elements introduce erroneous data, resulting in faulty predictions and distorted findings. Outliers exert disproportionate influence if not treated properly, thereby altering the model’s decision boundaries and negatively influencing the learning process. Furthermore, class imbalance leads to the misclassification of the minority class, which can be especially problematic in illness detection scenarios. To address these issues, we diligently constructed a rigorous preprocessing strategy suited specifically to this dataset. We systematically cleanse and improve the raw comma-separated values (CSV) data using this method, eventually eliminating these negative aspects and paving the road for enhanced model performance and accuracy. Our methodology is based on data refinement, which allows our models to work with more trustworthy and insightful input, critical for accurate dyslexia identification. Figure 1 illustrates this process in detail.
Feature extraction
The initial step in our preprocessing approach is feature extraction, which is critical given the dataset’s depth of information. The dataset has a total of 196 features, with the first four being demographic information. As a feature extraction step, we used PCA to input the features of good quality in the work. PCA is a technique for dimension reduction in which the features are transformed into a set of orthogonal components called principal components that capture maximum variance in the data. We do not actually reduce the number of features that will be used in training our models. The potential of PCA application in this work is for getting rid of several key technical challenges directly related to preprocessing the data, thus improving model performance in several ways:
The dataset represented each question with correlated features Clicks1, Hits1, Misses1, Score1, Accuracy1, and Missrate1. PCA derives uncorrelated features out of these correlated features. For example, instead of these six, PCA creates some principal components capturing combined variance of such features but in the non-redundant way, where each component is adding unique information in the model.
PCA finds the directions of the maximal variance in the data. For instance, a feature like this might take up a huge variance in datasets with features Accuracy1, Accuracy2, and Accuracy3. PCA fine-tunes such directions to enable the model to colorise the most informative avenues of the dataset. For example, one principal component may highlight the patterns of accuracy rates across different questions, which in turn are quite important to predict dyslexia.
The dataset itself can be treated as a noisy set where several features represent abstract information, such as the slight variation in Miss-rate in the case cited above, which does not add value to making a prediction about dyslexia. In essence, this way PCA cleans data and focuses on the principal components explaining a higher variance.
The data are transformed into principal components, thereby reducing the space of most variables. For example, instead of all the individual Clicks, Hits, Misses, Score, Accuracy, and Miss-rate variables being analysed for each question, PCA gives a couple of principal components representing trends of performance in the whole. These components have more meaning and are easily interpreted to give clear insight into how distinct performance metrics drive the prediction of dyslexia.
Outlier detection and removal
The next crucial stage in our preprocessing stage is to identify and remove outliers from the dataset, which is critical for robust and trustworthy data analysis. To achieve this, first we began with the boxplots, which helped visualise such data points graphically and which were different from the normal range defined at 1.5 times the interquartile range above the third quartile and below the first quartile, effectively showing extreme values. Following that, we calculated the z-score of each data point to know by how many standard deviations does each data point stand away from the mean. Data points with z-scores >3 or <−3 were considered outliers as they strongly deviated from the bulk of our data points and hence flagged for exclusion. This time, the method of IQR was used, and all those that lay below the first quartile minus 1.5 times the IQR or above the third quartile plus 1.5 times the IQR were regarded as outliers. By implementing the boxplots, z-scores, and the IQR techniques, we methodically eliminated outliers from our dataset. The data points that significantly deviate from the expected performance patterns observed in most participants have been noted down in Table 2, along with their descriptions.
Outliers and their description of the dataset.
Outliers | Description |
---|---|
1. Extreme scores | Outliers might be participants who achieve exceptionally high or low scores in the game, which could indicate an unusual level of proficiency or struggle |
2. Unusually high click rate | Participants with an extraordinarily high click rate, suggesting rapid or erratic clicking during the game, might be considered outliers |
3. Aberrant Miss-rate | An extremely low or high Miss-rate, reflecting either a participant’s exceptional accuracy or frequent mistakes, could be indicative of outliers |
4. Inconsistent response patterns | Participants who exhibit inconsistent patterns in their responses to the game’s questions, such as an unusual sequence of hits and misses, could be identified as outliers |
5. Outliers in demographic features | Outliers in demographic features such as age, gender, or other personal characteristics are present and influence the overall data distribution |
The existence of outliers impairs the machine learning models’ capacity to generalise. When outliers are ignored, models erroneously focus on these outliers, resulting in inferior performance on the majority of data. Hence, removing it allows for a more balanced learning process and increased model accuracy. Outlier elimination entails identifying data points that exceed preset thresholds, which are often obtained from the statistical techniques used. In this study, when these outliers are detected, they are systematically removed from the dataset, resulting in a cleaner and more trustworthy data representation. For this process of elimination, we helped detect data points that exceed preset thresholds, which are often obtained from the statistical techniques used. When these outliers are detected, they are systematically removed from the dataset, resulting in a cleaner and more trustworthy data representation. In short, the importance of this step is immense. This phase ultimately improves the integrity of our data and model performance and ensures that our analysis is founded on strong statistical principles.
Elimination of class imbalance
The issue of class imbalance is gigantic in the target column, where the number of non-dyslexic individuals outnumbers the number of dyslexic individuals, and has far-reaching ramifications for our entire modelling approach. This mismatch can have a significant impact on the model’s performance by biasing it toward the majority class, resulting in inferior minority recognition. In this context, such a class imbalance increases the chance of false negatives, in which dyslexic instances go unnoticed. This imbalance also has an impact on the entire learning process, as the model favours the majority class due to the frequency of its occurrences. As a result, correcting class imbalance is critical to ensuring that the model retains equal sensitivity to both classes, allowing for precise detection.
To address the issue of class imbalance in our study, we have employed SMOTE, a strong tool to reduce dataset imbalances. In the context of our study, SMOTE is a critical tool for correcting the disparity between the number of non-dyslexic and dyslexic instances in the target variable. SMOTE operates at its heart by generating synthetic samples for the minority class (in this case, dyslexic individuals) to generate a more balanced distribution. It accomplishes this by identifying each occurrence of a minority class and then generating false data points that are strategically interpolated between existing minority class samples. These synthetic data points are derived by considering the feature space between neighbouring instances. SMOTE calculates the difference between the feature vectors of an existing minority class instance and its nearest neighbours, and then multiplies this difference by a random value to generate the synthetic data points as shown in Figure 2.

SMOTE working for elimination of class imbalance. Abbreviation: SMOTE, Synthetic Minority Oversampling Technique.
By oversampling the minority class in this manner, SMOTE effectively augments the dataset with synthetic instances, equalising the representation of both classes. This rebalancing is crucial in our project because it ensures that the model receives sufficient exposure to dyslexic cases, thereby improving its ability to learn and detect dyslexia accurately. Recognising that SMOTE’s effectiveness can vary depending on its configuration, we developed an algorithm specifically for the fine-tuning of SMOTE parameters. This algorithm systematically explores various combinations of SMOTE parameters (e.g. the number of neighbours to consider for generating synthetic samples and the ratio of oversampling) to identify the configuration that yields the best model performance without compromising data integrity. Through this iterative process, we aim to find a balance that enhances the dataset’s representation of the minority class without introducing significant biases, and this method significantly reduced the data inconsistency and imbalance that is caused due to oversampling of SMOTE. For understanding and tuning of these hyperparameters, we have developed an algorithm for the fine-tuning of SMOTE to know the best set of parameters. Below is the algorithm for fine-tuning of SMOTE (Algorithm 1).
Fine-tuning SMOTE parameters for preprocessed data with ANN
Input: |
• D: Input Dataset with Minority and Majority Classes. |
• R: A List of Desired Balance Ratios to Be Tested, R = {r1, r2, …, rn}. |
• S: A List of Random State Values to Be Tested, S = {s1, s2, …, sm}. |
• K: A List of K-nearest Neighbour Values to Be Tested, K = {k1, k2, …, kl}. |
Output: |
Bestparameters: The set of SMOTE parameters (Bestr, Bests, Bestk) that yield the best model performance. |
Steps: |
1. Import Libraries: Import the necessary libraries, including scikit-learn for data processing and machine learning, and imbalanced-learn for applying the SMOTE algorithm. 2. Preprocess Dataset: Preprocess the input dataset D, ensuring it is ready for the application of SMOTE and subsequent modelling. 3. Initialise Best Parameters: Set Bestparameters = (null, null, null) and Bestscore = −∞ to store the optimal SMOTE parameters and the best score obtained. 4. Iterate Over Parameters: |
• For each ri E R: |
• For each sj E S: |
• For each kz E K: |
4.1. Apply SMOTE: Generate a new dataset D′ by applying SMOTE with Parameters (ri, sj, kz), i.e. D′ = SMOTE (D, ri, sj, kz). |
4.2. Split dataset: divide D′ into training (D′train) and testing (D′test) Subsets. |
4.3. Preprocess for ANN: Apply necessary preprocessing steps (e.g. scaling, feature engineering) to D′train and D′test. |
4.4. Train ANN model: Train the ANN model on D′train. |
4.5. Evaluate Model: Evaluate the model on D′test using a set of evaluation metrics, calculate performance score P (ri, Sj, kz). |
4.6. Update best parameters: If P (ri, Sj, kz) > Bestscore, update Bestparameters = (ri, Sj, kz) and Bestscore = P (ri, sj, kz). |
5. Return Best Parameters: After exploring all combinations, return Bestparameters as the optimal set of SMOTE parameters for the given dataset and ANN model configuration. |
The importance of this algorithm lies in its systematic exploration of various SMOTE parameters, such as the ratio of synthetic samples to real samples, random state values, and the number of nearest neighbours (k), to achieve the best configuration for our imbalanced dataset. This technique examines a variety of combinations to discover the ideal set of SMOTE parameters that maximise the performance of our model in managing unbalanced data, successfully addressing the class imbalance issue. The benefit of employing this technique is that it produces a more robust and balanced dataset, which improves the model’s capacity to diagnose dyslexia accurately. The techniques of data normalisation and encoding are critical final phases of data refining in our project. Normalisation is the process of scaling feature values to a standard range, often between 0 and 1 or −1 and 1, to ensure that each feature has equal weight in the model and to prevent certain qualities from dominating due to variable magnitudes. This results in a level playing field for all features, which improves the model’s interpretability and performance. Encoding, on the other hand, makes possible the categorical input into a numerical representation. Both procedures are critical in data preparation since they contribute to the dataset’s homogeneity, consistency, and preparedness for modelling. We have fine-tuned the data through these steps, providing it with the requirements for effective analysis and guaranteeing that our model is well-equipped to give accurate illness identification, free of the biases that can occur from variable feature scales and categorical data. The detailed model summary is shown in Figure 3.
In this model, the input layer starts with 196 nodes and then goes on to a dense layer with 128 neurons, having 25,216 trainable parameters. It adds dropout layers with a rate of 0.5 to avoid overfitting. The subsequent layers are another dense layer with 128 neurons, followed by another one, adding up to 16,512 trainable parameters, with a dropout. After the dropout, the following are dense layers with 64 neurons containing 8256 parameters and a dropout, yet again followed by a dense layer with 32 neurons with 2080 parameters. It is further followed by a dropout layer. The final layer is a dense output layer with a single neuron for a model that consists of 33 total parameters. Thus, we have designed the model in a very careful way to come up with a delicate balancing of complexity and generalisation at the same time, to ensure that the model performs robustly over the task of detecting dyslexia from gamified test data. We carried out extensive hyperparameter tuning, which encompasses adding regularisation techniques as well as tuning sensitivity parameters to attain a high level of accuracy and robustness in model prediction. This work includes dropout layers and careful optimisation of the hyperparameters, which directly handles the noisy data to improve the sensitivity and specificity of the model for prediction. The ANN model demonstrates the superior performance, because of our rigorous data preparation and modelling efforts. It excels at predicting and detecting dyslexia, with an astounding 97% accuracy rate. Such precision reflects the rigorous work put in at every stage right from the data refinement to the strategic deployment of new techniques.
RESULTS
This section dives into a thorough examination of the outcomes achieved by the methodical application of our ANN model. The model’s performance will be evaluated across critical criteria, including accuracy, and confusion matrix. These findings testify to the quality and precision of our study technique, as well as the potential for crucial advances in dyslexia detection. Furthermore, to contextualise our findings, we conduct a comprehensive comparison with the state of the art in the field, examining the relative accuracy and efficiency of our ANN model.
A diverse narrative arises from a comparison of our findings and the state of the art in dyslexia detection (Rello et al., 2020). Our study achieves a 97% accuracy rate, and this exceptional accuracy is due to our thorough data preprocessing approaches, which include feature extraction, outlier reduction, and class imbalance mitigation using SMOTE. Based on testing data, our ANN model’s confusion matrix displays 490 true positives, 17 false positives, 6 false negatives, and 501 true negatives (TNs), highlighting not only its accuracy but also its excellent precision and sensitivity in dyslexia identification. In contrast, the state-of-the-art approach, centred on ensemble learning, presents a different performance profile. The primary objective of this study is to achieve high recall for the dyslexia class, minimising false negatives. This focus manifests in their confusion matrix, featuring a considerable 2568 TNs, signalling a strong ability to correctly identify non-dyslexic cases. However, it also includes 316 true positives, 76 false positives, and 684 false negatives, reflecting a lower sensitivity in detecting dyslexic cases.
The focus on achieving high recall for the dyslexia class, as stated in the state-of-the-art paper, provides a crucial context for understanding their confusion matrix results.
A high recall value for the dyslexia class, means that their model is designed to be highly sensitive in identifying individuals with dyslexia, with the primary objective of minimising false negatives. This approach prioritises the correct identification of as many dyslexic cases as possible, often at the cost of a higher false-positive rate.
Given this goal, it becomes clearer why the state-of-the-art confusion matrix exhibits a high TN count, indicating a strong ability to correctly identify non-dyslexic individuals. This high TN count aligns with their emphasis on minimising false negatives, which means they are achieving their goal of high recall for the dyslexia class.
Prioritising the correct identification of dyslexic cases over minimising the misclassification of non-dyslexic cases is their goal. This preference for a low false-negative rate indicates that their model is designed to be highly sensitive in identifying individuals with dyslexia, even if it means accepting a higher rate of false positives.
In contrast, our approach, as indicated by our confusion matrix, aims for a balance between sensitivity and precision, leading to a lower TN count but a higher true positive count. This trade-off suggests that our model is achieving high accuracy in correctly identifying both dyslexic and non-dyslexic cases.
This comparative analysis highlights the distinct priorities and strengths of both approaches. Our ANN model excels in balancing sensitivity and precision, achieving an impressive overall accuracy and providing a robust basis for early dyslexia detection. In contrast, the state-of-the-art emphasis on high recall for the dyslexia class results in a stronger ability to correctly identify non-dyslexic individuals but potentially comes at the cost of missing some dyslexic cases. These insights underscore the potential and advantages of our research in improving early disease detection, with a focus on both sensitivity and precision. Moving on from the comparison to the actual results obtained by our model, it performs admirably across multiple measures, particularly in precision, recall, and F1 score. The overall evaluation of the ANN model’s performance on the classification report’s metrics is given below in Table 3.
Model performance classification report.
Precision | Recall | F1 score | Support | |
---|---|---|---|---|
Non-dyslexia | 0.98 | 0.95 | 0.97 | 381 |
Dyslexia | 0.95 | 0.98 | 0.97 | 380 |
In Figures 4–6, the graphical representations of training and validation loss and accuracy along with the confusion matrix give a narrative of excellent results. These illustrations depict a model that not only excels at making accurate predictions, but also has robustness and generalisation capabilities.
The comparison of our research with the state-of-the-art methods is established by comparing performance metrics (Table 4). Here, even after applying feature selection and feature engineering techniques by the state-of-the-art methods, their performance metrics stand low when compared to those of our research.
Comparison of our research with the state-of-the-art methods.
Accuracy (%) | Recall (%) | Precision (%) | |
---|---|---|---|
Our work (dyslexic) | 97 | 98 | 95 |
Kaggle (2020) | 70.8 | 75.3 | 69.7 |
Our work (non-dyslexic) | 97 | 95 | 98 |
Kaggle (2020) | 70.8 | 75.3 | 69.7 |
LIMITATIONS OF THIS WORK
Two technical limitations were recognised despite this promising achievement. Most importantly, the technical difficulty of training complex deep learning models resulted in some challenges, especially when operationalising these solutions in situations with constrained computational resources. Moreover, the predictive accuracy and the performance of the model are heavily dependent on the dataset quality. The more biased it is, the more skewed the results will be.
Furthermore, if the dataset is not representative, it limits the theoretical models to a small set of probable locations. Looking ahead, our future studies will incorporate using variety of data types in our research. Specifically, we seek to supplement the general learning context with diverse data types, such as eye-tracking metrics, neurophysiological signals, and others, which might allow mitigating the dataset quality problems, as well as improving the model’s resilience. This is likely to diminish the model’s reliance on a particular data modality and thus facilitate the creation of more adaptable, universally applicable dyslexia detection tools.
CONCLUSION
Our novel approach to dyslexia identification not only improves diagnostic accuracy and efficiency but also holds tremendous potential in tackling the complex issues associated with dyslexia. We have reached a big milestone by meticulously tackling the difficulties of data refinement, class imbalance, and model development. Our entire strategy, which included novel approaches and a one-of-a-kind online gamified test dataset, not only outperformed the state of the art, but also produced outstanding results, with an accuracy rate of 97%. These findings, together with our model’s improved sensitivity and specificity, highlight the revolutionary potential of our approach. This study demonstrates the ability of modern methods to improve early disease identification, allowing persons with dyslexia to receive prompt and effective care, opening the path for improved quality of life and effective disease detection. It demonstrates our constant dedication to enhancing illness diagnosis and offering actual benefits to individuals and society. In the future, we plan to integrate advanced machine learning algorithms and multimodal data analysis to enrich our dyslexia detection approach, aiming for unprecedented accuracy and personalised care solutions.