INTRODUCTION
Brain stroke (BS) is a cerebrovascular accident that occurs when the blood supply to the brain is interrupted, damaging brain tissue. BS may result in hemiplegia or hemiparesis, leading to impaired coordination and mobility ( Mikhail et al., 2020; Sirsat et al., 2020; Yedavalli et al., 2021). This may lead to challenges in ambulation, executing routine activities, and maintaining autonomy in routine life activities ( Inamdar et al., 2021). BS may cause aphasia, a condition characterized by impaired speech and language skills. The damaged areas of the brain determine the kind and severity of the difficulties, which can include problems with speaking, comprehending, reading, and writing. BS survivors may encounter cognitive issues, including memory deficits, impaired concentration, and diminished problem-solving skills ( Gaidhani et al., 2019). These cognitive function impairments may negatively affect a person’s ability to perform everyday tasks and overall well-being.
Early identification and management are vital for improving patient outcomes due to the fact that it is a prominent cause of disability and death on a global scale ( Rahman et al., 2023). Computed tomography (CT) and magnetic resonance imaging (MRI) are crucial in the diagnosis of strokes ( Li et al., 2020; Xu et al., 2020; Surya et al., 2021). However, their availability and expense might pose limitations, particularly in settings with limited resources. There has been an increasing interest in using artificial intelligence and machine learning (ML) techniques to help identify and diagnose BSs at an early stage ( Nishio et al., 2020). These strategies can potentially improve the efficiency and precision of stroke detection, lowering the need for costly imaging equipment and specialist knowledge.
Current techniques for identifying strokes frequently rely on the subjective analysis of radiologists, resulting in inconsistent diagnoses, treatment delays, and lost chances for intervention ( Raghavendra et al., 2021). Moreover, traditional ML methods may not possess the capability to comprehend intricate patterns and subtleties seen in CT images, thus restricting their diagnostic precision and dependability ( Jung and Whangbo, 2020). Therefore, it is essential to create sophisticated computational models that harness the capabilities of deep learning (DL) and traditional ML methods in order to enhance the detection of strokes from CT scans.
DL models have shown encouraging outcomes in automating the interpretation of medical images for stroke identification ( Chavva et al., 2022). DL algorithms may be used to automate the processing of MRI images for stroke identification, alleviating the workload of radiologists and healthcare workers. This automated process enhances operations’ efficiency and enables quicker imaging investigation analysis. DL models may be implemented on many platforms, such as hospital imaging systems and cloud-based services, enabling stroke detection to be accessible to healthcare institutions of varying sizes and resource capacities.
DL algorithms can examine vast imaging data and detect small differences and biomarkers indicating stroke disease ( Gautam and Raman, 2021). This individualized method of identifying strokes enables customized treatment strategies and interventions, maximizing patient results and minimizing the impact of stroke-related impairments ( Schmitt et al., 2022). Due to the growing accessibility of high-quality medical imaging data and developments in computational approaches, there is now an exceptional chance to create advanced stroke identification models.
This research presents a detailed framework for detecting BSs by combining the capabilities of EfficientNet B7 and MobileNet V3 using feature fusion approaches. In order to enhance the classification findings and make them more understandable, we include a CatBoost model in this fusion strategy. The primary objective of the proposed framework is to improve the precision, effectiveness, and comprehensibility of BS detection systems, eventually leading to enhanced patient care and outcomes in clinical settings. The contributions of this study are detailed as follows:
By using both DL and traditional ML methods, this study advances the development of advanced computational models for stroke detection. The proposed strategy combines DL architectures, namely SqueezeNet v1.1 and MobileNet V3-Small, with traditional ML models like CatBoost. This integration makes it possible to use the advantages of both paradigms.
The authors employ cutting-edge DL architectures, SqueezeNet v1.1 and MobileNet V3-Small, to examine the effectiveness of feature extraction from MRI images. By conducting thorough experiments, we show that these designs may extract meaningful features that capture minor details related to stroke pathophysiology.
This paper investigates feature fusion strategies for combining representations derived from several DL architectures. By combining advantageous characteristics from SqueezeNet v1.1 and MobileNet V3-Small, we amplify the ability of the feature set to distinguish strokes, resulting in enhanced stroke recognition performance.
The authors suggest combining feature representations retrieved from MRI scans with gradient-boosting models, namely CatBoost. We use the CatBoost method, which is further improved by using the Optuna algorithm’s hyperparameter optimization to create reliable and precise stroke identification models.
The present study undertakes a thorough assessment and validation of the built models by analyzing a wide range of MRI images labeled to indicate the presence of stroke. To evaluate the effectiveness of the models in identifying strokes, the authors measure their accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AU-ROC).
LITERATURE REVIEW
Stroke is a prominent factor in causing disability and death on a worldwide scale, requiring prompt and precise detection for efficient treatment and control ( Sheth et al., 2023). Most research studies have recently focused on creating computer models to detect strokes using sophisticated ML methods and medical imaging technologies, including CT and MRI. DL is an effective technique for analyzing medical images ( Ali et al., 2021). It allows for directly extracting intricate patterns and characteristics from raw data. Multiple research studies have investigated the use of DL architectures to detect strokes from MRI images. A practical method includes using a convolutional neural network (CNN) to recognize strokes automatically. Badriyah et al. (2020) developed a CNN model to identify acute ischemic stroke lesions. The model exhibited exceptional accuracy and sensitivity in detecting stroke lesions, showcasing the promise of DL in assisting radiologists with stroke diagnosis. Using transfer learning to modify pretrained CNN models for application in stroke detection is the subject of an additional area of study. Lewick et al. (2020) introduced a transfer learning method that utilizes a pretrained model to diagnose acute ischemic stroke lesions accurately. They demonstrated the efficacy of transfer learning in medical image analysis by attaining competitive performance in stroke recognition after fine-tuning the pretrained model on a dataset of stroke patients.
DL models trained on CT images for BS diagnosis may be restricted to applying their knowledge across other patient demographics, kinds of scanners, and imaging methods ( Kanchana and Menaka, 2020; Phaphuangwittayakul et al., 2022). It is necessary to conduct research that confirms these models’ effectiveness on various datasets to verify their dependability in real-world clinical environments. CT scans may exhibit abnormalities, including noise, motion artifacts, and metal artifacts, which may impact the efficacy of DL models. Developing preprocessing solid methods that can reduce artifacts’ effects and improve image quality is essential. This is important for making stroke detection models more reliable. The clinical effectiveness of DL-based stroke detection models in real-world clinical practice is contingent upon their performance despite achieving high accuracy rates in research settings. It is crucial to conduct prospective clinical trials to assess the efficacy of these models in assisting doctors’ decision-making processes. This is necessary for their acceptance and incorporation into regular clinical workflows.
METHODOLOGY
SqueezeNet v1.1 ( https://github.com/forresti/SqueezeNet/blob/master/SqueezeNet_v1.1/squeezenet_v1.1.caffemodel) and MobileNet V3 ( https://github.com/kuan-wang/pytorch-mobilenet-v3) are the recent CNN architectures that have shown remarkable performance in image classification challenges. SqueezeNet v1.1 is very efficient and successful in many image recognition applications. It achieves high accuracy using fewer parameters, making it suited for situations with limited resources. Conversely, MobileNet V3 performs very well in mobile and edge computing situations, delivering rapid inference rates while maintaining accuracy. Although SqueezeNet v1.1 and MobileNet V3 have shown exceptional ability in image classification tasks, their performance in stroke detection may be further improved by using feature fusion approaches to combine their respective capabilities. Feature fusion combines different features obtained from various CNNs to represent the input data better. It enhances the accuracy and reliability of classification. Furthermore, it is possible to use conventional ML methods, including CatBoost ( https://github.com/catboost/catboost), combined with DL models to tackle the difficulty of categorizing stroke pictures that include intricate patterns and minor irregularities. CatBoost is a gradient-boosting framework. It is resistant to overfitting and suitable for medical image classification applications where the ability to comprehend and generalize results is of utmost importance. Figure 1 shows the proposed BS detection model.
Data acquisition
The authors obtained the BS CT image dataset from the Kaggle repository ( https://www.kaggle.com/datasets/afridirahman/brain-stroke-ct-image-dataset). The images were preprocessed and categorized into normal and abnormal classes. The normal class contains 1551 images, and the abnormal class includes 950 images. The authors employed data augmentation techniques and increased the number of images in normal and abnormal classes to 9684 and 8897 images.
SqueezeNet V1.1-based feature extraction
SqueezeNet v1.1 is primarily constructed using fire modules, which consist of a squeeze layer followed by expand layers. The squeeze layer is composed of 1 × 1 convolutions, which compress the input feature maps by decreasing the number of input channels. The extended layers collect local and global information using a combination of 1 × 1 and 3 × 3 convolutions. SqueezeNet v1.1, although small in size, can extract distinctive feature representations from input pictures, such as MRI scans of stroke-affected brains. The fire modules can effectively capture intricate features and significant semantic information, making them well-suited for various image identification applications. During the forward propagation via the SqueezeNet v1.1 architecture, the input MRI pictures are sequentially processed by several levels, including the fire modules.
Convolutional operations are carried out at each layer to extract features from the input pictures. The 1 × 1 convolutions in the squeeze layers reduce the size of the input feature maps, while the following expand layers capture more intricate patterns and structures. As the MRI pictures pass through the network, feature maps are produced at various tiers of the SqueezeNet v1.1 architecture. The feature maps are abstract representations of the input pictures, where each channel encodes distinct image attributes that the network has learned.
Feature maps obtained from more profound levels of the network often include more advanced semantic information, while those from shallower layers primarily store finer low-level features. The feature maps obtained via SqueezeNet v1.1 behave as comprehensive representations of the raw MRI images, encapsulating pertinent data for the identification of strokes. The feature maps include local picture information, such as edges and textures, and global semantic elements that indicate stroke pathology. By using these feature representations, subsequent classification algorithms may accurately differentiate between stroke and non-stroke instances based on the retrieved characteristics. The pretrained weights from SqueezeNet v1.1, which were trained on large-scale image classification tasks such as ImageNet, are adjusted on a smaller dataset of MRI scans that have been identified to indicate the presence of strokes.
MobileNet V3-based feature extraction
MobileNet V3 is a compact and practical CNN structure specifically developed for mobile and embedded devices. Although MobileNet V3 is mostly intended for picture classification tasks, it may also be used to extract features from medical images like CT scans. Preprocess the CT images to ensure they are in an acceptable format for input to the MobileNet V3 architecture. This process may include adjusting the dimensions of the photographs to a predetermined resolution, standardizing the pixel intensity values, and maybe implementing other modifications like cropping or windowing to improve the quality of the images. The authors employed a pretrained MobileNet V3 model for feature extraction with limited computational resources. MobileNet V3 is composed of inverted residual blocks that use lightweight depthwise and pointwise convolutions, resulting in an efficient feature extraction process that maintains a high level of performance. Eliminate the classification head of the MobileNet V3 model since the process of extracting features does not need making class predictions. The authors performed a forward pass on the preprocessed CT images using the updated MobileNet V3 model to extract features from various levels of the network. Features derived from deeper levels of the network capture advanced semantic information, while features from shallower layers store more specific low-level data. Retrieve the feature maps or embeddings produced by the MobileNet V3 model at the specified levels. The feature maps serve as abstract representations of the input CT images, with each channel representing distinct image properties acquired by the network during learning. In addition, the authors used principal component analysis to decrease the number of dimensions in the retrieved features while maintaining their ability to distinguish between different classes.
Feature fusion
The convolutional layers of SqueezeNet and MobileNet models use learnable filters to scan the input MRI picture, identifying distinct characteristics at varying spatial scales. The filters are acquired during the training process and have the ability to detect patterns such as edges, textures, and forms that are characteristic of irregularities connected to strokes. Pooling layers are used to reduce the resolution of the feature maps produced from the convolutional layers. This process decreases the spatial dimensions of the feature maps while preserving the crucial information. Pooling enhances the creation of features that exhibit greater resistance to minor changes and distortions in the input picture. Traditional CNN designs often have fully connected layers at the network’s end to execute classification tasks using the retrieved features. However, with lightweight designs such as SqueezeNet and MobileNet, the practice of using global average pooling is prevalent instead of employing fully linked layers. This technique calculates the mean value of each feature map, resulting in a vector of a constant length that represents the complete picture. The ultimate results of these models are a vector representation of the input MRI images, which captures its most significant characteristics. These properties are acquired throughout the training process and are indicative of different characteristics seen in the brain MRI, including those related to stroke disorders.
Feature fusion is a method used to fuse features obtained from multiple sources or modalities to improve overall representation and discriminative capability. In the context of BS detection, the authors combined the key characteristics obtained from MobileNet V3 and SqueezeNet v1.1 to provide a complete representation of features. MobileNet V3 and SqueezeNet v1.1 provide feature maps at distinct network tiers, collecting both fine-grained minute and higher-level semantic information. The feature vector has numerical values that depict the attributes of the input MRI picture, as recorded by the corresponding DL architectures. The authors standardized the feature vectors to ensure their values are consistent. The standardization procedure prevents features from exerting excessive influence on the fusion process due to their size. Lastly, the authors combined the standardized feature vectors from MobileNet V3 and SqueezeNet v1.1 into a unified feature vector. The concatenated feature vector is a combined representation of the input MRI image, combining features from MobileNet V3 and SqueezeNet v1.1.
Ischemic infarcts
Ischemic strokes are the most common type of stroke and occur when a clot obstructs blood flow to a part of the brain. In brain MRI, ischemic infarcts appear as hyperintense (bright) regions on diffusion-weighted imaging sequences and hypointense (dark) regions on apparent diffusion coefficient maps.
Hemorrhagic strokes occur when a blood vessel ruptures and causes bleeding into the brain tissue. In brain MRI, hemorrhages appear as hyperintense regions on T1-weighted and T2-weighted imaging sequences, often surrounded by edema. Perfusion imaging techniques such as dynamic susceptibility contrast or arterial spin labeling can assess blood flow in the brain. Regions of hypoperfusion or delayed perfusion may indicate areas affected by stroke. Swelling or edema of brain tissue can occur following a stroke. Edema appears as hyperintense regions on fluid-attenuated inversion recovery sequences and may indicate the extent of tissue damage. MRI angiography techniques such as magnetic resonance angiography can visualize the blood vessels in the brain. Stenosis or occlusion of blood vessels may indicate underlying vascular pathology contributing to stroke risk.
CatBoost-based BS identification
CatBoost is a robust gradient-boosting library renowned for its exceptional performance and efficiency in managing categorical data. Using Optuna’s hyperparameter optimization capabilities in conjunction with CatBoost’s resilience, we can develop an exceptionally optimized model for a wide range of applications, such as identifying BSs from medical imaging data. Combining Optuna with CatBoost streamlines the process of exploring hyperparameter space, improving model performance and enabling better generalization. This method automates the process of adjusting hyperparameters, liberating practitioners from the laborious and time-consuming chore of manually fine-tuning hyperparameters. Through a methodical examination of the hyperparameter space and assessing the model’s performance on validation data, the aim is to pinpoint the hyperparameters that provide the maximum accuracy and resilience in stroke detection tasks. The authors defined an objective function that accepts hyperparameters as input and produces a loss value that has to be minimized.
The goal function assesses the performance of the CatBoost model on the validation set by using suitable evaluation measures, including accuracy, F1-score, and AU-ROC. The authors have specified and initialized the hyperparameters that need to be tuned, including the learning rate, tree depth, number of trees, and regularization parameters. Optuna hyperparameter ( Akiba et al., 2019) tuning is an advanced method that automatically searches for the most optimum combination of hyperparameters for ML models. Optuna employs an iterative approach throughout the optimization process, offering new sets of hyperparameters depending on the outcomes of prior trials. Various hyperparameter settings are used to train and assess CatBoost models on the validation set, and the resulting performance metrics are recorded. Optuna continues the exploration process until a certain number of trials have been executed or a convergence requirement has been satisfied.
The process of integrating SHAP (SHapley Additive exPlanations) values with CatBoost entails using the SHAP library to provide explanations for the predictions generated by the CatBoost model. SHAP values provide valuable insights into the impact of each feature on the model’s output for individual predictions, thus enhancing the interpretability of the model’s judgments.
Evaluation metrics
Accuracy is a metric that quantifies the extent to which a model’s predictions are true. It is determined by dividing the number of successfully predicted occurrences by the total number of instances. It offers a broad assessment of how well a model performs, but it may not be appropriate for datasets that have uneven class distributions or when the consequences of misclassifying different classes differ. Precision is a metric that quantifies the ratio of accurately predicted positive cases to all instances that the model projected as positive. It refers to the model’s capacity to minimize false positives, which is especially important in situations when the consequences of false alarms are significant. Recall, or sensitivity, quantifies the ratio of accurately anticipated positive cases to the total number of actual positive cases. Recall evaluates the model’s capacity to accurately identify all relevant examples of a certain class, which is vital in situations when the absence of positive cases has significant consequences. The F1-score is a statistical metric that quantifies the performance of a model by taking the harmonic mean of its accuracy and recall. It offers a fair evaluation of the model’s effectiveness. Specificity enhances recall by emphasizing the model’s capacity to accurately detect negative instances, which is advantageous in situations where false alarms for negative instances are expensive. The quantity of parameters in a model has a direct impact on its intricacy, memory demands, and training duration. Comprehending and maximizing the quantity of parameters is essential to developing efficient and impactful models. Floating-point operations per second (FLOPs) is a crucial measure for evaluating the effectiveness of DL models, especially in situations when there are limited computing resources.
RESULTS AND DISCUSSIONS
The experimental investigation is performed using a Windows 10 Professional operating system, an i7 fifth generation processor, 16 GB RAM, and NVIDIA R350X settings. The authors used a fivefold cross-validation technique to train the suggested model. The proposed model was developed using the PyTorch, TensorFlow, and Keras packages. The model was trained using 17 batches and 21 epochs. Table 1 presents the results of the fivefold cross-validation. The model has consistently delivered superior results in each iteration. It acquired the essential pain patterns and accurately categorized BS with high precision. The recommended feature extraction has produced the critical characteristics to aid the proposed model in decision-making.
Findings of the fivefold cross-validation.
Folds | Accuracy | Precision | Recall | F1-score | Specificity |
---|---|---|---|---|---|
1 | 98.5 | 97.9 | 98.2 | 98.0 | 98.6 |
2 | 98.9 | 97.8 | 97.6 | 97.7 | 99.4 |
3 | 98.4 | 98.3 | 98.1 | 98.2 | 98.3 |
4 | 98.9 | 98.6 | 98.6 | 98.6 | 97.9 |
5 | 99.2 | 98.7 | 98.9 | 98.8 | 98.7 |
The achieved accuracy of 99.1% in identifying BS using a fivefold cross-validation method highlights the strength and efficiency of the created model. The model’s high accuracy demonstrates its capacity to consistently detect cases of BS using medical imaging data. The exceptional performance of the model has important clinical implications. Precise and fast identification of BS is crucial for commencing immediate medical care, which may greatly enhance patient outcomes and decrease the likelihood of long-term neurological harm or impairment. Healthcare professionals might have more trust in the diagnostic expertise of a model that has such high accuracy. This can lead to faster decision-making and treatment planning.
The findings of the performance validation are highlighted in Table 2 and Figure 2. The factors, including quality of training data, effective feature representation, robust training procedure, regularization techniques, and hyperparameter tuning, have yielded an optimal outcome.
Performance validation outcomes.
Classes | Accuracy | Precision | Recall | F1-score | Specificity |
---|---|---|---|---|---|
Normal | 98.8 | 99.1 | 98.7 | 98.9 | 99.3 |
Abnormal | 99.5 | 98.9 | 98.9 | 98.9 | 99.1 |
Average | 99.1 | 99.0 | 98.8 | 98.9 | 99.2 |
The comparative analysis outcomes are presented in Table 3 and Figure 3. The proposed DL-based BS prediction model has demonstrated remarkable performance compared to existing models, marking a significant advancement in medical image analysis and stroke detection. The results underscore the efficacy and potential of DL techniques in accurately identifying stroke lesions from CT images, with implications for improving patient outcomes and clinical decision-making in stroke care. One key aspect that sets the proposed model apart from existing approaches is its ability to leverage the strengths of DL architectures such as SqueezeNet v1.1 and MobileNet V3-Small for feature extraction. These architectures are designed to balance model complexity with computational efficiency, making them well-suited for medical imaging tasks where accuracy and speed are paramount. By harnessing the representational power of these architectures, the proposed model can effectively capture subtle patterns and nuances indicative of stroke pathology, leading to improved prediction accuracy. Furthermore, integrating feature fusion techniques enhances the discriminative power of the extracted features by combining complementary representations from multiple DL architectures. This feature fusion process allows the model to leverage the unique strengths of each architecture, resulting in a more comprehensive and robust feature representation of the input CT images. As a result, the model can effectively differentiate between stroke and non-stroke cases with higher accuracy and reliability.
Comparative analysis outcomes.
Models | Accuracy | Precision | Recall | F1-score | Specificity |
---|---|---|---|---|---|
Proposed model | 99.1 | 99.0 | 98.8 | 98.9 | 99.2 |
Pan et al. (2021) | 97.7 | 97.2 | 97.6 | 97.4 | 98.1 |
Qiu et al. (2020) | 97.9 | 96.5 | 96.8 | 96.6 | 98.3 |
Yalçın and Vural (2022) | 98.3 | 97.1 | 97.6 | 97.3 | 96.5 |
Kumaravel et al. (2021) | 98.6 | 97.0 | 97.2 | 97.1 | 97.1 |
Patel et al. (2023) | 97.7 | 96.5 | 96.7 | 96.6 | 98.0 |
Table 4 presents the computational complexities of BS detection models. The proposed DL-based BS detection model has outperformed current methods by delivering improved outcomes while also decreasing the model’s complexity in terms of parameters, FLOPs, and testing time. This accomplishment signifies a notable advance in medical image analysis, providing a streamlined and potent technique for identifying strokes from CT images. An inherent benefit of the suggested model is its capacity to get superior outcomes using fewer parameters. The model achieves excellent accuracy using SqueezeNet v1.1 and MobileNet V3-Small, which also greatly reduces the trainable parameters. The decrease in the number of parameters not only improves the computational efficiency of the model but also lowers the likelihood of overfitting, resulting in enhanced generalization performance on new and unknown data.
Computational complexities.
Models | Parameters (in millions) | FLOPs (in giga) | Testing time (in seconds) | Learning rate |
---|---|---|---|---|
Proposed model | 29 | 45 | 1.02 | 0.0005 |
Pan et al. (2021) | 36 | 53 | 1.52 | 0.0031 |
Qiu et al. (2020) | 41 | 56 | 2.13 | 0.004 |
Yalçın and Vural (2022) | 39 | 57 | 1.62 | 0.001 |
Kumaravel et al. (2021) | 43 | 51 | 1.27 | 0.006 |
Patel et al. (2023) | 38 | 42 | 1.19 | 0.0004 |
Abbreviation: FLOPs, floating-point operations per second.
Moreover, the suggested model has reduced FLOPs compared to current models, showing its computational efficiency. The suggested model may do complex computation using fewer computer resources by improving the model’s architecture and using lightweight convolutional techniques. This high level of efficiency is especially beneficial in contexts with limited resources, such as mobile devices or edge computing platforms, where there are restrictions on processing capabilities. One notable benefit of the suggested approach is its decreased testing duration. The model’s simplified architecture and efficient design provide quicker inference times, enabling fast analysis and identification of stroke lesions from CT images. The shortened duration of testing has significant ramifications for clinical practice, allowing healthcare practitioners to promptly and knowledgeably make choices about patient treatment. Furthermore, the exceptional efficiency of the suggested model, which has a reduced number of parameters, FLOPs, and testing time, emphasizes its practical usefulness and adaptability in real-world scenarios. The model’s high efficiency makes it ideal for use in clinical settings, where rapid and precise stroke detection is essential for patient care and treatment decision-making.
The dataset used may be relatively limited or lacking in variety, which might impact the model’s capacity to generalize to wider populations or varying imaging settings. The dataset may exhibit an imbalance in the distribution of classes, with a notable disparity in the number of samples between one class (e.g. non-stroke) and the other. This discrepancy has the potential to introduce bias in the performance metrics of the model. The evaluation of the suggested model may have been limited to the same dataset that was used for training, without conducting external validation on alternative datasets to examine its potential to generalize. Although DL architectures are very effective at extracting features, the interpretability of these characteristics may be restricted, making it difficult to comprehend their underlying biological relevance or clinical value.
Developing precise models for stroke detection using medical imaging data might greatly improve the diagnosis procedure. Precise and prompt diagnosis is essential for promptly implementing therapies, maximizing patient outcomes, and minimizing the likelihood of long-term impairment or death linked to strokes. Incorporating these sophisticated models into clinical practice may function as important tools to assist healthcare workers in making decisions. These models may aid doctors in making better judgments about patient management and treatment strategies by automatically analyzing and interpreting medical imaging data. Future research should prioritize the smooth integration of these sophisticated models into current clinical processes. This entails tackling practical obstacles such as ensuring compatibility with healthcare systems, adhering to regulatory requirements, and designing user interfaces to guarantee seamless uptake and use for healthcare workers. It is crucial to do thorough external validation studies and prospective clinical trials in order to evaluate the applicability, consistency, and practical value of the suggested models. Engaging in partnerships with healthcare institutions and stakeholders to gather varied and inclusive datasets may enhance the body of evidence and assist the practical implementation in real-world settings.
CONCLUSIONS
The utilization of SqueezeNet v1.1, MobileNet V3-Small, feature fusion, and CatBoost models in a stroke recognition model signifies noteworthy progress in medical picture analysis. By combining advanced DL architectures, feature fusion methods, and gradient-boosting models, we have successfully obtained high accuracy in detecting stroke lesions from MRI scans. By using SqueezeNet v1.1 and MobileNet V3-Small for feature extraction, we can effectively capture both specific local features and broader global patterns that are suggestive of stroke pathology. The DL architectures demonstrate exceptional proficiency in extracting meaningful features from MRI images, resulting in a comprehensive depiction of stroke lesions. Feature fusion approaches improve distinguishing between different characteristics by combining complementary representations acquired from distinct architectures. By merging the capabilities of SqueezeNet v1.1 and MobileNet V3-Small, we provide a broad range of features that effectively capture many aspects of stroke pathology. By including CatBoost models, the stroke recognition system’s prediction performance is significantly improved. By using the Optuna approach for hyperparameter optimization, we enhance the performance of the CatBoost models to get the highest level of accuracy in detecting strokes. The attained accuracy of 99.1% serves as evidence for the efficacy of the suggested model in precisely detecting stroke lesions from CT images. The exceptional precision of our model highlights its capacity to aid healthcare practitioners in promptly and precisely diagnosing strokes, resulting in timely treatments and enhanced patient outcomes. Our stroke detection model is a significant improvement in medical imaging technology. It can potentially enhance stroke diagnosis and patient care in clinical practice. Potential future research avenues might include amalgamating supplementary imaging modalities, assimilation of clinical data, and verification in more extensive and varied patient cohorts to enhance the model’s performance and establish its validity.