Dual Kernel Support Vector-based Crossover Red Fox Algorithm: Advancements in Assistive Technology for Hearing-impaired Individuals

Abidi, Mustufa Haider; Alkhalefah, Hisham; Siddiquee, Arshad Noor

doi:10.57197/JDR-2024-0066

INTRODUCTION

Assistive technology (AT) plays a crucial role in enhancing the quality of life of individuals with disabilities. It is a form of service or device that assists in accomplishing a task for impaired individuals and also improves the functional ability of disabled individuals (Modi and Singh, 2022). Moreover, taking part in work and education assisted in reducing the involvement of caretakers and minimizing the healthcare and social costs. Various products associated with AT include wheelchairs, hearing aids, sunglasses, and so on. The growing use of digital technologies, such as advanced sensors and artificial intelligence (AI), has become essential for developing innovations in various fields, leveraging substantial improvements in data analytics and computer processing power (Abdi et al., 2021). The World Health Organization concludes that five types of people who require AT are people with noncontagious diseases, older people, people having mental health issues, people with gradual functional difficulty, and people with disabilities (Kbar et al., 2017; Ochsner et al., 2022). AI is finding its application in numerous fields; hence, ATs are not immune to it (Abide et al., 2020a,b, 2022a,b). The design and development of AT, including sensors and AI, have been employed to satisfy the needs of disabled people. Among various disabilities such as hearing impairment, vision loss, and so on, hearing loss is one of the prominent disabilities around the world among all stages of people; they recognize the words by reading the lips of speakers, which leads to miscommunication (Hermawati and Pieri, 2020; Abd Ghani et al., 2021). In addition, remote microphone technologies utilized as hearing aids provide improved audibility in certain environments. People face many difficulties in their day-to-day lives and accomplish their daily activities with panic. In an emergency, people with hearing loss face many issues in protecting their survival; hence, alarm systems, such as smoke alarms, security alarms, fire alarms, etc., are generally employed to secure their lives. Sign language (SL) is the primary language that aids communication between hearing-impaired individuals through manual and non-manual appearances. AI employs the development of SL applications that paved the way for communicating with hearing-impaired people in their everyday lives (Šumak et al., 2021). The development of ATs for severe and profound hearing loss aids communication between individuals. In the design processes, ATs are categorized as the translation of text or speech to SL, translation of speech to text, and translation of SL to text or speech (Zhang et al., 2022). In the first case, the visual avatars are utilized to perform the task, and in the second case, augmented reality is used, where hearing-impaired individuals can see the facial expressions and speech of the narrator. Finally, this case utilizes a Kinect 3D depth camera (Microsoft, USA) for translation. The sensors associated with AT include audio- and video-based input modalities such as speech recognition as an audio signal using a microphone and facial expression recognition based on processing visual data using various sensors, including depth sensor, smart vision sensor, and so on. On the other hand, triboelectric sensors are used as external hearing aids for hearing-impaired people, allowing them to hear voices and music effectively. AT also uses wearable devices to aid straight communication between impaired individuals (Ariza and Pearce, 2022; Kannan et al., 2023).

Disabled people face many difficulties in their day-to-day lives and accomplish their daily activities with panic (Kbar et al., 2016). However, the traditional techniques such as attention dual kernel support vector-based crossover red fox (ADKS-CRF), convolutional neural network-long short-term memory (CNN-LSTM), YOLOV4-Re, adaptive stochastic optimal decision control for systems (ASODC-S), and Smoothed Pseudo Wigner-Ville Distribution with Convolution (SPWVD-C) contain some drawbacks like high cost, lack of scalability, needing more time for implementation, lack of security, scalability issues, and lack of dataset. Hence, the ADKS-CRF algorithm is introduced in this paper; this model can overcome the problems of the existing methods. Furthermore, the dual kernel support vector machine (DKSVM) is memory efficient, and red fox optimization (RFO) helps to enhance the efficiency; so, motivated by this, this work develops the ADKS-CRF algorithm. The major contributions of the ADKS-CRF algorithm are described as follows:

Novel method: For the design and development of ATs for hearing-disabled people, an ADKS-CRF algorithm is proposed that uses data collected from the source of raw data of an effective 3D ear acquisition system dataset.
Efficient hyperparameter tuning: The incorporation of the crossover strategy in the RFO process enables effective hyperparameter tuning. The strategy efficiently balances exploration and exploitation during the search for optimal hyperparameter configurations.
Performance analysis: The performance validation is conducted with different metrics using comparative analysis and an accuracy of 98.4% is achieved by the developed ADKS-CRF model.

This paper is organized as follows: the section on “Related Works” provides a diverse analysis of the existing research papers related to hearing-impaired people. The proposed methodology to achieve effectiveness is discussed in the section on “Proposed Methodology”. The experimental evaluation is provided in the Experimental Results section, and the Discussion and Conclusion section with future works concludes the paper.

RELATED WORKS

This section presents the existing literature in this domain. Gupta et al. (2022) introduced a three-dimensional-based convolutional neural network (3D-CNN) and long short-term memory for Indian sign linguistic recognition. Disabled people, such as those with hearing loss, face many challenges in online platforms because of the communication. To avoid these issues, the developed method introduced an application called WebRTC, which aids the conversion of SL into audible sound. Furthermore, CNN-LSTM was employed to categorize the particular aspects, and it has the ability to train one single input image. As a result, the merit was that the cost of computation was less because the app was made in the wearable models; so, the normal people also comprehended the SL, but the dataset was extremely small.

Alahmadi et al. (2023) established the text-to-speech and YOLOv4-Resnet101 methods to improve the detection of objects for visually impaired people (VIP). The performances of ResNet-101 in this method enhance object detection accuracy and activate the complex visual forms; in addition, it also provides a strong aspect extraction network. Furthermore, the developed method provides the aural information by employing a text-to-speech conversion model that aids blind people in identifying the issues. On the other hand, the training process was improved with the help of the image preprocessing method and the dataset utilized in the method was microsoft common objects in context to verify the introduced method, in which the accuracy was 96.34%, but this method was more expensive for VIP object detection.

Madhusanka and Ramadass (2021) discussed communication intention-based activities for daily living for elderly or disabled people to enhance their well-being. Humans hard of hearing face many challenges in society, and it is hard to find a qualified curator. To overcome these issues, these models were introduced. Every activity was about discovering a vision-based design structure. Further, CNN-based support vector machines (SVMs) utilized conversation methods employing computer gaze. As a result, these methods help the caretakers easily interact with humans hard of hearing.

Marzouk et al. (2022) implemented a study on hearing- and speech-disabled persons using atom search optimization with a deep convolutional autoencoder-enabled sign language recognition (ASODCAE-SLR). In this method, the collections of aspect vectors were utilized, and the feature extraction tool called CapsNet (capsule network) and the weighted average filtering method were employed for preprocessing the input frames. Furthermore, the ASO algorithm was utilized to tune the parameters, which successfully tunes the parameters; therefore, the performance of the model was enhanced, and the DCAE method was deployed for SL recognition. As a result, this method avoids overfitting issues, but the drawback was the lack of a dataset to recognize the SL.

Nagarajan and Gopinath (2023) discussed distance estimation and detection of indoor objects to VIP using the Honey Adam African Vultures Optimization (HAAVO) algorithm. This method assessed the distance with the help of two networks: deep residual network and deep convolutional neural network (DCNN). Furthermore, the integration of two networks, namely DCNN and generative adversarial network, was employed to detect objects of VIP. HAAVO was deployed to train the networks and enhance their performance. As a result, this method has high efficiency, but the framework was more complex, so the computational time for detection was also high.

Ashiq et al. (2022) proposed a solution for VIP using CNN-based detection of object and tracking system. VIPs cannot see, so monitoring their activities and providing the correct time for medication was more necessary. Therefore, the developed method employed the webserver to share the VIP’s current status so the family members could easily identify the current location and status of the VIP. Furthermore, CNN was also employed for detecting the objects; this method helps the person to identify the real-time activity and provides a robust life. As a result, this method provides more security, but the drawbacks are that it takes more training time and lacks efficiency.

Younis et al. (2019) developed computer-enabled smart glasses equipped with a wide-angle camera for people with vision loss. This developed method generates a path for people with vision loss, and moving objects were identified by deploying the DL object detector. Furthermore, monitoring was done on the identified objects with the help of the Kalman filter multi-object tracker, and public and private datasets were used by proposed method. As a result, the merit was that the cost of computation was lower, but this method does not detect the potential complexity in implementation.

Khare et al. (2021) aimed to identify the schizophrenia (SZ) disorder using a smoothed pseudo-Wigner–Ville distribution (SPWVD)-based CNN. SZ causes loss of hearing, so many methods were developed for detecting SZ; but those methods face many challenges, such as less accuracy and more time consumption. To avoid this problem, this method identifies SZ by employing time–frequency plots provided in the CNN. Furthermore, the electroencephalogram data were analyzed using various aspects such as frequency and time. As a result, the merit was that the cost of computation was lower, but this method was not able to detect the disorder within a certain time.

Tasnim et al. (2024) recently presented an in-depth literature review for the use of machine learning in the development of personalization in hearing aids. The authors presented the contemporary technologies and discussed about the future research directions based on the challenges. Kumar et al. (2022) proposed a visual speech recognition (VSR) technique using deep learning techniques. Active shape modeling was used for lip localization and a CNN-based VSR unit was built to boost overall performance. Results indicated a higher accuracy of about 95% with a lesser word error rate of 6.59%.

Research gap

Disabled people face many difficulties in their daily activities, but ATs enhance their independence and safety and help them detect potential health impairment early. In the current work, a few existing models such as ADKS-CRF, CNN-LSTM, YOLOV4-Re, ASODC-S, and SPWVD-C contain some drawbacks like high cost, lack of scalability, needing more time for implementation, lack of security, scalability issues, low power consumption, and lack of dataset. To overcome these problems, a strong model is needed; so, the ADKS-CRF algorithm is proposed for this work. The research gap for the work is specified in the following section.

Effective method

The above-mentioned existing methods are struggling to alert disabled people at the correct time, but our proposed model (ADKS-CRF) deeply resolves those issues, in which attention DKSVM aids in alerting disabled people at the correct time.

Optimization issues

The RFO algorithm is utilized to tune the parameters, which successfully tunes the parameters; therefore, the model’s performance is enhanced but has low computational accuracy and higher cost. The algorithm is more efficient when the crossover procedure is applied to manage the optimization issues.

PROPOSED METHODOLOGY

The person who has been affected by hearing impairment encounters trouble during social interactions, communication, etc. Therefore, this work aims to enhance the intelligibility and quality of speech signals for hearing-disabled people. According to this, an efficacious algorithm is introduced, which is stated as the ADKS-CRF algorithm. The DKSVM and attention mechanism models give effective predictive output for hearing-impaired people. Within a short period, the developed ADKS-CRF algorithm transmits alerts to hearing-impaired people to protect them. The crossover red fox algorithm contributes to tuning the parameters of the proposed ADKS-CRF method.

ADKS-CRF algorithm stems from its unique strengths in processing complex datasets and optimizing model parameters effectively. The DKSVM component allows the algorithm to manage high-dimensional feature spaces efficiently, while the attention mechanism enhances its focus on relevant data patterns. The RFO algorithm, combined with a crossover strategy, ensures robust and efficient parameter tuning, overcoming common optimization challenges such as local optimization. The ADKS-CRF method’s performance estimation is carried out using five evaluation parameters. Figure 1 provides the schematic representation of the ADKS-CRF method. The detailed elucidation of the ADKS-CRF model is presented in the remaining sections below.

Figure 1:

Schematic representation of ADKS-CRF. Abbreviation: ADKS-CRF, attention dual kernel support vector-based crossover red fox.

Assistive listening devices

In large places, the assistive listening device (ALD) increases the poor audio waves for persons with hearing loss, and the auto medium provides any kind of information, such as classrooms, airports, and conference halls. Enhancing the quality of the speech signal is the main aim of the speech enhancement methods, and with the help of the ALD, the person with hearing loss can live their life independently. This technology is designed to enhance sound perception and communication for hearing-disabled people. This technology uses advanced sensors and signal processing to amplify sounds in several environments.

Augmentative and alternative communication

It refers to strategies and tools used to replace speech for individuals with hearing impairment (Crowe et al., 2022). Individuals with hearing impairment utilize alternate methods like SL, choice boards, texting, and computer-assisted speech. To acquire knowledge and acknowledge the interpretation of hearing-impaired individuals and normal people, users require an augmentative and alternative communication system for efficient communication.

Predictive model for disabled people

Predicting the needs and preferences of people with hearing impairment involves anticipating challenges and developing innovative solutions. Additionally, the integration of smart technologies may lead to more seamless and natural integrations for people with hearing disability in various aspects of life. Several methods may contribute to personalized treatment plans offering the requirements of hearing-disabled people. Here, novel approaches like DKSVM with attention mechanism and crossover RFO are implemented for the prediction model.

Dual kernel SVM

It refers to SVM (Qiu et al., 2021) that operates in the dual space, utilizing kernel function to map input data in the higher-dimensional feature space implicitly. Applying several kernel functions can provide superior performance in plotting each input signature, while only a single kernel is typically applied in SVM, which is the benefit of DKSVM. Given the fixed input vectors as $J = {t_{m}^{j}, z_{j}},$ where $t_{m}^{j}$ and z_j signify the sample of jth element and $t_{m}^{j}$ label, the mathematical expression of the dual kernel function is as follows:

(1)

$L (t_{m}^{j}, t_{m}^{i}) = \sum_{m = 1}^{4} λ_{m} L_{dual}^{m} (t_{m}^{j}, t_{m}^{j})$

(2)

$λ_{m} L_{dual}^{m} = (λ_{m} - μ) L_{c 1}^{m} + μ L_{c 2}^{m}$

where the coefficient of various kernels represents λ_m which provides $\sum_{m = 1}^{4} λ_{m} = 1.$ The dual kernel function $L_{dual}^{m}$ contains two sub-kernels, namely L_c ₁ and L_c ₂. μ ⊂ [0.0.1] and $L_{c 1}^{m}$ indicate the sub-kernel coefficient and major kernel, and $L_{c 2}^{m}$ refers to rectifying the miscategorized support vectors $L_{c 1}^{m} .$ Also, if belonging to a similar hearing-impaired class, it is noteworthy that the two signatures J contain various equivalence. As a consequence, the training sample’s label information can be combined with the kernel support vector machine (KSVM) prediction. Dynamically apply the label information using the further advanced dynamic KSVM, and the dynamic dual kernel training procedure is evaluated as follows:

(3)

$L_{c} (t_{m}^{j}, t_{m}^{i}) = exp (log L (t_{m}^{j}, t_{m}^{i}) + S)$

(4)

$S (t_{m}^{j}, t_{m}^{i}) = {\begin{cases} α, z_{j} = z_{i} \\ 0, z_{j} \neq z_{i} \end{cases}$

where, $S (t_{m}^{j}, t_{m}^{i})$ and α signify the dynamic ideal kernel and the label weight information. The innovative samples of DKSVM are described by utilizing the Neumann divergence technique,

(5)

$L_{c} (t_{m}^{o}, t_{m}^{r}) = - L (t_{m}^{o}, t_{m}^{r}) + L (t_{m}^{o}, t_{m}^{r}) T L (t_{m}^{j}, t_{m}^{i})$

Hence, $t_{m}^{o}, t_{m}^{r} \subset J$ describing the prediction samples and the decision function can be expressed as follows:

(6)

$\bar{z} = sign (X^{T} L_{c} (t_{m}^{o}, t_{m}^{r}) + a)$

where the X_T and a indicate the learned weight vector and the bias term.

Attention mechanism

This refers to the target features and identifies them by paying attention to the input weights, which means the attention mechanism implementation (Ding et al., 2022). Applying the attention mechanism improves the model’s accuracy in attaining time sequence model construction. The major purposes are to assign attention by hidden layer condition, distinct input detail, and spotlight the impact of crucial text information. The attention mechanism’s weight assignment evaluation can be calculated based on the below equations.

(7)

$f s = v a_{t} \tan g (w_{a} h_{t} + b_{a})$

(8)

$a_{t} = \frac{\exp (f s)}{\sum_{i = 1}^{s} {e^{'}}_{j}}$

where h_t , fs, a_t , w_a , and b_a indicate the hidden layer state at the moment s, attention probability distribution, score, weight vector, and the bias vector.

Novel crossover RFO

The red fox is very good at hunting all kinds of wild and native animals, including small animals. This type of fox can be divided into two types: those that live nomadic lives and those that leave only a few carcasses (Natarajan et al., 2022). Each pack cuts each carcass through an alpha pair system. If opportunities are favorable upon arrival, they may leave the herd and start their herd. Each population is denoted as x, $\bar{c} = (c_{0}, c_{1}, ........, c_{n - 1}) .$ On this note ${(\vec{c_{v}^{u}})}^{s},$ where u is the number of foxes in the calculation and v is the coordinate to identify each fox ${\bar{c}}^{c}$ in a variation of s. Let k ε Pⁱ be the constant; ${(\bar{c})}^{(u)} = [{(c_{0})}^{(u)}, {(c_{1})}^{(u)}, ............. {(c_{n - 1})}^{(u)}]$ represents the solution dimension, All locations are designated as (g, h) ^u , g, h ε P^u . The function value $({\bar{c}}^{(u)})$ is a value on ${(\bar{c})}^{(u)} .$ This is the correct solution (l, m).

Global exploration stage

Each species has an important role to play in the survival of its nest in a colony. Members of a general herd without food in their current location may go in search of food in distant places or find new places. Thus, the population is first sorted by fitness and then by the equation of the square of the Euclidean distance for each individual in the population is calculated as follows:

(9)

$dis(({\bar{c}}^{u})^{s}, ({\bar{c}}^{best})^{s}) = \sqrt{({\bar{c}}^{u})^{s} - {({\bar{c}}^{best})}^{s}}$

Moving an individual from the population to the best individual direction is given by the equation below:

(10)

${({\bar{c}}^{u})}^{s} = {({\bar{c}}^{u})}^{s} + α sign (({\bar{c}}^{best})^{s} - {({\bar{c}}^{u})}^{s})$

For all individuals in the population, $α ε ({({\bar{c}}^{u})}^{s}, {({\bar{c}}^{best})}^{s})$ is calculated blindly. It is an ascending hyperparameter, so it will be calculated iteratively.

Traversing through the local habitat-local search phase

The arbitrary value ψ ε (0,1) set to show the chance that a fox is approaching prey. The action of the fox is defined by the equation below:

(11)

$\{\begin{matrix} Move closer i f μ > 0.75 \\ Stay and disguise i f μ \leq 0.75 \end{matrix}$

When describing the individual effort using the coleoid equation, X represents the barrier to change the population in iterations. In this movement, X represents the barrier to repeated scaling to represent various locations from the arrival of the fox to the target line for all members of the population and Y is selected for all. A fox’s gaze helps to calculate the radius. This is expressed by the equation below:

(12)

$A = (\begin{matrix} l \frac{\sin φ_{0}}{φ_{o}} & i f φ_{o} \neq 0 \\ θ & i f φ_{0} \neq 0 \end{matrix})$

θ is the number between (0,1). Its exemplary functions are given in the equation below.

(13)

$\{\begin{matrix} c_{o}^{new} = l b \cdot \cos (φ_{1}) + c_{0}^{actual} \\ c_{1}^{new} = l b \cdot \sin (φ_{1}) + l b \cdot \cos (φ_{2}) + c_{1}^{actual} \\ c_{2}^{new} = l b \cdot \sin (φ_{1}) + l b \cdot \sin (φ_{2}) + l b \cdot \cos (φ_{3}) + c_{2}^{actual} \\ \cdot \cdot \cdot \cdot \cdot \cdot \cdot \cdot \cdot \cdot \cdot \cdot \\ c_{n - 2}^{new} = l b \cdot \sum_{n = 1}^{n - 2} i n (φ_{k}) + l b \cdot \cos (φ_{n - 1}) + c_{n - 1}^{actual} \\ c_{n - 1}^{new} = l b \cdot \sin (φ_{1}) + l b \cdot \sin (φ_{2}) + ........ + l b \cdot \sin (φ_{n - 1}) + c_{n - 1}^{actual} \end{matrix}$

Each rating is random to φ₁, φ₂, ………. φ_n–1 ^∈(0.2π).

Reproduction stage

Two characters ${({\bar{C}}^{(1)})}^{s} {({\bar{C}}^{(2)})}^{s}$ are used to keep the population size constant. The equation to represent alpha connectivity is given below:

(14)

${habitat}^{(center)} s = \frac{{({\bar{c}}^{(1)})}^{s} + {({\bar{c}}^{(2)})}^{s}}{2}$

The habitat of the distance between the parameters is shown in the equation below:

(15)

${habitat}^{{(diameter)}^{s}} = \sqrt{‖{({\bar{c}}^{(1)})}^{s} - {({\bar{c}}^{(2)})}^{s}‖}$

The equation above shows the distance between the parameter and the types of points defined by the modified and defined processes. It is assumed that the exact solution of a given variable is a stochastic control g ∈ (0,1).

$\{\begin{matrix} New nomadic individual i f g \geq 0.45 \\ Reproduction of the alpha couple i f < 0.45 \end{matrix}$

If both individuals ${({\bar{c}}^{(1)})}^{s}$ and ${({\bar{c}}^{(2)})}^{s}$ reproduce the correct solution then ${({\bar{c}}^{reproduced})}^{s}$ is defined by the equation below:

(16)

${({\bar{c}}^{reproduced})}^{s} = k \frac{{({\bar{c}}^{(1)})}^{s} + {({\bar{c}}^{(2)})}^{s}}{2}$

Crossover operation

To increase the diversity of the population, the original progeny chromosome and the mutant chromosome combine to form the cross chromosome $R_{u} s = (r_{u 1} S, ....... r_{u B} S) .$ This algorithm returns a random number [0,1]. This occurs when a gene is less than or equal to its corresponding number crossover (Zhang et al., 2019). This crossover strategy can be described as follows:

(17)

$r_{u v} s = \{\begin{matrix} l_{u v} s, i f r a n d_{v} [0, 1] \leq D T o r v = v_{r a n d} \\ w_{u v} s, o t h e r w i s e \end{matrix} v = 1, 2, ......... N$

where rand_v [0,1] is the random number assigned to each gene to enable genetic crossover and v_rand ∈[1, 2, ……. N] is a randomly selected code that ensures that crosses over the chromosome $r_{u} s$ DT are essentially a replicable crossover event. This is not optimal for global optimization because the diversity of the population is rapidly reduced.

Crossover red fox algorithm for optimizing the hyperparameters for the ADKS-CRF algorithm

An appropriate and optimal hyperparameter of the ADKS-CRF algorithm is responsible for the improved efficacy of the approach. Therefore, the hyperparameters of the ADKS-CRF algorithm are adjusted and optimized by applying the RFO algorithm. This technique is advantageous in determining the best global solution with a lower step distance and more optimization interval. On the other hand, it will influence from local optima issues when searching for food and hence diminishes the performance of the algorithm. The crossover-based RFO is applied to tune the hyperparameters of the ADKS-CRF to raise algorithm by addressing the difficulty. Drawing this advantageous result from the novel crossover red fox algorithm, the hyperparameters are tuned. Figure 2 illustrates the flowchart representation of ADKS-CRF.

Figure 2:

Flowchart representation of ADKS-CRF. Abbreviations: ADKS-CRF, attention dual kernel support vector-based crossover red fox; SVM, support vector machine.

Alert system

This system is developed to notify users of specific events, emergencies, and important information. Integrating advanced sensors into the alert system can improve the effectiveness for people with hearing impairment. Additionally, it integrates their connectivity with other devices like wearables and smartphones for alerts. It also utilizes an intense flash, loud sound, and vibration to alert hearing-disabled people when a certain instance happens. Some alarm security systems include doorbell alarms, fire alarms, wake-up alarms, baby-cry alarms, smoke alarms, security alarms, and so on.

EXPERIMENTAL RESULTS

Existing works for developing an effective model that helps people with disabilities are carried out, and some of the models generated by the researchers are debated in the above section. This clearly points out that existing methods meet difficulties, so this work introduced an algorithm referred to as ADKS-CRF algorithm. The proposed ADKS-CRF method is able to tackle the issues faced by the existing methods and its performance assessment is checked by other methods, namely CNN-LSTM (Gupta et al., 2022), YOLOv4-Resnet101 (Alahmadi et al., 2023), ASODCAE-SLR (Marzouk et al., 2022), and smoothed pseudo-Wigner–Ville distribution-CNN (SPWVD-CNN) (Khare et al., 2021). The overall performance evaluation of the proposed ADKS-CRF is assayed on the basis of facilitating for disabled people, which encompasses several evaluation parameter utilization for the evaluation, experimental setup for the implementation, dataset for validation, and parameter setting that includes parameters of the proposed method.

Experimental setup

The execution of the proposed ADKS-CRF method is accomplished using the Python 3.6.5 tool (Python Software Foundation) in a PC with the following configuration: i5 8600k, 250 GB SSD, 1 TB HDD, GeForce 1050Ti 4 GB, and 16 GB RAM. The parameter setting is specified in Table 1, and the values of the parameters are presented in the table.

Table 1:

Parameter settings.

Parameters	Values
Number of classes	4
Kernel function	2
Regularization parameter	1
Number of iterations	7
Random values	0,1

Dataset description

Utilizing this dataset supplies numerous advantages for the proposed ADKS-CRF method. By applying the dataset, we can obtain enhanced results and improve the efficiency and accuracy of the proposed model. This work exploits the dataset for the validation of the proposed ADKS-CRF method and its explanation: raw data of an effective 3D ear acquisition dataset (Liu, 2015) are deployed for the proposed method’s performance analysis. The ear contains a constant structure that does not alter due to facial expressions and age. This dataset contains many raw ear images for the testing and training processes of the proposed ADKS-CRF method. The 3D ear acquisition dataset possesses 418 files, which comprise a large number of ear images.

Evaluation measures

Matthews correlation coefficient (MCC), F1-score, accuracy, specificity, and sensitivity measures determine the performance through the evaluation. The mathematical calculation of these measures is indicated as follows:

Specificity: For the acquired total false positives and true negatives, it is the proportion of the formed number of true negatives.

(18) $Specificity = \frac{T U_{N E}}{T U_{N E} + F E_{P S}}$
False alarm rate: It is a metric for how many common occurrences are predicted. Equation (19) shows how to compute it.

(19) $FAR = \frac{F E_{P S}}{F E_{P S} + T U_{N E}}$
Accuracy: For the total count of image samples, this measure is the rate of count of accurate detections.

(20) $Accuracy = \frac{T U_{P S} + T U_{N E}}{T U_{P S} + F E_{P S} + F E_{N E} + T U_{N E}}$
Sensitivity: The sum of false negatives and true positives is the rate of count of true positives.

(21) $Sensitivity = \frac{T U_{P S}}{T U_{P S} + F E_{N E}}$
Matthews correlation coefficient: For the effectiveness improvement in detection, an MCC measure is deployed that has a range between −1 and 1 with false negatives and false positives.

(22) $MCC = \frac{(T U_{P S} \times T U_{N E} - F E_{P S} \times F E_{N E})}{\sqrt{((T U_{P S} + F E_{P S}) (T U_{P S} + F E_{N E}) (T U_{N E} + F E_{P S}) (T U_{N E} + F E_{N E}))}}$
F1-score: This measure includes both false negatives and positives and is also called the harmonic mean.

(23) $F 1 -score = \frac{2 T U_{P S}}{2 T U_{P S} + F E_{P S} + F S_{N E}}$

Performance evaluation

This section provides a graphical analysis of the performance of the proposed ADKS-CRF approach for helping people with hearing impairment. Each figure determines the efficiency of the CNN-LSTM, YOLOv4-Resnet101, ASODCAE-SLR, SPWVD-CNN, and developed ADKS-CRF methods. The accuracy analysis and loss analysis during training and testing are illustrated in Figure 3. The figure indicates that the developed ADKS-CRF method gains higher training accuracy as well as testing accuracy with the number of epochs. The developed ADKS-CRF method attains 0.9 training accuracy at 20 epochs and achieves a testing accuracy of 0.7 at 20 epochs. Figure 4 specifies the loss analysis of the developed ADKS-CRF method. The suggested ADKS-CRF method reaches lower loss values in both training and testing. The ADKS-CRF method acquires a 98.44 loss value for the training process, and in the testing process, it gains a lower loss value, i.e. 97.22.

Figure 3:

Accuracy analysis in training and testing.

Figure 4:

Loss analysis in training and testing.

The developed method’s effectiveness with respect to MCC is provided in Figure 5. The developed ADKS-CRF method has greater MCC performance, which can be seen from the figure. SPWVD-CNN, ASODCAE-SLR, YOLOv4-Resnet101, and CNN-LSTM methods impart inadequate MCC performance. This indicates that the ADKS-CRF method is effective with a higher MCC value. CNN-LSTM and YOLOv4-Resnet101 methods possess the lowest MCC, i.e. 90.1% and 89.6%. SPWVD-CNN and ASODCAE-SLR methods supply 92.5% and 91.3% MCC values. A higher MCC performance of 97.3% is obtained by the proposed ADKS-CRF method. The F1-score performance for the suggested ADKS-CRF method is examined in Figure 6, which gives the comparison of the F1-score performance. YOLOv4-Resnet101, CNN-LSTM, SPWVD-CNN, and ASODCAE-SLR methods’ F1-score performance is not enough; at the same time, the developed ADKS-CRF method offers greater F1-score performance. The proposed ADKS-CRF method yields an F1 score of 97.1%; this proves that the YOLOv4-Resnet101, CNN-LSTM, SPWVD-CNN, and ASODCAE-SLR methods are not strong. The ASODCAE-SLR method delivers an F1-score of 89.7%, and the SPWVD-CNN model delivers an F1-score of 89.7%. The other two methods, CNN-LSTM and YOLOv4-Resnet101, confer decreased F1-score performances of 88.8% and 90.5%.

Figure 5:

MCC analysis. Abbreviations: ADKS-CRF, attention dual kernel support vector-based crossover red fox; ASODCAE-SLR, atom search optimization with a deep convolutional autoencoder-enabled sign language recognition; CNN-LSTM, convolutional neural network-long short-term memory; MCC, Matthews correlation coefficient; SPWVD-CNN, smoothed pseudo-Wigner–Ville distribution-CNN.

Figure 6:

F1-score performance. Abbreviations: ADKS-CRF, attention dual kernel support vector-based crossover red fox; ASODCAE-SLR, atom search optimization with a deep convolutional autoencoder-enabled sign language recognition; CNN-LSTM, convolutional neural network-long short-term memory; SPWVD-CNN, smoothed pseudo-Wigner–Ville distribution-CNN.

Specificity analysis is exhibited in Figure 7, which compares the CNN-LSTM, SPWVD-CNN, ASODCAE-SLR, and YOLOv4-Resnet101 methods with the proposed ADKS-CRF method. Other methods’ specificity performance compared to that of the developed ADKS-CRF method is lower; 89.8% and 88.3% are the specificity performances reached by the CNN-LSTM and YOLOv4-Resnet101 methods. ASODCAE-SLR and SPWVD-CNN methods attain very low specificity performances, 88.4% and 93.2%. The suggested ADKS-CRF model gains an increased specificity performance of 96.3%.

Figure 7:

Specificity analysis. Abbreviations: ADKS-CRF, attention dual kernel support vector-based crossover red fox; ASODCAE-SLR, atom search optimization with a deep convolutional autoencoder-enabled sign language recognition; CNN-LSTM, convolutional neural network-long short-term memory; SPWVD-CNN, smoothed pseudo-Wigner–Ville distribution-CNN.

The graphical analysis for testing the accuracy performance of the developed ADKS-CRF method is shown in Figure 8. Among other methods, the proposed ADKS-CRF method shows increased performance for accuracy. The accuracy of the developed ADKS-CRF method is attained at 98.4%, which is an increased value compared to those of YOLOv4-Resnet101, CNN-LSTM, SPWVD-CNN, and ASODCAE-SLR methods. ASODCAE-SLR and CNN-LSTM models attain an accuracy of 92.4% and 92.7%, respectively; the other two methods, SPWVD-CNN and YOLOv4-Resnet101 algorithms, deliver 88.6% and 91.4%, respectively. Of these methods, the SPWVD-CNN approach gains a much decreased accuracy performance compared to the ASODCAE-SLR, YOLOv4-Resnet101, and CNN-LSTM methods.

Figure 8:

Analysis of accuracy performance. Abbreviations: ADKS-CRF, attention dual kernel support vector-based crossover red fox; ASODCAE-SLR, atom search optimization with a deep convolutional autoencoder-enabled sign language recognition; CNN-LSTM, convolutional neural network-long short-term memory; SPWVD-CNN, smoothed pseudo-Wigner–Ville distribution-CNN.

The sensitivity performance of the proposed ADKS-CRF method is represented in Figure 9. As expressed in Figure 9, the suggested ADKS-CRF algorithm acquires higher sensitivity performance, i.e. 97.8%. YOLOv4-Resnet101 and SPWVD-CNN algorithms give a lower performance on sensitivity, i.e. 93.2% and 89.3%, respectively. The sensitivity performances of ASODCAE-SLR and CNN-LSTM methods are 92.4% and 89.8%, respectively. Table 2 represents the false alarm rate (FAR) analysis of the existing and proposed ADKS-CRF methods. The developed ADKS-CRF method procures lower FAR, demonstrating that the proposed ADKS-CRF method can alert people with hearing impairment. SPWVD-CNN, YOLOv4-Resnet101, CNN-LSTM, and ASODCAE-SLR methods accomplish greater FAR performance.

Figure 9:

Comparison of sensitivity performance. Abbreviations: ADKS-CRF, attention dual kernel support vector-based crossover red fox; ASODCAE-SLR, atom search optimization with a deep convolutional autoencoder-enabled sign language recognition; CNN-LSTM, convolutional neural network-long short-term memory; SPWVD-CNN, smoothed pseudo-Wigner–Ville distribution-CNN.

Table 2:

FAR performance.

Methods	FAR performance
ADKS-CRF	90.8%
CNN-LSTM	95.4%
YOLOv4-Resnet101	97.3%
ASODCAE-SLR	94.2%
SPWVD-CNN	96.8%

Abbreviations: ADKS-CRF, attention dual kernel support vector-based crossover red fox; ASODCAE-SLR, atom search optimization with a deep convolutional autoencoder-enabled sign language recognition; CNN-LSTM, convolutional neural network-long short-term memory; FAR, false alarm rate; SPWVD-CNN, smoothed pseudo-Wigner–Ville distribution-CNN.

Figures 10 and 11 illustrate the hearing level of the left ear and right ear by the proposed ADKS-CRF method. The figures comprise the patient’s hearing level, which is determined on the basis of the frequencies. The proposed ADKS-CRF algorithm reaches high hearing levels with frequencies in the left ear close to 95.7. The proposed method furnishes a greater hearing level of 96.9 in the right ear. In both left-ear and right-ear analyses, the effectiveness of the YOLOv4-Resnet101, CNN-LSTM, SPWVD-CNN, and ASODCAE-SLR methods provide lower hearing levels.

Figure 10:

Hearing level on left ear. Abbreviations: ADKS-CRF, attention dual kernel support vector-based crossover red fox; ASODCAE-SLR, atom search optimization with a deep convolutional autoencoder-enabled sign language recognition; CNN-LSTM, convolutional neural network-long short-term memory; SPWVD-CNN, smoothed pseudo-Wigner–Ville distribution-CNN.

Figure 11:

Hearing level on right ear. Abbreviations: ADKS-CRF, attention dual kernel support vector-based crossover red fox; ASODCAE-SLR, atom search optimization with a deep convolutional autoencoder-enabled sign language recognition; CNN-LSTM, convolutional neural network-long short-term memory; SPWVD-CNN, smoothed pseudo-Wigner–Ville distribution-CNN.

Table 3 represents the effectiveness of the proposed ADKS-CRF method according to the F1-score, specificity, MCC, accuracy, and sensitivity. Table 4 indicates the overall performance of both the proposed ADKS-CRF approach and other methods, where the overall performance of YOLOv4-Resnet101, CNN-LSTM, SPWVD-CNN, and ASODCAE-SLR methods is lower. The developed ADKS-CRF algorithm has a greater overall performance of 98.9%. The output of the suggested ADKS-CRF method is indicated in Figure 12. It represents the efficiency of the ADKS-CRF method in improving the quality of speech signals for people with hearing impairment. The developed ADKS-CRF method sends an alert to people with hearing impairment within a short time.

Table 3:

Efficiency of the ADKS-CRF method.

Measures	ADKS-CRF	CNN-LSTM	YOLOv4-Resnet101	ASODCAE-SLR	SPWVD-CNN
Sensitivity (%)	97.8	89.8	93.2	92.4	89.3
F1-score (%)	97.1	88.8	90.5	89.7	92.3
Specificity (%)	96.3	93.2	88.3	88.4	93.2
Accuracy (%)	98.4	92.7	91.4	92.4	88.6
MCC (%)	97.3	90.1	89.6	91.3	92.5

Abbreviations: ADKS-CRF, attention dual kernel support vector-based crossover red fox; ASODCAE-SLR, atom search optimization with a deep convolutional autoencoder-enabled sign language recognition; CNN-LSTM, convolutional neural network-long short-term memory; MCC, Matthews correlation coefficient; SPWVD-CNN, smoothed pseudo-Wigner–Ville distribution-CNN.

Table 4:

Overall performance.

Methods	Overall performance (%)
ADKS-CRF	98.9
CNN-LSTM	90.6
YOLOv4-Resnet101	92.7
ASODCAE-SLR	93.2
SPWVD-CNN	95.8

Abbreviations: ADKS-CRF, attention dual kernel support vector-based crossover red fox; ASODCAE-SLR, atom search optimization with a deep convolutional autoencoder-enabled sign language recognition; CNN-LSTM, convolutional neural network-long short-term memory; SPWVD-CNN, smoothed pseudo-Wigner–Ville distribution-CNN.

Figure 12:

Output of the proposed ADKS-CRF algorithm. Abbreviation: ADKS-CRF, attention dual kernel support vector-based crossover red fox.

DISCUSSION AND CONCLUSIONS

Many methods are employed for speech signals in the search field, but these methods have problems and must be tackled using a suitable model. Therefore, to make improvements in speech signals to facilitate people who have a hearing impairment, the ADKS-CRF algorithm is suggested in this work. In the proposed method, DKSVM is integrated with an attention mechanism that does not compute the transformed vector features in order to process them within a high-dimensional feature space. The crossover red fox algorithm is a combination of the crossover strategy and the RFO algorithm, which tunes parameters by eliminating the local optima issues. The efficiency of the proposed ADKS-CRF method is acceptable since it achieved greater values than the compared methods in the experimental results. For this, five evaluation parameters are exploited: specificity, MCC, F1-score, sensitivity, FAR, and accuracy. Based on the evaluation tests, it can be concluded that the developed ADKS-CRF algorithm has greater effectiveness for the quality of speech signals.

In order to apply the ADKS-CRF algorithm in a practical setting, the initial stage is incorporating it into AT equipment such as advanced hearing aids or mobile applications specifically developed for individuals with hearing impairment. The device would gather real-time auditory and visual data, which the program analyzes to improve communication and accessibility. For instance, the algorithm employs an attention mechanism in hearing aids to concentrate on crucial auditory information, thereby filtering and amplifying pertinent sounds while minimizing background noise. The DKSVM component effectively manages high-dimensional data from diverse contexts, assuring strong performance. Moreover, the RFO system consistently adjusts the device’s characteristics to accommodate various user settings and preferences, thereby enhancing the overall user experience. The device can also utilize the algorithm to activate notifications during emergency scenarios, such as fire alarms or security alerts, to protect the safety of the user. Collecting user feedback through the device’s interface can assist in further refining the model, thus ensuring its continued effectiveness and user-friendliness. This practical application utilizes the benefits of the ADKS-CRF algorithm to greatly improve the daily experiences of those with hearing impairment by offering a more prompt and adaptable AT solution.

Several ethical considerations were addressed during the development and validation of the ADKS-CRF algorithm. Data privacy was prioritized by anonymizing all data and implementing stringent security protocols to safeguard user information. Inclusivity was ensured by incorporating diverse auditory and visual data from various demographics, enhancing the model’s generalizability and fairness. Potential biases in the dataset were assessed and mitigated through balanced data collection and preprocessing techniques. These efforts aim to ensure that the ADKS-CRF algorithm is effective, equitable, and ethically transparent, contributing positively to societal impact.

While the ADKS-CRF algorithm offers substantial advancements, several limitations exist. Its high computational complexity may limit its use on devices with limited processing power. The algorithm’s performance is highly dependent on the quality and diversity of the training data, and any biases could affect its generalizability. The model’s complexity can reduce interpretability, and scaling for broader applications poses challenges. Real-time processing constraints may introduce latency, and continuous updates and maintenance are necessary, requiring ongoing effort and resources. Addressing these limitations is crucial for the practical deployment and adoption of the ADKS-CRF algorithm in AT solutions.

In the future, the developed ADKS-CRF will be enhanced for sign board recognition from real-time applications, and the fusion of deep learning will be added to accelerate the proposed model by applying real-world datasets.

[1] Abd Ghani MK, Noma NG, Mohammed MA, Abdulkareem KH, Garcia-Zapirain B, Maashi MS, et al.. 2021. Innovative artificial intelligence approach for hearing-loss symptoms identification model using machine learning techniques. Sustainability. Vol. 13(10):5406

[2] Abdi S, Kitsara I, Hawley MS, de Witte LP. 2021. Emerging technologies and their potential for generating new assistive technologies. Assist. Technol. Vol. 33 Suppl 1:17–26

[3] Abidi MH, Alkhalefah H, Mohammed MK, Umer U, Qudeiri JEA. 2020a. Optimal scheduling of flexible manufacturing system using improved lion-based hybrid machine learning approach. IEEE Access. Vol. 8:96088–96114

[4] Abidi MH, Umer U, Mohammed MK, Aboudaif MK, Alkhalefah H. 2020b. Automated maintenance data classification using recurrent neural network: enhancement by spotted hyena-based whale optimization. Mathematics. Vol. 8(11):2008

[5] Abidi MH, Alkhalefah H, Umer U. 2022a. Fuzzy harmony search based optimal control strategy for wireless cyber physical system with industry 4.0. J. Intell. Manuf. Vol. 33(6):1795–1812

[6] Abidi MH, Mohammed MK, Alkhalefah H. 2022b. Predictive maintenance planning for industry 4.0 using machine learning for sustainable manufacturing. Sustainability. Vol. 14(6):3387

[7] Alahmadi TJ, Ur Rahman A, Khalid Alkahtani H, Kholidy H. 2023. Enhancing object detection for VIPs using YOLOv4_Resnet101 and Text-to-Speech conversion model. Multimodal Technol. Interact. Vol. 7(8):77

[8] Ariza JÁ, Pearce JM. 2022. Low-cost assistive technologies for disabled people using open-source hardware and software: a systematic literature review. IEEE Access. Vol. 10:124894–124927

[9] Ashiq F, Asif M, Bin Ahmad M, Zafar S, Masood K, Mahmood T, et al.. 2022. CNN-based object recognition and tracking system to assist visually impaired people. IEEE Access. Vol. 10:14819–14834

[10] Crowe B, Machalicek W, Wei Q, Drew C, Ganz J. 2022. Augmentative and alternative communication for children with intellectual and developmental disability: a mega-review of the literature. J. Dev. Phys. Disabil. Vol. 34(1):1–42

[11] Ding W, Huang J, Shang G, Wang X, Li B, Li Y, et al.. 2022. Short-term trajectory prediction based on hyperparametric optimisation and a dual attention mechanism. Aerospace. Vol. 9(8):464

[12] Gupta M, Thakur N, Bansal D, Chaudhary G, Davaasambuu B, Hua Q. 2022. CNN-LSTM hybrid real-time IoT-based cognitive approaches for ISLR with WebRTC: auditory impaired assistive technology. J. Healthc. Eng. Vol. 2022:3978627

[13] Hermawati S, Pieri K. 2020. Assistive technologies for severe and profound hearing loss: beyond hearing aids and implants. Assist. Technol. Vol. 32(4):182–193

[14] Kannan RS, Ezhilarasi P, Rajagopalan VG, Krishnamithran S, Ramakrishnan H, Balaji HK, et al.. 2023. Integrated AI based smart wearable assistive device for visually and hearing-impaired people2023 International Conference on Recent Trends in Electronics and Communication (ICRTEC); Mysore, India. 10-11 February 2023; p. 1–6

[15] Kbar G, Al-Daraiseh A, Mian SH, Abidi MH. 2016. Utilizing sensors networks to develop a smart and context-aware solution for people with disabilities at the workplace (design and implementation). Int. J. Distrib. Sens. Netw. Vol. 12(9):1550147716658606

[16] Kbar G, Al-Daraiseh A, Aly S, Abidi MH, Mian SH. 2017. Assessment of technologies relevant for people with motor hearing and speech impairment. IETE Tech. Rev. Vol. 34(3):254–264

[17] Khare SK, Bajaj V, Acharya UR. 2021. SPWVD-CNN for automated detection of schizophrenia patients using EEG signals. IEEE Trans. Instrum. Meas. Vol. 70:1–9

[18] Kumar LA, Renuka DK, Rose SL, Priya MCS, Wartana IM. 2022. Deep learning based assistive technology on audio visual speech recognition for hearing impaired. Int. J. Cogn. Comput. Eng. Vol. 3:24–30

[19] Liu Y. 2015. RAW data of an effective 3D ear acquisition systemFigshare. [Cross Ref]

[20] Madhusanka BGDA, Ramadass S. 2021. Implicit intention communication for activities of daily living of elder/disabled people to improve well-beingIoT in Healthcare and Ambient Assisted Living. Marques G, Bhoi AK, de Albuquerque VHC, Hareesha KS. p. 325–342. Springer Singapore. Singapore:

[21] Marzouk R, Alrowais F, Al-Wesabi FN, Hilal AM. 2022. Atom search optimization with deep learning enabled Arabic sign language recognition for speaking and hearing disability persons. Healthcare. Vol. 10(9):1606

[22] Modi N, Singh J. 2022. A survey of research trends in assistive technologies using information modelling techniques. Disabil. Rehabil. Assist. Technol. Vol. 17(6):605–623

[23] Nagarajan A, Gopinath MP. 2023. Hybrid optimization-enabled deep learning for indoor object detection and distance estimation to assist visually impaired persons. Adv. Eng. Softw. Vol. 176:103362

[24] Natarajan R, Megharaj G, Marchewka A, Divakarachari PB, Hans MR. 2022. Energy and distance based multi-objective red fox optimization algorithm in wireless sensor network. Sensors. Vol. 22(10):3761

[25] Ochsner B, Spöhrer M, Stock R. 2022. Rethinking assistive technologies: users, environments, digital media, and app-practices of hearing. NanoEthics. Vol. 16(1):65–79

[26] Qiu W, Tang Q, Zhu K, Yao W, Ma J, Liu Y. 2021. Cyber spoofing detection for grid distributed synchrophasor using dynamic dual-kernel SVM. IEEE Trans. Smart Grid. Vol. 12(3):2732–2735

[27] Šumak B, Brdnik S, Pušnik M. 2021. Sensors and artificial intelligence methods and algorithms for human-computer intelligent interaction: a systematic mapping study. Sensors (Basel). Vol. 22(1):20

[28] Tasnim NZ, Ni A, Lobarinas E, Kehtarnavaz N. 2024. A review of machine learning approaches for the personalization of amplification in hearing aids. Sensors. Vol. 24(5):1546

[29] Younis O, Al-Nuaimy W, Rowe F, Alomari MH. 2019. A smart context-aware hazard attention system to help people with peripheral vision loss. Sensors. Vol. 19(7):1630

[30] Zhang Z, Ding S, Jia W. 2019. A hybrid optimization algorithm based on cuckoo search and differential evolution for solving constrained engineering problems. Eng. Appl. Artif. Intell. Vol. 85:254–268

[31] Zhang S, Suresh L, Yang J, Zhang X, Tan SC. 2022. Augmenting sensor performance with machine learning towards smart wearable sensing electronic systems. Adv. Intell. Syst. Vol. 4(4):2100194

Journal of Disability Research