INTRODUCTION
Rehabilitation robots represent a significant advancement in medical technology, offering novel solutions in physical therapy and patient care ( Tejima, 2001; Qian and Bi, 2015; Alotaibi and Alsubaie, 2023; Alsubaie and Alotaibi, 2023; Luo et al., 2023). These sophisticated machines are designed to assist patients in recovering motor functions lost due to injuries, strokes, or chronic illnesses. Their significance lies in their ability to provide consistent, precise, and personalized therapy sessions, which can lead to better recovery outcomes compared to traditional therapy methods ( Yakub et al., 2014; Yu et al., 2015; Gassert and Dietz, 2018). The adaptive nature of these robots allows them to cater to the unique needs and progress of each patient, making them an invaluable tool in rehabilitative medicine. Moreover, they offer the advantage of data collection and analysis, enabling therapists to make informed decisions about the course of treatment. The psychological benefits, such as increased patient engagement and motivation, are also noteworthy. Rehabilitation robots can make therapy sessions more engaging and enjoyable, leading to improved adherence to therapy regimens and better outcomes ( Moulaei et al., 2023; Su et al., 2023). In essence, rehabilitation robots are transforming the landscape of physical therapy by offering a precise, patient-centric approach to rehabilitation ( Han et al., 2023; Su et al., 2023).
Controlling rehabilitation robots effectively is crucial for their success in patient therapy ( Ahmed et al., 2019; Li et al., 2023). Model predictive control (MPC) emerges as a promising candidate due to its forward-looking nature and ability to handle multivariable systems with constraints ( Drgoňa et al., 2020; Jahanshahi et al., 2021; Wang et al., 2022). MPC works by continuously solving an optimization problem that predicts future outputs and optimizes control inputs based on a model of the system. This predictive capability allows MPC to make informed decisions about the optimal control actions required to achieve desired outcomes ( Karamanakos and Geyer, 2019). The precision and adaptability of MPC are particularly beneficial in scenarios where the robot must respond to varying degrees of patient effort, resistance, or fatigue. Moreover, MPC’s capability to handle constraints is crucial in ensuring that the robot operates within safety limits while achieving therapeutic objectives ( Rosolia and Ames, 2020; Mariano-Hernández et al., 2021). Thus, MPC’s predictive capabilities, combined with its ability to handle multivariable systems and constraints, make it an ideal control strategy for rehabilitation robots, offering a sophisticated and nuanced approach to robotic control ( Holkar and Waghmare, 2010).
Despite its advantages, MPC faces significant challenges in the context of rehabilitation robots. The primary issue is the computational intensity of MPC, especially when solving complex optimization problems in real time ( Karamanakos et al., 2020; Garcia-Torres et al., 2021). Additionally, MPC requires an accurate model of the system it controls. In rehabilitation robotics, where the robot interacts with humans whose intentions and responses can be unpredictable, creating an accurate model is exceptionally challenging ( Zhan and Chong, 2021). This unpredictability often leads to suboptimal control performance, as MPC struggles to adapt to the variability and complexity inherent in human–robot interactions. The need for real-time responses and the difficulty in accurately modeling human behavior are major obstacles in effectively implementing MPC in rehabilitation robotics, underscoring the need for innovative solutions to enhance its adaptability and efficiency ( Xi et al., 2013; Hewing et al., 2020; Schwenzer et al., 2021).
To address these challenges, recent advancements in machine learning offer promising solutions. Gaussian processes (GPs) stand out as a potential game-changer. GPs are capable of providing reliable predictions of uncertainties and require fewer training samples, making them well-suited for dynamic and uncertain environments like those encountered in rehabilitation robotics ( Wang et al., 2005; Zhao et al., 2022; Sauer et al., 2023). Despite their potential, the integration of GPs with MPC in rehabilitation robotics remains underexplored in the literature. This presents a unique opportunity to enhance the performance of MPC, making it more adaptable, efficient, and effective in managing the complexities of human–robot interactions in rehabilitation settings. The application of GPs in this context could lead to significant improvements in the control and effectiveness of rehabilitation robots, offering a more responsive and patient-tailored approach to therapy.
Motivated by the limitations inherent in traditional MPC for rehabilitation robotics, our study introduces a novel kriging-based MPC approach. Utilizing GPs—a core element of kriging—this methodology proficiently models the uncertain dynamics of rehabilitation robots. It adeptly captures their stochastic nature, enabling more precise predictions of future states, which are vital for the exacting control requirements of rehabilitation scenarios. By integrating this predictive model within the MPC framework, our approach addresses the inherent uncertainties and nonlinearities in rehabilitation robotics, surpassing conventional MPC methods. Additionally, it significantly reduces the computational load, a common challenge in traditional MPC. This is achieved by leveraging GPs to efficiently approximate complex dynamics without necessitating intensive real-time computations. This computational efficiency is particularly beneficial in the context of rehabilitation, where rapid response and adaptability are essential. Overall, our method not only improves control precision but also offers a more computationally viable solution, rendering it highly suitable for real-world rehabilitation robotics applications.
The study is organized as follows: the Dynamic of 2-DOF Knee Rehabilitation Robot section provides a detailed description of the dynamical model for a two degrees of freedom (2-DOF) knee rehabilitation robot, laying the foundational understanding of the system. The nonlinear model predictive control (NMPC) section covers the NMPC tailored for rehabilitation. The heart of the study, the Proposed Learning-based Control Technique section, introduces our proposed learning-based control technique, detailing its guaranteed convergence and advantages. The Simulation Results section presents the results of testing the proposed algorithm. Finally, the Conclusion section concludes the study, summarizing the key findings and suggesting directions for future research.
DYNAMIC OF 2-DOF KNEE REHABILITATION ROBOT
We focus on a knee rehabilitation robotic system with 2-DOF, as depicted in Figure 1. This configuration effectively models the biomechanical dynamics crucial for knee rehabilitation exercises. The system is constructed with two main links, each exhibiting its own degree of freedom, identified as q 1 for the upper link and q 2 for the lower link. Associated with these links are distinct masses and lengths, marked as m 1, l 1 for the first link and m 2, l 2 for the second. Springs are integrated at the joints, introducing an aspect of flexibility, likely intended to mimic the natural elasticity inherent in a human knee. The coordinate points x 1, y 1 and x 2, y 2 serve to pinpoint specific locations or reference points on each link.

Illustration of a 2-DOF knee rehabilitation robot. Springs are modeled at each joint, with q 1 and q 2 indicating the reference coordinates for kinematic analysis purposes. Abbreviation: 2-DOF, two degrees of freedom.
The kinetic energy of the 2-DOF knee rehabilitation robot can be mathematically expressed using the principles of classical mechanics. This formulation takes into account the movement and dynamics of both links in the robot. Each link contributes to the overall kinetic energy of the system based on its mass, length, and velocity. The kinetic energy equation would typically incorporate terms representing the rotational and translational kinetic energies of each link, factoring in their respective degrees of freedom, masses, and velocities which is given as follows:
where in our mathematical representation, we define q c as the sum of these coordinates, specifically, q c = q 1 + q 2. The term lc i denotes the distance from joint i − 1 to the center of mass of the ith link, where i can be either 1 or 2. I i represents the moment of inertia for the ith link, calculated about an axis that is orthogonal to the plane of the diagram and passes through the link’s center of mass. The potential energy of this system is formulated based on these parameters and is expressed as follows:
By employing Lagrange’s equation, we can derive the dynamic equations that govern the system as follows:
In this context, the inertia matrix, represented by I (ψ), is a symmetric and positive definite matrix with dimensions n × n. Additionally, Cd(ψ, ˙ψ) captures the combined effects of Coriolis and Centripetal forces acting on the system. The gravitational forces exerted on the system are represented by G (ψ). The Jacobian matrix, denoted as J T (ψ), is assumed to be nonsingular. The force vector, which is subject to constraints, is indicated by f(t), and the input torque applied to the system is denoted by τ( t). By applying Lagrange’s equation, the individual components that make up this motion equation [as outlined in Equation (3)] are obtained as follows:
The force vector and Jacobian matrix within the system are characterized in the following manner:
In order to transform the motion equation of the 2-DOF multi-input multi-output rehabilitation robot into a state space representation, we define z 1 as the vector [ q 1, q 2] T and z 2 as [ q 1, q 2] T . With these definitions, the dynamic behavior of the robot can be expressed as follows:
Based on Equations (3)– (5), the subsequent derivations can be made:
The equation outlined above represents the state space of the system, a model where each component might encounter uncertainties. This aspect is particularly pertinent given the robot’s interaction with patients, where the forces and uncertainties involved can vary significantly from one individual to another. Relying exclusively on these dynamic motion equations for designing controllers in rehabilitation robots might prove impractical in real-world scenarios. At best, the outcomes could be less than ideal, owing to the inherent uncertainties of the system.
NONLINEAR MODEL PREDICTIVE CONTROL
NMPC is particularly adept at handling the complex and dynamic environments in which modern robots operate, offering an efficient control strategy compared to linear MPC. The cost function for the MPC, related to the state space model outlined in Equation (6), is established in the following manner:
The cost function includes the stage cost, symbolized by Q ( z( k), τ( k), and a terminal cost component, T ( z( N)), which is concerned with the cost related to the final state. The cost function spans a prediction horizon N, covering the control inputs τ (0, 1, …, N–1), and state variables z (0, 1, …, N). While Equation (8) presents the system in a general format, the controller in this study is tailored specifically for affine systems. This method is focused on minimizing the cost function [ Equation (8)] in accordance with the baseline model [ Equation (6)] as follows:
where Z(0) = z 0 denotes the initial state, vector Z = [ z 1, z 2] T represents the state space of the system, and h and w denote the discretized vectors corresponding to the state space model of the system. The augmented cost function, represented by J N , extends the original cost function outlined in Equation (8) to encompass both the constraints and the associated Lagrange multipliers. This expanded cost function is derived by integrating the Hamiltonian function over the prediction horizon, denoted as N. Consequently, the augmented cost function is articulated as follows:
In this setting, λ ( k + 1) signifies the Lagrange multiplier corresponding to the dynamics in the baseline model detailed in Equation (6). When the NMPC problem, as outlined in Equation (9), is resolved, it yields the optimal trajectories for the baseline, indicated as Z o ( k) and τ ( k). The process of optimization involves the application of the Karush–Kuhn–Tucker (KKT) conditions, which are crucial for determining the necessary conditions for optimality. This application ensures the effective minimization of the augmented cost function shown in Equation (10). The expressions defining these conditions of optimality are as follows:
By employing the KKT conditions and using the Lagrange multipliers λ ( k + 1), the optimal baseline solutions Z ( k) and τ ( k) can be calculated in real time. Considering the KKT conditions, the relevant expressions are as follows:
Taking into account Equations (11) and (12), the calculation of the Lagrange multipliers λ( k + 1) can be accomplished as follows:
Furthermore, the Lagrange multipliers, as outlined in Equation (13), are regarded as the baseline optimal Lagrange multipliers, represented by λ( k).
Our control strategy focuses on the stage cost function Q ( z( k), τ( k)). We use a kriging-based MPC and GP model for accuracy. This setup guides the system toward its goals, making a separate terminal cost T( z( N)) less critical. In the following section, details of the proposed scheme are provided.
PROPOSED LEARNING-BASED CONTROL TECHNIQUE
In this section, we first outline our tailored GP for the control strategy, followed by a detailed presentation of our proposed control law, including its design and convergence properties. Additionally, we discuss the advantages of our approach over existing methods, highlighting its unique benefits and potential impact in complex control systems.
Formulation of GPs
Data gathered from laboratory experiments or computer simulations are often susceptible to noise and disturbances. This noise can lead to variations in the input–output relationship of the system, which can be conceptualized as a single realization of a random process, denoted by GP( Z):
where the unknown mean function is denoted as m( Z). This function can be constructed using a predetermined collection of basis functions: m( Z) = [ m 1( Z), …, m m ( Z)], and we need to estimate the unknown coefficients μ = [μ 1, …, μ h ] T for the mean function. Additionally, there is a zero-mean GP with a parametric covariance function defined as c n ( Z).
The process’s variance is symbolized as σ 2, and the correlation function is denoted by k (·), where k ( Z *, Z′) measures the similarity between Z * and Z′. Several correlation functions have been proposed in the literature, but the Gaussian correlation function is the most commonly employed one. It can be represented as follows:
The hyperparameters are represented as σ 2 and Ω, where Ω is a diagonal matrix. To perform GP modeling, it is crucial to estimate the parameters ꞵ, Ω, and σ 2. This estimation can be achieved through maximum likelihood estimation. Maximum likelihood is a statistical technique employed to estimate model parameters by maximizing the likelihood function. In the context of GP modeling, the multivariate Gaussian likelihood function is used. This function is defined as the probability density function of a multivariate Gaussian distribution, determined using the mean vector and the covariance matrix. To estimate the unknown parameters in the parametric mean and covariance function within GP formulation, several approaches have been investigated in the literature. While some studies opt for maximum likelihood estimation, others integrate prior information and employ maximum a posteriori estimation. For comprehensive insights into these methods, we suggest referring to seminal papers in this field, including Daemi et al. (2019). The likelihood function is expressed as follows:
Where log(·) denotes natural logarithm, and the correlation matrix is represented by K, with elements K ij = k ( Z i, Z j ) for i, j = 1, …, n, and τ n = [τ 1,..τ n ]. Additionally, M is an n × m matrix with the ( p, q)th element M pq = m p ( Z q ), where p = 1, …, n, and q = 1, …, m.
To derive estimations for the hyperparameters ˆμ, ˆσ2 and ˆΩ numerical optimization techniques are judiciously employed to minimize Equation (17). In the context of engineering journal publications, a diverse array of global optimization methodologies has been applied for this specific task. These encompass techniques such as particle swarm optimization (PSO), genetic algorithms (GAs), pattern searches, and gradient-based optimization, as described in previous studies ( Arı et al., 2012; Filippov et al., 2020; Jakubik et al., 2021; Paulson and Lu, 2022; Rajwar et al., 2023). It is essential to underscore that, particularly within the engineering domain, gradient-based optimization methods are frequently favored for the purpose of maximizing likelihood in GP models. This preference is primarily rooted in their computational efficiency and their capacity to adeptly navigate the intricate landscape of optimization problems posed by the multifaceted engineering models under consideration.
Utilizing GP modeling empowers us to derive the mean and variance of the predicted probability distribution at any input point Z *. To be precise, the mean prediction is calculated using the posterior mean function, which is a linear combination of the training data, with weights determined by the covariance between the training data and the test point. Conversely, the variance of the prediction is ascertained by the posterior covariance function. This covariance function quantifies the uncertainty in the prediction arising from both the noise in the data and the selection of the kernel function. The expressions for the mean and variance of the prediction are as follows:
where we have a set of functions denoted as M ( Z *) = [ m 1 ( Z *), …, m h ( Z*)]. For each training data point Z i , we can compute the value c(Z*, Z′)= ˆσ2k(Z*, Z′), and by collecting these values for all n training data points, we obtain a column vector c n ( Z *) [where c n ( Z *) quantifies the similarity of Z * with respect to all training data points]. By utilizing these similarity values among the training data points, we can construct a covariance matrix V, where the ( i, j)th element is given by ˆσ2k(Zi, Zj).
Robust learning-based technique with guaranteed convergence
The system error is defined as e = Z − Z d , with Z d representing the desired reference trajectory. The control law for robust tracking is proposed as follows:
where τ GP represents the optimal control term as estimated by a trained GP. The term δσ ( Z, Z n ) e+ μσ ( Z, Z n ) sign ( e) is a correction term introduced to address uncertainties and errors in the GP, where δ and μ are user-defined parameters constrained to positive values. Z n represents the training data utilized in the GP. This approach allows for the integration of predicted uncertainties, deviating from the optimal control law only when necessary, thus ensuring consistent operation amidst unpredicted disturbances or model inaccuracies. The control law accounts for the discrepancies between the training data and the current state of the system, effectively leveraging the inherent benefits of GPs.
Theorem 1
The proposed intelligent control law [ Equation (19)], utilizing a GP trained to generate optimal control signals and augmented with a correction term, ensures the convergence of the closed-loop system even when operating conditions deviate significantly from its trained state.
Proof
When the system operates outside of ideal conditions, applying the proposed control signal to the system’s dynamics allows for the deliberate management of the time-dependent error, N ( t), within the system.
where τ n represents the ideal control action for the system. However, due to uncertainties, errors in the GP model, inadequate training data, and external disturbances, applying τ n precisely as intended is often not feasible. This leads to an error dynamic, N(t), highlighting the deviation from the desired behavior. The control action applied, denoted as τ, is adjusted to τ = τ − τ n + τ n , accounting for these deviations. Theoretically, τ n would eliminate the error dynamic, but due to inaccuracies in the proposed controller signal with GP and the expected optimal controller, some errors are inevitable. These practical limitations necessitate adjustments to the actual control action τ, thus influencing the error dynamic as follows:
Now, we should show that the correction term is able to suppress this error. We define ˜N=I−1(τn−τ); consequently, the error dynamic will be as follows:
Now, consider a Lyapunov function candidate denoted as V, which is expressed as follows:
The time derivative of the Lyapunov function V is expressed as follows:
The parameter μ should be chosen such that μσ(Z, Zn)>|˜N|. Consequently, we have
Given that both δ and σ ( Z, Z y ) are positive, we can use Equation (19) to confirm that the convergence of the states in the closed-loop system toward the equilibrium point is assured. This conclusion aligns with the principles outlined in the Lyapunov stability theorem, thereby solidifying the proof of our system’s stability.
Remark 1
The user-defined parameters δ and μ must comply with predetermined constraints to ensure the reliability of the results and the stability of the model. Specifically, δ should remain positive. Moreover μ, apart from being positive, must meet an additional requirement: μσ ( Z, Z y ) should be greater than |˜N|. This condition holds significant importance in upholding Lyapunov stability.
Remark 2
In the realm of MPC, accurate system equations are crucial for ensuring convergence. Traditionally, this implies a need for precise knowledge of all parameters and functions, as indicated in Equation (9) of this paper. However, our proposed algorithm leverages kriging metamodeling to address the inherent uncertainties in system dynamics, enabling the proof of system convergence even in their presence. This approach not only acknowledges the limitations of relying on exact equations but also illustrates the robustness of our methodology in accommodating and adapting to uncertainties, as initially outlined in the Remark 1 section.
Comparison with other techniques
GPs hold a distinct place in algorithmic techniques, especially when compared to methods like neural networks, deep learning, PSO, and GAs. Unlike these techniques, GPs are unique in their innovative approach and fundamental methodologies, offering adaptability across various application areas. While deep learning is adept at handling high-dimensional data, revolutionizing fields such as image processing, and GPs provide a valuable probabilistic perspective for both supervised and unsupervised learning due to their nonparametric nature. In complex and noncontinuous search spaces, PSO and GAs may excel where traditional gradient-based methods falter; yet, they lack the probabilistic depth of GPs.
A crucial advantage of GPs is their ability to estimate uncertainty. This characteristic is vital for our novel control approach, where it is leveraged for intelligent, efficient, and optimally converged control. The strengths of GPs are multifaceted: they inherently offer a framework for interpretability, crucial for understanding uncertainties in predictions; they are efficient with limited data, making credible predictions in contrast to the extensive data requirements of deep learning models; in contexts like Bayesian optimization, their intrinsic capacity for uncertainty quantification is exceptionally valuable; and with careful kernel and hyperparameter selection, GPs are less prone to overfitting, beneficial in sparse data scenarios.
The efficiency of kriging in sparse data environments allows for the generation of reliable models without the need for extensive data collection, presenting a significant advantage over neural networks, which typically require large volumes of training data to achieve comparable levels of accuracy. Moreover, kriging’s direct estimation of uncertainty is especially valuable in control applications, where understanding the confidence in predictions can critically influence system stability and performance. Additionally, the minimal requirement for parameter tuning in kriging models, as opposed to the often labor-intensive optimization process for neural networks, further validates our choice, enabling quicker deployment and adaptability in varying control scenarios.
By harnessing these attributes, especially in uncertainty estimation, our approach utilizes GPs for intelligent and rapid control optimization. This not only ensures adaptability under diverse conditions but also guarantees the convergence of the system, a critical aspect in engineering applications demanding accuracy, efficiency, and reliability. This integration of GPs into control systems represents a significant leap forward, showcasing the versatility and robustness of GPs in addressing complex engineering challenges.
One challenge with machine learning methods, such as the kriging algorithm, is their limited performance in beyond-the-data-range scenarios ( Yousefpour et al., 2022). However, our proposed approach mitigates this through a correction term. As long as the error remains within the acceptable range, as outlined in the theoretical framework and the Remark 1 section, the method’s performance and convergence are assured, even when extrapolating.
SIMULATION RESULTS
In this section, we assess the efficacy of our proposed method using two illustrative examples. Example 1 showcases the performance of our approach when the GP is trained with a comprehensive dataset. Conversely, example 2 delves into a scenario where the GP is trained on limited datasets. Here, the initial conditions are significantly divergent from those in the dataset. This exploration serves to highlight the robustness of our technique in managing unexpected system dynamics under varying initial conditions. This comprehensive evaluation demonstrates the adaptability and reliability of the proposed method in diverse data environments. The system parameters are defined as follows: l 1 = 0.2 kgm 2, l 2 = 0.25 kgm 2, m 1 = 2 kg, and m 2 = 0.85 kg. In numerous methodologies, especially with neural networks, the precision in specifying system parameters is essential. Our approach, however, tolerates errors due to inaccuracies and uncertainties in these parameters. Provided the resulting uncertainty remains within the acceptable limits detailed in the Remark 1 section, our method ensures satisfactory performance and assures convergence.
Training optimal GP
To generate the dataset, we implemented a range of initial conditions, introducing variations by randomly modulating their amplitudes. Following this, MPC was employed to determine the nominal optimal control inputs, which were then measured and recorded as the output. Consequently, the dataset encompasses both the position and velocity of each link (as input parameters) and the corresponding optimal control signals (as outputs).
For the training of the GP, this collected dataset was utilized in an offline setting. Emphasizing the richness of the training data, we incorporated 2000 distinct sets of initial conditions to ensure comprehensive learning. Figure 2 illustrates the diversity of the training dataset, showcasing how the system responds to varied initial conditions. This extensive training approach is critical for the robust performance of the GP in diverse operational scenarios.
Example 1: performance evaluation with a rich dataset
For this experiment, the initial states were set to [q1(0), q2(0), ˙q1(0), ˙q2(0)]=[0.5, 0.5, −1, −1]. Figures 3 and 4 provide a clear illustration of the controller’s effectiveness in handling the rehabilitation system. As depicted in the figures, the system rapidly converges to the desired state, underscoring the efficiency of our proposed technique. This rapid convergence is indicative of the controller’s robust design and its capability to handle varying operational conditions.

Time response of the first link of the rehabilitation robot. This behavior is shown in a case where the GP is trained on a comprehensive dataset, with both links having a reference point at the origin and initial conditions set as [q1(0), q2(0), ˙q1(0), ˙q2(0)]=[0.5,0.5,−1,−1]. Abbreviation: GP, Gaussian process.

The temporal response of the rehabilitation robot’s second link. Displayed in a scenario where the GP has been trained with an extensive dataset, this behavior illustrates both links set to a reference point at the origin and initial conditions defined as [q1(0), q2(0), ˙q1(0), ˙q2(0)]=[0.5,0.5,−1,−1]. Abbreviation: GP, Gaussian process.
Figure 5 showcases the control signal as produced by our proposed method, specifically through the application of the control law outlined in Equation (19). To demonstrate how each component of the control signal [the optimal signal derived from the GP and the corrective term from control law shown in Equation (19)] contributes to system stabilization, these elements are depicted separately. The figure underscores the controller’s steady performance, ensuring that operations stay within a desirable range. Predominantly, the system’s regulation is driven by the signal from the GP, with the corrective signal’s magnitude being significantly smaller. This reduced need for correction is due to the depth and quality of our training data, which effectively reduces uncertainties in the GP-generated control signal.
Example 2: performance evaluation with a limited dataset
In this example, our objective is to assess the controller’s performance when our proposed algorithm is trained using a suboptimal dataset. While maintaining the same number of data points as in the previous section, we deliberately reduced the amplitude of the initial conditions by an order of magnitude of 10 to the power of minus 3. Specifically, while Figure 2 displays initial condition amplitudes ranging randomly from −4 to 4, here the amplitude values for initial conditions range from −0.004 to 0.004. This adjustment ensures that during training, the model is exposed to minimal fluctuations under the initial conditions, essentially encountering beyond-the-data-range scenarios. As discussed in the Comparison with Other Techniques section, machine learning models often struggle in such situations. In terms of computation cost, both methods are within the same range as we utilize the same dataset, albeit with the added challenge of poor quality data.
Given the reduced accuracy of the GP signal in such a scenario, we hypothesize a more pronounced role for the correction term. Figures 6 and 7 illustrate the state trajectories for the first and second links of the rehabilitation robot under our advanced control strategy. The initial states were set to [q1(0), q2(0), ˙q1(0), ˙q2(0)]=[1.5, −1, 0, 0]. The images effectively highlight the robust adaptive controller’s efficiency, despite the limitations of the training dataset. This showcases the controller’s ability to maintain system stability and achieve desired performance levels under less-than-ideal training conditions.

Temporal response of the first link of the rehabilitation robot, illustrated in a scenario where the GP is trained on a limited dataset. In this case, both links are aligned with a reference point at the origin, and initial conditions are established as [q1(0), q2(0), ˙q1(0), ˙q2(0)]=[1.5, −1, 0, 0]. Abbreviation: GP, Gaussian process.

Time response of the rehabilitation robot’s second link, displayed in a situation where the GP is trained using a restricted dataset. Here, both links are set with a reference point at the origin, and the initial conditions are defined as [q1(0), q2(0), ˙q1(0), ˙q2(0)]=[1.5, −1, 0, 0]. Abbreviation: GP, Gaussian process.
Figure 8 depicts the control signals generated by our proposed control method. The graph presents a clear visual of the total control input and its individual components: the optimal term derived from the GP model and the correction term. It is observable that after an initial transient phase, both the optimal term and the correction term stabilize and closely align with the total control input, indicating a robust control strategy that swiftly counterbalances any uncertainties. The minimal deviation between the optimal and correction terms throughout the majority of the time horizon reflects the controller’s capability to ensure consistent performance, even when faced with the intrinsic limitations of the training dataset. This alignment showcases the effectiveness of the control scheme in adapting to and compensating for the dynamics of the system, confirming the precision and reliability of the GP model within the control framework.

Temporal progression of the control signals generated by the proposed control technique, implemented on both the first and second links of the rehabilitation robot. This display is in the context of a scenario where the GP has been trained with a limited dataset. Abbreviation: GP, Gaussian process.
A key aspect of this controller is its operational independence from the quality of the training data. It does not require prior knowledge about the training sample’s quality or how it might affect performance, which is a significant advantage in practical applications. This feature is particularly vital in the context of rehabilitation robotics, where fast, reliable, and adaptive responses are essential. The controller’s ability to handle uncertainties and adapt to varying conditions demonstrates its potential for reliable and effective use in scenarios where precise and adaptable control is crucial.
Comparison of simulation results
A side-by-side comparison of the simulations detailed in the Example 1: Performance Evaluation with a Rich Dataset section and the Example 2: Performance Evaluation with a Limited Dataset section highlights the controller’s adaptability to the training data’s quality. In the Example 1: Performance Evaluation with a Rich Dataset section, using a rich dataset, the system achieved rapid convergence with negligible corrective input, suggesting high predictive accuracy ( Figs. 3 and 5). Conversely, the limited dataset mentioned in the Example 2: Performance Evaluation with a Limited Dataset section led to increased reliance on the correction term to achieve stability, as reflected in Figures 6 and 8. Despite this, effective control was maintained, demonstrating the controller’s robustness against data insufficiency. The comparative analysis accentuates the method’s resilience, maintaining efficacy across variable data conditions.
As shown in both examples in our simulations, the controller consistently achieved state stabilization in less than one time unit, a remarkable feat that underscores its robustness and efficiency. In summary, the results of our simulations provide strong evidence of the effectiveness of the control methodology we have developed for tracking control in rehabilitation robots, especially in situations involving unknown dynamics. This approach not only ensures the stability of the system but also enhances its robustness in the face of uncertainties and variable conditions. These attributes are particularly beneficial in real-world settings, where unpredictable system behaviors are common. The practicality and efficiency of our proposed control strategy make it a compelling solution for the dynamic and often unpredictable world of rehabilitation robotics.
CONCLUSION
In this study, we introduced a novel approach in the realm of rehabilitation robotics, merging the strengths of machine learning with classical control methods to meet the complex demands of this fast-evolving field. Our study emphasizes the need for a control mechanism that is not only reliable and precise but also adaptable to the varied needs of individual patients while ensuring the highest safety and effectiveness standards. Central to our methodology is the innovative use of kriging-based MPC, which benefits from the advanced capabilities of GPs. We have augmented this system with a correction term, designed to bolster robustness and reliability. This addition is critical, enabling the system to adapt and remain resilient against unforeseen challenges that were not encountered during its training phase. Grounded in the Lyapunov stability theorem, our approach has a solid theoretical underpinning that guarantees our controller’s stability. Numerical validations through two simulations have exhibited the efficacy and efficiency of our control scheme, even with a GP trained on a limited dataset. These simulations have not only illustrated the controller’s ability to manage unknown or dynamically evolving uncertainties but also its robust performance across various challenging conditions. The comparative analysis between the Example 1: Performance Evaluation with a Rich Dataset section and the Example 1: Performance Evaluation with a Limited Dataset section underscores our controller’s remarkable adaptability to varying data quality, exhibiting resilience and maintaining high performance under diverse conditions. This adaptability, showcased through rapid convergence with rich datasets and effective control with limited data, highlights the potential for further refining and testing the controller in broader application scenarios. Future work could explore enhancing this adaptability, potentially setting new benchmarks in control system efficiency and robustness. Although our research represents a significant advancement in the control of rehabilitation robots, there is still ample room for further development. One promising avenue is the integration of finite-time control strategies within our kriging-based MPC framework. Such an approach has the potential to significantly enhance the agility and performance of rehabilitation robots, optimizing their function across a range of clinical settings. Additionally, the inherent uncertainty quantification provided by the kriging algorithm opens another compelling direction for future investigation. Leveraging this uncertainty data could lead to advancements in adaptive control strategies, further improving both the precision and responsiveness of the controller to achieve superior performance and agility in dynamic environments.