# Continual online learning-based optimal tracking control of nonlinear strict-feedback systems: application to unmanned aerial vehicles

*Complex Eng Syst*2024;4:4.

## Abstract

A novel optimal trajectory tracking scheme is introduced for nonlinear continuous-time systems in strict feedback form with uncertain dynamics by using neural networks (NNs). The method employs an actor-critic-based NN backstepping technique for minimizing a discounted value function along with an identifier to approximate unknown system dynamics that are expressed in augmented form. Novel online weight update laws for the actor and critic NNs are derived by using both the NN identifier and Hamilton-Jacobi-Bellman residual error. A new continual lifelong learning technique utilizing the Fisher Information Matrix via Hamilton-Jacobi-Bellman residual error is introduced to obtain the significance of weights in an online mode to overcome the issue of catastrophic forgetting for NNs, and closed-loop stability is analyzed and demonstrated. The effectiveness of the proposed method is shown in simulation by contrasting the proposed with a recent method from the literature on an underactuated unmanned aerial vehicle, covering both its translational and attitude dynamics.

## Keywords

*,*optimal control

*,*neural networks

*,*unmanned aerial vehicles

*,*strict-feedback systems

## 1. INTRODUCTION

Optimal control of nonlinear dynamical systems with known and uncertain dynamics is an important field of study due to numerous practical applications. Traditional optimal control methods^{[1,2]} for nonlinear continuous-time (CT) systems with known dynamics often require the solution to a partial differential equation, referred to as Hamilton-Jacobi-Bellman (HJB) equation, which cannot be solved analytically. To address this challenge, actor-critic designs (ACDs) combined with approximate dynamic programming (ADP) have been proposed as an online method^{[3,4]}. Numerous optimal adaptive control (OAC) techniques for nonlinear CT systems using strict-feedback structure have emerged, leveraging backstepping design as outlined in^{[5,6]}. These approaches, however, require predefined knowledge of the system dynamics. In real-world industrial settings, where system dynamics might be partially or completely unknown, the application of neural network (NN)-based optimal tracking for uncertain nonlinear CT systems in strict feedback form has been demonstrated in^{[5,7]}, utilizing the policy/value iterations associated with ADP. However, these policy/value iteration methods often require an extensive number of iterations within each sampling period to solve the HJB equation and ascertain the optimal control input, leading to a significant computational challenge.

The optimal trajectory tracking of nonlinear CT systems involves obtaining a time-varying feedforward term to ensure precise tracking and a feedback term to stabilize the system dynamics. Recent optimal tracking efforts^{[7,8]}, have utilized a backstepping-based approach with completely known or partially unknown system dynamics, but the design of feedforward term while minimizing a cost function has not been addressed. Instead, a linear term is used to design the control input. A more recent study^{[8,9]} employed a positive function for obtaining simple weight update laws of the actor and critic NN, which also relaxes the persistency of excitation (PE) condition. However, finding such a function for the time-varying trajectory tracking problem of a nonlinear CT system will be challenging by using an explicit time-dependent value function and HJB equation at each stage of backstepping design since the Hamiltonian is nonzero along the optimal trajectory^{[10]}. In simplified and optimized backstepping control schemes were developed for a class of nonlinear strict feedback systems^{[8,11,12]}. These approaches are different from the one proposed in^{[5]}. However, they either require complete knowledge of the system dynamics or do not assume that the system dynamics are completely unknown.

Moreover, all control techniques rooted in NN-based learning, whether aimed at regulation or tracking, routinely face the issues of catastrophic forgetting^{[13]}. This is understood as the system's ability to lose previously acquired knowledge while assimilating new information^{[13,14]}. Continual lifelong learning (CLL) is conceived as the sustained ability of a nonlinear system to acquire, assimilate, and retain knowledge over prolonged periods without the interference of catastrophic forgetting. This concept is particularly critical when delving into the realm of online NN control strategies for nonlinear CT systems, as these systems are often tasked with navigating and managing complex processes within dynamic and varying environments and conditions. Nonetheless, the lifelong learning (LL) methodologies shown in^{[13,15]} operate in an offline mode and have not been applied to real-time NN control scenarios yet. This scenario offers a promising direction to leverage the advantage of LL in online control systems, addressing catastrophic forgetting and thus enhancing the efficacy of the control system progressively. Implementing LL-oriented strategies in online NN control enables persistent learning and adaptation without discarding prior knowledge, thereby improving its overall performance. By developing an LL-based NN trajectory tracking scheme, it is possible to continuously learn and track trajectories of interest without losing information about previous tasks.

This paper presents an optimal backstepping control approach that incorporates reinforcement learning (RL) to design the controller. The proposed method utilizes an augmented system to address the tracking problem, incorporating both feedforward and feedback controls, which sets it apart from prior work such as^{[8,16]}. This approach uses a trajectory generator to generate the trajectories and hence deals with the non-stationary condition in the HJB equation that arises in optimal tracking problems due to the time-varying reference trajectory. In addition, the proposed weight update laws are direct error driven based, obtained using Hamiltonian and control input error, in contrast to where the weight update laws are obtained using some positive functions^{[8,16]}. Furthermore, the control scheme incorporates an identifier where the approximation error is bounded above by system states to approximate the unknown system dynamics, as opposed to prior work, such as^{[8,16]}, where the system dynamics are either completely known or partially known. Additionally, the utilization of an HJB equation at each step of the backstepping process is intended to ensure that the entire sequence of steps is optimized.

The paper also examines the impacts of LL and catastrophic forgetting on control systems and proposes strategies for addressing these challenges in control system-based applications. Specifically, the proposed method employs a weight velocity attenuation (WVA)-based LL scheme in an online manner, in contrast to prior work, such as^{[13,15]}, which utilizes offline learning. Additionally, the proposed method demonstrates the stability of the LL scheme via Lyapunov analysis in contrast to offline-based learning^{[13,15]}, where the weight convergence is not addressed. To validate the effectiveness of the proposed method, an unmanned aerial vehicle (UAV) application is considered, and the proposed method is contrasted with the existing approach. Lyapunov stability shows the uniform ultimate boundedness (UUB) of the overall closed-loop continual lifelong RL (LRL) scheme.

The contributions include

(1) A novel optimal trajectory tracking control formulation is presented, utilizing an augmented system approach for nonlinear strict-feedback systems within an ADP-based framework, offering a novel perspective.

(2) An NN-based identifier is employed, wherein the reconstruction error is presumed to be upper-bounded by the norm of the state vector, providing an enhanced approximation of the system dynamics. The new weight update laws are introduced, incorporating Hamiltonian and the NN identifier within an actor-critic framework at each step of the backstepping process.

(3) An online LL method is developed in the critic NN weight update law, mitigating both catastrophic forgetting and gradient explosion, with the significance of weights for NN layers obtained using Fisher Information Matrix (FIM) determined by the Bellman error, as opposed to offline LL-based methods with targets.

(4) Lyapunov stability analysis is undertaken for the entire closed-loop tracking system, involving the identifier NN and the LL-based actor-critic NN framework to show the UUB of the closed-loop system.

## 2. CONTINUAL LIFELONG OPTIMAL CONTROL FORMULATION

In this section, we provide the problem formulation and the development of our proposed LRL approach for uncertain nonlinear CT systems in strict feedback form.

### 2.1. System description

Consider the following strict feedback system

where

**Assumption 1** (^{[17]}). *The nonlinear CT system is controllable and observable. In addition, the control coefficient matrix satisfies *

**Assumption 2** (^{[4]}). *The state vector is considered measurable. The desired trajectory *

Next, the LRL control design is introduced. The goal of the LRL control scheme is to achieve satisfactory tracking and maintain the boundedness of all closed-loop system signals while minimizing the control effort and addressing the issue of catastrophic forgetting.

The design of the control system begins by implementing an optimal backstepping approach using augmented system-based actor-critic architecture and then using an online LL to mitigate catastrophic forgetting.

### 2.2. Optimal backstepping control

To develop optimal control using the backstepping technique, first, a new augmented system is expressed in terms of tracking error as follows. Define the tracking error as

where

In order to get both the feedforward and feedback part of the controller, the tracking problem is changed to a regulation problem by defining a new augmented state as

where

Step 1: For the first backstepping step, let

where

**Remark 1***Generally, addressing trajectory tracking control problems poses considerable challenges, particularly when dealing with a system characterized by nonlinear dynamics and a trajectory that evolves over time. In such instances, a prevalent approach is to employ a discounted cost function, denoted as* (4)*, to render the cost index, * (4)

*along the trajectories of the augmented system*(3)

*. In addition, the performance function is not explicitly dependent on time.*

By taking the time derivative on both sides of the optimal performance function (4), the tracking Bellman equation is obtained as

By noting that the first term of (5) is

Therefore, the tracking HJB equation is generated as

where

It is well known that NNs have universal function approximation abilities and can approximate a nonlinear continuous function

Since

where

where optimal Hamiltonian function

where

where the estimated Hamiltonian function

where

where

Therefore,

where

where

where

since the optimal Hamiltonian value is zero. Notice that the estimated Hamiltonian,

**Remark 2***The subsequent section will leverage the HJB residual error* (20) *to formulate the weight update laws for the critic NN. Additionally, the control input error *

*Employing the HJB residual error and the control input error for formulating the weight update laws is pivotal as it enables efficient optimization of the weights in NN. This methodology ensures more accurate and reliable learning processes, allowing the network to better approximate the desired functions or policies, thereby enhancing the overall performance and robustness of the system.*

**Step**

Since

where

where

The HJB equation for step 2 is given by

where

Since

Therefore,

where

where

where

Since

where

where

where

Similarly, the actor NN will be designed to estimate the optimal control as

where

since the optimal Hamiltonian for the second step is zero, notice that the estimated Hamiltonian,

**Remark 3***The optimal control input can be obtained by utilizing the gradient of the optimal value function* (26) *and an NN identifier. As a result, the critic NN outlined in* (26) *may be utilized to determine the actor input without the need for an additional NN. However, for the purpose of simplifying the derivation of weight update rules and subsequent stability analysis, separate NNs are employed for the actor and critic.*

Next, an NN identifier will be used to approximate the unknown dynamics given by (3) and (22).

### 2.3. NN identifier

A single-layer NN is used to approximate both the nonlinear functions ^{[17]}. Then, by using

where

**Assumption 3** (^{[17]}). *The NN identifier is of single-layer, and its reconstruction error is bounded above such that *

**Remark 4***Because *

^{[18]}, where

Define the dynamics of the NN identifier as

where

where

**Remark 5***The NN identifier weights are tuned by using both augmented state estimation error and the input vector. The boundedness of the control input is needed to show the convergence of the NN identifier if proof is shown separately, whereas this assumption is relaxed when the identifier is combined with the LRL control scheme, as shown in the next section.*

**Remark 6***Since the control input error and the HJB error used to tune the actor-critic NN weights require the system dynamics which are uncertain, the NN identifier is used to approximate the unknown dynamics of an augmented system. The estimated values from the identifier are used in the actor-critic weight update laws to tune the NN weights, as shown in the subsequent section.*

### 2.4. Actor critic NN weight tuning

In this section, the actor-critic weight update laws are obtained using the gradient descent method to the Hamiltonian-based performance function. The following Lemma is stated.

**Lemma 1***Consider a system* (1)*, transformed system* (3)*, NN identifier weight update laws* (38)*, the update laws for the critic NN* (12)*,* (26) *and actor NN* (19)*,* (34)*. They can be written as*

*where * (20)

*,*(35)

*,*$$ \hat{\wedge}_{j}=\hat{\mathcal{G}}_{sj}r_{j}^{-1}\hat{\mathcal{G}}^{\top}_{sj}, $$ and $$ \hat{\mathcal{F}}_{sj}, \hat{\mathcal{G}}_{sj} $$ are approximated by using NN identifier.

**Proof:** The weight update laws for critic NN in step

By using the gradient descent algorithm, the weight update law can be obtained as

On simplifying (42), we will get the weight update law for critic NN, as shown in Lemma. The weight update law for actor NN is obtained by defining the performance function as

By using the gradient descent approach, the weight update law for an actor NN is obtained as

On further solving and adding the stabilization terms, we will get the weight update law shown in Lemma 1.

**Remark 7**. *The weight update laws are obtained using the gradient descent method to the Hamiltonian-based performance function. The weight update equations for the critic and actor have additional terms to ensure stability and facilitate convergence proof. The last term, known as the sigma modification term, relaxes the PE condition needed to ensure weight convergence. It is important to note that the right-hand side terms in the weight update equation can be measured.*

The following assumption is stated next.

**Assumption 4** (^{[4]}). *It is assumed that the ideal weights exist and are bounded over a compact set by an unknown positive constant, such that *

Next, the following theorem is stated.

**Theorem 1***Consider the nonlinear system in strict-feedback form defined by* (1)*. By using the augmented system* (3)*, consider the optimal virtual control* (19) *and actual control terms* (34)*, along with the identifier, actor-critic updating laws* (38)*,* (39)*, and* (40)*. Further, assume that the design parameters are selected as per the conditions stated and that Assumptions 1 through 4 hold. If the system input is PE and its initial value, *

**Proof**: See Appendix.

**Remark 8***In the proposed optimal backstepping technique, the RL/ADP is employed at every step to obtain the optimal virtual and actual control inputs. We have derived the backstepping for a two-step process; however, it can be implemented up to *

**Remark 9***The sigma modification term does serve to alleviate the PE condition and assists in the process of forgetting; however, it does not prove effective in multitasking scenarios to minimize forgetting. Subsequently, a novel online LL strategy is presented to address the issue of catastrophic forgetting.*

Next, an online regularization-based approach to LL is introduced.

### 2.5. Continual lifelong learning

To mitigate the issues of catastrophic forgetting^{[13]}, a novel technique called WVA was proposed^{[15]}. However, WVA has only been used in an offline manner, which cannot be applied to NN-based online techniques.

In contrast, this study introduces a new online LL technique that can be integrated into an online NN-based trajectory tracking control scheme by identifying and safeguarding the most critical parameters during the optimization process. To achieve this, the proposed technique employs a performance function given by

where

where

where

where

Subsequently, leveraging normalized gradient descent allows us to formulate an additional term in the critic weight update law. This term is derived as follows

For LL, the terms from (49) are combined with the terms from the previously defined update law that is given in Theorem 1. Next, the following theorem is stated.

**Theorem 2***Consider the hypothesis stated in Theorem 1, and let Assumptions 1 to 4 hold, with the LRL critic NN tuning law for j step optimal backstepping, given by*

*where j denotes the number of steps in backstepping, *

**Proof**: See Appendix.

**Remark 10***From* (49)*, when the significance of the weights increases, *

**Remark 11***The first part of the NN weight update law in Theorem 2 is the same as in Theorem 1, whereas the second part includes regularization terms resulting from LL. Notice that the tracking and weight estimation error bounds increase due to the LRL-based control scheme because the bounding constant, *

**Remark 12***The proposed LL method is scalable to encompass *

**Remark 13***The efficacy of the LL method is pronounced when Tasks 1 and 2 share informational overlap reflected in the weights, facilitating the knowledge transfer for Task 2. However, in the absence of shared weights or knowledge between non-overlapping tasks, the visible enhancement in Task 2 performance might be negligible with online LL, albeit it mitigates the catastrophic forgetting of Task 1, offering long-term benefits when reverting to Task 1.*

## 3. UAV TRACKING SIMULATION OUTCOMES

This section delineates the outcomes of optimal tracking control founded on LRL, applied on an underactuated UAV system.

### 3.1. Unmanned aerial vehicle (UAV) problem formulation and control design

Consider the UAV model depicted in Figure 1, which is characterized by two reference frames: the inertial frame

The quadrotor dynamics can be modeled by two unique equations: (1) translational; and (2) rotational. However, these dynamic equations interrelate via the rotation matrix, rendering them as two cohesive subsystems. A holistic control strategy involves both outer and inner loop controls, corresponding to the two subsystems. The outer loop aims to execute positional control by managing the state variables of

Define

Given the underactuated nature of UAV translational dynamics, an intermediate control vector

is introduced for optimal position control derivation, and the translational dynamic can thus be reformulated as

**Remark 14***The relation between *

Solving yields the control

Using the reference trajectory vector

For the coordinate transformation of rotational dynamic, the transformation relationship between rotational velocity

with

Applying time derivation to both sides yields the attitude dynamic as:

The function

So, the attitude dynamic can be rephrased in strict feedback form as

Reference signals are denoted as

Tracking error variables are designated as

Therefore, using the control law (34) and the weight update laws shown in Theorems 1 and 2 for translational and attitude dynamics will drive the UAV system to track the reference trajectory, as shown in the simulations.

### 3.2. Simulation parameters and results

The desired position trajectory for

We consider two task scenarios in which the reference trajectory is changed in each task as if the UAV is moving in a different path or environment. In the simulations, we have shown task 1 again to demonstrate that when the UAV returns to task 1, the LL-based control will help mitigate the catastrophic forgetting. The proposed method is able to drive the UAV to track the reference trajectory accurately, even on changing tasks. Figure 2 shows the performance of the position and attitude tracking; it indicates that using the proposed LRL method shown by the blue color, and the UAV position states can accurately follow the reference trajectory shown by the red color. The attitude tracking performance demonstrates that the UAV attitudes can also follow the reference attitudes better as compared to recent literature^{[9]} shown by green color. Figure 3 illustrates the tracking errors, indicating that the tracking performance of the proposed method is superior when compared with recent literature, referred to as 'r-lit'^{[9]}. Figure 4 illustrates the position and attitude tracking errors.

Figure 3. Tracking performance of position and attitude subsystems using LRL and recent literature (r-lit)^{[9]} methods.

Figure 4. Position and attitude tracking errors using proposed LRL and recent literature (r-lit)^{[9]} methods.

Both the system state tracking plots and positional error plots in Figure 2 and Figure 3 demonstrate the superior performance of the proposed LL method, represented by blue lines. However, the recent literature^{[9]} exhibits higher error, thus showing the need for LL. In contrast, the 'Lit' method, as shown in green, has a higher error rate when compared to other methods. Notably, the total average error shown in Figure 4 is low when the proposed LL method is employed over the 'Lit' method, indicating a substantial enhancement in tracking accuracy.

Figure 5 depicts torque inputs and cumulative costs where it can be seen that the cost of using the proposed method is minimal, and all the closed-loop signals are bounded. The control effort demanded by the 'Lit' method is higher in comparison to the proposed LL-based method. Figure 5 also showcases the cumulative cost. It is observed that the cost associated with Lit (shown in green color) is higher compared to the proposed LL method (represented in blue) during the tasks and as the tasks change.

## 4. CONCLUSION AND DISCUSSION

This paper proposed an innovative LL tracking control technique for uncertain nonlinear CT systems in strict feedback form. The method combined the augmented system, trajectory generator, and optimal backstepping approach to design both feedforward and feedback terms of the tracking scheme. By utilizing a combination of actor-critic NN and identifier NN, the method effectively approximated the solution to the HJB equations with unknown nonlinear functions. The use of RL at each step of the backstepping process allows for the development of virtual and actual optimal controllers that can effectively handle the challenges posed by uncertain, strict feedback systems. The proposed work highlighted the significance of considering catastrophic forgetting in online controller design and developed a new method to address this issue. Simulation results on a UAV tracking a desired trajectory show acceptable performance. The proposed approach can be extended by using deep NNs for better approximation. In addition, the integral RL (IRL)-based approach can relax the drift dynamics. Dynamic surface control can be included to minimize the number of NNs used.

## DECLARATIONS

### Authors' contributions

Made substantial contributions to the conception and design of the study: Ganie I, Jagannathan S

Made contributions in writing, reviewing, editing, and methodology: Ganie I, Jagannathan S

### Availability of data and materials

Not applicable.

### Financial support and sponsorship

The project or effort undertaken was or is sponsored by the Office of Naval Research Grant N00014-21-1-2232 and Army Research Office Cooperative Agreements W911NF-21-2-0260 and W911NF-22-2-0185.

### Conflicts of interest

Both authors declared that there are no conflicts of interest.

### Ethical approval and consent to participate

Not applicable.

### Consent for publication

Not applicable.

### Copyright

© The Author(s) 2024.

## APPENDICES

### Proof of Theorem 1

**Step1:** Consider the Lyapunov function as follows

The time derivative of

Where

Considering the first term of (51), we can write it as

Substituting (3), (9) in (52) gives

Substituting the value of

Using

Separating the terms in (55) w.r.t actual NN weights and the terms w.r.t weight estimation error gives

Substituting

One can further simplify (57), as follows

Using (11), we have

which can be further written as

where

On substituting the value of

which can be further simplified by using the cyclic property of traces as

To simplify, we have

Consider the fourth term of (52),

Therefore, one can further simplify (65) by using Young's inequality in cross-product terms as follows

Consider the fifth term of (52)

Using

Considering a last term, we can write

Using Young's inequality in the cross product terms, we can write

Combining (60), (64), (66) and (69) and simplifying, we have

where

where

This demonstrates that the overall closed-loop system is bounded. Since

**Step 2:**

This is the final step. Consider the Lyapunov function as follows

The time derivative of

Let

Considering the second term of (74), substituting (3) in

which on further solving leads to

Using

Separating the terms in (77) w.r.t actual NN weights and the terms w.r.t weight estimation error gives

Substituting

One can further simplify (79), as follows

Using (11), we have

which can be further written as

where

Consider the second and third term of (52)

On substituting the value of

which can be further simplified by using the cyclic property of traces as

To simplify, we have

Consider the fourth term of (52),

Therefore, one can further simplify (87) by using Young's inequality in cross-product terms as follows

Consider the fifth term of (52)

Using

Considering a last term, we can write

Using Young's inequality in the cross product terms, we can write

Combining (82), (86), (88) and (91) and simplifying, we have

where

This demonstrates that the overall closed-loop system is bounded. Since

### Proof of Theorem 2

The convergence of weights for Task 1 remains in alignment with Theorem 1. For Task 2, an additional term emerges in the Lyapunov proof (92) due to the regularization penalty, denoted as

where

Substituting

Employing Young's inequality to the first and third terms of (95), we get

Substituting

Thus, the integration of this term into the proof solely modifies the error bound to

The aggregate contribution to the error bounds is calculated by adding

## REFERENCES

1. Abu-Khalaf M, Lewis FL. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. *Automatica* 2005;41:779-91.

2. McLain TW, Beard RW. Successive galerkin approximations to the nonlinear optimal control of an underwater robotic vehicle. In Proceedings of the1998 IEEE international conference on robotics and automation (Cat. No. 98CH36146). Leuven, Belgium. 20-20 May 1998.

3. Vrabie D, Pastravanu O, Abu-Khalaf M, Lewis FL. Adaptive optimal control for continuous-time linear systems based on policy iteration. *Automatica* 2009;45:477-84.

4. Modares H, Lewis FL. Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning. *Automatica* 2014;50:1780-92.

5. Gao W, Jiang ZP. Learning-Based adaptive optimal tracking control of strict-feedback nonlinear systems. *IEEE Trans Neural Netw Learn Syst* 2018;29:2614-24.

6. Zargarzadeh H, Dierks T, Jagannathan S. Optimal control of nonlinear continuous-time systems in strict-feedback form. *IEEE Trans Neural Netw Learn Syst* 2015;26:2535-49.

7. Huang Z, Bai W, Li T, et al. Adaptive reinforcement learning optimal tracking control for strict-feedback nonlinear systems with prescribed performance. *Inf Sci* 2023;621:407-23.

8. Wen G, Chen CLP, Ge SS. Simplified optimized backstepping control for a class of nonlinear strict-feedback systems with unknown dynamic functions. *IEEE Trans Cybern* 2021;51:4567-80.

9. Wen G, Hao W, Feng W, Gao K. Optimized backstepping tracking control using reinforcement learning for quadrotor unmanned aerial vehicle system. *IEEE Trans Syst Man Cybern Syst* 2022;52:5004-15.

10. Bryson AE. Applied optimal control: optimization, estimation and control. New York: Routledge; 1975. p. 496.

11. Wen G, Ge SS, Tu F. Optimized backstepping for tracking control of strict-feedback systems. *IEEE Trans Neural Netw Learn Syst* 2018;29:3850-62.

12. Wu J, Wang W, Ding S, Xie X, Yi Y. Adaptive neural optimized control for uncertain strict-feedback systems with unknown control directions and pre-set performance. *Commun Nonlinear Sci Numer Simul* 2023;126:107506.

13. Kirkpatrick J, Pascanu R, Rabinowitz NC, et al. Overcoming catastrophic forgetting in neural networks. *Proc Natl Acad Sci USA* 2017;114:3521-6.

14. Ganie I, Jagannathan S. Adaptive control of robotic manipulators using deep neural networks. *IFAC-PapersOnLine* 2022;55:148-53.

15. Kutalev A, Lapina A. Stabilizing elastic weight consolidation method in practical ML tasks and using weight importances for neural network pruning. *ArXiv* 2021. Available from: https://arxiv.org/abs/2109.10021 [Last accessed on 2 Feb 2024].

16. Liu Y, Zhu Q, Wen G. Adaptive tracking control for perturbed strict-feedback nonlinear systems based on optimized backstepping technique. *IEEE Trans Neural Netw Learn Syst* 2022;33:853-65.

17. Moghadam R, Jagannathan S. Optimal adaptive control of uncertain nonlinear continuous-time systems with input and state delays. *IEEE Trans Neural Netw Learn Syst* 2023;34:3195-204.

## Cite This Article

## How to Cite

Ganie, I.; Jagannathan S. Continual online learning-based optimal tracking control of nonlinear strict-feedback systems: application to unmanned aerial vehicles. *Complex. Eng. Syst.* **2024**, *4*, 4. http://dx.doi.org/10.20517/ces.2023.35

## Download Citation

## Export Citation File:

## Type of Import

### Tips on Downloading Citation

### Citation Manager File Format

### Type of Import

**Direct Import:**When the Direct Import option is selected (the default state), a dialogue box will give you the option to Save or Open the downloaded citation data. Choosing Open will either launch your citation manager or give you a choice of applications with which to use the metadata. The Save option saves the file locally for later use.

**Indirect Import:**When the Indirect Import option is selected, the metadata is displayed and may be copied and pasted as needed.

## About This Article

### Copyright

**Open Access**This article is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, sharing, adaptation, distribution and reproduction in any medium or format, for any purpose, even commercially, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## Data & Comments

### Data

### Comments

Comments must be written in English. Spam, offensive content, impersonation, and private information will not be permitted. If any comment is reported and identified as inappropriate content by OAE staff, the comment will be removed without notice. If you have any queries or need any help, please contact us at support@oaepublish.com.

^{0}