Closed-Loop Control Learning - Neural Networks: Methodology and Applications

Information Technology Reference

In-Depth Information

of the dynamical system in the absence of disturbances. In the general case

of a nonlinear dynamical system with several attractors, the situation is still

more intricate. Actually, fluctuations occur “almost surely”, which enable the

system to pass from an attraction basin of the deterministic dynamical sys-

tem to another. The theory of “Large deviations” allows provides a tool for

estimating the transition probability of those events [Benveniste et al. 1987;

Duflo 1996].

However, in this chapter (and in most applications), we are interested in

stabilizing a fixed point or in tracking a reference trajectory; therefore the

investigation of the coexistence of several dynamical attractors is not really

relevant.

5.2 Design of a Neural Control with an Inverse Model

5.2.1 Straightforward Inversion

The simplest method, to design a neural control law from a neural model

of a controlled dynamical system that was identified as an open-loop neural

network, is the straightforward inversion of that model. The control system

is just the inverse of the model of the process. If that model is nonlinear, its

inverse is nonlinear too, hence can be implemented as a neural network. The

training and operation of such neural control are demonstrated in Fig. 5.2.

In that figure, a neural network that computes the control signal is added

to the neural model of the process. That neural controller is a feedforward

network whose inputs are the state and, optionally, the desired state (at the

next time) if the task is the tracking of a state trajectory. Otherwise, the only

input of the controller is the current state of the system (at time k ). The

output of the neural controller is the control signal at time k . That control is

fed to the control input of the model of the process during the training phase,

and to the process input during the operation phase.

The set (controller + model) is a feedforward neural network whose input

is the state at the next time step. Training is performed by minimizing the

difference between the reference state or setpoint and the network output. The

only parameters subject to change are the controller's parameters (weights and

bias). The model parameters stay unchanged during the training process.

The cost function is usually the squared distance between the desired

output and the measured output. If constraints are imposed to the control

signal, they can be embedded into the controller. For instance, if the admissible

control is bounded, those bounds can be embedded into the activation sigmoid

function of the output neuron of the controller. Alternatively, a penalization

that grows drastically when the constraint is violated may be added to the

error cost function.

That straightforward methodology gives good results for simple problems,

where the objective is a static function of the current state. If the objective

Search WWH ::

Custom Search

Home