Approximate Dynamic Programming Based Solutions for Fixed-final-time Optimal Control and Optimal Switching PDF Download

Are you looking for read ebook online? Search for your book and save it on your Kindle device, PC, phones or tablets. Download Approximate Dynamic Programming Based Solutions for Fixed-final-time Optimal Control and Optimal Switching PDF full book. Access full book title Approximate Dynamic Programming Based Solutions for Fixed-final-time Optimal Control and Optimal Switching by Ali Heydari. Download full books in PDF and EPUB format.

Approximate Dynamic Programming Based Solutions for Fixed-final-time Optimal Control and Optimal Switching

Author: Ali Heydari
Publisher:
ISBN:
Category : Automatic programming (Computer science)
Languages : en
Pages : 239

Book Description
"Optimal solutions with neural networks (NN) based on an approximate dynamic programming (ADP) framework for new classes of engineering and non-engineering problems and associated difficulties and challenges are investigated in this dissertation. In the enclosed eight papers, the ADP framework is utilized for solving fixed-final-time problems (also called terminal control problems) and problems with switching nature. An ADP based algorithm is proposed in Paper 1 for solving fixed-final-time problems with soft terminal constraint, in which, a single neural network with a single set of weights is utilized. Paper 2 investigates fixed-final-time problems with hard terminal constraints. The optimality analysis of the ADP based algorithm for fixed-final-time problems is the subject of Paper 3, in which, it is shown that the proposed algorithm leads to the global optimal solution providing certain conditions hold. Afterwards, the developments in Papers 1 to 3 are used to tackle a more challenging class of problems, namely, optimal control of switching systems. This class of problems is divided into problems with fixed mode sequence (Papers 4 and 5) and problems with free mode sequence (Papers 6 and 7). Each of these two classes is further divided into problems with autonomous subsystems (Papers 4 and 6) and problems with controlled subsystems (Papers 5 and 7). Different ADP-based algorithms are developed and proofs of convergence of the proposed iterative algorithms are presented. Moreover, an extension to the developments is provided for online learning of the optimal switching solution for problems with modeling uncertainty in Paper 8. Each of the theoretical developments is numerically analyzed using different real-world or benchmark problems"--Abstract, page v.

Approximate Dynamic Programming Based Solutions for Fixed-final-time Optimal Control and Optimal Switching

Author: Ali Heydari
Publisher:
ISBN:
Category : Automatic programming (Computer science)
Languages : en
Pages : 239

Adaptive Dynamic Programming for Control

Author: Huaguang Zhang
Publisher: Springer Science & Business Media
ISBN: 144714757X
Category : Technology & Engineering
Languages : en
Pages : 432

Book Description
There are many methods of stable controller design for nonlinear systems. In seeking to go beyond the minimum requirement of stability, Adaptive Dynamic Programming in Discrete Time approaches the challenging topic of optimal control for nonlinear systems using the tools of adaptive dynamic programming (ADP). The range of systems treated is extensive; affine, switched, singularly perturbed and time-delay nonlinear systems are discussed as are the uses of neural networks and techniques of value and policy iteration. The text features three main aspects of ADP in which the methods proposed for stabilization and for tracking and games benefit from the incorporation of optimal control methods: • infinite-horizon control for which the difficulty of solving partial differential Hamilton–Jacobi–Bellman equations directly is overcome, and proof provided that the iterative value function updating sequence converges to the infimum of all the value functions obtained by admissible control law sequences; • finite-horizon control, implemented in discrete-time nonlinear systems showing the reader how to obtain suboptimal control solutions within a fixed number of control steps and with results more easily applied in real systems than those usually gained from infinite-horizon control; • nonlinear games for which a pair of mixed optimal policies are derived for solving games both when the saddle point does not exist, and, when it does, avoiding the existence conditions of the saddle point. Non-zero-sum games are studied in the context of a single network scheme in which policies are obtained guaranteeing system stability and minimizing the individual performance function yielding a Nash equilibrium. In order to make the coverage suitable for the student as well as for the expert reader, Adaptive Dynamic Programming in Discrete Time: • establishes the fundamental theory involved clearly with each chapter devoted to a clearly identifiable control paradigm; • demonstrates convergence proofs of the ADP algorithms to deepen understanding of the derivation of stability and convergence with the iterative computational methods used; and • shows how ADP methods can be put to use both in simulation and in real applications. This text will be of considerable interest to researchers interested in optimal control and its applications in operations research, applied mathematics computational intelligence and engineering. Graduate students working in control and operations research will also find the ideas presented here to be a source of powerful methods for furthering their study.

Dynamic Optimization of Path-Constrained Switched Systems

Author: Jun Fu
Publisher: Springer Nature
ISBN: 3031234286
Category : Technology & Engineering
Languages : en
Pages : 113

Book Description
This book provides a series of systematic theoretical results and numerical solution algorithms for dynamic optimization problems of switched systems within infinite-dimensional inequality path constraints. Dynamic optimization of path-constrained switched systems is a challenging task due to the complexity from seeking the best combinatorial optimization among the system input, switch times and switching sequences. Meanwhile, to ensure safety and guarantee product quality, path constraints are required to be rigorously satisfied (i.e., at an infinite number of time points) within a finite number of iterations. Several novel methodologies are presented by using dynamic optimization and semi-infinite programming techniques. The core advantages of our new approaches lie in two folds: i) The system input, switch times and the switching sequence can be optimized simultaneously. ii) The proposed algorithms terminate within finite iterations while coming with a certification of feasibility for the path constraints. In this book, first, we provide brief surveys on dynamic optimization of path-constrained systems and switched systems. For switched systems with a fixed switching sequence, we propose a bi-level algorithm, in which the input is optimized at the inner level, and the switch times are updated at the outer level by using the gradient information of the optimal value function calculated at the optimal input. We then propose an efficient single-level algorithm by optimizing the input and switch times simultaneously, which greatly reduces the number of nonlinear programs and the computational burden. For switched systems with free switching sequences, we propose a solution framework for dynamic optimization of path-constrained switched systems by employing the variant 2 of generalized Benders decomposition technique. In this framework, we adopt two different system formulations in the primal and master problem construction and explicitly characterize the switching sequences by introducing a binary variable. Finally, we propose a multi-objective dynamic optimization algorithm for locating approximated local Pareto solutions and quantitatively analyze the approximation optimality of the obtained solutions. This book provides a unified framework of dynamic optimization of path-constrained switched systems. It can therefore serve as a useful book for researchers and graduate students who are interested in knowing the state of the art of dynamic optimization of switched systems, as well as recent advances in path-constrained optimization problems. It is a useful source of up-to-date optimization methods and algorithms for researchers who study switched systems and graduate students of control theory and control engineering. In addition, it is also a useful source for engineers who work in the control and optimization fields such as robotics, chemical engineering and industrial processes.

Approximate Dynamic Programming

Author: Warren B. Powell
Publisher: John Wiley & Sons
ISBN: 0470182954
Category : Mathematics
Languages : en
Pages : 487

Book Description
A complete and accessible introduction to the real-world applications of approximate dynamic programming With the growing levels of sophistication in modern-day operations, it is vital for practitioners to understand how to approach, model, and solve complex industrial problems. Approximate Dynamic Programming is a result of the author's decades of experience working in large industrial settings to develop practical and high-quality solutions to problems that involve making decisions in the presence of uncertainty. This groundbreaking book uniquely integrates four distinct disciplines—Markov design processes, mathematical programming, simulation, and statistics—to demonstrate how to successfully model and solve a wide range of real-life problems using the techniques of approximate dynamic programming (ADP). The reader is introduced to the three curses of dimensionality that impact complex problems and is also shown how the post-decision state variable allows for the use of classical algorithmic strategies from operations research to treat complex stochastic optimization problems. Designed as an introduction and assuming no prior training in dynamic programming of any form, Approximate Dynamic Programming contains dozens of algorithms that are intended to serve as a starting point in the design of practical solutions for real problems. The book provides detailed coverage of implementation challenges including: modeling complex sequential decision processes under uncertainty, identifying robust policies, designing and estimating value function approximations, choosing effective stepsize rules, and resolving convergence issues. With a focus on modeling and algorithms in conjunction with the language of mainstream operations research, artificial intelligence, and control theory, Approximate Dynamic Programming: Models complex, high-dimensional problems in a natural and practical way, which draws on years of industrial projects Introduces and emphasizes the power of estimating a value function around the post-decision state, allowing solution algorithms to be broken down into three fundamental steps: classical simulation, classical optimization, and classical statistics Presents a thorough discussion of recursive estimation, including fundamental theory and a number of issues that arise in the development of practical algorithms Offers a variety of methods for approximating dynamic programs that have appeared in previous literature, but that have never been presented in the coherent format of a book Motivated by examples from modern-day operations research, Approximate Dynamic Programming is an accessible introduction to dynamic modeling and is also a valuable guide for the development of high-quality solutions to problems that exist in operations research and engineering. The clear and precise presentation of the material makes this an appropriate text for advanced undergraduate and beginning graduate courses, while also serving as a reference for researchers and practitioners. A companion Web site is available for readers, which includes additional exercises, solutions to exercises, and data sets to reinforce the book's main concepts.

Iterative Dynamic Programming

Author: Rein Luus
Publisher: CRC Press
ISBN: 9781420036022
Category : Mathematics
Languages : en
Pages : 346

Book Description
Dynamic programming is a powerful method for solving optimization problems, but has a number of drawbacks that limit its use to solving problems of very low dimension. To overcome these limitations, author Rein Luus suggested using it in an iterative fashion. Although this method required vast computer resources, modifications to his original schem

Constrained Optimal Control of Linear and Hybrid Systems

Author: Francesco Borrelli
Publisher: Springer
ISBN: 3540362258
Category : Mathematics
Languages : en
Pages : 206

Book Description
Many practical control problems are dominated by characteristics such as state, input and operational constraints, alternations between different operating regimes, and the interaction of continuous-time and discrete event systems. At present no methodology is available to design controllers in a systematic manner for such systems. This book introduces a new design theory for controllers for such constrained and switching dynamical systems and leads to algorithms that systematically solve control synthesis problems. The first part is a self-contained introduction to multiparametric programming, which is the main technique used to study and compute state feedback optimal control laws. The book's main objective is to derive properties of the state feedback solution, as well as to obtain algorithms to compute it efficiently. The focus is on constrained linear systems and constrained linear hybrid systems. The applicability of the theory is demonstrated through two experimental case studies: a mechanical laboratory process and a traction control system developed jointly with the Ford Motor Company in Michigan.

Discrete-time Control Algorithms and Adaptive Intelligent Systems Designs

Author: Asma Azmi Al-Tamimi
Publisher: ProQuest
ISBN: 9780549263791
Category : Dynamic programming
Languages : en
Pages :

Book Description
In this work, approximate dynamic programming (ADP) designs based on adaptive critic structures are developed to solve the discrete-time H2/Hinfinity optimal control problems in which the state and action spaces are continuous. This work considers linear discrete-time systems as well as nonlinear discrete-time systems that are affine in the input. This research resulted in forward-in-time reinforcement learning algorithms that converge to the solution of the Generalized Algebraic Riccati Equation (GARE) for linear systems. For the nonlinear case, a forward-in-time reinforcement learning algorithm is presented that converges to the solution of the associated Hamilton-Jacobi Bellman equation (HJB). The results in the linear case can be thought of as a way to solve the GARE of the well-known discrete-time Hinfinity optimal control problem forward in time. Four design algorithms are developed: Heuristic Dynamic programming (HDP), Dual Heuristic dynamic programming (DHP), Action dependent Heuristic Dynamic programming (ADHDP) and Action dependent Dual Heuristic dynamic programming (ADDHP). The significance of these algorithms is that for some of them, particularly the ADHDP algorithm, a priori knowledge of the plant model is not required to solve the dynamic programming problem. Another major outcome of this work is that we introduce a convergent policy iteration scheme based on the HDP algorithm that allows the use of neural networks to arbitrarily approximate for the value function of the discrete-time HJB equation. This online algorithm may be implemented in a way that requires only partial knowledge of the model of the nonlinear dynamical system. The dissertation includes detailed proofs of convergence for the proposed algorithms, HDP, DHP, ADHDP, ADDHP and the nonlinear HDP. Practical numerical examples are provided to show the effectiveness of the developed optimization algorithms. For nonlinear systems, a comparison with methods based on the State-Dependent Riccati Equation (SDRE) is also presented. In all the provided examples, parametric structures like neural networks have been used to find compact representations of the value function and optimal policies for the corresponding optimal control problems.

Reinforcement Learning and Dynamic Programming Using Function Approximators

Author: Lucian Busoniu
Publisher: CRC Press
ISBN: 1439821097
Category : Computers
Languages : en
Pages : 280

Book Description
From household appliances to applications in robotics, engineered systems involving complex dynamics can only be as effective as the algorithms that control them. While Dynamic Programming (DP) has provided researchers with a way to optimally solve decision and control problems involving complex dynamic systems, its practical value was limited by algorithms that lacked the capacity to scale up to realistic problems. However, in recent years, dramatic developments in Reinforcement Learning (RL), the model-free counterpart of DP, changed our understanding of what is possible. Those developments led to the creation of reliable methods that can be applied even when a mathematical model of the system is unavailable, allowing researchers to solve challenging control problems in engineering, as well as in a variety of other disciplines, including economics, medicine, and artificial intelligence. Reinforcement Learning and Dynamic Programming Using Function Approximators provides a comprehensive and unparalleled exploration of the field of RL and DP. With a focus on continuous-variable problems, this seminal text details essential developments that have substantially altered the field over the past decade. In its pages, pioneering experts provide a concise introduction to classical RL and DP, followed by an extensive presentation of the state-of-the-art and novel methods in RL and DP with approximation. Combining algorithm development with theoretical guarantees, they elaborate on their work with illustrative examples and insightful comparisons. Three individual chapters are dedicated to representative algorithms from each of the major classes of techniques: value iteration, policy iteration, and policy search. The features and performance of these algorithms are highlighted in extensive experimental studies on a range of control applications. The recent development of applications involving complex systems has led to a surge of interest in RL and DP methods and the subsequent need for a quality resource on the subject. For graduate students and others new to the field, this book offers a thorough introduction to both the basics and emerging methods. And for those researchers and practitioners working in the fields of optimal and adaptive control, machine learning, artificial intelligence, and operations research, this resource offers a combination of practical algorithms, theoretical analysis, and comprehensive examples that they will be able to adapt and apply to their own work. Access the authors' website at www.dcsc.tudelft.nl/rlbook/ for additional material, including computer code used in the studies and information concerning new developments.

Adaptive Dynamic Programming: Single and Multiple Controllers

Author: Ruizhuo Song
Publisher: Springer
ISBN: 9811317127
Category : Technology & Engineering
Languages : en
Pages : 278

Book Description
This book presents a class of novel optimal control methods and games schemes based on adaptive dynamic programming techniques. For systems with one control input, the ADP-based optimal control is designed for different objectives, while for systems with multi-players, the optimal control inputs are proposed based on games. In order to verify the effectiveness of the proposed methods, the book analyzes the properties of the adaptive dynamic programming methods, including convergence of the iterative value functions and the stability of the system under the iterative control laws. Further, to substantiate the mathematical analysis, it presents various application examples, which provide reference to real-world practices.

Approximate Dynamic Programming with Adaptive Critics and the Algebraic Perceptron as a Fast Neural Network Related to Support Vector Machines

Author: Thomas Hanselmann
Publisher:
ISBN:
Category : Dynamic programming
Languages : en
Pages : 386

Book Description
[Truncated abstract. Please see the pdf version for the complete text. Also, formulae and special characters can only be approximated here. Please see the pdf version of this abstract for an accurate reproduction.] This thesis treats two aspects of intelligent control: The first part is about long-term optimization by approximating dynamic programming and in the second part a specific class of a fast neural network, related to support vector machines (SVMs), is considered. The first part relates to approximate dynamic programming, especially in the framework of adaptive critic designs (ACDs). Dynamic programming can be used to find an optimal decision or control policy over a long-term period. However, in practice it is difficult, and often impossible, to calculate a dynamic programming solution, due to the 'curse of dimensionality'. The adaptive critic design framework addresses this issue and tries to find a good solution by approximating the dynamic programming process for a stationary environment. In an adaptive critic design there are three modules, the plant or environment to be controlled, a critic to estimate the long-term cost and an action or controller module to produce the decision or control strategy. Even though there have been many publications on the subject over the past two decades, there are some points that have had less attention. While most of the publications address the training of the critic, one of the points that has not received systematic attention is training of the action module.¹ Normally, training starts with an arbitrary, hopefully stable, decision policy and its long-term cost is then estimated by the critic. Often the critic is a neural network that has to be trained, using a temporal difference and Bellman's principle of optimality. Once the critic network has converged, a policy improvement step is carried out by gradient descent to adjust the parameters of the controller network. Then the critic is retrained again to give the new long-term cost estimate. However, it would be preferable to focus more on extremal policies earlier in the training. Therefore, the Calculus of Variations is investigated to discard the idea of using the Euler equations to train the actor. However, an adaptive critic formulation for a continuous plant with a short-term cost as an integral cost density is made and the chain rule is applied to calculate the total derivative of the short-term cost with respect to the actor weights. This is different from the discrete systems, usually used in adaptive critics, which are used in conjunction with total ordered derivatives. This idea is then extended to second order derivatives such that Newton's method can be applied to speed up convergence. Based on this, an almost concurrent actor and critic training was proposed. The equations are developed for any non-linear system and short-term cost density function and these were tested on a linear quadratic regulator (LQR) setup. With this approach the solution to the actor and critic weights can be achieved in only a few actor-critic training cycles. Some other, more minor issues, in the adaptive critic framework are investigated, such as the influence of the discounting factor in the Bellman equation on total ordered derivatives, the target interpretation in backpropagation through time as moving and fixed targets, the relation between simultaneous recurrent networks and dynamic programming is stated and a reinterpretation of the recurrent generalized multilayer perceptron (GMLP) as a recurrent generalized finite impulse MLP (GFIR-MLP) is made. Another subject in this area that is investigated, is that of a hybrid dynamical system, characterized as a continuous plant and a set of basic feedback controllers, which are used to control the plant by finding a switching sequence to select one basic controller at a time. The special but important case is considered when the plant is linear but with some uncertainty in the state space and in the observation vector, and a quadratic cost function. This is a form of robust control, where a dynamic programming solution has to be calculated. ¹Werbos comments that most treatment of action nets or policies either assume enumerative maximization, which is good only for small problems, except for the games of Backgammon or Go [1], or, gradient-based training. The latter is prone to difficulties with local minima due to the non-convex nature of the cost-to-go function. With incremental methods, such as backpropagation through time, calculus of variations and model-predictive control, the dangers of non-convexity of the cost-to-go function with respect to the control is much less than the with respect to the critic parameters, when the sampling times are small. Therefore, getting the critic right has priority. But with larger sampling times, when the control represents a more complex plan, non-convexity becomes more serious.