We study a problem of simultaneous system identification and model predictive
control of nonlinear systems. Particularly, we provide an algorithm for systems
with unknown residual dynamics that can be expressed by Koopman operators. Such
residual dynamics can model external disturbances and modeling errors, such as
wind and wave disturbances to aerial and marine vehicles, or inaccurate model
parameters. The algorithm has finite-time near-optimality guarantees and
asymptotically converges to the optimal non-causal controller. Specifically,
the algorithm enjoys sublinear \textit{dynamic regret}, defined herein as the
suboptimality against an optimal clairvoyant controller that knows how the
unknown dynamics will adapt to its states and actions. To this end, we assume
the algorithm is given Koopman observable functions such that the unknown
dynamics can be approximated by a linear dynamical system. Then, it employs
model predictive control based on the current learned model of the unknown
residual dynamics. This model is updated online using least squares in a
self-supervised manner based on the data collected while controlling the
system. We validate our algorithm in physics-based simulations of a cart-pole
system aiming to maintain the pole upright despite inaccurate model parameters.
Questo articolo esplora i giri e le loro implicazioni.
Scarica PDF:
2504.15805v1