Bilevel optimization is a fundamental tool in hierarchical decision-making
and has been widely applied to machine learning tasks such as hyperparameter
tuning, meta-learning, and continual learning. While significant progress has
been made in bilevel optimization, existing methods predominantly focus on the
{nonconvex-strongly convex, or the} nonconvex-PL settings, leaving the more
general nonconvex-nonconvex framework underexplored. In questo documento, we address
this gap by developing an efficient gradient-based method inspired by the
recently proposed Relaxed Gradient Flow (RXGF) framework with a continuous-time
dynamic. In particular, we introduce a discretized variant of RXGF and
formulate convex quadratic program subproblems with closed-form solutions. Noi
provide a rigorous convergence analysis, demonstrating that under the existence
of a KKT point and a regularity assumption {(lower-level gradient PL
assumption)}, our method achieves an iteration complexity of
$\mathcal{O}(1/\epsilon^{1.5})$ in terms of the squared norm of the KKT
residual for the reformulated problem. Moreover, even in the absence of the
regularity assumption, we establish an iteration complexity of
$\mathcal{O}(1/\epsilon^{3})$ for the same metric. Through extensive numerical
experiments on convex and nonconvex synthetic benchmarks and a hyper-data
cleaning task, we illustrate the efficiency and scalability of our approach.
Questo articolo esplora i giri e le loro implicazioni.
Scarica PDF:



