Material Detail

Regret Bounds for the Adaptive Control of Linear Quadratic Systems

Regret Bounds for the Adaptive Control of Linear Quadratic Systems

This video was recorded at 24th Annual Conference on Learning Theory (COLT), Budapest 2011. We study the average cost Linear Quadratic (LQ) problem with unknown model parameters, also known as the adaptive control problem in the control community. We design an algorithm and prove that its regret up to time T is O(√T) apart from logarithmic factors. Unlike many classical approaches that use a forced-exploration scheme to provide the sufficient exploratory information for parameter estimation, we construct a high-probability confidence set around the model parameters and design an algorithms that plays optimistically with respect to this confidence set. The construction of the confidence set is based on the new results from online least-squares estimation and leads to improved worst-case regret bound for the proposed algorithm. To best of our knowledge this is the the first time that a regret bound is derived for the LQ problem.

Quality

  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material

Comments

Log in to participate in the discussions or sign up if you are not already a MERLOT member.