Material Detail

Toward the understanding of partial-monitoring games

Toward the understanding of partial-monitoring games

This video was recorded at Workshop on On‐lineTrading of Exploration and Exploitation 2, Washington 2011. Partial monitoring games form a common ground for problems such as learning with expert advice, the multi-armed bandit problem, dynamic pricing, the dark pool problem, label efficient prediction, channel allocation in cognitive radio, linear and convex optimization with various feedbacks. How hard is to learn to play well in a partial monitoring game can be characterized by the minimax growth rate of the learner's regret. It is well known that some of these games are harder, while some others are easier when it comes to learning, depending on how much information one receives at what cost: Some actions in a game might give more information for the learner but might be pricier, while some might give less selective (or no), but be cheaper. When designing a strategy, it should be clear that one should take into account the global information-cost structure of the game. The question asked in this talk is how? What are the good learning strategies? That is, given the structure of a game, what is the corresponding minimax regret and what algorithm achieves it? In this talk I will review recent progress toward answering this question, as well as some open problems.


  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material


Log in to participate in the discussions or sign up if you are not already a MERLOT member.