Material Detail

Exploration exploitation in Go: UCT for Monte-Carlo Go

Exploration exploitation in Go: UCT for Monte-Carlo Go

This video was recorded at NIPS Workshop on On-line Trading of Exploration and Exploitation, Whistler 2006. Trading exploration and exploitation plays a key role in a number of learning tasks. For example the bandit problem provides perhaps the simplest case in which we must decide a trade-off between pulling the arm that appears most advantageous and experimenting with arms for which we do not have accurate information. Similar issues arise in learning problems where the information received depends on the choices made by the learner. Learning studies have frequently concentrated on the final performance of the learned system rather than consider the errors made during the learning process. For example reinforcement learning has traditionally been concerned with showing convergence to an... Show More

Quality

  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material

Comments

Log in to participate in the discussions or sign up if you are not already a MERLOT member.
hidden