Material Detail

Information Geometry

Information Geometry

This video was recorded at Machine Learning Summer School (MLSS), Chicago 2005. This tutorial will focus on entropy, exponential families, and information projection. We'll start by seeing the sense in which entropy is the only reasonable definition of randomness. We will then use entropy to motivate exponential families of distributions — which include the ubiquitous Gaussian, Poisson, and Binomial distributions, but also very general graphical models. The task of fitting such a distribution to data is a convex optimization problem with a geometric interpretation as an "information projection": the projection of a prior distribution onto a linear subspace (defined by the data) so as to minimize a particular information-theoretic distance measure. This projection operation, which is more familiar in other guises, is a core optimization task in machine learning and statistics. We'll study the geometry of this problem and discuss two popular iterative algorithms for it.


  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material


Log in to participate in the discussions or sign up if you are not already a MERLOT member.