Material Detail

Learning Shared and Separate Features of Two Related Data Sets using GPLVMs

Learning Shared and Separate Features of Two Related Data Sets using GPLVMs

This video was recorded at NIPS Workshop on Learning from Multiple Sources, Whistler 2008. Dual source learning problems can be formulated as learning a joint representation of the data sources, where the shared information is represented in terms of a shared underlying process. However, there may be situations in which the shared information is not the only useful information, and interesting aspects of the data are not common to both data sets. Some useful features within one data set may not be present in the other and vice versa; this complementary property motivates the use of multiple data sources over single data sources which capture only one type of useful information. For instance, having two eyes (and two streams of visual data) allows us to gain a 3-D impression of the world. This ability of stereo vision combines both shared features and features private to each data stream to form a coherent representation of the world; common shifted features can be used in disparity estimation to infer depths of objects, while some features which may be seen in one view but not in the other, due to occlusions, can provide additional information about the scene. In this work, we present a probabilistic generative framework for analysing two sets of data, where the structure of each data set is represented in terms of a shared and private latent space. Explicitly modeling a private component for each data set avoids an oversimplified representation of the within-set variation such that the between-set variation can be modeled more accurately, as well as giving insight into potentially interesting features particular to a data set. Since two data sets may have a complex (possibly nonlinear) relationship, we use nonparametric Bayesian techniques - we define Gaussian process priors over the functions from latent to data spaces, such that each data set is modelled as a Gaussian Process Latent Variable Model (GPLVM) [1] where the dependency structure is captured in terms of shared and private kernels.


  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material


Log in to participate in the discussions or sign up if you are not already a MERLOT member.