Material Detail

Machine Learning in Acoustic Signal Processing

Machine Learning in Acoustic Signal Processing

This video was recorded at Machine Learning Summer School (MLSS), Chicago 2009. This tutorial presents a framework for understanding and comparing applications of pattern recognition in acoustic signal processing. Representative applications will be delimited by two binary features: (1) regression vs. (2) classification (inferred variables are continuous vs. discrete), (A) instantaneous vs. (B) dynamic. (1. Regression) problems include imaging and sound source tracking using a device with unknown properties, and inverse problems, e.g., articulatory estimation from speech audio. (2. Classification) problems include, e.g., the detection of syllable onsets and offsets in a speech signal, and the classification of non-speech audio events. (A. Instantaneous) inference is performed using a universal approximator (neural network, Gaussian mixture, kernel regression), constrained or regularized, if necessary, to reduce generalization error (resulting in a support vector machine, shrunk net, pruned tree, or boosted classifier combination). (B. Dynamic) inference methods apply prior knowledge of state transition probabilities, either in the form of a regularization term (e.g., using Bayesian inference) or in the form of set constraints (e.g., using linear programming) or both; examples include speech-to-text transcription, acoustic-to-articulatory inversion using a switching Kalman filter, and computation of the query presence probability in an audio information retrieval task.


  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material


Log in to participate in the discussions or sign up if you are not already a MERLOT member.