Material Detail
Machine Learning in Acoustic Signal Processing
This video was recorded at Machine Learning Summer School (MLSS), Chicago 2009. This tutorial presents a framework for understanding and comparing applications of pattern recognition in acoustic signal processing. Representative applications will be delimited by two binary features: (1) regression vs. (2) classification (inferred variables are continuous vs. discrete), (A) instantaneous vs. (B) dynamic. (1. Regression) problems include imaging and sound source tracking using a device with unknown properties, and inverse problems, e.g., articulatory estimation from speech audio. (2. Classification) problems include, e.g., the detection of syllable onsets and offsets in a speech signal, and the classification of non-speech audio events. (A. Instantaneous) inference is performed using a universal approximator (neural network, Gaussian mixture, kernel regression), constrained or regularized, if necessary, to reduce generalization error (resulting in a support vector machine, shrunk net, pruned tree, or boosted classifier combination). (B. Dynamic) inference methods apply prior knowledge of state transition probabilities, either in the form of a regularization term (e.g., using Bayesian inference) or in the form of set constraints (e.g., using linear programming) or both; examples include speech-to-text transcription, acoustic-to-articulatory inversion using a switching Kalman filter, and computation of the query presence probability in an audio information retrieval task.
Quality
- User Rating
- Comments
- Learning Exercises
- Bookmark Collections
- Course ePortfolios
- Accessibility Info