Material Detail

Multi-stream modeling with applications in speech and multimodal processing

This video was recorded at Machine Learning Workshop, Sheffield 2004. After a brief discussion of the problem arising from the processing and modeling of multiple stream (multi-channel, multi-sensor) signals, we will discuss a few statistical structures (such as multi-stream HMM and asynchronous HMM) that can accommodate multiple (asynchnonous) observation streams (possibly exhibiting different frame rates). Indeed, it will be shown on different speech recognition and multimodal fusion tasks that it might sometimes be a good idea to be able to ``desynchronize'' the streams in order to maximize their joint likelihood. Different applications in speech recognition, such as multi-band and multi-stream speech processing, will be discussed. Finally, multimodal applications significantly benefiting from this multi-stream paradigm will also be discussed, including audio-visual speech recognition and modeling of human interaction in meetings (by modeling the joint behaviours of participants through multiple audio and visual features).

Keywords:: videolectures, ocwc, oec

Disciplines:

Science and Technology / Computer Science

Go to Material

Bookmark / Add to Course ePortfolio

Create a Learning Exercise

Add Accessibility Information

Rate

Add a Comment

Quality

User Rating
Comments
Learning Exercises
Bookmark Collections
Course ePortfolios
Accessibility Info

Report Broken Link
Report as Inappropriate

More about this material

Material Type:: Presentation
Date Added to MERLOT:: February 10, 2015
Date Modified in MERLOT:: February 10, 2015
Author:: Herve Bourlard, IDIAP Research Institute
Submitter:: The Open Education Consortium
Primary Audience:: College General Ed, College Lower Division, College Upper Division
Technical Format:: Video

Mobile Compatibility:: Not specified at this time
Language:: English
Cost Involved:: No
Source Code Available:: No
Creative Commons:: This work is licensed under a Attribution-NonCommercial-NoDerivs 3.0 United States