Material Detail

Unsupervised Training of an HMM-based Speech Recognizer for Topic Classification

Unsupervised Training of an HMM-based Speech Recognizer for Topic Classification

This video was recorded at Center for Language and Speech Processing (CLSP) Seminar Series. We address the problem of performing topic classification of speech when no transcriptions from the speech corpus of interest are available. The approach we take is one of incremental learning about the speech corpus starting with adaptive segmentation of the speech, leading to the generation of discovered acoustic units and a segmental recognizer for these units, and finally to an initial tokenization of the speech for the training of a HMM speech recognizer. The recognizer trained is BBN's Byblos system. We discuss the performance of this system and also consider the case when a small amount of transcribed data is available.

Quality

  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material

Comments

Log in to participate in the discussions or sign up if you are not already a MERLOT member.