Share Email Print
cover

Proceedings Paper

Psychoacoustic frequency scales versus frequency warping in scale cepstrum
Author(s): Srinivasan Umesh; Leon Cohen; Nenad M. Marinovic; Douglas J. Nelson
Format Member Price Non-Member Price
PDF $14.40 $18.00
cover GOOD NEWS! Your organization subscribes to the SPIE Digital Library. You may be able to download this paper for free. Check Access

Paper Abstract

In this paper, we derive a frequency-warping function by analyzing speech data obtained from the TIMIT database. Until now, numerous frequency scales have been proposed, based purely on psychoacoustic studies. Many speech recognition algorithms have been using such frequency scales for the spectral analysis at the signal processing front- end. The motivation for the use of such psychoacoustic frequency scales, is that, since these are based on the properties of the human auditory perception, they may provide accurate representation of the relevant spectral information in speech. Since the preference of one scale over another is ad hoc, and since the goal is to achieve better recognition, experiments are conducted to determine if better recognition rates are indeed obtained using any one such scale. In this paper, we analyze actual speech data, and present evidence of the kind of frequency-warping that may be necessary to achieve speaker-independent recognition of vowels. This provides us with the motivation to use such frequency-warping functions in speech recognition. Surprisingly, the frequency-warping obtained is similar to the Mel-scale obtained from psychoacoustic studies. This suggests that the ear may be using such a frequency-warping to remove extraneous speaker-specific information, while identifying and recognizing phonemes.

Paper Details

Date Published: 23 October 1996
PDF: 10 pages
Proc. SPIE 2825, Wavelet Applications in Signal and Image Processing IV, (23 October 1996); doi: 10.1117/12.255264
Show Author Affiliations
Srinivasan Umesh, Indian Institute of Technology (United States)
Leon Cohen, Hunter College/CUNY (United States)
Nenad M. Marinovic, Hunter College/CUNY (United States)
Douglas J. Nelson, Fort Meade/U.S. Dept. of Defense (United States)


Published in SPIE Proceedings Vol. 2825:
Wavelet Applications in Signal and Image Processing IV
Michael A. Unser; Akram Aldroubi; Andrew F. Laine, Editor(s)

© SPIE. Terms of Use
Back to Top