Linear Interpolation of Spectrotemporal Excitation Pattern Representations for Automatic Speech Recognition in the Presence of Noise (bibtex)
by Adriana Stan
Abstract:
This article is based on the study of new methods to improve recognition capabilities of automatic speech recognition in the presence of noise systems. Instead of trying to modify complex recognition models, the study is aimed at enhancing the input data's reliability. This is achieved through processing of the acoustic representations of speech. One of these representations, called SpectroTemporal Excitation Pattern (STEP) is used in recognition systems with missing or unreliable data. One of the ideas behind this study was to increase the glimpsing areas in the STEP representations. And, because the glimpsing algorithm requires previous knowledge of the noise, another idea was to estimate noise characteristics, and base the glimpsing areas determination on these estimations. Preliminary tests were conducted with an HMM recognition system, but this will be the object of a future study.
Reference:
Adriana Stan, "Linear Interpolation of Spectrotemporal Excitation Pattern Representations for Automatic Speech Recognition in the Presence of Noise", In Proceedings of the $5^th$ Conference on Speech Technology and Human- Computer Dialogue, Constanta, Romania, 2009.
Bibtex Entry:
@inproceedings{SPED09,
  author = {Adriana Stan},
  title =  {{Linear Interpolation of Spectrotemporal Excitation Pattern Representations
                    for Automatic Speech Recognition in the Presence of Noise}},
  abstract = {This article is based on the study of new methods to 
                   improve recognition capabilities of automatic speech 
                   recognition in the presence of noise systems. Instead 
                   of trying to modify complex recognition models, the 
                   study is aimed at enhancing the input data's reliability. 
                   This is achieved through processing of the acoustic 
                   representations of speech. One of these representations, 
                   called SpectroTemporal Excitation Pattern (STEP) is used in 
                   recognition systems with missing or unreliable data. One of 
                   the ideas behind this study was to increase the glimpsing areas 
                   in the STEP representations. And, because the glimpsing algorithm 
                   requires previous knowledge of the noise, another idea was to estimate 
                   noise characteristics, and base the glimpsing areas determination on 
                   these estimations. Preliminary tests were conducted with an HMM 
                   recognition system, but this will be the object of a future study.},
  booktitle = {Proceedings of the $5^{th}$ Conference on Speech Technology and Human-
                    Computer Dialogue},
  year = 2009,
  address = {Constanta, Romania}
}
Powered by bibtexbrowser