Voice activity detection in the DFT domain based on a parametric noise model
- Author(s)
-
- Colin Breithaupt, (Ruhr-Universität Bochum)
- Rainer Martin, (Ruhr-Universität Bochum)
- Topics
-
- Voice activity detection and double-talk detection
|
Get the paper in PDF format
Acrobat Reader
(version 5 minimum)
is necessary to read this document.
|
Abstract
We present a robust voice activity detection (VAD) algorithm which is based on the statistics of the coefficients of the discrete Fourier transform (DFT) derived from short signal segments. This algorithm uses a common parametric noise probability density function (PDF) in all frequency bins. The noise model is based on a Rayleigh inverse Gaussian distribution that is adapted to the statistics of the noise
during speech-absence. As only the current and past signal frames are analysed, the detection is causal and no additional delay is introduced. A framework for protecting low energy syllables at the end of utterances is also described.
|