Télécom Paris


GET
EURASIP
HINDAWI
Voice activity detection in the DFT domain based on a parametric noise model
Author(s)
  • Colin Breithaupt, (Ruhr-Universität Bochum)
  • Rainer Martin, (Ruhr-Universität Bochum)
Topics
  • Voice activity detection and double-talk detection
Get the paper in PDF format
 
Acrobat Reader (version 5 minimum) is necessary to read this document.

Abstract
We present a robust voice activity detection (VAD) algorithm which is based on the statistics of the coefficients of the discrete Fourier transform (DFT) derived from short signal segments. This algorithm uses a common parametric noise probability density function (PDF) in all frequency bins. The noise model is based on a Rayleigh inverse Gaussian distribution that is adapted to the statistics of the noise during speech-absence. As only the current and past signal frames are analysed, the detection is causal and no additional delay is introduced. A framework for protecting low energy syllables at the end of utterances is also described.

©2006 Télécom Paris/TSI
Edition : Télécom Paris -- 2006