RWTH Aachen
University
Institute for Communication
Systems and Data Processing
Skip to content
Direkt zur Navigation
Home
  • Deutsch
  • English
Home

Noise Reduction – Spectral Weighting

Spectral weighting basically means that different spectral regions of the mixed signal of speech and noise are attenuated with different factors. The aim of this process is an audio signal which contains less noise than the original one. Besides requiring a minimal distortion of the original speech, it is also important that the residual noise, i.e. the noise remaining in the processed signal, does not sound unnatural.

The spectral weighting is usually performed in a transformed domain, e.g. the frequency domain. A common transform is the Fourier transform which provides an equidistant frequency solution.
With the assumption that the speech signal s(k)  and the noise signal n(k)  interfere additive with each other, we get for the (microphone) input signal signal x(k)

x(k) = s(k) + n(k).

After segmentation and windowing this equation leads to

X(f) = S(f) + N(f)

in the frequency domain. The actual spectral weighting is now performed by multiplying the spectrum X(f) with a weighting function G(f) (see next page for more details). We call G(f) a weighting function or weighting rule. The result Y(f) is then given by

Y(f) = X(f) * G(f).

The weighting function G(f) is usually a function of the spectrum X(f) and of the noise power spectral density (PSD) Rnn(f). Thus, to calculate G(f) some estimate of
the noise which should be reduced is necessary. Basically, two methods exist for estimating the noise:

  • Voice Activity Detector (VAD): By using voice activity detectors, the average noise magnitude spectrum or power spectral density is estimated during speech pauses.
  • Minimum Statistics [Martin-01]: By considering the different temporal characteristics of the speech and noise power density spectra, it is possible to obtain a noise magnitude or PSD estimate by tracking spectral minima. In this way slow changes in the noise spectrum can be followed even if the speaker is active.

Finally the output signal y(k) of the system is obtained by transforming Y(f) back into the time domain and applying overlap-add. The total system is depicted in the following block diagram.