SNR Maximization and Wiener Filtering

In digital audio processing, one of the most common tasks is reducing background noise while keeping the original signal as clear as possible. This balance is typically expressed as maximizing the Signal-to-Noise Ratio (SNR) — a measure of how dominant the clean signal is compared to unwanted noise.

In this post, I will walk through a practical example using Python, applying both simple and STFT-based Wiener filters to maximize SNR. The SNR (Signal-to-Noise Ratio) is defined as:

(1) $\begin{equation*} \text{SNR} = 10 \log_{10} \frac{\sum clean^2}{\sum (clean - noisy)^2} \end{equation*}$

A higher SNR means the noise level is lower relative to the signal — in other words, a cleaner output. In practical denoising systems, SNR can’t be directly maximized (since we don’t know the clean signal in real recordings). Instead, we estimate the noise and design a filter that minimizes its impact — this is where the Wiener filter shines.

The Wiener filter is a statistical filter that aims to minimize the Mean Squared Error (MSE) between the clean and estimated signals. Mathematically, its frequency-domain form is:

(2) $\begin{equation*} H(f) = \frac{S(f)}{S(f) + N(f)} \end{equation*}$

where:

= power spectrum of the clean signal (estimated)
$N (f)$ = power spectrum of the noise

Intuitively, it gives more weight to frequencies dominated by the clean signal and less weight to noisy regions. The result is a smooth spectral gain that enhances clarity without introducing harsh artifacts.

For noise power, a simplified static noise estimation technique is used. It assumes the first 0.1 seconds of the recording contain only background noise and uses that segment to estimate noise power per frequency bin. While this assumption works well for many test signals, in real-world audio you might use adaptive noise tracking or voice activity detection (VAD) to locate noise-only regions dynamically.

Results:

The Simple Wiener gives a higher numeric SNR (since it smooths aggressively), but the STFT-based Wiener often preserves frequency balance better — producing a cleaner and more natural sound. If the noise estimate and overlap parameters are well-tuned, the STFT method can outperform the simpler one perceptually. Overall, Wiener filtering—especially in the STFT domain—offers a practical and effective method for SNR maximization in audio denoising.

import numpy as np
import soundfile as sf
import matplotlib.pyplot as plt
from scipy.signal import wiener, stft, istft

# --- Load audio ---
x, sr = sf.read('testMonoAudio.wav')

# --- Simple Wiener filter (time-domain) ---
wiener_output = wiener(x)

# --- Custom STFT-based Wiener filtering ---
f, t_stft, Zxx = stft(x, fs=sr, nperseg=1024, noverlap=512, window='hann')
power_spec = np.abs(Zxx)**2

# Estimate noise power from first 0.1s (assuming mostly noise there) 
# (simplified noise estimation technique)
noise_frames = int(0.1 * sr / 512)
noise_power = np.mean(power_spec[:, :noise_frames], axis=1, keepdims=True)

# Estimate clean signal power (total - noise)
signal_power = np.maximum(power_spec - noise_power, 1e-10)

# Wiener gain - clamp or smooth to avoid abrupt spectral changes
H = signal_power / (signal_power + noise_power)
H = np.clip(H, 0.05, 1.0)

# Apply gain
Y = H * Zxx

# Inverse STFT
_, clean_est = istft(Y, fs=sr)

# --- SNR measurement function ---
def snr_db(reference, test):
    """Compute SNR between reference and test signals."""
    # Align length
    min_len = min(len(reference), len(test))
    reference = reference[:min_len]
    test = test[:min_len]
    ref_power = np.mean(reference**2)
    diff_power = np.mean((reference - test)**2)
    return 10 * np.log10(ref_power / diff_power + 1e-12)

# Use input as reference for relative improvement
snr_in = 0.0  # reference baseline (self-SNR)
snr_wiener = snr_db(x, wiener_output)
snr_stft = snr_db(x, clean_est)

print("=== SNR RESULTS (dB) ===")
print(f"Input (reference): {snr_in:.2f} dB")
print(f"Simple Wiener:     {snr_wiener:.2f} dB")
print(f"STFT Wiener:       {snr_stft:.2f} dB")

# --- Save outputs ---
sf.write("input.wav", x, sr)
sf.write("wiener_clean_simple.wav", wiener_output, sr)
sf.write("wiener_clean_stft.wav", clean_est, sr)

# --- Plot for comparison (with SNR) ---
plt.figure(figsize=(10, 6))

plt.subplot(3, 1, 1)
plt.title(f"Input Signal (Reference) — SNR: {snr_in:.2f} dB")
plt.plot(x, color='gray')
plt.ylabel("Amplitude")

plt.subplot(3, 1, 2)
plt.title(f"Simple Wiener Filter Output — SNR: {snr_wiener:.2f} dB")
plt.plot(wiener_output, color='blue')
plt.ylabel("Amplitude")

plt.subplot(3, 1, 3)
plt.title(f"STFT Wiener Filter Output (SNR-Maximized) — SNR: {snr_stft:.2f} dB")
plt.plot(clean_est, color='green')
plt.ylabel("Amplitude")
plt.xlabel("Samples")

plt.tight_layout()
plt.show()

[1] Wiener filter – Wikipedia

[2] MIT OpenCourseWare – Signals, Systems, and Inference, Chapter 11: Wiener Filtering