# Discrete Fourier Transform

**Python.** This is my DFT implementation to test a wav file. It only accepts mono wav files with 44100 Hz sample rate. I used 32 floating point (with max 23 mantissa) in the range -1 and +1 for normalisation.

I chose 2048 for N points. DFT’s computational complexity is O(N^2) while FFT’s computational complexity is O(NlogN), so it takes very long time when N point is chosen too big. It is acceptable till 8192. After that, it takes time. There is also time starting variable, so it can be start anywhere from 0. I did not use windowing with zero padding at this implementation because it was a direct application of DFT:

(1)

where k = 0, 1, 2,…, N – 1

The implementation gives magnitude spectra (one in normal one in dB) and phase. It is easier to see to the peaks (fundamental frequency and harmonics) in the normal spectrum, that’s why I added the normal magnitude spectrum to the figure as well. Since magnitude spectrum is symmetric, it would be better to take the first halves of the absolute value of spectra. Finally, to show the results in frequency range, frequency resolution (sample rate/N) was applied to the plots.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
import os import numpy as np import matplotlib.pyplot as plt from scipy.io.wavfile import read INT16_FAC = (2**15)-1 INT32_FAC = (2**31)-1 INT64_FAC = (2**63)-1 norm_fact = {'int16':INT16_FAC, 'int32':INT32_FAC, 'int64':INT64_FAC,'float32':1.0} def DFT(filename): #Reading a wav file and converting it to a normalized floating point array #fs: sampling rate of the wav file, x: integer point array if (os.path.isfile(filename) == False): # error if it is wrong input file raise ValueError("Input file is wrong") fs, x = read(filename) if (len(x.shape) !=1): # error if it is not mono raise ValueError("Audio file should be mono") if (fs !=44100): # error if it is not 44100 sample rate raise ValueError("Sampling rate of input sound should be 44100") #Converting the wav file into floating point 32 in range of -1 to 1 #Computational complexity of DFT is O(N^2), N should not be high x = np.float32(x)/norm_fact[x.dtype.name] time = 0.5 #starting time frame of the signal N = 2048 #frame size for DFT to apply, only even numbers x = x[int(time*fs):int(time*fs)+N] #input sequence for DFT without windowing X = np.array([]) for k in range(N): s = np.exp(-1j * 2 * np.pi * k * np.arange(N)/ N) X = np.append(X, sum(x*s)) mX = abs(X) #amplitude mXdb = 20*np.log(mX) #amplitude in dB pX = np.unwrap(np.angle(X)) #phase #Frequency resolution would be applied to x axis for amplitude and phase #Because of amplitude is symmetric, first halves will be used df = float(fs/N) #frequency resolution plt.subplot(311) plt.plot(df*np.arange(N/2), mX[:len(mX)/2], 'b') plt.axis([0, fs/2, 0, max(mX)+5]) plt.title('magnitude spectra and phase') plt.ylabel('amplitude') plt.subplot(312) plt.plot(df*np.arange(N/2), mXdb[:N/2], 'b') plt.axis([0, fs/2, min(mXdb[:N/2])-10, max(mXdb[:N/2])+15]) plt.ylabel('amplitude (dB)') plt.subplot(313) plt.plot(df*np.arange(N/2), pX[:N/2], 'b') plt.axis([0, fs/2, min(pX[:N/2])-10, max(pX[:N/2])+10]) plt.xlabel('frequency (Hz)') plt.ylabel('phase (radians)') plt.show() return |

I used a short frame of soprano singing at E4 to test it. It is clearly to see peaks in both spectrum since it was a single note singing. The first peak is fundamental frequency. It is approximately 336 Hz. However, she was singing in vibrato style, plus it depends on choosing starting time and N values. Therefore, it is a good idea to check its harmonics and measure their average values. For example, first harmonic/2, second harmonic/3, and third harmonic/4. Then, measuring the final average value for all of them can approximately give the correct fundamental frequency she was singing.

First harmonic is 650 Hz, second harmonic is 987 Hz, and third harmonic is 1323 Hz. Average values are approximately 325, 329, and 331 Hz in order. Final average value for 4 frequencies is 330.25 Hz. E4 frequency is exactly 329.63 Hz. As a result, it is a pretty close approximation. I would finally say that she is a well trained (opera) singer.

References:

[1] Mathematics of the Discrete Fourier Transform, Julius O. Smith III [2] The Scientist and Engineer’s Guide toDigital Signal Processing, Steven W. Smith [3] The Frequency Domain, www.alwayslearn.com