Melody Analyser (slow single line)
Matlab. I developed a melody analyser that detects frequencies with their attack time. It only works with slow passages in which the notes are separable from each other. I wrote its algorithms by taking reference the signal’s total power. Basically, it smooths the power graph, reshaping it by 1 and 0, and finds the attack times for each note. Then, it creates sample segments for each note for next FFT calculations. It does not work well with fast or close note melodies because the power tail of previous pitch carries to the next. It fails to calculate to correct attack time and related frequency.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 |
function result = melody_analyser tic [y, Fs] = audioread('piano.wav'); %input audio file (mono) [~,~,T,P]=spectrogram(y,256,[],[],Fs); totalpower = sum(P); %SIGNAL SMOOTHING STAGE (GRAPH SMOOTHING) %Applying First Signal Smoothing smooth1 = totalpower(1); for i = 2:2:length(T)-2 new1 = (totalpower(i-1)+totalpower(i+1))/2; smooth1 = [smooth1, new1, new1]; end if abs(length(T) - length(smooth1)) > 0 for j = abs(length(T) - length(smooth1)):-1:1 addition = length(T)+1-j; smooth1 = [smooth1, totalpower(addition)]; end end %Applying Second Signal Smoothing smooth2 = smooth1(1); for i = 2:2:length(T)-2 new2 = (smooth1(i-1)+smooth1(i+2))/2; smooth2 = [smooth2, new2,new2]; end if abs(length(T) - length(smooth2)) > 0 for j = abs(length(T) - length(smooth2)):-1:1 addition = length(T)+1-j; smooth2 = [smooth2, smooth1(addition)]; end end %RESHAPING SIGNAL STAGE threshold = 0.1 * max(smooth2); %deciding threshold reshape = smooth2; reshape(reshape <= threshold) = 0; %replacing with zeros below threshold reshape(reshape > 0) = 1; %replacing with ones above threshold % Finding Attack Time Points %Odd number elements in the list will be attack frame points list = []; for i = 1:length(reshape)-1 for j = i+1:i+1 if reshape(i) ~= reshape(j) list = [list, j]; end end end if length(list) == 0 %if it does not detect anything error('It is not a melody'); end %FREQUENCY DETECTION STAGE attacktime = []; frequency = []; for k = 1:2:length(list) start = Fs * list(k) * T(1); %data segment start finish = Fs * list(k+1) * T(1); %data segment finish ysegment = y(start:finish); %data segment of note attacktime = [attacktime, list(k)*T(1)]; %taking odd numbers frequency = [frequency, pitch_detection(ysegment)]; %calling nested function end attacktime = round(attacktime, 2); result = [attacktime; frequency]; %Final result; 1. Attack Time, 2. Frequency %PLOTTING STAGE subplot(3,1,1); plot(T,totalpower); subplot(3,1,2); plot(T,smooth2); subplot(3,1,3); plot(T,reshape); %PITCH DETECTION FUNCTION function final_freq = pitch_detection(input) N = length(input); %FFT size for the segment itself Y = fft(input); %taking FFT of the signal X = Y'; %laying fft data onto x-axis mX = abs(X); %symmetric magnitude mX = mX(1:N/2); %taking the first half mX = 20*log10(mX); %magnitude in decibels %Calculating Stage - bands for fundamental and next 3 harmonics threshold = 0.7 * max(mX); %threshold to filter mX fmX = mX; fmX(fmX < threshold) = 0; %fmX = filtered magnitude array bandwidth = 0; for i = 1:length(fmX) %finding the bandwidth if fmX(i) ~= 0 %finding the first bin that reaches any value bandwidth = i - 1; if bandwidth < 2 %if it gets a value at the beginning bandwidth = 2; end break end end base = zeros(1, floor(N/2)); %base to check frequency for each band flist = []; %temporary frequency list for j = 1:4 start = floor((2*j-1)*bandwidth/2); finish = floor((2*j+1)*bandwidth/2); base(start:finish) = fmX(start:finish); %creating band to detect [~, freq_bin] = max(base); %detecting frequency bin in each band freq = (Fs*freq_bin)/(j*N); %changing frequency bin to frequency flist = [flist, freq]; %creating temporary frequency list base = zeros(1, floor(N/2)); %preparing base for next band end %Consistency Checking Stage - of harmonics with the fundamental frequency %Harmonics were normalised by their ranks; 2, 3, and 4 %Comparing absolute value of difference with %5 range to create final list frequencies = flist(1); %funtamental frequency for k = 2:4 if abs(flist(1) - flist(k)) <= 0.05 * flist(1) frequencies = [frequencies, flist(k)]; end end %Final Stage - taking average value of the frequency list format long g final_freq = round(mean(frequencies), 2); if final_freq < 27.5 || final_freq > 13290 error('Not detectable') end end %end of nested pitch detection function toc end %end of main function |
For demonstration, I used a simple piano segment. It is a 5 notes single melody line. Its STFT parameter and threshold can be changed. It gives the output with attack times and frequencies. For this line;
- 0.05 sec , 165.73 Hz
- 0.85 sec , 201.59 Hz
- 1.03 sec , 177.69 Hz
- 1.55 sec , 138.96 Hz
- 2.08 sec , 262.65 Hz
References:
[1] Mathematics of the Discrete Fourier Transform, Julius O. Smith III