How to filter breath noise in audio?

10 views (last 30 days)
In the attachment are the original audio files and the MATLAB filter files used. I tried low-pass filtering and band-pass filtering. The effect is not obvious. This noise is mainly heavy breathing sound. How can I filter this breathing sound and save the speaking sound completely (Chinese or English)?
  5 Comments
Jonas
Jonas on 13 Jul 2022
do you want to remove it only in this sound or do you want to do this automatically for multiple files?
wei sun
wei sun on 13 Jul 2022
remove or attenuate this noise.

Sign in to comment.

Accepted Answer

Mathieu NOE
Mathieu NOE on 13 Jul 2022
Edited: Mathieu NOE on 13 Jul 2022
hello
i opted for a strategy based on the spectrogram content. I noticed that the "breathing" sections are characterized by a strong spectrogram output below 100 Hz (red dots) which is not the case for the "speaking" sections
I worked on channel 1 as channel 2 is clipped (distorded)
so I simply reduced the volume (here - 30 dB) for the segments that goes from the local minima just before and after each red dot
(you can also put directly zero if you prefer - see options in the code)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% options
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% spectrogram dB scale
spectrogram_dB_scale = 80; % dB range scale (means , the lowest displayed level is XX dB below the max level)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% load signal
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[signal,Fs] = audioread('original.wav');
dt = 1/Fs;
[samples,channels] = size(signal);
% select channel (if needed)
channels = 1;
signal = signal(:,channels);
signal_filtered = signal;
% time vector
time = (0:samples-1)*dt;
%% decimate (if needed)
% NB : decim = 1 will do nothing (output = input)
decim = 40;
if decim>1
signal_decim = decimate(signal,decim);
Fs_decim = Fs/decim;
end
samples_decim = length(signal_decim);
time_decim = (0:samples_decim-1)*1/Fs_decim;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% FFT parameters
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
NFFT = 512; %
OVERLAP = 0.75;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% display : time / frequency analysis : spectrogram
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[sg,fsg,tsg] = specgram(signal_decim,NFFT,Fs_decim,hanning(NFFT),floor(NFFT*OVERLAP));
% FFT normalisation and conversion amplitude from linear to dB (peak)
sg_dBpeak = 20*log10(abs(sg))+20*log10(2/length(fsg)); % NB : X=fft(x.*hanning(N))*4/N; % hanning only
% saturation of the dB range :
min_disp_dB = round(max(max(sg_dBpeak))) - spectrogram_dB_scale;
sg_dBpeak(sg_dBpeak<min_disp_dB) = min_disp_dB;
% plots spectrogram
figure(2);
imagesc(tsg,fsg,sg_dBpeak);colormap('jet');
axis('xy');colorbar('vert');grid on
df = fsg(2)-fsg(1); % freq resolution
title(['Spectrogram / Fs = ' num2str(Fs) ' Hz / Delta f = ' num2str(df,3) ' Hz ']);
xlabel('Time (s)');ylabel('Frequency (Hz)');
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% extract SG (dB) values from 0 to 100 hz (loud level in this freq range is
% breath sound
ind = find(fsg<=100);
fsg_breath = fsg(ind);
sg_dB_breath = sg_dBpeak(ind,:);
max_dB = max(sg_dB_breath,[],1);
max_dB = max_dB-min(max_dB); % shift the dB values to positive values for good working islocalmax
% select peaks above +25 dB and neighboring local mins
% find local maxima
[tf, P] = islocalmax(max_dB,'MinProminence',25);
x_peak = tsg(tf);
y_peak = max_dB(tf);
% find local minima
[tm, P] = islocalmin(max_dB);
x_min = tsg(tm);
y_min = max_dB(tm);
figure(3);plot(tsg,max_dB,x_peak,y_peak,'dr',x_min,y_min,'dk');
title('Spectrogram max dB value vs Time');
xlabel('Time (s)');ylabel('Max dB value');
% set to zero the data that are defined by the local mins just before
% and after the high peaks
for ck = 1:numel(x_peak)
% search x_min just before
dist = x_min - x_peak(ck);
ind_bef = find(dist<0,1,'last');
x_min_bef = x_min(ind_bef);
ind_aft = find(dist>0,1,'first');
x_min_aft = x_min(ind_aft);
% now zero time signal between these two time indexes
ind = find(time>=x_min_bef & time<=x_min_aft);
% signal_filtered(ind) = 0; % option 1 : zero
signal_filtered(ind) = signal_filtered(ind)/30 ; % option 2 : 30 dB attenuation
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% display : time domain plot
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
figure(1),
subplot(2,1,1),plot(time,signal,'b');grid on
title(['Time plot / Fs = ' num2str(Fs) ' Hz / raw data ']);
xlabel('Time (s)');ylabel('Amplitude');
subplot(2,1,2),plot(time,signal_filtered,'b');grid on
title(['Time plot / Fs = ' num2str(Fs) ' Hz / filtered data ']);
xlabel('Time (s)');ylabel('Amplitude');
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% export signal
audiowrite('filtered.wav',signal_filtered,Fs); % audiowrite(filename,y,Fs,varargin)
  8 Comments
wei sun
wei sun on 15 Jul 2022
Ok thank you, I have been taught, the FFT of the entire segment does take up a lot of computing power, and it will introduce a lot of invalid information。
Mathieu NOE
Mathieu NOE on 15 Jul 2022
the saving in computation is proportionnal to the applied decimation factor (here 40) so I don't think it's negelctable especcially if you want to apply the code to longer wav files
but of course you can remove the decimation operation if you feel bad about it

Sign in to comment.

More Answers (0)

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!