How to remove noise (unwanted data)
13 views (last 30 days)
Show older comments
I have a data that contain some noise , I have to remove that . from attachment take only Time and Pyrn1_Avg column. I am also attaching two figure one with noise and another is without noise(which I need). I have removed noise of only one data by putting NaN in the place of noise manually. but it is time consuming and I have thousand of data. sugest me suitable fillter.
0 Comments
Accepted Answer
John D'Errico
on 22 Feb 2023
Finding and removing what I might call singleton outlier's while leaving in place small amounts of noise can be a difficult task. After all, how far out does noise need to be for it to be an outlier? Could it be just a rare event from the normal population of noise? And this very much depends on the regular signal to noise ratio in your data.
The real problem becomes though, as to how to find large blocks of possible noise.
data = readtable('KLOG0024.CSV')
T = data.Time;
Y = data.Pyrn1_Avg;
plot(datenum(T),Y)
If you knew what the normal level of noise was in this data, then you might decide that any region where the variability appears to be larger than the norm should just be dropped out. The problem is, the curve itself has a significant amount of signal in it. Since the time vector is just at a constant increment, we might consider a simple finite difference of the curve. That essentially eliminates any component of the signal itself.
So we might do this:
plot(datenum(T(2:end)),diff(Y))
Now you can see where crap is happening. Next, compute a moving, local estimate of the noise in that curve. I've attached my movingstd utility (it should be on the file exchange for download.)
Sigest = movingstd([0;diff(Y)],20,'central'); % A centroal moving window, width 20
plot(Sigest)
Now, you might decide to zap out any part of the curve where the local variability of the curve is greater than some level. If we assume the bad part is no more than 10% of the curve, that would be the 90'th percentile.
Sigmax = prctile(Sigest,90)
Y(Sigest>Sigmax) = NaN;
And finally, plot the result:
plot(datenum(T),Y,'-')
And while it looks like you chose to zap out a little more of the curve than I did, this looks at least reasonable.
2 Comments
More Answers (1)
Askic V
on 22 Feb 2023
Edited: Askic V
on 22 Feb 2023
I would suggest the following approach:
% read file into table
%T = readtable('KLOG0024.csv');
outfile = websave('KLOG0024.csv', 'https://www.mathworks.com/matlabcentral/answers/uploaded_files/1303255/KLOG0024.CSV');
T = readtable(outfile);
% read data into arrays
time_t = table2array(T(:,'Time'));
data_d = table2array(T(:,'Pyrn1_Avg'));
% plot ddata
plot(time_t, data_d);
hold on
% medfilt1 replaces every point of a signal by the
% median of that point and a specified number of neighboring points (15)
filtered_data = medfilt1(data_d,15);
plot(time_t, filtered_data);
legend('Noisy data', 'Filtered data');
you can now play with the number of points until it suits your needs,
3 Comments
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!