How to remove the outliers

8 views (last 30 days)
Skydriver
Skydriver on 11 Jul 2019
Edited: Jon on 12 Jul 2019
I have a sequence data and I assumed there are some outliers which us plott in my excel in the red colour of shading. I attach the xfiles of my data.
I have a question about how function of the matlab can detect dan delete those data in the red shading.
If any one can help, I would be appreciated.
Thanks

Accepted Answer

Steven Lord
Steven Lord on 11 Jul 2019
Take a look at the filloutliers and rmoutliers functions on this documentation page.
  3 Comments
Jon
Jon on 11 Jul 2019
Maybe you are running an old version of MATLAB that does not have the filloutliers function.
filloutliers was introduced in MATLAB version 2017A
What version of MATLAB are you running? To find out you can type the ver command.
In the future it is good to use the code button in the MATLAB answers toolbar for inserting code. That way it comes out nicely formatted and is easier to read, use and or copy.
Skydriver
Skydriver on 11 Jul 2019
Edited: Skydriver on 11 Jul 2019
I use Matlab 2013 version or May be do you have any suggestion with Matlab version to remove outliers or filloutliers with another values closed in between.

Sign in to comment.

More Answers (1)

Jon
Jon on 11 Jul 2019
Edited: Jon on 11 Jul 2019
Since you do not have filloutliers and rmoutliers in your version of MATLAB
I would first recommend updating to a more recent version of MATLAB if possible as there have been many advances since 2013.
If that is not possible, you can look at the documentation in the link that Steven provided.
It gives MATLAB's default definition of an outlier as:
Outliers are defined as elements more than three scaled MAD from the median. The scaled MAD is defined as c*median(abs(A-median(A))), where c=-1/(sqrt(2)*erfcinv(3/2)).
So you could easily implement this in your code. For example if you had a vectors x and y and you wanted to make a plot with the outliers removed you could do the following
isOutlier = abs(y) > -3/(sqrt(2)*erfcinv(3/2))*median(abs(y - median(y)))
plot(x(~isOutlier),y(~isOutlier))
I would recommend though implementing isOutlier as a small function, so you don't have to keep repeating this code.
Another simple way to remove outliers is to sort your data, using the sort command, and then removing the first and last n values from the sorted listed, where you choose n according to how conservative you want to be with the outlier removal. so for example, given vectors x and y and n = 5.
You could implement this with something like
n = 5;
[ySrt,iSrt] = sort(y)
iKeep = iSrt(n:length(y)-n)
plot(x(iKeep),y(iKeep))
Note that n/length(y) is the fraction of data that you are discarding as outliers at the top and the bottom of the sorted list. So you might want to choose n so that n/length(y) is approximately 0.025, and thus you would be keeping 100*( 1- 2*0.025) = 95% of your data and considering the other extremes as outlier.
This method although simple, of course assumes you usually have some outliers at the extremes, otherwise you are just throwing away good data even though it is at the lower and upper end of the sorted list.
  2 Comments
Skydriver
Skydriver on 12 Jul 2019
Thank you for Steven Lord and Jon, it is working know.
Jon
Jon on 12 Jul 2019
Edited: Jon on 12 Jul 2019
Glad to hear it is working now. If you feel like the question is answered it would be good to "accept" it so that if someone else has the same issue they can see that there is an answer available. If you are still waiting to see if there other approaches then you should leave it open.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!