KERNEL : mean integrated squared error- Bandwidth Selection
5 views (last 30 days)
Show older comments
Hello all,
I have my set of data and I estimated the function using kernel, however the Bandwidth must be estimated for a correct density from the given data. I just put 0.2 for initial start so I will be able to play around with the bandwidth before looking into proper method but the kernel didn't work for width = 0.2,however for another set of data it did work. there is more proffesional method to pick the best bandwith for the given data and it is using mean integrated squared error, Is there any in-built function in Matlab, I didn't seem to find any, not sure if there is a method in one of the toolboxes not available to me. I would like to know why the width 0.2 is not working to my code??..
Thank you all,
sample1 = [6.52689332414481E7
6.52693837402845E7
6.5270203713004336E7
6.527122138667133E7
6.52717237415096E7
6.527173346449997E7
6.527211590239384E7
6.5272540473269284E7
6.527282568117965E7
6.527314005807114E7
];
x = sample1.';
[xi,f]=ksdensity(x,'width',0.2);
plot(f,xi);
line(repmat(x,2,1),repmat([0;0.1*max(xi)],1,length(x)),'color','g' );
0 Comments
Accepted Answer
Ilya
on 29 Aug 2011
The "right" width depends on your assumptions about the fitted distribution. MATLAB does not choose the bandwidth "randomly". It computes the optimal bandwidth for the normal distribution:
help ksdensity
[snip]
[F,XI,U]=ksdensity(...) also returns the bandwidth of the kernel smoothing window.
[snip]
'width' The bandwidth of the kernel smoothing window. The default is optimal for estimating normal densities, but you may want to choose a smaller value to reveal features such as multiple modes.
If you look at that Wikipedia article, note this paragraph:
Neither the AMISE nor the hAMISE formulas are able to be used directly since they involve the unknown density function ƒ or its second derivative ƒ'', so a variety of automatic, data-based methods have been developed for selecting the bandwidth. Many review studies have been carried out to compare their efficacities,[6][7][8][9][10] with the general consensus that the plug-in selectors[11] and cross validation selectors[12][13][14] are the most useful over a wide range of data sets.
I suggest that you choose the optimal bandwidth by cross-validation using ksdensity and crossval functions. Often the approximation based on the normal distribution (which you get by default from ksdensity) is good enough. -Ilya
More Answers (1)
the cyclist
on 28 Aug 2011
In your case, your data are order-of-magnitude 1e7, but you are choosing a width of 0.2, so it is much, much too tiny. I suspect you do not have a very good understanding of what kernel density estimation is doing, so you might want to read some basic articles to understand the technique better. This is not a bad place to start:
The easiest thing to do is to not include the 'width' parameter at all, and let MATLAB choose it for you:
[xi,f] = ksdensity(x);
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!