Violin plot has tails that go beyond real data
42 views (last 30 days)
Show older comments
Hi everyone,
I'm using the awesome "distributionPlot.m" from file exchange in order to plot some nice violins.
I've noticed that the violins have tails that don't reflect my data - it is under the lowest value and above the highest.
for example: here's violin of data that is ALL POSITIVE! how come it goes below zero?
appreciate any help,
shir
2 Comments
Philip G
on 10 Dec 2018
What input arguments do you use for the distributionPlot function?
The plot has some histogram smoothing options as specified:
% histOpt : histogram type to plot
% 0 : use hist command (no smoothing, fixed number of
% bins)
% 1 : smoothened histogram using ksdensity with
% Normal kernel. Default.
% 1.1: smoothened histogram using ksdensity where the
% kernel is robustly estimated via histogram.m.
% Normal kernel.
% 2 : histogram command (no smoothing, automatic
% determination of thickness (y-direction) of bins)
Any smoothed histogram might give you tails outside of where you data lies. Use Option "0" for a "true" histogram.
Answers (2)
Ruggero G. Bettinardi
on 11 Dec 2018
Hi Shir,
I uploaded an updated version of 'distributionPlot' on my FileXchange page, 'distributionPlot_OnlyPositive'. This version of the function works exactly as the original one, but avoid violins whose lower tail go below zero.
NOTE that, however, as this function is still based on normal kernel smoothing, it does not guarantee to plot violins whose lower/upper tail extend only in the exact range of your input values. It does only guarantee not to plot violins with tails extending below zero.
HTH
Ruggero
Adam Danz
on 9 Oct 2024
The important concept to understand is that violin plots show a visual estimate of the data’s distribution beyond the observed values using a kernal density estimate based on the input data. It's up to the reader to interpret those results in context of the data.
Unlike boxchart, histogram, swarmchart, and other similar distribution visualizations, the violinplot makes inferences about the population rather than merely depicting the input data.
violinplot was introduced in MATLAB R2024b. A good example to understand the visualization is basing the violinplot on a single data point at y=0. The shape forms a gaussian curve centered at 0 with a standard deviation of 1. Compare the violin plot with the gaussian curve.
tiledlayout(2,1)
violinplot(nexttile(), 0, orientation='horizontal')
grid on
x = -3:.02:3;
y = gaussmf(x,[1,0]);
plot(nexttile, x, y)
ylim([-2 2])
grid on
0 Comments
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!