Can anybody suggest me to draw a more meaningful histogram? Or how can I get the opposite histogram of this bar plot?

2 views (last 30 days)
SA on 10 May 2021
Commented: William Rose on 11 May 2021
I've some data distributions w.r.t a variable. I've plotted the histogram in the following way. Do I get more meaningful histogram if the change the axis ? Can anybody suggest me in this regards? Thanks and regards in advance.
xx=[1 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250];
K40=[0.4262 0.4313 0.4253 0.4278 0.8554 0.4206 0.426 0.4187 0.4255 0.858 0.8517 0.4257 0.8567 0.4288 0.8488 0.4265 0.8587 0.4183 0.4201 0.4278 0.8572 0.4251 0.4263 0.4291 0.4295 0.4303];
K82=[0.9941 0.8707 0.5442 0.9709 0.5538 0.7029 1.028 0.5364 0.7049 0.5278 0.5341 0.5141 0.8712 0.902 0.9365 0.5382 0.5316 0.9737 0.693 0.9925 0.532 0.9365 1.251 0.9374 0.5504 0.5298];
K152=[0.7171 0.7203 0.7256 0.7197 0.7258 0.7269 0.7132 0.7129 0.6941 0.7257 0.7131 0.7059 0.7191 0.7106 1.012 0.7141 0.7259 0.7182 0.9928 0.7104 0.7136 0.9859 1.001 0.712 0.7045 0.7184];
%% Making the Bar Plot Histogram
p = [K40;K82;K152];
figure; bar(p);
title("Bar Histogram");
xlabel("100sec stretches of Data"); ylabel("KK distribution");
legend('Distribution of K@40Hz','Distribution of K@82Hz','Distribution of K@152Hz')

Accepted Answer

William Rose
William Rose on 11 May 2021
@Sk. Alam, When I run your code I get the plot below. The triples of adjcent bars are hard to discern or appreciate. The x-xis labels overlap and are unreadable.
It is easier to perceive the differences between frequencies 40, 82, and 152 kHz by making three bar charts or by plotting three x,y sets on one plot, as shown by the following two figures. I lso djusted the labels to minimize overlap.
I have quesitons about the data. Are these really histograms? What is causing the quantization effects? Are the distributions different? Do these represent about 8 hours of data (100 sec x 250)?
1.Thee bar charts
2.Three sets of x,y on a single plot
Code for the above: attached.
  1 Comment
SA on 11 May 2021
Thanks for your attempt. I've tried earlier for the figure 2 with the following scripts:
h=figure; grid on; hold on;
plot(xx,K40, 'LineStyle','-.', 'Color', 'b','linewidth', 1.35);
plot(xx,K82, 'LineStyle','--', 'Color', 'r','linewidth', 1.35);
plot(xx,K152, 'LineStyle',':', 'Color', 'm','linewidth', 1.35);
xlim([0 250]); xticks(0:10:250);
ylim([0.3 1.3]); yticks(0.3:.1:1.3);
title(sprintf ('Distribution of K for 250 times 100s stretches of data'));
xlabel('Serial No.'); %,'FontSize',08,'FontWeight','bold');
ylabel('Magnitude of K'); %,'FontSize',08,'FontWeight','bold');
legend('Distribution of K@40Hz','Distribution of K@82Hz','Distribution of K@152Hz')
set(gcf, 'PaperUnits','inches','PaperPosition',[0 0 6.3 4]);
fig_name = ('Distribution_K');
ax = gca; ax.FontSize = 6; %ax.FontWeight = 'bold';
"Are these really histograms?"
-I just want to check if histogram provides better visualization
"What is causing the quantization effects? "
I don't get your question at this point!!
Are the distributions different?
The value of K is different for 3 different frequencies @40Hz, @82Hz & @152Hz
Do these represent about 8 hours of data (100 sec x 250)?
No, It's not about 8hours of data; say xx=1 means for the first 100sec data bin I've calculate the value of K @40Hz, @82Hz & @152Hz are 0.4262, 0.9941 & 0.7171 respectively,
for xx=10 means at the 10-th instant of 100sec data bin I've calculate the value of K @40Hz, @82Hz & @152Hz are 0.4313, 0.8707 & 0.7203 respectively, and so on.
One thing I've in my mind is that is there any way to plot the same bar plot by altering the x- & y-axis? I don't understand how to alter the axis for that plotting/histogram. Thank you again.

Sign in to comment.

More Answers (1)

William Rose
William Rose on 11 May 2021
I still do not understand what you want to do. You said you are plotting the histograms, and that you want to get more meanigful histogram. However, it seems that you are plotting three time series: the K versus time at 40 Hz, at 82 Hz, and at 152 Hz. That is not a histogram. You said "xx=10 means at the 10-th instant of 100sec data bin I've calculate the value of K". You report values for xx which range from 1 to 250. Does this mean there are 250 "instants" in 100 seconds? Does one "instant" have a duration of 0.4 seconds?
A histogram is a plot of the number of times that a specific value occurs in a dataset, plotted versus the value. (Or it can be number of occurrences of values within specified bins, versus the bin center values.) This does not seem to be what you are plotting in your original bar chart. Matlab has a histo() function. If you want a histogram, why not use it? You could use it on your three data sets, or you could combine them and make a histogram of all 78 points.
What is K? Since K depends on frequency, I wonder if K corresponds to amplitude, or power, or phase, or time lag, or some other frequency-dependent property of a signal.
When I asked "What is causing the quantization effects?", I meant why do the K values cluster at specific levels, instead of being spread randomly across the range from lowest to highest K. Why do the values appear to be constrained to specific values or narrow ranges? Here is an actual histogram showing the frequency of different values of the elements of array p, which shows that the values of p are clustered at particular levels. This is what I mean by quantization. This plot was made as follows:
>> histogram([p(:,1);p(:,2);p(:,3)],[.3 .418 .432 .51 .555 .69 .73 .83 .85 .87 .89 .91 .93 .95 .97 1.03 1.24 1.26 1.3]);
>> xlabel('K'); ylabel('Frequency'); title('Histogram of p');
William Rose
William Rose on 11 May 2021
@Sk. Alam, The number of elements in qq may be more than or less than or equal to the number of elements in p or in p(:,i). qq specifies the edges of the bins, i.e. the x-axis coordinates for the histogram plot. The attached code illustrates how the choice of qq influences the histogram. The values I chose for this example are arbitrary. There are no "right" or "wrong" values for qq, except the values should at least span the min and max of the data set. It depends on what you want to do with this histogram, or what point you wish to illustrate. I don't know exactly how Matlab chooses the default bins, but I do know that the number of default bins gets larger (i.e. bins get more narrow) when the number of values in the data set gets larger.
>> figure;
>> subplot(3,1,1);
>> histogram(pall);
>> title('Matlab default bins');
>> qq=0.4:.05:1.3;
>> subplot(3,1,2);
>> histogram(pall,qq);
>> title('Bin edges 0.4:.05:1.3');
>> qq=0.2:.02:1.5;
>> subplot(3,1,3);
>> histogram(pall,qq);
>> title('Bin edges 0.2:.02:1.5');
The code above generates the figure below.
Here is another example of histogram with default bins. In this cae, the data set is 1000 normal random numbers. Code, then figure.
>> rands=randn(1000,1);
>> figure;
>> histogram(rands)
Figure made by code above.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!