Problem in determining standard error and plotting 3 error bars
8 views (last 30 days)
Show older comments
Hello everyone,
I am trying to make a figure where Y-axis is temperature and X-axis represents particle concentration. The I-type vertical bars indicate standard error. I tried this:
Data = table2array(WORK01S1); % convert it into array
Temp = double(Data(:,1)); % extract the column of Temperature from Data
Particle = double(Data(:,2)); % extract the column of Particle from Data
Particle = Particle*1000000000; % To change kg/um3 to ug/um3
CAPE = double(Data(:,3)); % extract the column of CAPE from Data
[N,edges,bins] = histcounts(Particle,10) % I divided the Particle concentration in to 10 bins (just to check I selected 10 bins)
% I used below line to calculate standard error but I am not sure whether
% its correct or not
std_err= std(Data,[],2)/sqrt(size(Data,2))
I am not able to calculate standard error properly and aslo I want 3 error bars as shown in the figure (values of CAPE too). What changes and additions should I make in my code? I can't figure out this. I'm grateful for any help. Data is attached.
7 Comments
Accepted Answer
dpb
on 13 Jun 2022
tData=readtable('WORK.xlsx');
tData.ParticleConc=1E9*tData.ParticleConc;
tData.CAPE_Bin=discretize(tData.CAPE,[0,500,900,inf]);
tStats=grpstats(tData,'CAPE_Bin',{'mean','std',fnStdMn});
tStats.Properties.VariableNames=strrep(tStats.Properties.VariableNames,'Fun3','stdMn')
figure
hAx=axes; hold on
hL=splitapply(@plotrows,tData.ParticleConc,tData.Temp,tData.CAPE_Bin);
hAx.XScale='log';
grid on
provides the beginnings for the figure, but your data aren't separated by the temperature gradient nearly as neatly as the example figure -- and your particle concentrations cover several orders of magnitude wider range..
The data for the error bars is in the tStats structure as the stdMnTemp value, errorbar will let you add it to the plot although it's already messy enough owing to the overlay of the observations, adding the error bars will basically turn it into a solid blob over the bulk of the figure.
>> tStats
tStats =
3×11 table
CAPE_Bin GroupCount mean_Temp std_Temp stdMn_Temp mean_ParticleConc std_ParticleConc stdMn_ParticleConc mean_CAPE std_CAPE stdMn_CAPE
________ __________ _________ ________ __________ _________________ ________________ __________________ _________ ________ __________
1 1 75 245.93 3.554 0.047386 64.121 90.012 1.2002 156.36 131.14 1.7486
2 2 15 248.27 3.7696 0.2513 11.246 12.918 0.86118 653.97 109.81 7.3209
3 3 59 249.2 3.7406 0.063401 5.6333 10.606 0.17975 1781.2 640.89 10.862
>>
The basics to produce the figure; looks like will need to do a lot of clean up and possibly other selection of binning intervals or some other presentation to be useful.
Oh -- the legend is also simple enough, just write the desired strings.
I wrote a little helper function plotrows to be able to call from splitapply with the grouping variable; you could always just write a straightahead loop selecting the bin variable subset in turn. It looked like
>> type plotrows
function hL=plotrows(x,y,varargin)
xy=sortrows([x y]);
hL=plot(xy(:,1),xy(:,2),varargin{:});
end
The thing was to sort the data to have a smooth line; I tried getting an indexing variable by group in the table, but it didn't work as seemed as should have and I ran out of time to mess with it any further...
5 Comments
dpb
on 14 Jun 2022
Oh, I forgot I had defined an anonymous function to compute the std deviation of the mean -- it was/is
fnStdMn=@(x)std(x)./sqrt(size(x,1));
More Answers (1)
Peter Perkins
on 13 Jun 2022
Zhou, independently from how to make the figure, this code
Data = table2array(WORK01S1); % convert it into array
Temp = double(Data(:,1)); % extract the column of Temperature from Data
Particle = double(Data(:,2)); % extract the column of Particle from Data
Particle = Particle*1000000000; % To change kg/um3 to ug/um3
CAPE = double(Data(:,3)); % extract the column of CAPE from Data
seems much too complicated. I suggest perhaps
Temp = WORK01S1.Temp;
Particle = WORK01S1.ParticleConc*1000000000;
CAPE = WORK01S1.CAPE;
But more to answer your question:
Use the table. It will make your life easier.
>> WORK01S1 = readtable("WORK.xlsx");
>> WORK01S1.binnedCAPE = discretize(WORK01S1.CAPE,[0 500 900 Inf],"categorical");
>> WORK01S1.binnedParticleConc = discretize(WORK01S1.ParticleConc,10,"categorical");
>> meanTemp = varfun(@mean,WORK01S1,"GroupingVariables",["binnedCAPE" "binnedParticleConc" ],"InputVariables","Temp");
>> sem = @(x) std(x)/sqrt(length(x));
>> semTemp = varfun(sem,WORK01S1,"GroupingVariables",["binnedCAPE" "binnedParticleConc" ],"InputVariables","Temp")
>> meanTemp.SE = semTemp.Fun_Temp
meanTemp =
13×5 table
binnedCAPE binnedParticleConc GroupCount mean_Temp SE
__________ ____________________ __________ _________ _______
[0, 500) [0, 3.8e-08) 45 -26.889 0.52438
[0, 500) [3.8e-08, 7.6e-08) 13 -28.538 0.93106
[0, 500) [7.6e-08, 1.14e-07) 5 -29 0.89443
[0, 500) [1.14e-07, 1.52e-07) 2 -22.5 0.5
[0, 500) [1.52e-07, 1.9e-07) 2 -27.5 1.5
[0, 500) [1.9e-07, 2.28e-07) 2 -24.5 2.5
[0, 500) [2.66e-07, 3.04e-07) 2 -23 1
[0, 500) [3.04e-07, 3.42e-07) 3 -27.667 3.6667
[0, 500) [3.42e-07, 3.8e-07] 1 -26 0
[500, 900) [0, 3.8e-08) 14 -24.429 0.99291
[500, 900) [3.8e-08, 7.6e-08) 1 -29 0
[900, Inf] [0, 3.8e-08) 56 -23.661 0.48031
[900, Inf] [3.8e-08, 7.6e-08) 3 -26.333 3.6667
Now you can make the plot. One way is to loop over CAPE bins, where at each iteration, you get the data for one CAPE bin:
for cape_i = categories(meanTemp.binnedCAPE)'
i = meanTemp.binnedCAPE == cape_i;
Temp = meanTemp.mean_Temp(i);
Conc = meanTemp.binnedParticleConc(i);
% make plot
end
Of course, your data doesn't really support your figure, but that's for you to reconcile.
4 Comments
dpb
on 14 Jun 2022
Edited: dpb
on 14 Jun 2022
Thanks...I know there's more to do than time/resources to do it, but... :)
"... with grouping variables there's no way to sort the groups effectively ..."
This one ties in with the previous complaint/suggestions I've made that one can't return the sorting index from sort except as the second, optional output and there's no way to use anything except the first return in the anonymous function effectively for these purposes. I've built a customized sorti that does return the index first; I think there should be a flag parameter in sort for the purpose.
I still don't follow why my use of
@(x,y,ix) plot(x(ix),y(ix))
in splitapply wasn't in sorted order after building that sort index by group and passing it in the anonymous function, but don't have the time just now to pursue it further. I must've made a bonehead blunder somewhere, but it wasn't apparent where that was.
Walter Roberson
on 14 Jun 2022
In the case where duplicate values are to be assigned the same index, then findgroups() returns sort order.
See Also
Categories
Find more on Errorbars in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!