Grouping multiple variables in boxplot with different numbers of samples
40 views (last 30 days)
Show older comments
James Hengenius
on 2 May 2023
Commented: James Hengenius
on 7 May 2023
Hi - I have n-by-m data where n is the number of datapoints and m is the number of variables.
I have an additional vector, length n, containing 0s and 1s. The 0s refer to which datapoints are controls and experimental conditions, respectively.
My data is a matrix with n = 43 datapoints and m = 15 variables.
I would like to plot paired boxplots, such as this example from the Mathematics Stack Exchange (https://mathematica.stackexchange.com/questions/66720/how-to-group-box-and-whisker-in-parallel-for-comparison-in-boxwhiskerchart):

Basically, I'm trying to plot 15 variables with two boxes each (A,B,C,.. in the image above), one for control and one for experimental condition for each variable (Apples and Oranges in the image above).
To play around, I tried the following:
X = randn(43,15); % Sample data - 43 points, 15 vars
group = randi([0,1],43,1); % Vector of length 43, 0 = control, 1 = exp
boxplot(X,group)
It seems to be assuming for a matrix input that different columns of X are different groups, but the different groups are within the columns of X, represented by the group var.
I feel like I'm missing something very basic here, but I'm sleep-deprived and not seeing it. (Please note also that I am currently limited to functionality in R2018b.)
Any help would be much appreciated. TYIA!
0 Comments
Accepted Answer
Cris LaPierre
on 2 May 2023
Edited: Cris LaPierre
on 2 May 2023
It is possible to create a plot like this in MATLAB, but with boxchart, not boxplot, and your inputs must be vectors, not matrices. This means you need to modify your inputs so that they are all the same size. I was able to do this using repmat and the colon operator.
X = randn(43,15); % Sample data - 43 points, 15 vars
group = randi([0,1],43,1); % Vector of length 43, 0 = control, 1 = exp
groups = repmat(group,size(X,2),1);
var = string((char(65):char(65+size(X,2)-1)).');
vars = repmat(categorical(var),1,size(X,1)).';
boxchart(vars(:),X(:),'GroupByColor',groups(:))
legend("Apples","Oranges")
If your data is in a table, then you can use stack and your table properties to do the data manipulation for you.
% Create a table and name the variables
data = table(group,X);
data = splitvars(data,'X','NewVariableNames',var)
% Stake the data to create the 3 inputs to boxchart
bpData = stack(data,2:width(data));
bpData.Properties.VariableNames = ["group","var","X"]
boxchart(bpData.var,bpData.X,'GroupByColor',bpData.group)
legend
3 Comments
Cris LaPierre
on 3 May 2023
It's not a great way, but yes, you can. You need to create 2 grouping variables. Note that boxplot does not have a legend option. It also adds both grouping variable values as the X tick label.
% Create sample data
data = randn(43,15); % Sample data - 43 points, 15 vars
% Create grouping variables (value for each element of data)
group0 = string((char(65):char(65+size(data,2)-1)).');
group0 = repmat(group0,1,size(data,1)).';
group1 = randi([0,1],43,1); % Vector of length 43, 0 = control, 1 = exp
group1 = repmat(group1,size(data,2),1);
group = {group0(:),group1};
% Create boxplot with 2 boxplots in each group
boxplot(data,group, 'Colors', ['b', 'r'], 'Widths', 0.5)
More Answers (0)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

