Grouping multiple variables in boxplot with different numbers of samples

40 views (last 30 days)
Hi - I have n-by-m data where n is the number of datapoints and m is the number of variables.
I have an additional vector, length n, containing 0s and 1s. The 0s refer to which datapoints are controls and experimental conditions, respectively.
My data is a matrix with n = 43 datapoints and m = 15 variables.
I would like to plot paired boxplots, such as this example from the Mathematics Stack Exchange (https://mathematica.stackexchange.com/questions/66720/how-to-group-box-and-whisker-in-parallel-for-comparison-in-boxwhiskerchart):
Basically, I'm trying to plot 15 variables with two boxes each (A,B,C,.. in the image above), one for control and one for experimental condition for each variable (Apples and Oranges in the image above).
To play around, I tried the following:
X = randn(43,15); % Sample data - 43 points, 15 vars
group = randi([0,1],43,1); % Vector of length 43, 0 = control, 1 = exp
boxplot(X,group)
Error using boxplot>straightenX
G must be the same length as X or the same length as the number of columns in X.

Error in boxplot (line 275)
[xDat,gDat,origRow,xlen,gexplicit,origInd,origNumXCols] = straightenX(x,g);
It seems to be assuming for a matrix input that different columns of X are different groups, but the different groups are within the columns of X, represented by the group var.
I feel like I'm missing something very basic here, but I'm sleep-deprived and not seeing it. (Please note also that I am currently limited to functionality in R2018b.)
Any help would be much appreciated. TYIA!

Accepted Answer

Cris LaPierre
Cris LaPierre on 2 May 2023
Edited: Cris LaPierre on 2 May 2023
It is possible to create a plot like this in MATLAB, but with boxchart, not boxplot, and your inputs must be vectors, not matrices. This means you need to modify your inputs so that they are all the same size. I was able to do this using repmat and the colon operator.
X = randn(43,15); % Sample data - 43 points, 15 vars
group = randi([0,1],43,1); % Vector of length 43, 0 = control, 1 = exp
groups = repmat(group,size(X,2),1);
var = string((char(65):char(65+size(X,2)-1)).');
vars = repmat(categorical(var),1,size(X,1)).';
boxchart(vars(:),X(:),'GroupByColor',groups(:))
legend("Apples","Oranges")
If your data is in a table, then you can use stack and your table properties to do the data manipulation for you.
% Create a table and name the variables
data = table(group,X);
data = splitvars(data,'X','NewVariableNames',var)
data = 43×16 table
group A B C D E F G H I J K L M N O _____ _________ _________ _________ ________ _________ ________ _________ ________ ________ _________ ________ ________ ________ _________ _________ 0 -0.78365 -1.3672 1.2767 -0.66198 0.44009 -1.1647 -1.0523 0.91291 0.19244 -0.26104 0.85805 -1.5931 1.8209 0.13791 1.6072 0 0.5338 -0.11093 -1.0237 0.31408 0.4591 -0.67494 0.21593 0.15389 0.36312 0.47997 1.0372 -0.73125 2.8055 0.42833 -0.050669 1 -0.75449 0.14931 0.94171 1.0577 -0.63141 0.41018 -0.044837 2.2156 -1.0096 -0.32943 -0.89334 0.89178 -1.9457 -0.60051 0.22861 0 -0.52258 1.9209 0.51428 1.6171 -0.67284 0.80086 -0.98011 -0.19906 -0.44956 -0.54665 1.2315 0.62645 -0.39145 0.14721 0.2675 1 -0.28495 -0.011854 1.492 0.8763 1.1202 -0.76473 1.7985 0.60363 -0.24757 -1.038 0.55251 0.86205 1.5242 0.69273 0.39518 1 -0.48679 0.95656 0.18366 0.46253 -0.17019 0.45269 0.36548 1.9587 1.0833 0.49743 -1.1618 0.87201 0.78753 -0.092399 -0.54733 0 -0.81188 0.0065904 -0.32503 1.1963 0.253 -0.7246 0.31608 -0.37053 0.2739 0.92101 2.3102 -0.31289 0.21085 0.3611 1.1466 1 -0.012616 1.2022 0.51439 -0.84723 -0.031184 -0.58333 0.34962 -1.2689 -0.08459 0.65182 -0.61438 0.32188 0.7876 -1.3868 -1.1578 1 0.96285 -0.4046 0.2313 0.1222 0.58342 0.78438 -0.6552 -0.30795 1.1631 1.0569 2.2018 -1.2669 -0.37615 -0.78463 0.16056 0 1.4431 -0.8278 0.67344 0.011972 -1.3331 1.2988 -0.85942 0.20976 0.52739 -0.27757 -0.81509 0.3486 -0.71525 1.0656 -0.70969 1 -0.48448 1.2058 -0.70949 1.2218 -0.30543 -0.34847 0.10634 -0.40955 -1.2326 -1.1277 0.11259 0.31992 0.80036 -1.5255 -1.6419 0 0.55114 0.85516 0.34844 -0.57411 1.542 2.0835 0.20677 -1.0867 0.77636 1.1072 0.55104 1.0428 -0.78806 -0.75644 0.045694 0 -1.7639 -0.097171 -1.3497 0.60566 -0.36955 -0.36066 -0.87998 -0.57048 -0.41724 1.2859 -0.35725 -0.8071 -1.1749 -0.43949 0.649 1 0.35676 0.82779 0.014518 -1.7933 0.15958 -0.20429 0.0099976 -1.4156 -0.80663 -0.063186 0.29063 -1.7622 1.595 1.8 -2.021 1 -1.9696 -1.2014 0.10853 -0.22364 -0.20838 0.46176 -0.7397 2.394 -1.6997 0.68078 0.10383 -1.74 -1.2035 -2.2549 0.78085 1 -0.42589 -0.17361 -0.088093 -0.7349 -0.57292 1.3826 1.2355 -0.63352 0.26533 -0.26311 0.77797 -0.6987 -0.41754 -1.1781 -0.61961
% Stake the data to create the 3 inputs to boxchart
bpData = stack(data,2:width(data));
bpData.Properties.VariableNames = ["group","var","X"]
bpData = 645×3 table
group var X _____ ___ ________ 0 A -0.78365 0 B -1.3672 0 C 1.2767 0 D -0.66198 0 E 0.44009 0 F -1.1647 0 G -1.0523 0 H 0.91291 0 I 0.19244 0 J -0.26104 0 K 0.85805 0 L -1.5931 0 M 1.8209 0 N 0.13791 0 O 1.6072 0 A 0.5338
boxchart(bpData.var,bpData.X,'GroupByColor',bpData.group)
legend
  3 Comments
Cris LaPierre
Cris LaPierre on 3 May 2023
It's not a great way, but yes, you can. You need to create 2 grouping variables. Note that boxplot does not have a legend option. It also adds both grouping variable values as the X tick label.
% Create sample data
data = randn(43,15); % Sample data - 43 points, 15 vars
% Create grouping variables (value for each element of data)
group0 = string((char(65):char(65+size(data,2)-1)).');
group0 = repmat(group0,1,size(data,1)).';
group1 = randi([0,1],43,1); % Vector of length 43, 0 = control, 1 = exp
group1 = repmat(group1,size(data,2),1);
group = {group0(:),group1};
% Create boxplot with 2 boxplots in each group
boxplot(data,group, 'Colors', ['b', 'r'], 'Widths', 0.5)
James Hengenius
James Hengenius on 7 May 2023
Thank you!
I ended up jumping onto my personal computer with a newer version to use boxchart (I was under a time crunch). But this code snippet will be very useful in coding a permanent version of the script for use with R2018b.

Sign in to comment.

More Answers (0)

Products


Release

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!