Elegant way to view and compare data based on feature/parameter selection?

1 view (last 30 days)
I have a data set containing results from various tests under different test conditions. Now I want to analyze the data so one can see the impact of the test conditions (parameters).
I read the data into a table. The data uploaded is an arbitrary minimal example. I have more test objects, more parameters and more tests.
Think of this example as an aging test under AC and DC Load.
tbl = readtable("TestData.xlsx", 'PreserveVariableNames', 1);
% treat missing duration as zero
tbl{isnan(tbl.("Duration / h")), "Duration / h"} = 0
tbl = 7×6 table
Test Object Ud / V Ud / Ud0 Temperature / °C Duration / h Load ___________ ______ ________ ________________ ____________ ______ {'TO1'} 10 NaN 30 0 {'AC'} {'TO1'} 12 NaN 30 0 {'DC'} {'TO2'} 11 NaN 30 0 {'AC'} {'TO2'} 13 NaN 30 0 {'DC'} {'TO1'} NaN 50 30 98 {'AC'} {'TO1'} NaN 80 30 112 {'DC'} {'TO2'} NaN 70 30 95 {'AC'}
Then I select some parameters I want to compare to.
For example compare TO1 and TO2 "new" under AC and DC Load:
% data for labels
x = categorical(tbl{(tbl.("Load") == "AC") & int16(tbl.("Duration / h") == 0), "Test Object"});
% select dataset 1
y1 = tbl{(tbl.("Load") == "AC") & int16(tbl.("Duration / h") == 0), "Ud / V"}
y1 = 2×1
10 11
% select dataset 2
y2 = tbl{(tbl.("Load") == "DC") & int16(tbl.("Duration / h") == 0), "Ud / V"}
y2 = 2×1
12 13
% plot both data sets
bar(x, [y1, y2])
legend("AC", "DC", 'location', 'northwest')
In this case I have hard coded the selection and variables. Is there a more elegant and/or easy way to select the data?
I already have approx. 10 more parameters and 10 more test objects. It is likely that even more parameters are added in the future. Therefore I'd like to have my code scalable from the beginning.
In Excel I can use Filter and the drop down menue to select data, but plotting is cumbersome.
Also you might have noticed there is no data for TO2, aged with DC. With the current solution I will get errors if I wanted to compare DC Load new to aged because y1 and y2 don't have compatible length anymore.
In this case it would be easy to say it's not possible to compare, of course. But if one or two out of ten test objects are missing, I still like to compare the 8 or 9 remaining.
Any Ideas for that?
  2 Comments
Ive J
Ive J on 31 Aug 2021
I don't think there is a straightforward way for this. But you can use groupsummary and groupfilter. For the sample dataset you provided, you can squeeze all grouping into one line:
T = groupsummary(data, {'Test Object', 'Load'}, @(x){x(~isnan(x))}, 'Ud / V')
Test Object Load GroupCount fun1_Ud / V
___________ ____ __________ ___________
"TO1" "AC" 2 {[10]}
"TO1" "DC" 2 {[12]}
"TO2" "AC" 2 {[11]}
"TO2" "DC" 1 {[13]}
% use T for visualization
This is especially useful if you have more numbers per each category (e.g. you have multiple 'Ud / V' values for TO1 & AC) for which you can apply a custom function.

Sign in to comment.

Answers (1)

Peter Perkins
Peter Perkins on 2 Mar 2022
Martin, you may have moved on already, but maybe this is still helpful.
First, I think using categorical will make life easier.
>> tbl = readtable("TestData.xlsx", 'PreserveVariableNames', 1);
>> tbl.("Duration / h")(isnan(tbl.("Duration / h"))) = 0; % treat missing duration as zero
>> tbl.("Test Object") = categorical(tbl.("Test Object"));
>> tbl.Load = categorical(tbl.Load)
Next, this seems like the next step in making this code more general:
>> designVar1 = "Test Object";
>> designVar2 = "Load";
>> measurementVar = "Ud / V";
>> newTbl = tbl(tbl.("Duration / h")==0,[designVar1 designVar2 measurementVar]);
>> newTbl = sortrows(newTbl,[designVar1 designVar2])
>> n = length(categories(tbl.(designVar1)));
>> m = length(categories(tbl.(designVar2)));
>> bar(reshape(newTbl.(measurementVar),n,m));
>> set(gca,'XTickLabel',categories(tbl.(designVar1)))
>> legend(categories(tbl.(designVar2)), 'location', 'northwest')
This would work for any number of levels of two design vars. I don't know what kind of chart you'd want to make for more than two design vars. Maybe groups of groups of bars. It's all in the reshape, I think.

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!