For-loop question

1 view (last 30 days)
Ronan Rafferty
Ronan Rafferty on 6 Dec 2020
Commented: Walter Roberson on 9 Dec 2020
I am using the readMatrix function to read in 90 rows and 13 columns of data (with each column being a measurement such as age, gender etc) from a .csv file, I then need to perform if statements on this data to follow a decision tree based on the data in each row to give me an outcome. I have been told that using a for loop will help me to implement this decision tree on the data but I haven't been told how?
This is what I have so far
myData = readMatrix('BME501_Coursework_Testdata.csv');
for i= 1:90
Val1 = Column1(1);
Val2 = Column2(1);
Val3 = Column3(1);
Val4 = Column4(1);
Val5 = Column5(1);
Val6 = Column6(1);
Val7 = Column7(1);
Val8 = Column8(1);
Val9 = Column9(1);
Val10 = Column10(1);
Val11 = Column11(1);
Val12 = Column12(1);
Val13 = Column13(1);
end
As you can see I have gotten the file to be read in and then have defined each of the values in row 1 so that they can be used in the decision tree.
My question is how is it possible to do this for all 90 rows so that each row (which is 1 set of data) can be read as one set of data so that the decision tree can use the variables within it to give me an outcome for each row?
  4 Comments
dpb
dpb on 7 Dec 2020
Well, that's totally illegible as to try to decipher -- would have to work back from that to try to derive the actual logic from which to write the algorithm.
It starts off with
data(:,1)<=3
as the first decision point and goes from there.
Where's the problem definition you were given?
But, that aside, the way to code it even if were to keep the compound if...elseif...end block is to substitute the array variable indices for the numbered variables.
Patient0utput = 1; % Global "else" if nothing matches
if X(i,3)<=3 % first level on parameter 3 <= 3
if X(i,9) <=0 % second level on parameter 9 <= 0
if X(i,1) <=55 % third level on parameter 1 <=55
if X(i,3) <=1 % fourth level on parameter 3 <=1
if (X(i,2)<=0 % fifth level on parameter 2 <=0
Patient0utput = 0;
else % Val2 >0
&& Val7 <=1 && Val3 <=46
Patient0utput = 1;
Val7 <=1 && Val3 >46
Patient0utput = 0;
Val7 >1
end % Val2 level 5
Patient0utput = 0;
else % Val3 >1
Patient0utput = 0;
end % Val3 level 4
else % Val1 >55
Val7 <=0
Patient0utput = 1;
Val7 >0 && Val2 <=0
Patient0utput = 0;
Val7 >0 && Val2 >0 && Val7 <=0 && Val3 <=1
Patient0utput = 1;
Val7 >0 && Val2 >0 && Val7 <=0 && Val3 >1 && Val4 <=128 && Val8 <=142
Patient0utput = 1;
Val7 >0 && Val2 >0 && Val7 <=0 && Val3 >1 && Val4 <=128 && Val8 >142
Patient0utput = 0;
Val7 >0 && Val2 >0 && Val7 <=0 && Val3 >1 && Val4 >128
Patient0utput = 0;
Val7 >0 && Val2 >0 && Val7 >0 && Val7 <=1
Patient0utput = 1;
Val7 >0 && Val2 >0 && Val7 >0 && Val7 >1 && Val12 <=0 && Val7 <=271
Patient0utput = 0;
Val7 >0 && Val2 >0 && Val7 >0 && Val7 >1 && Val12 <=0 && Val7 >271
Patient0utput = 1;
Val7 >0 && Val2 >0 && Val7 >0 && Val7 >1 && Val12 >0
Patient0utput = 1;
Val7 <=0
end % Val1 >55
else % Val9 >0
if X(i,1) <=1
Patient0utput = 0;
else % Val1 >1
Patient0utput = 1;
end
else % Val3 >3
&& Val5 <=0
Patient0utput = 1;
&& Val5 >0 && Val10 <=0.8 && Val2 <=0 && Val9 <=0
Patient0utput = 0;
&& Val5 >0 && Val10 <=0.8 && Val2 <=0 && Val9 >0 && Val7 <=0
Patient0utput = 1;
&& Val5 >0 && Val10 <=0.8 && Val2 <=0 && Val9 >0 && Val7 >0
Patient0utput = 0;
&& Val5 >0 && Val10 <=0.8 && Val2 >0 && Val12 <=0 && Val13 <=3
Patient0utput = 0;
&& Val5 >0 && Val10 <=0.8 && Val2 >0 && Val12 <=0 && Val13 >3
Patient0utput = 1;
&& Val5 >0 && Val10 <=0.8 && Val2 >0 && Val12 >0
Patient0utput = 1;
&& Val5 >0 && Val10 >0.8 && Val2 <=0 && Val13 <=3 && Val9 <=0
Patient0utput = 0;
&& Val5 >0 && Val10 >0.8 && Val2 <=0 && Val13 <=3 && Val9 >0
Patient0utput = 1;
&& Val5 >0 && Val10 >0.8 && Val2 <=0 && Val13 >3
Patient0utput = 1;
end
A start at factoring the conditions...
Walter Roberson
Walter Roberson on 7 Dec 2020
Your conditions sometimes clash. For example your first condition
if Val3 <=3 && Val9 <=0 && Val1 <=55 && Val3 <=1 && Val2 <=0
Val3 must be <= 3 (first part) but also <= 1 (fourth part) .
You use that same clash on tests 2, 3, 4. But then on test 5 you have
elseif Val3 <=3 && Val9 <=0 && Val1 <=55 && Val3 >1
Val3 must be <= 3 (first part) but also > 1 (fourth part) . That makes more sense to test together
But look at test 3:
elseif Val3 <=3 && Val9 <=0 && Val1 <=55 && Val3 <=1 && Val2 >0 && Val7 <=1 && Val3 >46
Val3 must be <= 3 (first part) but also <= 1 (4th part) but also > 46 (7th part)
Consider going through your tests, and for each variable, make a list of the used conditions, each in numeric sorted order within the test, such as
val1 <= 1, val1 <= 55, val1 > 55
val2 <= 0, val2 > 0
val3 <= 1 & <= 3, val3 <= 1 & <= 3 & <= 46, val3 <= 1 & <= 3 & > 46, val3 > 1 & <= 3, val3 <= 3
logically, val3 <= 1 & <= 3 & <= 46 could be simplified to val3 <= 1, but you need to review the medical part to see whether that makes sense or whether instead you named the wrong val* for one of the tests. But val3 <= 1 & <= 3 & > 46 is just plain false, so you need to review the medical part to see if you named the wrong test or if instead the test is truly impossible to succeed
After all tests are resolved and simplified, for each variable you will end up with a list of breakpoints, such as
val1 <= 1, val1 <= 55, val1 > 55
and you can process that test by discretizing the values into exclusive domains
val1 < 1, val1 = 1, val1 < 55, val1 = 55, val1 > 55
if you assign a number (possibly categorical or enumeration) to each, then in your tests you can code something like
ismember(val1idx, [EQ1, LT55])
which would encode val1 == 1 or (val1 > 1 & val1 < 55)
It is a bit of a nuisance to have to code the == separately from the < or >, but at the moment it looks to me as if some of your ranges are valN >= A & valN <= B, and others ar valN > A & valN <= B, and others are valN >= A & valN < B .
With this kind of coding, if you wanted to express a strict < 55 including <= 1, you would code the entire list up to that point,
ismember(val1idx, [LT1, EQ1, LT55])
This kind of setup helps to think of the conditions more methodically, and can even express "or" for disjoint ranges. But for your purposes it might turn out that you do not need disjoint ranges, and you might just need to know starting and ending condition numbers, like
ismember(val1idx, EQ1:LT55)
that would then potentially lend itself to encoding in a table, like
[1, EQ1, LT55; %val1 [1, 55)
2, -inf, LT0; %val2 0)
3, GT3, EQ46] %val3 (3, 46]

Sign in to comment.

Accepted Answer

Walter Roberson
Walter Roberson on 6 Dec 2020
myData = readMatrix('BME501_Coursework_Testdata.csv');
for i = 1 : size(mYData,1)
thisrow = myData(i,:);
now use thisrow in your decision tree
end
  2 Comments
Ronan Rafferty
Ronan Rafferty on 7 Dec 2020
Thank you, this looks like it'll work.
I'm quite new to Matlab so not quite sure how exactly it will look for me decision tree though? Could you give me an example of how I would use thisrow in the decision tree? Even if you could make a very basic decision tree that i could use as a reference?
Thanks for your help!
Walter Roberson
Walter Roberson on 9 Dec 2020
You can just feed it sample data and associated class labels, and it will automatically figure out what the tree looks like.

Sign in to comment.

More Answers (0)

Categories

Find more on Testing Frameworks in Help Center and File Exchange

Products


Release

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!