Randomly split data set and make sure that a sample of each class is included

1 view (last 30 days)
Hello,
I am randomly splitting my dataset into train,validation and test sets as shown below:
% Random split Complete Data Set
Q = size(completeRDS,1);
Q1 = floor(Q*0.75);
Q2 = Q-Q1;
Q2 = round(Q2/2);
Q3 = Q2;
ind = randperm(Q);
ind1 = ind(1:Q1);
ind2 = ind(Q1+(1:Q2));
ind3 = ind(Q2+(1:Q3));
% Complete Data Set
trainData = completeRDS(ind1,1);
trainSeqLabels = seqLabels(ind1,1);
valData = completeRDS(ind3,1);
valLabels = seqLabels(ind3,1);
testData = completeRDS(ind2,1);
testLabels = seqLabels(ind2,1);
I have just realised that my test set has not included a sample from each class that I am trying to classify. It has not included 2 of the classes
How can i split my data set and make sure that the test set includes at least 1 sample of each class that i am trying to classify?
Thanks in advance

Answers (1)

MaryD
MaryD on 8 Jan 2020
If those are images I think best way is to load your data with class labels and then use splitEachLabel( __,'randomize') function.

Products


Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!