over sampling method( SMOTE)
2 views (last 30 days)
Show older comments
Dear all, I have used SMOTE (an oversampling method for balancing data set),but after balancing, the obtained balanced data set has not the label column. the rows related to the balanced data set get increase but the label column would not increase. the main data set is 1000*25. the obtained balanced data set will be 2200*24. without label column. label column goes to "final_labels" parameter. it is 2200*1 but it contains only label 1. it must contain both labels 2 and 1 .
I will be so happy if any one would be able to guide me. any suggestion will be appreciated.
------------------------------------------------
this is my script code to balancing data set.
-----------------------------------------------------
load creditgerman.mat
a=creditgerman;
[n,m]=size(a);
total_rows=(1:n);
original_features=a(:,1:m-1);
original_mark=a(:,m);
[creditgerman_balanced_SMOTE,final_labels]=SMOTE(original_features, original_mark);
--------------------------------------------------------------------------
and this is the utilized SMOTE code.
function [final_features , final_mark] = SMOTE(original_features, original_mark)
ind = find(original_mark ==2);
% P = candidate points
P = original_features(ind ,:);
T = P';
% X = Complete Feature Vector
X = T;
% Finding the 5 positive nearest neighbours of all the positive blobs
I = nearestneighbour(T, X, 'NumberOfNeighbours', 6);
I = I';
[r, c] = size(I);
S = [];
th=0.3;
for i=1:r
for j=2:c
index = I(i,j);
new_P=P(i,:)+((P(index,:)-P(i,:))*rand);
S = [S;new_P];
end
end
original_features = [original_features;S];
[r c] = size(S);
mark = ones(r,1);
original_mark = [original_mark;mark];
train_incl = ones(length(original_mark), 1);
I = nearestneighbour(original_features', original_features', 'NumberOfNeighbours', 6);
I = I';
for j = 1:length(original_mark)
neighbors = I(j, 2:6);
len = length(find(original_mark(neighbors) ~= original_mark(j,1)));
if(len >= 2)
if(original_mark(j,1) == 1)
train_incl(neighbors(original_mark(neighbors) ~= original_mark(j,1)),1) = 0;
else
train_incl(j,1) = 0;
end
end
end
final_features = original_features(train_incl == 1, :);
final_mark = original_mark(train_incl ==1, :);
end
-----------------------------------------------------------

0 Comments
Answers (0)
See Also
Categories
Find more on Matrix Indexing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!