Hi everyone,
I need your help for my project.
I have already built an SVM model for classification with 4 labels. The SVM model worked very well. Accuracy classification reaches more than 90%.
However, when I want to check the model with new data ( new data = the original data through an AWGN channel having a 10 dB signal-to-noise ratio (SNR). The classification result is always less than 30% accuracy.
I don't know why despite trying so many ways. Pls help me!!!
untitled.jpg
untitled1.jpg
My code is as follows:
%% preparing data
load('mydata.mat') % including 200 observers and 120 features, 4 labels
output = grp2idx(Y);
rand_num = randperm(size(X,1));
% training data set 70%, test set 30%,
X_train = X(rand_num(1:round(0.7*length(rand_num))),:);
y_train = output(rand_num(1:round(0.7*length(rand_num))),:);
X_test = X(rand_num(round(0.7*length(rand_num))+1:end),:);
y_test = output(rand_num(round(0.7*length(rand_num))+1:end),:);
%% Train a classifier
% This code specifies all the classifier options and trains the classifier.
template = templateSVM(...
'KernelFunction', 'linear', ...
'PolynomialOrder', [], ...
'KernelScale', 'auto', ...
'BoxConstraint', 1, ...
'Standardize', true)
Mdl = fitcecoc(...
X_train, ...
y_train, ...
'Learners', template, ...
'Coding', 'onevsall',...
'OptimizeHyperparameters','auto',...
'HyperparameterOptimizationOptions',...
struct('AcquisitionFunctionName',...
'expected-improvement-plus'));
%% Perform cross-validation
partitionedModel = crossval(Mdl, 'KFold', 10);
% Compute validation predictions
[validationPredictions, validationScores] = kfoldPredict(partitionedModel);
% Compute validation accuracy
validation_error = kfoldLoss(partitionedModel, 'LossFun', 'ClassifError'); % validation error
validationAccuracy = 1 - validation_error;
%% test model
oofLabel_n = predict(Mdl,X_test);
oofLabel_n = double(oofLabel_n); % chuyen tu categorical sang dang double
test_accuracy_for_iter = sum((oofLabel_n == y_test))/length(y_test)*100;
%% save model
saveCompactModel(Mdl,'mySVM');

3 Comments

You forgot to attach 'mydata.mat', so we can't run your code.
Maybe SVM is not the best approach. Maybe you should try a discriminant analysis or something. Try the Classification Learner app on the Apps tab of the tool ribbon.
As @ImageAnalyst suggests, we can't do much without the data.
That being said, it is suspicious that almost all points for the new data are classified into class 4 (rather than a more random misclassification). That should give you a hint as to what is happening.
Thank for your suprort!
I attach my data and my code is used to test the SVM model.
Look forward to your advice!
clc
clear all
%% preparing data
%load('mydata.mat')
load('dataWithNoiseSNR10dBForTest.mat');
output = grp2idx(Y);
rand_num = randperm(size(X,1));
% training data set 70%, test set 30%,
X_train = X(rand_num(1:round(0.7*length(rand_num))),:);
y_train = output(rand_num(1:round(0.7*length(rand_num))),:);
X_test = X(rand_num(round(0.7*length(rand_num))+1:end),:);
y_test = output(rand_num(round(0.7*length(rand_num))+1:end),:);
%% load and test SVM model with noise
CompactMdl = loadCompactModel('mySVM');
oofLabel_n = predict(CompactMdl,X_test);
test_accuracy_for_iter = sum((oofLabel_n == y_test))/length(y_test)*100; % tinh accuracy rate
%% plotconfusion
isLabels = unique(output);
nLabels = numel(isLabels);
[n,p] = size(X_test);
% Convert the integer label vector to a class-identifier matrix.
[~,grpOOF] = ismember(oofLabel_n,isLabels);
oofLabelMat = zeros(nLabels,n);
idxLinear = sub2ind([nLabels n],grpOOF,(1:n)');
oofLabelMat(idxLinear) = 1; % Flags the row corresponding to the class
[~,grpY] = ismember(y_test,isLabels);
YMat = zeros(nLabels,n);
idxLinearY = sub2ind([nLabels n],grpY,(1:n)');
YMat(idxLinearY) = 1;
figure;
plotconfusion(YMat,oofLabelMat);
h = gca;
h.XTickLabel = [(isLabels); {''}];
h.YTickLabel = [(isLabels); {''}];
title('Add white Gaussian noise to original data (SNR=10dB) ','FontWeight','bold','FontSize',12);

Sign in to comment.

 Accepted Answer

Don Mathis
Don Mathis on 8 Jul 2019

2 votes

If you want your classifier to perform well on data with Gaussian noise added, I suggest training it on your original data with Gaussian noise added. That is, create an "augmented" dataset and train on that.

5 Comments

Thank for your help!
I tried to follow your instructions. The classification result is very well on data with noise. But when using that SVM model for the original data set, the result is still not good (accuracy is less than 30%)
untitled.jpg
untitled1.jpg
Don Mathis
Don Mathis on 8 Jul 2019
Edited: Don Mathis on 8 Jul 2019
Ok then, how about making an augmented dataset in which noise is added with random standard deviations ranging from 0 to the maximum noise you would like the model to handle? You could replicate each training example say, 10 times, each time adding noise with standard deviation rand(size)*maxStd, where maxStd is the maximum std you'd like to handle.
Sorry, Could you please explain it clearly? I added white Gaussian noise to data by using awgn command (X = awgn(X,10,'measured');). Actually, I followed this flowchart
I hope you can help me to solve my problem soon. Thanks!
Untitled111.png
I meant you could take your original predictor matrix, X, and replace it with something like this for training purposes only:
Xnew = X;
snrs = [5 10 15]
for snr = snrs
Xnew = [Xnew; awgn(X, snr)];
end
Xnew is an augmented dataset that has 4 times as many rows as your original. It appends 3 noisy copies of your predictors to the original, using 3 different snrs.
It seems that the results are very good. Thank you so much!
I will try it with some other popular classification methods like k-NN, Decision Trees, ANN.
Thank you again, Mr. Don Mathis!

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!