How to apply majority voting for classification ensemble in Matlab?

I have five classifiers SVM, random forest, naive Bayes, decision tree, KNN,I attached my Matlab code. I want to combine the results of these five classifiers on a dataset by using majority voting method and I want to consider all these classifiers have the same weight. because the number of the tests is calculated 5 so the output of each classifier is 5 labels(class labels in this example is 1 or 2). I'll be gratefull to have your opinions
clear all
close all
clc
load data.mat;
data=data;
[n,m]=size(data);
rows=(1:n);
test_count=floor((1/6)*n);
sum_ens=0;sum_result=0;
test_rows=randsample(rows,test_count);
train_rows=setdiff(rows,test_rows);
test=data(test_rows,:);
train=data(train_rows,:);
xtest=test(:,1:m-1);
ytest=test(:,m);
xtrain=train(:,1:m-1);
ytrain=train(:,m);
%-----------svm------------------
svm=svm1(xtest,xtrain,ytrain);
%-------------random forest---------------
rforest=randomforest(xtest,xtrain,ytrain);
%-------------decision tree---------------
DT=DTree(xtest,xtrain,ytrain);
%---------------bayesian---------------------
NBModel = NaiveBayes.fit(xtrain,ytrain, 'Distribution', 'kernel');
Pred = NBModel.predict(xtest);
dt=Pred;
%--------------KNN----------------
knnModel=fitcknn(xtrain,ytrain,'NumNeighbors',4);
pred=knnModel.predict(xtest);
sk=pred;
how can I apply majority voting directly on these outputs of classifiers in Matlab?
Thanks very much

 Accepted Answer

I don't think that there's an existing function that does that for you, so you have to build your own. Here is a suggested method:
  • Assuming you have your five prediction arrays from your five different classifiers, and
  • all prediction arrays have the same size = length(test_rows), and
  • you have 2 classes: 1 & 2, you can do the following:
% First we concatenate all prediciton arrays into one big matrix.
% Make sure that all prediction arrays are of the same type, I am assumming here that they
% are type double. I am also assuming that all prediction arrays are column vectors.
Prediction = [svm,rforest,DTree,dt,sk];
Final_decision = zeros(length(test_rows),1);
all_results = [1,2]; %possible outcomes
for row = 1:length(test_rows)
election_array = zeros(1,2);
for col = 1:5 %your five different classifiers
election_array(Prediction(row,col)) = ...
election_array(Prediction(row,col)) + 1;
end
[~,I] = max(election_array);
Final_decision(row) = all_results(I);
end
Hope this helps.
Ahmad

7 Comments

Thank you very much, I'm very grateful
If I want to change your codes for integer arrays, which part should I change?
I used these codes for lung dataset(attached)
clear all
close all
clc
load lung.mat;
data=lung;
[n,m]=size(data);
rows=(1:n);
test_count=floor((1/6)*n);
sum_ens=0;sum_result=0;
test_rows=randsample(rows,test_count);
train_rows=setdiff(rows,test_rows);
test=data(test_rows,:);
train=data(train_rows,:);
xtest=test(:,1:m-1);
ytest=test(:,m);
xtrain=train(:,1:m-1);
ytrain=train(:,m);
%-----------svm------------------
svm=svm1(xtest,xtrain,ytrain);
%-------------random forest---------------
rforest=randomforest(xtest,xtrain,ytrain);
%-------------decision tree---------------
DT=DTree(xtest,xtrain,ytrain);
%---------------bayesian---------------------
NBModel = NaiveBayes.fit(xtrain,ytrain, 'Distribution', 'kernel');
Pred = NBModel.predict(xtest);
dt=Pred;
%--------------KNN----------------
knnModel=fitcknn(xtrain,ytrain,'NumNeighbors',4);
pred=knnModel.predict(xtest);
sk=pred;
% First we concatenate all prediciton arrays into one big matrix.
% Make sure that all prediction arrays are of the same type, I am assumming here that they
% are type double. I am also assuming that all prediction arrays are column vectors.
Prediction = [svm,rforest,DTree,dt,sk];
Final_decision = zeros(length(test_rows),1);
all_results = [1,2]; %possible outcomes
for row = 1:length(test_rows)
election_array = zeros(1,2);
for col = 1:5 %your five different classifiers
election_array(Prediction(row,col)) = ...
election_array(Prediction(row,col)) + 1;
end
[~,I] = max(res_mat);
Final_decision(row) = all_results(I);
end
There is an error:
Not enough input arguments.
Error in DTree (line 2)
DTreeModel=ClassificationTree.fit(xtrain,ytrain);
Error in mathworks (line 36)
Prediction = [svm,rforest,DTree,dt,sk];
How can I remove this error?
Thank you very much
Opps, my mistake, the line should read DT, not DTree. I don't seem to be able to edit the answer. But just remove DTree from line 36 and replace it with DT (i.e. the name of your variable).
Ahmad
Thank you
% First we concatenate all prediciton arrays into one big matrix.
% Make sure that all prediction arrays are of the same type, I am assumming here that they
% are type double. I am also assuming that all prediction arrays are column vectors.
Prediction = [svm,rforest,DT,dt,sk];
Final_decision = zeros(length(test_rows),1);
all_results = [1,2]; %possible outcomes
for row = 1:length(test_rows)
election_array = zeros(1,2);
for col = 1:5 %your five different classifiers
election_array(Prediction(row,col)) = ...
election_array(Prediction(row,col)) + 1;
end
[~,I] = max(res_mat);
Final_decision(row) = all_results(I);
end
There are two errors:
Subscript indices must either be real positive integers or logicals.
Error in classify2 (line 170)
election_array(Prediction(row,col)) = election_array(Prediction(row,col)) + 1;
Undefined function or variable 'res_mat'.
Error in classify2 (line 172)
[~,I] = max(res_mat);
I'm very grateful to have your opinions how to remove these errors.
Thanks
As for the first error, it is because your prediction matrix must be double, not boolean.
As for the second error, my mistake again.
It is because at the time I wrote this I just copied it from one of my workstations.
I fixed the error, please check my editted answer now. You should use your variable name, which is election_array, not res_mat
Ahmad
Thanks Mr. Ahmad it is working for me.
hi Mr. Ahmed i want to apply majority voting for two classification model .Can you help me?
unzip('Preprocessing.zip');
imds = imageDatastore('Preprocessing', ...
'IncludeSubfolders',true, ...
'LabelSource','foldernames');
%Use countEachLabel to summarize the number of images per category.
tbl1 = countEachLabel(imds)
%Divide the data into training and validation data sets
rng('default') % For reproduciblity
[trainingSet, testSet] = splitEachLabel(imds, 0.3, 'randomize');
%Load Pretrained Network1
net1 = alexnet;
% Inspect the first layer
net1.Layers(1)
% Create augmentedImageDatastore from training and test sets to resize
% images in imds to the size required by the network1.
imageSize1 = net1.Layers(1).InputSize;
augmentedTrainingSet1 = augmentedImageDatastore(imageSize1, trainingSet, 'ColorPreprocessing', 'gray2rgb');
augmentedTestSet1 = augmentedImageDatastore(imageSize1, testSet, 'ColorPreprocessing', 'gray2rgb');
featureLayer1 = 'fc7';
trainingFeatures1 = activations(net1, augmentedTrainingSet1, featureLayer1, ...
'MiniBatchSize', 32, 'OutputAs', 'columns');
%Load Pretrained Network2
net2 = resnet50;
% Visualize the first section of the network.
figure
plot(net2)
title('First section of ResNet-50')
set(gca,'YLim',[150 170]);
% Inspect the first layer
net2.Layers(1)
% Create augmentedImageDatastore from training and test sets to resize
% images in imds to the size required by the network2.
imageSize2 = net2.Layers(1).InputSize;
augmentedTrainingSet2 = augmentedImageDatastore(imageSize2, trainingSet, 'ColorPreprocessing', 'gray2rgb');
augmentedTestSet2 = augmentedImageDatastore(imageSize2, testSet, 'ColorPreprocessing', 'gray2rgb');
featureLayer2 = 'fc1000';
trainingFeatures2 = activations(net2, augmentedTrainingSet2, featureLayer2, ...
'MiniBatchSize', 32, 'OutputAs', 'columns');
%% %% Train A Multiclass SVM Classifier Using CNN1 Features
% Get training labels from the trainingSet
trainingLabels = trainingSet.Labels;
% Train multiclass SVM classifier using a fast linear solver, and set
% 'ObservationsIn' to 'columns' to match the arrangement used for training
% features.
classifier1 = fitcecoc(newfeatures, trainingLabels, ...
'Learners', 'Linear', 'Coding', 'onevsall', 'ObservationsIn', 'columns');
% Extract test features using the CNN1
testFeatures1 = activations(net1, augmentedTestSet1, featureLayer1, ...
'MiniBatchSize', 32, 'OutputAs', 'columns');
% Pass CNN image features to trained classifier
predictedLabels1 = predict(classifier1, testFeatures1, 'ObservationsIn', 'columns');

Sign in to comment.

More Answers (3)

I think just put the outputs of five models together as a matrix, then use mode function
How three classifer like fitcnb,fitcecoc and fitensemble can be used to get average results.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!