I am new to matlab. I can't figure out the problem with this code. It won't execute. It's for naive bayes to classify setosa (class = 1) against combined versicolor and virginca (class = 0) without using gaussian pdf
2 views (last 30 days)
Show older comments
load iris.dat
%discretize data
numBins = 5;
dd = zeros(150,5);
for q = 1:size(iris,2)-1
x = iris(:,q);
binEdges = linspace(min(x), max(x)+1 , numBins+1);
[bincount ,whichBin] = histc(x, binEdges);
dd(:,q) = whichBin;
end
dd(:,5)= iris(:,5);
%change class values 2 and 3 to 0 in the discretized data
for w=1:size(dd,1)
if dd(w,5)~= 1
dd(w,5) = 0;
end
end
%divide into train and test
[train_indices,validation_set_indices,test_indices] = dividerand(150,0.5,0,0.5);
traind = dd(train_indices,:);
testd = dd(test_indices,:);
% calculate P(class)
s1 = 0;
s2 = 0;
for row = 1:size(traind,1)
if traind(row,5) == 1
s1 = s1 + 1;
if traind(row,5) == 0
s2 = s2 + 1;
p1 = s1/size(traind,1); % P(class = 1)
p0 = s2/size(traind,1); % P(class = 0)
%the following cell array will have the vector of unique values for each column/feature
v = cell(1,1);
for c = 1:size(traind,2)-1
v{c}=unique(traind(:,c)); %cell array stores vectors of unique values for each feature
% v{1} = vector of unique values in column 1 and so on.
end
%the following matrix helps compute P(feature and class)
%for every unique value of the feature, it stores a count for when the class is 1
f1 = zeros(4,5); %all sums need to be initialized to zero
%the following matrix helps compute P(feature and not class)
%for every unique value of the feature, it stores a count for when the class is 0
f2 = zeros(4,5); %all sums need to be initialized to zero
col = 1;
u = 1;
while col<=size(traind,2)-1
for row = 1:size(traind,1)
if traind(row,col) ==v{col}( u)
if traind(row,5) == 1
f1(col,u) = f1(col,u) + 1;
else if traind(row,5) == 0
f2(col,u) = f2(col,u) + 1;
end %end if
end %end if
end %end for for all rows
u = u + 1;
if u>length(v{col})
u=1;
col = col + 1;
end %end of if
end %end of while
%Here, we begin classification for each data point(row)
pfgc = []; %empty vector
%p(feature given class), pfgc(1) = p(feature and class)/p(class)
%it does this for each row because each whole row represents a data point
pfgnc = []; %empty vector
%p(feature given not class), pfgnc(1) = p(feature and not class)/p(not class)
%it does this for each row because each whole row represents a data point
for row = 1:size(traind,1)
for c = 1:size(traind,2)-1
for u = 1: length(v{c})
if traind(row,c) == v{c}(u)
pfgc(c) = f1(c,u)/p1;
pfgnc(c) = f2(c,u)/p0;
end
break
end
end
%compute the final p(features given class) as a multiplication of all p(feature given class) values
fpfgc = cumprod(pfgc);
%compute the final p(features given not class) as a multiplication of all p(feature given not class) values
fpfgnc = cumprod(pfgnc);
%use bayes theorem to compute p(class given feature).
pcgf = fpfgc * p1;
pncgf = fpfgnc * p0;
%The right hand side has not been normalized (no denominator) as for both p(class given feature) and p(notclass given feature), the denominators are the same. For comparision, we need not compute them. It is sufficient to compare the numerators.
%Finally, determine class of datapoint
if pcgf>pncgf
disp(['The class of datapoint ' num2str(row) 'is' num2str(1)]);
else
disp(['The class of datapoint ' num2str(row) 'is' num2str(0)]);
end
end %end of loop running through all rows
%to test run the same program with testd instead of traind
0 Comments
Answers (1)
John D'Errico
on 23 Sep 2016
The crystal ball is so foggy. Don't just tell someone that it won't execute. Tell us what DOES happen when you try. If there is an error message, then show the ENTIRE message.
Would you go to your doctor and say only that something hurts, then refuse to say what hurts?
I'll guess that the very first line of your code fails, since you have given us absolutely no information otherwise. Is it possible that you don't have the file iris.dat?
See Also
Categories
Find more on Sequence and Numeric Feature Data Workflows in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!