# Elbow Method (Error Warning: Failed to converge in 100 iterations)

37 views (last 30 days)
MAT NIZAM UTI on 27 Dec 2021
Answered: Tanmay Das on 29 Dec 2021
function [IDX,C,SUMD,K]=kmeans_opt(X,varargin)
%%% [IDX,C,SUMD,K]=kmeans_opt(X,varargin) returns the output of the k-means
%%% algorithm with the optimal number of clusters, as determined by the ELBOW
%%% method. this function treats NaNs as missing data, and ignores any rows of X that
%%% contain NaNs.
%%%
%%% [IDX]=kmeans_opt(X) returns the cluster membership for each datapoint in
%%% vector X.
%%%
%%% [IDX]=kmeans_opt(X,MAX) returns the cluster membership for each datapoint in
%%% vector X. The Elbow method will be tried from 1 to MAX number of
%%% clusters (default: square root of the number of samples)
%%% [IDX]=kmeans_opt(X,MAX,CUTOFF) returns the cluster membership for each datapoint in
%%% vector X. The Elbow method will be tried from 1 to MAX number of
%%% clusters and will choose the number which explains a fraction CUTOFF of
%%% the variance (default: 0.95)
%%% [IDX]=kmeans_opt(X,MAX,CUTOFF,REPEATS) returns the cluster membership for each datapoint in
%%% vector X. The Elbow method will be tried from 1 to MAX number of
%%% clusters and will choose the number which explains a fraction CUTOFF of
%%% the variance, taking the best of REPEATS runs of k-means (default: 3).
%%% [IDX,C]=kmeans_opt(X,varargin) returns in addition, the location of the
%%% centroids of each cluster.
%%% [IDX,C,SUMD]=kmeans_opt(X,varargin) returns in addition, the sum of
%%% point-to-cluster-centroid distances.
%%% [IDX,C,SUMD,K]=kmeans_opt(X,varargin) returns in addition, the number of
%%% clusters.
%%% sebastien.delandtsheer@uni.lu
%%% sebdelandtsheer@gmail.com
%%% Thomas.sauter@uni.lu
[m,~]=size(X); %getting the number of samples
if nargin>1, ToTest=cell2mat(varargin(1)); else, ToTest=ceil(sqrt(m)); end
if nargin>2, Cutoff=cell2mat(varargin(2)); else, Cutoff=0.95; end
if nargin>3, Repeats=cell2mat(varargin(3)); else, Repeats=3; end
%unit-normalize
MIN=min(X); MAX=max(X);
X=(X-MIN)./(MAX-MIN);
D=zeros(ToTest,1); %initialize the results matrix
for c=1:ToTest %for each sample
[~,~,dist]=kmeans(X,c,'emptyaction','drop'); %compute the sum of intra-cluster distances
tmp=sum(dist); %best so far
for cc=2:Repeats %repeat the algo
[~,~,dist]=kmeans(X,c,'emptyaction','drop');
tmp=min(sum(dist),tmp);
end
D(c,1)=tmp; %collect the best so far in the results vecor
end
Var=D(1:end-1)-D(2:end); %calculate %variance explained
PC=cumsum(Var)/(D(1)-D(end));
[r,~]=find(PC>Cutoff); %find the best index
K=1+r(1,1); %get the optimal number of clusters
[IDX,C,SUMD]=kmeans(X,K); %now rerun one last time with the optimal number of clusters
C=C.*(MAX-MIN)+MIN;
end
but i always getting error
Warning: Failed to converge in 100 iterations.
> In kmeans/loopBody (line 438)
In internal.stats.parallel.smartForReduce (line 136)
In kmeans (line 316)
In elbow_method (line 46)

Tanmay Das on 29 Dec 2021
Hi,
The error is may be because the 'kmeans' function in MATLAB has 100 iteration steps by default, and you may have iterated 100 steps without convergence because the data that needs to be clustered is relatively large. You may view the parameter information of kmeans function by typing the following command in the command window of MATLAB:
help kmeans
Here are some of its functional explanations for your reference:
'Options' - Options for the iterative algorithm used to minimize the fitting criterion, as created by "statset".
Choices of "statset" parameters are:
'Display' - Level of display output. Choices are 'off', (the
default), 'iter', and 'final'.
'MaxIter' - Maximum number of iterations allowed. Default is 100.
One of the possible workarounds may be to add parameter settings to the kmeans function, where 'Display' shows the number of steps of the iteration and 'MaxIter' sets the number of steps of the iteration. The following code may give you further understanding:
opts = statset('Display','final','MaxIter',1000);
[idx, ctrs] = kmeans(X,c,'Options',opts);