Main Content

cophenet

Cophenetic correlation coefficient

Syntax

c = cophenet(Z,Y)
[c,d] = cophenet(Z,Y)

Description

c = cophenet(Z,Y) computes the cophenetic correlation coefficient for the hierarchical cluster tree represented by Z. Z is the output of the linkage function. Y contains the distances or dissimilarities used to construct Z, as output by the pdist function. Z is a matrix of size (m–1)-by-3, with distance information in the third column. Y is a vector of size m*(m–1)/2.

[c,d] = cophenet(Z,Y) returns the cophenetic distances d in the same lower triangular distance vector format as Y.

The cophenetic correlation for a cluster tree is defined as the linear correlation coefficient between the cophenetic distances obtained from the tree, and the original distances (or dissimilarities) used to construct the tree. Thus, it is a measure of how faithfully the tree represents the dissimilarities among observations.

The cophenetic distance between two observations is represented in a dendrogram by the height of the link at which those two observations are first joined. That height is the distance between the two subclusters that are merged by that link.

The output value, c, is the cophenetic correlation coefficient. The magnitude of this value should be very close to 1 for a high-quality solution. This measure can be used to compare alternative cluster solutions obtained using different algorithms.

The cophenetic correlation between Z(:,3) and Y is defined as

c=i<j(Yijy)(Zijz)i<j(Yijy)2i<j(Zijz)2

where:

  • Yij is the distance between objects i and j in Y.

  • Zij is the cophenetic distance between objects i and j, from Z(:,3).

  • y and z are the average of Y and Z(:,3), respectively.

Examples

X = [rand(10,3); rand(10,3)+1; rand(10,3)+2];
Y = pdist(X);
Z = linkage(Y,'average');

% Compute Spearman's rank correlation between the
% dissimilarities and the cophenetic distances
[c,D] = cophenet(Z,Y);
r = corr(Y',D','type','spearman')
r =
   0.8279 

Version History

Introduced before R2006a