How to combine Matlab's build-in functions dtw and pdist?

2 views (last 30 days)
Hi, I'm trying to perform hierarchical clustering on my data. I've tried several distance metrics, but now I would like to use the build-in function for dynamic time warping (Signal Processing Toolbox), by passing the function handle @dtw to the function pdist. Following problem occuried:
Error using pdist (line 391)
Error evaluating distance function 'dtw'.
Caused by:
Error using dtw (line 87)
The number of rows between X and Y must be equal when X and Y are matrices
Here is the code I'm using:
D = pdist(data,@dtw); %data is a 1184x38 double matrix, where 1184 is the number of time-series
Z = linkage(D,'ward');
res = cluster(Z, 'maxclust', numClusters); %e.g. numClusters = 5
Many thanks in advance!

Answers (1)

Greg Dionne
Greg Dionne on 9 Aug 2018
You'll want to take the output of DTW and put it into a form that PDIST can recognize.
This should get you started:
function d = dtwdist(Xi, Xj, varargin)
[m,n] = size(Xj);
% preallocate
d = zeros(m,1);
for j=1:m
d(j) = dtw(Xi, Xj(j,:), varargin{:});
end
Use it like:
X = randn(118,38);
maxsamp = 10;
d = pdist(X,@(Xi,Xj) dtwdist(Xi,Xj,maxsamp,'squared'))
imagesc(squareform(d));
colorbar;
title('Distance Matrix')
  3 Comments
Amila
Amila on 20 Mar 2023
Dear Greg Dioonne this was still used full to me,
Thank you for giving helping,
I'm trying to use same kind of test the problem is my time series (column) are not same length i alined them with a time and make a matrics somthing like X = randn(118,38); but the problem is due to un even size (time), some columns has NaN at end and the begining;
could you kindly helped me ...
NaN NaN NaN NaN 0.7989 -0.6898 ...
NaN NaN -3.0292 -0.6045 0.1202 -0.6667 ...
NaN NaN -0.4570 0.1034 0.5712 0.8641 ...
NaN NaN 1.2424 0.5632 0.4128 0.1134 ...
NaN 1.3790 -1.0667 0.1136 -0.9870 0.3984 ...
-1.3077 -1.0582 0.9337 -0.9047 0.7596 0.8840 ...
-0.4336 -0.4686 0.3503 -0.4677 -0.6572 0.1803
0.3426 -0.2725 -0.0290 -0.1249 -0.6039 0.5509
... ... ... ... ... ..
-0.8396 0.4434 0.3275 -0.7236 -1.6387 0.1222 ...
1.3546 0.3919 0.6647 -0.5933 -0.7601 1.0470 ...
NaN NaN 0.0852 0.4013 -0.8188 -0.2269 ...
NaN NaN NaN NaN 0.5197 -0.1625 ...
NaN NaN NaN NaN NaN 0.6901 ...
NaN NaN NaN NaN NaN 0.5558 ...
Greg Dionne
Greg Dionne on 20 Mar 2023
Hi Amila,
DTW will not work in the presence of NaN values.
Maybe if you give a little back-story on what each column represents? If all you need is to compute a distance metric between each column (without NaN), maybe you could filter them out before sending them to DTW?
If you have a vector, v, you can remove NaN values via:
v = v(~isnan(v));
For matrices, it's probably easier to operate on them one column/row at a time before sending to DTW.
row = M(j, :)
row = row(~isnan(row));
So (assuming I understood your question correctly) something like:
function d = dtwdist(Xi, Xj, varargin)
[m,n] = size(Xj);
% preallocate
d = zeros(m,1);
for j=1:m
x = Xi(~isnan(Xi));
y = Xj(j,:);
y = y(~isnan(y));
d(j) = dtw(x, y, varargin{:});
end
Hope that helps.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!