Computing difference between maximum values in columns of matrix within predefined time interval

Hello, I have a matrix of values of size 8760 x 6 (a small excerpt is shown below), in which 1st column is the year, 2nd month, 3rd is the day, 4th is the hour, 5th is value 1 and 6th is value 2. The nature of graph for val 1 and val 2 is sinusoidal, hence there is alternate peaks and troughs. I need to identify the peaks in val 1 and corresponding peaks in val 2 within +/- 3 hr window and then difference between the two values. For example, in the matrix below, the 1st peak value is 2.08 in val 1 and next value is +1 hr next to it, i.e., 1.75 in val 2. In the next positive cycle, the peak value is 2.02 and corresponding peak value in val2 is 1.82 and so on.
Year Mo Day hr val 1 val 2
2014 1 1 0 1.84 1.67
2014 1 1 1 1.74 1.56
2014 1 1 2 1.26 1.04
2014 1 1 3 0.47 0.23
2014 1 1 4 -0.4 -0.65
2014 1 1 5 -1.11 -1.40
2014 1 1 6 -1.43 -1.78
2014 1 1 7 -1.30 -1.68
2014 1 1 8 -0.76 -1.12
2014 1 1 9 0.078 -0.27
2014 1 1 10 0.89 0.58
2014 1 1 11 1.62 1.29
2014 1 1 12 2.08 1.74
2014 1 1 13 2.06 1.75
2014 1 1 14 1.61 1.33
2014 1 1 15 0.87 0.56
2014 1 1 16 -0.05 -0.37
2014 1 1 17 -0.87 -1.25
2014 1 1 18 -1.47 -1.86
2014 1 1 19 -1.59 -2.00
2014 1 1 20 -1.28 -1.63
2014 1 1 21 -0.56 -0.87
2014 1 1 22 0.327 0.022
2014 1 1 23 1.181 0.85
2014 1 2 0 1.85 1.521
2014 1 2 1 2.020 1.819
2014 1 2 2 1.736 1.637
2014 1 2 3 1.074 1.021
2014 1 2 4 0.142 0.115
2014 1 2 5 -0.81 -0.83
2014 1 2 6 -1.54 -1.58
2014 1 2 7 -1.82 -1.90
2014 1 2 8 -1.59 -1.68
2014 1 2 9 -0.92 -0.98
2014 1 2 10 0.002 -0.058
2014 1 2 11 0.892 0.825
2014 1 2 12 1.693 1.529
2014 1 2 13 2.138 1.914
2014 1 2 14 2.057 1.829
2014 1 2 15 1.547 1.287
2014 1 2 16 0.696 0.4018
Finally, the desired output is:
2014 1 1 2.08 1.76 0.32
2014 1 2 2.02 1.82 0.20
2014 1 2 2.14 1.91 0.22
Any help?

 Accepted Answer

The first step would be to identify the local maxima (and their row index) in val1 and val2 using findpeaks (requires signal processing toolbox) or the FileExchange peakfinder:
[val1max, rowval1] = findpeaks(yourtable.value1);
[val2max, rowval2] = findpeaks(yourtable.value2);
The next thing is to compute the time difference between the rows. Assuming that there's so many peaks that you'd run out of memory you could do it all in one go. But first, I'd convert your ymdh vectors into datetime objects:
dt = datetime([yourtable{:, 1:4}, repmat([0 0], height(yourtable), 1)]);
The time difference in hours is then:
hourdiff = hours(dt(rowval1) - dt(rowval2)') %require R2016b, in earlier versions use bsxfun for the subtraction
Then find where the time difference is within +/-3 hours:
[row, col] = find(abs(hourdiff) <= 3);
row is the index of val1max and rowval1 and col is the corresponding index of val2max and rowval2 where the absolute difference is within 3 hours.
You can build your final table with:
result = table(dt(rowvar1(row)), val1max(row), val2max(col), val2max(col)-val1max(row))

14 Comments

In some cases, size of rowval1 and rowval2 are dismilar, for example, in one case I found size of rowval1 is 318x1, while rowval2 is 319x1, what should be the way to calculate in that case?
The number of peaks in each column does not matter for the code above. There is no assumption that they're the same. It will compare the timing of all the peaks in val1 with the timing of all the peaks in val2 and only keep those within your interval.
but I receive an error message with
hourdiff = hours(dt(rowval1) - dt(rowval2))
Further since, my values are in matrix format not table I had to use length() instead of height. Please note that values are in matrix format not in table.
Don't use length on 2D matrices, use size(yourmatrix, 1).
My code transposes dt(rowval2), this is essential to obtain the time difference between the cartesian product of the two sets:
hourdiff = hours(dt(rowval1) - dt(rowval2)'); % ' is essential
Otherwise, yes you'd get an error if the two sets are not the same length, but more importantly, it won't do what is intended at all.
As per my original comment this requires at least R2016b. In earlier versions:
hourdiff = hours(bsxfun(@minus, dt(rowval1), dt(rowval2)'));
Hello, I tested the code, it's now working but difference is computed only for the first peak & not moving forward to the next peak. With another set of data I am getting error as:
Data inputs must be the same size, or any of them can be a scalar.
Please, show the exact code you're using. Also attach a mat file of the original data.
And which version of matlab?
I am using R2017a
Sl_tide = load 'Sl_tide.mat';
Sl_MSLfre = Sl_tide(:,5);
tide = Sl_tide(:,6);
[val1max,rowval1] = findpeaks(Sl_MSLfre(Sl_MSLfre >= 0));
[val2max,rowval2] = findpeaks(tide(tide >= 0));
dt = datetime([Sl_tide(:,1:4), repmat([0 0], size(Sl_tide,1),1)]);
hourdiff = hours(dt(rowval1) - dt(rowval2)');
[row, col] = find(abs(hourdiff) <= 3);
result = table(dt(rowval1(row)),val1max(row),val2max(col),val1max(row)-val2max(col));
Y = datevec(table2array(result(:,1)));
All = [Y(:,1:4) table2array(result(:,2)) table2array(result(:,3)) table2array(result(:,4))];
Please find the attached code and .mat file
Note: there is no point in creating a table and then deconstructing that table if you don't want to use table. Personally, I prefer tables, just one variable to hold everything but it's your choice.
The problem is that I didn't realise that the new implicit expansion (or the old bsxfun) don't support datetime objects. A bit of a silly shortcoming. Easily fixed (at the expense of memory use)
[rowval2, rowval1] = ndgrid(rowval2, rowval1);
hourdiff = hours(dt(rowval1) - dt(rowval2)); %no transpose this time
and the result, without transitioning through a table:
[row, col] = find(abs(hourdiff) <= 3);
result = [Sl_tide(rowval1(1, col), 1:4), val1max(col), val2max(row), val1max(col)-val2max(row)];
Note: I've changed the order of a few things so that the output is ordered by date.
Note2: don't use all as a variable name. It's already the name of a function.
Many Thanks, its working now. In some cases, within +/- 3hr time difference, I find negative peaks for val2. In that case, if I have to enlarge the window size to +/-6hr, any way I can modify the codes. I think, I have to add a condition for the check, if no +peak value for val2 within +/-3hr then enlarge the window size to +/-6hr?
I think
[row, col] = find(abs(hourdiff) <= 3 | ...
(~any(abs(hourdiff) <= 3) & abs(hourdiff) <= 6));
is what you're after. For each hourdiff column (correspond to val1 peakes), it looks for the +/- 3 hours differences and only if none is found at all looks at the +/- 6 hours differences
Hi ndgrid doesn't work for very large matrices due to memory issues. Anyways to overcome it?
It's a trade off between speed and memory. If you don't have enough memory you have to sacrify speed and implement the calculation as a for loop:
row = [];
col = [];
for c = 1:numel(rowval2)
hourdiff = dt(rowval1) - dt(rowval2(c);
r = find(abs(hourdiff) <= 3 | ...
(~any(abs(hourdiff) <= 3) & abs(hourdiff) <= 6));
row = [row; r(:)];
col = [col; repmat(c, numel(r), 1)];
end
hmm now I am using a 12-core processor, now its working. But good to know about alternatives. Thanks a bunch!

Sign in to comment.

More Answers (0)

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!