Find closest value in array

I have two vector(which are time stamps) like, V N 1375471092848936 1375473384440853 1375473388165900 1375471277856598 1375471320476780 1375473388165900 1375473388947681 1375471322465961 1375473392527002 1375471335206288 .................. .................. My goal is to find closest time in N with respect to V (i.e. find time in N which is nearly equal with V). My frame is W = 1e4, furthermore V should lies between N-W and N+W. So how do I get closest time through MATLAB? Any help would be appreciated. Thanks

 Accepted Answer

Joe S
Joe S on 14 Nov 2024
Edited: MathWorks Support Team on 14 Nov 2024

18 votes

This answer was flagged by Walter Roberson
To compute the closest value in a vector “N” for each element of “V”, try the following code with example vectors “N” and “V”: V = randi(10,[5 1]) N = randi(10,[5 1]) A = repmat(N,[1 length(V)]) [minValue,closestIndex] = min(abs(A-V’)) closestValue = N(closestIndex) Note that if there is a tie for the minimum value in each column, MATLAB chooses the first element in the column.

7 Comments

For MATLAB2015b (probably 2016a too) and earlier, use:
[minValue,closestIndex] = min(abs(bsxfun(@minus,A, V')))
At least as of R2020a, it seems as though
[minValue, closestIndex] = min(abs(N - V.'))
closestValue = N(closestIndex)
produces the same result, is more efficient, and uses less RAM than:
A = repmat(N,[1 length(V)])
[minValue,closestIndex] = min(abs(A-V.'))
closestValue = N(closestIndex)
So that would reduce the script to:
V = randi(10,[5 1])
N = randi(10,[5 1])
[minValue, closestIndex] = min(abs(N - V.'))
closestValue = N(closestIndex)
The biggest issue with repmat is that when the vectors become very large, out of memory errors are more likely to occur - that's what led me to try and find a solution to this problem that didn't use repmat.
To make this solution consistent when the length of N is 1, a suggestion would be to change
[minValue,closestIndex] = min(abs(A-V'))
closestValue = N(closestIndex)
to
[minValue,closestIndex] = min(abs(A-V'),[],1)
closestValue = N(closestIndex')
Otherwise the min will return a scalar when the length of N is 1. Note the transpose on closestIndex also, otherwise you won't get a column vector when the length of N is 1.
Great answer. I would only use dot-apostrophe (.') instead of only apostrophe to make sure you are taking the non-conjugate transpose. Of course, this is only relevant if you are working with complex numbers.
This just saved my night!
For anybody juse searching the index, faster variant:
[~,closestIndex] = min(abs(N-V));
Great variant!
@David Very helpful, thanks!

Sign in to comment.

More Answers (5)

Andrew Reibold
Andrew Reibold on 25 Aug 2014
Edited: Andrew Reibold on 25 Aug 2014
This finds the value in N which is closest to the V value I am calling.
N = [1990 1998 2001 2004 2001]
V = [2000 2011 2010 2001 1998]
[c index] = min(abs(N-V(1)))
In this case Im looking for the closest value to 'V(1)' which is 2000. It should return the 3rd or 5th value of N which is 2001.
Note: 'index' is the index of the closest value. If two are the same, like in this example with two different '2001's, it will return the index of the first one.

4 Comments

So simple but yet so effective and elegant! I am a bit ashamed that I did not figure this out by my self.
Caution/Note: This solution only compares a row to the very same row in the other vector. My solution compares all rows to all other rows. So this solution might come up with, say, row 40 as the closest distance, but my solution might come up with a closer distance between row 34 or N with row 53 of V.
If you have the Statistics and Machine Learning Toolbox, you can also compute the distance between every element and every element of the other array using the function pdist2().
So it really depends if you want the closest distance between corresponding rows (this solution), or if you want the overall closest distance no matter what rows they may occur in (my solution).
if N is just a decimal number and it is to be searched in a matrix V(containing decimal numbers). how would the code change?
reetu, if N is just a single number then you can do this
[minDistance, indexOfMin] = min(abs(V-N));

Sign in to comment.

How about this:
clc;
% Sample data
numberOfRows = 5;
V = rand(numberOfRows, 1)
N = rand(numberOfRows, 1)
% Find min distance
minDistance = inf;
for ni = 1 : numberOfRows
for vi = 1 : numberOfRows
distances(vi, ni) = abs(N(ni) - V(vi));
if distances(vi, ni) < minDistance
minNRow = ni;
minVRow = vi;
minDistance = distances(vi, ni);
end
end
end
% Report to command window:
distances
fprintf('Closest distance is %f which occurs between row %d of N and row %d of V\n',...
minDistance, minNRow, minVRow);
In the command window:
V =
0.5309
0.6544
0.4076
0.8200
0.7184
N =
0.9686
0.5313
0.3251
0.1056
0.6110
distances =
0.4378 0.0005 0.2057 0.4252 0.0801
0.3142 0.1231 0.3293 0.5488 0.0435
0.5610 0.1237 0.0825 0.3020 0.2033
0.1487 0.2886 0.4948 0.7144 0.2090
0.2503 0.1870 0.3932 0.6127 0.1074
Closest distance is 0.000470 which occurs between row 2 of N and row 1 of V

3 Comments

@Image Analyst- is there a shortcut method to it or getting read of for loop and including all values between 0 and 0.2
You can try this:
% Sample data
numberOfRows = 5;
V = rand(numberOfRows, 1)
N = rand(numberOfRows, 1)
% Find min distance
distances = pdist2(V, N)
[minDistance, index] = min(distances(:))
[minVRow, minNRow] = ind2sub(size(distances), index)
fprintf('The closest distance is %f which occurs between\nrow %d of V (%f) and\nrow %d of N (%f)\n',...
minDistance, minVRow, V(minVRow), minNRow, N(minNRow));
% Double-check / Prove it
V(minVRow) - N(minNRow)
What's wrong with a for loop? And what is ni and vi?

Sign in to comment.

To be honest, the easiest way is to use knnsearch. It works well in one dimension, as you have here, and it should be quite efficient.
V = [1375471092848936; 1375473388165900; 1375471320476780; 1375473388947681; 1375473392527002];
N = [1375473384440853; 1375471277856598; 1375473388165900; 1375471322465961; 1375471335206288];
help knnsearch
KNNSEARCH Find K nearest neighbors. IDX = KNNSEARCH(X,Y) finds the nearest neighbor in X for each point in Y. X is an MX-by-N matrix and Y is an MY-by-N matrix. Rows of X and Y correspond to observations and columns correspond to variables. IDX is a column vector with MY rows. Each row in IDX contains the index of the nearest neighbor in X for the corresponding row in Y. [IDX, D] = KNNSEARCH(X,Y) returns a MY-by-1 vector D containing the distances between each row of Y and its closest point in X. [IDX, D]= KNNSEARCH(X,Y,'NAME1',VALUE1,...,'NAMEN',VALUEN) specifies optional argument name/value pairs: Name Value 'K' A positive integer, K, specifying the number of nearest neighbors in X to find for each point in Y. Default is 1. IDX and D are MY-by-K matrices. D sorts the distances in each row in ascending order. Each row in IDX contains the indices of K closest neighbors in X corresponding to the K smallest distances in D. 'NSMethod' Nearest neighbors search method. Value is either: 'kdtree' - Creates and uses a kd-tree to find nearest neighbors. 'kdtree' is only valid when the distance metric is one of the following metrics: - 'euclidean' - 'cityblock' - 'minkowski' - 'chebychev' 'exhaustive' - Uses the exhaustive search algorithm. The distance values from all the points in X to each point in Y are computed to find nearest neighbors. Default is 'kdtree' when the number of columns of X is not greater than 10, X is not sparse, and the distance metric is one of the above 4 metrics; otherwise, default is 'exhaustive'. 'IncludeTies' A logical value indicating whether KNNSEARCH will include all the neighbors whose distance values are equal to the Kth smallest distance. Default is false. If the value is true, KNNSEARCH includes all these neighbors. In this case, IDX and D are MY-by-1 cell arrays. Each row in IDX and D contains a vector with at least K numeric numbers. D sorts the distances in each vector in ascending order. Each row in IDX contains the indices of the closest neighbors corresponding to these smallest distances in D. 'Distance' A string or a function handle specifying the distance metric. The value can be one of the following: 'euclidean' - Euclidean distance (default). 'seuclidean' - Standardized Euclidean distance. Each coordinate difference between X and a query point is scaled by dividing by a scale value S. The default value of S is the standard deviation computed from X, S=NANSTD(X). To specify another value for S, use the 'Scale' argument. 'fasteuclidean' - Euclidean distance computed by using an alternative algorithm that saves time. This faster algorithm can, in some cases, reduce accuracy. 'fastseuclidean' - Standardized Euclidean distance computed by using an alternative algorithm that saves time. This faster algorithm can, in some cases, reduce accuracy. 'cityblock' - City Block distance. 'chebychev' - Chebychev distance (maximum coordinate difference). 'minkowski' - Minkowski distance. The default exponent is 2. To specify a different exponent, use the 'P' argument. 'mahalanobis' - Mahalanobis distance, computed using a positive definite covariance matrix C. The default value of C is the sample covariance matrix of X, as computed by NANCOV(X). To specify another value for C, use the 'Cov' argument. 'cosine' - One minus the cosine of the included angle between observations (treated as vectors). 'correlation' - One minus the sample linear correlation between observations (treated as sequences of values). 'spearman' - One minus the sample Spearman's rank correlation between observations (treated as sequences of values). 'hamming' - Hamming distance, percentage of coordinates that differ. 'jaccard' - One minus the Jaccard coefficient, the percentage of nonzero coordinates that differ. function - A distance function specified using @ (for example @DISTFUN). A distance function must be of the form function D2 = DISTFUN(ZI, ZJ), taking as arguments a 1-by-N vector ZI containing a single row of X or Y, an M2-by-N matrix ZJ containing multiple rows of X or Y, and returning an M2-by-1 vector of distances D2, whose Jth element is the distance between the observations ZI and ZJ(J,:). 'P' A positive scalar indicating the exponent of Minkowski distance. This argument is only valid when 'Distance' is 'minkowski'. Default is 2. 'Cov' A positive definite matrix indicating the covariance matrix when computing the Mahalanobis distance. This argument is only valid when 'Distance' is 'mahalanobis'. Default is NANCOV(X). 'Scale' A vector S containing non-negative values, with length equal to the number of columns in X. Each coordinate difference between X and a query point is scaled by the corresponding element of S. This argument is only valid when 'Distance' is 'seuclidean'. Default is NANSTD(X). 'BucketSize' The maximum number of data points in the leaf node of the kd-tree (default is 50). This argument is only meaningful when kd-tree is used for finding nearest neighbors. 'SortIndices' A flag to indicate if output distances and the corresponding indices should be sorted in the order of distances ranging from the smallest to the largest distance. Default is true. 'CacheSize' A positive scalar or 'maximal'. The default is 1e3. This argument is only meaningful when the alternative algorithm of computing Euclidean distance is used which requires an intermediate matrix (when 'NSMethod' is 'exhaustive', and 'Distance' is one of {'fasteuclidean','fastseuclidean'}). If numeric, this argument specifies the cache size in megabytes (MB) to allocate for an intermediate matrix. If 'maximal', knnsearch attempts to allocate enough memory for an entire intermediate matrix whose size is MX-by-MY (MX is the number of rows of the input data X, and MY is the number of rows of the input data Y). 'CacheSize' does not have to be large enough for an entire intermediate matrix, but must be at least large enough to hold an MX-by-1 vector. Otherwise, the regular algorithm of computing Euclidean distance will be used instead. If the specified cache size exceeds the available memory, MATLAB issues an out-of-memory error. Example: % Find 2 nearest neighbors in X and the corresponding values to each % point in Y using the distance metric 'cityblock' X = randn(100,5); Y = randn(25, 5); [idx, dist] = knnsearch(X,Y,'dist','cityblock','k',2); See also CREATENS, ExhaustiveSearcher, KDTreeSearcher, RANGESEARCH. Documentation for knnsearch doc knnsearch Other uses of knnsearch ExhaustiveSearcher/knnsearch KDTreeSearcher/knnsearch gpuArray/knnsearch tall/knnsearch hnswSearcher/knnsearch textanalytics/knnsearch
ids = knnsearch(N,V)
ids = 5x1
2 3 4 3 3
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
There is no need to look at differences, compute absolute values, etc. Just use the tool that is designed to solve your problem directly.
As of release R2021b, you don't need to use abs on the input to the min function. Use the 'ComparisonMethod' option to tell MATLAB to take the minimum value by absolute value.
V = randi(100,[5 1])
V = 5×1
32 16 70 49 22
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
N = randi(100,[5 1])
N = 5×1
39 75 70 10 59
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
[minDifference,closestIndex] = min(N-V.', [], ComparisonMethod = "abs")
minDifference = 1×5
7 -6 0 10 -12
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
closestIndex = 1×5
1 4 3 5 4
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
closestValue = N(closestIndex)
closestValue = 5×1
39 10 70 59 10
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>

4 Comments

And if you're trying to perform matching, where you're trying to match elements of V with elements of N on a one-to-one basis (no two elements of V can match with the same element of N and vice versa) consider the matchpairs function.
From the sounds of it I was hoping matchpairs would find a matching pair in an array of pairs. That would be convenient since ismember is so tricky that I have to consult the documentation every time I use it. But I guess matchpairs doesn't do that or else I'm not using it correctly.
% Define variables to search and search for.
allValues = [1,20; 2,30; 3, 40; 4, 50; 5, 60]
allValues = 5×2
1 20 2 30 3 40 4 50 5 60
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
sought = [3, 40];
% ismember() works, though it's always tricky to figure out.
[a, b] = ismember(sought, allValues) % Find the row where it occurs
a = 1x2 logical array
1 1
b = 1×2
3 8
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
rowFound = allValues(b(1), :) % Extract the row to make sure it matches sought
rowFound = 1×2
3 40
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
% Matchpairs() doesn't work like this (like I'd hope for)
rowFound = matchpairs(sought, allValues, 'min')
Error using matchpairs (line 61)
costUnmatched must be a real scalar of data type double or single.
For the case you described you probably want to use the 'rows' option for ismember. [This assumes the row you're searching for is in the matrix you're searching, not 'close to' a row in the matrix. In that case you'd want to use ismembertol with ByRows=true.]
% Define variables to search and search for.
allValues = [1,20; 2,30; 3, 40; 4, 50; 5, 60]
allValues = 5×2
1 20 2 30 3 40 4 50 5 60
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
sought = [3, 40];
[~, whichRow] = ismember(sought, allValues, 'rows')
whichRow = 3
matchpairs is intended to solve a different problem. It solves the combinatorial optimization assignment problem. It is intended to associate the elements of two different sets along with a cost or value for each association to minimize cost or maximize value, matching pairs of elements from those sets. One example is if you have workers and tasks, and each worker is more or less skilled with different tasks or is more or less satisfied with those tasks. You may want to assign each task to a worker maximizing the overall skill or the job satisfaction of the group of workers with their assignments.
Thanks Steve. I forgot about the 'rows' option. 🙂

Sign in to comment.

Korosh Agha Mohammad Ghasemi
Moved: Voss on 25 Jun 2024
% Example V and N vectors
V = [1375471092848936; 1375473388165900; 1375471320476780; 1375473388947681; 1375473392527002];
N = [1375473384440853; 1375471277856598; 1375473388165900; 1375471322465961; 1375471335206288];
W = 1e4; % Window size
% Initialize the closest times array
closest_times = zeros(size(V));
% Find the closest time in N for each time in V within the window
for i = 1:length(V)
% Calculate the absolute differences
diffs = abs(N - V(i));
% Find the indices within the window
within_window = diffs <= W;
if any(within_window)
% Find the closest time
[~, closest_idx] = min(diffs(within_window));
% Get the actual index in N
closest_times(i) = N(find(within_window, closest_idx, 'first'));
else
% No times within the window
closest_times(i) = NaN;
end
end
% Display the closest times
disp('Closest times:');
disp(closest_times);

Tags

No tags entered yet.

Asked:

on 25 Aug 2014

Commented:

on 8 Feb 2025

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!