ExhaustiveSearcher

Create exhaustive nearest neighbor searcher

Description

`ExhaustiveSearcher` model objects store the training data, distance metric, and parameter values of the distance metric for an exhaustive nearest neighbor search. The exhaustive search algorithm finds the distance from each query observation to all n observations in the training data, which is an n-by-K numeric matrix.

Once you create an `ExhaustiveSearcher` model object, find neighboring points in the training data to the query data by performing a nearest neighbor search using `knnsearch` or a radius search using `rangesearch`. The exhaustive search algorithm is more efficient than the Kd-tree algorithm when K is large (that is, K > 10), and it is more flexible than the Kd-tree algorithm with respect to distance metric choices. The `ExhaustiveSearcher` model object also supports sparse data.

Creation

Use either the `createns` function or the `ExhaustiveSearcher` function (described here) to create an `ExhaustiveSearcher` object. Both functions use the same syntax except that the `createns` function has the `'NSMethod'` name-value pair argument, which you use to choose the nearest neighbor search method. The `createns` function also creates a `KDTreeSearcher` object. Specify `'NSMethod','exhaustive'` to create an `ExhaustiveSearcher` object. The default is `'exhaustive'` if K > 10, the training data is sparse, or the distance metric is not the Euclidean, city block, Chebychev, or Minkowski.

Syntax

``Mdl = ExhaustiveSearcher(X)``
``Mdl = ExhaustiveSearcher(X,Name,Value)``

Description

example

````Mdl = ExhaustiveSearcher(X)` creates an exhaustive nearest neighbor searcher object (`Mdl`) using the n-by-K numeric matrix of training data (`X`).```

example

````Mdl = ExhaustiveSearcher(X,Name,Value)` specifies additional options using one or more name-value pair arguments. You can specify the distance metric and set the distance metric parameter (`DistParameter`) property. For example, `ExhaustiveSearcher(X,'Distance','chebychev')` creates an exhaustive nearest neighbor searcher object that uses the Chebychev distance. To specify `DistParameter`, use the `Cov`, `P`, or `Scale` name-value pair argument.```

Input Arguments

expand all

Training data that prepares the exhaustive searcher algorithm, specified as a numeric matrix. `X` has n rows, each corresponding to an observation (that is, an instance or example), and K columns, each corresponding to a predictor (that is, a feature).

Data Types: `single` | `double`

Name-Value Arguments

Specify optional pairs of arguments as `Name1=Value1,...,NameN=ValueN`, where `Name` is the argument name and `Value` is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose `Name` in quotes.

Example: `'Distance','mahalanobis','Cov',eye(3)` specifies to use the Mahalanobis distance when searching for nearest neighbors and a 3-by-3 identity matrix for the covariance matrix in the Mahalanobis distance metric.

Properties

expand all

Training data that prepares the exhaustive searcher algorithm, specified as a numeric matrix. `X` has n rows, each corresponding to an observation (that is, an instance or example), and K columns, each corresponding to a predictor (that is, a feature).

The input argument `X` of `createns` or `ExhaustiveSearcher` sets this property.

Data Types: `single` | `double`

Distance metric used when you call `knnsearch` or `rangesearch` to find nearest neighbors for future query points, specified as a character vector or string scalar (`'chebychev'`, `'cityblock'`, `'correlation'`, `'cosine'`, `'euclidean'`, `'fasteuclidean'`, `'fastseuclidean'`, `'hamming'`, `'jaccard'`, `'minkowski'`, `'mahalanobis'`, `'seuclidean'`, or `'spearman'`), or a function handle.

The `'Distance'` name-value pair argument of `createns` or `ExhaustiveSearcher` sets this property.

The software does not use the distance metric for creating an `ExhaustiveSearcher` model object, so you can alter it by using dot notation. Algorithms starting with `'fast'` do not support sparse data.

Distance metric parameter values, specified as empty (`[]`) or a positive scalar.

This table describes the distance parameters of the supported distance metrics.

Distance MetricParameter Description
`'mahalanobis'`

A positive definite matrix representing the covariance matrix used for computing the Mahalanobis distance. By default, the software sets the covariance using `cov(Mdl.X,'omitrows')`.

The `'Cov'` name-value pair argument of `createns` or `ExhaustiveSearcher` sets this property.

You can alter `DistParameter` by using dot notation, for example, ```Mdl.DistParameter = CovNew```, where `CovNew` is a K-by-K positive definite numeric matrix.

`'minkowski'`

A positive scalar indicating the exponent of the Minkowski distance. By default, the exponent is `2`.

The `'P'` name-value pair argument of `createns` or `ExhaustiveSearcher` sets this property.

You can alter `DistParameter` by using dot notation, for example, ```Mdl.DistParameter = PNew```, where `PNew` is a positive scalar.

`'seuclidean'`

A positive numeric vector indicating the values used by the software to scale the predictors when computing the standardized Euclidean distance. By default, the software:

1. Estimates the standard deviation of each predictor (column) of `X` using ```scale = std(Mdl.X,'omitnan')```

2. Scales each coordinate difference between the rows in `X` and the query matrix by dividing by the corresponding element of `scale`

The `'Scale'` name-value pair argument of `createns` or `ExhaustiveSearcher` sets this property.

You can alter `DistParameter` by using dot notation, for example, ```Mdl.DistParameter = sNew```, where `sNew` is a K-dimensional positive numeric vector.

If `Mdl.Distance` is not one of the parameters listed in this table, then `Mdl.DistParameter` is `[]`, which means that the specified distance metric formula has no parameters.

Data Types: `single` | `double`

Object Functions

 `knnsearch` Find k-nearest neighbors using searcher object `rangesearch` Find all neighbors within specified distance using searcher object

Examples

collapse all

```load fisheriris X = meas; [n,k] = size(X)```
```n = 150 ```
```k = 4 ```

`X` has 150 observations and 4 predictors.

Prepare an exhaustive nearest neighbor searcher using the entire data set as training data.

`Mdl1 = ExhaustiveSearcher(X)`
```Mdl1 = ExhaustiveSearcher with properties: Distance: 'euclidean' DistParameter: [] X: [150x4 double] ```

`Mdl1` is an `ExhaustiveSearcher` model object, and its properties appear in the Command Window. The object contains information about the trained algorithm, such as the distance metric. You can alter property values using dot notation.

Alternatively, you can prepare an exhaustive nearest neighbor searcher by using `createns` and specifying `'exhaustive'` as the search method.

`Mdl2 = createns(X,'NSMethod','exhaustive')`
```Mdl2 = ExhaustiveSearcher with properties: Distance: 'euclidean' DistParameter: [] X: [150x4 double] ```

`Mdl2` is also an `ExhaustiveSearcher` model object, and it is equivalent to `Mdl1`.

To search `X` for the nearest neighbors to a batch of query data, pass the `ExhaustiveSearcher` model object and the query data to `knnsearch` or `rangesearch`.

Load Fisher's iris data set. Focus on the petal dimensions.

```load fisheriris X = meas(:,[3 4]); % Predictors```

Prepare an exhaustive nearest neighbor searcher. Specify the Mahalanobis distance metric.

`Mdl = createns(X,'Distance','mahalanobis')`
```Mdl = ExhaustiveSearcher with properties: Distance: 'mahalanobis' DistParameter: [2x2 double] X: [150x2 double] ```

Because the distance metric is Mahalanobis, `createns` creates an `ExhaustiveSearcher` model object by default.

Access properties of `Mdl` by using dot notation. For example, use `Mdl.DistParameter` to access the Mahalanobis covariance parameter.

`Mdl.DistParameter`
```ans = 2×2 3.1163 1.2956 1.2956 0.5810 ```

You can pass query data and `Mdl` to:

Create an `ExhaustiveSearcher` model object and alter the `Distance` property by using dot notation.

```load fisheriris X = meas;```

Train a default exhaustive searcher algorithm using the entire data set as training data.

`Mdl = ExhaustiveSearcher(X)`
```Mdl = ExhaustiveSearcher with properties: Distance: 'euclidean' DistParameter: [] X: [150x4 double] ```

Specify that the neighbor searcher use the Mahalanobis metric to compute the distances between the training and query data.

`Mdl.Distance = 'mahalanobis'`
```Mdl = ExhaustiveSearcher with properties: Distance: 'mahalanobis' DistParameter: [4x4 double] X: [150x4 double] ```

You can pass `Mdl` and the query data to either `knnsearch` or `rangesearch` to find the nearest neighbors to the points in the query data based on the Mahalanobis distance.

Create an exhaustive searcher object by using the `createns` function. Pass the object and query data to the `knnsearch` function to find k-nearest neighbors.

`load fisheriris`

Remove five irises randomly from the predictor data to use as a query set.

```rng('default'); % For reproducibility n = size(meas,1); % Sample size qIdx = randsample(n,5); % Indices of query data X = meas(~ismember(1:n,qIdx),:); Y = meas(qIdx,:);```

Prepare an exhaustive nearest neighbor searcher using the training data. Specify the Mahalanobis distance for finding nearest neighbors.

`Mdl = createns(X,'Distance','mahalanobis')`
```Mdl = ExhaustiveSearcher with properties: Distance: 'mahalanobis' DistParameter: [4x4 double] X: [145x4 double] ```

Because the distance metric is Mahalanobis, `createns` creates an `ExhaustiveSearcher` model object by default.

The software uses the covariance matrix of the predictors (columns) in the training data for computing the Mahalanobis distance. To display this value, use `Mdl.DistParameter`.

`Mdl.DistParameter`
```ans = 4×4 0.6547 -0.0368 1.2320 0.5026 -0.0368 0.1914 -0.3227 -0.1193 1.2320 -0.3227 3.0671 1.2842 0.5026 -0.1193 1.2842 0.5800 ```

Find the indices of the training data (`Mdl.X`) that are the two nearest neighbors of each point in the query data (`Y`).

`IdxNN = knnsearch(Mdl,Y,'K',2)`
```IdxNN = 5×2 5 6 98 95 104 128 135 65 102 115 ```

Each row of `IdxNN` corresponds to a query data observation. The column order corresponds to the order of the nearest neighbors with respect to ascending distance. For example, based on the Mahalanobis metric, the second nearest neighbor of `Y(3,:)` is `X(128,:)`.

expand all

Version History

Introduced in R2010a

expand all