# predict

Predict responses of multinomial regression model

Since R2023a

## Syntax

``Ypred = predict(mdl,XNew)``
``[Ypred,probs] = predict(mdl,XNew)``
``[Ypred,probs,lower,upper] = predict(mdl,XNew)``
``[___] = predict(___,Name=Value)``

## Description

example

````Ypred = predict(mdl,XNew)` returns predicted class labels for the predictor data `XNew` and `MultinomialRegression` model object `mdl`. ```
````[Ypred,probs] = predict(mdl,XNew)` also returns probability estimates for the response categories.```
````[Ypred,probs,lower,upper] = predict(mdl,XNew)` also returns lower and upper confidence interval bounds for the responses `Ypred`.```
````[___] = predict(___,Name=Value)` specifies options using one or more name-value arguments in addition to any of the input argument combinations in previous syntaxes. For example, you can specify the type of probability for the probability estimates returned in `probs`. ```

## Examples

collapse all

Load the `fisheriris` sample data set.

`load fisheriris`

The column vector `species` contains three iris flowers species: setosa, versicolor, and virginica. The matrix `meas` contains four types of measurements for the flower: the length and width of sepals and petals in centimeters.

Divide the species and measurement data into training and test data by using the `cvpartition` function. Get the indices of the training data rows by using the `training` function.

```n = length(species); partition = cvpartition(n,'Holdout',0.05); idx_train = training(partition);```

Create training data by using the indices of the training data rows to create a matrix of measurements and a vector of species labels.

```meastrain = meas(idx_train,:); speciestrain = species(idx_train,:);```

Fit a multinomial regression model using the training data.

`mdl = fitmnr(meastrain,speciestrain)`
```mdl = Multinomial regression with nominal responses Value SE tStat pValue _______ ______ ________ __________ (Intercept_setosa) 86.293 12.541 6.8806 5.9582e-12 x1_setosa -1.0614 3.5795 -0.29651 0.76684 x2_setosa 23.851 3.1238 7.6353 2.2535e-14 x3_setosa -27.264 3.5009 -7.7879 6.815e-15 x4_setosa -59.678 7.0214 -8.4994 1.9057e-17 (Intercept_versicolor) 42.637 5.2214 8.1659 3.1906e-16 x1_versicolor 2.4652 1.1263 2.1887 0.028619 x2_versicolor 6.6808 1.474 4.5325 5.829e-06 x3_versicolor -9.4292 1.2946 -7.2837 3.248e-13 x4_versicolor -18.286 2.0833 -8.7775 1.671e-18 143 observations, 276 error degrees of freedom Dispersion: 1 Chi^2-statistic vs. constant model: 302.0378, p-value = 1.5168e-60 ```

`mdl` is a multinomial regression model object that contains the results of fitting a nominal multinomial regression model to the data. The table output shows coefficient statistics for each predictor in `meas`. By default, `fitmnr` uses `virginica` as the reference category.

Get the indices of the test data rows by using the `test` function. Create test data by using the indices of the test data rows to create a matrix of measurements and a vector of species labels.

```idx_test = test(partition); meastest = meas(idx_test,:); speciestest = species(idx_test,:);```

Predict the iris species for the measurements in `meastest`.

`speciespredict = predict(mdl,meastest)`
```speciespredict = 7x1 cell {'setosa' } {'setosa' } {'setosa' } {'setosa' } {'setosa' } {'versicolor'} {'versicolor'} ```

Compare the predictions in `speciespredict` with the category names in `speciestest`.

`speciestest`
```speciestest = 7x1 cell {'setosa' } {'setosa' } {'setosa' } {'setosa' } {'setosa' } {'versicolor'} {'versicolor'} ```

The output shows that the model accurately predicts the iris species for the measurements in `meastest`.

Load the `carbig` sample data set.

`load carbig`

The variables `Acceleration` and `Displacement` contain data for car acceleration and displacement, respectively. The variable `Cylinders` contains data for the number of cylinders in each car engine.

Create a table from the car data variables using the `table` function.

`tbl = table(Acceleration,Displacement,Cylinders,VariableNames=["Acceleration","Displacement","Cylinders"])`
```tbl=406×3 table Acceleration Displacement Cylinders ____________ ____________ _________ 12 307 8 11.5 350 8 11 318 8 12 304 8 10.5 302 8 10 429 8 9 454 8 8.5 440 8 10 455 8 8.5 390 8 17.5 133 4 11.5 350 8 11 351 8 10.5 383 8 11 360 8 10 383 8 ⋮ ```

The `Cylinders` data has an inherent ordering. Fit an ordinal multinomial regression model using `Acceleration` and `Displacement` as predictor variables and `Cylinders` as the response.

`mdl = fitmnr(tbl,"Cylinders",ModelType="ordinal");`

`mdl` is a multinomial regression model object that contains the results of fitting an ordinal multinomial regression model to the data.

Predict the response category, cumulative category probabilities, and 99% confidence interval bounds for a car with an acceleration of 16 and an engine displacement of 80.

`[cylinderspredict,cumprobs,lower,upper] = predict(mdl,[16 80],Alpha=0.01,ProbabilityType="cumulative")`
```cylinderspredict = 4 ```
```cumprobs = 1×4 0.0792 1.0000 1.0000 1.0000 ```
```lower = 1×4 0.0787 1.0000 1.0000 1.0000 ```
```upper = 1×4 0.0798 1.0000 1.0000 1.0000 ```

The output shows that the predicted response category is 4. The vector `cumprobs` shows the cumulative probabilities for each category in `Cylinders`. To view the category probabilities on which the prediction is based, calculate the category probabilities.

`[~,catprobs] = predict(mdl,[16 80])`
```catprobs = 1×5 0.0792 0.9208 0.0000 0.0000 0.0000 ```

The second value in the vector `catprobs` has the highest probability. Display an ordered list of the categories in `Cylinders`.

`mdl.ClassNames`
```ans = 5×1 3 4 5 6 8 ```

The output shows that the second category corresponds to cars with four cylinders. Therefore, the category with the highest category probability is `4`.

## Input Arguments

collapse all

Multinomial regression model object, specified as a `MultinomialRegression` model object created with the `fitmnr` function.

New predictor input values, specified as a table or an n-by-p matrix, where n is the number of observations to predict, and p is the number of predictor variables used to fit `mdl`.

• If `XNew` is a table, it must contain all the names of the predictors used to fit `mdl`. You can find the predictor names in the `mdl.PredictorNames` property.

• If `XNew` is a matrix, it must have the same number of columns as the number of estimated coefficients. You can find the number of estimated coefficients in the `mdl.NumPredictors` property. You can specify `XNew` as a matrix only when all names in `mdl.PredictorNames` refer to numeric predictors.

Example: `predict(mdl,[6.2 3.4; 5.9 3.0])` evaluates the two-predictor model `mdl` at the points ```p1 = [6.2 3.4]``` and `p2 = [5.9 3.0]`.

Data Types: `single` | `double` | `table`

### Name-Value Arguments

Specify optional pairs of arguments as `Name1=Value1,...,NameN=ValueN`, where `Name` is the argument name and `Value` is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: ```[Ypred,probs,lower,upper] = predict(model,X,Alpha=0.01,ProbabilityType="cumulative")``` specifies a 99% confidence level for the probability estimates and their type as cumulative.

Significance level for the probability estimates, specified as a scalar value in the range (0,1). The confidence level of the confidence intervals is 100(1 − α)%. The default value for `Alpha` is `0.05`, which returns 95% confidence intervals for the estimates.

Example: `Alpha=0.01`

Data Types: `single` | `double`

Type of probability estimates to return in `probs`, specified as one of the following options.

OptionDescription
`"category"` (default)Calculate a distinct probability for each response category.
`"cumulative"`Calculate a cumulative probability for each response category.
`"conditional"`Calculate a conditional probability for each response category.

Example: `ProbabilityType="conditional"`

Data Types: `char` | `string`

## Output Arguments

collapse all

Predicted response categories, returned as a categorical or character array, logical or numeric vector, or cell array of character vectors. `Ypred` has the same data type as `mdl.ClassNames`.

Probability estimates for the response categories, returned as a numeric matrix. Each column of `probs` corresponds to the entry at the same index in `mdl.ClassNames`.

Upper confidence interval bound for the probability estimates in `probs`, returned as a numeric matrix.

Lower confidence interval bound for the probability estimates in `probs`, returned as a numeric matrix.

## Alternative Functionality

• `feval` returns the same predictions as `predict`. The `feval` function can take multiple input arguments, with one input for each predictor variable. Note that the `feval` function does not give confidence intervals on predictions.

• `random` returns predictions with added noise.

## Version History

Introduced in R2023a