Prediction Using Discriminant Analysis Models

predict uses three quantities to classify observations: posterior probability, prior probability, and cost.

predict classifies so as to minimize the expected classification cost:

$\hat{y} = \underset{y = 1, ..., K}{\arg \min} \sum_{k = 1}^{K} \hat{P} (k | x) C (y | k),$

where

$\hat{y}$ is the predicted classification.
K is the number of classes.
$\hat{P} (k | x)$ is the posterior probability of class k for observation x.
$C (y | k)$ is the cost of classifying an observation as y when its true class is k.

The space of X values divides into regions where a classification Y is a particular value. The regions are separated by straight lines for linear discriminant analysis, and by conic sections (ellipses, hyperbolas, or parabolas) for quadratic discriminant analysis. For a visualization of these regions, see Create and Visualize Discriminant Analysis Classifier.

Posterior Probability

The posterior probability that a point x belongs to class k is the product of the prior probability and the multivariate normal density. The density function of the multivariate normal with 1-by-d mean μ_k and d-by-d covariance Σ_k at a 1-by-d point x is

$P (x | k) = \frac{1}{{({(2 π)}^{d} | Σ_{k} |)}^{1 / 2}} \exp (- \frac{1}{2} (x - μ_{k}) Σ_{k}^{- 1} {(x - μ_{k})}^{T}),$

where $| Σ_{k} |$ is the determinant of Σ_k, and $Σ_{k}^{- 1}$ is the inverse matrix.

Let P(k) represent the prior probability of class k. Then the posterior probability that an observation x is of class k is

$\hat{P} (k | x) = \frac{P (x | k) P (k)}{P (x)},$

where P(x) is a normalization constant, namely, the sum over k of P(x|k)P(k).

Prior Probability

The prior probability is one of three choices:

'uniform' — The prior probability of class k is 1 over the total number of classes.
'empirical' — The prior probability of class k is the number of training samples of class k divided by the total number of training samples.
A numeric vector — The prior probability of class k is the jth element of the Prior vector. See fitcdiscr.

After creating a classifier obj, you can set the prior using dot notation:

obj.Prior = v;

where v is a vector of positive elements representing the frequency with which each element occurs. You do not need to retrain the classifier when you set a new prior.

Cost

There are two costs associated with discriminant analysis classification: the true misclassification cost per class, and the expected misclassification cost per observation.

True Misclassification Cost per Class

Cost(i,j) is the cost of classifying an observation into class j if its true class is i. By default, Cost(i,j)=1 if i~=j, and Cost(i,j)=0 if i=j. In other words, the cost is 0 for correct classification, and 1 for incorrect classification.

You can set any cost matrix you like when creating a classifier. Pass the cost matrix in the Cost name-value pair in fitcdiscr.

After you create a classifier obj, you can set a custom cost using dot notation:

obj.Cost = B;

B is a square matrix of size K-by-K when there are K classes. You do not need to retrain the classifier when you set a new cost.

Expected Misclassification Cost per Observation

Suppose you have Nobs observations that you want to classify with a trained discriminant analysis classifier obj. Suppose you have K classes. You place the observations into a matrix Xnew with one observation per row. The command

[label,score,cost] = predict(obj,Xnew)

returns, among other outputs, a cost matrix of size Nobs-by-K. Each row of the cost matrix contains the expected (average) cost of classifying the observation into each of the K classes. cost(n,k) is

$\sum_{i = 1}^{K} \hat{P} (i | X (n)) C (k | i),$

where

K is the number of classes.
$\hat{P} (i | X (n))$ is the posterior probability of class i for observation Xnew(n).
$C (k | i)$ is the cost of classifying an observation as k when its true class is i.

Prediction Using Discriminant Analysis Models

Posterior Probability

Prior Probability

Cost

True Misclassification Cost per Class

Expected Misclassification Cost per Observation

See Also

Functions

Objects

Topics