Probability results are informed plus recall, precision, F factor, G factor, correlation and kappa.
Updated 14 Sep 2016

View License

The informedness of a prediction method as captured by a contingency matrix is defined as the probability that the prediction method will make a correct decision as opposed to guessing and is calculated using the bookmaker algorithm. The markedness is the informedness for the inverse problem, predicting the classifier labels from the real classes (so how well the real world class marks the predictor). Their correlation is the geometric mean of the two - which are understood here as regression coefficients.

Alternate measures of the usefulness of a prediction method are all defective in that they do not take into account all cells of the contingency matrix, they do not take into account the baseline performance due to chance/guessing, or they are concerned with significance or information rather than correctness. Significance tests and ROC tuning are complementary analyses that should also be performed and are provided (pearsonXsq --> px).

The Recall, Precision and Rank average are calculated for comparison, along
with the F and G measures corresponding to their harmonic and geometric means (the geometric mean is more generally valid and mathematically justifiable GM of Lp and L-p for any p) - the F1 score overemphasizes Rec/Prec differences and assumes a base distribution around the mean of the actual and predicted labels (see 5 below).

The Kappa calculated is Cohen Kappa, not Fleiss Kappa (see 3 below), which like the F-measure assumes both Real and Predicted labels are drawn from the mean distribution - this may be reasonable in the context it was designed, but for classification the prediction distribution is produced with knowledge of and ability to exploit the actual distribution to increase accuracy and kappa). Informedness and Markedness can't be biased by changing the prediction distribution, and are stable in relation to changes in the real class distribution (e.g. applying in a different demographic area), and also have a relationship with ROC and AUC - choosing the optimal operating point with ROC maximizes Informedness, while maximizing AUC maximizes the ability to adapt to different demographics/distributions (see 4/4a below).

For original technical paper see
For for original tutorial poster see

References - all by David MW Powers, developer of the multiclass Bookmaker
informedness, markedness and correlation evaluation statistics.
1. Recall & Precision versus The Bookmaker
(2003:6pp) International Conference on Cognitive Science (ICCS), 529-534
2. Evaluation: from precision, recall and F-factor to ROC, informedness, markedness and correlation
(2007:13pp) Flinders University School of Informatics & Engineering TR SIE-07-001
2a. Evaluation: from precision, recall & F-measure to ROC, informedness, markedness & correlation
(2011:27pp) International Journal of Machine Learning Technology 2 (1), 37-63
2b. Evaluation Evaluation
(2008:2pp) Proceedings of the European Conference on Artificial Intelligence (ECAI)
2c. Evaluation Evaluation a Monte Carlo study
(2008:5pp) arXiv preprint arXiv:1504.00854 (longer submitted version of 2b, highlights of 2/2a)
3. The problem with kappa
(2012) Proceedings of the 13th Conference of the European Chapter of ACL (EACL)
4. The problem of area under the curve
(2012) International Conference on Information Science and Technology (ICIST)
4a. ROC-ConCert: ROC-Based Measurement of Consistency and Certainty
(2012) Spring Congress on Engineering and Technology (SCET), 2:238-241
5. What the F-measure doesn't measure: Features, Flaws, Fallacies and Fixes
(2015) arXiv preprint arXiv:1503.06410
6. Visualization of Tradeoff in Evaluation: from Precision-Recall & PN to LIFT, ROC & BIRD
(2015) arXiv preprint arXiv:1505.00401

Cite As

David Powers (2024). bm(cm) (, MATLAB Central File Exchange. Retrieved .

MATLAB Release Compatibility
Created with R10
Compatible with any release
Platform Compatibility
Windows macOS Linux
Find more on ROC - AUC in Help Center and MATLAB Answers

Inspired: paralled buddy prima

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
Version Published Release Notes

clarified references with standard acronyms for conferences (removed redundant dates)
more whitespace/formatting and ref (re)fixes/clarifications
removed redundant dates
fixed grammar - hard to check in a 3-line field
try again with whitespace
pearsonxsq --> pearsonXsq
updated description/refs with more detail
whitespace added back in (thanks mathworks)
renamed some fields to be clearer - cohenKappa replaces kappa and informedness replaces bookmaker. Bookmaker is the algorithm/method used to derive the multiclass measure of informedness, markedness and correlation.
Whitespace between one pair (of half a dozen) paragraphs stolen yet again.

readded "Evaluation Statistics" to title
added eps to allow for degenerate matrices (missing label or class)

cleans up, and documents additional features relating to kappa, correlation and significance; adds example & figure