Updated Wed, 14 Sep 2016 22:47:38 +0000
The informedness of a prediction method as captured by a contingency matrix is defined as the probability that the prediction method will make a correct decision as opposed to guessing and is calculated using the bookmaker algorithm. The markedness is the informedness for the inverse problem, predicting the classifier labels from the real classes (so how well the real world class marks the predictor). Their correlation is the geometric mean of the two - which are understood here as regression coefficients.
Alternate measures of the usefulness of a prediction method are all defective in that they do not take into account all cells of the contingency matrix, they do not take into account the baseline performance due to chance/guessing, or they are concerned with significance or information rather than correctness. Significance tests and ROC tuning are complementary analyses that should also be performed and are provided (pearsonXsq --> px).
The Recall, Precision and Rank average are calculated for comparison, along
with the F and G measures corresponding to their harmonic and geometric means (the geometric mean is more generally valid and mathematically justifiable GM of Lp and L-p for any p) - the F1 score overemphasizes Rec/Prec differences and assumes a base distribution around the mean of the actual and predicted labels (see 5 below).
The Kappa calculated is Cohen Kappa, not Fleiss Kappa (see 3 below), which like the F-measure assumes both Real and Predicted labels are drawn from the mean distribution - this may be reasonable in the context it was designed, but for classification the prediction distribution is produced with knowledge of and ability to exploit the actual distribution to increase accuracy and kappa). Informedness and Markedness can't be biased by changing the prediction distribution, and are stable in relation to changes in the real class distribution (e.g. applying in a different demographic area), and also have a relationship with ROC and AUC - choosing the optimal operating point with ROC maximizes Informedness, while maximizing AUC maximizes the ability to adapt to different demographics/distributions (see 4/4a below).
For original technical paper see www.infoeng.flinders.edu.au/papers/20030007.doc
For for original tutorial poster see www.infoeng.flinders.edu.au/papers/20030003.ppt
References - all by David MW Powers, developer of the multiclass Bookmaker
informedness, markedness and correlation evaluation statistics.
1. Recall & Precision versus The Bookmaker
(2003:6pp) International Conference on Cognitive Science (ICCS), 529-534
2. Evaluation: from precision, recall and F-factor to ROC, informedness, markedness and correlation
(2007:13pp) Flinders University School of Informatics & Engineering TR SIE-07-001
2a. Evaluation: from precision, recall & F-measure to ROC, informedness, markedness & correlation
(2011:27pp) International Journal of Machine Learning Technology 2 (1), 37-63
2b. Evaluation Evaluation
(2008:2pp) Proceedings of the European Conference on Artificial Intelligence (ECAI)
2c. Evaluation Evaluation a Monte Carlo study
(2008:5pp) arXiv preprint arXiv:1504.00854 (longer submitted version of 2b, highlights of 2/2a)
3. The problem with kappa
(2012) Proceedings of the 13th Conference of the European Chapter of ACL (EACL)
4. The problem of area under the curve
(2012) International Conference on Information Science and Technology (ICIST)
4a. ROC-ConCert: ROC-Based Measurement of Consistency and Certainty
(2012) Spring Congress on Engineering and Technology (SCET), 2:238-241
5. What the F-measure doesn't measure: Features, Flaws, Fallacies and Fixes
(2015) arXiv preprint arXiv:1503.06410
6. Visualization of Tradeoff in Evaluation: from Precision-Recall & PN to LIFT, ROC & BIRD
(2015) arXiv preprint arXiv:1505.00401
David Powers (2023). bm(cm) (https://www.mathworks.com/matlabcentral/fileexchange/5648-bm-cm), MATLAB Central File Exchange. Retrieved .
MATLAB Release Compatibility
Platform CompatibilityWindows macOS Linux
Inspired: paralled buddy prima
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!Start Hunting!
Discover Live Editor
Create scripts with code, output, and formatted text in a single executable document.
clarified references with standard acronyms for conferences (removed redundant dates)
readded "Evaluation Statistics" to title
cleans up, and documents additional features relating to kappa, correlation and significance; adds example & figure