Distribution-balanced stratified cross-validation

Version 1.1.0 (1.9 KB) by Jan Motl
An improvement to stratified cross-validation for small imbalanced data sets.
116 Downloads
Updated 12 Oct 2019

View License

Distribution optimally balanced stratified cross-validation (DOB-SCV) partitions a data set into n folds in such a way that a balanced distribution in feature space is maintained for each class, in addition to stratification based on the label.

The real-world effect of using DOB-SCV, instead of stratified cross-validation, is slightly higher testing accuracy. The biggest improvements can be expected on small, class imbalanced data sets.

The implementation can be used as a drop-in replacement for CVPARTITION.

Reference: Study on the Impact of Partition-Induced Dataset Shift on k-Fold Cross-Validation available from https://ieeexplore.ieee.org/document/6226477

Cite As

Jan Motl (2024). Distribution-balanced stratified cross-validation (https://www.mathworks.com/matlabcentral/fileexchange/72963-distribution-balanced-stratified-cross-validation), MATLAB Central File Exchange. Retrieved .

MATLAB Release Compatibility
Created with R2018a
Compatible with any release
Platform Compatibility
Windows macOS Linux
Categories
Find more on Statistics and Machine Learning Toolbox in Help Center and MATLAB Answers

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
Version Published Release Notes
1.1.0

Speed up

1.0.0