Use Ground Truth for Training AI Models

Preprocess, augment, and split ground truth data for training and evaluating AI models

Labeled ground truth data is essential for training supervised AI models across a wide range of computer vision tasks, including object detection, semantic segmentation, image classification, and video activity recognition. Computer Vision Toolbox™ provides tools to help you prepare labeled ground truth data for deep learning training by selecting relevant labels, modifying file paths, merging ground truth objects, and organizing data sets for training and evaluation.

The Image Labeler and Video Labeler apps export labeled ground truth data in the form of a groundTruth object. To generate training data sets by converting labeled ground truth into formats compatible with AI models, use functions like objectDetectorTrainingData, pixelLabelTrainingData, and sceneLabelTrainingData. These functions support object detection, segmentation, and classification tasks. For more information, see Training Data for Object Detection and Semantic Segmentation and Postprocess Exported Labels for Instance Segmentation Training. You can also create blocked image representations using polyToBlockedImage enabling you to efficiently process large-scale images.

To select specific labels from ground truth data, and filter and organize annotations based on task requirements, use functions like selectLabelsByGroup, selectLabelsByType, or selectLabelsByName. The toolbox also supports post-processing of labeled data using functions such as merge to combine multiple ground truth objects, changeFilePaths to update data set references, and gatherLabelData to extract label information. For video data, utilities like writeVideoScenes and sceneTimeRanges help manage scene-level annotations.

To share and review image labels with colleagues, consider creating a team project within the Image Labeler app. For more details, see Get Started with Team-Based Labeling.

Apps

Image Labeler	Label images for computer vision applications
Video Labeler	Label video for computer vision applications

Functions

expand all

Select Labels

`selectLabelsByGroup`	Select ground truth labels by label group
`selectLabelsByType`	Select ground truth labels by label type
`selectLabelsByName`	Select ground truth labels by label name

Store and Post-process Labels

Store Labeled Ground Truth

`groundTruth`	Ground truth label data
`pixelLabelDatastore`	Datastore for pixel label data
`boxLabelDatastore`	Datastore for bounding box label data

Postprocess Labeled Ground Truth Object

`merge`	Merge two or more ground truth objects (Since R2023b)
`changeFilePaths`	Change file paths in ground truth data
`writeVideoScenes`	Write video sequence to video file (Since R2021b)
`sceneTimeRanges`	Time ranges of scene labels from ground truth data (Since R2021b)
`gatherLabelData`	Gather label data from ground truth

Postprocess Label Datastores

`combine`	Combine data from multiple datastores
`transform`	Transform datastore
`splitlabels`	Find indices to split labels according to specified proportions
`countlabels`	Count number of unique labels
`countEachLabel`	Count occurrence of pixel or box labels
`folders2labels`	Get list of labels from folder names

Create Training Data

`objectDetectorTrainingData`	Create training data for an object detector
`pixelLabelTrainingData`	Create training data for semantic segmentation from ground truth
`sceneLabelTrainingData`	Create training data for scene classification from ground truth (Since R2022b)
`polyToBlockedImage`	Create labeled `blockedImage` object from set of ROIs (Since R2021b)
`merge`	Merge two or more ground truth objects (Since R2023b)

Enumerate Attribute and Label Types

`attributeType`	Attribute type enumerations for labeling
`labelType`	Label type enumerations for labeling

Topics

Elements of Ground Truth Objects
Understand how to save and pass data using a ground truth data object.
Share and Store Labeled Ground Truth Data
Share and store labeled ground truth data exported from labeling apps.
How Labeler Apps Store Exported Pixel Labels
Learn how the labeling apps store pixel label data.
Training Data for Object Detection and Semantic Segmentation
Create training data for object detection or semantic segmentation using the Image Labeler or Video Labeler.
Postprocess Exported Labels for Instance Segmentation Training
Postprocess exported ground truth labels and create training datastore for training instance segmentation networks such as SOLOv2 or Mask R-CNN.
Datastores for Deep Learning (Deep Learning Toolbox)
Learn how to use datastores in deep learning applications.
Get Started with Image Preprocessing and Augmentation for Deep Learning
Preprocess data for deep learning applications with deterministic operations such as resizing, or augment training data with randomized operations such as random cropping.

Featured Examples

Create Instance Segmentation Training Data From Ground Truth

Create instance segmentation training data from a groundTruth object. To train a Mask R-CNN and SOLO v2 network using the trainMaskRCNN and trainSOLOv2 functions, respectively, format your input data as a 1-by-4 cell array containing the RGB training image, bounding boxes, instance labels, and instance masks. Convert your polygon ground data into a set of binary instance masks and axis-aligned rectangles for training.

Open Live Script