Main Content

Use Ground Truth for Training AI Models

Preprocess, augment, and split ground truth data for training and evaluating AI models

Labeled ground truth data is essential for training supervised AI models across a wide range of computer vision tasks, including object detection, semantic segmentation, image classification, and video activity recognition. Computer Vision Toolbox™ provides tools to help you prepare labeled ground truth data for deep learning training by selecting relevant labels, modifying file paths, merging ground truth objects, and organizing data sets for training and evaluation.

The Image Labeler and Video Labeler apps export labeled ground truth data in the form of a groundTruth object. To generate training data sets by converting labeled ground truth into formats compatible with AI models, use functions like objectDetectorTrainingData, pixelLabelTrainingData, and sceneLabelTrainingData. These functions support object detection, segmentation, and classification tasks. For more information, see Training Data for Object Detection and Semantic Segmentation and Postprocess Exported Labels for Instance Segmentation Training. You can also create blocked image representations using polyToBlockedImage enabling you to efficiently process large-scale images.

To select specific labels from ground truth data, and filter and organize annotations based on task requirements, use functions like selectLabelsByGroup, selectLabelsByType, or selectLabelsByName. The toolbox also supports post-processing of labeled data using functions such as merge to combine multiple ground truth objects, changeFilePaths to update data set references, and gatherLabelData to extract label information. For video data, utilities like writeVideoScenes and sceneTimeRanges help manage scene-level annotations.

To share and review image labels with colleagues, consider creating a team project within the Image Labeler app. For more details, see Get Started with Team-Based Labeling.

Apps

Image LabelerLabel images for computer vision applications
Video LabelerLabel video for computer vision applications

Functions

expand all

selectLabelsByGroupSelect ground truth labels by label group
selectLabelsByTypeSelect ground truth labels by label type
selectLabelsByNameSelect ground truth labels by label name

Store Labeled Ground Truth

groundTruthGround truth label data
pixelLabelDatastoreDatastore for pixel label data
boxLabelDatastoreDatastore for bounding box label data

Postprocess Labeled Ground Truth Object

mergeMerge two or more ground truth objects (Since R2023b)
changeFilePathsChange file paths in ground truth data
writeVideoScenesWrite video sequence to video file (Since R2021b)
sceneTimeRangesTime ranges of scene labels from ground truth data (Since R2021b)
gatherLabelDataGather label data from ground truth

Postprocess Label Datastores

combineCombine data from multiple datastores
transformTransform datastore
splitlabelsFind indices to split labels according to specified proportions
countlabelsCount number of unique labels
countEachLabelCount occurrence of pixel or box labels
folders2labelsGet list of labels from folder names
objectDetectorTrainingDataCreate training data for an object detector
pixelLabelTrainingDataCreate training data for semantic segmentation from ground truth
sceneLabelTrainingDataCreate training data for scene classification from ground truth (Since R2022b)
polyToBlockedImageCreate labeled blockedImage object from set of ROIs (Since R2021b)
mergeMerge two or more ground truth objects (Since R2023b)
attributeTypeAttribute type enumerations for labeling
labelTypeLabel type enumerations for labeling

Topics

Featured Examples