Detect objects in monocular camera using Faster R-CNN deep learning detector


The fasterRCNNObjectDetectorMonoCamera object contains information about a Faster R-CNN (regions with convolutional neural networks) object detector that is configured for use with a monocular camera sensor. To detect objects in an image that was captured by the camera, pass the detector to the detect function.

When using the detect function with fasterRCNNObjectDetectorMonoCamera, use of a CUDA® enabled NVIDIA® GPU with a compute capability of 3.0 or higher is highly recommended. The GPU reduces computation time significantly. Usage of the GPU requires Parallel Computing Toolbox™.


  1. Create a fasterRCNNObjectDetector (Computer Vision Toolbox) object by calling the trainFasterRCNNObjectDetector (Computer Vision Toolbox) function with training data (requires Deep Learning Toolbox™).

    detector = trainFasterRCNNObjectDetector(trainingData,...);

    Alternatively, create a pretrained detector by using the vehicleDetectorFasterRCNN function.

  2. Create a monoCamera object to model the monocular camera sensor.

    sensor = monoCamera(...);
  3. Create a fasterRCNNObjectDetectorMonoCamera object by passing the detector and sensor as inputs to the configureDetectorMonoCamera function. The configured detector inherits property values from the original detector.

    configuredDetector = configureDetectorMonoCamera(detector,sensor,...);


expand all

This property is read-only.

Name of the classification model, specified as a character vector or string scalar. By default, the name is set to the heading of the second column of the trainingData table specified in the trainFasterRCNNObjectDetector (Computer Vision Toolbox) function. You can modify this name after creating your fasterRCNNObjectDetectorMonoCamera object.

This property is read-only.

Trained Fast R-CNN object detection network, specified as a DAGNetwork (Deep Learning Toolbox) object. This object stores the layers that define the convolutional neural network used within the Faster R-CNN detector.

This property is read-only.

Size of anchor boxes, specified as an M-by-2 matrix, where each row is in the format [height width]. This value is set during training.

This property is read-only.

Names of the object classes that the Faster R-CNN detector was trained to find, specified as a cell array. This property is set by the trainingData input argument for the trainFasterRCNNObjectDetector (Computer Vision Toolbox) function. Specify the class names as part of the trainingData table.

This property is read-only.

Minimum object size supported by the Faster R-CNN network, specified as a [height width] vector. The minimum size depends on the network architecture.

This property is read-only.

Camera configuration, specified as a monoCamera object. The object contains the camera intrinsics, the location, the pitch, yaw, and roll placement, and the world units for the parameters. Use the intrinsics to transform the object points in the image to world coordinates, which you can then compare to the values in the WorldObjectSize property.

Range of object widths and lengths in world units, specified as a [minWidth maxWidth] vector or [minWidth maxWidth; minLength maxLength] vector. Specifying the range of object lengths is optional.

Object Functions

detectDetect objects using Faster R-CNN object detector configured for monocular camera


collapse all

Configure a Faster R-CNN object detector for use with a monocular camera mounted on an ego vehicle. Use this detector to detect vehicles within an image captured by the camera.

Load a fasterRCNNObjectDetector object pretrained to detect vehicles.

detector = vehicleDetectorFasterRCNN;

Model a monocular camera sensor by creating a monoCamera object. This object contains the camera intrinsics and the location of the camera on the ego vehicle.

focalLength = [309.4362 344.2161];    % [fx fy]
principalPoint = [318.9034 257.5352]; % [cx cy]
imageSize = [480 640];                % [mrows ncols]
height = 2.1798;                      % height of camera above ground, in meters
pitch = 14;                           % pitch of camera, in degrees
intrinsics = cameraIntrinsics(focalLength,principalPoint,imageSize);

monCam = monoCamera(intrinsics,height,'Pitch',pitch);

Configure the detector for use with the camera. Limit the width of detected objects to a typical range for vehicle widths: 1.5–2.5 meters. The configured detector is a fasterRCNNObjectDetectorMonoCamera object.

vehicleWidth = [1.5 2.5];
detectorMonoCam = configureDetectorMonoCamera(detector,monCam,vehicleWidth);

Read in an image captured by the camera.

I = imread('carsinfront.png');

Detect the vehicles in the image by using the detector. Annotate the image with the bounding boxes for the detections and the detection confidence scores.

[bboxes,scores] = detect(detectorMonoCam,I);
I = insertObjectAnnotation(I,'rectangle',bboxes,scores,'Color','g');

Introduced in R2017a