Description

What Are Convolutional Neural Networks? | Introduction to Deep Learning

From the series: Introduction to Deep Learning

Explore the basics behind convolutional neural networks (CNNs) in this MATLAB^® Tech Talk. Broadly, convolutional neural networks are a common deep learning architecture – but what exactly is a CNN? This video breaks down this sometimes complicated concept into easy-to-understand parts. You’ll learn about 3 concepts: local receptive fields, shared weights and biases, and activation and pooling.

The video pulls together these three concepts and shows you how to configure the layers in a convolutional neural network.

You’ll also learn about the 3 ways to train convolutional neural networks for image analysis. These include: 1.) Training the model from scratch; 2.) Using transfer learning (based on the idea that you can use knowledge of one type of problem to solve a similar problem); 3.) Using a pretrained CNN to extract features for training a machine learning model.

Learn more about using MATLAB for deep learning.

Recorded: 24 Mar 2017

Full Transcript

A convolutional neural network, or CNN, is a network architecture for deep learning. It learns directly from images. A CNN is made up of several layers that process and transform an input to produce an output.

You can train a CNN to do image analysis tasks, including scene classification, object detection and segmentation, and image processing. In order to understand how CNNs work, we'll cover three key concepts: local receptive fields, shared weights and biases, and activation and pooling.

Finally, we'll briefly discuss the three ways to train CNNs for image analysis.

So let's start with the concept of local receptive fields. In a typical neural network, each neuron in the input layer is connected to a neuron in the hidden layer. However, in a CNN, only a small region of input layer neurons connects to neurons in the hidden layer. These regions are referred to as local receptive fields.

The local receptive field is translated across an image to create a feature map from the input layer to the hidden layer neurons. You can use convolution to implement this process efficiently. That's why it is called a convolutional neural network. The second concept we'll discuss is about shared weights and biases.

Like a typical neural network, a CNN has neurons with weights and biases. The model learns these values during the training process, and it continuously updates them with each new training example. However, in the case of CNNs, the weights and bias values are the same for all hidden neurons in a given layer.

This means that all hidden neurons are detecting the same feature, such as an edge or a blob, in different regions of the image. This makes the network tolerant to translation of objects in an image. For example, a network trained to recognize cats will be able to do so whenever the cat is in the image.

Our third and final concept is activation and pooling. The activation step applies a transformation to the output of each neuron by using activation functions. Rectified linear unit, or ReLU, is an example of a commonly used activation function. It takes the output of a neuron and maps it to the highest positive value.

Or if the output is negative, the function maps it to zero. You can further transform the output of the activation step by applying a pooling step. Pooling reduces the dimensionality of the featured map by condensing the output of small regions of neurons into a single output. This helps simplify the following layers and reduces the number of parameters that the model needs to learn.

Now let's pull it all together. Using these three concepts, we can configure the layers in a CNN. A CNN can have tens or hundreds of hidden layers that each learn to detect different features in an image. In this feature map, we can see that every hidden layer increases the complexity of the learned image features.

For example, the first hidden layer learns how to detect edges, and the last learns how to detect more complex shapes. Just like in a typical neural network, the final layer connects every neuron, from the last hidden layer to the output neurons. This produces the final output. There are three ways to use CNNs for image analysis.

The first method is to train the CNN from scratch. This method is highly accurate, although it is also the most challenging, as you might need hundreds of thousands of labeled images and significant computational resources.

The second method relies on transfer learning, which is based on the idea that you can use knowledge of one type of problem to solve a similar problem. For example, you could use a CNN model that has been trained to recognize animals to initialize and train a new model that differentiates between cars and trucks.

This method requires less data and fewer computational resources than the first. With the third method, you can use a pre-trained CNN to extract features for training a machine learning model. For example, a hidden layer that has learned how to detect edges in an image is broadly relevant to images from many different domains. This method requires the least amount of data and computational resources.

I hope you found this video useful. For more information, visit MathWorks.com/deep-learning.

Related Resources

Related Products

Learn More

Deep Learning with MATLAB (5 videos)

Deep Learning with MATLAB (Ebook)

FREE EBOOK

Introducing Deep Learning with MATLAB

Download ebook

Featured Product

Deep Learning Toolbox

Related Videos:

Virtual engineering technology has undergone rapid progress in recent years and has been widely accepted for commercial product development. Product design and manufacturing organizations are moving from the traditional multiple and serial test cycle — Optimal Neural Network for Automotive Product Development

MathWorks and Brainstorm engineers will demonstrate the essential tools offered by Brainstorm to analyse and visualize multidimensional, complex datasets obtained from electrophysiological recordings, with an emphasis on functional brain imaging. We — Brainstorm: Imaging Neural Activity at the Speed of Brain...

Discover how to use <code>configureKalmanFilter</code> and <code>vision.KalmanFilter</code> to track a moving object in video. Learn how to handle the challenges of inaccurate or missing object detection while keeping track of its location in video. — Introduction to Kalman Filters for Object Tracking

Engineers complete communication payload design simulation and hardware validation and verification with Simulink. — Indian Space Research Organization Simulates Hybrid...

Learn how MATLAB addresses common challenges encountered while developing object recognition systems and see new capabilities for deep learning, machine learning, and computer vision. — Deep Learning for Computer Vision

View more related videos