Main Content

Sample Data Sets for Shallow Neural Networks

The Deep Learning Toolbox™ contains a number of sample data sets that you can use to experiment with shallow neural networks. To view the data sets that are available, use the following command:

help nndatasets
  Neural Network Datasets
  -----------------------
 
  Function Fitting, Function approximation and Curve fitting.
 
  Function fitting is the process of training a neural network on a
  set of inputs in order to produce an associated set of target outputs.
  Once the neural network has fit the data, it forms a generalization of
  the input-output relationship and can be used to generate outputs for
  inputs it was not trained on.
 
   simplefit_dataset     - Simple fitting dataset.
   abalone_dataset       - Abalone shell rings dataset.
   bodyfat_dataset       - Body fat percentage dataset.
   building_dataset      - Building energy dataset.
   chemical_dataset      - Chemical sensor dataset.
   cho_dataset           - Cholesterol dataset.
   engine_dataset        - Engine behavior dataset.
   vinyl_dataset         - Vinyl bromide dataset.
 
  ----------
 
  Pattern Recognition and Classification
 
  Pattern recognition is the process of training a neural network to assign
  the correct target classes to a set of input patterns.  Once trained the
  network can be used to classify patterns it has not seen before.
 
   simpleclass_dataset     - Simple pattern recognition dataset.
   cancer_dataset          - Breast cancer dataset.
   crab_dataset            - Crab gender dataset.
   glass_dataset           - Glass chemical dataset.
   iris_dataset            - Iris flower dataset.
   ovarian_dataset         - Ovarian cancer dataset.
   thyroid_dataset         - Thyroid function dataset.
   wine_dataset            - Italian wines dataset.
   digitTrain4DArrayData   - Synthetic handwritten digit dataset for
                             training in form of 4-D array.
   digitTrainCellArrayData - Synthetic handwritten digit dataset for
                             training in form of cell array.
   digitTest4DArrayData    - Synthetic handwritten digit dataset for
                             testing in form of 4-D array.
   digitTestCellArrayData  - Synthetic handwritten digit dataset for
                             testing in form of cell array.
   digitSmallCellArrayData - Subset of the synthetic handwritten digit 
                             dataset for training in form of cell array.
 
  ----------
 
  Clustering, Feature extraction and Data dimension reduction
 
  Clustering is the process of training a neural network on patterns
  so that the network comes up with its own classifications according
  to pattern similarity and relative topology.  This is useful for gaining
  insight into data, or simplifying it before further processing.
 
   simplecluster_dataset - Simple clustering dataset.
  
  The inputs of fitting or pattern recognition datasets may also clustered.
 
  ----------
 
  Input-Output Time-Series Prediction, Forecasting, Dynamic modeling
  Nonlinear autoregression, System identification and Filtering
 
  Input-output time series problems consist of predicting the next value
  of one time series given another time series. Past values of both series
  (for best accuracy), or only one of the series (for a simpler system)
  may be used to predict the target series.
 
   simpleseries_dataset  - Simple time series prediction dataset.
   simplenarx_dataset    - Simple time series prediction dataset.
   exchanger_dataset     - Heat exchanger dataset.
   maglev_dataset        - Magnetic levitation dataset.
   ph_dataset            - Solution PH dataset.
   pollution_dataset     - Pollution mortality dataset.
   refmodel_dataset      - Reference model dataset
   robotarm_dataset      - Robot arm dataset
   valve_dataset         - Valve fluid flow dataset.
 
  ----------
 
  Single Time-Series Prediction, Forecasting, Dynamic modeling,
  Nonlinear autoregression, System identification, and Filtering
 
  Single time series prediction involves predicting the next value of
  a time series given its past values.
 
   simplenar_dataset     - Simple single series prediction dataset.
   chickenpox_dataset    - Monthly chickenpox instances dataset.
   ice_dataset           - Global ice volume dataset.
   laser_dataset         - Chaotic far-infrared laser dataset.
   oil_dataset           - Monthly oil price dataset.
   river_dataset         - River flow dataset.
   solar_dataset         - Sunspot activity dataset

Notice that all of the data sets have file names of the form name_dataset. Inside these files will be the arrays nameInputs and nameTargets. You can load a data set into the workspace with a command such as

load simplefit_dataset

This will load simplefitInputs and simplefitTargets into the workspace. If you want to load the input and target arrays into different names, you can use a command such as

[x,t] = simplefit_dataset;

This will load the inputs and targets into the arrays x and t. You can get a description of a data set with a command such as

help maglev_dataset

See Also

| | |

Related Topics