Asked by Eli
on 23 Apr 2016

Hello, For teaching purposes, I am interested in using matlab to calculate the weights for a simple neural network classification problem, and then doing the classification myself using net1.IW, net1.LW, net1.b, etc. If I put the code below into a nn_bare.m file and run it, y_matlab and y_my are the same the first time, but y_my is then different the next time I run this. any hints for why the behavior is not consistent? I tried disabling adaptation, and adding a random number seed, but this did not help. Many thanks... E

%%data to be classified:

X=[2 7 0 7 4 3 9 1 8 4 10 6 4 5 2 3 5 8 3 4;

10 6 7 10 8 1 3 4 4 1 1 7 10 8 0 0 2 4 6 9];

y=[3 3 1 3 3 1 2 1 2 1 2 3 3 3 1 1 2 2 1 3];

%%for classification, turn labels into matrix format:

T=zeros(max(y),length(y)); for i=1:length(y); T(y(i),i)=1; end

rng('default'); % for reproducible results, as weights are initialized randomly

net1 = patternnet([5 5]);

net1.divideFcn=''; % don't divide data into training, testing, validation.

[net1,tr] = train(net1,X,T);

net1.adaptFcn=''; % don't change network during usage after training

%%use the trained network to classify a new point:

Xtest=[7 2]'

y_matlab=net1(Xtest)

%%Test matlab's classification manually:

a1 = tansig(net1.IW{1,:}*Xtest + net1.b{1});

a2 = tansig(net1.LW{2,1}*a1 + net1.b{2});

y_my = softmax(net1.LW{3,2}*a2 + net1.b{3});

y_my

Output: >> nn_bare

Xtest =

7

2

y_matlab =

0.0000

1.0000

0.0000

y_my =

0.0000

1.0000

0.0000

>> nn_bare

Xtest =

7

2

y_matlab =

0.0000

1.0000

0.0000

y_my =

0.0000

0.9307

0.0693

>>

Answer by Greg Heath
on 24 Apr 2016

Accepted Answer

You did not take into account the default normalization of inputs to the range [-1,1] and the unnormalization of the output from the range [-1,1].

Type, without the ending semicolons:

net = patternnet

inputprocessFcns = net.input.processFcns

outputprocessFcns = net.output.processFcns

Hope this helps.

Thank you for formally accepting my answer

Greg

Eli
on 24 Apr 2016

thanks, I can see that this would make a difference, here is the output. I am all set now, when I add net1.inputs{1}.processFcns = {}; net1.outputs{1}.processFcns = {}; things work just fine. I realize this is not optimal for a real application, but probably good enough for demonstration purposes. thanks again for the helpful response!

>> net = patternnet

inputprocessFcns = net.input.processFcns

outputprocessFcns = net.output.processFcns

net =

Neural Network

name: 'Pattern Recognition Neural Network'

userdata: (your custom info)

dimensions:

numInputs: 1

numLayers: 2

numOutputs: 1

numInputDelays: 0

numLayerDelays: 0

numFeedbackDelays: 0

numWeightElements: 10

sampleTime: 1

connections:

biasConnect: [1; 1]

inputConnect: [1; 0]

layerConnect: [0 0; 1 0]

outputConnect: [0 1]

subobjects:

input: Equivalent to inputs{1}

output: Equivalent to outputs{2}

inputs: {1x1 cell array of 1 input}

layers: {2x1 cell array of 2 layers}

outputs: {1x2 cell array of 1 output}

biases: {2x1 cell array of 2 biases}

inputWeights: {2x1 cell array of 1 weight}

layerWeights: {2x2 cell array of 1 weight}

functions:

adaptFcn: 'adaptwb'

adaptParam: (none)

derivFcn: 'defaultderiv'

divideFcn: 'dividerand'

divideParam: .trainRatio, .valRatio, .testRatio

divideMode: 'sample'

initFcn: 'initlay'

performFcn: 'crossentropy'

performParam: .regularization, .normalization

plotFcns: {'plotperform', plottrainstate, ploterrhist,

plotconfusion, plotroc}

plotParams: {1x5 cell array of 5 params}

trainFcn: 'trainscg'

trainParam: .showWindow, .showCommandLine, .show, .epochs,

.time, .goal, .min_grad, .max_fail, .sigma,

.lambda

weight and bias values:

IW: {2x1 cell} containing 1 input weight matrix

LW: {2x2 cell} containing 1 layer weight matrix

b: {2x1 cell} containing 2 bias vectors

methods:

adapt: Learn while in continuous use

configure: Configure inputs & outputs

gensim: Generate Simulink model

init: Initialize weights & biases

perform: Calculate performance

sim: Evaluate network outputs given inputs

train: Train network with examples

view: View diagram

unconfigure: Unconfigure inputs & outputs

evaluate: outputs = net(inputs)

inputprocessFcns =

'removeconstantrows' 'mapminmax'

outputprocessFcns =

'removeconstantrows' 'mapminmax'

Greg Heath
on 25 Apr 2016

There is nothing keeping you from doing the normalizations/denormalization,

explicitly, before and after training.

Hope this helps.

Greg

Greg Heath
on 26 Apr 2016

I ran your code multiple times but always got the same result which differs from yours.

I also made comments that may be helpful.

GEH1 = ' Use lowercase for double variables (uppercase for cells)'

X=[2 7 0 7 4 3 9 1 8 4 10 6 4 5 2 3 5 8 3 4;

10 6 7 10 8 1 3 4 4 1 1 7 10 8 0 0 2 4 6 9];

y=[3 3 1 3 3 1 2 1 2 1 2 3 3 3 1 1 2 2 1 3];

GEH2 = ' [ I N ] = size(X) , [ O N ] = size(y)' % [2 20], [1 20 ]

% for classification, turn labels into matrix format: T = zeros(max(y),length(y)); for i=1:length(y); T(y(i),i)=1; end

GEH3 = 'Use functions ind2vec and, later, vec2ind'

rng('default'); % for reproducible results, as weights are initialized randomly net1 = patternnet([5 5]);

GEH4 = 'One hidden layer is sufficient for a universal approximator'

net1.divideFcn=''; % don't divide data into training, testing, validation.

GEH5 = ' Number of training equations Ntrneq = N*O = 20'

GEH6 = [ ' Number of unknown weights Nw = ( I + 1 )*H1 + ' ...,

'( H1 + 1 )*H2 + ( H2 + 1)*O = 15+30+6 = 51' ]

GEH7 = [ 'More unknowns(51) than equations (20) ==> ' ...

'OVERFITTING ==> ' ...

' 1. Reduce No. of hidden nodes or ' ...

' 2. use a validation set or' ...

' 3. use Bayesian Regularization ']

[net1,tr] = train(net1,X,T); net1.adaptFcn=''; % don't change network during usage after training % use the trained network to classify a new point:

GEH8 = 'Delete above command: it"s irrelevant'

Xtest=[7 2]'

Xtest =

7

2

GEH9 = ' Where is verification of good performance on training data ???'

y_matlab=net1(Xtest)

y_matlab =

2.7747e-07

1

3.5459e-07

% Test matlab's classification manually: a1 = tansig(net1.IW{1,:}*Xtest + net1.b{1}); a2 = tansig(net1.LW{2,1}*a1 + net1.b{2}); y_my = softmax(net1.LW{3,2}*a2 + net1.b{3});

y_my =

6.5673e-07

0.71821

0.28179

% Output: >> nn_bare % % Xtest = % 7 % 2 % % y_matlab = % 0.0000 % 1.0000 % 0.0000 % % y_my = % 0.0000 % 1.0000 % 0.0000 % % >> nn_bare % % Xtest = % 7 % 2 % % y_matlab = % 0.0000 % 1.0000 % 0.0000 % % y_my = % 0.0000 % 0.9307 % 0.0693

Not sure why my results are different from yours.

Hope this helps.

Greg

Sign in to comment.

Opportunities for recent engineering grads.

Apply Today
## 0 Comments

Sign in to comment.