Time Delay Neural Network: How to separate measurements in dynamic data for the use either in the training or the test data?

Question

Carina on 8 Jul 2017

0
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/347847-time-delay-neural-network-how-to-separate-measurements-in-dynamic-data-for-the-use-either-in-the-tr

Answered: Greg Heath on 24 Jul 2017

Hi, I'm training a Time Delay Neural Network. My data consists of 140 measurements and I use 5 Inputs and 2 Outputs for the neural network. As my longest measurement consists of 2791 samples, the preprocessed dynamic target data cell looks like this: 1x2791 cell containing 2x140 matrixes. (I prepared the data according to these tutorials: https://de.mathworks.com/help/nnet/ug/understanding-neural-network-toolbox-data-structures.html and https://de.mathworks.com/help/nnet/ug/multiple-sequences-with-dynamic-neural-networks.html).

If I use divideblock, divideind, divideint or dividerand I can only pick samples out of the measurements in different ways. This results in a trainMask like:

{[1 1 1 ...; 1 1 1 ...][1 1 1 ...; 1 1 1]...[0 0 0 ... ; 0 0 0]}

However, the trainMask that I need should look this way:

{[1 1 0 ...; 1 1 0 ...][1 1 0 ...; 1 1 0]...[1 1 0 ... ; 1 1 0]}

I want the measurements to be either in the test data or in the training data. How can this be done?

The reason, why I do not want to cut the measurements and put the first part in the training data and the second part in the test data: during one measurement, only one input parameter is changed and the other 4 input parameters keep their value. If the net is trained with the first part of the measurement and is then tested with the second part of the same measurement, the test data will not be really new to the network. If I train the network like that, the mse is good but the performance on truely new data (all 5 input parameters differ) is bad. I already tried to seperate the measurements for the training of a static network and it solved the problem. However, I do not know how to accomplish this for the dynamic data structure.

Thank you for your help.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Mukul Rao on 18 Jul 2017

0
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/347847-time-delay-neural-network-how-to-separate-measurements-in-dynamic-data-for-the-use-either-in-the-tr#answer_274649

Open in MATLAB Online

Hi,

I am not sure I understand your question correctly.

If the training set looked like:

{ (a1a2...a140) (b1b2b3...b140) (c1c2...c140) .... }

cell array of matrices, where a1,a2 etc represented columns of the first matrix and so on. Given this structure, you would like the columns for the training, validation and test subsets to not be confined to samples associated with an entire timestamp worth of data. Meaning the test set could include combinations like (a1a4...a140c1c3..). Is this correct?

1 Comment
Show -1 older commentsHide -1 older comments

Carina on 24 Jul 2017

Open in MATLAB Online

Hi,

thank you for dealing with the problem.

No, if c3 ist part of the test data it is important for a3 not to be part of the training or validation data. Therefore, if the test set includes c3 it should also include a3.

If the training set after division looks like:

{(a2 a3 ... a139) (b2 b3 ... b139) (c2 c3 ... c139)...}

the test set should look like:

{(a1 a4 ... a140) (b1 b4 ... b140) (c1 c4 ... c140)...}

Training and test data should not look like this:

 training set {(a2 a4 ... a139) (b2 b3 ... b140)...}
 test set     {(a1 a3 ... a140) (b1 b4 ... b139)...}

Neither should it look like that:

 training set {(a1 a2 ... a140) (d1 d2 ... d140)...}
 test set     {(b1 b2 ... b140) (c1 c2 ... c140)...}

Did that answer your question? Thank you for your help.

Sign in to comment.

Answer 2

Greg Heath on 24 Jul 2017

0
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/347847-time-delay-neural-network-how-to-separate-measurements-in-dynamic-data-for-the-use-either-in-the-tr#answer_275319

Open in MATLAB Online

Your explanation is too confusing.

1. Timeseries analysis contains one basic assumption:

Every contiguous subset of the timeseries has the same approximate summary statistics of mean, standard deviation and correlation coefficients.

2. If you use DIVIDEBLOCK, you do not have to worry about causality or training with nontraining data. Using the default 0.7/0.15/0.15 ratios, the 2nd (validation) and third (testing) blocks each contain Ntst = Nval = floor(0.15*N) points of NONTRAINING DATA. They are not used for weight estimation.

However, the val block will stop training if it's error rate increases continually for 6 epochs. Therefore, although it is not part of the training set, it can be considered a part of the design subset. i.e.,

 data        = training + nontraining
 nontraining = validation + testing
 data        = design + nondesign
 nondesign   = testing

Since the val subset is not directly involved in weight estimation, it's slightly biased error rate is not a bad estimate of net performance on all nondesign data.

Finally, the test subset yields a completely unbiased estimate of net performance.

Unfortunately, many timeseries do not have stationary summary statistics and multiple nets are required to model the series over it's entire length.

Again, the validation subset performance is the best indicator that the assumption of stationarity is becoming invalid.

Therefore, use divideblock and don't worry unless plots or other evidence show that the series is not approximately stationary.

Time Delay Neural Network: How to separate measurements in dynamic data for the use either in the training or the test data?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (2)

1 Comment
Show -1 older commentsHide -1 older comments

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

Time Delay Neural Network: How to separate measurements in dynamic data for the use either in the training or the test data?

0 Comments Show -2 older commentsHide -2 older comments

Answers (2)

1 Comment Show -1 older commentsHide -1 older comments

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments

0 Comments
Show -2 older commentsHide -2 older comments