Partitioning data for Time Series TCN model Training, Validation, and Testing

Question

Isabelle Museck on 5 Jun 2024

0
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/2125801-partitioning-data-for-time-series-tcn-model-training-validation-and-testing

Answered: Krishna on 6 Jun 2024

Hello there, I am trying to build a TCN model to predict a continuous variable. I have time series data in which I am using 3 input features (accelrometer measuments in x,y,z directions) to estimate/predict a continuous variable. I have acceleromter data from 10 different trials stored in a 10x1 cell and each cell has the three accelerometer measurments over time stored in a 500x3 table for that trial. The target continous varable I am trying to predict is simialrly stored in a 10x1 cell array with each cell contaning a the a 500x1 table which is the true value of the predicted variable over time named "Taget". If I am trying to build a TCN model with this data what is the best way to partition the data for training, testing (10%), and validation (10%)? I think I need to use the tspartition function but am not sure how to use it for this type of data. Do I need to combine the data from all 10 trials into one large table and then partition? Or should I partition each trial seprately, train the model on a singluar trial, and then retrain the model on the next trial and so on. Any help would be greatly appreciated!

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Krishna on 6 Jun 2024

0
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/2125801-partitioning-data-for-time-series-tcn-model-training-validation-and-testing#answer_1468256

Hello Isabelle,

Based on your description, I think you're seeking the correct method for dividing your time series data into training, testing, and validation sets. I can share an effective approach that I have personally utilized.

You've mentioned having 10 observations, with each one comprising both input and output data. Specifically, the input data consists of a time series sequence of 500 steps with 3 features, and the output data is a sequence of 500 steps for a single variable. Therefore, your data should be organized as 1x10 sequences within a cell array, where each sequence is represented as a list of 500x4, including 3 inputs and 1 output.
To partition this data into training, testing, and validation sets, you can use the cvpartition function. However, it's important to note that cvpartition generates two sets at a time, necessitating its use twice. Initially, divide the data into a training set and a combined testing/validation set. Subsequently, split the latter into distinct testing and validation sets. After this the whole trainData would contain 8 sequences(80 percent) and validate and test would contain 1 sequence each (10 percent each).
Once partitioned, proceed to organize the training data into Xtrain, which comprises the input sequences of 500x3, and Ytrain, which includes the output sequences of 500x1.

Partitioning data for Time Series TCN model Training, Validation, and Testing

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

Partitioning data for Time Series TCN model Training, Validation, and Testing

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments