How can I do data cleaning/ data smoothing?

2 views (last 30 days)
lil brain
lil brain on 15 Mar 2023
Answered: Gayatri Rathod on 3 Apr 2023
I have a cell array called "pre_data" with 1 column and 27 rows. Each element in the column contains a cell with 21 columns and a varying number of rows.
I want to scan the columns in the cells of "pre_data". For each separate column, if there are values in a column that are above 3 standard deviations of that column, then I want the row cointaining that value to be removed.
Additionally, I want to create a cell array called "removed_pre_data" that has the same structure as "pre_data" but includes all the values that were removed.
How would I go about doing that?

Answers (1)

Gayatri Rathod
Gayatri Rathod on 3 Apr 2023
Hi Lil,
To accomplish this task, you can use a loop to iterate through each column of each cell in the pre_data cell array. For each column, you can compute the mean and standard deviation of the values and identify which rows have values that are above 3 standard deviations. You can then remove those rows and store them in a separate cell array called removed_pre_data.
You can follow the below steps to achieve the desired result:
  • Initialize an empty cell array called "removed_pre_data" with the same structure as "pre_data".
removed_pre_data = cell(size(pre_data));
  • Loop through each cell in the column of "pre_data" and compute the mean and standard deviation of each column using the mean and std functions.
Col_mean = mean(col_data) %returns the mean of the elements of col_data
Col_std = std(col_data) %returns the standard deviation of the elements of col_data
  • Loop through each cell in the column of "pre_data" again and remove the rows that contain values greater than 3 standard deviations away from the mean using the find function.
find(required_condition)
  • Store the removed rows in a new cell array called "removed_data" with the same structure as "pre_data".
  • Assign the updated cells in the column of "pre_data" with the remaining values.
  • Assign the removed cells to the corresponding cells in "removed_pre_data".
You can read more about the cell, mean, std and find functions from the following documentations: cell function, mean function, std function, find function.
Hope it helps!  
Regards,
Gayatri Rathod

Categories

Find more on Historical Contests in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!