Filling missing values of selected columns of a table with previous values

23 views (last 30 days)
I am trying to fill the NaNs from a selected columns of interest with previous values.
For example lets say i have a table Data as:
SN English Science Maths Health
1 100 80 89 74
2 NaN 85 NaN 89
3 NaN NaN 75 NaN
4 80 NaN NaN NaN
5 NaN 75 NaN 60
Lets say if i only want to fill with previous values for column English and Maths. How do i do it? I tried this line. I went to error as
Data=fillmissing(Data{:,'English','Science'},'previous');
You cannot subscript a table using linear indexing (one subscript) or multidimensional indexing (three or more subscripts). Use a
row subscript and a variable subscript.

Answers (3)

Guillaume
Guillaume on 24 Apr 2019
Star's answer works (please don't delete it!) or you can use:
Data(:, {'English', 'Science'}) = fillmissing(Data(:, {'English', 'Science'}), 'previous');
You need to understand the difference between {} indexing (extracts content of a table as a cell array or matrix) and () indexing (extracts portion of the table, resut is still a table) for tables.

Star Strider
Star Strider on 23 Apr 2019
You most likely have to use fillmissing with each variable separately:
Data.English = fillmissing(Data.English,'previous');
Data.Science = fillmissing(Data.Science,'previous');
That works.
  2 Comments
Star Strider
Star Strider on 24 Apr 2019
You can, but apparently only one at a time, as in my code. I experimented with varfun, using an anonymous function with fillmissing, however varfun threw an error about anonymous functions (that made no sense to me, since the anonymous function worked outside of varfun).

Sign in to comment.


Sam17
Sam17 on 24 Apr 2019
So, I wrote few lines of Code. I wrote the variables I want to interpolate in excel sheet and imported the excel in matlab and wrote these lines of code. By code works but it takes so much of time may be because my file size is big. Is there any way I can rewrite this code and make it more simpler and less time and memory consuming ?
[FileName,PathName] = uigetfile('*.xls*','Select excel file');
[~,~,interpolate] = xlsread(strcat(PathName,FileName));
interpolate = interpolate(2:end,2)';
variables=Total_Data.Properties.VariableNames;
to_delete= interpolate(2:end);
index=find(ismember(variables,to_delete));
Total_Data(:,index)=[];
index=find(ismember(variables,interpolate));
Data=Total_Data(:,index);
Data = fillmissing(Data,'previous');
Total_Data = outerjoin(Total_Data,Data,'MergeKeys',true);
  2 Comments
Guillaume
Guillaume on 24 Apr 2019
This is a different question from your original one, so please start a new question. It certainly shouldn't be an answer to the current question (for which you should accept an answer if one actually solved your original problem).
When you ask your new question, you will most likely be asked: Have you used the profiler to find which part of the code is slow?
Note that none of the find are needed in your code (but the impact is most likely very minimal)
index = ismember(variables,interpolate);
Data=Total_Data(:,index);
works just as well.

Sign in to comment.

Categories

Find more on Matrices and Arrays in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!