How to fill in missing data?

10 views (last 30 days)
J K
J K on 26 Nov 2016
Answered: Star Strider on 7 Oct 2024
Hello everybody,
I have a dataset(txt file) which contains some missing values (represented with 0). I would like to replace all this 0 places with numbers. How can I do?

Answers (2)

Ayush
Ayush on 7 Oct 2024
Hi,
One way to replace missing values currently represented as 0 is to use interpolation. Start by identifying the indices of the zeros in your dataset. Then, apply the “interp1” function to perform the interpolation. Refer to the pseudo code below for a better understanding:
% Step 1: Load the data
data = readmatrix('your_dataset.txt');
% Step 2: Interpolate to replace zeros
for col = 1:size(data, 2)
x = 1:size(data, 1); % Indices of the data
y = data(:, col); % Data values
% Find indices of non-zero and zero elements
nonZeroIndices = y ~= 0;
zeroIndices = y == 0;
% Perform interpolation only if there are non-zero elements
if any(nonZeroIndices)
% Interpolate only non-zero elements
yInterpolated = interp1(x(nonZeroIndices), y(nonZeroIndices), x(zeroIndices), 'linear', 'extrap');
% Replace zeros with interpolated values
y(zeroIndices) = yInterpolated;
end
% Update the column in the dataset
data(:, col) = y;
end
% Step 3: Save the modified data back to a file (optional)
writematrix(data, 'modified_dataset.txt');
For more information on using the “interp1” function, please refer to the documentation below:

Star Strider
Star Strider on 7 Oct 2024
If you have R2016b, use the fillmissing function (introduced in R2016b) —
T1 = array2table(randi([0 9],10,5)) % Original
T1 = 10x5 table
Var1 Var2 Var3 Var4 Var5 ____ ____ ____ ____ ____ 2 8 7 7 1 8 7 5 6 7 8 4 4 5 3 4 3 3 1 3 4 6 5 5 7 0 4 7 0 4 8 2 2 7 2 9 7 3 0 9 5 0 5 8 3 2 2 6 3 4
loc = table2array(T1) == 0 % Logical Matrix Of ‘0’ Locations
loc = 10x5 logical array
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0
T1 = fillmissing(T1, 'linear', MissingLocations=loc, EndValues='nearest') % 'linear' Interpolation Of Missing Values With 'nearest' For End Values
T1 = 10x5 table
Var1 Var2 Var3 Var4 Var5 ____ ____ ____ ____ ____ 2 8 7 7 1 8 7 5 6 7 8 4 4 5 3 4 3 3 1 3 4 6 5 5 7 6 4 7 6 4 8 2 2 7 2 9 7 3 7.5 9 5 4.5 5 8 3 2 2 6 3 4
There are several methods to fill (interpolate) the missing values. See the documentation for those and other options.
.

Categories

Find more on Data Preprocessing in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!