readtable(​'filename.​csv') is not importing all columns

6 views (last 30 days)
I have the following attached csv file. The csv file has two columns X and Y. In the columns X, Y, I have several values in one cell
On the using the following matlab commands:
M = readtable('intima_coords.csv');
data = array2table(M)
MATLAB is returning only one column and ignoring the other.
What should be done to read both the colums?

Answers (1)

Simon Chan
Simon Chan on 30 Jan 2022
The data in the csv file contains both strings and numbers.
I would use function readcell to retrieve the data as strings and convert them to number in a number of steps as follows:
rawdata = readcell('intima_coords.csv'); % Use readcell
% The second row contains the required X & Y data
data = cellfun(@(x) strtrim(erase(x,{'[',']'})),rawdata(2,:),'uni',0); % Remove the characters such as '[' & ']'
datacell = cellfun(@(x) str2double(strsplit(x,' ')),data,'uni',0); % Split them and convert to numbers
% datacell{1} contains the data X
% datacell{2} contains the data Y
plot(datacell{1},datacell{2},'g*'); % Try to plot the data
grid on
  3 Comments
Simon Chan
Simon Chan on 31 Jan 2022
I use the following code which may easier to explain, actually they are doing the same thing:
rawdata = readcell('intima_coords.csv');
xydata = rawdata(2,:);
for k = 1:2
erase_characters = erase(xydata{k},{'[',']'});
erase_space = strtrim(erase_characters);
split_space = strsplit(erase_space,' ');
datacell{1,k} = str2double(split_space);
end
The XY data you are looking for are on the second row of the data after readcell and they are character vector like the following:
xydata{1}
'[ 5.68064747e-01 8.85456026e-01 1.21352549e+00 8.85456026e-01
5.68064747e-01 1.21352549e+00 8.81677878e-01 8.81677878e-01
1.00000000e+00 1.42658477e+00 1.42658477e+00 1.20536680e-01
4.63525492e-01 1.20536680e-01 4.63525492e-01 1.50000000e+00
-3.54604887e-01 -7.48510748e-01 -8.81677878e-01 -7.48510748e-01
-3.54604887e-01 -8.81677878e-01 -9.70941817e-01 -9.70941817e-01
-1.50000000e+00 -4.63525492e-01 -4.63525492e-01 -1.42658477e+00
-1.42658477e+00 -1.21352549e+00 -1.21352549e+00 7.71806935e-15
-5.23722504e-15 -1.94548764e-01 -1.94548764e-01 -1.09671525e+00
-1.09671525e+00]'
Step 1 is to erase the characters '[' and ']' (In the beginning and at the end of the character vector) using function erase
erase_characters = erase(xydata{k},{'[',']'});
erase_characters =
' 5.68064747e-01 8.85456026e-01 1.21352549e+00 8.85456026e-01
5.68064747e-01 1.21352549e+00 8.81677878e-01 8.81677878e-01
1.00000000e+00 1.42658477e+00 1.42658477e+00 1.20536680e-01
4.63525492e-01 1.20536680e-01 4.63525492e-01 1.50000000e+00
-3.54604887e-01 -7.48510748e-01 -8.81677878e-01 -7.48510748e-01
-3.54604887e-01 -8.81677878e-01 -9.70941817e-01 -9.70941817e-01
-1.50000000e+00 -4.63525492e-01 -4.63525492e-01 -1.42658477e+00
-1.42658477e+00 -1.21352549e+00 -1.21352549e+00 7.71806935e-15
-5.23722504e-15 -1.94548764e-01 -1.94548764e-01 -1.09671525e+00
-1.09671525e+00'
Step 2 is to remove the leading whitespace on the strings using function strtrim. Otherwise it will create an extra number when split the string in the next step.
erase_space = strtrim(erase_characters);
erase_space =
'5.68064747e-01 8.85456026e-01 1.21352549e+00 8.85456026e-01
5.68064747e-01 1.21352549e+00 8.81677878e-01 8.81677878e-01
1.00000000e+00 1.42658477e+00 1.42658477e+00 1.20536680e-01
4.63525492e-01 1.20536680e-01 4.63525492e-01 1.50000000e+00
-3.54604887e-01 -7.48510748e-01 -8.81677878e-01 -7.48510748e-01
-3.54604887e-01 -8.81677878e-01 -9.70941817e-01 -9.70941817e-01
-1.50000000e+00 -4.63525492e-01 -4.63525492e-01 -1.42658477e+00
-1.42658477e+00 -1.21352549e+00 -1.21352549e+00 7.71806935e-15
-5.23722504e-15 -1.94548764e-01 -1.94548764e-01 -1.09671525e+00
-1.09671525e+00'
Step 3 is to split the string with delimiter ' ' (actually a space between them) using function strsplit
split_space = strsplit(erase_space,' ')
split_space =
Columns 1 through 5
{'5.68064747e-01'} {'8.85456026e-01'} {'1.21352549e+00'} {'8.85456026e-01↵'} {'5.68064747e-01'}
Columns 6 through 10
{'1.21352549e+00'} {'8.81677878e-01'} {'8.81677878e-01↵'} {'1.00000000e+00'} {'1.42658477e+00'}
Columns 11 through 15
{'1.42658477e+00'} {'1.20536680e-01↵'} {'4.63525492e-01'} {'1.20536680e-01'} {'4.63525492e-01'}
Columns 16 through 20
{'1.50000000e+00↵'} {'-3.54604887e-01'} {'-7.48510748e-01'} {'-8.81677878e-01'} {'-7.48510748e-01↵'}
Columns 21 through 25
{'-3.54604887e-01'} {'-8.81677878e-01'} {'-9.70941817e-01'} {'-9.70941817e-01↵'} {'-1.50000000e+00'}
Columns 26 through 30
{'-4.63525492e-01'} {'-4.63525492e-01'} {'-1.42658477e+00↵'} {'-1.42658477e+00'} {'-1.21352549e+00'}
Columns 31 through 35
{'-1.21352549e+00'} {'7.71806935e-15↵'} {'-5.23722504e-15'} {'-1.94548764e-01'} {'-1.94548764e-01'}
Columns 36 through 37
{'-1.09671525e+00↵'} {'-1.09671525e+00'}
Finally convert them into numbers using function str2double
datacell{1}
Columns 1 through 12
0.5681 0.8855 1.2135 0.8855 0.5681 1.2135 0.8817 0.8817 1.0000 1.4266 1.4266 0.1205
Columns 13 through 24
0.4635 0.1205 0.4635 1.5000 -0.3546 -0.7485 -0.8817 -0.7485 -0.3546 -0.8817 -0.9709 -0.9709
Columns 25 through 36
-1.5000 -0.4635 -0.4635 -1.4266 -1.4266 -1.2135 -1.2135 0.0000 -0.0000 -0.1945 -0.1945 -1.0967
Column 37
-1.0967
The function cellfun just doing the same thing without using for loop.
Avishek Mukherjee
Avishek Mukherjee on 19 May 2022
Back in January, you helped me with formatting CSV files imported to MATLAB from Python. Now I have been using your suggestion all this time since then. But now I have run into a strange problem.
There are two csv files attached. The following code snippet (suggested by you back in Jan) works fine for the medcoordsdof.csv file:
rawdata = readcell('medcoordsdof.csv'); % Use readcell
% The second row contains the required X & Y data
data = cellfun(@(x) strtrim(erase(x,{'[' ']'})),rawdata(2,:),'uni',0); % Remove the characters such as '[' & ']'
datacell = cellfun(@(x) str2double(strsplit(x,' ')),data,'uni',0);
% datacell{1}
numOfColumn = size(rawdata, 2);
Now acdof.csv have the same nature of data as that in medcoordsdof.csv. These two csv files were created with similar codes in Python. However, the following code snippet raises error for the acdof.csv file:
rawdata = readcell('acdof.csv'); % Use readcell
% The second row contains the required X & Y data
data = cellfun(@(x) strtrim(erase(x,{'[' ']'})),rawdata(2,:),'uni',0); % Remove the characters such as '[' & ']'
datacell = cellfun(@(x) str2double(strsplit(x,' ')),data,'uni',0);
% datacell{1}
numOfColumn = size(rawdata, 2);
But if we open both the csv files in excel, we see similar data structures. So I don't think (I can be wrong and correct me if that's so) the acdof.csv is anyway getting corrupted.
I am really confused as to why there is a problem with reading cells for one file while for the other it works fine.
Any suggestetion would be extremely appreciated!
Thank you.

Sign in to comment.

Categories

Find more on Data Type Conversion in Help Center and File Exchange

Products


Release

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!