Multi-batch csv processing error

3 views (last 30 days)
I have a series of csv files where each file containts data that I want to extract located at data(1, [1 3]) (please see attached file). The data will then be exported to a one csv file named test.csv with there conrresonsding label (filename) at the first column, as seen below.
What I got so far is this code.
clear all; clc
path = '/Applications/trial';
list = dir(fullfile(path, '*.csv'));
for i = 1:numel(list)
F = fullfile(path, list(i).name);
data = readtable (F);
T = data(1, [1 3]);
energy = table2array(T);
dlmwrite('test.csv',energy,'-append')
end
However, when I tried exporting, it produces a series of number that is contained in each cell.

Accepted Answer

Jorg Woehl
Jorg Woehl on 16 Mar 2021
Edited: Jorg Woehl on 16 Mar 2021
Hi Denxybel, the main problem lies in the readtable statement. Althought the documentation states that "by default, readtable creates variables that have data types that are appropriate for the data values detected in each column of the input file", this does not work here:
data = readtable('DoseAtDemoSphereR1Q1C1.csv')
data =
1×14 table
x_ Results xFor scorer DoseAtDemoSphereR1Q1C1 Var6 Var7 Var8 Var9 Var10 Var11 Var12 Var13 Var14
________________________ __________________________ ________ __________________________ __________________________ _________________________ __________________________ _________ ____ _____ _____ _____ _____ _____
{'0.02753477916121483,'} {'2.753477962114824e-10,'} {'170,'} {'2.866090017161038e-05,'} {'2.866090045821939e-13,'} {'5.35358762496883e-07,'} {'4.827696467064431e-07,'} 0.0036571 NaN NaN NaN NaN NaN NaN
As you can see, most variables are read in as character vectors, especially the first and third column that you are interested in. Also, the comma has not been properly recognized as the delimiter in the csv file; instead, it appears as part of the data character vectors.
You will therefore have to provide more details to readtable so that it can parse the input file correctly. There are multiple ways for doing this, for example:
data = readtable('DoseAtDemoSphereR1Q1C1.csv', 'Format', '%f', 'Delimiter', ',')
data =
8×1 table
Var1
__________
0.027535
2.7535e-10
170
2.8661e-05
2.8661e-13
5.3536e-07
4.8277e-07
0.0036571
In addition (although not strictly necessary here), you could supply details about the header comments by adding 'CommentStyle','#' to the list of arguments.
Your output is now a 8-by-1 table, so to reduce it to a 2-by-1 table containing only the first and third value you would use
data([1 3],1)
Instead of converting your table to an array and then use dlmwrite (which is no longer recommended), you can simply write the table directly to a file using writetable. I would also suggest to first collect all data inside the loop and write them to a file all at once after the loop; this should be much more efficient than appending data to a file inside the loop.
Your final code could then look something like this:
path = '/Applications/trial';
list = dir(fullfile(path, '*.csv'));
% preallocate table and define variables
T = table('Size', [numel(list) 3],...
'VariableTypes', {'string','double','double'},...
'VariableNames', {'Name','Energy','Number of Particles'});
for i = 1:numel(list)
F = fullfile(path, list(i).name);
data = readtable(F, 'Format', '%f', 'Delimiter', ',');
T{i,1} = string(F);
T{i,2} = data{1,1};
T{i,3} = data{3,1};
end
% write data to file
writetable(T, 'test.csv');
  1 Comment
Denxybel Montinola
Denxybel Montinola on 17 Mar 2021
Thank you very much for the comprehensive explanation @Jorg Woehl! It is really helpful.
I have actually figured out how to spit the data and write it in a matrix form yesterday. This is what I got
clear all; clc
path = '/Applications/trial';
list = dir(fullfile(path, '*.csv'));
for i = 1:numel(list)
F = fullfile(path, list(i).name);
rawdata = readtable([F], 'Delimiter', 'comma', 'readvariable',false);
data = table2array(rawdata(1,[1 3]));
writematrix(data,'test.csv','WriteMode','append');
end
but with your help, everthing works great!
P.S. Under the name column, I think this should be this in order to write the filename not the whole path. but no biggies, it works really well!
T{i,1}=string(list(i).name);

Sign in to comment.

More Answers (0)

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!