one time Multiple data extraction from exel files for analysis

2 views (last 30 days)
karan
karan on 6 Dec 2011
Hi i want to read data from multiple .xls files located in the folder.
I want to read the data such that i can recall the data from a specific file later on.
ultimately i want to add index(like column full of 1's next to the data extracted from file1 ,column full of 2's next to the data extracted from file2,column full of 3's next to the data extracted from file3) so that i can keep track where is a specific data point taken from
i am currently using :
d = dir('*.xls');
N_File = numel(d);
for num = 1:N_File
k = fullfile(d(num).name);
[data1(num) txt1(num) ] = xlsread(d(num).name);
num =num+1;
end
this does not seem to work because of the matrix dimensions... My experience is limited in this kind of issues....
Help is much appreciated
FINAL GOAL: i need help in extracting the data(both text and number data) and make a big concatenated array which contains all the data from these n number of files.

Accepted Answer

Matt Tearle
Matt Tearle on 6 Dec 2011
This basically just stacks all the spreadsheets on top of each other, adding a first column with the number, into one big cell array. I think that's what you were requesting as your "FINAL GOAL", but it assumes that all the spreadsheets have the same number of columns.
d = dir('*.xls');
N_File = numel(d);
C = {};
for num = 1:N_File
k = fullfile(d(num).name);
[~,~,dat] = xlsread(d(num).name);
C = [C;num2cell(num*ones(size(dat,1),1)),dat];
end
EDIT TO ADD (based on further feedback): OK, using the CSV format you've shown, I'd do something like this
d = dir('*.csv');
N_File = numel(d);
C = {};
for num = 1:N_File
k = fullfile(d(num).name);
dat = myfileread(d(num).name,num);
C = [C;dat];
end
with
function outdata = myfileread(fname,k)
% read from file
fid = fopen(fname);
data = textscan(fid,'%s%f%s%s%s%s%f%f','delimiter',',');
fclose(fid);
% extract time stamps and convert to serial date numbers
t1 = datenum(strrep(strcat(data{3},'-',data{4}),'-',' '));
t2 = datenum(strrep(strcat(data{5},'-',data{6}),'-',' '));
% combine together into one big cell array
outdata = [num2cell(k*ones(size(data{1}))),data{1},num2cell([data{2},t1,t2,data{7},data{8}])];
This will return everything as a cell array, with the number of the file in the first column. You can sort by column, using the sortrows function:
sortrows(C,3) % sort by 3rd column
If you need to extract the values of the kth column as a vector, use the syntax [C{:,k}]:
plot([C{:,6}],[C{:,3}],'o')
Use C(:,2) for string work:
idx = strncmp(C(:,2),'54F',3);
foo = [C{idx,3}];
  1 Comment
karan
karan on 17 Dec 2011
With the help of your code and little bit of modifications i got the results that i wanted...Thank You very much.

Sign in to comment.

More Answers (1)

karan
karan on 6 Dec 2011
well it is somewhat computing to what i want ...But there are still concerns such as:- 1. I only get upto 65536 data points although the files have 300,000+ points each. Output looks like this.
Result : Not the desired output.
[ 1.9670] [ 0]
[ 1.9250] [ 0]
2. if you see the output format above it is giving me this bracket sign after every column.
Format in excel file. Note I converted .CSV to a .XLS file
23D7B909B3 1.53943E+11 08/18/11 0:23:00 08/18/11 0:26:25 3.417 0
6C9B0270BFBF 1.19413E+14 08/18/11 0:23:03 08/18/11 0:26:14 3.183 0
3C8BFE4764B9 6.6572E+13 08/18/11 0:23:15 08/18/11 0:26:40 3.408 0
3.Firstly it reads 8 columns .If you see above it has time-formatted data,which does not show up in matlab command window.
I hope you can help out with this and i hope my commnets made above are clear enough for understanding what i want.
Thanks in advance.
  1 Comment
Matt Tearle
Matt Tearle on 6 Dec 2011
1. I don't know why that's happening. Nothing I showed is inherently limited, except that cell arrays use more memory, so you might be running into an out-of-memory error.
2. Yes, it's a cell array. You said you wanted "a big concatenated array which contains all the data". The only way to contain all the data (text and numeric) in one array is to use a cell array.
3. I don't really understand the problem. What you've shown looks like it has 8 columns. Also, what do you mean that the time data "does not show up" in the Command Window?
Given that you've converted CSV to XLS, does that mean you still have the CSV? Is there any reason to use XLS instead? I wonder if it might be easier to read directly from the CSV.
Can you explain how you want the data to appear in MATLAB. How do you want those 8 columns interpreted? Strings? Numbers? Date strings? Serial date numbers? Do you want everything in one array or do you want them as separate variables?

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!