Help with extracting data from excel

29 views (last 30 days)
Hi folks,
I'm trying to extract data from excel files in a specific way.
Essentially, I want to loop through each file, get the contents of column A (Area), rename it from "Area" to the name of the sheet, then paste the data onto a new sheet or create a struct/table/cell array of all the column A data from all the spreadsheets. I'm not sure if this is possible but am encountering some issues with my code (which is far from complete). May I please ask for some help in tackling this issue?
for i = 1 : numFolders
pathVar = [subFolders(i).folder '\' subFolders(i).name];
excelPath = [pathVar '\Results.xls'];
nameString = sheetnames(excelPath);
nameString = extractAfter(nameString, ' ');
nameCell(i) = nameString;
tempdata = readtable(excelPath, "VariableNamingRule","preserve");
data{i} = tempdata.Area;
% writetable(data.Area, outPath);
end

Accepted Answer

Devyani Maladkar
Devyani Maladkar on 24 Aug 2021
It is my understanding that you want to extract column data from excel sheets in subdirectories and rename each column by the sheet name and write all the extracted columns to one file.
To insert columns into a new table with a certain column name you can use the dot syntax with parenthesis and quotation since the name of the sheet can be not a valid MATALB identifier, refer to this documentation for more details on tables. The code below shows how to read the files from subdirectories and extract the columns and rename them as filename_sheetname (to avoid the file from being if sheet name is same) The final table is written as an output excel file. The sample data used was a folder with two excel files, you can replicate the same using the commands below.
mkdir demo2
load patients
T = table(Gender,Smoker,Weight);
T2= table(Gender,Smoker,Height);
writetable(T,'demo2/allPatientsBMI_Weight.xls');
writetable(T2,'demo2/allPatientsBMI_Height.xls');
dircontent=dir(); %current directory listing
subdirs=dircontent([dircontent.isdir]) % filter all dir from current directory list
data=table(); %final table
%iterate all subdirs
for i=1:numel(subdirs)
if strcmp(subdirs(i).name,'.') || strcmp(subdirs(i).name,'..')
continue
end
%obtain all files from the subdir
subdirPath=fullfile(subdirs(i).folder,subdirs(i).name);
subdirContent=dir(subdirPath);
subdirsFiles=subdirContent(~[subdirContent.isdir])
%iterating the subdirs for file
for j=1:numel(subdirsFiles)
filePath=fullfile(subdirsFiles(j).folder,subdirsFiles(j).name)
tempData=readtable(filePath,"VariableNamingRule","preserve");
sheetName=sheetnames(filePath);
colName=extractBefore(subdirsFiles(j).name,'.')+"_"+sheetName
data.(colName) = tempData.Gender;
end
end
writetable(data,'output.xls')
  1 Comment
Teshan Rezel
Teshan Rezel on 24 Aug 2021
@Devyani Maladkar thank you, this works really well! Now, the only issue I have is that my data table has different numbers of elements per entry, so it always comes up as an mx1 array. May I ask how to get around this please?

Sign in to comment.

More Answers (0)

Categories

Find more on Data Import from MATLAB in Help Center and File Exchange

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!