Combine excel files into one

29 views (last 30 days)
Ancalagon8
Ancalagon8 on 19 Oct 2022
Edited: Ancalagon8 on 6 Jan 2025
I want to run a code inside every folder i have an excel file and i want to compare 10 excel files into 1. Can anyone help me?
  1 Comment
Mathieu NOE
Mathieu NOE on 19 Oct 2022
hello
you have to explain what data you want to extract , combine and save.
for the time being your doing things in the for loop that are not even in the saved output excel file (you r'e saving what you simply loaded from the input file)
so you have to explain your logic and probably share some input files as well if we want to test the code
all the best

Sign in to comment.

Accepted Answer

Mathieu NOE
Mathieu NOE on 19 Oct 2022
Edited: Mathieu NOE on 9 Nov 2022
so I relicated the same folder tree and file name (same in every folder)
this is a slightly modified code , try it :
path = 'D:\2019';
S = dir(fullfile(path, '*', 'data.xls'))
for k = 1:numel(S)
F = fullfile(S(k).folder,S(k).name);
data = readtable(F);
d = strcat(table2array((data(:,1))), {' '}, table2array((data(:,2 ))));
out = [d data(:,3)]; % date / time / acceleration
out = renamevars(out,'Var1','Date & Time');
writetable(out,'file.xls',"Sheet",k)
end
  43 Comments
Mathieu NOE
Mathieu NOE on 17 Apr 2024
ok, this is the new vesion with detrending (done in the function [x1,y1,z1,x2,y2,z2] = extract_xyz(out))
I also did a modification on how to generate the time vector. This new version can handle the case of missing csv input files and creates a gap for the missing file
in my situation, I have 3 files corresponding to t = 34 , 35 and 36 minutes
if for any reason, there is no "35 min" file (or any number of missing files) , I wanted that we see a gap corresponding to the missing file(s).
In the previous code this gap would not happen
now, with the new code (below) , if for example, you remove the file 03_05_23_35_00.csv from the folder, you see this gap and the time vector matches the fact that we have started at t = 34 min and we must stop at t = 36+1 = 37 min
Code :
% you
path = 'D:\';
S = dir(fullfile(path, '*00.csv'));
N = numel(S); %
%% main loop
% sensor = 'Sensor1_'; % add this prefix to the output excel file name
sensor = ''; % no prefix
day_previous = 1; % init to default value (day 1)
out = [];
time = [];
time_total = [];
for ci = 1:N % loop over number of files
folder_name = S(ci).folder;
filename = S(ci).name; % format mm_dd_HH_MM_ss.csv
tmp = str2double(split(filename(1:length(filename) - 4),'_'));
% compute month / day / ...
month = tmp(1);
day = tmp(2);
hour = tmp(3);
minutes = tmp(4);
seconds = tmp(5);
if day > day_previous % we are switching to the next day
% 1 / save previous daily data (if data is non empty !)
if ~isempty(out)
filename_out = [sprintf('%02d',month) '-' sprintf('%02d',day_previous) '.csv']; % the output name contains month and day values
% save the data in output file (comment this section if you don't
% need to save)
dlmwrite(filename_out,out,'delimiter','');
% plot the data
[x1,y1,z1,x2,y2,z2] = extract_xyz(out); % see function below
% extract x,y,z datas from "out" char array
% data are organized as follows (two sets of y/x/z datas side by side)
% {'y': 1.22, 'x': 1.98, 'z': -10.15}{'y': 0.55, 'x': -2.22, 'z': -0.85}
figure
plot(time_total,x1,time_total,y1,time_total,z1,time_total,x2,time_total,y2,time_total,z2)
title(['Detrended Data plot for day = ' num2str(day_previous)]);
legend('x1','y1','z1','x2','y2','z2');
xlabel('minutes');
end
% 2 / start a new data concat (1st iteration)
day_previous = day; % update
F = fullfile(folder_name,filename);
data = freadtxt(F);
out = [data]; % store data
% define sampling time - we have 1 minute record per file
samples = count(data,'''y''')/2; % division by 2 because the pattern 'y' appears twice per line
dt = 60/samples; % sampling time in seconds
% create one dedicated time vector per input (csv) file
% values are in minutes
time = ((0:samples-1)*dt + seconds)/60 + minutes; % division by (60) to go from seconds to minutes
time_total = [time]; % horizontal concatenation
else % we are still in the same day so keep concat data
F = fullfile(folder_name,filename);
data = freadtxt(F);
out = [out data]; % horizontal concatenation
% define sampling time - we have 1 minute record per file
samples = count(data,'''y''')/2; % division by 2 because the pattern 'y' appears twice per line
dt = 60/samples; % sampling time in seconds
% create one dedicated time vector per input (csv) file
% values are in minutes
time = ((0:samples-1)*dt + seconds)/60 + minutes; % division by (60) to go from seconds to minutes
time_total = [time_total time]; % horizontal concatenation
end
end
% save last daily data
if ci == N
filename_out = [sprintf('%02d',month) '-' sprintf('%02d',day) '.csv']; % the output name contains month and day values
% save the data in output file (comment this section if you don't
% need to save)
dlmwrite(filename_out,out,'delimiter','');
% plot the data
[x1,y1,z1,x2,y2,z2] = extract_xyz(out); % see function below
% extract x,y,z datas from "out" char array
% data are organized as follows (two sets of y/x/z datas side by side)
% {'y': 1.22, 'x': 1.98, 'z': -10.15}{'y': 0.55, 'x': -2.22, 'z': -0.85}
figure
plot(time_total,x1,time_total,y1,time_total,z1,time_total,x2,time_total,y2,time_total,z2)
title(['Detrended Data plot for day = ' num2str(day_previous)]);
legend('x1','y1','z1','x2','y2','z2');
xlabel('minutes');
end
%%%%%%%%%%%%%%%%%%% functions %%%%%%%%%%%%%%%%%%%%%%%%%%%
function [x1,y1,z1,x2,y2,z2] = extract_xyz(out)
% extract x,y,z datas from "out" char array
% data are organized as follows (two sets of y/x/z datas side by side)
% {'y': 1.22, 'x': 1.98, 'z': -10.15}{'y': 0.55, 'x': -2.22, 'z': -0.85}
tmp = extractBetween(out,'{','}'); % we get both sets of data in alternate way
[m,~] = size(tmp);
% first data set
tmp1 = tmp((1:2:m),:);
% Calling the regexp() function over the above cell array to extract number part
B = regexp(tmp1,'-?[0-9]+.[0-9]+', 'match'); % extract numeric parts inside the cell
for k=1:numel(B)
tmp1_out(k,:) = str2double([B{k}]);
end
y1 = tmp1_out(:,1);
x1 = tmp1_out(:,2);
z1 = tmp1_out(:,3);
% second data set
tmp2 = tmp((2:2:m),:);
% Calling the regexp() function over the above cell array to extract number part
B = regexp(tmp2,'-?[0-9]+.[0-9]+', 'match'); % extract numeric parts inside the cell
for k=1:numel(B)
tmp2_out(k,:) = str2double([B{k}]);
end
y2 = tmp2_out(:,1);
x2 = tmp2_out(:,2);
z2 = tmp2_out(:,3);
% detrend the data
x1 = detrend(x1); % removes the best fit linear trend
y1 = detrend(y1); % removes the best fit linear trend
z1 = detrend(z1); % removes the best fit linear trend
x2 = detrend(x2); % removes the best fit linear trend
y2 = detrend(y2); % removes the best fit linear trend
z2 = detrend(z2); % removes the best fit linear trend
end
Mathieu NOE
Mathieu NOE on 17 Apr 2024
as always, my pleasure !!

Sign in to comment.

More Answers (0)

Products


Release

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!