Extracting strings between different characters from a cell array without loop
5 views (last 30 days)
Show older comments
Hello,
I would like to rewrite the code below to remove the loop. I assume I need to use cellfun but after several attempts I am not managing.
In the following cell array, I need to extract the characters between "LS=" and "DS=", between "DS=" and "CS=", and between "CS=" and "]" and save them in their corresponsing field in a table.
This is how my orgiginal cell array looks like (I've cut it to 3 rows but there are actually a lot more in my dataset):
commentO =
3×1 cell array
{'LS=Students201509DS=Teachers201509CS=ConfigVS=English]' }
{'LS=Preschoolers201801DS=Students201910CS=AbsVS=Italian]' }
{'LS=Preschoolers201902DS=Assistants201902CS=NAVS=Italian]'}
This is the code I have written to extract the strings I need and store them in their corresponding field in a table:
commentO = {'LS=Students201509DS=Teachers201509CS=ConfigVS=English]'; 'LS=Preschoolers201801DS=Students201910CS=AbsVS=Italian]'; 'LS=Preschoolers201902DS=Assistants201902CS=NAVS=Italian]'};
MyDataTable = table;
MyDataTable.LS = cell(size(commentO));
MyDataTable.DS = cell(size(commentO));
MyDataTable.CS = cell(size(commentO));
MyDataTable.VS = cell(size(commentO));
for i = 1:length(commentO)
tempStr = commentO{i,1};
MyDataTable.LS{i} = tempStr(strfind(tempStr, 'LS=')+3:(strfind(tempStr, 'DS=')-1));
MyDataTable.DS{i} = tempStr(strfind(tempStr, 'DS=')+3:(strfind(tempStr, 'CS=')-1));
MyDataTable.CS{i} = tempStr(strfind(tempStr, 'CS=')+3:(strfind(tempStr, 'VS=')-1));
MyDataTable.VS{i} = tempStr(strfind(tempStr, 'VS=')+3:(end-1));
end
It takes ages so if someone could help me out with the correct cellfun call to do this much more efficiently. I would be very grateful.
Thank you very much.
Best Regards;
Cecile
1 Comment
dpb
on 4 Jan 2020
cellfun alone is unlikely to make much difference performance-wise--it's just a loop under the hood.
Accepted Answer
Stephen23
on 4 Jan 2020
>> C = {'LS=Students201509DS=Teachers201509CS=ConfigVS=English]'
'LS=Preschoolers201801DS=Students201910CS=AbsVS=Italian]'
'LS=Preschoolers201902DS=Assistants201902CS=NAVS=Italian]'};
>> [M,S] = regexp(C,'[A-Z]S=','match','split');
>> M = strrep(vertcat(M{:}),'=','');
>> S = strrep(vertcat(S{:}),']','');
>> T = cell2table(S(:,2:end),'VariableNames',M(1,:))
T =
LS DS CS VS
____________________ __________________ ________ _________
'Students201509' 'Teachers201509' 'Config' 'English'
'Preschoolers201801' 'Students201910' 'Abs' 'Italian'
'Preschoolers201902' 'Assistants201902' 'NA' 'Italian'
More Answers (0)
See Also
Categories
Find more on Characters and Strings in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!