Textscan and Regexp Cell Data Layout
Info
This question is closed. Reopen it to edit or answer.
Show older comments
I'm trying to make some logic work with some legacy matlab code. I figured the easist thing was to make the data look the same as what the code is expecting.
I'm reading the relevant data from a CSV file, it's pretty simple -- but the format for the IDs changed from a simple number to an ID of the form [YY,ZZZZ].
As an example, the 'previous' CSV data looked like:
1,Simple,Data
2,More,Data
3,Even,More
4,Really,More
The 'new' CSV data looks like:
[01,0001],Simple,Data
[02,1001],More,Data
[03,9876],Even,More
[04,1234],Really,More
Previously, to read in the data, this logic was used:
fid = fopen(fileName);
data = textscan(fid,'%s%s%s%*s','Delimiter',',');
When this was done against the 'previous' CSV data, it returned data that look like this:
data =
1×3 cell array
{4×1 cell} {4×1 cell} {4×1 cell}
The cells then look like:
K>> data{:}
ans =
4×1 cell array
'1'
'2'
'3'
'4'
ans =
4×1 cell array
'Simple'
'More'
'Even'
'Data'
ans =
4×1 cell array
'Data'
'Data'
'More'
'Data'
So to handle the ID of the form [YY,ZZZZ], I had to modify the 'textscan' logic to handle the new ID format that we're using. To do that, I'm using a regexp function:
fid = fopen(fileName);
rawData = textscan(fid,'%s','Delimiter','\n');
data = regexp(rawData{1},'[ \-\/\w]*([\[][^\)\]]*[\]])?', 'match')
This then, after it reads in the data, gives me data that is formatted like this:
K>> data
data =
4×1 cell array
{1×3 cell}
{1×3 cell}
{1×3 cell}
{1×3 cell}
K>> data{:}
ans =
1×3 cell array
'[01,0001]' 'Simple' 'Data'
ans =
1×3 cell array
'[02,1001]' 'More' 'Data'
ans =
1×3 cell array
'[03,9876]' 'Even' 'More'
ans =
1×3 cell array
'[04,1234]' 'Really' 'More'
So you can see that it has the correct data in it -- but the data is laid out differently which is breaking on the legacy code. So my question is how can I make the 'new' data be laid out like this as it was coming out of the 'textscan' logic:
data =
1×3 cell array
{4×1 cell} {4×1 cell} {4×1 cell}
Answers (1)
Pujitha Narra
on 9 Oct 2019
Hi Mark,
This could be a possible workaround for your requirement:
data = textscan(fid,'%s%s%s%s%*s','Delimiter',',');
data{1}=strcat(data{1},{','},data{2}); % to retrieve the ID in its original format
>> data{:}
ans =
4×1 cell array
{'[01,0001]'}
{'[02,1001]'}
{'[03,9876]'}
{'[04,1234]'}
ans =
4×1 cell array
{'0001]'}
{'1001]'}
{'9876]'}
{'1234]'}
ans =
4×1 cell array
{'Simple'}
{'More' }
{'Even' }
{'Really'}
ans =
4×1 cell array
{'Data'}
{'Data'}
{'More'}
{'More'}
Hope this helps!
This question is closed.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!