Why are spaces ignored when a character string is scanned?
43 views (last 30 days)
Show older comments
So I have a file titled text.txt which contains the string, 'Hi There!'. Now, when I use the following code to extract the string into my variable, data, MATLAB seems to ignore the space in the string.
%% Code Snippet
fileName = 'text.txt'; %name of file containing the data to be sent
formatSpec = '%s'; %data type
fileID = fopen(fileName,'r');
% data = textscan(fileID,formatSpec);
data = fscanf(fileID,formatSpec);
fileID = fclose(fileID);
%%
So, when I try to check the contents of 'data' in the command prompt:
>> data
data =
'HiThere!'
How do I make MATLAB scan the spaces in my character string?
0 Comments
Accepted Answer
Adam Danz
on 5 Aug 2019
Edited: Adam Danz
on 5 Aug 2019
"How do I make MATLAB scan the spaces in my character string?"
Use the character format spec
formatSpec = '%c'; %data type
Alternatively,
data = fileread('text.txt');
7 Comments
dpb
on 5 Aug 2019
Note most particularly, (and I did it on purpose :) ) that '%c' even returns the \n -- the trailing single quote demarking the end of the cellstr is displayed on the subsequent line. That behavior is unlikely what wanted in most cases, too.
More Answers (2)
David K.
on 5 Aug 2019
This is because of the way fscanf reads it in. If you go to the linked documentation and scroll down to Character Fields, it notes that it reads all characters excluding spaces. Then it says to use %c and a size to read in in order to read in spaces. This would look like this :
formatSpec = '%c';
sizeA = 20; % Some value larger than the number of characters in your file
data = fscanf(fileID,formatSpec,sizeA);
0 Comments
dpb
on 5 Aug 2019
WAD (Working As Documented). From the documentation--
....
This table lists available conversion specifiers for character inputs.
Character Field Type Conversion Specifier Description
Character vector or string scalar %s Read the text until fscanf encounters white space.
...
If you keep reading for Output, you'll see that when the formatSped contains only %s or %c then the return is a character array (as you got). Matlab "vectorizes" the formatSpec and applies it as often as it can to the file and so returns all the found characters other than white space in one long character array.
So, the answer to the Q? of "Why?" is "Because it's supposed to."
You can use textscan which will be somewhat different...
>> str=textscan('Hi There','%s')
str =
1×1 cell array
{2×1 cell}
>>
but the whitespace is still gone but you do have the two substrings and can reinsert whitespace if needed, where needed.
Or, you can use quoted strings and '%q' with textscan (but not with xscanf, they don't understand it)
>> str=textscan('"Hi There"','%q')
str =
1×1 cell array
{1×1 cell}
>> str{:}
ans =
1×1 cell array
{'Hi There'}
>>
Or, in the event the string were the only data in the given record, one could just use fgetl, but that would be useful only for the specific circumnstance which isn't generally the case.
C is, in general, pretty relentless in "eating up" whitespace wherever it can find it. Helps sometimes, can also definitely get in the way on occasion.
0 Comments
See Also
Categories
Find more on Text Data Preparation in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!