Search string for special characters

8 views (last 30 days)
Hi all,
I have a program that saves and loads data to and from a .mat file. Currently the UIPUTFILE and save function method I am using allows the user to save files with special characters. They then cannot load this file again. The most common issue is users saving the file with a period.
How can I search the string for special characters? I want to throw up an error message if the user tries to save the file with a filename that MATLAB will not be able to load with UIGETFILE and the load function.
I have tried regexp but I am struggling to make it do what I want, even using backslashes in front of the characters in places.
if ~isempty(regexp(filename, '[/\*:?"<>|!]'))
uiwait(msgbox('Filename contains illegal characters.' 'Filename Error','error','modal'));
else
save(SavePath,'DataStructure');
end
Thanks,
Matt
  3 Comments
Matt
Matt on 8 Sep 2017
Thanks for that. I have looked into this more deeply. The issue with periods in file names in my software is due to the way I construct the path for the load function command I think.
handles.ChosenTPath=fullfile('C:\Sim\UserFiles\Circ',handles.TSelection);
This leads to:
>> Unable to read file 'C:\Sim\UserFiles\Circ\a.4'. No such file or directory.
When trying to load a data file saved as " a.4.mat" for example.
This explains why users were managing to save the data file, but not load it when the program runs, or when they try to edit it.
Milos Matovic
Milos Matovic on 11 Nov 2020
Matt, your code works for me and it covers all invalid chars for file names as specified by Windows.
Only change i made is added a double backslash because it is a escape character so regular expression was not accounting for it.
if ~isempty(regexp(filename, '[/\\*:?"<>|!]'))

Sign in to comment.

Accepted Answer

Stephen23
Stephen23 on 8 Sep 2017
Edited: Stephen23 on 8 Sep 2017
It is invariably easier to build a short list of the permitted characters than an long (and most likely incomplete) list of forbidden characters (trust me, there are more characters out there than you would believe).
Adapt this to your list of "permitted characters":
>> rgx = '^[\w-]+\.[\w]+$';
>> regexp('okayname.txt',rgx)
ans = 1
>> regexp('bad()nam!e.txt',rgx)
ans = []
I know that it can be a challenge to create a working regular expression, and so to help with this I wrote a tool iregexp that you can download from MATLAB FEX:
It lets you try different parse and match string combinations, and shows regexp's outputs in real time as you type.

More Answers (2)

Pal Szabo
Pal Szabo on 8 Sep 2017
Can't you use strrep? You can replace the special characters with something which works. https://uk.mathworks.com/help/matlab/ref/strrep.html
  1 Comment
Matt
Matt on 8 Sep 2017
Hi, thanks, but I need to identify them to provide an error message. I may then remove the illegal characters with strrep to pre-fill the name box on the UIPUTFILE window when the error message is dismissed.

Sign in to comment.


Matt
Matt on 8 Sep 2017
I think I have answered my own question.
This finds the use of any character other than a A-Z, a-z, 0-9, a space, hyphen, or underscore.
file = '?T!e[s$t%f.i _l-e_1.mat'
file_without_extension = file(1:(length(file)-4)) % to prevent removal of period before file extension
illegal_chars = regexp(file_without_extension,'[^\w \s \-]+','match')
  1 Comment
Stephen23
Stephen23 on 8 Sep 2017
Edited: Stephen23 on 8 Sep 2017
Some notes:
  • Do not do this, it is an unreliable and obfuscated way to remove a file extension:
file_without_extension = file(1:(length(file)-4))
Instead simply use fileparts: it is simpler and correct for any length extension.
  • \s matches all whitespace characters. Do you really want vertical tab in your filenames?

Sign in to comment.

Categories

Find more on Characters and Strings in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!