How do I make the desired Regular Expression?
Show older comments
I am trying to use regexprep to remove certain parts of strings that meet a specific criteria. Here are the criteria in the best English I can think of:
Remove the ## if it occurs at the beginning of the first word even if this word is indented by whitespace.
And here is what I have so far. It works for most of the test cases but fails the last two:
%Test string
str = {'##hello','h##ello',' ##hello','hello','####hello',...
' ####hello','h## ello','##','h ##ello','##hello ##hello'}';
%Match at beginning of word and look behind for whitespace
regs = '(?<!\S)\>(##)';
str2 = regexprep(str,regs,'');
%What I actually want
str3 = {'hello','h##ello',' hello','hello','##hello',...
' ##hello','h## ello','','h ##ello','hello ##hello'}';
%Pretty visualization
ds = dataset(str,str2,str3,'VarNames',{'Input','Actual_Results','Wanted_Results'})
So it works for all of the cases where there is not a word in front of the ##. However if there is text behind the space, the lookbehind doesn't pick this up. Also I can't use the match beginning of line because I want it to be able to handle n-length white space which beginning of line counts.
1 Comment
Sean de Wolski
on 16 Oct 2012
Accepted Answer
More Answers (0)
Categories
Find more on Characters and Strings in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!