Multiple named tokens in a regexp
1 view (last 30 days)
Show older comments
Hi all,
I am having a little trouble finding out if you can have named tokens that are optional in a regular expression.
My goal is to seperate a string such as:
'CO(g) [atm] = 1000'
where it has the form
'Parameter [Units] = Value '
However, some Parameter don't have Units
'name = carbon'
If I have a regular expression pattern
'(?<Parameter>[^\[=]+)\s?\[(?<Units>[^\]]+)\]\s?=\s?(?<Value>.*)'
this will only work if all three named tokens are present.
Is there a way of modifying this to make it capture either Parameters,Units,Value or Parameters,Value? I tried to use a none capturing grouping '(?:\[?<Units>[^\]]+)\])?' but that doesn't seem to work right.
Basically, can there be optionally captured Named Tokens? If so, how do you construct the regular expression.
UPDATE:
I used:
(?<parameter>[^\[=]+)\s?\[(?<units>[^\]]+)\]?\s?=\s?(?<value>.*)|(?<parameter>[^\[=]+)\s?=\s?(?<value>.*)
So,
'co [k] = 5'
Parameter = 'co '
Units = 'k'
Value = '5'
And,
'co = 5'
Parameter = 'co '
Units = []
Value = '5'
However, the regular expressions looks very unelegant due to the redundance after the '|'. Anyone have any suggestions how to make it look better?
1 Comment
Answers (0)
See Also
Categories
Find more on Characters and Strings in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!