Multiple named tokens in a regexp

1 view (last 30 days)
Joseph
Joseph on 6 Dec 2013
Edited: Cedric on 28 Dec 2013
Hi all,
I am having a little trouble finding out if you can have named tokens that are optional in a regular expression.
My goal is to seperate a string such as:
'CO(g) [atm] = 1000'
where it has the form
'Parameter [Units] = Value '
However, some Parameter don't have Units
'name = carbon'
If I have a regular expression pattern
'(?<Parameter>[^\[=]+)\s?\[(?<Units>[^\]]+)\]\s?=\s?(?<Value>.*)'
this will only work if all three named tokens are present.
Is there a way of modifying this to make it capture either Parameters,Units,Value or Parameters,Value? I tried to use a none capturing grouping '(?:\[?<Units>[^\]]+)\])?' but that doesn't seem to work right.
Basically, can there be optionally captured Named Tokens? If so, how do you construct the regular expression.
UPDATE:
I used:
(?<parameter>[^\[=]+)\s?\[(?<units>[^\]]+)\]?\s?=\s?(?<value>.*)|(?<parameter>[^\[=]+)\s?=\s?(?<value>.*)
So,
'co [k] = 5'
Parameter = 'co '
Units = 'k'
Value = '5'
And,
'co = 5'
Parameter = 'co '
Units = []
Value = '5'
However, the regular expressions looks very unelegant due to the redundance after the '|'. Anyone have any suggestions how to make it look better?
  1 Comment
Cedric
Cedric on 28 Dec 2013
Edited: Cedric on 28 Dec 2013
The following is a bit simpler, but I wouldn't call it "more elegant" ..
pattern = '(?<parameter>\S+) \[?(?<unit>[^=\]]*)\]?\s*=\s*(?<value>\w+)' ;

Sign in to comment.

Answers (0)

Categories

Find more on Characters and Strings in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!