How can I sort my data from regexp?

Question

Linus Dock on 14 Oct 2016

0
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/307295-how-can-i-sort-my-data-from-regexp

Edited: Guillaume on 14 Oct 2016

Accepted Answer: Guillaume

Open in MATLAB Online

Hi I have a problem when using regexp with this command.

RVRtmp=regexp(TXTmod,'R\d\d\w\/\w*\d\d\d\D\>','match')

The output cell is mostly empty and looks like this:

[]
[]
[]
[]
[]
[]
[]
<1x4 cell>
<1x4 cell>
<1x4 cell>
<1x4 cell>
<1x4 cell>
[]

I would like to obtain the information in the [1x4 cells]. The information inside the cells look like this:

'R01L/P1500N' 'R19R/0900VP1500N' 'R01R/0800V1400D' 'R19L/1000N'

Here I would like to obtain the information 'R01L' as a variable or string and the corresponding value of '1500' as a vector or cell. I'm having a bit of trouble to extract the data as the empty cells is not working with my command:

RVR1=regexp(RVRtmp{1072}{1},'\d{4}','match')

I would like to arrange the data like this:

TXTmod looks like this:

'METAR ESSA 200901220720Z 03003KT 1500 R01L/P1500N R19R/P1500N R01R/0700N R19L/0800V1000N BR VV002 M00/M00 Q1006 01710173 08710164 51710170 TEMPO 2000'
'METAR ESSA 200901220750Z 04003KT 020V090 1500 R01L/P1500N R19R/P1500N R01R/0800V1000N R19L/0900N BR VV002 M00/M00 Q1006 01710173 08710164 51710170 TEMPO 2000'
'METAR ESSA 200901220820Z 02003KT 320V100 1000 R01L/P1500N R19R/0900VP1500N R01R/0800V1400D R19L/1000N BR VV002 M00/M00 Q1006 01710173 08710164 51710170 TEMPO 2000'
'METAR ESSA 200901220850Z 06004KT 0900 R01L/P1500N R19R/1100V1500U R01R/1000V1400N R19L/1200N FZFG VV002 M00/M00 Q1006 01710173 08710164 51710170 TEMPO 0700'
'METAR ESSA 200901220920Z 04003KT 360V060 1000 R01L/P1500N R19R/1200U R01R/0700N R19L/1000VP1500N BR VV002 M00/M00 Q1006 01710173 08710164 51710170 TEMPO 1500'
'METAR ESSA 200901220950Z 04004KT 1500 BR VV002 M00/M00 Q1005 01710173 08710164 51710170 NOSIG'
'METAR ESSA 200901221020Z 01003KT 1700 BR BKN002 BKN017 M00/M00 Q1005 01710173 08710164 51710170 NOSIG'
'METAR ESSA 200901221050Z 35004KT 2500 BKN002 BKN019 00/00 Q1004 01710173 08710164 51710170 NOSIG'

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Guillaume on 14 Oct 2016

0
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/307295-how-can-i-sort-my-data-from-regexp#answer_238956

Edited: Guillaume on 14 Oct 2016

Open in MATLAB Online

There is no real need for the intermediate regexp, you can get it all with just one regular expression:

tokens = regexp(TXTmod, '(R\d\d\w)/\w*(\d\d\d\d)\D\>', 'tokens'); %You were missing a \d in your regexp (which was captured by the \w* so it didn't matter)

Or more efficient (but a bit longer):

tokens = regexp(TXTmod, '\<(R\d{2}[A-Z])/(?:(?:\d{4})?[A-Z]+)?(\d{4})[A-Z]\>', 'tokens')

Note the inefficiency in your original expression: The \w*\d\d\d in your first regular expression is going to cause a lot of backtracking by the regular expression engine because the \w* is always going to match the next three \d. Because * is greedy, at first the engine is going to match the three digits with \w* and find then that it can't match 3 digits after. So it's going to backtrack one digit, match the first two digits with \w*, the 3rd digit with \d and find that it still can't find a match for the next two \d. it will have to backtrack two more times until \w* only match the letters and the three \d match a digit.

The new regular expression matches a optional group of 4 digits followed by 1 or more letter and then capture the final groups of 4 digits before the last letter. I've also added a start of word match: \<.

Other note: To rearrange the tokens of each string into a two column cell array:

cellfun(@(t) vertcat(t{:}), tokens, 'UniformOutput', false)

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

How can I sort my data from regexp?

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

0 Comments
Show -2 older commentsHide -2 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

How can I sort my data from regexp?

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

0 Comments Show -2 older commentsHide -2 older comments

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments