Extract number after specific words

11 views (last 30 days)
Rachele Franceschini
Rachele Franceschini on 19 Aug 2021
Edited: DGM on 27 Aug 2021
I have an excel file with text and number. In the same cell I have text with value of latitude and longitude value. Exist one method to get numbers, specificing the text.
For example: you extract the number after the word "latitude", you extract number after word "longitude".
  2 Comments
Kevin Holly
Kevin Holly on 19 Aug 2021
Rachele,
Is the text consistent?
If the string is “latitude30”or “latitude3” for instance, you could use sscanf.
str = "latitude30";
sscanf(str,'latitude%f')
Rachele Franceschini
Rachele Franceschini on 19 Aug 2021
for example the column name is: entities.
Within there is text with also latitude 45,0000 and longitude 2,00000.
I would like the number after words.

Sign in to comment.

Accepted Answer

DGM
DGM on 19 Aug 2021
Edited: DGM on 19 Aug 2021
Something like this?
C = {'latitude 45,0000','longitude 2,0000';
'latitude 47,5000','longitude 5,0000';
'latitude 50,8000','longitude 10,0000'};
D = regexp(C,'(?<=[latitude |longitude ])\d+,?\d*','match');
D = str2double(reshape(strrep(vertcat(D{:}),',','.'),size(C)))
D = 3×2
45.0000 2.0000 47.5000 5.0000 50.8000 10.0000
That's assuming that all the lines are formatted the same and that the comma is the decimal separator.
  2 Comments
Rachele Franceschini
Rachele Franceschini on 20 Aug 2021
Edited: Rachele Franceschini on 20 Aug 2021
Thank you, but I don't have the same format in all the lines and the sometimes I have comma and dot.
How can I do?
for example in this case
C = {aa bbb cccccc dddddd, latitude 45,0000 longitude 2,0000;
aaa bbbb cc, latitude 46,00000 longitude 2,00000;
aaaaa bbbbb cccc ddd eeee fffffff, latitude 46,00000 longitude 2,00000 latitude 49,00000 longitude 9,00000}
DGM
DGM on 20 Aug 2021
Edited: DGM on 27 Aug 2021
Without formatting, this is ambiguous.
C = {aa bbb cccccc dddddd, latitude 45,0000 longitude 2,0000;
aaa bbbb cc, latitude 46,00000 longitude 2,00000;
aaaaa bbbbb cccc ddd eeee fffffff, latitude 46,00000 longitude 2,00000 latitude 49,00000 longitude 9,00000}
I'm going to assume it's just a 3x1 array with a double entry on the third line
% extra prefix chars per line
% mixed comma/dot decimal sep
% multiple lat/lon per row
C = { 'aa bbb cccccc dddddd, latitude 45.0500 longitude 2.0500';
'aaa bbbb cc, latitude 46,00000 longitude 2,00000';
'aaaaa bbbbb cccc ddd eeee fffffff, latitude 46,00000 longitude 2,00000 latitude 49,00000 longitude 9,00000'};
D = regexp(C,'(?<=[latitude |longitude ])\d+[,|.]?\d*','match');
D = cellfun(@(x) reshape(x.',2,[]).',D,'uniform',false);
D = str2double(reshape(strrep(vertcat(D{:}),',','.'),[],2))
D = 4×2
45.0500 2.0500 46.0000 2.0000 46.0000 2.0000 49.0000 9.0000
Note that there are more rows in D than in C, since some lines have multiple entries.

Sign in to comment.

More Answers (0)

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!