Returning negetive numbesr from text file using regexp

2 views (last 30 days)
Hi! I'm writing code that takes a text file containing three seperate variables of numbers, time, x-displacement and y-displacement.
The three variables are all surounded by unique spacers like '&&*' and ',,,*' I've used fileread and regexp to read the file and put all numbers into an array, only issue is that they all become possitive, when some of the y-displacement values need to be negetive.
experiment = ('(?<=*[^0-9]*)[0-9]*\.?[0-9]+');
inputText = str2double(regexp(dataFile,experiment, 'match'))
Thanks for any help
Jack

Accepted Answer

Johannes Fischer
Johannes Fischer on 19 Sep 2019
With questions like these it's very helpful to have an example of waht exactly your data looks like. If you know exactly how your spacers look like, you could also try strsplit (https://www.mathworks.com/help/matlab/ref/strsplit.html?searchHighlight=strsplit&s_tid=doc_srchtitle) which can detect multiple delimiters.
If you want to use regular expressions, add '\-' in yout brackets, to also match the minus-sign
(?<=[^0-9])([0-9\-]*)\.*([0-9\-]+)
I tested it with &&-1234...-4562 here https://regex101.com/, which gave me an error for the '*' inside the lookbehind.
str = '&&-1234...4562';
vals = cell2mat(cellfun(@str2double, regexp(str,'(?<=[^0-9])([0-9\-]*)\.*([0-9\-]+)', 'tokens'), 'UniformOutput', false))
  2 Comments
Jack Wheatley
Jack Wheatley on 21 Sep 2019
Edited: Jack Wheatley on 22 Sep 2019
Sorry for the lack of info but your answer was actually a massive help! It works!!
&&*0.0000&&,,* 5.6054*,* -1.1018&*,,,&
&&* &&,,* *,* &*,,,&
&&*0.0100&&,,* 5.5384*,* -1.0429&*,,,&
&&*0.0200&&,,* 5.4716*,* -0.9850&*,,,&
&&*0.0300&&,,* 5.4049*,* -0.9283&*,,,&
&&*0.0400&&,,* 5.3384*,* -0.8727&*,,,&
This is an example of the text I was moving into arrays, the experiment variable I had earlier returned them all but all positive.
experiment = ('(?<=*[^0-9]*)[0-9\-?]*\.?[0-9]+');
inputText = str2double(regexp(dataFile,experiment, 'match'))
I added the \- like you suggested and it's returning
Columns 1 through 7
0 5.6054 -1.1018 0.0100 5.5384 -1.0429 0.0200
Columns 8 through 14
5.4716 -0.9850 0.0300 5.4049 -0.9283 0.0400 5.3384
Columns 15 through 21
-0.8727 0.0500 5.2720 -0.8182 0.0600 5.2058 -0.7648
Thanks for all your help mate
Johannes Fischer
Johannes Fischer on 23 Sep 2019
Edited: Johannes Fischer on 23 Sep 2019
Depending on how your code is used or shared and how permanent this style of data saving is (it looks a bit 'unusual' i have to say), you might want to use strsplit in this case. That would make your code much more readable for somebody not familiar with regexp and if your delimiters would never change or only slightly, debugging would be easier. I cannot tell you however which way is faster. This is how it could work
a = '&&*0.0000&&,,* 5.6054*,* -1.1018&*,,,&';
substr = strsplit(a, {'&&*', '&&,,*\t', '*,*\t', '&*,,,&'});
data = cellfun(@str2double, substr(2:4));
The '\t' represent the tab-stops that seem to be present in your delimiters, but they are not necessary, as str2double just ignores them.
If this code is just for you and you dont want your regexp skills to get rusty, go for it :)

Sign in to comment.

More Answers (0)

Products


Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!