Reading in a text file with numbers and strings in

24 views (last 30 days)
I have a .txt which I can read into a matrix fine.
I currently do this as follows:
FILENAME = 'inputData.csv';
sondeData = readmatrix(FILENAME,'numheaderlines',1); %ignore first line as this is heading
The text file has mostly numerical values. Along with some parameters which are presented as strings.
In those catagories the data is "GREEN" or "RED" which are essentially pass / fail status/
A sample of the data is below
inputData =
GREEN GREEN 11.88 876 1332 GREEN 758 GREEN 4.006 19.08 5.09265 GREEN
GREEN GREEN 12.09 863 1519 GREEN 761 GREEN 4.006 19.08 5.070016 GREEN
When I read the file in I get NaNs in all the columns that contain the RED / GREEN.
What I want to do is clean the data up - so remove any rows where the value in a certain column is RED. This means something has failed and so data is junk.
How can I load the data in such that I can do something like the below, to remove the junk rows
inputData(inputData(:, 2)== "RED", :)= []
I am not able to change how the inputData is provided to me - but I could do an intemediary step to turn those into booleans along the lines of GREEN = 1, RED = 0 if I had to.

Accepted Answer

Star Strider
Star Strider on 18 Oct 2022
Edited: Star Strider on 18 Oct 2022
It would help to haved the file.
The readtable function would likely be best for this, since it will also import the variable names (likely the first line of the file).
T1 = cell2table({'GREEN' 'GREEN' 11.88 876 1332 'GREEN' 758 'GREEN' 4.006 19.08 5.09265 'GREEN';
'RED' 'RED' rand rand rand 'RED' rand 'RED' rand rand rand 'RED';
'GREEN' 'GREEN' 12.09 863 1519 'GREEN' 761 'GREEN' 4.006 19.08 5.070016 'GREEN'})
T1 = 3×12 table
Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10 Var11 Var12 _________ _________ _______ ______ _______ _________ _______ _________ _______ _______ _______ _________ {'GREEN'} {'GREEN'} 11.88 876 1332 {'GREEN'} 758 {'GREEN'} 4.006 19.08 5.0926 {'GREEN'} {'RED' } {'RED' } 0.59669 0.9725 0.41824 {'RED' } 0.18383 {'RED' } 0.48095 0.27379 0.99147 {'RED' } {'GREEN'} {'GREEN'} 12.09 863 1519 {'GREEN'} 761 {'GREEN'} 4.006 19.08 5.07 {'GREEN'}
Lv = any(strcmpi(T1{:,[1 2 6 8 12]},'RED'),2) % Test Variables
Lv = 3×1 logical array
0 1 0
T1 = T1(~Lv,:)
T1 = 2×12 table
Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10 Var11 Var12 _________ _________ _____ ____ ____ _________ ____ _________ _____ _____ ______ _________ {'GREEN'} {'GREEN'} 11.88 876 1332 {'GREEN'} 758 {'GREEN'} 4.006 19.08 5.0926 {'GREEN'} {'GREEN'} {'GREEN'} 12.09 863 1519 {'GREEN'} 761 {'GREEN'} 4.006 19.08 5.07 {'GREEN'}
Experiment with the actual data.
EDIT — (18 Oct 2022 at 16:26)
Use curly braces for the table reference here —
a = inputData{1,3};
b = a^2;
.

More Answers (1)

Les Beckham
Les Beckham on 18 Oct 2022
Try using readtable instead of readmatrix. This will read in the strings properly and allow you to do your cleanup based on the values of those strings.
  1 Comment
Richard Nash
Richard Nash on 18 Oct 2022
I looked at that - but when I tried it I couldn't seem to then do mathematical functions using the numerical data.
So if I read the example data in as a table and do the following:
I get an error "Operator '.^' is not supported for operands of type 'table'."
a = inputData(1,3);
b = a^2;

Sign in to comment.

Categories

Find more on Characters and Strings in Help Center and File Exchange

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!