Find matched string in table

188 views (last 30 days)
Hi,
To find data matching certain conditions in a table we use:
rows = (T.Smoker==true & T.Age<40);
What if the T.Smoker field was not a logical but a string? 'yes' or 'no'.
rows = (T.Smoker=='yes' & T.Age<40);
This later code does not work. How could I make it work so the condition matches a certain string?
Thank you,
TD

Accepted Answer

Steven Lord
Steven Lord on 9 Mar 2020
Compare the text data stored in the table with a string array. Let's build a sample using the example from the documentation for the table function.
load patients
patients = table(LastName,Gender,Age,Height,Weight,Smoker,Systolic,Diastolic);
Let's extract patients under 40, separated by gender.
malePatientsUnder40 = patients(patients.Gender == "Male" & patients.Age < 40, :)
femalePatientsUnder40 = patients(patients.Gender == "Female" & patients.Age < 40, :)
Let's check against the full list of patients under 40.
patientsUnder40 = patients(patients.Age < 40, :);
Combine the male and female patients into one larger table and compare it against the full list.
isequal(sortrows([malePatientsUnder40; femalePatientsUnder40]), sortrows(patientsUnder40))
  4 Comments
Steven Lord
Steven Lord on 10 Mar 2020
If you build the patients table using the patients.mat file the Gender variable is actually a cellstr, a cell array containing char vectors. It's not a string array, though you could use the cellstr to make one.
Comparing char arrays with string or categorical arrays we try to do "the nice thing" and and treat the text more like "words" rather than "bunches of characters". See this documentation page for a discussion about comparing char and string arrays and this one for char and categorical.
How would these work? Let's rebuild the patients table.
load patients
patients = table(LastName,Gender,Age,Height,Weight,Smoker,Systolic,Diastolic);
What type is the Gender variable in patients? It's a cellstr.
class(patients.Gender)
iscellstr(patients.Gender)
Since in this data set Gender only takes two values, we can store and display them as two categories using a categorical array. We can match a categorical array using a char vector, a string, or a categorical value. [I'm switching the Age threshold to 30 just so the filtered table arrays are shorter.]
patients.GenderCat = categorical(patients.Gender);
F1 = patients(patients.GenderCat == 'Female' & patients.Age < 30, :)
F2 = patients(patients.GenderCat == "Female" & patients.Age < 30, :)
FCategory = patients.GenderCat(3) % The first Female patient is in row 3
F3 = patients(patients.GenderCat == FCategory & patients.Age < 30, :)
Or you can turn that cellstr data into string.
patients.GenderStr = string(patients.Gender);
F4 = patients(patients.GenderStr == "Female" & patients.Age < 30, :)
F5 = patients(patients.GenderStr == 'Female' & patients.Age < 30, :)
The Finnish Rein Deer
The Finnish Rein Deer on 11 Mar 2020
Thank you so much! All clear.

Sign in to comment.

More Answers (2)

Fangjun Jiang
Fangjun Jiang on 9 Mar 2020
isequal(T.Smoker, 'yes')
strcmpi(T.Smoker, 'yes')
  1 Comment
The Finnish Rein Deer
The Finnish Rein Deer on 10 Mar 2020
Thanks. Isequal does not work, and strcmpi is better to replace for strcmp, since the former would match independently of case.

Sign in to comment.


Image Analyst
Image Analyst on 9 Mar 2020
Try contains:
rows = contains(T.Smoker, 'yes') & T.Age<40;
  1 Comment
The Finnish Rein Deer
The Finnish Rein Deer on 10 Mar 2020
Actually, I realized that the code I suggested would work if only I had changed simple quotes for double ones!
rows = (T.Smoker=='yes' & T.Age<40); % does not work
rows = (T.Smoker=="yes" & T.Age<40); % does work

Sign in to comment.

Categories

Find more on Tables in Help Center and File Exchange

Tags

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!