Replacing a numberless string in matrix with a number

Hello,
I have a matrix that contains a column with participants' gender written as strings ("female" vs" "male").
I would like to change females to 1 and males to 0. How would I do that?
Thank you!

Answers (2)

Where V is that column:
V = ["female";"male";"female";"male";"male";"female"]
V = 6×1 string array
"female" "male" "female" "male" "male" "female"
X = strcmpi(V,"female")
X = 6x1 logical array
1 0 1 0 0 1

7 Comments

Unfortunately that did not work. :(
the matrix is called "mat" and is 3138x19cell.
The gender is in cell number 15.
That's what I have tried, but I get errors!
mat(mat(:,15(strcmpi(mat,'female',15))= 1;
idx = strcmpi(mat(:,15),'female')
mat(:,15) = num2cell(idx)
why num2cell? I don't want to create a new cell array. I want to replace the two different types of strings within the existing column of my matrix with numbers.
And where do you specify in your code, which number is desired as a replacement?
That does not make sense to me...
and when I try it, that's the error code I get..
changeidx = strcmpi(mat(:15),'female');
Error: Invalid expression. Check for missing
multiplication operator, missing or unbalanced
delimiters, or other syntax error. To construct
matrices, use brackets instead of parentheses.
"why num2cell?"
Because you cannot directly allocate the elements of a numeric vector the the corresponding elements of a cell array. It is possible to allocate the elements of a cell array to another cell array, which is what my answer does.
"I don't want to create a new cell array."
The output of my code is the same cell array as the input. Which is apparently what you want to do.
"I want to replace the two different types of strings within the existing column of my matrix with numbers."
That is exactly what my code does.
"And where do you specify in your code, which number is desired as a replacement?"
The output of strcmpi is a logical vector (which is a subclass of the numeric base class), with values 0 and 1.
"That does not make sense to me... "
What I did is very basic MATLAB: calling functions and then allocating the output using indexing. The introductory tutorials are a good place to learn basic MATLAB concepts that you need to know:
"and when I try it, that's the error code I get.. "
Did you read the error message? It tells you that you made this mistake when copying my code:
strcmpi(mat(:15),'female') % your incorrect code
strcmpi(mat(:,15),'female') % my original code
% ^ you missed this.
Use num2cell because it works. The created cell is inserted in the original one.
% Change:
changeidx = strcmpi(mat(:15),'female');
% to
changeidx = strcmpi(mat(:, 15),'female');
% ^ comma inserted
Ok, thank you! it seemed to have been the comma.
But now it seems lucky, that I only had two different strings in my current code.
Out of interest for the future:
What would I do if I had for example a column dedicated to colour = 'green', 'blue', 'yellow' and 'purple' and I want to change these strings as following:
'green' = 1
'blue' = 2
'yellow' = 3
'purple' = 4
Where and how would I then specify the values I want to use to replace the strings?
For the values 1..N I would simply use the second output of ismember.
If you want to specify totally arbitrary values, then probably the easiest way would be to define a vector of those arbitrary values and then use indexing (e.g. the second output of ismember) to select from that vector.
You will find num2cell useful for both of these.

Sign in to comment.

I would likely use a string array.
colors = ["green"; "blue"; "yellow"; "purple"];
% Make some sample data
ind = randi(numel(colors), 10, 1);
arrayOfColors = colors(ind);
whichColor = NaN(size(arrayOfColors));
for c = 1:numel(colors)
whichColor(arrayOfColors == colors(c)) = c;
end
% Display results side by side
results = table(ind, arrayOfColors, whichColor)
results = 10x3 table
ind arrayOfColors whichColor ___ _____________ __________ 4 "purple" 4 3 "yellow" 3 1 "green" 1 2 "blue" 2 2 "blue" 2 1 "green" 1 1 "green" 1 3 "yellow" 3 4 "purple" 4 4 "purple" 4
This also works if arrayOfColors was a categorical array, though for a categorical array there's an easier way to report which element of the valueset was used to generate each element of the categorical array by just using the double function.
colorCats = categorical(arrayOfColors, colors);
results.colorCats = colorCats;
results.whichCat = double(colorCats)
results = 10x5 table
ind arrayOfColors whichColor colorCats whichCat ___ _____________ __________ _________ ________ 4 "purple" 4 purple 4 3 "yellow" 3 yellow 3 1 "green" 1 green 1 2 "blue" 2 blue 2 2 "blue" 2 blue 2 1 "green" 1 green 1 1 "green" 1 green 1 3 "yellow" 3 yellow 3 4 "purple" 4 purple 4 4 "purple" 4 purple 4
all(whichColor == ind)
ans = logical
1
all(results.whichCat == ind)
ans = logical
1

Categories

Products

Release

R2020b

Asked:

on 27 Jan 2021

Commented:

on 27 Jan 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!