How do I average all the values for each column in a cell array?

Question

lil brain on 12 Dec 2022

0
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/1876727-how-do-i-average-all-the-values-for-each-column-in-a-cell-array

Commented: Voss on 13 Dec 2022

Accepted Answer: Voss

new_mat.mat

Open in MATLAB Online

Hi,

I have a cell array called new_mat. I would like to compute the mean of all the values in each column and save the result in a new array called averages. I would then have a numerical array with one row and five columns, so five values in total.

How would I do that?

I tried this

avg_cols = cellfun(@(x) mean(x, 1), new_mat, 'UniformOutput', false);

But I still get this

avg_cols =

5×5 cell array

{[ 2.9473]} {[ 0.7736]} {[24.7335]} {[-32.1028]} {[ 5.4609]}

{[ 7.9357]} {[15.6115]} {[28.3915]} {[ 51.8624]} {[ 1]}

{[38.3376]} {[62.5463]} {[35.4955]} {[ 17.6059]} {[ 35.9168]}

{[15.0732]} {[24.9668]} {[ 3.2505]} {[-21.6557]} {1×0 double}

{[57.9756]} {[49.9486]} {[53.4301]} {[ 45.9361]} {[-17.1092]}

Any ideas why the columns are not averaged?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Voss on 13 Dec 2022

0
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/1876727-how-do-i-average-all-the-values-for-each-column-in-a-cell-array#answer_1126617

Edited: Voss on 13 Dec 2022

Open in MATLAB Online

new_mat.mat

cellfun operates on the contents of each cell independently, performing the specified function (in this case the function is mean(x,1)). If the outputs of those function calls are all scalars of the same class, then cellfun is able to combine all the results into an array (which it will do by default). Otherwise, you need to use 'UniformOutput',false to have cellfun return a cell array.

Examples:

C1 = {[1 1] [2]} % cell array with two cells: 1st contains a 1x2 vector, 2nd contains a scalar
C1 = 1×2 cell array
    {[1 1]}    {[2]}
try
    result = cellfun(@(x)x,C1) % the function @(x)x just returns the contents of the cell
catch e % error: non-scalar output at 1st cell (which is [1 1] - obviously non-scalar).
        % That is, cellfun can't combine [1 1] and [2] into a 1-by-2 matrix (the size of C)
    disp(e.message);
end
Non-scalar in Uniform output, at index 1, output 1.
Set 'UniformOutput' to false.
C2 = {single(1) double(2)} % cell array with two cells: both containing scalars but different classes
C2 = 1×2 cell array
    {[1]}    {[2]}
try
    result = cellfun(@(x)x,C2) % the function @(x)x just returns the contents of the cell (again)
catch e % error: mismatch in type of outputs (single vs double)
        % That is, cellfun doesn't know what class the result should be
    disp(e.message);
end
Mismatch in type of outputs, at index 2, output 1 (single versus double).
Set 'UniformOutput' to false.

In both of those examples, you must use 'UniformOutput',false to have cellfun return a cell array instead of trying to construct a numeric matrix and erroring-out. Of couse, since the function is @(x)x, the resulting cell array from cellfun will be the same as what you gave it.

result = cellfun(@(x)x,C1,'UniformOutput',false)
result = 1×2 cell array
    {[1 1]}    {[2]}
isequal(result,C1) % the same as what you started with
ans = logical
   1
result = cellfun(@(x)x,C2,'UniformOutput',false)
result = 1×2 cell array
    {[1]}    {[2]}
isequal(result,C2) % the same as what you started with
ans = logical
   1

Now, to turn to your cell array:

load new_mat
new_mat % notice the cell in the 4th row, 5th column contains an empty array
new_mat = 5×5 cell array
    {[ 2.9473]}    {[ 0.7736]}    {[24.7335]}    {[-32.1028]}    {[  5.4609]}
    {[ 7.9357]}    {[15.6115]}    {[28.3915]}    {[ 51.8624]}    {[       1]}
    {[38.3376]}    {[62.5463]}    {[35.4955]}    {[ 17.6059]}    {[ 35.9168]}
    {[15.0732]}    {[24.9668]}    {[ 3.2505]}    {[-21.6557]}    {0×0 double}
    {[57.9756]}    {[49.9486]}    {[53.4301]}    {[ 45.9361]}    {[-17.1092]}

new_mat consists of 24 cells that contain a scalar and one cell that contains an empty array. The function you want to run is @(x)mean(x,1), which will return a scalar on each of the 24 cells containing scalars and will return an empty array on the cell that contains an empty array. Since not all results will be scalars, you must use 'UniformOutput',false. Of course, the mean of a scalar is the scalar itself and the mean of an empty array is an empty array, so the result you get is essentially what you started with (the difference is that the 0x0 empty array becomes 1x0 when passed through mean(x,1)).

result = cellfun(@(x)mean(x,1),new_mat,'UniformOutput',false)
result = 5×5 cell array
    {[ 2.9473]}    {[ 0.7736]}    {[24.7335]}    {[-32.1028]}    {[  5.4609]}
    {[ 7.9357]}    {[15.6115]}    {[28.3915]}    {[ 51.8624]}    {[       1]}
    {[38.3376]}    {[62.5463]}    {[35.4955]}    {[ 17.6059]}    {[ 35.9168]}
    {[15.0732]}    {[24.9668]}    {[ 3.2505]}    {[-21.6557]}    {1×0 double}
    {[57.9756]}    {[49.9486]}    {[53.4301]}    {[ 45.9361]}    {[-17.1092]}
isequal(result,new_mat)                       % not the same
ans = logical
   0
isequal(result([1:23 25]),new_mat([1:23 25])) % but the only difference is in the 24th cell (row 4, column 5)
ans = logical
   1

OK, so that's a cellfun primer. As I said, cellfun operates on each cell independently. But you want to operate on columns of cells together, so that makes cellfun ill-suited to the task. It's straightforward to write a loop to do what you want:

N = size(new_mat,2);
result = zeros(1,N);
for ii = 1:N
    result(ii) = mean([new_mat{:,ii}]);
end
disp(result)
   24.4539   30.7694   29.0602   12.3292    6.3171

which could also be written:

N = size(new_mat,2);
result = zeros(1,N);
for ii = 1:N
    result(ii) = mean(vertcat(new_mat{:,ii}));
end
disp(result)
   24.4539   30.7694   29.0602   12.3292    6.3171

The difference being that the first loop horizontally concantenates the contents of the cells in a given column, and the second loop vertically concatenates the contents of the cells in a given column. In either case the result of that concatenation is a vector, so no dimension argument is required for mean (that is, it's mean(x), not mean(x,1)), but you could include one (it would be 2 for the horizontal concatentation case and 1 for the vertical).

Note that the dimension argument sent to mean has nothing to do with how the cells are arranged in new_mat! You're wanting to do @(x)mean(x,1) because you are thinking of averaging a column of cells, but the function @(x)mean(x,1) when used in cellfun doesn't operate on a column of cells - it operates on one cell at a time. Each cell contains a scalar (or empty array), so the dimension argument passed in to mean() is irrelevant.

In order to take the mean of several elements at a time, you've got to concatenate them together somehow - that's what code inside the loops does ([new_mat{:,ii}] to horizontally concatenate the contents of the cells in the iith column of new_mat, or vertcat(new_mat{:,ii}) to concatenate the same things vertically).

3 Comments
Show 1 older commentHide 1 older comment

lil brain on 13 Dec 2022

Hi @Voss! Wow, this is super helpful and detailed. Many thanks for that! I learned a lot just reading this. I understand the problem much better now.

One last thing however, since you mentioned that taking the mean of a scalar simply gives you the scalar itself, would it make sense to convert the cells that contain scalars to something else? Would that allow me to create the mean along the column dimension then?

Thanks again for the help!

Voss on 13 Dec 2022

Open in MATLAB Online

new_mat.mat

@lil brain: You're welcome! I'm glad it's useful.

"would it make sense to convert the cells that contain scalars to something else?"

I don't know what you'd convert them to.

If you didn't have that one cell that contains an empty array, then you could convert the entire 5-by-5 cell array to a numeric matrix. Let's say you replace that empty array in that one cell with a scalar NaN:

load new_mat
new_mat{4,5} = NaN
new_mat = 5×5 cell array
    {[ 2.9473]}    {[ 0.7736]}    {[24.7335]}    {[-32.1028]}    {[  5.4609]}
    {[ 7.9357]}    {[15.6115]}    {[28.3915]}    {[ 51.8624]}    {[       1]}
    {[38.3376]}    {[62.5463]}    {[35.4955]}    {[ 17.6059]}    {[ 35.9168]}
    {[15.0732]}    {[24.9668]}    {[ 3.2505]}    {[-21.6557]}    {[     NaN]}
    {[57.9756]}    {[49.9486]}    {[53.4301]}    {[ 45.9361]}    {[-17.1092]}

Now all the cells contain scalars, so you can put all those scalars together into a numeric matrix the same size as your original cell array:

% using cell2mat:
M = cell2mat(new_mat)
M = 5×5
    2.9473    0.7736   24.7335  -32.1028    5.4609
    7.9357   15.6115   28.3915   51.8624    1.0000
   38.3376   62.5463   35.4955   17.6059   35.9168
   15.0732   24.9668    3.2505  -21.6557       NaN
   57.9756   49.9486   53.4301   45.9361  -17.1092
% or, concatenating and reshaping:
M = reshape([new_mat{:}],size(new_mat))
M = 5×5
    2.9473    0.7736   24.7335  -32.1028    5.4609
    7.9357   15.6115   28.3915   51.8624    1.0000
   38.3376   62.5463   35.4955   17.6059   35.9168
   15.0732   24.9668    3.2505  -21.6557       NaN
   57.9756   49.9486   53.4301   45.9361  -17.1092

Now, I don't know, in your application, whether using a NaN instead of an empty array is a good idea (maybe you still need to distinguish NaN from empty, in which case you don't want to replace one with the other), but if it makes sense to do that (or use some other scalar place-holder value like Inf), then it's convenient to use a numeric matrix like above instead of a cell array where all the cells contain a scalar.

Sign in to comment.

Answer 2

Walter Roberson on 12 Dec 2022

1
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/1876727-how-do-i-average-all-the-values-for-each-column-in-a-cell-array#answer_1126527

Open in MATLAB Online

avg_cols = cellfun(@(x) mean(x, 1), new_mat);

You only need non-uniform output if some of the outputs can be a different size or datatype than the others, or if the output datatype is one that cannot be concatenated into an array (for example, function handles)

2 Comments
Show NoneHide None

lil brain on 12 Dec 2022

Thnaks!

However, if I try this it gives me the error:

Error using cellfun

Non-scalar in Uniform output, at index 24, output 1.

Set 'UniformOutput' to false.

Is this because some columns are not equal in length?

Walter Roberson on 13 Dec 2022

Open in MATLAB Online

Ah, you have an empty cell. mean of empty is empty. That prevents you from creating a numeric array of results.

If you were to

mask = cellfun(@isempty, new_mat);
new_mat(mask) = nan;

Then the cellfun would return nan for those entries

Sign in to comment.

How do I average all the values for each column in a cell array?

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

3 Comments
Show 1 older commentHide 1 older comment

More Answers (1)

2 Comments
Show NoneHide None

See Also

Categories

Tags

Community Treasure Hunt

How do I average all the values for each column in a cell array?

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

3 Comments Show 1 older commentHide 1 older comment

More Answers (1)

2 Comments Show NoneHide None

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

3 Comments
Show 1 older commentHide 1 older comment

2 Comments
Show NoneHide None