How to sample single values from field in *non-scalar structure array*?

Hello Matlab experts,
I'm working on the optimization of some code following the Profiler's measurements.
Here, one main bottleneck is the sampling of an individual element on each of the fields contained within the struct array.
For context, let's consider that a non-scalar struct array with uniform mat sparse fields is built as
A = repmat(struct('mat',speye(3)),10,1);
for k = 1:10
A(k).mat = A(k).mat * k;
end
For the sake of simplicity, I'm interested in sampling each field at indexes (i,j) = (2,2) as
col = zeros(10,1);
for k = 1:10
col(k) = A(k).mat(2,2);
end
So that for the next step, I report each sampled point into a column vector.
Thus the expected output is:
>> col
col =
1
2
3
4
5
6
7
8
9
10
Question: is there a much better (hopefully faster) way to do this? (mainly getting rid of the for-loop)
Disclaimer: I have tried converting this struct array into flat array, [A.mat], or into an array of cells, {A.mat}, but I run into the trouble of the sparse nature of the fields (Get warnings with SPFUN). I'm sure there must be a clever way to do this with arrayfun() or cellfun() but the solution eludes me.

8 Comments

"I'm sure there must be a clever way to do this with arrayfun() or cellfun() but the solution eludes me."
Both are likely to be slower than a well-written loop.
Is there a way to vectorize col(k) = A(k).mat(2,2); ?
i.e.:
col = A.mat(2,2)
currently this line outputs the following error message:
  • Intermediate dot '.' indexing produced a comma-separated list with 10 values, but it must produce a single value when followed by subsequent indexing operations.
"Is there a way to vectorize"
No. Every field contains a different array: in MATLAB it is not possible to write one index into multiple arrays (that you generate with that comma-separated list). It is possible to index into one array.
"Is there a way to vectorize? No."
Great news then!
"It is possible to index into one array"
Not sure I get you.
"Not sure I get you."
This is one array:
A = rand(2,3);
here is some indexing into it:
A(1,3)
In general a comma-separated list in not one array. Therefor it cannot be indexed into like you showed.
Thank you captain obvious
Yes! I'm aware of it.
Although it allowed me to have three indexes, it didn't do the trick on my case ...
Nevertheless thx for the suggestion! : )
-M

Sign in to comment.

 Accepted Answer

For the sake of documentation, I'm going to respond to my own question.
Turns out (like always) that the solution is very simple:
This problem arises because we are trying to use structures (A(k).mat(i,j)) or cells (A{k}(i,j)) to handle sparse fields like if they werer 3-dimensinal arrays (i.e. A(i,j,k) ). As @Stephen23, commented trying to vectorize this is not currently posible in Matlab. However, the answer for @Bruno Luong showed me that the problem can be simply flatten so that simple indexing can be used.
TL;DR: a simple way to speed up this problem is:
% Get Data
A = repmat(struct('mat',speye(3)),10,1);
for k = 1:10
A(k).mat = A(k).mat * k;
end
% Get Data sizes
[m,n] = size(A(1).mat); % because we have uniformly sized fields
% Compute
A_flat = [A.mat]; % Flatten the structure fields
col = transpose(A_flat(2,2 + 0:m:m*10));
col =
(1,1) 1
(2,1) 2
(3,1) 3
(4,1) 4
(5,1) 5
(6,1) 6
(7,1) 7
(8,1) 8
(9,1) 9
(10,1) 10
Voila !
To probe to you that above idea is correct, I share the screencaps of the profiler measurementes using the original and *this-> solution:
Original Problem:
Using *this-> reformulation

7 Comments

A_flat = [A.mat];
You just build a potentially huge intermediate matrix. The code looks shorter but honestly the for-loop would me more efficient.
Why insisting on avoiding for-loop?
"You just build a potentially huge intermediate matrix."
Yes, but here the key is that is huge 2-d sparse intermediate matrix.
Indexing is faster than in 3-d dense matrix. My goal is to avoid huge 3-, .. N-d dense arrays because of how slow indexing becomes as my problem scales in size. 2-d still managable for my current application.
Honesty, if I could reformulate my problem with a the single-index and creating longer 1-d arrays while becoming easier to read and faster, I'll over the moon. But, for this special product formulation, I simply can't.
"Why insisting on avoiding for-loop?"
I roll with whatever is faster.
The (orignal) for-loop solution was too slow in my profile measurements than using either a struct- or cell-array. As you can see, *this->solution reduces my computation time 10x. Althought, it is harder to read than the for-loop.
Are you sure you profile this line?
A_flat = [A.mat]; % Flatten the structure fields
it must be called 379913 times as well; to be a fair comparison. In that case it should take a big amount of time and memory IMO and I don't see it in your profiler result.
Unless you call once so your description of the problem is incomplete (calling different retreival wilh changing indexes but on the same matrix.
And you said in your question
"Disclaimer: I have tried converting this struct array into flat array, [A.mat], ..., but I run into the trouble of the sparse nature of the fields (Get warnings with SPFUN). "
Why suddenly you are back of using
A_flat = [A.mat];
?
Yes, I reverted myself because I was doing something like ~ A_col = structfun(@(x) x(i,j), A_mat), where A_mat is already a concatenated sparse array and structfun() would always complain that it can't handle sparse objects.
Once I understood the nature of the problem, it became clear what was a more plausible path to speed up my bottle neck.
The speed up depends on the size of mat. Less than 100, the speed up is real; beyond that the flatten runtime increases with the size where-as the for-loop time is constant, independent of the size of the matrix, as showed,in this test script.
matsize = 2.^(1:12);
ntest = length(matsize);
t1 = zeros(1, ntest);
t2 = zeros(1, ntest);
t3 = zeros(1, ntest);
for i = 1:ntest
A = repmat(struct('mat',speye(matsize(i))),10000,1);
for k = 1:numel(A)
A(k).mat = A(k).mat * k;
end
t1(i) = timeit(@() forloop(A), 1);
t2(i) = timeit(@() flatA(A), 1);
t3(i) = timeit(@() afun(A), 1);
end
close all
semilogx(matsize, t1)
hold on
semilogx(matsize, t2)
semilogx(matsize, t3)
xlabel('sizemat')
ylabel('time [s]')
legend('for loop', 'flat mat', 'arrayfun')
function col = forloop(A)
n = numel(A);
col = zeros(n,1);
for k = 1:n
col(k) = A(k).mat(2,2);
end
end
function col = afun(A)
col = arrayfun(@(s) full(s.mat(2,2)), A);
end
function col = flatA(A)
n = numel(A);
% Get Data sizes
m = size(A(1).mat,1); % because we have uniformly sized fields
% Compute
A_flat = [A.mat]; % Flatten the structure fields
col = transpose(A_flat(2,2 + 0:m:m*n));
end
Wow ... you went the extra mile too!
Disclosure: I'm working on the range of matsize : [20,400] with my struct size : ~[20^3,400^3].
I have being observing the same this afternoon.
Not safistied with the loop solution. Thus, I'm currently coding mex-file instead ; )
Thx again
-M

Sign in to comment.

More Answers (1)

Just shorter code, not necessary better.
A = repmat(struct('mat',speye(3)),10,1);
for k = 1:10
A(k).mat = A(k).mat * k;
end
col = arrayfun(@(s) full(s.mat(2,2)), A)
col = 10×1
1 2 3 4 5 6 7 8 9 10

1 Comment

Indeed, it works!
... but the profiler indicates that the arrayfun() does little to improve over the for-loop solution.
Nevertheless, I was not aware that full() can be used in this manner. Thus it gives me a hint to reformulate my problem.
Thx much!

Sign in to comment.

Categories

Products

Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!