Sharing large structure with nested functions with function handles increase main function overhead?

Good day!
I am building a large simulation model using matlab. I have a quite complitcated structure with many sub structures and data containing model states and I have found that much of the computational power is spent on shuffeling this structure to and from functions where I edit the model states. I decided to try out some nested functions that share this complicated structure, so no copying is necessary. However, the nested functions seem to incur a massive overhead charge, using just one nested function increases the self execution time ('end' in profiler) to 60% of the main function, this is compeared to using a .m file function insted of nested.
Some relevant information:
I'm calling the function from a for loop a couple of thousand of times 4000-6000.
I'm using function handles in a cell array and calling them like so: Output = cell_array{i}(Input);
The shared structure is modified several times inside the for loop by different functions that are specified in m-files.
The nested function seems to execute faster than the .m function, but the total execution time is around 100% slower.
Is there something that can explain this increase in overhead? It would be a massive improvement not having to copy data between functions so I would very much appricieta some input!
Thanks!

3 Comments

I cannot say for sure what the cause is, but it is likely that workflows involving nested functions are less optimized than flat functions in m-files.
Since you have found that copying parts of the structure is consuming lots of processing time, I wonder if converting your structure into a handle class hierarchy would help. Passing handle objects between functions and editing their properties should be a lightweight operation.
Are you willing to share more details about the structure you're working with, and how the functions edit the state? It may help us come up with some more ideas.
"I have a quite complitcated structure with many sub structures and data containing model states and I have found that much of the computational power is spent on shuffeling this structure to and from functions where I edit the model states."
Flatter data is often much more efficient to process than deeply nested data. It usually makes code simpler too.
"It would be a massive improvement not having to copy data between functions so I would very much appricieta some input!"
MATLAB does not copy data every time you pass something to a function. Search this forum for "copy on write" to know how it really works, as well as reading this documentation:
Thank you for your input!
@Ted: I tried the handle class previously but it did not seem to help the situation. So maybe it is not the copying of data that is the problem like Matt J is suggesting but the editing. As for the structure... there are around 5 substructures that contain read-only static information. However, there are a few "dynamic" containing events that are read and written to, they have pre-allocated memory and burried four levels deep in the main structure. Im suspecting that this is the main problem reading your anwers.
I am editing theese as follows:
Read:
event = A.b.c.d(idx);
Write/add =
A.b.c.d(idx + 1: end_idx + 1) = A.b.c.d(idx: end_idx);
A.b.c.d(idx) = event;
---
The model is slowed down considerably as struct "d" is filled with data, I thought it was due it having to be copied but now I am not so sure. The structure "d" has around 20 fields.
@Stephen23: Thank you for the input, I could consider flattening the data, do you have any idea of what kind of performance impovement I could expect from that? I nested the structrue because of code complexity and readability so there is a balance there i guess. I'm afraid I have some try catch blocks around the code as well that I would be reluctant to remove. Am I correct in my understanding that "file functions (.m)" do not get the memory optimization on "copy on write" that local functions do?
Thank you for your answers!

Sign in to comment.

 Accepted Answer

where I edit the model states
It is the editing of the variable contents that incurs time, not the passing of them to functions. Note that if one of the struct variables is a large numeric matrix, changing even one element will allocate memory for a complete new matrix. Similarly, if you have lots of small matrices that you are making changes to, that would consume a lot of time as well.

10 Comments

Interesting! Your right its the editing that is incuring the time, I just assumed that it was due to copying. As I am more used to passing variables through reference I find it difficoult to understand exactly how this works in MATLAB. Im suspecting that my only chance to improve the performance is to make the structure that I am editing as small as possible... would you agree?
Thanks!
it is not the copying of data that is the problem like Matt J is suggesting but the editing.
But bear in mind that these are often the same thing. Editing events trigger deep copying of data whereas renaming and passing variables to functions alone do not. For example in the simple code below, no new memory allocation occurs until "Line 2" where a completely new 1e4 x1e4 is allocated for B to occupy.
A=rand(1e4); %Line 0: Memory allocated for A
function B=passIt(A)
B=A; %Line 1: no memory allocation
B(1)=3; %Line 2 : Memory allocated for B
end
With structs, though, it is a little more complicated. Memory is not allocated for struct fields until you actually try to modify the fields, and the operations you've shown above don't show any modification of the fields of d. The memory duplication that is happening in
A.b.c.d(idx + 1: end_idx + 1) = A.b.c.d(idx: end_idx);
is really just in the pointers that are pointing to the different d(i), so unless endidx-idx is a very large number (or these operations are happening very often), there shouldn't be any significant memory copying here.
Okay that makes it a bit clearer!
I forgot to mention it before but i also edit 'struct d' like this:
A.b.c.d(idx).field = x; %This one is called thousands of times
I suppose this is the command that is generating the memory copying? I am curios does MATLAB do the copying if theres a posibility that a field is edited or in the instance the field is edited? For example, if there were an if statement that could stop the editing of a variable in most cases, would that stop uneccasary overhead?
I ran in to this post: https://se.mathworks.com/matlabcentral/answers/1640135-matlab-s-inefficient-copy-on-write-implementation. Where you suggest wrapping the data in a handle function, this is what @Ted suggested as well. Do you think this could help the situation in this case?
Thanks, really appriciate your answers!
A.b.c.d(idx).field = x; %This one is called thousands of times
This doesn't look like it should allocate any new memory to me, as long as d(idx).field already exists. Replacing the entire contents of an existing field shouldn't result in any data copying. If the field doesn't already exist, then some memory is allocated for the field definition.
I did some quick testing on a simple function with the code profiler. Writing to the structrue takes around 1ms (95% of function computation time) per call on my laptop. Stopping unnecessary writes does speed up the associated function a lot!! As far as I can tell the computation time improves when limiting writes to, and the size of 'structure d', as for the reason I am still confused.
I reiterate what I said before. The expense of the operation depends on whether d(idx) and/or d(idx).field pre-exists.
The field and structure pre-exists, that was why I was confused that the operation is still expensive.
The only thing I can think of is to try to avoid deep indexing expressions by making temporary struct arrays, e.g.,
temp=A.b.c;
temp.d(idx + 1: end_idx + 1) = temp.d(idx: end_idx);
temp.d(idx) = event;
A.b.c=temp;
I second Matt J's advice, particularly if you have situations where particular access is repeated many times:
temp = A.b.c.d;
for k = lots
..temp..
end
A.b.c.d = temp;
Okay I will try that, I have a much better grasp of what to focus on now!
Thanks!

Sign in to comment.

More Answers (0)

Categories

Products

Release

R2022b

Asked:

on 17 Nov 2023

Commented:

on 22 Nov 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!