Is it better to have one function or multiple instances of one code?
22 views (last 30 days)
Hello everyone. I was wondering if it is computationally cheaper to have MATLAB run multiple instances of the same code rather than running a custom function repeatedly. For example, I could make a simple set of conditional statements and a couple of really fast matrix operations and place that where needed (copy-and-paste and maybe slight adjustment). The code probably takes a few milliseconds which causes me believe that a function would make it slow. In my case, I want to run the same set of code three or four times. Would the overhead of a function make it significantly slower?
Also, can someone give me a hyperlink to specific documentation that explains when it's best to start thinking about making a function rather than extending the current script?
I don't really know how to search for this answer on the internet. Thanks!
Jan on 14 Jul 2017
Edited: Jan on 14 Jul 2017
I never write scripts, because the variables declared in one script interfere with other scripts. Any possibility to influence parts of the code, which are far awy, is a source of bugs - see https://en.wikipedia.org/wiki/Action_at_a_distance_(computer_science).
But the question remains, when to separate a part of a function to create a new function. This can be desided by the code complexity, but it is a question of taste, how you measure it and where the limit is. But as soon as a piece of code is a compact unit, which can be reused from other functions also, it is worth to move it to an own function. Then it be be tested exhaustively by a unit-test. If it is a bottleneck for any code, it can be optimized or mex-ed, and then all codes using this subfunction wil profit from the speed.
There is a certain overhead in calling an M- or built-in function. The latter is faster, e.g. calling mean(X) is slower than sum(X) / numel(X). But this overhead is small.
Sometimes the error checks in functions take more time than the actual computations (see polyfit). But this is a different problem.
While the runtime of inlined code can save milliseconds of runtime compared to functions, the time for programming and debugging can profit massively from this. If you have 4 almost identical parts, it will be a mess to apply a modification of the code compared to having 1 external function with a parameter for the differences. Therefore copy&paste programming is considered as anti-pattern. Maintaining or expanding such code is too prone to typos.
Caring too much about the runtime during the programming is often a premature optimization: The goal is to solve the problem. Then concentrating on the runtime at the beginning can lead to unreadable code which cannot be adjusted or expanded anymore. It is smarter to create a stable and clean code at first and find the bottlenecks for the runtime later. Then you can be sure not to "optimize" any code, which takes 1% of the total runtime only, such that a 100% speedup of this part reduces the total runtime by 0.5% only.
I've seen the advice, that subfunctions must be created, when the main function exceeds 1000 lines of code. Others use the a limit of the "Cyclomatic complexity", e.g. McCabe complexity > 10 to determine the need to move code to subfunctions to allow a complete testing. Another idea is: "If you cannot describe the job of a function in one sentence anymore, split it into subfunctions."
You see: runtime does not appear in the criteria, but the reduction of complexity rules to avoid bugs and improve the maintainability.
More Answers (1)
Walter Roberson on 15 Jul 2017
It appears that anonymous functions have lots of overhead. See test code attached.
It appears that on my system, each layer of pure function call adds about .05 * 10^(-6) overhead (that is, about 5E-8 seconds), but each level of anonymous function adds 2 to 3 * 10^(-6) (that is, about 2.5E-6 seconds).