Is it better to have one function or multiple instances of one code?

24 views (last 30 days)
Hello everyone. I was wondering if it is computationally cheaper to have MATLAB run multiple instances of the same code rather than running a custom function repeatedly. For example, I could make a simple set of conditional statements and a couple of really fast matrix operations and place that where needed (copy-and-paste and maybe slight adjustment). The code probably takes a few milliseconds which causes me believe that a function would make it slow. In my case, I want to run the same set of code three or four times. Would the overhead of a function make it significantly slower?
Also, can someone give me a hyperlink to specific documentation that explains when it's best to start thinking about making a function rather than extending the current script?
I don't really know how to search for this answer on the internet. Thanks!
  1 Comment
Stephen23
Stephen23 on 15 Jul 2017
Edited: Stephen23 on 15 Jul 2017
"when it's best to start thinking about making a function rather than extending the current script?"
Right now! Scripts are less reliable than functions, what with all of those variables hanging around interfering with each other, and slower too: "Use functions instead of scripts. Functions are generally faster."
"The code probably takes a few milliseconds which causes me believe that a function would make it slow"
MATLAB does all kinds of intelligent runtime optimizations behind the scenes, and so any optimization should be based on tests and good practice. By writing code based on what you presume might be the case you cut yourself off from actually first writing good code and then optimizing if required. Good code in this case means avoiding copy-and-paste, which is prone to errors, makes code maintenance a horror, and makes code less understandable (which leads to more bugs). You would be much better off writing clearly defined functions and testing them thoroughly. Worry about optimization later, once you know where the actual bottlenecks are.

Sign in to comment.

Accepted Answer

Jan
Jan on 14 Jul 2017
Edited: Jan on 14 Jul 2017
I never write scripts, because the variables declared in one script interfere with other scripts. Any possibility to influence parts of the code, which are far awy, is a source of bugs - see https://en.wikipedia.org/wiki/Action_at_a_distance_(computer_science).
But the question remains, when to separate a part of a function to create a new function. This can be desided by the code complexity, but it is a question of taste, how you measure it and where the limit is. But as soon as a piece of code is a compact unit, which can be reused from other functions also, it is worth to move it to an own function. Then it be be tested exhaustively by a unit-test. If it is a bottleneck for any code, it can be optimized or mex-ed, and then all codes using this subfunction wil profit from the speed.
There is a certain overhead in calling an M- or built-in function. The latter is faster, e.g. calling mean(X) is slower than sum(X) / numel(X). But this overhead is small.
Sometimes the error checks in functions take more time than the actual computations (see polyfit). But this is a different problem.
While the runtime of inlined code can save milliseconds of runtime compared to functions, the time for programming and debugging can profit massively from this. If you have 4 almost identical parts, it will be a mess to apply a modification of the code compared to having 1 external function with a parameter for the differences. Therefore copy&paste programming is considered as anti-pattern. Maintaining or expanding such code is too prone to typos.
Caring too much about the runtime during the programming is often a premature optimization: The goal is to solve the problem. Then concentrating on the runtime at the beginning can lead to unreadable code which cannot be adjusted or expanded anymore. It is smarter to create a stable and clean code at first and find the bottlenecks for the runtime later. Then you can be sure not to "optimize" any code, which takes 1% of the total runtime only, such that a 100% speedup of this part reduces the total runtime by 0.5% only.
I've seen the advice, that subfunctions must be created, when the main function exceeds 1000 lines of code. Others use the a limit of the "Cyclomatic complexity", e.g. McCabe complexity > 10 to determine the need to move code to subfunctions to allow a complete testing. Another idea is: "If you cannot describe the job of a function in one sentence anymore, split it into subfunctions."
You see: runtime does not appear in the criteria, but the reduction of complexity rules to avoid bugs and improve the maintainability.

More Answers (1)

Walter Roberson
Walter Roberson on 15 Jul 2017
It appears that anonymous functions have lots of overhead. See test code attached.
It appears that on my system, each layer of pure function call adds about .05 * 10^(-6) overhead (that is, about 5E-8 seconds), but each level of anonymous function adds 2 to 3 * 10^(-6) (that is, about 2.5E-6 seconds).
  3 Comments
Tyler Warner
Tyler Warner on 20 Jul 2017
Thanks! It is true that anonymous functions have nasty overhead times. I converted an anonymous function to inline code. I'm not sure why we bothered with a function in the beginning.
Walter Roberson
Walter Roberson on 20 Jul 2017
I extended Philip's extension of my tests. I estimate from the results that a single level of anonymous function adds about 0.47 seconds per million.
I also experimented using timeit() instead of tic/toc, but the individual function times were too small for timeout to measure with any accuracy.

Sign in to comment.

Categories

Find more on Introduction to Installation and Licensing in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!