Global workspace usage for efficiency?

Are there cases when using global variables is the most efficient implementation?
I hate using the global workspace, but I end up doing it all the time. There is only one reason that I have had for doing this: performance/efficiency.
For concrete examples, I have two heavily optimized submissions on the FEX: a Particle simulator and real time motion from video
I am well aware of how JIT acceleration works, and how Matlabs in-place operations works. The functions setappdata/getappdata seem to be good candidates for replacing the use of global workspace, however consider a situation where you modify the data in the calling function. If, as are the cases in the two FEX submissions described above, I call functions often and and share large amounts of data, isnt the global workspace more efficient than other solutions?
The answer to this question in Matlab before 2007 I would say definetly yes: the global workspace is the fast, but ugly way. I dont feel to old to learn new tricks, so please enlighten me here, and the next code i make will be free of globals.
Note that milliseconds count.

Answers (2)

However I suspect a lot of the time there is in locating the data. If you are in a mex routine you would grab the appropriate pointer and maybe data pointer as well and there would be no further cost penalty beyond copy-on-write semantics.

6 Comments

Thank you Walter. Its a good reference link. If one has the option to provide the data to the function through "normal" means, that is definitely the best option.
Its the situations where this cannot be done that I am considering. I should have written that more clearly. In the example submissions referenced I am forced to have some form of data-gathering inside the function. This is because they are event or timer callbacks
I did mention "setappdata". If you are developing a GUI for example, I can think of the following ways of sharing data between callbacks:
  • function guidata
  • functions setappdata/getappdata
  • nested functions, multi-scope variables (same m-file for all your functions)
  • persistent memory in mex functions(essentially making your own setappdata)
  • global variables
When i get into such situations, I just use global variables. I want something else however, but I have always thought the globals to be the fastest of the list, with exception of nested functions... but you cant use that when your project grows.
... as a quick side-note... the issue of optimizing with mex functions have become far less intuitive with JIT and copy-on-write.
As I understand it these days, the only way to implement a mex function, is by having the data copied from the Matlab workspace. It used to be that you could get a pointer straight into the memory allocated by Matlab, and that you could even modify that by reference from within the mex. If you have information on how to get a shared memory buffer between Matlab and a mex module, especially if it did not throw a wrench in the machinery for the rest of Matlab, I would be very happy to hear about it.
shared variables via nested functions would be the most efficient for that case.
setappdata() has the benefit of being able to pull back a particular substructure, and so is more efficient than guidata which has to copy everything.
handle objects can also provide efficient access to data in multiple scopes -- though you do have to be careful to not allow all references to the objects to go out of scope. (Persistence of objects in MATLAB is something I have not read much about.)
Nested functions indeed.
What that amounts to is to have an agile workspace, stretching across several functions. I wish I could keep this agility. Wouldnt it be great to have something like:
using workspace myProjSpace
clearing it could be done in the same manner as the default workspaces:
clear myProjSpace
as part of Matlab? This way, when your project grows into multiple M-files, you could get the same benefits of shared workspace of nested functions. If something like that already exists, I would love to know about it.
So, making GUI with efficiency in mind:
start by making nested functions, use shared variables between them. These nested functions are registered as event handlers for the many GUI elements.
When size of m-file becomes un-manageable, split the file into several m-files. Functions are no longer nested, shared variables becomes global, a cost to efficiency is noted, but smaller than using guidata and set/get-appdata.
This has the benefit of being faster to implement as well, and that the new code resembles the old (syntax highlighting makes global variables and variables of nested functions have the same color coding even).
but...
using the global workspace is opening a can of worms for development, so I really would like a better solution.
I think getappdata() of a unique name should be faster than global of the same name, but I would need to test to be sure.
I could see how reading from data retrieved from getappdata might be faster, but I cant see how it could be faster with a combined read and write. This would force a copy of the entire data, while with the global variable no demand arises to have a local workspace version. This is the assumption I have been working with anyway...

Sign in to comment.

Walters good points above are not an answer to my question. Let me try to give my point of view on this, and then anyone can feel free to rip it apart. As I said before, I want to be corrected, because I absolutely hate using the global workspace.
However, I believe there are times when using the global workspace is the most efficient. This is both in terms of efficiency of the finished app, as well as development speed and readabillity of code.
the situation occurs with callbacks, for example: GUIs and timer callbacks.
For those applications, there are only 2 ways I know of where you can be guaranteed that no extra copying of data occurrs in Matlab: nested scope variables, and global variables.
Variables created in the topmost function are reacheable to the nested functions below. This requires the entire project to be defined in a single M-file. Starting an application development with efficiency in mind, one may therefore start by making a nested functions solution, and when the size of the M-file grows, one can split up the functions into many files. When multiple files are introduced, with one function per file one need to replace the nested variables. At this point, one can use one out of several ways to make the variables reacheable. There are only two methods that are worth mentioning here, considering effeciency:
  • global variables
  • setappdata/getappdata
If the data shared this way is large, and the application will need to modify this data in multiple files, then I believe the following to be true about the global variables solution:
  • globals are the most efficient, as in requiring the smallest amount of memory
  • globals are the easiest and fastest way to implement the app
This is not an answer to suggest global variables as a good habit of development. If anything, its to point out shortcomings of the language. In Mathworks defence, it seems not a very worthwhile effort to improve this, prior to the introduction of the JIT and copy-on-write functionality. Now that this is in the language, the issues mentioned above can be bottlenecks for efficiency. As a possible future addition to the language... consider something like:
top of the first function:
define workspace myProjSpace
top of all the reliant m-files (the formerly nested functions)
using workspace myProjSpace
A hack I could make is to make a Matlab preprocessing script, that looks for "using workspace" at the top of every file, and then inserts those functions as nested versions into a file that contains "define workspace". This would probably break numerous other things in Matlab, but would work for specific use-cases in callbacks.

Categories

Asked:

on 14 Jan 2016

Answered:

on 20 Jan 2016

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!