# Big O Complexity for Matlab code - Sum of Matrix Operation

47 views (last 30 days)
Catherine on 2 Feb 2014
Commented: Walter Roberson on 2 Feb 2014
Hello,
This is my first time really programming with MATLAB, and I'm trying to analyze the complexity. The challenge I have is how to think of the sum of matrix operation: e.g., say A is of (nxm), and I want to do sum(A), which returns an answer of (1xm). In this case, should I interpret it as O(m) for going through the columns? Or...?
I guess I'm having trouble how to interpret matrix operations -- is it similar to having a "loop" in java/c terms?
Catherine

Walter Roberson on 2 Feb 2014
Usually big-O notation is used with respect to one variable, with the others held fixed.
For a fixed m, sum(A) with A being n x m, is equivalent to doing m summations of length n. Each length n summations requires O(n) operations, so you would have m * O(n) . But multiplicative constants are ignored for big-O notation, so you would just say O(n)
If m varies according to n, then you have a different situation and you need to know the relationship. For example for sum(A) where A is n x n, that would give you n * O(n) which is O(n^2)
##### 3 CommentsShow 1 older commentHide 1 older comment
Catherine on 2 Feb 2014
I'm looking at set of operations done on a network's adjacency matrix, so it would be A would be n x n. I guess that makes it O(n^2).
Catherine
Walter Roberson on 2 Feb 2014
When you have n^2 elements and you have to "touch" them all at least once, you cannot do it without at least n^2 operations.

Jan on 2 Feb 2014
Edited: Jan on 2 Feb 2014
The big-O notation is meaningful in coding theory. In practize analysing the computational complexity must consider the physical machine:
• Matlab's sum uses multi-threading above a certain limit of data - it was 89000 elements in some Matlab versions. Therefore the computing time for summing 88999 and 89000 elements can vary by up to the number of built-in cores. Unfortunately for single-thread machines, the trial to start mutliple threads wastes a lot of processing time (this might be cleaned in the newest Matlab versions).
• When an algorithm is multi-threaded, it depends on the details of the implementation, if the creation of the output can cause an invalidation of the cache-line: The RAM is copied in chunks of e.g. 64 bytes to the caches. When the different cores write to the same chunk of data, caring for the consistency of the output data requires a lot of time, such that the total processing speed can be degraded to a single-threaded version.
• The calculations are much faster, when the data match into the processor's data cache, because accessing the RAM is slow.
• Above a certain limit, data are stored in the virtual memory and swapped to the hard-disk. This is roughly a factor of 1000 slower than the RAM.
• The time for reserving memory for the result depends on the availability of free and zeroed memory. This detail cannot be controlled from the Matlab level, but depends on the operating system.
• Other applications should not occupy processor power, e.g. virus-scanners.
• Modern processors can vary their speed widely, e.g. by thermal throtteling or "turbo boost" (this actually means, that the processor runs slower, when all cores are under load). Therefore the processing speed depends e.g. on the cooling capabilities also.
In consequence the O-value of the algorithm does not have a strong correlation to the processing time, except if the problem is specified exactly: data size, cache sizes, enough free RAM, number of involved cores, processor type and temperature.
Nevertheless, Matlab's sum cannot do any magic and the underlying code is based on simple C-loops or perhaps some improved assembler code. Most matrix operations are performed by the BLAS library.
Catherine on 2 Feb 2014
Jan,
Thanks for your reply! I'm looking at the computational complexity because I'm trying to write it in a theory section of a paper. I may compare the actual time (given that it's a worst case scenario) next to it. I guess it makes sense that matlab can't do any magic -- I didn't know whether I have to break down matlab's each operation into normal C/Java style code before I can do the analysis, but I guess I should. Thank you for your helpful response :)
Catherine

Amit on 2 Feb 2014
Edited: Amit on 2 Feb 2014
Matlab sum without specification of dimension, for a matrix, will be for the column.
However for a vector, either row or column, will be for the whole vector.
The Matlab documentation is really good. You can check easily either online or just by typing on command window:
help function name, Like
help sum
This will be very useful when you are programming and get stuck!
##### 3 CommentsShow 1 older commentHide 1 older comment
Amit on 2 Feb 2014
This makes this question quite interesting. I am not sure which order this algorithm will fall but you can possibly do a test for this where you record time based on the number of elements.
I did a quick test (I am not sure how accurate that is) but it seems that it O(nm)
Catherine on 2 Feb 2014
Thank you for going through the testing process for me! It's nice to get some confirmation, experimentally as well :)