Parallel rank calculation for sparse matrices -- suggestions?
Show older comments
I need to calculate the rank of large (> 1 terabyte of non-zero elements) sparse matrices with MATLAB. Exploring the Parallel Toolbox, but can't seem to find anything that convinces me what is offers will be helpful. If I'm wrong, can someone point me in the right direction? Ideally, I'd just want to take my existing code that has "r=sprank(A)" in it and have that library call run in parallel, perhaps with some additional annotation as needed. Would using a GPUARRAY help here, for example? Doesn't seem like it, but perhaps I'm wrong. Hoping someone here can help me out. Thanks!
Barry Fagin
Professor of Computer Science
barry.fagin@afacademy.af.edu
9 Comments
James Tursa
on 30 Aug 2022
Edited: James Tursa
on 30 Aug 2022
What is 1T? Is there any exploitable structure to the sparse matrix?
Barry
on 30 Aug 2022
Bruno Luong
on 30 Aug 2022
I'll move my comment so you can accept it if it helps
Walter Roberson
on 30 Aug 2022
sparse arrays are fairly inefficient on gpu array. Do not use gpuarray for this purpose.
Barry
on 31 Aug 2022
Matt J
on 5 Sep 2022
A = sparse(tall,skinny)
How tall? How skinny? What currently rules out the approach already discussed below of using rank(full(A.'*A))?
Bruno Luong
on 6 Sep 2022
"After doing a little digging, I now believe sprank() merely counts the number of non-zero columns. This is not what I require."
It does something more intelligent than that it match the row to column through Dulmange Mendelsohn permutation. In your case might be all the non-zero column can be matched. But that is exactly what structural rank means.
Accepted Answer
More Answers (2)
Matt J
on 30 Aug 2022
0 votes
Is the matrix square or is it tall and thin? If the later, then it may be an easier computation to compute rank(A.'*A).
9 Comments
NOTE: sprank is NOT rank
S = sparse(ones(3));
sprank(S)
rank(full(S))
Matt J
on 31 Aug 2022
Interesting. But the OP seems to use "rank" and "sprank" interchangeably, so either could be the intended computation.
Barry
on 31 Aug 2022
John D'Errico
on 31 Aug 2022
Edited: John D'Errico
on 31 Aug 2022
Yes, but I think you do not understand that IF your matrix is tall but sufficiently narrow, then you CAN do this:
rank(full(A.'*A))
The problem with this solution is it loses a lot of precision. So a matrix that is CLOSE to singular but is not truly singular will suffer a problem. For example:
format long g
A = [1:5;2:6]';
A(:,3) = A(:,2) + randn(5,1)*1e-12
So it is not truly singular, nor is it even sparse, but the sparsity or the size of it are irrelevant here.
rank(A)
So rank knows it to be full rank. However, the trick by forming A'*A causes rank to be confused.
rank(full(A.'*A))
As you can see, rank fails here, even though we know that A itself was of full rank. The problem is, when you form A'*A, this operation squares the singular values of A.
svd(A)
but here, squaring the singular values causes a loss of precision, and rank is unable to see the true rank of the product matrix, only computing the numerical rank of that matrix, which is now only 2.
svd(A'*A)
"it may be an easier computation to compute rank(A.'*A)."
This is not true in general for complex matrix
A=[1 i; i -1]
rank(A.'*A)
rank(A)
the correct formula is
rank(A'*A)
Bruno Luong
on 31 Aug 2022
btw, For fat and wide matrix
rank(A) == rank(A*A')
There is no reason why the trick works for one diimension but not the other.
Barry
on 31 Aug 2022
I would take caution before relying on rref. It has been often found to lack robustness, see e.g.,
For thin and tall sparse matrix A of size (m x n), m>>n and n in the order of 1000s, it might be possible to compute the rank using q-less qr, which is better than rank(A'*A) which has the drawback of square the condition number.
A=sprand(100000,10,0.1)*sprand(10,100,0.1);
R=qr(A,0);
rank(full(R))
Categories
Find more on Linear Algebra in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!