Parallel rank calculation for sparse matrices -- suggestions?

Question

0 votes

I need to calculate the rank of large (> 1 terabyte of non-zero elements) sparse matrices with MATLAB. Exploring the Parallel Toolbox, but can't seem to find anything that convinces me what is offers will be helpful. If I'm wrong, can someone point me in the right direction? Ideally, I'd just want to take my existing code that has "r=sprank(A)" in it and have that library call run in parallel, perhaps with some additional annotation as needed. Would using a GPUARRAY help here, for example? Doesn't seem like it, but perhaps I'm wrong. Hoping someone here can help me out. Thanks!

Barry Fagin

Professor of Computer Science

barry.fagin@afacademy.af.edu

9 Comments
Show 7 older comments Hide 7 older comments

Matt J on 5 Sep 2022

A = sparse(tall,skinny)

How tall? How skinny? What currently rules out the approach already discussed below of using rank(full(A.'*A))?

Bruno Luong on 6 Sep 2022

"After doing a little digging, I now believe sprank() merely counts the number of non-zero columns. This is not what I require."

It does something more intelligent than that it match the row to column through Dulmange Mendelsohn permutation. In your case might be all the non-zero column can be matched. But that is exactly what structural rank means.

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Bruno Luong on 30 Aug 2022

Moved: Bruno Luong on 30 Aug 2022

0 votes

Something I don't get : sprank doc tells when you run witth threadpool it runs in parallel.

5 Comments
Show 3 older comments Hide 3 older comments

Bruno Luong on 31 Aug 2022

IMO the doc should clarify this.

But then I still have some doubt. If user runs multiple sprank in parallel, it will take less worker to work on one sprank. So at the end do we really speed up the whole thing?

Edric Ellis on 1 Sep 2022

Yes, the "extended capability" description could probably do with some refinement to make it clear exactly what works. (Today, there are a good number of MATLAB functions that cannot be run on a thread pool worker, and this "extended capability" really means simply that the function can run on a thread pool worker).

The performance situation is ... complicated, unfortunately. There is no single simple answer as to whether running N copies of a given function concurrently on workers is faster than running N copies sequentially on the client. It depends on all sorts of details of the implementation of that function - but primarily whether the function is already intrinsically multithreaded by MATLAB itself.

Sign in to comment.

Answer 2

Matt J on 30 Aug 2022

0 votes

Is the matrix square or is it tall and thin? If the later, then it may be an easier computation to compute rank(A.'*A).

9 Comments
Show 7 older comments Hide 7 older comments

John D'Errico on 31 Aug 2022

Edited: John D'Errico on 31 Aug 2022

Open in MATLAB Online

@Barry

Yes, but I think you do not understand that IF your matrix is tall but sufficiently narrow, then you CAN do this:

rank(full(A.'*A))

The problem with this solution is it loses a lot of precision. So a matrix that is CLOSE to singular but is not truly singular will suffer a problem. For example:

format long g
A = [1:5;2:6]';
A(:,3) = A(:,2) + randn(5,1)*1e-12
A = 5×3
                         1                         2          1.99999999999926
                         2                         3          2.99999999999993
                         3                         4          3.99999999999839
                         4                         5          5.00000000000203
                         5                         6          6.00000000000069

So it is not truly singular, nor is it even sparse, but the sparsity or the size of it are irrelevant here.

rank(A)
ans = 
     3

So rank knows it to be full rank. However, the trick by forming A'*A causes rank to be confused.

rank(full(A.'*A))
ans = 
     2

As you can see, rank fails here, even though we know that A itself was of full rank. The problem is, when you form A'*A, this operation squares the singular values of A.

svd(A)
ans = 3×1
          15.3157988606437
         0.652920562027581
      1.62159646645945e-12

but here, squaring the singular values causes a loss of precision, and rank is unable to see the true rank of the product matrix, only computing the numerical rank of that matrix, which is now only 2.

svd(A'*A)
ans = 3×1
          234.573694739694
         0.426305260318399
      4.07630245872306e-17

Barry on 5 Sep 2022

Edited: Barry on 5 Sep 2022

I like the A.'*A trick. Or one could ...wait for it ...

... use MATLAB's rref() command, which works on sparse matrices, and then count the nonzero rows. Sigh.

But that in turn begs the question of roundoff errors. Time to move to a new question, which I'll post shortly.

--BF

Matt J on 5 Sep 2022

Edited: Matt J on 5 Sep 2022

I would take caution before relying on rref. It has been often found to lack robustness, see e.g.,

https://www.mathworks.com/matlabcentral/answers/586889-rref-gives-an-impossible-answer#answer_487541

Sign in to comment.

Answer 3

Bruno Luong on 5 Sep 2022

Open in MATLAB Online

0 votes

For thin and tall sparse matrix A of size (m x n), m>>n and n in the order of 1000s, it might be possible to compute the rank using q-less qr, which is better than rank(A'*A) which has the drawback of square the condition number.

A=sprand(100000,10,0.1)*sprand(10,100,0.1);
R=qr(A,0);
rank(full(R))
ans = 10

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Parallel rank calculation for sparse matrices -- suggestions?

9 Comments
Show 7 older comments Hide 7 older comments

Accepted Answer

5 Comments
Show 3 older comments Hide 3 older comments

More Answers (2)

9 Comments
Show 7 older comments Hide 7 older comments

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Products

Release

Tags

Community Treasure Hunt

Parallel rank calculation for sparse matrices -- suggestions?

9 Comments Show 7 older comments Hide 7 older comments

Accepted Answer

5 Comments Show 3 older comments Hide 3 older comments

More Answers (2)

9 Comments Show 7 older comments Hide 7 older comments

0 Comments Show -2 older comments Hide -2 older comments

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

9 Comments
Show 7 older comments Hide 7 older comments

5 Comments
Show 3 older comments Hide 3 older comments

9 Comments
Show 7 older comments Hide 7 older comments

0 Comments
Show -2 older comments Hide -2 older comments