Modifying a large sparse matrix efficiently

14 views (last 30 days)
I have a lot of zero elements in the matrix but I also I have a lot of elements that are small enough that I want to zero them out. What is the best way to do this? Keep in mind that it's a sparse matrix so if I index into this, wouldn't it take too much time? If I convert it back to a full matrix, that may not be memory efficient too.

Accepted Answer

Matt J
Matt J on 12 Mar 2013
Edited: Matt J on 13 Mar 2013
The fastest method would probably be to rebuild the matrix from scratch, without the elements that fall below the threshold. The examples below give some evidence of this. Trying to set elements directly to zero by indexing isn't do inefficient (as long as you don't change one element at a time in a loop), but generating the indices can be subtly tricky with sparse matrices and results in some overhead.
X=sprand(16000,16000,.0003);
X1=X;
tol=0.5;
tic;
[i,j,s]=find(X);
i(s>tol)=[];
j(s>tol)=[];
idx=sub2ind(size(X),i,j);
X1(idx)=0;
toc%Elapsed time is 0.008526 seconds.
tic;
[i,j,s]=find(X>tol);
X2=sparse(i,j,s);
toc %Elapsed time is 0.004636 seconds.
tic;
X3=spfun(@(a)a.*(a>=tol),X);
toc %Elapsed time is 0.007094 seconds.
  4 Comments
Matt J
Matt J on 13 Mar 2013
Edited: Matt J on 13 Mar 2013
Walter's answer shows how to fix this, but I've now identified a 4th method that's even faster (and simpler) than the other three
tic;
X4=(X>tol).*X;
toc;
%Elapsed time is 0.001780 seconds.

Sign in to comment.

More Answers (3)

Walter Roberson
Walter Roberson on 12 Mar 2013
tol = 0.001;
[i, j, s] = find(YourSparseMatrix);
idx = abs(s) < tol;
i(idx) = [];
j(idx) = [];
s(idx) = [];
YourSparseMatrix = sparse(i, j, s); %write over it with new sparse matrix

James Tursa
James Tursa on 14 Mar 2013
I couldn't resist making a small mex function to do this. In addition to zeroing out small values, it can zero out values within a user-defined range (not necessarily small) and zero out or replace nan values. Works with real or complex matrices, and can generate a new matrix result or operate in-place. For in-place operations, it will throw an error if matrix is shared to avoid unintended side effects (but this behavior can be over-ridden). You can find it here:
The advantage of this mex implementation is that it doesn't generate potentially large intermediate arrays to get the job done.

Azzi Abdelmalek
Azzi Abdelmalek on 12 Mar 2013
Define your tolerence tol
tol=0.001
a(abs(a)<tol))=0
  1 Comment
Matt J
Matt J on 12 Mar 2013
Edited: Matt J on 13 Mar 2013
No, that will be very slow and memory consuming, because abs(a)<tol will include all the many many zero entries of a, too. It will generate a very large and dense logical matrix (still in sparse form, though) with 1's in place of all the 0's of a.

Sign in to comment.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!