Create/deal big binary sparse matrices

Hi, I'm dealing with really big binary sparse matrices and I need to manipulate them (i.e allocate memory, multiplication etc.) I'm aware of the function sparse and I use it in my code. the first step is taking 4.5 second with parameters (t=8000,n0=2000).
And with bigger matrices, it takes about 2min whereas the rest of the code is taking about 5secs...
The question how one can efficiently allocate/create (a big) random binary sparse matrix?
tic
%%step 1 create random matrix with proba p=0.05
%I allocated first with sparse(t,n0) but the result was the same
%tried also false(t,n0)
A=rand(t,n0)<p;
toc
%step 3
tic
%finding number of rows of A that have 1 at both column i and column j
%by multiplying it with its transpose
B=sparse(A)'*sparse(A);
%getting numbers (i.e counts)
W=triu(B,1);
edges=(W>=meanvalue);
toc
Thanks in advance for your time and help.

1 Comment

It is not clear to me what take time. One thing for sure: don't use SPARSE as you did: i.e., generate full matrix then convert with sparse command. The efficient SPARSE command is with the form
SPARSE(rows, cols, values, ...). DO GENERATE SPARSE from the start.

Sign in to comment.

Answers (3)

I doubt it is the memory allocation or creation of the sparse matrix that is taking the time. I would think it much more likely that it is the matrix multiplication that is taking the time, as that will result in a matrix which is less sparse than the original matrix.

4 Comments

Yes indeed it seems so. but when I try :
A=rand(20000,15000)<0.05;
Matlab seems frozen. (4gb macbook pro, 64bit, 2.5ghz). I tried also to optimize the second step but no luck. The second step sometimes is really fast and sometime not. I think it has to do with the ratio #cols/#rows
Seeing as random values are independent of each other, build in pieces.
NP = 10; %number of fragments to piece together
PSize = n0/NP; %fragment width
rowcol = cell(K,2);
for K = 0 : NP - 1
[row,col] = find(rand(t,PSize) < p);
rowcol{K,1} = row;
rowcol{K,2} = col + Psize * K;
end
row = vertcat(rowcol{:,1});
col = vertcat(rowcol{:,2});
A = sparse(row,col,1);
I tried to use it but I think that there may be a problem in your code you used twice K.
rowcol = cell(K,2);
for K = 0 : NP - 1
what is the first K?
Sorry, the line should be
rowcol = cell(NP,2);

Sign in to comment.

See my comment above.
Instead of
A=rand(t,n0)<p;
Use sparse directly
A = logical(sprand(t, n0, p)); % OR
A = spones(sprand(t, n0, p));
Bruno

3 Comments

I saw your comment above but I don't know the correct way to create random binary sparse matrix. is it this way:
A=sparse(randomvector1,randomvector2,1); As W.Robertson suggested.
by the way here is the benchmark for rand(m,n)<0.05 vs sprandt(m,n,0.05) (m=n=10'000)
Elapsed time is 2.836211 seconds. rand(m,n)<0.05
Elapsed time is 5.449751 seconds. logical(sprand(t, n0, p));
Elapsed time is 5.134477 seconds. spones(sprand(t, n0, p));
Your timing does not mean much:
1) the full matrix cannot even be use for large dimension
2) the time needed later to convert to sparse is not taken into account.
correcting values
Elapsed time is 2.790257 seconds.
Elapsed time is 0.982951 seconds.
Elapsed time is 0.944833 seconds.

Sign in to comment.

m = 80000;
n = 10000;
p = 0.001;
nel = m*n*p;
rows = ceil(m*rand(1,nel));
cols = ceil(n*rand(1,nel));
A = sparse(rows, cols, 1);
B = A'*A;
...

Products

Asked:

on 28 Feb 2011

Edited:

on 20 Oct 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!