How can I extract a certain 'cluster' of elements according to a particular condition on the elements?
Show older comments
I have a matrix (about 342 by 342) denoted by C(k,l) and I want to identify all cluster of indices of the original according to the condition C(k,l) > rho. I.e. I want all square matrices C'(a,b) of C(k,l) such that C'(a,b) > rho for all pairs of indices a and b
For example, if I have the matrix C(i,j) as:
C = 1 0.8 0.7
0.8 1 0.5
0.7 0.5 1
And rho = 0.6 then a correct square matrix I want my code to identify is:
C'= 1 0.7
0.7 1
This is not unique of course and the result as given by the example above is not necessarily a submatrix. I am not sure how/the best way to do this is in MATLAB? If possible, I would also like identify what a and b are for each possible matrix e.g. for my example above a and b can be 1 or 3. The matrices are always symmetric and the diagonal entries are always 1.
8 Comments
Torsten
on 21 Jan 2016
C' = 1
or
C' = [1 0.8
0.8 1 ]
are also possible solutions ?
Best wishes
Torsten.
Guillaume
on 21 Jan 2016
Do you have an upper bound for the number of matrices that satisfy your condition?
A 342 x 342 matrix has nchoosek(342, 171)^2 ~= 1.5e203 square submatrices of size 171 x 171. What if all of them satisfy your condition?
Kirby Fears
on 21 Jan 2016
Edited: Kirby Fears
on 21 Jan 2016
Ansh,
Do you only want submatrices along the diagonal? Is each submatrix required to have 1's along the diagonal?
Also, if there is a 3x3 submatrix S with values greater than rho, there are also 4 2x2 submatrixes inside of S which are greater than rho. Do you want to count those too?
Image Analyst
on 21 Jan 2016
Obviously, as you can tell from the responses, this is confusing and not well described. We don't know why 0.8 is not in any of your output(s) even though it is > 0.6. Why is 0.8 not included anywhere even though it's greater than 0.6????
What if the connected blob consisting of numbers more than 0.6 is not a rectangle but some irregularly shaped blob? What then?
Help me get excited about this by describing the use case. Why do you want to do this? Knowing that may give us a clue as to a good approach to solving the main problem.
Ansh
on 22 Jan 2016
Answers (2)
Kirby Fears
on 21 Jan 2016
Edited: Kirby Fears
on 21 Jan 2016
Assuming you only want to find submatrices along the diagonal of C, the following code extracts all square submatrices (>rho) into a table S. This should be a good starting point for whatever assumptions you end up deciding on.
% make data
sizeC = 342;
rho = 0.6;
c = rand(sizeC);
c(1:(sizeC+1):end) = 1;
% prep
S = cell((sizeC-2)*(sizeC-1),3);
varNames = {'S','sizeS','diagC'};
idxRho = c>rho;
counterS = 1;
% traverse submatrix size
for sizeS = (sizeC-1):-1:2,
% traverse diagonal of c
for d = 1:(sizeC-sizeS),
% store valid submatrix with meta info
if all(idxRho(d:(d+sizeS-1),d:(d+sizeS-1))),
S(counterS,:) = {c(d:(d+sizeS-1),d:(d+sizeS-1)),...
sizeS,d};
counterS = counterS + 1;
end
end
end
% drop extra rows of S
if counterS<=size(S,1),
S(counterS:end,:)=[];
end
% convert S to table
S = array2table(S,'VariableNames',varNames);
Hope this helps.
9 Comments
Ansh
on 22 Jan 2016
Kirby Fears
on 22 Jan 2016
Edited: Kirby Fears
on 22 Jan 2016
Ansh,
The matrix you are describing:
Cs = 1 0.7
0.7 1
Is not a submatrix of:
C = 1 0.8 0.7
0.8 1 0.5
0.7 0.5 1
Can you state exactly what you are trying to extract from C? In the following example, what are all of the Ds results you want to extract from D?
D = 1 0.8 0.9 0.5
0.8 1 0.6 0.1
0.9 0.6 1 0.7
0.5 0.1 0.7 1
Kirby Fears
on 25 Jan 2016
You can get the set of indices i,j where C(i,j) > rho with one line:
[i,j] = find(C>rho);
You can re-use these indices in C to retrieve values like this:
C(i(1),j(1))
C(i(2),j(2))
... etc
Stephen23
on 27 Jan 2016
... which is exactly what my Answer does.
Kirby Fears
on 27 Jan 2016
Hi Ansh,
You asked for submatrices, which I provided in one manner and Stephen in another since you did not clarify what you mean by "submatrix". Then you said no, you want the index of C elements greater than rho, which I provided in the comment above. Now you want submatrices again.
For these reasons, I'm not going to respond anymore since my time is better spent on other questions. Nonetheless, good luck with your code.
Assuming that the input matrix is always square and symmetric:
>> D = [1,0.8,0.9,0.5;0.8,1,0.6,0.1;0.9,0.6,1,0.7;0.5,0.1,0.7,1]
D =
1 0.8 0.9 0.5
0.8 1 0.6 0.1
0.9 0.6 1 0.7
0.5 0.1 0.7 1
>> rho = 0.6;
>> [R,C] = find(tril(D,-1)>rho);
>> out = arrayfun(@(r,c)D([r,c],[r,c]),R,C,'UniformOutput',false);
>> out{:}
ans =
1 0.8
0.8 1
ans =
1 0.9
0.9 1
ans =
1 0.7
0.7 1
5 Comments
I understood your question to mean that you want all 2x2 submatrices: "I want all square matrices C'(a,b) ... for all pairs of indices a and b". This is also what you stated in a comment to Kirby Fears' answer: "I want to extract the following clusters:"
[ 1 0.8 , [ 1 0.9 , [ 1 0.7
0.8 1 ] 0.9 1 ] 0.7 1 ]
I answered your question using your examples: I cannot read your mind and discover other secret examples that you have hiding there, I can only use the data that you have shown us, which specifically listed only 2x2 matrices as being the correct output.
Do you really want all such submatrices, regardless of size? Do the submatrices have to be contiguous?
Stephen23
on 27 Jan 2016
This task might not be solvable using a standard PC: there are potentially a lot of such matrices:
Categories
Find more on Creating and Concatenating Matrices in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!