Finding and counting numbers from one matrix in another

4 views (last 30 days)
Hello all
I have a question, which seems complex to me but probably isn't.
I have two matrices and I am trying to find and count each number from one matrix in another.
For example: Lets say I have two matrices A and B. A is a list of nodes (mx1) and B is a list of triangles that contain those nodes (nx3).
I need to find all the nodes from A that are in B, count how many times they appear and in which triangles they appear.
So in the end I will get two new matrices, C=[nodeID, No. of times it appears in all columns in B] and D=[node ID, list of triangles (1:9) that contain that node (will contain 0s for nodes that few elements attached].
I have tried to use find and ismember but they don't see to do exactly what I want.
I hope I've explained that well enough. Help is very much appreciated.
Meghan

Accepted Answer

dpb
dpb on 15 Mar 2017
Is fairly simple, yes, but have to think about it a while to find "the Matlab way"... :)
There's always another way, but what came to me first...
>> A=[1:13].'; % Assume total of 13 nodes
>> B=randi(20,20,3) % Random set of triangles that include more nodes than in the list
B =
2 2 19
11 5 4
16 19 6
19 4 3
3 17 3
12 11 18
10 20 12
1 2 11
7 9 3
4 3 18
16 20 13
7 1 8
11 16 11
4 17 9
13 18 2
6 2 5
14 8 3
14 6 4
15 17 5
10 9 9
>> [n,bin]=histc(B(:),A); % count how many of each and locate where they are
>> C=[A n] % 'C' is now easy-peasy...
C =
1 2
2 5
3 6
4 5
5 3
6 3
7 2
8 2
9 4
10 2
11 5
12 2
13 2
>>
D takes a little thinking...first reshape bin to match the shape of B so can get the row corresponding to position...
>> bin=reshape(bin,[],3);
>> D=zeros(length(A),length(B)); % preallocate for all possible locations
>> for i=1:length(A) % for each node in A
[r c]=find(bin==A(i)) % get row where located
r=unique(r); % save only the unique rows for repeated; probably not needed
t(i,r)=r; % and populate array at those locations with the value
end
>> D=[n t]
D =
2 0 0 0 0 0 0 0 8 0 0 0 12 0 0 0 0 0 0 0 0
5 1 0 0 0 0 0 0 8 0 0 0 0 0 0 15 16 0 0 0 0
6 0 0 0 4 5 0 0 0 9 10 0 0 0 0 0 0 17 0 0 0
5 0 2 0 4 0 0 0 0 0 10 0 0 0 14 0 0 0 18 0 0
3 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 16 0 0 19 0
3 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 16 0 18 0 0
2 0 0 0 0 0 0 0 0 9 0 0 12 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0 0 12 0 0 0 0 17 0 0 0
4 0 0 0 0 0 0 0 0 9 0 0 0 0 14 0 0 0 0 0 20
2 0 0 0 0 0 0 7 0 0 0 0 0 0 0 0 0 0 0 0 20
5 0 2 0 0 0 6 0 8 0 0 0 0 13 0 0 0 0 0 0 0
2 0 0 0 0 0 6 7 0 0 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0 11 0 0 0 15 0 0 0 0 0
>>
The duplicated nodes not possible in a real dataset I'd presume; I just added so could use the randomized array w/o having to clean it up here.
Oh, above assumes the row is the same as the node in assigning r as the row, that needs to be the elements of A for the corresponding index if the array of nodes isn't 1:N.
  2 Comments
Meghan Rochford
Meghan Rochford on 16 Mar 2017
Edited: Meghan Rochford on 16 Mar 2017
Thank you for that. It has been really helpful, however I'm getting an error when trying to get D. D is being calculated as having 649x1 values, while n only has 367x1 values so they cannot be concatenated. Do you have any idea why that would happen?
Also D should be a 367x10 array, with the first column containing the node ID (from A) and the list of triangles associated with that node. The maximum number of elements that are able to be related to the node is 9. Right now the way you say it makes it a AxB sized matrix.
I really appreciate the help, I've been trying to work this out for a while :)
EDIT: I think I've figured out why it's not working properly. My list of nodes isn't 1:N. They are a list of nodes taken from a bigger matrix so the node IDs vary all the way up to like 500000 ish. I guess what I need then is to change r to index from the actual node ID, rather than 1:367. I thought it would be easy to do but I keep getting errors. Any ideas?
EDIT 2: I've just realised that I think you misunderstood me. In matrix D I need the list of nodes in the first column, followed by the triangle ID associated with that node. I have been trying to think of a way to do it and the only thing I can come up with is taking the nodes line by line and running through the triangle matrix(B) to find each and every location of that node, and putting that location in a new matrix that will be 367x10 (maximum number of locations of nodes in the triangle matrix will be 9). Does that make sense? For example, D should look something like (with the first column containing node ID):
20 38 39 0 0 0 0 0 0 0
21 39 40 41 0 0 0 0 0 0
22 1 42 43 44 45 3 2 1 0
23 42 1 0 0 0 0 0 0 0
24 3 45 46 5 4 3 0 0 0
25 5 46 47 48 49 7 6 5 0
dpb
dpb on 16 Mar 2017
Edited: dpb on 16 Mar 2017
Yes, I mentioned above that if A has missing values there's an issue. I've a conflicting engagement here in just a few minutes but the size length(A) is still correct for row dimension; the dimension I used was maximum possible, for usage not accounting for some external constraints (like, what happens if somebody goofs and there are more than 9 connections? was what I figured the code was for, perhaps).
Use
t(A(i),1:length(r))=r; % (*)
should, I think, give you the result you're looking for...gotta' run, sorry.
(*) Which is Guillaume's loop below with the correction in my rush to make appointment I forgot the length() expression for column position and, of course, I later appended t to the first column whereas he's storing beginning with column 2.

Sign in to comment.

More Answers (1)

Guillaume
Guillaume on 16 Mar 2017
Edited: Guillaume on 16 Mar 2017
Another method of obtaining the result, which works regardless of the values of the node ids, and whether or not a node is present in any triangle
%A: column vector of ids
[~, id] = ismember(B(:), A);
C = [A, accumarray(nonzeros(id), 1)]; %nonzero not required if all nodes are sure to be found in triangles
trigids = repmat((1:size(B, 1))', 1, size(B, 2));
triglist = accumarray(nonzeros(id), trigid(find(id)), [], @(list) {[list.', nan(1, 9-numel(list))]}); %again nonzeros and find(id) not needed if all nodes are sure to be found
triglist(cellfun(@isempty, triglist)) = {nan(1, 9)};
D = [A, cell2mat(triglist)]
Alternatively, D could be generated with a loop:
D = [A, nan(numel(A), 9)];
for rowid = 1:numel(A)
[trigid, ~] = find(B == A(rowid));
D(rowid, 2:numel(trigid)+1) = trigid;
end
  4 Comments
Meghan Rochford
Meghan Rochford on 16 Mar 2017
Edited: Meghan Rochford on 16 Mar 2017
Thank you both so much. Unfortunately dpd I couldn't get your loop to work so I used the one Guillaume provided.
I have a question about that loop though. The triangle matrix also contains out of sequence values, ranging up to the 500000 mark because they were taken from a larger matrix also. When I use that loop it does exactly what I want except it assigns the triangle position in the final D matrix, rather than the actual value from B. I know it's probably a simple fix but I can't figure out where in that loop to actual put the indexing?
EDIT: never mind, I got it. Thanks a million :)
dpb
dpb on 16 Mar 2017
"...why I then offered the loop option which may actually be faster, and certainly easier to understand."
Indeed, it dawned on me that the loop over the bins was equivalent to a direct loop over the array while writing the posted answer but had it tested and with the time constraint of an appointment in town decided better just leave good-enough alone.

Sign in to comment.

Categories

Find more on Creating and Concatenating Matrices in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!