Comparing each vector in a cell array to another vector
Show older comments
Hello. I am wondering if there is an easier way to check vectors in my cell array are subsets of another given vector. I want to keep track of how many times the vectors in C are found in T, for many different T's. For example:
T = [1 2 3 4 5 6]
C = { [1 2], [3 4], [5,6], [7 8], [9, 10] }
% loop through each candidate
for j = 1:numCands
c = Ck{j};
% Candidate Counts. Check if candidate is a subset of the
% current transaction
if ismember(c,t)
candCounts(j) = candCounts(j) + 1;
end
% Candidate Tail Counts. Check if the tail of the candidate is
% a subset of the current transaction
if ismember(c(2:end),t)
candTailCounts(j) = candTailCounts(j) + 1;
end
end
What I have done is loop through each member of C and check using ismember() for each element. But this gets really long when C becomes very large. I was wondering if there was any clever way of vectorization which would result in a faster runtime. I would also like to add that T can be varying sized vectors. Not sure if that information is relevant or not.
Thanks in advance!
Edit: For future references here is what I ended up doing, which is similar to what the answers suggested:
% Make all candidates into matrix for faster calculation
allCands = vertcat(Ck{:});
% Get number of columns to get where there are subsets.
[~,col] = size(allCands);
% Find locations where the candidates are subsets of the
% transactions for both the candidates and tail counts.
memb1 = ismember(allCands,t);
memb2 = ismember(allCands(:,2:end),t);
% Subset check
subCheck1 = find(sum(memb1,2) == col);
subCheck2 = find(sum(memb2,2) == col - 1);
% Increment counts
candCounts(subCheck1) = candCounts(subCheck1) + 1;
candTailCounts(subCheck2) = candTailCounts(subCheck2) + 1;
1 Comment
Walter Roberson
on 5 Feb 2017
Edited: Walter Roberson
on 5 Feb 2017
Does it happen to be the case that C and T only have integer elements in the range 0 to 65535 ?
Are all of the entries in C the same length? Are they always row vectors?
Accepted Answer
More Answers (1)
Walter Roberson
on 5 Feb 2017
1 vote
If all members of C are the same length then you can use buffer() with overlap to create an array containing all the subsequences of that length, and transpose that and then convert C to a row matrix and then ismember() with rows option
1 Comment
Christopher Tran
on 5 Feb 2017
Edited: Christopher Tran
on 5 Feb 2017
Categories
Find more on Operators and Elementary Operations in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!