How to I find X unique combinations (order doesn't matter) of X numbers within an array of X numbers.
25 views (last 30 days)
Show older comments
Accepted Answer
Bruno Luong
on 29 Aug 2020
You might just miss the feature of NCHOOSEK that can accept the vector array as input
>> X=rand(1,5)
X =
0.3566 0.3205 0.8137 0.1779 0.9431
>> nchoosek(X,3)
ans =
0.3566 0.3205 0.8137
0.3566 0.3205 0.1779
0.3566 0.3205 0.9431
0.3566 0.8137 0.1779
0.3566 0.8137 0.9431
0.3566 0.1779 0.9431
0.3205 0.8137 0.1779
0.3205 0.8137 0.9431
0.3205 0.1779 0.9431
0.8137 0.1779 0.9431
2 Comments
the cyclist
on 29 Aug 2020
I guess I missed that, too -- even though I did take a quick peek at the documentation to refresh my memory.
sigh -- must be getting old.
More Answers (2)
the cyclist
on 29 Aug 2020
Edited: the cyclist
on 29 Aug 2020
There's probably a more efficient way, since I effectively first find all combinations, but I think this does what you want, in an admittedly obfuscated method.
M = 10;
N = 5;
% Get *all* combinations of whether each element should be selected,
% where 1 means yes, and 0 means no.
allComboIndex = dec2bin(0:2^M-1)' - '0';
% Reduce the above set to only columns with N elements selected
fiveComboIndex = allComboIndex(:,sum(allComboIndex)==N);
% Find where the 1's are located.
[r,~] = find(fiveComboIndex);
% Reshape to get the N values of each column
fiveCombos = reshape(r,N,[]);
1 Comment
the cyclist
on 29 Aug 2020
On my machine, this worked for M up to 24, taking a few seconds. For M larger than that, the intermediate array is too big, and MATLAB bogs down.
John D'Errico
on 29 Aug 2020
Youtr title is still far too vague to provide an answer, but the comment you made may be sufficient.
You have lets say 10 numbers, 1:10. Apparently your real problem may be much larger. (EXPLAIN THESE THINGS! Don't feel you need to put it all in the title, and be as short and sweet as you can. The question can be quite long, and that is why you have more than jjust a title given to you to use.)
Now you wish to select some subset of those 10 numbers, here in blocks of 5. The numbers themselves are references to another array, and will be used to find the average of the corresponding elements. So a simple increasing order of each subset is sufficient. Your original question was to find 1000 such subsets.
>> nchoosek(10,5)
ans =
252
And of course, nchoosek tells us this is impossible. THere are exactly 252 subsets. In fact, nchoosek gives us the complete set.
>> nchoosek(1:10,5)
ans =
1 2 3 4 5
1 2 3 4 6
1 2 3 4 7
1 2 3 4 8
1 2 3 4 9
1 2 3 4 10
1 2 3 5 6
1 2 3 5 7
1 2 3 5 8
1 2 3 5 9
1 2 3 5 10
1 2 3 6 7
...
I've just pasted in the first 11 such subsets, but there will be 252 of them as generated.
Now, suppose I wanted to use these to find averages of some arrays of numbers? First, get used to storing your array in multi-dimensional arrays.
Data = rand(2,3,10);
So here, I might have 10 arrays, each of size 2x3. Now I will decide to select all 252 possible subsets of tose arrays, where the subsets are of size 5, and form the average of those arrays.
subsets = nchoosek(1:10,5);
Datasubs = reshape(Data(:,:,subsets),[2,3,nchoosek(10,5),5]);
size(Datasubs)
ans =
2 3 252 5
Dataave = mean(Datasubs,4);
size(Dataave)
ans =
2 3 252
So I took all 252 of the possible subsets. I created the set of arrays you wanted. And then I formed the averages. The result is now 252 arrays. Here is one of them:
Dataave(:,:,1)
ans =
0.606133417842019 0.633298374281431 0.348760616827367
0.605002524439109 0.570316578527122 0.392097473426195
So you can easily do what you wanted, within limits. But we still don't know the real problem. Because apparently you are trying to do this with much larger sets of numbers. And for that, of course things will fail if you go too large.
For example, suppose you ask to find all unique sets of numbers of size 50 in subsets of size 25, of course things will fail miserably!
nchoosek(50,25)
ans =
126410606437752
You need to understand the limits of your computer and the memory you have, and the time you are willing to invest in allowing your computer to perform some huge thing. Even a smaller problem will become pretty large, pretty fast.
nchoosek(100,5)
ans =
75287520
Suppose for example, I had 20 such arrays to average, and I wish to find the average of blocks of 5 elements. Again, your question is far ttoo vague. But now your goal will be to find 1000 such sets, chosen randomly. But you wish to avoid replication.
subs = randi(20,[2000 5]);
subs = unique(sort(subs,2),'rows');
size(subs)
ans =
1950 5
subs = subs(randperm(size(subs,1),1000),:);
So I selected integers from 1:20. 2000 sets of 5 of them. Then I sorted them, and used unique to toss the replicates.
Finally, I selected 1000 of those sets randomly.
Essentially, you just need to learn to use MATLAB. At the same time, you need to understand the limits of your computer.
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!