Randsample with matrix: extract multiple values from every column of a matrix without loop!

4 views (last 30 days)
I have a matrix of weights pV = rand(N,F),
e.g. pV= [0.5522 0.3922 0.0221 1 0;...
0.0947 0.4357 0.0000 0 0.0064;...
0.0214 0.0000 0.0062 0 0;...
0.3317 0.1720 0.9717 0 0.9936];
For F times, I want to extract z_k numbers from a vector 1:N, using weights taken from the matrix pV. The loop version of the code is:
F = 5;
N = 4;
z_k = 2;
for f=1:F
seller(f,:)=randsample(1:N,z_k,'true',pV(:,f));
end
I am looking for solutions without the loop to improve efficiency. I have found the following solution (pV is already normalised to 1 by column, i.e. sum(pV) = 1 1 1 1 1) but I do not know how to fill the final matrix "seller" for those columns where less than z_k numbers satisfying the condition below are found.
If not enough numbers are found for a colum, I would like to fill it first with the numbers satisfying the condition and then with random numbers.
pV= [0.5522 0.3922 0.0221 1.0000 0;...
0.0947 0.4357 0.0000 0 0.0064;...
0.0214 0.0000 0.0062 0 0;...
0.3317 0.1720 0.9717 0 0.9936];
[rand_pV_val,rand_pV_rank_idx] = sort(pV,1);
pV_cdf = cumsum(rand_pV_val,1);
%rand_pV = rand(1,size(pV,2)); Use this for reproducibility:
rand_pV = [0.1119 0.2180 0.0649 0.4878 0.7268];
seller_full = repmat(rand_pV,size(pV,1),1)<pV_cdf;
seller_zk = cumsum(seller_full,1)==1|cumsum(seller_full,1)==2 & seller_full; %alternative ways?
enough_sellers = any(cumsum(seller_zk,1)>=z_k);
seller = zeros(z_k,F);
seller(:,enough_sellers) = reshape(rand_pV_rank_idx(seller_zk(:,enough_sellers)),z_k,[]);
seller =
2 1 0 0 0
4 2 0 0 0
%How can i fill the matrix seller where I didnt find enough sellers seller(:,~ enough_sellers)?
My desired results:
seller = 2 1 4 1 4
4 2 2 3 3
where 4, 1, 4 are the rand_pV_rank_idx in the position of the ones of the seller_zk matrix; the 2, 3, 3 are just random numbers different from 4, 1, 4. It can also happen that some columns of seller_zk have only zeros.
How can I do this, filling the matrix seller(:,~ enough_sellers) as I wrote above?
Also, I'm looking for alternative ways to find the matrix seller_zk as I would like it to be flexible to different values of z_k.
Thanks

Accepted Answer

the cyclist
the cyclist on 2 Feb 2023
Edited: the cyclist on 2 Feb 2023
Here is a pretty obfuscated one-liner, but I think it does what you want, and should be fast:
% Your data
z_k = 2;
pV= [0.5522 0.3922 0.0221 1.0000 0;...
0.0947 0.4357 0.0000 0 0.0064;...
0.0214 0.0000 0.0062 0 0;...
0.3317 0.1720 0.9717 0 0.9936];
F = width(pV);
N = height(pV);
% The algorithm
seller = (squeeze(sum(rand(1,F,z_k) >= cumsum(pV))) + 1)'
seller = 2×5
1 2 4 1 4 1 1 4 1 4
There are two potentially non-intuitve elements to this:
  • Generating z_k*F draws from a uniform distribution, but lining those up in the 3rd dimension. This is going to take advantage of the fact that MATLAB will implicitly expand those vectors into an array for the comparison I describe next.
  • Compare those random draws to the cumulative sum of pV. This comparison is checking when the number is smaller than the cumulative distribution function (CDF) of your weights. This is a well known (but perhaps not widely known) method of doing the weighting you want.
I suggest you unpeel the algorith from the "inside out", to understand what it is doing. I did a little bit of testing to make sure the results are sensible and accurate, but not a ton.
  3 Comments
the cyclist
the cyclist on 2 Feb 2023
I am confused by your comments here (specifically about not wanting repeated elements), and by the first code you posted.
My first point of confusion is that the 4th column of pV is [1; 0; 0; 0]. This means that a weighted draw from that column will always select the value 1. So, it is no consistent to say you want no repeats, but also that you want that weighting.
Second, in the call to randsample in your original code, you set the 'replacement' parameter to 'true', meaning that you are explicitly saying that repeated elements should be allowed.
Maybe I did not read your question carefully enough the first time, but when I look at it now, the second half of your question seems to ask for something quite different from the first half of the question. I think perhaps you edited it, after you first posted it?
I have to admit, I am now pretty lost in trying to fully understand what you want as output.
esperanta
esperanta on 3 Feb 2023
Sorry, my mistake. I was actually looking for randsample with replacement, so your answer is correct. Thanks!

Sign in to comment.

More Answers (0)

Categories

Find more on MATLAB in Help Center and File Exchange

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!