How to select multiple sets of data points from a scatter plot (like gating in flow cytometry/cell sorting)?

6 views (last 30 days)
I have 2 of column vectors that I plotted on a scatter plot. This is cell sorting data. I gated the data using reference lines (see png) I want to be able to export (x,y) data points between the specific lines. Is this possible? I'm using R2019b.

Accepted Answer

Adam Danz
Adam Danz on 28 Sep 2021
Edited: Adam Danz on 30 Sep 2021
Using the slope and y-intercept of each reference line, determine which points contain y-values less than the upper line and which points contain y-values greater than the lower line.
Create demo data. I assume you've computed the log() for each x and y since the axis scales are not log.
x = exp(linspace(2.2,5.8,500));
y = exp(log(x)+rand(size(x)).*linspace(2,.1,numel(x))-1);
% Define slope and y-intercepts for each line
slope = 0.74484;
yint = -1:.5:2;
% Convert x,y to log
xlog = log(x);
ylog = log(y);
Plot results
h = plot(xlog, ylog, 'o');
xlim([2,6])
ylim([.5,7.7])
Add reference lines. I assume you already have the slope and y-intercept info.
arrayfun(@(y)refline(slope,y), yint)
% Label lines
text(6*ones(size(yint)), slope*6+yint, compose('%d',1:numel(yint)))
Isolate dots between two reference lines. For this demo, we're isolating dots between the 3rd and 5th lines.
% isolate dots between lines 3 and 5
isBetween = ylog > slope*xlog+yint(3) & ylog < slope*xlog+yint(5);
% ^ ^
Plot the isolated points and return their x,y values
% Label selected dots
hold on
xBetween = xlog(isBetween);
yBetween = ylog(isBetween);
h2 = plot(xBetween, yBetween, 'r.');
legend([h,h2], 'All data', 'Data between lines 3 and 5')
  2 Comments
Hannah Haller
Hannah Haller on 30 Sep 2021
Edited: Hannah Haller on 30 Sep 2021
Thank you for your answer! This is what I was looking for. Could you elaborate on how you generated the references lines? Like how are you able to refer to them by number? Is it just like assigning them a variable?
Adam Danz
Adam Danz on 30 Sep 2021
It looks like your reference lines are parallel and therefore share the same slope. If that's the case, they only vary by y-intercept. I've defined a single slope and 7 y-intercepts (see variables slope and yint). The y-intercepts are sorted in ascending order so the bottom line is line #1 and the top line is line #7.
In the section "Isolate dots between two reference lines", I've selected lines 3 and 5.

Sign in to comment.

More Answers (0)

Categories

Find more on Data Distribution Plots in Help Center and File Exchange

Tags

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!