How to find intersection between two outputs?

Dear Coder,
I have to output generated from KSDENSITY, namely f_smoke and f_Nonsmoke. The idea was to find intersection between the f_smoke and f_Nonsmoke. To achieve this, I used the function INTERSECT. However, MATLAB return me 1×0 empty double row vector instead.
C = intersect(f_smoke,f_Nonsmoke);.
I expect, the intersection should occur as depicted in the figure
Any idea what I have been missing?. The full code is as below. The patients.mat is an in-build matlab mat file (Version 2017).
load patients
[tf,Ioc_Alert_LessThree] = find (ismember(Smoker, 1));
for i = 1: length (tf)
M (i,1) = Height (tf(i));
end
[tgf,Ioc_Alert_two] = find (ismember(Smoker, 0));
for i = 1: length (Ioc_Alert_two)
M (i,2) = Height (tgf(i));
end
M(M == 0) = NaN;
[f_smoke,xi_smoke] Th= ksdensity(M (1:end,2));
[f_Nonsmoke,xi_Nonsmoke] = ksdensity(M (1:34,1));
plot (xi_smoke, f_smoke);
hold on
plot (xi_Nonsmoke, f_Nonsmoke);
C = intersect(f_smoke,f_Nonsmoke);
Thanks for the time and idea!

 Accepted Answer

What you're missing is that intersect is a comparison of the elements in the two arrays for exact equality of the members thereof, not a solution to the geometric solution of the intersection of two functions. Even if there had been a point or two that were absolutely identical (to the last bit, remember?) all you would have gotten would have been that one point and it could have been anywhere along the path that just happened to return the same pdf estimate from the two distributions.
What you need to do is to solve for the intersection by writing the difference and find the zero --
On the way, let's make use of some of the more user-friendly Matlab syntax features, ok?
s=load('patients'); % load the data into structure in preparation for...
pat=struct2table(s); % creating a table structure from it...
pat.Gender=categorical(pat.Gender); % just a sample cleanup to turn from cellstr to categorical
[fSmk,xSmk]=ksdensity(pat.Height(pat.Smoker==true)); % fit the empircal density to smokers
[fNSmk,xNSmk]=ksdensity(pat.Height(pat.Smoker==false)); % and non in turn...
Now comes "the trick"...
Build an anonymous function that computes the output difference between the two epdf's as function of the independent variable...use interp1 because the form is numerical, not functional so must interpolate to find points other than those given explicitly.
fnX=@(x) interp1(xSmk,fSmk,x)-interp1(xNSmk,fNSmk,x)
fnX =
@(x)interp1(xSmk,fSmk,x)-interp1(xNSmk,fNSmk,x)
>> fnX(65) % just a sample use of the function to check it works...skip this, just demo
ans =
-0.0346
Now use the function we just defined in fzero to find the intersection point--
>> X0 =fzero(fnX,65) % solve for the intersection; use 65 as initial guess
X0 =
67.5852
>>

10 Comments

Dear DPB, I really amazed with your approach. Basically, you solve two of my question in one post. I had run the proposed code, and it work perfectly.
Thank you.
How to accept you answer? I only can vote it!
You have to unaccept Geoff's answer first, THEN you can accept dpb's answer.
Don't forget to at least take a peak at my answer!
No problem, such things are what makes Matlab such a kick to mess with...you can do so much w/ so little code owing to all the supplied functionality. Of course, there's a fair amount of "time in grade" to become familiar with even a significant fraction of what's there and then thinking how to use that for a given problem is also at least partially an experience factor of what has/hasn't worked in the past.
I will note that the above solution works as is for the given epdfs--it wouldn't take much of a change in the underlying data that the two curves could also have one or more other intersections. If that were the case it could require making sure the initial value was the closest one to the desired intersection; fzero will find the nearest one and only one if there are multiple solutions.
IA, while there's a peak in each distribution, he probably wants to peek at your technique... GD&R :) (Sorry, 'de debbil made me do it!)
"have to unaccept Geoff's answer first,..."
And, not to plug for points specifically, :) Geoff won't lose any reputation points for the selection shift -- TMW recognized order may play a role in finding alternate solutions so doesn't penalize first responders if a subsequent comes along.
Hi DPB Thanks for explanation. Yup, the proposal by IA should solve multiple intersection/ solutions
Out of curiosity, is DPB = de debbil ? Or it is diff person?.
dpb
dpb on 24 Jul 2017
Edited: dpb on 24 Jul 2017
dpb --> my actual initials hopefully not related to the above in actuality! :)
As you likely are far too young, if you're curious google "Flip Wilson"
I'd still use fsolve meself, just be aware of the possibility of there being multiple intersections.
Hi DPB, You really contribute a lot in this community, my honor to know you.
dpb
dpb on 26 Jul 2017
Edited: dpb on 26 Jul 2017

Sign in to comment.

More Answers (2)

balandong - remember, the data that you are plotting are doubles and you are drawing a curve where every two consecutive points is "connected" to each other. So when the two curves are drawn, their intersection (where the two curves cross) may not be represented by data from your sets...and so intersect is not guaranteed to be non-empty. Consider the following
plot (xi_smoke, f_smoke, 'r*');
hold on
plot (xi_Nonsmoke, f_Nonsmoke,'g*');
Each point from the first set is represented by a red star, and each from the second set by a green star. Now when you run the code, you can see while there are points close to one another, there doesn't appear to be any that actually intersect.
I suspect that you will need to loop through your data and try to find the two (or more points) that are "close" to one another using a Euclidean or some other distance metric. You may be able to discover the "intersection" from those points.

1 Comment

Thanks Geoff, I thought INTERSECT will take care both the intersection line and overlap point. I will find another alternative based on your recommendation.
Thanks

Sign in to comment.

You can do it approximately and numerically by using comparison to find the index where the one curve goes above or below the other curve. Let's say non-smokers are red (to the right because they live longer) and smokers are above and to the left in blue. Find the index where the blue dips below the red:
index = find(f_smoke < f_Nonsmoke, 1, 'first');
% Get values of the two curves at that index.
smokeCrossoverValue = f_smoke(index);
nonsmokeCrossoverValue = f_Nonsmoke(index);
If the x value is not the index, then you can get the x value (Age, I presume) by indexing it:
ageAtCrossover = x(index);
If you want you could home in on it a little more by doing bilinear interpolation between index and index-1.

4 Comments

Hi IA, Thanks for the reply. I try to understand your code with great interest. However, I am having difficulties in digesting it.
index = find(f_smoke < f_Nonsmoke, 1, 'first');
The line above will give index equal
  1. Case A:Equal to 1 if, = f_smoke < f_Nonsmoke, 1, 'first'.
  2. Case B: Equal to [1 2]if, = f_smoke < f_Nonsmoke, 2, 'first'.
Then
ageAtCrossover = Height(index);
I thought the line will give me the intersection. However, it just access the first element of Height (for case A) and two first element of Height ( for case B). Do i understand it wrong? because it seems the lines does not give the intersection point
dpb
dpb on 24 Jul 2017
Edited: dpb on 24 Jul 2017
The problem is again that the logical expression f_smoke<f_Nonsmoke is a 1:1 comparison of the elements in the arrays which are not at all at the same locations of x since the two are computed independently. To do this you'd have either
  1. Align the two distributions at comparable x values before the comparison, or
  2. Compute the two density functional values at a set of matching x values before the comparison.
Having plotted the two on the same axis is misleading in that one can easily assume the x- values are the same but they have no relationship to each other the way ksdensity works with default outputs.
Right. I agree it turned out to be trickier than I thought because they don't have the same x-values.
Thanks for the input both of you. I really appreciate it

Sign in to comment.

Categories

Tags

Asked:

on 23 Jul 2017

Edited:

dpb
on 26 Jul 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!