# Passing a line through 2 determined points of a dataset automatically

1 view (last 30 days)
Alfonso on 30 May 2018
Closed: Alfonso on 13 Jun 2018
I am trying to draw a line which cuts through approximately the middle of a cloud of points. The data sets have similar shape with some variances, so for each of the cloud of points I am trying to automatically find (use of loop for example) the 2 correct points that when defining a line, this will cut it as it is shown in the images (I have used imline to create the desired line just as an example to show what I mean):
If you look carefully you will see that the correct points that define the line are approximately in the same zone (observing the plots) for the 2 datasets. The idea would be that the line would cross through this characteristical zones which are present in all the datasets.
2 examples:  The problem is I am having some difficulties to find an effective way of obtaining this. I thought of maybe using centroid and calculating euclidean distances from centroid to points, etc. But it is nearly impossible that the centroid will coincide with the desired line.
I have uploaded the 2 datasets as data1.mat and data2.mat.
Thank you for any help in advance.

John D'Errico on 30 May 2018
Edited: John D'Errico on 30 May 2018
Your problem is one of definition. Until you do that, i.e., define what you mean by "middle", you can do nothing.
In fact, there are infinitely many lines that will pass through the mean of your data. Surely that is a good definition of "middle". We could equally easily define the "middle" as a line that has 50% of the points on either side of the line. Whoops, again, infinitely many such potential lines.
You seem to be drawing a line that is essentially a line of symmetry. But you never said anything at all about that. Sadly, that is only a line of approximate symmetry in both cases. The problem is the eye is good at seeing patterns like that. Computers? Not really. You need to tell them carefully and accurately what you mean, which essentially involves code.
So, are you looking for a line of symmetry? Must the line pass through two of the points? (It does not appear you have drawn them like that.) So, now you have an extra problem, in that you also need to learn to interpolate such a curve.
Next, is this set of points a completely scattered one, in that they are provided in random order? Or do you have them in sequence around the circumference of this cloud, essentially as a polygon? I loaded one of those files. They are scattered. So worse yet, while your eye easily sees this as a heart shaped curve, the computer just sees a cloud. Again, that makes it a more difficult problem. Is it a solvable problem? Probably.
Finally, are these point clouds as you have shown them similar to the REAL problem you are facing, or have you just drawn a couple of nice examples? Is 200 points or so a reasonable estimate of the real problems you will face? Or may you have 200000 points some times?
How might I solve the problem of finding a line of approximate symmetry?
I'd start by sorting the points in order, IF possible. Since your data seems to be simply deformable to a circle, polar coordinates is a good way to do that.
x = data1(:,1);
y = data1(:,2);
plot(x,y,'-o')
% So random order initially. [theta,r] = cart2pol(x-mean(x),y-mean(y));
[theta,thetatags] = sort(theta);
x = x(thetatags);
y = y(thetatags);
plot(x,y,'-o')
grid on
axis equal A traveling salesman solver would help in worse cases, finding the shortest path to traverse between the list of points. But we don't need anything like that for these simple cases.
How might I find a line of symmetry?
Simplest seems to write a function that would accept any two points around the curve, which would define a line. Then take all points on ONE side of the line. Create a curve from them. Next, all points on the other side of the line would get mirrored across the line. Find the closest point on the curve to those mirror images, summing up all of the distances.
Finally, I would use an optimizer to find the BEST two such points, such that the total mirrored sum of distances is minimum.
I'm not going to write that code, or even give you any more depth in it than this without validation of my wild guesses.

Alfonso on 31 May 2018
Okay, I sent you a message through the blue contact link of your profile a few mins ago before reading your last comment.
John D'Errico on 31 May 2018
Hmm. I've not gotten anything yet in my mail. Now I realize why. I bastardized the address in that link, because I was getting too much junk mail from kids wanting me to do their homework. Sigh. I changed it back. Send another mail.
Direct mail is fine. I'll check again this afternoon.
Alfonso on 31 May 2018
Okay, I sent it again.