Split Set Of Circles into 2 Equal Halves with a Line - algorithm

I have been trying to answer this question for months now but I am still stuck.
The question requires me to write a program to output YES or NO to whether the given set has a line that can divide it.
I am looking for a possible algorithm to determine the answer, I want to interpret it into code once I have a firm grasp on the answer.
Given an even length set of Circles on a 2D plane that are guaranteed not to touch. determine if it is possible to draw a line through the set dividing it exactly in two without intersecting any circle.
Circle Radius is greater than zero
No Circle touches or contains one another
A set of length 2 is always possible
every Circle can be unique in size
Input Format:
N - number of circles in set
x y r - N lines of: x coordinate, y coordinate, radius
Input repeats until EOF
Output YES or NO for each test case
Example input:
4
0 0 20
0 40 20
0 30 10
40 -30 10
4
0 0 20
0 40 20
20 40 20
20 -40 20
Output:
YES
NO
Edit: My attempts to solve
First attempt was to find all lines that could solve this problem if every circle were zero radius points to give me a set of possible solutions to the problem.
Link to a Dividing a plane of points into two equal halves
Afterwards I would return the radii back then iterate through each possible solution.
This algorithm was extremely slow (I did'nt bother to calculate the O time since the required algorithm was required to run in a reasonable time frame of a second)
My Second attempt was to project these circles onto the y and x axis and rotate the set until there existed a section of the x or y axis without a "shadow" of any circle while splitting the sets into two.
This method would only require a maximum rotation of 1/2pi radians to determine the answer but attempts to program were complex and slow.
I cannot find the question anywhere online as it was presented on paper last year created by a professor at my university.

Simple algorithm with cubic complexity:
Find common tangents for all circle pairs. There are 4*n*(n-1)/2 ~ n^2 tangents.
For every tangent check whether it intersects all circles. n*n^2=n^3 operations
I think that algorithms with better complexity might exist (based on tangent direction sorting)

Related

Algorithm to divide region such that sum of distance is minimized

Suppose we have n points in a bounded region of the plane. The problem is to divide it in 4 regions (with a horizontal and a vertical line) such that the sum of a metric in each region is minimized.
The metric can be for example, the sum of the distances between the points in each region ; or any other measure about the spreadness of the points. See the figure below.
I don't know if any clustering algorithm might help me tackle this problem, or if for instance it can be formulated as a simple optimization problem. Where the decision variables are the "axes".
I believe this can be formulated as a MIP (Mixed Integer Programming) problem.
Lets introduce 4 quadrants A,B,C,D. A is right,upper, B is right,lower, etc. Then define a binary variable
delta(i,k) = 1 if point i is in quadrant k
0 otherwise
and continuous variables
Lx, Ly : coordinates of the lines
Obviously we have:
sum(k, delta(i,k)) = 1
xlo <= Lx <= xup
ylo <= Ly <= yup
where xlo,xup are the minimum and maximum x-coordinate. Next we need to implement implications like:
delta(i,'A') = 1 ==> x(i)>=Lx and y(i)>=Ly
delta(i,'B') = 1 ==> x(i)>=Lx and y(i)<=Ly
delta(i,'C') = 1 ==> x(i)<=Lx and y(i)<=Ly
delta(i,'D') = 1 ==> x(i)<=Lx and y(i)>=Ly
These can be handled by so-called indicator constraints or written as linear inequalities, e.g.
x(i) <= Lx + (delta(i,'A')+delta(i,'B'))*(xup-xlo)
Similar for the others. Finally the objective is
min sum((i,j,k), delta(i,k)*delta(j,k)*d(i,j))
where d(i,j) is the distance between points i and j. This objective can be linearized as well.
After applying a few tricks, I could prove global optimality for 100 random points in about 40 seconds using Cplex. This approach is not really suited for large datasets (the computation time quickly increases when the number of points becomes large).
I suspect this cannot be shoe-horned into a convex problem. Also I am not sure this objective is really what you want. It will try to make all clusters about the same size (adding a point to a large cluster introduces lots of distances to be added to the objective; adding a point to a small cluster is cheap). May be an average distance for each cluster is a better measure (but that makes the linearization more difficult).
Note - probably incorrect. I will try and add another answer
The one dimensional version of minimising sums of squares of differences is convex. If you start with the line at the far left and move it to the right, each point crossed by the line stops accumulating differences with the points to its right and starts accumulating differences to the points to its left. As you follow this the differences to the left increase and the differences to the right decrease, so you get a monotonic decrease, possibly a single point that can be on either side of the line, and then a monotonic increase.
I believe that the one dimensional problem of clustering points on a line is convex, but I no longer believe that the problem of drawing a single vertical line in the best position is convex. I worry about sets of points that vary in y co-ordinate so that the left hand points are mostly high up, the right hand points are mostly low down, and the intermediate points alternate between high up and low down. If this is not convex, the part of the answer that tries to extend to two dimensions fails.
So for the one dimensional version of the problem you can pick any point and work out in time O(n) whether that point should be to the left or right of the best dividing line. So by binary chop you can find the best line in time O(n log n).
I don't know whether the two dimensional version is convex or not but you can try all possible positions for the horizontal line and, for each position, solve for the position of the vertical line using a similar approach as for the one dimensional problem (now you have the sum of two convex functions to worry about, but this is still convex, so that's OK). Therefore you solve at most O(n) one-dimensional problems, giving cost O(n^2 log n).
If the points aren't very strangely distributed, I would expect that you could save a lot of time by using the solution of the one dimensional problem at the previous iteration as a first estimate of the position of solution for the next iteration. Given a starting point x, you find out if this is to the left or right of the solution. If it is to the left of the solution, go 1, 2, 4, 8... steps away to find a point to the right of the solution and then run binary chop. Hopefully this two-stage chop is faster than starting a binary chop of the whole array from scratch.
Here's another attempt. Lay out a grid so that, except in the case of ties, each point is the only point in its column and the only point in its row. Assuming no ties in any direction, this grid has N rows, N columns, and N^2 cells. If there are ties the grid is smaller, which makes life easier.
Separating the cells with a horizontal and vertical line is pretty much picking out a cell of the grid and saying that cell is the cell just above and just to the right of where the lines cross, so there are roughly O(N^2) possible such divisions, and we can calculate the metric for each such division. I claim that when the metric is the sum of the squares of distances between points in a cluster the cost of this is pretty much a constant factor in an O(N^2) problem, so the whole cost of checking every possibility is O(N^2).
The metric within a rectangle formed by the dividing lines is SUM_i,j[ (X_i - X_j)^2 + (Y_i-Y_j)^2]. We can calculate the X contributions and the Y contributions separately. If you do some algebra (which is easier if you first subtract a constant so that everything sums to zero) you will find that the metric contribution from a co-ordinate is linear in the variance of that co-ordinate. So we want to calculate the variances of the X and Y co-ordinates within the rectangles formed by each division. https://en.wikipedia.org/wiki/Algebraic_formula_for_the_variance gives us an identity which tells us that we can work out the variance given SUM_i Xi and SUM_i Xi^2 for each rectangle (and the corresponding information for the y co-ordinate). This calculation can be inaccurate due to floating point rounding error, but I am going to ignore that here.
Given a value associated with each cell of a grid, we want to make it easy to work out the sum of those values within rectangles. We can create partial sums along each row, transforming 0 1 2 3 4 5 into 0 1 3 6 10 15, so that each cell in a row contains the sum of all the cells to its left and itself. If we take these values and do partial sums up each column, we have just worked out, for each cell, the sum of the rectangle whose top right corner lies in that cell and which extends to the bottom and left sides of the grid. These calculated values at the far right column give us the sum for all the cells on the same level as that cell and below it. If we subtract off the rectangles we know how to calculate we can find the value of a rectangle which lies at the right hand side of the grid and the bottom of the grid. Similar subtractions allow us to work out first the value of the rectangles to the left and right of any vertical line we choose, and then to complete our set of four rectangles formed by two lines crossing by any cell in the grid. The expensive part of this is working out the partial sums, but we only have to do that once, and it costs only O(N^2). The subtractions and lookups used to work out any particular metric have only a constant cost. We have to do one for each of O(N^2) cells, but that is still only O(N^2).
(So we can find the best clustering in O(N^2) time by working out the metrics associated with all possible clusterings in O(N^2) time and choosing the best).

Fragmented line fitting

I want to fit a line to line fragments, i.e. a small number (often less than 10) of line segments that approximately belong to the line. The line has a small slope. But there are outliers: segments (usually smaller) outside the line. The figure below shows a typical case. There is no horizontal overlap between the pieces.
I would prefer to avoid trying a fit on all subsets of segments and keeping the best. I also wouldn't rely on RANSAC as the sample is too small.
Any suggestion ?
Update:
I now plan to recast the problem as that of fitting a line on points, namely the infinities of points on the individual line segments, assuming a constant linear density. By rewriting the least squares equations in integral form, one sees that we can consider the segments as concentrated at their middle, with a weight equal to their length; there is also an extra term taking their slope into account. This gives a good grounding to the fitting on segments.
Now I still have to incorporate outlier detection. Inspired by RANSAC, I can pick the longest segments and use them in isolation or in pairs to get candidate lines. For each line, evaluate the total error, and keep the line giving the smallest value. From there, some criterion (yet to be found) should allow rejecting the outliers and performing the final least-squares fit on the inliers.
I'd guess the slope is going to be around the average of the line fragment slopes times a factor equal to the length of the fragment (or square of the length of the fragment, depending on how that length of the outlying fragment compares). And then bestfit that line with that slope.
So take the line fragments, convert the slopes to angles (arctan2(y1-y0,x1-x0)) multiply that by the length add them all up, divide by (total length of all fragments). Do the same thing for the position (position of midpoint of the line fragment * length of fragment) / (total length of all fragments), then make sure the line with that slope intercepts the point with that value.
Update:
If we are not to consider much about the slopes we should rather just bestfit the line positionally with regard to the impact of the various segments which again we weight by their length.
Find the total length of the fragments. Iterate the fragments until you are 1/3rd of the way through the total length of the fragments. That is going to be your x of your first point. Then pick some arbitrarily small value and iterate through fragments again, sampling at your chosen rate. Then the impact of that sample is the given y multiplied by the linear distance of the x from the x 1/3rd of the way through the total fragments all normalized by the total sum of the linear distances across the all the fragments. Do the same for 2/3rds of the way. And draw a line between the two resultant points.
As you have asked, I have some suggestions. Complete and working answer would be a bit too much for me to arrive at. My suggestion contains two major parts. Taking them one by one:
Handling outliers:
One suggestion for getting rid of the outliers is to Cluster the line segments. Then on, don't worry about the lines that fall outside the cluster. But, how do you cluster the lines? Divide the entire 2D plane into y = 0 to a, y = a to 2a, y = 2a to 3a etc. Line segments which fall in the same y = i to j stripe would be the one you will use to generate your i and j values for the correct stripe.
There is however one issue: What if the line segments are not well divided horizontally? What if majority lines are inclined at 38 degrees instead of being close to 0? In that case, you may do a Principle Component Analysis. Sorry to link you to such an open-ended idea - your question kinda demands it.
Realign your lines so that they are majorly parallel to X-axis and then, as I mentioned above, find the stripe that contains most of the lines.
Approximating the best fitting line:
Now, after you have finalized the correct stripe, take all the line segments that fell in the stripe and densify them. Densification is the step of approximating the line segments as a collection of points. Since all of these line segments are between y = i and y = j, therefore, you may start with the line y = (i + j) / 2 as the best fit line. Then:
Find the distance of all the points from this line, keeping the distance as negative when the point is above the line and the distance as positive when the point is below the line.
Sum all the distances. Let's call this summed value are approximationError.
Your target is now to find that y value for which approximationError is 0.
Decrease y if majority points lied below the line, increase it if majority points lied above it.
You will finally arrive at a line like y = c.
Now, incline this line to the same angle by which you changed all your input line segments during the Principle Component Analysis step.
To get the line segment, cut this line by x-value of the two x-farthest points in the stripe.
I realize that this all may not be easy to visualize. Here is a link to the wikipedia image for PCA. Here is a link to another answer demonstrating line densification.

Minimum cost path to visit a set of points in a given order exactly once?

The Problem:
We are given a set of N points on a 2D plane, we have to find a path which visits all the points in the order 1-2-3....N and comes back to 1 such that the time taken is minimized. We can move one step to north,east,west or south which takes 1 unit of time. We cannot visit an y of the N points more than once except for 1, which we cannot visit more than twice.
N <= 100
The x and y axis of every point is <= 1000000
(This is the complete problem statement which appeared in a past USACO contest)
My Algorithm:
The x and y axis of the points can be very large but there are just <=100 points so, we change x-axis of the points so that when the are arranged in ascending order of their x axis the difference between the x axis of the adjacent points is 1. We do the same for all the y axis of the points.
Now we find the shortest path from point 1 to 2, from 2 to 3, ... and from N to 1 without visiting any of the given points other than the source and target. We cannot use a straightforward bfs to find the shortest path from, because the distance from a point x,y to a point x+1,y is not 1, but is the original value of x+1 minus the original value of x. So I used Dijktra's algorithm with a binary heap to find the shortest path.
But this algorithm does not work for half of the testcases, it outputs a solution larger than the correct solution.
Why is this algorithm wrong? How do we solve this problem otherwise?
The x and y axis of the points can be very large but there are just <=100 points so, we change x-axis of the points so that when the are arranged in ascending order of their x axis the difference between the x axis of the adjacent points is 1. We do the same for all the y axis of the points.
This essentially means you remove “unused” coordinates. But that will cost you space to maneuver. Take the following example:
4
1 1
3 3
3 2
1 2
The shortest path here takes 8 steps. Assuming positive x is east and positive y is north, that best path would be ennESwWS, with capital letters indicating arrival at the next farm.
/--2
| |
4--|--3
| |
1--/
Now if you do your compression scheme, then you'll remove the y=2 column, and in effect will be left without any column where you could pass from farm 1 to farm 2 without visiting farm 3 or 4. So I see no gain from this compression, but lots of trouble.
So I used Dijktra's algorithm
On what graph? If you use Dijkstra on the farms only, you'll be in trouble, since you have to take the non-farm locations into account. If you take those as well, things should work, as far as I can see. Except for the compression up front.
What you can do if you want to keep this idea is to compress consecutive ranges of empty rows or columns into a single one. That way, your graph will stay reasonably small (201 rows and columns max), but where there is space to manaeuver around farms, your graph will represent that fact.
I guess I'd use a “detour metric” for Dijkstra: every step which brings you closer to the distance has zero cost, while every step that takes you away has cost one. In the end you can take the detour cost, multiply it by two (since every step you take away is also one more step you have to take towards your goal) and add the Manhattan distance of the end points (which is the zero detour cost) and you are back at your original cost. This is basically the idea from A*, but with the performance (and existing implementation) of Disjkstra.
If you compress this
..2
...
3.4
...
1..
to this
.2
34
1.
then you increase the length of the path from 1 to 2 because 34 constitute a spurious obstacle. You could compress multiple empty rows/columns to one empty row/column instead of none.
My thinking is: when is the distance from point i to point i + 1 not the Manhattan Distance? It seems to me that the only scenario for that is when there is a full horizontal or vertical block (or both), e.g.,
(i+1) X (i+1) (i+1)
X
XXX XXXX X XXXX
X X X
i i X i X
I haven't coded anything yet, but perhaps it would be useful to scan for either block when calculating the route to the next point, and calculate the minimal detour if a block exists.

what is the algorithm to determine x and y for something so that it follows a curve between 2 keyframes?

back in uni i remember there being an algorithm that's use to calculate the x and y position of point between the x and y values of 2 key frames, i know the one for a straight line
x = ((KeyFrame2.x - KeyFrame1.x)/duration)*time
my understanding is that the difference between the the 2 key frames divided by the duration give you how how many units of measurement (be it pixels mostly), for every unit of time (normally 1 frame) so you just multiply that value by how far though the timeline is
ie.
x = ((KeyFrame2.x - KeyFrame1.x)/duration)*time
x = ((10 - 0)/10)*3
x = (10/10)*3
x = 1*3
x = 3 (after 3 units of time, the object's position will be +3 pixels along the x axis from KeyFrame1)
this one i understand however there i was also told about one that it used for curve paths, say a ball bouncing foward and the keyframes are when it hits the ground and when it's at the peek of it's bounce, this one is what i've forgotten and i have no idea where the hell my notes are for it
what i am asking is the algorithm used to calculate the x and y positions for an object with a path like this, i am asking for the mathematical algorithm which is code interdependent, what i'm trying to do is animate a number of orbs that will circle the center of the screen for a logo, i've got the objects in code to move however i need to adjust the calculations between keyframes
NOTE: even though i'm not asking for code, the algorithms are used in animation programming and as such my question relates to programming in general
NOTE2: KeyFrame2.x and KeyFrame1.x are not code, i see keyframes as an instance of a class which holds values such as x ,y, z, duration from previous frame ect
You're not asking for algorithms, you're asking for equations. You can use various equations based on what kind of motion you're simulating; for example, projectile motion under gravity is described by a parabola -- a curve of the form
x = a * y^2 + b
For motion controlled by some intelligent force, curve-fitting based on higher-order polynomials or Bezier curves are more appropriate. Google is your friend here.
If you know that you have circle movement you can circle equation to predict next position or to interpolate between. Since circle can be uniquely defined by 3 points, you need 3 points to interpolate in-between. It makes sense, having only 2 points you cant even know if it is convex or concave circle.
based on movement your points will have, you pick equation that is at least similar to movement you have. In most cases linear equation as you pointed out is just good enough.

How can I inscribe a rectangle or circle inside an arbitrary quadrilateral

This may be a more math focused question, but wanted to ask here because it is in a CS context. I'm looking to inscribe a rectangle inside another (arbitrary) quad with the inscribed quad having the largest height and width possible. Since I think the algorithm will be similar, I'm looking to see if I can do this with a circle as well.
To be more clear hear is what I mean by the bounding quadrilateral as an example.
Here are 2 examples of the inscribed maximization I'm trying to achieve:
I have done some preliminary searching but have not found anything definitive. It seems that some form of dynamic programming could be the solution. It seems that this should be a linear optimization problem that should be more common than I have found, and perhaps I'm searching for the wrong terms.
Notes: For the inscribed square assume that we know a target w/h ratio we are looking for (e.g. 4:3). For the quad, assume that the sides will not cross and that it will be concave (if that simplifies the calculation).
1) Circle.
For a triangle, this is a standard math question from school program.
For quadrilateral, you can notice that maximum inner circle will touch at least three of its sides. So, take every combination of three sides and solve the problem for each triangle.
A case of parallel sides have to be considered separately (since they don't form a triangle), but it's not terribly difficult.
2) Rectangle.
You can't have "largest height and width", you need to choose another criteria. E.g., on your picture I can increase width by reducing height and vice versa.
4 year old thread, but I happened to stumble accross it when googling my problem.
I have a problem like this in a current CV application. I came up with a simple and somewhat clumsy solution for the finding the largest. Not exactly the same though, cause I maximize the area of the rectangle without a fixed ratio of sides.
I don't know yet wether my solutions finds the optimum or whether it works in all cases. I also think there should be a more efficient way, so I am looking forward to your input.
First, assume a set of 4 points forming our (convex) quadrilateral:
x y
P1 -2 -5
P2 1 7
P3 4 5
P4 3 -2
For this procedure the leftmost point is P1, the following points are numbered clockwise. It looks like this:
We then create the linear functions between the Points. For each function we have to know the slope k and the distance from 0: d.
k is simply the difference in Y of the two points divided by the difference in X.
d can be calculated by solving the linear function to d. So we have
k=dy/dx
d=y1-k*x1
We will also want the inverse functions.
k_inv = 1/k
d_inv = -d/k
We then create the function and inverse function for each side of the quadrilateral
k d k d
p1p2 4 3 p1p2_inv 0.25 -0.75
p2p3 -0.67 7.67 p2p3_inv -1.5 11.5
p3p4 7 -23 p3p4_inv 0.14 3.29
p4p1 0.6 -3.8 p4p1_inv 1.67 6.33
If we had completely horizontal or vertical lines we would end up with a DIV/0 in one of the functions or inverse functions, thus we would need to handle this case separately.
Now we go through all corners that are enclosed by two functions that have a k with a slope with a different sign. In our case that would be P2 and P3.
We start at P2 and iterate through the y values between P2 and the higher one of P1 and P3 with an appropriate step size and use the inverse functions to calculate the distance between the functions in horizontal direction. This would give us one side of the rectangle
a=p2p3_inv(y)-p1p2_inv(y)
At the two x values x = p2p3_inv(y) and x = p1p2_inv(y) we then calculate the difference in y to the two opposite functions and take the distance to our current y position as a candidate for the second side of our rectangle.
b_candidate_1 = y-p4p1(p2p3_inv(y))
b_candidate_2 = y-p4p1(p1p2_inv(y))
b_candidate_3 = y-P3p4(p2p3_inv(y))
b_candidate_4 = y-P3p4(p1p2_inv(y))
The lesser of the four parameters would be the solution for side b.
The area obviously becomes a*b.
I did a quick example in excel to demonstrate:
the minimum b here is 6.9, so the upper right corner of the solution is on p2p3 and the rectangle extends a in horizontal and b in vertical direction to the left and bottom respectively.
The four points of the rectangle are thus
Rect x y
R1 0.65 -1.3
R2 0.65 5.6
R3 3.1 5.6
R4 3.1 -1.3
I will have to put this into C++ code and will run a few tests to see if the solution generalizes or if this was just "luck".
I think it should also be possible to substitute a and b in A=a*b by the functions and put it into one linear formula that has to be maximized under the condition that p1p2 is only defined between P1 and P2 etc...

Resources