Square intersection solver - computational-geometry

I'm looking for an efficient way to move hundreds of uniform, possibly intersecting squares away from each other, so they no longer intersect. The resulting new positions should be as close as possible to the original coordinates.
Is there such an algorithm?

Introduce the shift variables Xi+, Xi-, Yi-, Yi- and solve the linear problem that minimizes the sum of the variables under constraints that express the non-overlap like (Ui + Xi+) - (Uj - Xj-) >= S, (Vi + Yi+) - (Vj - Yj-) >= S or similar.
If you are not familiar with linear programming, you should read about: http://en.wikipedia.org/wiki/Linear_programming

Related

Finding closest pair of points in the plane with non-distinct x-coordinates in O(nlogn)

Most of the implementations of the algorithm to find the closest pair of points in the plane that I've seen online have one of two deficiencies: either they fail to meet an O(nlogn) runtime, or they fail to accommodate the case where some points share an x-coordinate. Is a hash map (or equivalent) required to solve this problem optimally?
Roughly, the algorithm in question is (per CLRS Ch. 33.4):
For an array of points P, create additional arrays X and Y such that X contains all points in P, sorted by x-coordinate and Y contains all points in P, sorted by y-coordinate.
Divide the points in half - drop a vertical line so that you split X into two arrays, XL and XR, and divide Y similarly, so that YL contains all points left of the line and YR contains all points right of the line, both sorted by y-coordinate.
Make recursive calls for each half, passing XL and YL to one and XR and YR to the other, and finding the minimum distance, d in each of those halves.
Lastly, determine if there's a pair with one point on the left and one point on the right of the dividing line with distance smaller than d; through a geometric argument, we find that we can adopt the strategy of just searching through the next 7 points for every point within distance d of the dividing line, meaning the recombination of the divided subproblems is only an O(n) step (even if it looks n2 at first glance).
This has some tricky edge cases. One way people deal with this is sorting the strip of points of distance d from the dividing line at every recombination step (e.g. here), but this is known to result in an O(nlog2n) solution.
Another way people deal with edge cases is by assuming each point has a distinct x-coordinate (e.g. here): note the snippet in closestUtil which adds to Pyl (or YL as we call it) if the x-coordinate of a point in Y is <= the line, or to Pyr (YR) otherwise. Note that if all points lie on the same vertical line, this would result us writing past the end of the array in C++, as we write all n points to YL.
So the tricky bit when points can have the same x-coordinate is dividing the points in Y into YL and YR depending on whether a point p in Y is in XL or XR. The pseudocode in CLRS for this is (edited slightly for brevity):
for i = 1 to Y.length
if Y[i] in X_L
Y_L.length = Y_L.length + 1;
Y_L[Y_L.length] = Y[i]
else Y_R.length = Y_R.length + 1;
Y_R[Y_R.length] = Y[i]
However, absent of pseudocode, if we're working with plain arrays, we don't have a magic function that can determine whether Y[i] is in X_L in O(1) time. If we're assured that all x-coordinates are distinct, sure - we know that anything with an x-coordinate less than the dividing line is in XL, so with one comparison we know what array to partition any point p in Y into. But in the case where x-coordinates are not necessarily distinct (e.g. in the case where they all lie on the same vertical line), do we require a hash map to determine whether a point in Y is in XL or XR and successfully break down Y into YL and YR in O(n) time? Or is there another strategy?
Yes, there are at least two approaches that work here.
The first, as Bing Wang suggests, is to apply a rotation. If the angle is sufficiently small, this amounts to breaking ties by y coordinate after comparing by x, no other math needed.
The second is to adjust the algorithm on G4G to use a linear-time partitioning algorithm to divide the instance, and a linear-time sorted merge to conquer it. Presumably this was not done because the author valued the simplicity of sorting relative to the previously mentioned algorithms in most programming languages.
Tardos & Kleinberg suggests annotating each point with its position (index) in X.
You could do this in N time, or, if you really, really want to, you could do it "for free" in the sorting operation.
With this annotation, you could do your O(1) partitioning, and then take the position pr of the right-most point in Xl in O(1), using it to determine weather a point in Y goes in Yl (position <= pr), or Yr (position > pr). This does not require an extra data structure like a hash map, but it does require that those same positions are used in X and Y.
NB:
It is not immediately obvious to me that the partitioning of Y is the only problem that arises when multiple points have the same coordinate on the x-axis. It seems to me that the proof of linearity of the comparisons neccesary across partitions breaks, but I have seen only the proof that you need only 15 comparisons, not the proof for the stricter 7-point version, so i cannot be sure.

Querying large amount of multidimensional points in R^N

I'm looking at listing/counting the number of integer points in R^N (in the sense of Euclidean space), within certain geometric shapes, such as circles and ellipses, subject to various conditions, for small N. By this I mean that N < 5, and the conditions are polynomial inequalities.
As a concrete example, take R^2. One of the queries I might like to run is "How many integer points are there in an ellipse (parameterised by x = 4 cos(theta), y = 3 sin(theta) ), such that y * x^2 - x * y = 4?"
I could implement this in Haskell like this:
ghci> let latticePoints = [(x,y) | x <- [-4..4], y <-[-3..3], 9*x^2 + 16*y^2 <= 144, y*x^2 - x*y == 4]
and then I would have:
ghci> latticePoints
[(-1,2),(2,2)]
Which indeed answers my question.
Of course, this is a very naive implementation, but it demonstrates what I'm trying to achieve. (I'm also only using Haskell here as I feel it most directly expresses the underlying mathematical ideas.)
Now, if I had something like "In R^5, how many integer points are there in a 4-sphere of radius 1,000,000, satisfying x^3 - y + z = 20?", I might try something like this:
ghci> :{
Prelude| let latticePoints2 = [(x,y,z,w,v) | x <-[-1000..1000], y <- [-1000..1000],
Prelude| z <- [-1000..1000], w <- [-1000..1000], v <-[1000..1000],
Prelude| x^2 + y^2 + z^2 + w^2 + v^2 <= 1000000, x^3 - y + z == 20]
Prelude| :}
so if I now type:
ghci> latticePoints2
Not much will happen...
I imagine the issue is because it's effectively looping through 2000^5 (32 quadrillion!) points, and it's clearly unreasonably of me to expect my computer to deal with that. I can't imagine doing a similar implementation in Python or C would help matters much either.
So if I want to tackle a large number of points in such a way, what would be my best bet in terms of general algorithms or data structures? I saw in another thread (Count number of points inside a circle fast), someone mention quadtrees as well as K-D trees, but I wouldn't know how to implement those, nor how to appropriately query one once it was implemented.
I'm aware some of these numbers are quite large, but the biggest circles, ellipses, etc I'd be dealing with are of radius 10^12 (one trillion), and I certainly wouldn't need to deal with R^N with N > 5. If the above is NOT possible, I'd be interested to know what sort of numbers WOULD be feasible?
There is no general way to solve this problem. The problem of finding integer solutions to algebraic equations (equations of this sort are called Diophantine equations) is known to be undecidable. Apparently, you can write equations of this sort such that solving the equations ends up being equivalent to deciding whether a given Turing machine will halt on a given input.
In the examples you've listed, you've always constrained the points to be on some well-behaved shape, like an ellipse or a sphere. While this particular class of problem is definitely decidable, I'm skeptical that you can efficiently solve these problems for more complex curves. I suspect that it would be possible to construct short formulas that describe curves that are mostly empty but have a huge bounding box.
If you happen to know more about the structure of the problems you're trying to solve - for example, if you're always dealing with spheres or ellipses - then you may be able to find fast algorithms for this problem. In general, though, I don't think you'll be able to do much better than brute force. I'm willing to admit that (and in fact, hopeful that) someone will prove me wrong about this, though.
The idea behind the kd-tree method is that you recursive subdivide the search box and try to rule out whole boxes at a time. Given the current box, use some method that either (a) declares that all points in the box match the predicate (b) declares that no points in the box match the predicate (c) makes no declaration (one possibility, which may be particularly convenient in Haskell: interval arithmetic). On (c), cut the box in half (say along the longest dimension) and recursively count in the halves. Obviously the method can choose (c) all the time, which devolves to brute force; the goal here is to do (a) or (b) as much as possible.
The performance of this method is very dependent on how it's instantiated. Try it -- it shouldn't be more than a couple dozen lines of code.
For nicely connected region, assuming your shape is significantly smaller than your containing search space, and given a seed point, you could do a growth/building algorithm:
Given a seed point:
Push seed point into test-queue
while test-queue has items:
Pop item from test-queue
If item tests to be within region (eg using a callback function):
Add item to inside-set
for each neighbour point (generated on the fly):
if neighbour not in outside-set and neighbour not in inside-set:
Add neighbour to test-queue
else:
Add item to outside-set
return inside-set
The trick is to find an initial seed point that is inside the function.
Make sure your set implementation gives O(1) duplicate checking. This method will eventually break down with large numbers of dimensions as the surface area exceeds the volume, but for 5 dimensions should be fine.

Algorithm for finding all combinations of (x,y,z,j) that satisfy w+x = y+j, where w,x,y,j are integers between -N...N inclusive

I'm working on a problem that requires an array (dA[j], j=-N..N) to be calculated from the values of another array (A[i], i=-N..N) based on a conservation of momentum rule (x+y=z+j). This means that for a given index j for all the valid combinations of (x,y,z) I calculate A[x]A[y]A[z]. dA[j] is equal to the sum of these values.
I'm currently precomputing the valid indices for each dA[j] by looping x=-N...+N,y=-N...+N and calculating z=x+y-j and storing the indices if abs(z) <= N.
Is there a more efficient method of computing this?
The reason I ask is that in future I'd like to also be able to efficiently find for each dA[j] all the terms that have a specific A[i]. Essentially to be able to compute the Jacobian of dA[j] with respect to dA[i].
Update
For the sake of completeness I figured out a way of doing this without any if statements: if you parametrize the equation x+y=z+j given that j is a constant you get the equation for a plane. The constraint that x,y,z need to be integers between -N..N create boundaries on this plane. The points that define this boundary are functions of N and j. So all you have to do is loop over your parametrized variables (s,t) within these boundaries and you'll generate all the valid points by using the vectors defined by the plane (s*u + t*v + j*[0,0,1]).
For example, if you choose u=[1,0,-1] and v=[0,1,1] all the valid solutions for every value of j are bounded by a 6 sided polygon with points (-N,-N),(-N,-j),(j,N),(N,N),(N,-j), and (j,-N).
So for each j, you go through all (2N)^2 combinations to find the correct x's and y's such that x+y= z+j; the running time of your application (per j) is O(N^2). I don't think your current idea is bad (and after playing with some pseudocode for this, I couldn't improve it significantly). I would like to note that once you've picked a j and a z, there is at most 2N choices for x's and y's. So overall, the best algorithm would still complete in O(N^2).
But consider the following improvement by a factor of 2 (for the overall program, not per j): if z+j= x+y, then (-z)+(-j)= (-x)+(-y) also.

Google Interview : Find the maximum sum of a polygon [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
Given a polygon with N vertexes and N edges. There is an int number(could be negative) on every vertex and an operation in set (*,+) on every edge. Every time, we remove an edge E from the polygon, merge the two vertexes linked by the edge (V1,V2) to a new vertex with value: V1 op(E) V2. The last case would be two vertexes with two edges, the result is the bigger one.
Return the max result value can be gotten from a given polygon.
For the last case we might not need two merge as the other number could be negative, so in that case we would just return the larger number.
How I am approaching the problem:
p[i,j] denotes the maximum value we can obtain by merging nodes from labelled i to j.
p[i,i] = v[i] -- base case
p[i,j] = p[i,k] operator in between p[k+1,j] , for k between i to j-1.
and then p[0,n] will be my answer.
Second point , i will have to start from all the vertices and do the same as above as this will be cyclic n vertices n edges.
The time complexity for this is n^3 *n i.e n^4 .
Can i do better then this ?
As you have identified (tagged) correctly, this indeed is very similar to the matrix multiplication problem (in what order do I multiply matrixes in order to do it quickly).
This can be solved polynomially using a dynamic algorithm.
I'm going to instead solve a similar, more classic (and identical) problem, given a formula with numbers, addition and multiplications, what way of parenthesizing it gives the maximal value, for example
6+1 * 2 becomes (6+1)*2 which is more than 6+(1*2).
Let us denote our input a1 to an real numbers and o(1),...o(n-1) either * or +. Our approach will work as follows, we will observe the subproblem F(i,j) which represents the maximal formula (after parenthasizing) for a1,...aj. We will create a table of such subproblems and observe that F(1,n) is exactly the result we were looking for.
Define
F(i,j)
- If i>j return 0 //no sub-formula of negative length
- If i=j return ai // the maximal formula for one number is the number
- If i<j return the maximal value for all m between i (including) and j (not included) of:
F(i,m) (o(m)) F(m+1,j) //check all places for possible parenthasis insertion
This goes through all possible options. TProof of correctness is done by induction on the size n=j-i and is pretty trivial.
Lets go through runtime analysis:
If we do not save the values dynamically for smaller subproblems this runs pretty slow, however we can make this algorithm perform relatively fast in O(n^3)
We create a n*n table T in which the cell at index i,j contains F(i,j) filling F(i,i) and F(i,j) for j smaller than i is done in O(1) for each cell since we can calculate these values directly, then we go diagonally and fill F(i+1,i+1) (which we can do quickly since we already know all the previous values in the recursive formula), we repeat this n times for n diagonals (all the diagonals in the table really) and filling each cell takes (O(n)), since each cell has O(n) cells we fill each diagonals in O(n^2) meaning we fill all the table in O(n^3). After filling the table we obviously know F(1,n) which is the solution to your problem.
Now back to your problem
If you translate the polygon into n different formulas (one for starting at each vertex) and run the algorithm for formula values on it, you get exactly the value you want.
I think you can reduce the need for a brute force search. For example: if there is a chain of
x + y + z
You can replace it with a single vertex whose value is the sum, you can't do better than that. You need to do the multiplying after the addition when you're dealing with +ve integers. So if it's all positive then simply reduce all + chains and then mutliply.
So that leaves the cases where there are -ve numbers. Seems to me that the strategy for a single -ve number is pretty obvious, for two -ve numbers there are a few cases (remembering that - x - is positive) and for more than 2 -ve numbers it seems to get tricky :-)

Minimizing a function of vectors

I need to minimize the following sum:
minimize sum for all i{(i = 1 to n) fi(v(i), v(i - 1), tangent(i))}
v and tangent are vectors.
fi takes the 3 vectors as arguments and returns a cost associated with these 3 vectors. For this function, v(i - 1) is the vector chosen in the previous iteration. tangent(i) is also known. fi calculates the cost of choosing a vector v(i), given the other two vectors v(i - 1) and tangent(i). The v(0) and v(n) vectors are known. tangent(i) values are also known in advance for alli = 0 to n.
My task is to determine all such v(i)s such that the total cost of the function values for i = 1 to n is minimized.
Can you please give me any ideas to solve this?
So far I can think of Branch and Bound or dynamic programming methods.
Thanks!
I think this is a problem in mathematical optimisation, with an objective function built up of dot products and arcCosines, subject to the constraint that your vectors should be unit vectors. You could enforce this either with Lagrange multipliers, or by including a normalising step in the arc-Cosine. If Ti is a unit vector then for Vi calculate cos^-1(Ti.Vi/sqrt(Vi.Vi)). I would have a go at using a conjugate gradient optimiser for this, or perhaps even Newton's method, with my starting point Vi = Ti.
I would hope that this would be reasonably tractable, because the Vi are only related to neighbouring Vi. You might even get somewhere by repeatedly adjusting each Vi in isolation, one by one, to optimise the objective function. It might be worth just seeing what happens if you repeatedly set Vi to be the average of Ti, Vi+1, and Vi-1, and then scaled Vi to be a unit vector again.

Resources