Dividing a plane of points into two equal halves [closed] - algorithm

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 3 years ago.
Improve this question
Given a 2 dimensional plane in which there are n points. I need to generate the equation of a line that divides the plane such that there are n/2 points on one side and n/2 points on the other.

I have assumed the points are distinct, otherwise there might not even be such a line.
If points are distinct, then such a line always exists and is possible to find using a deterministic O(nlogn) time algorithm.
Say the points are P1, P2, ..., P2n. Assume they are not all on the same line. If they were, then we can easily form the splitting line.
First translate the points so that all the co-ordinates (x and y) are positive.
Now suppose we magically had a point Q on the y-axis such that no line formed by those points (i.e. any infinite line Pi-Pj) passes through Q.
Now since Q does not lie within the convex hull of the points, we can easily see that we can order the points by a rotating line passing through Q. For some angle of rotation, half the points will lie on one side and the other half will lie on the other of this rotating line, or, in other words, if we consider the points being sorted by the slope of the line Pi-Q, we could pick a slope between the (median)th and (median+1)th points. This selection can be done in O(n) time by any linear time selection algorithm without any need for actually sorting the points.
Now to pick the point Q.
Say Q was (0,b).
Suppose Q was collinear with P1 (x1,y1) and P2 (x2,y2).
Then we have that
(y1-b)/x1 = (y2-b)/x2 (note we translated the points so that xi > 0).
Solving for b gives
b = (x1y2 - y1x2)/(x1-x2)
(Note, if x1 = x2, then P1 and P2 cannot be collinear with a point on the Y axis).
Consider |b|.
|b| = |x1y2 - y1x2| / |x1 -x2|
Now let the xmax be the x-coordinate of the rightmost point and ymax the co-ordinate of the topmost.
Also let D be the smallest non-zero x-coordinate difference between two points (this exists, as not all xis are same, as not all points are collinear).
Then we have that |b| <= xmax*ymax/D.
Thus, pick our point Q (0,b) to be such that |b| > b_0 = xmax*ymax/D
D can be found in O(nlogn) time.
The magnitude of b_0 can get quite large and we might have to deal with precision issues.
Of course, a better option is to pick Q randomly! With probability 1, you will find the point you need, thus making the expected running time O(n).
If we could find a way to pick Q in O(n) time (by finding some other criterion), then we can make this algorithm run in O(n) time.

Create an arbitrary line in that plane. Project each point onto that line a.k.a for each point, get the closest point on that line to that point.
Order those points along the line in either direction, and choose a point on that line such that there is an equal number of points on the line in either direction.
Get the line perpendicular to the first line which passes through that point. This line will have half the original points on either side.
There are some cases to avoid when doing this. Most importantly, if all the point are themselves on a single line, don't choose a perpendicular line which passes through it. In fact, choose that line itself so you don't have to worry about projecting the points. In terms of the actual mathematics behind this, vector projections will be very useful.

This is a modification of Dividing a plane of points into two equal halves which allows for the case with overlapping points (in which case, it will say whether or not the answer exists).
If number of points is odd, return "impossible".
Pick a random line (two random points)
Project all points onto this line (`O(N)` operation)
(i.e. we pretend this line is our new X'-axis, and write down the
X'-coordinate of each point)
Perform any median-finding algorithm on the X'-coordinates
(`O(N)` or faster-if-desired operation)
(returns 2 medians if no overlapping points)
Return the line perpendicular to original random line that splits the medians
In rare case of overlapping points, repeat a few times (it would take
a pathological case to prevent a solution from existing).
This is O(N) unlike other proposed solutions.
Assuming a solution exists, the above method will probably terminate, though I don't have a proof.
Try the above algorithm a few times unless you detect overlapping points. If you detect a high number of overlapping points, you may be in for a rough ride, but there is a terribly inefficient brute-force solution that involves checking all possible angles:
For every "critical slope range", perform the above algorithm
by choosing a line with a slope within the range.
If all critical slope ranges fail, the solution is impossible.
A critical angle is defined as the angle which could possibly change the result (imagine the solution to a previous answer, rotate the entire set of points until one or more points swaps position with one or more other points, crossing the dividing line. There are only finitely many of these, and I think they are bounded by the number of points, so you're probably looking at something in the range O(N^2)-O(N^2 log(N)) if you have overlapping points, for a brute-force approach.

I'd guess that a good way is to sort/sequence/order the points (e.g. from left to right), and then choose a line which passes through (or between) the middle point[s] in the sequence.

There are obvious cases where no solution is possible. E.g. when you have three heaps of points. One point at location A, Two points at location B, and five points at location C.
If you expect some decent distribution, you can probably get a good result with tlayton's algorithm. To select the initial line slant, you could determine the extent of the whole point set, and choose the angle of the largest diagonal.

The median equally divides a set of numbers in the manner similar to what you're trying to accomplish, and it can be computed in O(n) time using a selection algorithm (the writeup in Cormen et al is better, so you may want to look there instead). So, find the median of your x values Mx (or your y values My if you prefer) and set x = Mx (or y = My) and that line will be axially aligned and split your points equally.
If the nature of your problem requires that no more than one point lies on the line (if you have an odd number of points in your set, at least one of them will be on the line) and you discover that's what's happened (or you just want to guard against the possibility), rotate all of your points by some random angle, θ, and compute the median of the rotated points. You then rotate the median line you computed by -θ and it will evenly divide points.
The likelihood of randomly choosing θ such that the problem manifests itself again is very small with a finite number of points, but if it does, try again with a different θ.

Here is how I approach this problem (with the assumption that n is even and NO three points are collinear):
1) Pick up the point with smallest Y value. Let's call it point P.
2) Take this point as the new origin point, so that all other points will have positive Y values after this transformation.
3) For every other point (there are n - 1 points remaining), think it under the polar coordinate system. Each other point can be represented with a radius and angle. You could ignore the radius and just focus on the angle.
4) How can you find a line that split the points evenly? Find the median of (n - 1) angles. The line from point P to the point with that median angle will split the points evenly.
Time complexity for this algorithm is O(n).

I dont know how useful this is I have seen a similar problem...
If you already have the directional vector (aka the coefficients of the dimensions of your plane).
You can then find two points inside that plane, and by simply using the midpoint formula you can find the midpoint of that plane.
Then using the coefficients of that plane and the midpoint you can find a plane that is from equal distance from both points, using the general equation of a plane.
A line then would constitute in expressing one variable in terms of the other
so you would find a line with equal distance between both planes.
There are different methods of doing this such as projection using the distance equation from a plane but I believe that would complicate your math a lot.

To add to M's answer: a method to generate a Q (that's not so far away) in O(n log n).
To begin with, let Q be any point on the y-axis ie. Q = (0,b) - some good choices might be (0,0) or (0, (ymax-ymin)/2).
Now check if there are two points (x1, y1), (x2, y2) collinear with Q. The line between any point and Q is y = mx + b; since b is constant, this means two points are collinear with Q if their slopes m are equal. So determine the slopes mi for all points and check if there are any duplicates: (amoritized O(n) using a hash-table)
If all the m's are distinct, we're done; we found Q, and M's algorithm above generates the line in O(n) steps.
If two points are collinear with Q, we'll move Q up just a tiny amount ε, Qnew = (0, b + ε), and show that Qnew will not be collinear with two other points.
The criterion for ε, derived below, is:
ε < mminΔ*xmin
To begin with, our m's look like this:
mi = yi/xi - b/xi
Let's find the minimum difference between any two distinct mi and call it mminΔ (O(n log n) by, for instance, sorting then comparing differences between mi and i+1 for all i)
If we fudge b up by ε, the new equation for m becomes:
mi,new = yi/xi - b/xi - ε/xi
= mi,old - ε/xi
Since ε > 0 and xi > 0, all m's are reduced, and all are reduced by a maximum of ε/xmin. Thus, if
ε/xmin < mminΔ, ie.
ε < mminΔ*xmin
is true, then two mi which were previously unequal will be guaranteed to remain unequal.
All that's left is to show that if m1,old = m2,old, then m1,new =/= m2,new. Since both mi were reduced by an amount ε/xi, this is equivalent to showing x1 =/= x2. If they were equal, then:
y1 = m1,oldx1 + b = m2,oldx2 + b = y2
Contradicting our assumption that all points are distinct. Thus, m1, new =/= m2, new, and no two points are collinear with Q.

I picked up the idea from Moron and andand and
continued to form a deterministic O(n) algorithm.
I also assumed that the points are distinct and
n is even (thought the algorithm can be
changed so that uneven n with one point
on the dividing line are also supported).
The algorithm tries to divide the points with a vertical line between them. This only fails if the points in the middle have the same x value. In that case the algorithm determines how many points with the same x value have to be on the left and lower site and and accordingly rotates the line.
I'll try to explain with an example.
Let's asume we have 16 points on a plane.
First we need to get the point with the 8th greatest x-value
and the point with the 9th greatest x-value.
With a selection algorithm this is possible in O(n),
as pointed out in another answer.
If the x-value of that points is different, we are done.
We create a vertical line between that two points and
that splits the points equal.
Problematically now is if the x-values are equal. So we have 3 sets of points.
That on the left side (x < xa), in the middle (x = xa)
and that on the right side (x > xa).
The idea now is to count the points on the left side and
calculate how many points from the middle needs to go there,
so that half of the points are on that side. We can ignore the right side here
because if we have half of the points on the left side, the over half must be on the right side.
So let's asume we have we have 3 points (c=3) on the left side,
6 in the middle and 7 on the right side
(the algorithm doesn't know the count from the middle or right side,
because it doesn't need it, but we could also determine it in O(n)).
We need 8-3=5 points from the middle to go on the left side.
The points we already got in the first step are useless now,
because they are only determined by the x-value
and can be any of the points in the middle.
We want the 5 points from the middle with the lowest y-value on the left side and
the point with the highest y-value on the right side.
Again using the selection algorithm, we get the point with the 5th greatest y-value
and the point with the 6th greatest y-value.
Both points will have the x-value equal to xa,
else we wouldn't get to this step,
because there would be a vertical line.
Now we can create the point Q in the middle of that two points.
Thats one point from the resulting line.
Another point is needed, so that no points from the left or right side are divided.
To get that point we need the point from the left side,
that has the lowest angle (bh) between the the vertical line at xa
and the line determined by that point and Q.
We also need that point from the right side (with angle ag).
The new point R is between the point with the lower angle
and a point on the vertical line
(if the lower angle is on the left side a point above Q
and if the lower angle is on the right side a point below Q).
The line determined by Q and R divides the points in the middle
so that there are a even number of points on both sides.
It doesn't divide any points on the left or right side,
because if it would that point would have a lower angle and
would have been choosen to calculate R.
From the view of a mathematican that should work well in O(n).
For computer programs it is fairly easy to find a case
where precision becomes a problem. An example with 4 points would be
A(0, 100000000), B(0, 100000001), C(0, 0), D(0.0000001, 0).
In this example Q would be (0, 100000000.5) and R (0.00000005, 0).
Which gives B and C on the left side and A and D on the right side.
But it is possible that A and B are both on the dividing line,
because of rounding errors. Or maybe only one of them.
So it belongs to the input values if this algorithm suits to the requirements.
get that two points Pa(xa, ya) and Pb(xb, yb)
which are the medians based on the x values > O(n)
if xa != xb you can stop here
because a y-axis parallel line between that two points is the result > O(1)
get all points where the x value equals xa > O(n)
count points with x value less than xa as c > O(n)
get the lowest point Pc based on the y values from the points from 3. > O(n)
get the greatest point Pd based on the y values from the points from 3. > O(n)
get the (n/2-c)th greatest point Pe based on the y values from the points from 3. > O(n)
also get the next greatest point Pf based on the y values from the points from 3. > O(n)
create a new point Q (xa, (ye+yf)/2)
between Pe and Pf > O(1)
for all points Pi calculate
the angle ai between Pc, Q and Pi and
the angle bi between Pd, Q and Pi > O(n)
get the point Pg with the lowest angle ag (with ag>0° and ag<180°) > O(n)
get the point Ph with the lowest angle bh (with bh>0° and bh<180°) > O(n)
if there aren't any Pg or Ph (all points have same x value)
create a new point R (xa+1, 0) anywhere but with a different x value than xa
else if ag is lower than bh
create a new point R ((xc+xg)/2, (yc+yg)/2) between Pc and Pg
else
create a new point R ((xd+xh)/2, (yd+yh)/2) between Pd and Ph > O(1)
the line determined by Q and R divides the points > O(1)

Related

Find two rectangles with minimum areas that cover all points

You're given a n points, unsorted in an array. You're supposed to find two rectangles that cover all points and they should not overlap. Edges of rectangles should be parallel to x or y ordinate.
The program should return the minimum area covered by all these dots. Area of first rectangle + area of second rectangle.
I tried to solve this problem. I sorted points by X ordinate and the first one is the leftmost one of the first rectangle. When we go through the points we find the highest and lowest one. I was thinking that when the difference between two points by x is the biggest, that means that the first point is rightmost one of the first rectangle, and the second point is the leftmost one of the second rectangle.
It should work when the points are given as in first example, however, if the example is the second one it doesn't work. As it would return something like this and that's wrong:
This should be correct:
Then i was thinking doing sorting twice, just, the second time do it by Y ordinate and then compare two total areas. Areas when points are sorted by x and when points are sorted by y and the smaller area is the correct answer.
The two rectangles cannot overlap, so one must be either completely to the right or on top of the other. Your idea to sort the points by x-value and find the biggest gap is good, but you should do that for both directions, as you suggested. That would find the correct rectangles in your example.
The biggest gap isn't necessarily the ideal splitting point, however. Depending on the extent of the bounding boxes in the perpendicular direction, the split may be somewhere else. Consider a rectangular area with four quadrants, where two diagonally opposite quadrants are populated with points:
Here, the ideal split isn't where the largest gap is.
           
You can find the ideal location by considering all possible splits between points with adjacent x- and y-coordinates.
Sort the points by x-coordinate.
Scan the sorted array in ascending order. Keep track of the minimum rectangle to the left of the current point by storing the minimum and maximum y-coordinates. Store these running top and bottom borders for each point.
Now do the same in descending order, where you keep running top and bottom borders for the right rectangle.
Finally, loop through the points again and calculate the areas of the left and right minimal rectangles for a split between two adjacent nodes. Keep track of the minimum area sum.
Then do the same for minimum top and bottom rectangles. The last two steps can be combined, which will save arrays for the minimum bounds for the right rectangle.
This should be O(n · log n) in time: Sorting is O(n · log n) and the individual passes are O(n). You need O(n) additional memory for the running bounding boxes for thze first rectangle.
The first observation is that any edge of a rectangle must touch one of the points. Edges that didn't touch a point could be pulled back, resulting in less area.
Given n points, there are thus n selections total for left1, right1, bottom1, top1, left2, right2, bottom2 and top2. This gives a simple O(n^8) algorithm already: try all possible assignments and remember the one giving the least total area (right1 - left1)(top1 - bottom1) + (right2 - left2)(top2 - bottom2). Indeed, you can skip any combinations with right < left or top < bottom. This gives a speedup, though it does not change the O(n^8) bound.
Another observation is that the edges should stay within the minimum and maximum X and Y bounds of all points. Find the minimum and maximum X and Y values of any points. Call these minX, maxX, minY and maxY. At least one of your rectangles will need to have its left, right, bottom and top edges, respectively, at those levels.
Because minx, maxX, minY and maxY must be assigned to one of the two rectangles, and there are exactly 2^4 = 16 ways to do this, you can try each of the four possible assignments with the remaining coordinates assigned as above. This gives an O(n^4) algorithm: O(n) to get minX, maxX, minY and maxY, and then O(n^4) to fill in the four unassigned variables for each of 16 assignments of minX, maxX, minY and maxY to the eight edge coordinates.
We have so far ignored the requirement that rectangles not overlap. To accommodate that, we must ensure at least one of the following four conditions holds true:
a horizontal line at Y coordinate h with top1 <= h <= bottom2
a horizontal line at Y coordinate h with top2 <= h <= bottom1
a vertical line at X coordinate w with right1 <= h <= left2
a vertical line at X coordinate w with right2 <= h <= left1
The two rectangles overlap if and only if all four of these conditions are simultaneously false. This allows us to skip over candidate solutions, giving a speedup but not changing the asymptotic bound O(n^4). Note that we need to check this condition specifically since, otherwise, optimal solutions might have overlap (exercise: show such an example).
Let's try to shave some more time off of this. Assume we have non-overlapping rectangles by condition #1 above. Then there are n choices for h; we can try each of these n choices and then determine the area of the resulting selections by finding the minimum and maximum coordinates of points in the resulting halves. By trying all n selections for h, we can determine the "best case" vertical split. We need not try condition #2, since the only difference is in the ordering of the rectangles which is arbitrary. We must also try condition #3 with a horizontal split. This suggests an O(n^2) algorithm:
For each point, choose h = point.y
Separate the points into groups with point.y <= h and point.y > h.
Find the minimum and maximum X and Y coordinates of both subsets of points.
Compute the sum of the areas of the two rectangles.
Remember the minimum area obtained from the above and the corresponding h.
Repeat, but using w and X coordinates.
Determine whether minimum area was obtained for a vertical or horizontal split
Return the corresponding rectangles as the answer
Can we do even better? This is O(n^2) and not O(n) because for each choice of h and w we need to then find the minimum and maximum coordinates of each subgroup. This assumes a linear scan through both subgroups. We don't actually need to do this for the min and max X/Y coordinates when scanning horizontally/vertically, since those will be known. What we need is a solution to this problem:
Given n points and a value h, find the maximum X coordinate of any point whose Y coordinate is no greater than h.
The obvious solution I give above is O(n^2), but you might be able to find an O(n log n) solution by clever application of sorting or maybe even an O(n) solution by some even more clever method. I will not attempt this.
Our solution is O(n^2); the theoretically optimal solution is Omega(n) since you must at least look at all the points. So we're pretty close but there is room for improvement.

Tangents range for all pairs of points in a box

Suppose i have a box with a lot of points. I need to be able to calculate min and max angles for all lines which go through all possible pairs of the points. I can do it in O(n^2) times by just enumerating every point with all others. But is there faster algorithm?
Taking the idea of dual plane proposed by Evgeny Kluev, and my comment about finding left-most intersection point, I'll try to give an equivalent direct solution without any dual space.
The solution is simple: sort your points by (x, y) lexicographically. Now draw a line through each two adjacent points in the sorted order. It can be proved that the minimal angle is achieved by one of these lines. In order to get maximal angle, you need to sort by (x, -y) lexicographically, and also check only adjacent pairs of points.
Let's prove by the idea for min angle. Consider the two points A and B which yield the minimal possible angle. Among such points we can choose the pair with minimal difference of x coordinates.
Suppose that they have same y. If there is no other point between them, then they are adjacent. If there are any points between them, then clearly at least one of them is adjacent to A in our order, and all of them yield the same angle.
Suppose that there exists a point P with x-coordinate in between A and B, i.e. Ax < Px < Bx. If P lies on AB, then AP has same angle but less difference of x coordinates, hence a contradiction. When P is not on AB, then either AP or PB would give you less angle, which also gives contradiction.
Now we have points A and B lying on two adjacent vertical lines. There are no other points between these lines. If A and B are the only points on their vertical lines, then the AB pair is clearly adjacent in sorted order and QED. If there many points on these lines, obviously the minimal angle is achieved by taking the highest point on the left vertical line (which must be A) and the lowest point on the right vertical line (which must be B). Since we sort points of equal x by y, these two points are also adjacent.
Sort the points (or use hash map) to find out if there are any horizontal lines.
Then solve this problem on dual plane. Here you only need to find the leftmost and the rightmost intersection points. Use binary searches to find a pair of horizontal coordinates such that all intersection points are between them. (You could quickly find approximate results just by continuing binary searches from these coordinates).
Then sort lines according to their tangents on dual plane. And for pairs of adjacent lines in this sorted order find intersections closest to those horizontal coordinates. This does not guarantee good complexity in the worst case (when some lines on primal plane are almost horizontal). But in most cases time complexity would be determined by sorting: O(N log N) + O(binary_search_complexity).

find a point non collinear with all other points in a plane

Given a list of N points in the plane in general position (no three are collinear), find a new point p that is not collinear with any pair of the N original points.
We obviously cannot search for every point in the plane, I started with finding the coincidence point of all the lines that can be formed with the given points, or making a circle with them something.. I dont have any clue how to check all the points.
Question found in http://introcs.cs.princeton.edu/java/42sort/
I found this question in a renowned algorithm book that means it is answerable, but I cannot think of an optimal solution, thats why I am posting it here so that if some one knows it he/she can answer it
The best I can come up with is an N^2 algorithm. Here goes:
Choose a tolerance e to control how close you're willing to come to a line formed from the points in the set.
Compute the convex hull of your set of points.
Choose a line L parallel to one of the sides of the convex hull, at a distance 3e outside the hull.
Choose a point P on L, so that P is outside the projection of the convex hull on L. The projection of the convex hull on L is an interval of L. P must be placed outside this interval.
Test each pair of points in the set. For a particular line M formed by the 2 test points intersects a disc of radius 2e around P, move P out further along L until M no longer intersects the disc. By the construction of L, there can be no line intersecting the disk parallel to L, so this can always be done.
If M crosses L beyond P, move P beyond that intersection, again far enough that M doesn't pass through the disc.
After all this is done, choose your point at distance e, on the perpendicular to L at P. It can be colinear with no line of the set.
I'll leave the details of how to choose the next position of P along L in step 5 to you,
There are some obvious trivial rejection tests you can do so that you do more expensive checks only with the test line M is "parallel enough" to L.
Finally, I should mention that it is probably possible to push P far enough out that numerical problems occur. In that case the best I can suggest is to try another line outside of the convex hull by a distance of at least 3e.
You can actually solved it using a simple O(nlogn) algorithm, which we will then improve to O(n). Name A the bottom most point (in case of tie choose the one that is has smaller x coordinate). You can now sort in clockwise order the rest of the points using the CCW. Now as you process each point from the sorted order you can see that between any two successive points having different angle with point A and the bottom axis (let these be U, V) there is no point having angle c, with U <= c <= V. So we can add any point in this section and it is guaranteed that it won’t be collinear with any other points from the set.
So, all you need is to find one pair of adjacent points and you are done. So, find the minimum and the second minimum angle with A (these should be different) in O(n) time and select any point in between them.

Given two lines on a plane, how to find integer points closest to their intersection?

I can't solve it:
You are given 8 integers:
A, B, C representing a line on a plane with equation Ax + By = C
a, b, c representing another line
x, y representing a point on a plane
The two lines are not parallel therefore divide plane into 4 pieces.
Point (x, y) lies inside of one these pieces.
Problem:
Write a fast algorithm that will find a point with integer coordinates in the same piece as (x,y) that is closest to the cross point of the two given lines.
Note:
This is not a homework, this is old Euler-type task that I have absolutely no idea how to approach.
Update:
You can assume that the 8 numbers on input are 32-bit signed integers.
But you cannot assume that the solution will be 32 bit.
Update 2:
Difficult case - when lines are almost parallel - is the heart of the problem
Update 3:
Author of the problem states that the solution is linear O(n) algorithm. Where n is the size of the input (in bits). Ie: n = log(A) + log(B) + ... + log(y)
But I still can't solve it.
Please state complexities of published algorithms (even if they are exponential).
alt text http://imagebin.ca/img/yhFOHb.png
Diagram
After you find intersection of lines L1:Ax+By=C and L2:ax+by=c i.e. point A(x1,y1).
Define two more lines y = ceil(y1) and y = floor(y1) parallel to X-axis and find their intersection with L1 and L2 i.e. Points B(x2,y2) and C(x3,y3).
Then point you require is D or E whichever is closer to point A. Similar procedure applies to other parts of the plane.
D ( ceil(x2), ceil(y1) )
E ( ceil(x3), floor(y1) )
This problem falls into the category of Integer Convex Optimization.
Presented here is a mathematical way to approach the problem. I don't expect you to actually use it - a lot of complicated techniques are required, and other algorithmic approaches (such as "searching" for the appropriate point) will likely do just fine. However, interest has been expressed in the "true" solution, so here it is.
It can be solved in three stages:
First, determine which side of each line the answer will be on, as illustrated by TheMachineCharmer's answer.
Once that is known, the problem can be rewritten as a convex optimization problem (see Wikipedia for details). The function to be optimized is minimizing (x - x0)^2 + (y - y0)^2, with x0 and y0 the coordinates of the intersection of the two lines. The two lines each become a linear inequality, e.g. "x+y >= 0", together forming the convex region the answer can be found in. I will note that the solution will be (x=x0, y=y0) - what you need from this stage a way of expressing the problem, analagous to a feasible tableau for the simplex method.
Third, an integer solution can be found by repeatedly adding cuts to further constrain the feasible region until the solution to the convex optimization problem is integral. This stage may take a lot of iterations in the general case, but looking at the problem presented, and in particular the 2D nature of it, I believe it will be solved with at most two cuts.
I show here how a "difficult" instance of this problem can be solved. I think this method can be generalized. I have put another simpler instance in the comments of the original post.
Consider the two lines:
10000019 * X - 10000015 * Y + 909093 >= 0 (L1)
-10000022 * X + 10000018 * Y + 1428574 >= 0 (L2)
A = 10000019, B = -10000015, C = -909093
The intersection point is H:
Hx = -5844176948071/3, Hy = -5844179285738/3
For a point M(X,Y), the squared distance HM^2 is:
HM^2 = (9*X^2+35065061688426*X
+68308835724213587680825685
+9*Y^2+35065075714428*Y)/9
g = gcd(A,B) = 1: the equation of L1 A*X+B*Y+909093
can take any integer value.
Bezout coefficients, U=2500004 and V=2500005 verify:
A * U + B * V = 1
We now rewrite the problem in the "dual" basis (K,T) defined by:
X = T*U - K*B
Y = T*V + K*A
After substitution, we get:
T+909093 >= 0
2*T+12*K+1428574 >= 0
minimize 112500405000369*T^2
+900003150002790*T*K
+1800006120005274*K^2
+175325659092760325844*T
+701302566240903900522*K
+Constant
After further translating (first on T, then on K to minimize the
constant in the second equation), T=T1-909093, K=K1+32468:
T1 >= 0
2*T1+4+12*K1 >= 0
minimize 112500405000369*T1^2
+300001050000930*T1
+900003150002790*T1*K1
+1200004080003516*K1
+1800006120005274*K1^2
+Constant
The algorithm I proposed is to loop on T1. Actually, we don't need to
loop here, since the best result is given by T1=K1=0, corresponding to:
X = -1948055649352, Y = -1948056428573
My initial post below.
Here is another idea of algorithm. It may work, but I did not implement it...
With appropriate change of signs to match the position of (x,y), the problem can be written:
A*X+B*Y>=C (line D)
a*X+b*Y>=c (line d)
minimize the distance between M(X,Y) and H, the intersection point
A*b != a*B (intersection is defined)
A,B,C,a,b,c,X,Y all integers
The set of all values reached by (AX+BY) is the set of all multiples of g=gcd(A,B), and there exist integers (u,v) such that Au+Bv=g (Bezout theorem). From a point with integer coordinates (X0,Y0), all points with integer coordinates and the same value of AX+BY are (X0-KB/g,Y0+KA/g), for all integers K.
To solve the problem, we can loop on lines parallel to D at increasing distance from H, and containing points with integer coordinates.
Compute g,u,v, and H (the coordinates of H are probably not needed, we only need the coefficients of the quadratic form corresponding to the distance).
T0 = ceil(C/g)
Loop from T = T0
a. Find K the smallest (or largest, depending on the sign of aB-bA) integer verifying a*(Tu-KB/g)+b*(Tv+KA/g)>=c
b. Keep point (Tu-KB/g,Tv+KA/g) if closer to H
c. Exit the loop when (T-T0) corresponds to a distance from D larger than the best result so far, otherwise continue with T+=1
I have researched the problem in the past (both because it's fun and because I ran into something related at a place where I worked).
To my knowledge, there is no efficient (FPTIME) algorithm for this problem.
The only known (to me) solution is to basically enumerate integer coordinates (starting from around the intersection) until you find the one you want. This is of course not at all efficient when the angle between the two lines is very small. You can do some pruning to improve efficiency and, when the slope is small, efficiency is decent.
A generalization of this (ILP - integer linear programming) is known to be NP-complete.
The more I think about this, the more it seems like it turns into Integer Linear Programming, which is NP-complete in the general case. http://en.wikipedia.org/wiki/Linear_programming#Integer_unknowns
My line of reasoning started out like TheMachineCharmer's answer until I reached that point. The problem there is that the approach of examining the lines along the ceil/floor of the point of intersection only works if the section is aligned with the vertical or horizontal axis though the intersection point. More likely, the thin section will be inclined at some angle away from the axis and the ceil/floor neighbors will not intersect the section on integer coordinates.
At that point we're looking for some integer combination of the natural unit vectors that satisfies the inequalities that define our selected section and also minimizes the distance to the point of intersection. To me, that seems like an integer linear programming problem.
There are special cases of integer linear programming that are easier than NP-hard and this problem could easily be one of them since it seems like its more constrained than the general linear programming case. The Wikipedia article links to a few methods, but that's beyond my math level to apply.
As a few others have pointed out, this is a problem in integer linear programming (aka linear Diophantine inequalities).
Check out this reference: ABS Algorithm For Solving a Class Of Linear Diophantine Inequalities and Integer LP Problems. The authors claim to be able to solve systems like
max(cTx) for Ax≤b, x∈Zn, where c∈Zn, b∈Zm, A∈Zm,n, m≤n.
In particular, setting m=2, n=2, we get the problem of finding
max(cTx) for Ax ≤ b, x∈Z2, where c∈Z2, b∈Z2, A∈Z2,2.
Here, A is a 2x2 matrix, and b, c and x are 2x1 column vectors.
The problem stated by the OP can be restated in this fashion (if asked, I'll try to spell this out in more detail).
The matrix algorithm presented in the paper may look hairy to the uninitiated, but matrix algorithms are like that. Not that I've gone through it line by line, or understand it, but it looks pretty tame compared to some stuff I've seen.
This seems to be something in the general class of ABS methods, which appear to be gaining traction in several problem domains.
The last sentence of section 2 of the paper also refers to another solution method.
As #Alan points out, whereas the general ILP problem is NP-Hard, the problem stated here may not be. I'm not sure why that is, but it may be because the matrix A is 2x2 (rather than nx2), and because the constraints can be expressed as integers.
Edit1: the complexity of the algorithm appears to be O(1) (It appears to be O(d), where d is the dimension of the lattice. In this case, d=2). My surprise at this is O(!!) and understanding and implementing this is still O(??), although I've gone through it a few times now and it is looking more straightforward than I thought.
Here's a partial idea which may be useful in getting a full solution. Imagine the two lines are very, very close to each other. Then any integral solution between them would also be an integral point which is very close to each line. Let's try to find close integral points to the line ax+by=c. Since y = (c - ax)/b, we need to have y very close to an integer, so b approximately divides c-ax. In other words, c-ax+D == 0 mod b for a small integer D.
We can solve c-ax+D == 0 mod b for x: x = a^-1(c+D) mod b (a^-1 will exist if a and b are relatively prime, not sure if that is the case here).
So the algorithm is to evaluate x = a^-1(c+D) mod b for D=0,+1,-1,+2,-2,... and try the resulting x's to see if they work. If there are close integral points to the intersection of the two lines, they should show up early in this iteration. Of course, you may have to reach D=b-1 in the worst case...
Actually it may be possible to solve this with a modified Bresenham's line drawing algorithm.
It is usually used to do scan conversion of lines, and only requires increments of some step inside a loop if you know the end points of the line.
Once you have worked out which sector the point is in, move the origin to the intersection keeping note of the non integer error. Work out the slope of the line from the intersection to the bottom line, then do a normal to the horizontal at an integer x value (if the slope is small) or a normal from the y (is the slope is high) and find where it intersects the other axis an an integer point.
You should be able to check each integer step in one axis to determine if the point you are testing is above or between your two lines (make a new vector to that spot from the intersection and determine the slope). If the point is above increment your integer step. Becuse you are testing from the smallest gradient differnece from one of the lines it should be O(n). In Bresenhams algorithm their are 8 sectors not just 4.
You guys are missing the point! haha, sorry, couldn't resist.
Hey, let's imagine a slightly simpler situation.
You've got one line emanating from the origin forming an angle of less than 90 degrees with the x axis. Find the nearest integer point.
The problem with just searching lattice points until you hit one that's in the quadrant we want is that one doesn't know how far out to search. In the case of a very, very acute angle, we could consider a bazillion points before we hit one that's in our region.
Solution:
Solve: (Slope of Line) * Delta(x) = 1.
i.e. Delta(x) = 1/(Slope of Line), is where we start searching. Subject to the constraint Delta(x) > 1.
In other words, we go just far out enough that there must have been at least an integer difference between x and y coordinates.
In our problem we'd have to transform appropriately and tweedle the numbers to give an appropriate error range. Delta(x) >= 2, Delta(x) = 2/(Slope of Line) I think will do it off of the top of my head, but I don't have a pencil.
No?
Well, it depends on what is considered as fast enough.
Let's name the point [x,y] P. Also I'll call points with integer coordinates 'integer points'.
Algorithm I propose:
Find the point Q where these two lines intersect. (Q=[x_q, y_q])
Get the function of the line between Q and P, y=f(x) or inverse x=g(y);
Determine if QP more vertical or horizontal according to its angle. Let's say it's vertical to simplify following solution (if it's horizontal, the axes would simply invert and where I write x it'd be y and vice versa).
Take the first integer coordinate y_1 we get going along the line from Q to P.
Calculate second coordinate of that point: x_1=f(y_1). That point is in our segment.
Find if the surrounding integer points with coordinates [floor(x_1);y_1] and [floor(x_1+1);y1] are in the segment we're interested in.
6.1 If yes, then we iterate through horizontal line x_3=f(y_1) to find the integer point which is still in our segment and has (x_3-x_q) -> min. That point is our answer.
6.2 If not, then increment y_1 by one and repeat from step 5.
I think there are 3 pieces to this.
calculate the intersection of the 2 lines, and hold on to the X and Y coordinates of that point
find the section that the given point is in. This should be easy enough, because you have the slope of the 2 lines, and the slope of the line created by the given point and the point of intersection. Call them m_line1, m_line2 and m_intersect. If m_intersect There's a formula to figure out the section using these values and the location of the given point.
find the closest integer. There is also a straightforward calculation of this once you know the values from #1 above, and the slopes from #2. You can brute-force it, but there is an elegant mathematical solution.
These are the steps I took, at least.
Updated to add more substance
OK, I'll start us off with a discussion on #2.
If you calculate the slope of the given point and the intersection point, then you arrive at m_intersection. This is the slope of a line that passes through the intersection point. Let's assume that m_line1 is the greater of the 2 slopes, so that line1 is "above" line2 as x increases after the intersection. It makes it easier to think about for section names. We'll call section A the section given by the sliver between line1 and line2 for x larger than the intersection coordinate x, and then we'll name the other 3 sections clockwise, so that A and C are opposite each other.
If m_intersection is between m_line1 and m_lin2, then it must be in one of the 2 sections A or C. Which section is a simple test of the x coordinate value against the intersection's x coordinate. We defined A to be the section with greater value. A similar calculation can be made if the slope is outside m_line1 or m_line2.
This gives you the section that your point lies in. All you did was calculate 1 intersection (5 multiplications, 2 divisions and a handful of subtractions if you do it the traditional way), 3 slopes, and then a couple integer comparisons.
Edit #3 - back by (un)popular demand!
So here is how I calculated #3, the closest integer point to the intersection. It may not be the best, but it uses binary search, so it's O(log n) where n is related to the inverse of the difference of the line slopes. The closer they are together, the larger n is.
First, take the difference between the slopes of the two lines. Say it's 1/8. This means that from the point of intersection, you have to go out 8 units along the x axis before you are guaranteed that there is a whole integer on the y axis in between the two lines (it may be on one of the lines). Now, if the intersection itself is not on an integer x coordinate, then you'll need to step out further to guarantee that your starting point is on an integer x coordinate, but it is bounded. If the intersection is at x = 1.2, then in the above example, at worst you'd start at x = 41, then move down ~5 units along the y axis. Choose either the ceil or floor of the y value that you get. It's not terribly critical.
Now that you have a starting point, the closest point can be approximated by binary search. Your new line segment is between the intersection and the starting point, and your units of movement are multiples of that line segment's slope. Calculate the midpoint of the line segment and see if it lies in between the two lines. Add or subtract 1 to it if it is not a direct hit, and if either of those hits, cut the remaining distance in half and do it again. Otherwise search the further half of the segment.
If you don't have a slope difference < 1, I think the problem may be simpler (brute force the space around the intersection). But it's just a special case of the search above, where you don't need to step out that far to find a starting point.
I was doing something similar when I had to find a point for labeling of a polygon.
The final result was 70000 polygons in 5 seconds on pentium 3 in Autocad. So that's about 3 seconds if you exclude Autocad.
First you need to find an intersection point.
Next thing you have to find where your point (x, y) lies and draw a horizontal or vertical line through it, so that your 2 lines (A, B, C) and (a, b, c) and a new horizontal/verical line form a triangle.
How to find if it's vertical or horizontal line:
Draw both horizontal and vertical lines through your (x, y) point and then check:
-for horizontal:
- if intersections for line A,B,C and your horizontal line and line a,b,c make this equation work (intersection with A,B,C).x < x < (intersection with a,b,c).x, then you know your inside. (you can switch A,B,C and a,b,c, just as long x is inside.
- similar for y, just check for y and not x.
So now you have a triangle and you know where it is (left, right, up, down).
for example if it's a right triangle (like the graph above). You take the x of intersection point and you ceil it (if it's on the left you floor it)..similar for y coordinate if you have up/down triangle.
Then you draw a scanline through it, that's paralel to your scanline through your (x,y) point and check if you have a point inside of the intersections (similar to x < x < x above, but with a new line).
If you don't have an integer inside, then you have to move your ceil point further away from intersection point. You should calculate apropriate 'step' based on the angle between your two lines (if the lines are paralel and very close to each other then the angle will be small, so you have to increse the step, if the angle is wide, small step is required.
When you find a point, it may not be the closest one. So you'll have to do a bisection between the last not good point (when you're increasing step) and the last point (where you found an integer).
The problem of checking whether a point is part of a mathematical cone is fairly simple. Given 2 vectors, v, w, any point in the cone defined by (v, w) will be on the form: z = a***v** + b***w**, where a,b >= 0. Note, for this to work, you will have to move Origo to the intersection of the 2 lines. Since we cannot assume finite precision of the intersection, you will have to do floating point math and decide whether something is close enough to what you want.
Find vectors defining the 4 cones (there's infinitely many of them, we just need 2 for each cone), that are defined by the 2 lines.
Find out which cone contains our point, call that cone for C.
Take the 2 vectors defining C, and find the median vector (the vector that would split C in 2 identical cones), call it m.
Now is time to initiate the loop. For simplicity sake I'm going to assume that we limit ourself to n-bits on the x and y axis. Note you'll need an integer larger than n-bits for the length of m. Now do a binary search along the length of m, and check the 2 rings around every time (I suspect 1 ring around will be enough). When you've found the smallest length that do contain points C, check which of those points are the closest.
The worst case growth would be O(log(sqrt(2*n^2)), where n is the length we use to represent the x and y axis.
It is possible to do a "reverse binary search" so to speak, if you don't know the length of *m. Just keep doubling the the length you go out until you find points in C. Then you know 2 points on m and you can do a binary search between them.
The main problem with all this is precision, so keep this in mind. Alternative ways to pursue could include the 2 halfplanes that make up a cone. Each cone above are defined by the intersection of 2 halfplanes, and it is possible that checking whether a point is member of a halfplane is simple enough, I'm not sure.
Edit: it is indeed a whole lot easier with the half planes, the 2 lines divide R^2 into 2 half planes each, this gives the 4 combinations that would be the 4 cones. So every time you want to check if a point is member of a cone, you have to check if it's a member of the 2 half planes that make up that particular cone. How to do so are explained here:
http://www.mathsteacher.com.au/year9/ch04_linear_graphs/07_half/planes.htm
and would be easier than moving Origo and fiddling around with precision. Replacing the method of checking membership and keeping everything else the same, you arrive at the same growth.
Here is a linear time (i.e., O(# bits of A, B, C, etc.), assuming the bits fit into O(1) words of memory) solution using line-side tests and binary search:
Suppose w.l.o.g. that B != 0 (else we swap A with a, B with b, and C with c). Perform a line-side test to see which side of line (A, B, C) the point is on. Assume w.l.o.g. that the point is below (or on) the line.
Note that for an arbitrary x-coordinate x', we can compute the smallest y' such that (x', y') is above the line (A, B, C) in O(1) time via y' = (C - A * x') / B.
Now, assume w.l.o.g. that the input point (x, y) is to the right of (a, b, c), or below in the case of a horizontal line. We can then perform a line-side test of (x', y') relative to line (a, b, c) and determine whether we need to increase x' or decrease x' to find the minimum x' such that (x', y') falls on the correct side of (a, b, c).
Binary searching for this point takes at most O(w) time where w is the number of bits in A, B, etc. Note that because the input coordinates x and y each fit in an integer, so will the return value. Even if x and y were not necessarily within these bounds and the lines were nearly parallel, a suitable x will be found within O(w) time because the difference in slopes is A / B - a / b = (Ab - aB) / Bb <= 1 / 2^(2w), so the x-coordinate of the answer will fit within O(1) words of memory. We still need to find the y-coordinate of the answer, which can also be found via binary search.
I suspect this is a mathematical optimization problem that can be solved with a Lagrange multiplier...
Of those four pieces of the plane, one is to the left of both lines, one is to the right of both lines, one is to the right of one and to the left of the other line, and the last one is to the left of one and to the right of the other line. It's easier to see if you draw it.
The relative position of a point from a line depends on the result of this determinant:
[1 p1x p1y; 1 p2x p2y; 1 p3x p3y], where p1 and p2 are two arbitrary points in the line and p3 is the given point.
If it equals zero, the point is in the line, if it's greater of lower than zero, it's to a side, the side depends on the relative position of p1 and p2 in the line and what you consider left and right on the plane.
The problem is choosing two points that follow the same criteria in both lines, so that results are consistent, maybe p1 always has lower value of x coordinate than p2 (y coordinate if the line is vertical).
When you have both points for each line, calculate both determinants and you are done.
EDIT
Ups, this solves the problem partially. Anyway you can calculate the side the XY point is in with this, calculate the intersection, and then calucate the relative position of all valid points (floor(x), floor(y)), (floor(x), ciel(y)), ...
line 1 is defined as y1 = m1 * x1 + b1.
line 2 is defined as y2 = m2 * x2 + b2.
m1, m2, b1, b2 are all known values [constants].
make sure m1 <> m2.
find point of intersection, ie where y1 == y2 and x1 == x2 , defined as (X,Y).
Y = ((m1*b2)/m2)/(m1/m2-1)
X = (Y-b1)/m1
the nearest point can be found by rounding X and Y to the nearest integers. you decide what to do with .5
My proposal is this. Assume that the section of the plane which contains our target point spans entirely in lower left quadrant, looking from the cross point of two lines (other quadrants are analogous, and case when section of plane spans more than one quadrant will be considered later).
Let the two given lines be l1 and l2 (l1 is 'less steep' than l2)
find X = (a, b), the cross point of l1 and l2.
let k = 0
let vk be vertical line with the x coordinate xk = floor(a-k)
find cross points of vk with l1 and l2 (points V1 = (x1, y1), V2 = (x2, y2)).
if floor(y1) != floor(y2), target point is (x1, floor(y1)) END.
if floor(y1) == floor(y2), increment k and go to step 3.
Since l1 and l2 are not parallel, abs(y1 - y2) must grow with k. When abs(y1 - y2) gets larger than 1, algorithm will surely stop (it might stop earlier though).
Now let us consider the (easy) case when our section of plane spans more than one quadrant, looking from the cross point of two lines (it may span two or three quadrants).
find X = (a, b), the cross point of l1 and l2.
find A, the set of four closest points to X that have integer coordinates
find B, the set of points from A which are in the target section of the plane.
point from B that is closest to the cross point of l1 and l2 is the target point
(This case runs in constant time.)

Connected points with-in a grid

Given a collection of random points within a grid, how do you check efficiently that they are all lie within a fixed range of other points. ie: Pick any one random point you can then navigate to any other point in the grid.
To clarify further: If you have a 1000 x 1000 grid and randomly placed 100 points in it how can you prove that any one point is within 100 units of a neighbour and all points are accessible by walking from one point to another?
I've been writing some code and came up with an interesting problem: Very occasionally (just once so far) it creates an island of points which exceeds the maximum range from the rest of the points. I need to fix this problem but brute force doesn't appear to be the answer.
It's being written in Java, but I am good with either pseudo-code or C++.
I like #joel.neely 's construction approach but if you want to ensure a more uniform density this is more likely to work (though it would probably produce more of a cluster rather than an overall uniform density):
Randomly place an initial point P_0 by picking x,y from a uniform distribution within the valid grid
For i = 1:N-1
Choose random j = uniformly distributed from 0 to i-1, identify point P_j which has been previously placed
Choose random point P_i where distance(P_i,P_j) < 100, by repeating the following until a valid P_i is chosen in substep 4 below:
Choose (dx,dy) each uniformly distributed from -100 to +100
If dx^2+dy^2 > 100^2, the distance is too large (fails 21.5% of the time), go back to previous step.
Calculate candidate coords(P_i) = coords(P_j) + (dx,dy).
P_i is valid if it is inside the overall valid grid.
Just a quick thought: If you divide the grid into 50x50 patches and when you place the initial points, you also record which patch they belong to. Now, when you want to check if a new point is within 100 pixels of the others, you could simply check the patch plus the 8 surrounding it and see if the point counts match up.
E.g., you know you have 100 random points, and each patch contains the number of points they contain, you can simply sum up and see if it is indeed 100 — which means all points are reachable.
I'm sure there are other ways, tough.
EDIT: The distance from the upper left point to the lower right of a 50x50 patch is sqrt(50^2 + 50^2) = 70 points, so you'd probably have to choose smaller patch size. Maybe 35 or 36 will do (50^2 = sqrt(x^2 + x^2) => x=35.355...).
Find the convex hull of the point set, and then use the rotating calipers method. The two most distant points on the convex hull are the two most distant points in the set. Since all other points are contained in the convex hull, they are guaranteed to be closer than the two extremal points.
As far as evaluating existing sets of points, this looks like a type of Euclidean minimum spanning tree problem. The wikipedia page states that this is a subgraph of the Delaunay triangulation; so I would think it would be sufficient to compute the Delaunay triangulation (see prev. reference or google "computational geometry") and then the minimum spanning tree and verify that all edges have length less than 100.
From reading the references it appears that this is O(N log N), maybe there is a quicker way but this is sufficient.
A simpler (but probably less efficient) algorithm would be something like the following:
Given: the points are in an array from index 0 to N-1.
Sort the points in x-coordinate order, which is O(N log N) for an efficient sort.
Initialize i = 0.
Increment i. If i == N, stop with success. (All points can be reached from another with radius R)
Initialize j = i.
Decrement j.
If j<0 or P[i].x - P[j].x > R, Stop with failure. (there is a gap and all points cannot be reached from each other with radius R)
Otherwise, we get here if P[i].x and P[j].x are within R of each other. Check if point P[j] is sufficiently close to P[i]: if (P[i].x-P[j].x)^2 + (P[i].y-P[j].y)^2 < R^2`, then point P[i] is reachable by one of the previous points within radius R, and go back to step 4.
Keep trying: go back to step 6.
Edit: this could be modified to something that should be O(N log N) but I'm not sure:
Given: the points are in an array from index 0 to N-1.
Sort the points in x-coordinate order, which is O(N log N) for an efficient sort.
Maintain a sorted set YLIST of points in y-coordinate order, initializing YLIST to the set {P[0]}. We'll be sweeping the x-coordinate from left to right, adding points one by one to YLIST and removing points that have an x-coordinate that is too far away from the newly-added point.
Initialize i = 0, j = 0.
Loop invariant always true at this point: All points P[k] where k <= i form a network where they can be reached from each other with radius R. All points within YLIST have x-coordinates that are between P[i].x-R and P[i].x
Increment i. If i == N, stop with success.
If P[i].x-P[j].x <= R, go to step 10. (this is automatically true if i == j)
Point P[j] is not reachable from point P[i] with radius R. Remove P[j] from YLIST (this is O(log N)).
Increment j, go to step 6.
At this point, all points P[j] with j<i and x-coordinates between P[i].x-R and P[i].x are in the set YLIST.
Add P[i] to YLIST (this is O(log N)), and remember the index k within YLIST where YLIST[k]==P[i].
Points YLIST[k-1] and YLIST[k+1] (if they exist; P[i] may be the only element within YLIST or it may be at an extreme end) are the closest points in YLIST to P[i].
If point YLIST[k-1] exists and is within radius R of P[i], then P[i] is reachable with radius R from at least one of the previous points. Go to step 5.
If point YLIST[k+1] exists and is within radius R of P[i], then P[i] is reachable with radius R from at least one of the previous points. Go to step 5.
P[i] is not reachable from any of the previous points. Stop with failure.
New and Improved ;-)
Thanks to Guillaume and Jason S for comments that made me think a bit more. That has produced a second proposal whose statistics show a significant improvement.
Guillaume remarked that the earlier strategy I posted would lose uniform density. Of course, he is right, because it's essentially a "drunkard's walk" which tends to orbit the original point. However, uniform random placement of the points yields a significant probability of failing the "path" requirement (all points being connectible by a path with no step greater than 100). Testing for that condition is expensive; generating purely random solutions until one passes is even more so.
Jason S offered a variation, but statistical testing over a large number of simulations leads me to conclude that his variation produces patterns that are just as clustered as those from my first proposal (based on examining mean and std. dev. of coordinate values).
The revised algorithm below produces point sets whose stats are very similar to those of purely (uniform) random placement, but which are guaranteed by construction to satisfy the path requirement. Unfortunately, it's a bit easier to visualize than to explain verbally. In effect, it requires the points to stagger randomly in a vaguely consistant direction (NE, SE, SW, NW), only changing directions when "bouncing off a wall".
Here's the high-level overview:
Pick an initial point at random, set horizontal travel to RIGHT and vertical travel to DOWN.
Repeat for the remaining number of points (e.g. 99 in the original spec):
2.1. Randomly choose dx and dy whose distance is between 50 and 100. (I assumed Euclidean distance -- square root of sums of squares -- in my trial implementation, but "taxicab" distance -- sum of absolute values -- would be even easier to code.)
2.2. Apply dx and dy to the previous point, based on horizontal and vertical travel (RIGHT/DOWN -> add, LEFT/UP -> subtract).
2.3. If either coordinate goes out of bounds (less than 0 or at least 1000), reflect that coordinate around the boundary violated, and replace its travel with the opposite direction. This means four cases (2 coordinates x 2 boundaries):
2.3.1. if x < 0, then x = -x and reverse LEFT/RIGHT horizontal travel.
2.3.2. if 1000 <= x, then x = 1999 - x and reverse LEFT/RIGHT horizontal travel.
2.3.3. if y < 0, then y = -y and reverse UP/DOWN vertical travel.
2.3.4. if 1000 <= y, then y = 1999 - y and reverse UP/DOWN vertical travel.
Note that the reflections under step 2.3 are guaranteed to leave the new point within 100 units of the previous point, so the path requirement is preserved. However, the horizontal and vertical travel constraints force the generation of points to "sweep" randomly across the entire space, producing more total dispersion than the original pure "drunkard's walk" algorithm.
If I understand your problem correctly, given a set of sites, you want to test whether the nearest neighbor (for the L1 distance, i.e. the grid distance) of each site is at distance less than a value K.
This is easily obtained for the Euclidean distance by computing the Delaunay triangulation of the set of points: the nearest neighbor of a site is one of its neighbor in the Delaunay triangulation. Interestingly, the L1 distance is greater than the Euclidean distance (within a factor sqrt(2)).
It follows that a way of testing your condition is the following:
compute the Delaunay triangulation of the sites
for each site s, start a breadth-first search from s in the triangulation, so that you discover all the vertices at Euclidean distance less than K from s (the Delaunay triangulation has the property that the set of vertices at distance less than K from a given site is connected in the triangulation)
for each site s, among these vertices at distance less than K from s, check if any of them is at L1 distance less than K from s. If not, the property is not satisfied.
This algorithm can be improved in several ways:
the breadth-first search at step 2 should of course be stopped as soon as a site at L1 distance less than K is found.
during the search for a valid neighbor of s, if a site s' is found to be at L1 distance less than K from s, there is no need to look for a valid neighbor for s': s is obviously one of them.
a complete breadth-first search is not needed: after visiting all triangles incident to s, if none of the neighbors of s in the triangulation is a valid neighbor (i.e. a site at L1 distance less than K), denote by (v1,...,vn) the neighbors. There are at most four edges (vi, vi+1) which intersect the horizontal and vertical axis. The search should only be continued through these four (or less) edges. [This follows from the shape of the L1 sphere]
Force the desired condition by construction. Instead of placing all points solely by drawing random numbers, constrain the coordinates as follows:
Randomly place an initial point.
Repeat for the remaining number of points (e.g. 99):
2.1. Randomly select an x-coordinate within some range (e.g. 90) of the previous point.
2.2. Compute the legal range for the y-coordinate that will make it within 100 units of the previous point.
2.3. Randomly select a y-coordinate within that range.
If you want to completely obscure the origin, sort the points by their coordinate pair.
This will not require much overhead vs. pure randomness, but will guarantee that each point is within 100 units of at least one other point (actually, except for the first and last, each point will be within 100 units of two other points).
As a variation on the above, in step 2, randomly choose any already-generated point and use it as the reference instead of the previous point.

Resources