Related: Is there a simple algorithm for calculating the maximum inscribed circle into a convex polygon?
I'm writing a graphics program whose goals are artistic rather than mathematical. It composes a picture step by step, using geometric primitives such as line segments or arcs of small angle. As it goes, it looks for open areas to fill in with more detail; as the available open areas get smaller, the detail gets finer, so it's loosely fractal.
At a given step, in order to decide what to do next, we want to find out: where is the largest circular area that's still free of existing geometric primitives?
Some constraints of the problem
It does not need to be exact. A close-enough answer is fine.
Imprecision should err on the conservative side: an almost-maximal circle is acceptable, but a circle that's not quite empty isn't acceptable.
CPU efficiency is a priority, because it will be called often.
The program will run in a browser, so memory efficiency is a priority too.
I'll have to set a limit on level of detail, constrained presumably by memory space.
We can keep track of the primitives already drawn in any way desired, e.g. a spatial index. Exactness of these is not required; e.g. storing bounding boxes instead of arcs would be OK. However the more precision we have, the better, because it will allow the program to draw to a higher level of detail. But, given that the number of primitives can increase exponentially with the level of detail, we'd like storage of past detail not to increase linearly with the number of primitives.
To summarize the order of priorities
Memory efficiency
CPU efficiency
Precision
P.S.
I framed this question in terms of circles, but if it's easier to find the largest clear golden rectangle (or golden ellipse), that would work too.
P.P.S.
This image gives some idea of what I'm trying to achieve. Here is the start of a tendril-drawing program, in which decisions about where to sprout a tendril, and how big, are made without regard to remaining open space. But now we want to know, where is there room to draw a tendril next, and how big? And where after that?
One very efficient way would be to recursively divide your area into rectangular sub-areas, splitting them when necessary to divide occupied areas from unoccupied areas. Then you would simply need to keep track of the largest unoccupied area at each time. See https://en.wikipedia.org/wiki/Quadtree - but you needn't split into squares.
Given any rectangle, you can draw a line inside it, so that at least one of the rectangles to either side of the line is a golden rectangle. Therefore you can recursively erect partitions within a rectangle so that all but one of the rectangles formed by the partitions are golden rectangles, and the add rectangle left over is vanishingly small. You could do this to create a quadtree-like structure, where almost all of the rectangles left over were golden rectangles.
This seems like the kind of situation where a randomized algorithm might be helpful. Choose points at random, reject and choose more if they're inappropriate for some reason, then find the min distance from your choices to each of the figures already included. The random point with the max of the mins would be your choice.
The number of sample points might have to increase as the complexity of the figure increases.
The random algorithm could be improved by checking points nearby good choices. Keep checking neighbors until no more improvement is possible.
Here's a simple way that uses a fixed amount of memory and time per update, regardless of how many drawing primitives you use. How much memory (and time per update) is needed can be controlled according to how high a "resolution" you need:
Divide the space up into a grid of points. We will maintain a 2D array, d[], which records the minimum distance from the grid point (x, y) to any already-drawn primitive in the entry d[x, y]. Initially, set every element in this array to infinity (or some huge number).
Whenever you draw some primitive, iterate over all grid points (x, y) calculating the minimum distance (or some conservative approximation to it) from (x, y) to the just-drawn primitive. E.g., if the primitive just drawn was a circle of radius r centered at (p, q), then this distance would be sqrt((x-p)^2 + (y-q)^2) - r. Then update d[x, y] with this new distance value if it is smaller than its current value.
The grid point at which the largest circle can be drawn without touching any already-drawn primitive is the grid point that is the farthest away from any primitive drawn so far. To find it, simply scan through d[] to find its maximum value, and note the corresponding indices (x, y). d[x, y] will be the maximum radius you could safely use for this circle.
Repeat steps 2 and 3 as necessary.
A couple of points:
For primitives that have area, you can assign 0 or a negative value to all d[x, y] corresponding to grid points inside the primitive.
For any convex primitive, you can often avoid updating most of the d[] array by scanning rows (or columns) "outward" from the just-drawn primitive's border: the distance from the current grid point to the primitive will never decrease, so if this distance becomes larger than the previous maximum value in d[] then we know that we can stop scanning this row, because no further distance value that we would compute on it could possibly be less than an existing distance on it.
Related
In physics simulations (for example n-body systems) it is sometimes necessary to keep track of which particles (points in 3D space) are close enough to interact (within some cutoff distance d) in some kind of index. However, particles can move around, so it is necessary to update the index, ideally on the fly without recomputing it entirely. Also, for efficiency in calculating interactions it is necessary to keep the list of interacting particles in the form of tiles: a tile is a fixed size array (eg 32x32) where the rows and columns are particles, and almost every row-particle is close enough to interact with almost every column particle (and the array keeps track of which ones actually do interact).
What algorithms may be used to do this?
Here is a more detailed description of the problem:
Initial construction: Given a list of points in 3D space (on the order of a few thousand to a few million, stored as array of floats), produce a list of tiles of a fixed size (NxN), where each tile has two lists of points (N row points and N column points), and a boolean array NxN which describes whether the interaction between each row and column particle should be calculated, and for which:
a. every pair of points p1,p2 for which distance(p1,p2) < d is found in at least one tile and marked as being calculated (no missing interactions), and
b. if any pair of points is in more than one tile, it is only marked as being calculated in the boolean array in at most one tile (no duplicates),
and also the number of tiles is relatively small if possible (but this is less important than being able to update the tiles efficiently)
Update step: If the positions of the points change slightly (by much less than d), update the list of tiles in the fastest way possible so that they still meet the same conditions a and b (this step is repeated many times)
It is okay to keep any necessary data structures that help with this, for example the bounding boxes of each tile, or a spatial index like a quadtree. It is probably too slow to calculate all particle pairwise distances for every update step (and in any case we only care about particles which are close, so we can skip most possible pairs of distances just by sorting along a single dimension for example). Also it is probably too slow to keep a full (quadtree or similar) index of all particle positions. On the other hand is perfectly fine to construct the tiles on a regular grid of some kind. The density of particles per unit volume in 3D space is roughly constant, so the tiles can probably be built from (essentially) fixed size bounding boxes.
To give an example of the typical scale/properties of this kind of problem, suppose there is 1 million particles, which are arranged as a random packing of spheres of diameter 1 unit into a cube with of size roughly 100x100x100. Suppose the cutoff distance is 5 units, so typically each particle would be interacting with (2*5)**3 or ~1000 other particles or so. The tile size is 32x32. There are roughly 1e+9 interacting pairs of particles, so the minimum possible number of tiles is ~1e+6. Now assume each time the positions change, the particles move a distance around 0.0001 unit in a random direction, but always in a way such that they are at least 1 unit away from any other particle and the typical density of particles per unit volume stays the same. There would typically be many millions of position update steps like that. The number of newly created pairs of interactions per step due to the movement is (back of the envelope) (10**2 * 6 * 0.0001 / 10**3) * 1e+9 = 60000, so one update step can be handled in principle by marking 60000 particles as non-interacting in their original tiles, and adding at most 60000 new tiles (mostly empty - one per pair of newly interacting particles). This would rapidly get to a point where most tiles are empty, so it is definitely necessary to combine/merge tiles somehow pretty often - but how to do it without a full rebuild of the tile list?
P.S. It is probably useful to describe how this differs from the typical spatial index (eg octrees) scenario: a. we only care about grouping close by points together into tiles, not looking up which points are in an arbitrary bounding box or which points are closest to a query point - a bit closer to clustering that querying and b. the density of points in space is pretty constant and c. the index has to be updated very often, but most moves are tiny
Not sure my reasoning is sound, but here's an idea:
Divide your space into a grid of 3d cubes, like this in three dimensions:
The cubes have a side length of d. Then do the following:
Assign all points to all cubes in which they're contained; this is fast since you can derive a point's cube from just their coordinates
Now check the following:
Mark all points in the top left of your cube as colliding; they're less than d apart. Further, every "quarter cube" in space is only the top left quarter of exactly one cube, so you won't check the same pair twice.
Check fo collisions of type (p, q), where p is a point in the top left quartile, and q is a point not in the top left quartile. In this way, you will check collision between every two points again at most once, because very pair of quantiles is checked exactly once.
Since every pair of points is either in the same quartile or in neihgbouring quartiles, they'll be checked by the first or the second algorithm. Further, since points are approximately distributed evenly, your runtime is much less than n^2 (n=no points); in aggregate, it's k^2 (k = no points per quartile, which appears to be approximately constant).
In an update step, you only need to check:
if a point crossed a boundary of a box, which should be fast since you can look at one coordinate at a time, and box' boundaries are a simple multiple of d/2
check for collisions of the points as above
To create the tiles, divide the space into a second grid of (non-overlapping) cubes whose width is chosen s.t. the average count of centers between two particles that almost interact with each other that fall into a given cube is less than the width of your tiles (i.e. 32). Since each particle is expected to interact with 300-500 particles, the width will be much smaller than d.
Then, while checking for interactions in step 1 & 2, assigne particle interactions to these new cubes according to the coordinates of the center of their interaction. Assign one tile per cube, and mark interacting particles assigned to that cube in the tile. Visualization:
Further optimizations might be to consider the distance of a point's closest neighbour within a cube, and derive from that how many update steps are needed at least to change the collision status of that point; then ignore that point for this many steps.
I suggest the following algorithm. E.g we have cube 1x1x1 and the cutoff distance is 0.001
Let's choose three base anchor points: (0,0,0) (0,1,0) (1,0,0)
Associate array of size 1000 ( 1 / 0.001) with each anchor point
Add three numbers into each regular point. We will store the distance between the given point and each anchor point inside these fields
At the same time this distance will be used as an index in an array inside the anchor point. E.g. 0.4324 means index 432.
Let's store the set of points inside of each three arrays
Calculate distance between the regular point and each anchor point every time when update point
Move point between sets in arrays during the update
The given structures will give you an easy way to find all closer points: it is the intersection between three sets. And we choose these sets based on the distance between point and anchor points.
In short, it is the intersection between three spheres. Maybe you need to apply additional filtering for the result if you want to erase the corners of this intersection.
Consider using the Barnes-Hut algorithm or something similar. A simulation in 2D would use a quadtree data structure to store particles, and a 3D simulation would use an octree.
The benefit of using a a tree structure is that it stores the particles in a way that nearby particles can be found quickly by traversing the tree, and far-away particles are in traversal paths that can be ignored.
Wikipedia has a good description of the algorithm:
The Barnes–Hut tree
In a three-dimensional n-body simulation, the Barnes–Hut algorithm recursively divides the n bodies into groups by storing them in an octree (or a quad-tree in a 2D simulation). Each node in this tree represents a region of the three-dimensional space. The topmost node represents the whole space, and its eight children represent the eight octants of the space. The space is recursively subdivided into octants until each subdivision contains 0 or 1 bodies (some regions do not have bodies in all of their octants). There are two types of nodes in the octree: internal and external nodes. An external node has no children and is either empty or represents a single body. Each internal node represents the group of bodies beneath it, and stores the center of mass and the total mass of all its children bodies.
demo
Given
A set of many points in 3D space
(each represented as 3 floating-point coordinates (x, y, z))
A set of many infinite lines in 3D space
(each represented by an arbitary point on the line, and a 3D direction vector)
Is there a way to find out which of the points lie on which of the lines (with a little tolerance to account for floating-point errors), that is more efficient than the trivial O(n²) approach of testing every point against every line in a nested loop?
I'm thinking along the lines of storing one of the two sets in a special data structure that helps with the intersection tests. But what would such a data structure look like?
(Links to relevant academic literature are also appreciated.)
This is a complement to Pinhead's answer.
Consider the bounding box of the P points, assumed roughly cubic, and subdivide it in C³ cells. If the spreading of the points is uniform, every cell will contain P/C³ points on average.
Now for every line you will find all the cells it intersects by a process similar to drawing a digital line, and you will intersect on average αC cells, where α is a small constant. Hence the total workload for the search is proportional to L P/C³.
Anyway, you have to account for the initialization time of the grid, proportional to C³. Hence the total including initialization is of the form C³+ ß L P/C², and is minimized by C~(L P)^(1/5), giving time and space complexity O((L P)^(3/5)), a significant saving over O(L P).
You can also think of a binary tree that contains a hierarchy of bounding spheres, starting from the tolerance radius around each point. You could create the tree in the same way one creates a kD-tree, by recursive subvisions on some dimension.
It is much harder to make precise estimates of the speedup, but you could expect a time complexity like O((L + P) Log(P)).
Divide 3d space into cubes of side N, so that a cube can contain more than one point. An infinite line will intersect an infinite amount of cubes, but checking if a cube has a point takes O(1). When a line intersects a cube with more than one point, you only check for the points in that cube, which is going to be smaller than the total size of points if 1) they are evenly distributed 2) you are using amortized constants(Constant Amortized Time). From the perspective of average scenario, you would only check all points at once at O(N^2) if all points are packed inside one cube. You can also have cubes inside the cubes, but that is more bookkeeping.
This idea is inspired by a quadtree:
https://en.wikipedia.org/wiki/Quadtree
N.B: there's a major edit at the bottom of the question - check it out
Question
Say I have a set of points:
I want to find the point with the most points surrounding it, within radius (ie a circle) or within (ie a square) of the point for 2 dimensions. I'll refer to it as the densest point function.
For the diagrams in this question, I'll represent the surrounding region as circles. In the image above, the middle point's surrounding region is shown in green. This middle point has the most surrounding points of all the points within radius and would be returned by the densest point function.
What I've tried
A viable way to solve this problem would be to use a range searching solution; this answer explains further and that it has " worst-case time". Using this, I could get the number of points surrounding each point and choose the point with largest surrounding point count.
However, if the points were extremely densely packed (in the order of a million), as such:
then each of these million points () would need to have a range search performed. The worst-case time , where is the number of points returned in the range, is true for the following point tree types:
kd-trees of two dimensions (which are actually slightly worse, at ),
2d-range trees,
Quadtrees, which have a worst-case time of
So, for a group of points within radius of all points within the group, it gives complexity of for each point. This yields over a trillion operations!
Any ideas on a more efficient, precise way of achieving this, so that I could find the point with the most surrounding points for a group of points, and in a reasonable time (preferably or less)?
EDIT
Turns out that the method above is correct! I just need help implementing it.
(Semi-)Solution
If I use a 2d-range tree:
A range reporting query costs , for returned points,
For a range tree with fractional cascading (also known as layered range trees) the complexity is ,
For 2 dimensions, that is ,
Furthermore, if I perform a range counting query (i.e., I do not report each point), then it costs .
I'd perform this on every point - yielding the complexity I desired!
Problem
However, I cannot figure out how to write the code for a counting query for a 2d layered range tree.
I've found a great resource (from page 113 onwards) about range trees, including 2d-range tree psuedocode. But I can't figure out how to introduce fractional cascading, nor how to correctly implement the counting query so that it is of O(log n) complexity.
I've also found two range tree implementations here and here in Java, and one in C++ here, although I'm not sure this uses fractional cascading as it states above the countInRange method that
It returns the number of such points in worst case
* O(log(n)^d) time. It can also return the points that are in the rectangle in worst case
* O(log(n)^d + k) time where k is the number of points that lie in the rectangle.
which suggests to me it does not apply fractional cascading.
Refined question
To answer the question above therefore, all I need to know is if there are any libraries with 2d-range trees with fractional cascading that have a range counting query of complexity so I don't go reinventing any wheels, or can you help me to write/modify the resources above to perform a query of that complexity?
Also not complaining if you can provide me with any other methods to achieve a range counting query of 2d points in in any other way!
I suggest using plane sweep algorithm. This allows one-dimensional range queries instead of 2-d queries. (Which is more efficient, simpler, and in case of square neighborhood does not require fractional cascading):
Sort points by Y-coordinate to array S.
Advance 3 pointers to array S: one (C) for currently inspected (center) point; other one, A (a little bit ahead) for nearest point at distance > R below C; and the last one, B (a little bit behind) for farthest point at distance < R above it.
Insert points pointed by A to Order statistic tree (ordered by coordinate X) and remove points pointed by B from this tree. Use this tree to find points at distance R to the left/right from C and use difference of these points' positions in the tree to get number of points in square area around C.
Use results of previous step to select "most surrounded" point.
This algorithm could be optimized if you rotate points (or just exchange X-Y coordinates) so that width of the occupied area is not larger than its height. Also you could cut points into vertical slices (with R-sized overlap) and process slices separately - if there are too many elements in the tree so that it does not fit in CPU cache (which is unlikely for only 1 million points). This algorithm (optimized or not) has time complexity O(n log n).
For circular neighborhood (if R is not too large and points are evenly distributed) you could approximate circle with several rectangles:
In this case step 2 of the algorithm should use more pointers to allow insertion/removal to/from several trees. And on step 3 you should do a linear search near points at proper distance (<=R) to distinguish points inside the circle from the points outside it.
Other way to deal with circular neighborhood is to approximate circle with rectangles of equal height (but here circle should be split into more pieces). This results in much simpler algorithm (where sorted arrays are used instead of order statistic trees):
Cut area occupied by points into horizontal slices, sort slices by Y, then sort points inside slices by X.
For each point in each slice, assume it to be a "center" point and do step 3.
For each nearby slice use binary search to find points with Euclidean distance close to R, then use linear search to tell "inside" points from "outside" ones. Stop linear search where the slice is completely inside the circle, and count remaining points by difference of positions in the array.
Use results of previous step to select "most surrounded" point.
This algorithm allows optimizations mentioned earlier as well as fractional cascading.
I would start by creating something like a https://en.wikipedia.org/wiki/K-d_tree, where you have a tree with points at the leaves and each node information about its descendants. At each node I would keep a count of the number of descendants, and a bounding box enclosing those descendants.
Now for each point I would recursively search the tree. At each node I visit, either all of the bounding box is within R of the current point, all of the bounding box is more than R away from the current point, or some of it is inside R and some outside R. In the first case I can use the count of the number of descendants of the current node to increase the count of points within R of the current point and return up one level of the recursion. In the second case I can simply return up one level of the recursion without incrementing anything. It is only in the intermediate case that I need to continue recursing down the tree.
So I can work out for each point the number of neighbours within R without checking every other point, and pick the point with the highest count.
If the points are spread out evenly then I think you will end up constructing a k-d tree where the lower levels are close to a regular grid, and I think if the grid is of size A x A then in the worst case R is large enough so that its boundary is a circle that intersects O(A) low level cells, so I think that if you have O(n) points you could expect this to cost about O(n * sqrt(n)).
You can speed up whatever algorithm you use by preprocessing your data in O(n) time to estimate the number of neighbouring points.
For a circle of radius R, create a grid whose cells have dimension R in both the x- and y-directions. For each point, determine to which cell it belongs. For a given cell c this test is easy:
c.x<=p.x && p.x<=c.x+R && c.y<=p.y && p.y<=c.y+R
(You may want to think deeply about whether a closed or half-open interval is correct.)
If you have relatively dense/homogeneous coverage, then you can use an array to store the values. If coverage is sparse/heterogeneous, you may wish to use a hashmap.
Now, consider a point on the grid. The extremal locations of a point within a cell are as indicated:
Points at the corners of the cell can only be neighbours with points in four cells. Points along an edge can be neighbours with points in six cells. Points not on an edge are neighbours with points in 7-9 cells. Since it's rare for a point to fall exactly on a corner or edge, we assume that any point in the focal cell is neighbours with the points in all 8 surrounding cells.
So, if a point p is in a cell (x,y), N[p] identifies the number of neighbours of p within radius R, and Np[y][x] denotes the number of points in cell (x,y), then N[p] is given by:
N[p] = Np[y][x]+
Np[y][x-1]+
Np[y-1][x-1]+
Np[y-1][x]+
Np[y-1][x+1]+
Np[y][x+1]+
Np[y+1][x+1]+
Np[y+1][x]+
Np[y+1][x-1]
Once we have the number of neighbours estimated for each point, we can heapify that data structure into a maxheap in O(n) time (with, e.g. make_heap). The structure is now a priority-queue and we can pull points off in O(log n) time per query ordered by their estimated number of neighbours.
Do this for the first point and use a O(log n + k) circle search (or some more clever algorithm) to determine the actual number of neighbours the point has. Make a note of this point in a variable best_found and update its N[p] value.
Peek at the top of the heap. If the estimated number of neighbours is less than N[best_found] then we are done. Otherwise, repeat the above operation.
To improve estimates you could use a finer grid, like so:
along with some clever sliding window techniques to reduce the amount of processing required (see, for instance, this answer for rectangular cases - for circular windows you should probably use a collection of FIFO queues). To increase security you can randomize the origin of the grid.
Considering again the example you posed:
It's clear that this heuristic has the potential to save considerable time: with the above grid, only a single expensive check would need to be performed in order to prove that the middle point has the most neighbours. Again, a higher-resolution grid will improve the estimates and decrease the number of expensive checks which need to be made.
You could, and should, use a similar bounding technique in conjunction with mcdowella's answers; however, his answer does not provide a good place to start looking, so it is possible to spend a lot of time exploring low-value points.
I have a rectangular area where there are circles with equal radius. I want to find which circles overlap with other circles (the output is a list of 2-element sets of overlapping circles).
I know how to check if two of the circles overlap (the distance between their centers is less than the diameter). I can perform this check for every pair of circles, but I was wondering if there is a better algorithm (faster than O(n^2)).
EDIT
The number of circles is usually about 100 and overlappings won't happen very often.
Here is some context:
The rectangle is a battlefield in a game. The movement of the units is done on small steps and I'm trying to detect collisions between units.
Given the new explanation of the problem statement, I would recommend a different approach.
Overlay a square grid over the battlefield, with a grid step equal to one circle diameter. Every circle can overlap at most four cells. In each cell, keep a list of the overlapping circles (and update it on every move).
Detecting potential collisions will now take about four cell/circle tests per circle, i.e. close to linear time.
For a simple solution, insert the centers in a 2d-tree and perform circular range queries around every center with a query radius 2R. In good conditions, this can be O(N Log(N)).
Alternatively, just sort the centers on X and try all circles in turn: by dichotomic search, locate the abscissa Xc and scan to Xc-2R and to Xc+2R, then check the 2D distance.
The cost of the dichotomic searches will be O(N Log(N)). If the circles are uniformly spread out in a square of side S, a stripe of width 4R contains 4RN/S circles, hence a total comparison cost of 4RN²/S. This is a good performance if S is large (think that for N tightly packed circles in a square, S~2R√N, hence 2N√N comparisons).
Direct answer: You cannot get better than O(n^2) in general since the circles could potentially all overlap, so you have to generate n^2 answers.
If you get more specific, you might get better answers. For example, if what you are really trying to do is find bounding spheres in a 2D simulation, you can profit from the fact that entities only move so far between frames, if the circles are sparse it's different from when they are tightly packed, etc. So let us know more about what it's all about.
EDIT: You edited your question - you indeed are looking for collision detection in a 2D simulation. If you check out https://en.wikipedia.org/wiki/Collision_detection , they point to several algorithms for exactly your case.
I like the one detailed right on that page where you keep one list of bounding intervals per axis (2 in "2D") and only need to "work hard" when those bounding intervals (which are themself by definition one-dimensional) change (i.e., there "overlap state"). This removes the O(n²) for well-behaved cases. They don't give an estimate for the complexity of that, but as it basically comes down to sorting, it looks more or less O(n logn) to me, and less when there are only minimal changes between frames.
What is the most efficient way to randomly fill a space with as many non-overlapping shapes? In my specific case, I'm filling a circle with circles. I'm randomly placing circles until either a certain percentage of the outer circle is filled OR a certain number of placements have failed (i.e. were placed in a position that overlapped an existing circle). This is pretty slow, and often leaves empty spaces unless I allow a huge number of failures.
So, is there some other type of filling algorithm I can use to quickly fill as much space as possible, but still look random?
Issue you are running into
You are running into the Coupon collector's problem because you are using a technique of Rejection sampling.
You are also making strong assumptions about what a "random filling" is. Your algorithm will leave large gaps between circles; is this what you mean by "random"? Nevertheless it is a perfectly valid definition, and I approve of it.
Solution
To adapt your current "random filling" to avoid the rejection sampling coupon-collector's issue, merely divide the space you are filling into a grid. For example if your circles are of radius 1, divide the larger circle into a grid of 1/sqrt(2)-width blocks. When it becomes "impossible" to fill a gridbox, ignore that gridbox when you pick new points. Problem solved!
Possible dangers
You have to be careful how you code this however! Possible dangers:
If you do something like if (random point in invalid grid){ generateAnotherPoint() } then you ignore the benefit / core idea of this optimization.
If you do something like pickARandomValidGridbox() then you will slightly reduce the probability of making circles near the edge of the larger circle (though this may be fine if you're doing this for a graphics art project and not for a scientific or mathematical project); however if you make the grid size 1/sqrt(2) times the radius of the circle, you will not run into this problem because it will be impossible to draw blocks at the edge of the large circle, and thus you can ignore all gridboxes at the edge.
Implementation
Thus the generalization of your method to avoid the coupon-collector's problem is as follows:
Inputs: large circle coordinates/radius(R), small circle radius(r)
Output: set of coordinates of all the small circles
Algorithm:
divide your LargeCircle into a grid of r/sqrt(2)
ValidBoxes = {set of all gridboxes that lie entirely within LargeCircle}
SmallCircles = {empty set}
until ValidBoxes is empty:
pick a random gridbox Box from ValidBoxes
pick a random point inside Box to be center of small circle C
check neighboring gridboxes for other circles which may overlap*
if there is no overlap:
add C to SmallCircles
remove the box from ValidBoxes # possible because grid is small
else if there is an overlap:
increase the Box.failcount
if Box.failcount > MAX_PERGRIDBOX_FAIL_COUNT:
remove the box from ValidBoxes
return SmallCircles
(*) This step is also an important optimization, which I can only assume you do not already have. Without it, your doesThisCircleOverlapAnother(...) function is incredibly inefficient at O(N) per query, which will make filling in circles nearly impossible for large ratios R>>r.
This is the exact generalization of your algorithm to avoid the slowness, while still retaining the elegant randomness of it.
Generalization to larger irregular features
edit: Since you've commented that this is for a game and you are interested in irregular shapes, you can generalize this as follows. For any small irregular shape, enclose it in a circle that represent how far you want it to be from things. Your grid can be the size of the smallest terrain feature. Larger features can encompass 1x2 or 2x2 or 3x2 or 3x3 etc. contiguous blocks. Note that many games with features that span large distances (mountains) and small distances (torches) often require grids which are recursively split (i.e. some blocks are split into further 2x2 or 2x2x2 subblocks), generating a tree structure. This structure with extensive bookkeeping will allow you to randomly place the contiguous blocks, however it requires a lot of coding. What you can do however is use the circle-grid algorithm to place the larger features first (when there's lot of space to work with on the map and you can just check adjacent gridboxes for a collection without running into the coupon-collector's problem), then place the smaller features. If you can place your features in this order, this requires almost no extra coding besides checking neighboring gridboxes for collisions when you place a 1x2/3x3/etc. group.
One way to do this that produces interesting looking results is
create an empty NxM grid
create an empty has-open-neighbors set
for i = 1 to NumberOfRegions
pick a random point in the grid
assign that grid point a (terrain) type
add the point to the has-open-neighbors set
while has-open-neighbors is not empty
foreach point in has-open-neighbors
get neighbor-points as the immediate neighbors of point
that don't have an assigned terrain type in the grid
if none
remove point from has-open-neighbors
else
pick a random neighbor-point from neighbor-points
assign its grid location the same (terrain) type as point
add neighbor-point to the has-open-neighbors set
When done, has-open-neighbors will be empty and the grid will have been populated with at most NumberOfRegions regions (some regions with the same terrain type may be adjacent and so will combine to form a single region).
Sample output using this algorithm with 30 points, 14 terrain types, and a 200x200 pixel world:
Edit: tried to clarify the algorithm.
How about using a 2-step process:
Choose a bunch of n points randomly -- these will become the centres of the circles.
Determine the radii of these circles so that they do not overlap.
For step 2, for each circle centre you need to know the distance to its nearest neighbour. (This can be computed for all points in O(n^2) time using brute force, although it may be that faster algorithms exist for points in the plane.) Then simply divide that distance by 2 to get a safe radius. (You can also shrink it further, either by a fixed amount or by an amount proportional to the radius, to ensure that no circles will be touching.)
To see that this works, consider any point p and its nearest neighbour q, which is some distance d from p. If p is also q's nearest neighbour, then both points will get circles with radius d/2, which will therefore be touching; OTOH, if q has a different nearest neighbour, it must be at distance d' < d, so the circle centred at q will be even smaller. So either way, the 2 circles will not overlap.
My idea would be to start out with a compact grid layout. Then take each circle and perturb it in some random direction. The distance in which you perturb it can also be chosen at random (just make sure that the distance doesn't make it overlap another circle).
This is just an idea and I'm sure there are a number of ways you could modify it and improve upon it.