Collision detection of huge number of circles - algorithm

What is the best way to check collision of huge number of circles?
It's very easy to detect collision between two circles, but if we check every combination then it is O(n2) which definitely not an optimal solution.
We can assume that circle object has following properties:
Coordinates
Radius
Velocity
Direction
Velocity is constant, but direction can change.
I've come up with two solutions, but maybe there are some better solutions.
Solution 1
Divide whole space into overlapping squares and check for collision only with circles that are in the same square. Squares need to overlap so there won't be a problem when a circle moves from one square to another.
Solution 2
At the beginning distances between every pair of circles need to be calculated.
If the distance is small then these pair is stored in some list, and we need to check for collision in every update.
If the distance is big then we store after which update there can be a collision (it can be calculated because we know the distance and velocitites). It needs to be stored in some kind of priority queue. After previously calculated number of updates distance needs to be checked again and then we do the same procedure - put it on the list or again in the priority queue.
Answers to Mark Byers questions
Is it for a game?
It's for simulation, but it can be treated also as a game
Do you want to recalculate the new position every n milliseconds, and also check for collisions at this time?
Yes, time between update is constant.
Do you want to find the time at which the first/every collision occurs?
No, I want to find every collision and do 'something' when it occures.
How important is accuracy?
It depends on what do you mean by accuracy. I need to detect all collisions.
Is it a big problem if very small fast moving circles can pass through each other occasionally?
It can be assumed that speed is so small that it won't happen.

There are "spatial index" data-structures for storing your circles for quick comparison later; Quadtree, r-tree and kd-tree are examples.
Solution 1 seems to be a spatial index, and solution 2 would benefit from a spatial index every time you recalculate your pairs.
To complicate matters, your objects are moving - they have velocity.
It is normal to use spatial indexes for objects in games and simulations, but mostly for stationary objects, and typically objects that don't react to a collision by moving.
It is normal in games and such that you compute everything at set time intervals (discrete), so it might be that two objects pass through each other but you fail to notice because they moved so fast. Many games actually don't even evaluate collisions in strict chronological order. They have a spatial index for stationary objects e.g. walls, and lists for all the moving objects that they check exhaustively (although with relaxed discrete checks as I outlined).
Accurate continuous collision detection and where the objects react to collisions in simulations is usually much more demanding.
The pairs approach you outlined sounds promising. You might keep the pairs sorted by next collision, and reinsert them when they have collided in the appropriate new positions. You only have to sort the new generated collision list (O(n lg n)) for the two objects and then to merge two lists (the new collisions for each object, and the existing list of collisions; inserting the new collisions, removing those stale collisions that listed the two objects that collided) which is O(n).
Another solution to this is to adapt your spatial index to store the objects not strictly in one sector but in each that it has passed through since the last calculation, and do things discretely. This means storing fast moving objects in your spatial structure, and you'd need to optimise it for this case.
Remember that linked lists or lists of pointers are very bad for caching on modern processors. I'd advocate that you store copies of your circles - their important properties for collision detection at any rate - in an array (sequential memory) in each sector of any spatial index, or in the pairs you outlined above.
As Mark says in the comments, it could be quite simple to parallelise the calculations.

I assume you are doing simple hard-sphere molecular dynamic simulation, right? I came accros the same problem many times in Monte Carlo and molecular dynamic simulations. Both of your solutions are very often mentioned in literature about simulations. Personaly I prefer solution 1, but slightly modified.
Solution 1
Divide your space into rectangular cells that don't overlap. So when you check one circle for collision you look for all circles inside a cell that your first circle is, and look X cells in each direction around. I've tried many values of X and found that X=1 is the fastest solution. So you have to divide space into cells size in each direction equal to:
Divisor = SimulationBoxSize / MaximumCircleDiameter;
CellSize = SimulationBoxSize / Divisor;
Divisor should be bigger than 3, otherwise it will cause errors (if it is too small, you should enlarge your simulation box).
Then your algorithm will look like this:
Put all circles inside the box
Create cell structure and store indexes or pointers to circles inside a cell (on array or on a list)
Make a step in time (move everything) and update circles positions inside on cells
Look around every circle for collision. You should check one cell around in every direction
If there is a collision - do something
Go to 3.
If you will write it correctly then you would have something about O(N) complexity, because maximum number of circles inside 9 cells (in 2D) or 27 cells (in 3D) is constant for any total number of circles.
Solution 2
Ususaly this is done like this:
For each circle create a list of circles that are in distance R < R_max, calculate time after which we should update lists (something about T_update = R_max / V_max; where V_max is maximum current velocity)
Make a step in time
Check distance of each circle with circles on its list
If there is a collision - do something
If current time is bigger then T_update, go to 1.
Else go to 2.
This solution with lists is very often improved by adding another list with R_max_2 > R_max and with its own T_2 expiration time. In this solution this second list is used to update the first list. Of course after T_2 you have to update all lists which is O(N^2). Also be carefull with this T and T_2 times, because if collision can change velocity then those times would change. Also if you introduce some foreces to your system, then it will also cause velocity change.
Solution 1+2
You can use lists for collision detection and cells for updating lists. In one book it was written that this is the best solution, but I think that if you create small cells (like in my example) then solution 1 is better. But it is my opinion.
Other stuff
You can also do other things to improve speed of simulation:
When you calculate distance r = sqrt((x1-x2)*(x1-x2) + (y1-y2)*(y1-y2) + ...) you don't have to do square root operation. You can compare r^2 to some value - it's ok. Also you don't have to do all (x1-x2)*(x1-x2) operations (I mean, for all dimentions), because if x*x is bigger than some r_collision^2 then all other y*y and so on, summed up, would be bigger.
Molecular dynamics method is very easy to parallelise. You can do it with threads or even on GPU. You can calculate each distance in different thread. On GPU you can easly create thousends of threads almost costless.
For hard-spheres there is also effective algorithm that doesn't do steps in time, but instead it looks for nearest collision in time and jumps to this time and updates all positions. It can be good for not dense systems where collisions are not very probable.

one possible technique is to use the Delaunay triangulation on the center of your circles.
consider the center of each circle and apply the delaunay triangulation. this will tesselate your surface into triangles. this allows you to build a graph where each node stores the center of a triangle, and each edge connects to the center of a neighbour circle. the tesselation operated above will limit the number of neighbours to a reasonable value (6 neighbours on average)
now, when a circle moves, you have a limited set of circles to consider for collision. you then have to apply the tesselation again to the set of circles which are impacted by the move, but this operation involves only a very small subset of circles (the neighbours of the moving circle, and some neighbours of the neighbours)
the critical part is the first tesselation, which will take some time to perform, later tesselations are not a problem. and of course you need an efficient implementation of a graph in term of time and space...

Sub-divide your space up into regions and maintain a list of which circles are centred in each region.
Even if you use a very simple scheme, such as placing all the circles in a list, sorted by centre.x, then you can speed things up massively. To test a given circle, you only need to test it against the circles on either side of it in the list, going out until you reach one that has an x coordinate more than radius away.

You could make a 2D version of a "sphere tree" which is a special (and really easy to implement) case of the "spatial index" that Will suggested. The idea is to "combine" circles into a "containing" circle until you've got a single circle that "contains" the "huge number of circles".
Just to indicate the simplicity of computing a "containing circle" (top-of-my-head):
1) Add the center-locations of the two circles (as vectors) and scale by 1/2, thats the center of the containing circle
2) Subtract the center locations of the two circles (as vectors), add the radii and scale by 1/2, thats the radius of the containing circle

What answer is most efficient will depend somewhat on the density of circles. If the density is low, then placing placing a low-resolution grid over the map and marking those grid elements that contain a circle will likely be the most efficient. This will take approximately O(N*m*k) per update, where N is the total number of circles, m is the average number of circles per grid point, and k is the average number of grid points covered by one circle. If one circle moves more than one grid point per turn, then you have to modify m to include the number of grid points swept.
On the other hand, if the density is extremely high, you're best off trying a graph-walking approach. Let each circle contain all neighbors within a distance R (R > r_i for every circle radius r_i). Then, if you move, you query all the circles in the "forward" direction for neighbors they have and grab any that will be within D; then you forget all the ones in the backward direction that are now farther than D. Now a complete update will take O(N*n^2) where n is the average number of circles within a radius R. For something like a closely-spaced hexagonal lattice, this will give you much better results than the grid method above.

A suggestion - I am no game developer
Why not precalculate when the collisions are going to occur
as you specify
We can assume that circle object has following properties:
-Coordinates
-Radius
-Velocity
-Direction
Velocity is constant, but direction can change.
Then as the direction of one object changes, recalculate those pairs that are affected. This method is effective if directions do not change too frequently.

As Will mentioned in his answer, spacial partition trees are the common solution to this problem. Those algorithms sometimes take some tweaking to handle moving objects efficiently though. You'll want to use a loose bucket-fitting rule so that most steps of movement don't require an object to change buckets.
I've seen your "solution 1" used for this problem before and referred to as a "collision hash". It can work well if the space you're dealing with is small enough to be manageable and you expect your objects to be at least vaguely close to uniformly distributed. If your objects may be clustered, then it's obvious how that causes a problem. Using a hybrid approach of some type of a partition tree inside each hash-box can help with this and can convert a pure tree approach into something that's easier to scale concurrently.
Overlapping regions is one way to deal with objects that straddle the boundaries of tree buckets or hash boxes. A more common solution is to test any object that crosses the edge against all objects in the neighboring box, or to insert the object into both boxes (though that requires some extra handling to avoid breaking traversals).

If your code depends on a "tick" (and tests to determine if objects overlap at the tick), then:
when objects are moving "too fast" they skip over each other without colliding
when multiple objects collide in the same tick, the end result (e.g. how they bounce, how much damage they take, ...) depends on the order that you check for collisions and not the order that collisions would/should occur. In rare cases this can cause a game to lock up (e.g. 3 objects collide in the same tick; object1 and object2 are adjusted for their collision, then object2 and object3 are adjusted for their collision causing object2 to be colliding with object1 again, so the collision between object1 and object2 has to be redone but that causes object2 to be colliding with object3 again, so ...).
Note: In theory this second problem can be solved by "recursive tick sub-division" (if more than 2 objects collide, divide the length of the tick in half and retry until only 2 objects are colliding in that "sub-tick"). This can also cause games to lock up and/or crash (when 3 or more objects collide at the exact same instant you end up with a "recurse forever" scenario).
In addition; sometimes when game developers use "ticks" they also say "1 fixed length tick = 1 / variable frame rate", which is absurd because something that is supposed to be a fixed length can't depend on something variable (e.g. when the GPU is failing to achieve 60 frames per second the entire simulation goes in slow motion); and if they don't do this and have "variable length ticks" instead then both of the problems with "ticks" become significantly worse (especially at low frame rates) and the simulation becomes non-deterministic (which can be problematic for multi-player, and can result in different behavior when the player saves, loads or pauses the game).
The only correct way is to add a dimension (time), and give each object a line segment described as "starting coordinates and ending coordinates", plus a "trajectory after ending coordinates". When any object changes its trajectory (either because something unpredicted happened or because it reached its "ending coordinates") you'd find the "soonest" collision by doing a "distance between 2 lines < (object1.radius + object2.radius)" calculation for the object that changed and every other object; then modify the "ending coordinates" and "trajectory after ending coordinates" for both objects.
The outer "game loop" would be something like:
while(running) {
frame_time = estimate_when_frame_will_be_visible(); // Note: Likely to be many milliseconds after you start drawing the frame
while(soonest_object_end_time < frame_time) {
update_path_of_object_with_soonest_end_time();
}
for each object {
calculate_object_position_at_time(frame_time);
}
render();
}
Note that there are multiple ways to optimize this, including:
split the world into "zones" - e.g. so that if you know object1 would be passing through zones 1 and 2 then it can't collide with any other object that doesn't also pass through zone 1 or zone 2
keep objects in "end_time % bucket_size" buckets to minimize time taken to find "next soonest end time"
use multiple threads to do the "calculate_object_position_at_time(frame_time);" for each object in parallel
do all the "advance simulation state up to next frame time" work in parallel with "render()" (especially if most rendering is done by GPU, leaving CPU/s free).
For performance:
When collisions occur infrequently it can be significantly faster than "ticks" (you can do almost no work for relatively long periods of time); and when you have spare time (for whatever reason - e.g. including because the player paused the game) you can opportunistically calculate further into the future (effectively, "smoothing out" the overhead over time to avoid performance spikes).
When collisions occur frequently it will give you the correct results, but can be slower than a broken joke that gives you incorrect results under the same conditions.
It also makes it trivial to have an arbitrary relationship between "simulation time" and "real time" - things like fast forward and slow motion will not cause anything to break (even if the simulation is running as fast as hardware can handle or so slow that its hard to tell if anything is moving at all); and (in the absence of unpredictability) you can calculate ahead to an arbitrary time in the future, and (if you store old "object line segment" information instead of discarding it when it expires) you can skip to an arbitrary time in the past, and (if you only store old information at specific points in time to minimize storage costs) you can skip back to a time described by stored information and then calculate forward to an arbitrary time. These things combined also make it easy to do things like "instant slow motion replay".
Finally; it's also more convenient for multiplayer scenarios, where you don't want to waste a huge amount of bandwidth sending a "new location" for every object to every client at every tick.
Of course the downside is complexity - as soon as you want to deal with things like acceleration/deceleration (gravity, friction, lurching movement), smooth curves (elliptical orbits, splines) or different shaped objects (e.g. arbitrary meshes/polygons and not spheres/circles) the mathematics involved in calculating when the soonest collision will occur becomes significantly harder and more expensive; which is why game developers resort to the inferior "ticks" approach for simulations that are more complex than the case of N spheres or circles with linear motion.

Related

How to index nearby 3D points on the fly?

In physics simulations (for example n-body systems) it is sometimes necessary to keep track of which particles (points in 3D space) are close enough to interact (within some cutoff distance d) in some kind of index. However, particles can move around, so it is necessary to update the index, ideally on the fly without recomputing it entirely. Also, for efficiency in calculating interactions it is necessary to keep the list of interacting particles in the form of tiles: a tile is a fixed size array (eg 32x32) where the rows and columns are particles, and almost every row-particle is close enough to interact with almost every column particle (and the array keeps track of which ones actually do interact).
What algorithms may be used to do this?
Here is a more detailed description of the problem:
Initial construction: Given a list of points in 3D space (on the order of a few thousand to a few million, stored as array of floats), produce a list of tiles of a fixed size (NxN), where each tile has two lists of points (N row points and N column points), and a boolean array NxN which describes whether the interaction between each row and column particle should be calculated, and for which:
a. every pair of points p1,p2 for which distance(p1,p2) < d is found in at least one tile and marked as being calculated (no missing interactions), and
b. if any pair of points is in more than one tile, it is only marked as being calculated in the boolean array in at most one tile (no duplicates),
and also the number of tiles is relatively small if possible (but this is less important than being able to update the tiles efficiently)
Update step: If the positions of the points change slightly (by much less than d), update the list of tiles in the fastest way possible so that they still meet the same conditions a and b (this step is repeated many times)
It is okay to keep any necessary data structures that help with this, for example the bounding boxes of each tile, or a spatial index like a quadtree. It is probably too slow to calculate all particle pairwise distances for every update step (and in any case we only care about particles which are close, so we can skip most possible pairs of distances just by sorting along a single dimension for example). Also it is probably too slow to keep a full (quadtree or similar) index of all particle positions. On the other hand is perfectly fine to construct the tiles on a regular grid of some kind. The density of particles per unit volume in 3D space is roughly constant, so the tiles can probably be built from (essentially) fixed size bounding boxes.
To give an example of the typical scale/properties of this kind of problem, suppose there is 1 million particles, which are arranged as a random packing of spheres of diameter 1 unit into a cube with of size roughly 100x100x100. Suppose the cutoff distance is 5 units, so typically each particle would be interacting with (2*5)**3 or ~1000 other particles or so. The tile size is 32x32. There are roughly 1e+9 interacting pairs of particles, so the minimum possible number of tiles is ~1e+6. Now assume each time the positions change, the particles move a distance around 0.0001 unit in a random direction, but always in a way such that they are at least 1 unit away from any other particle and the typical density of particles per unit volume stays the same. There would typically be many millions of position update steps like that. The number of newly created pairs of interactions per step due to the movement is (back of the envelope) (10**2 * 6 * 0.0001 / 10**3) * 1e+9 = 60000, so one update step can be handled in principle by marking 60000 particles as non-interacting in their original tiles, and adding at most 60000 new tiles (mostly empty - one per pair of newly interacting particles). This would rapidly get to a point where most tiles are empty, so it is definitely necessary to combine/merge tiles somehow pretty often - but how to do it without a full rebuild of the tile list?
P.S. It is probably useful to describe how this differs from the typical spatial index (eg octrees) scenario: a. we only care about grouping close by points together into tiles, not looking up which points are in an arbitrary bounding box or which points are closest to a query point - a bit closer to clustering that querying and b. the density of points in space is pretty constant and c. the index has to be updated very often, but most moves are tiny
Not sure my reasoning is sound, but here's an idea:
Divide your space into a grid of 3d cubes, like this in three dimensions:
The cubes have a side length of d. Then do the following:
Assign all points to all cubes in which they're contained; this is fast since you can derive a point's cube from just their coordinates
Now check the following:
Mark all points in the top left of your cube as colliding; they're less than d apart. Further, every "quarter cube" in space is only the top left quarter of exactly one cube, so you won't check the same pair twice.
Check fo collisions of type (p, q), where p is a point in the top left quartile, and q is a point not in the top left quartile. In this way, you will check collision between every two points again at most once, because very pair of quantiles is checked exactly once.
Since every pair of points is either in the same quartile or in neihgbouring quartiles, they'll be checked by the first or the second algorithm. Further, since points are approximately distributed evenly, your runtime is much less than n^2 (n=no points); in aggregate, it's k^2 (k = no points per quartile, which appears to be approximately constant).
In an update step, you only need to check:
if a point crossed a boundary of a box, which should be fast since you can look at one coordinate at a time, and box' boundaries are a simple multiple of d/2
check for collisions of the points as above
To create the tiles, divide the space into a second grid of (non-overlapping) cubes whose width is chosen s.t. the average count of centers between two particles that almost interact with each other that fall into a given cube is less than the width of your tiles (i.e. 32). Since each particle is expected to interact with 300-500 particles, the width will be much smaller than d.
Then, while checking for interactions in step 1 & 2, assigne particle interactions to these new cubes according to the coordinates of the center of their interaction. Assign one tile per cube, and mark interacting particles assigned to that cube in the tile. Visualization:
Further optimizations might be to consider the distance of a point's closest neighbour within a cube, and derive from that how many update steps are needed at least to change the collision status of that point; then ignore that point for this many steps.
I suggest the following algorithm. E.g we have cube 1x1x1 and the cutoff distance is 0.001
Let's choose three base anchor points: (0,0,0) (0,1,0) (1,0,0)
Associate array of size 1000 ( 1 / 0.001) with each anchor point
Add three numbers into each regular point. We will store the distance between the given point and each anchor point inside these fields
At the same time this distance will be used as an index in an array inside the anchor point. E.g. 0.4324 means index 432.
Let's store the set of points inside of each three arrays
Calculate distance between the regular point and each anchor point every time when update point
Move point between sets in arrays during the update
The given structures will give you an easy way to find all closer points: it is the intersection between three sets. And we choose these sets based on the distance between point and anchor points.
In short, it is the intersection between three spheres. Maybe you need to apply additional filtering for the result if you want to erase the corners of this intersection.
Consider using the Barnes-Hut algorithm or something similar. A simulation in 2D would use a quadtree data structure to store particles, and a 3D simulation would use an octree.
The benefit of using a a tree structure is that it stores the particles in a way that nearby particles can be found quickly by traversing the tree, and far-away particles are in traversal paths that can be ignored.
Wikipedia has a good description of the algorithm:
The Barnes–Hut tree
In a three-dimensional n-body simulation, the Barnes–Hut algorithm recursively divides the n bodies into groups by storing them in an octree (or a quad-tree in a 2D simulation). Each node in this tree represents a region of the three-dimensional space. The topmost node represents the whole space, and its eight children represent the eight octants of the space. The space is recursively subdivided into octants until each subdivision contains 0 or 1 bodies (some regions do not have bodies in all of their octants). There are two types of nodes in the octree: internal and external nodes. An external node has no children and is either empty or represents a single body. Each internal node represents the group of bodies beneath it, and stores the center of mass and the total mass of all its children bodies.
demo

How to equally subdivide a closed CGPath?

I've an indeterminate number of closed CGPath elements of various shapes and sizes all containing a single concave bezier curve, like the red and blue shapes in the diagram below.
What is the simplest and most efficient method of dividing these shapes into n regions of (roughly) equal size?
What you want is Delaunay triangulation. Here is an example which resembles what you want to do. It uses an as3 library. Here is an iOS port, that should help you:
https://github.com/czgarrett/delaunay-ios
I don't really understand the context of what you want to achieve and what the constraints are. For instance, is there a hard requirement that the subdivided regions are equal size?
Often the solutions to a performance problem is not a faster algorithm but a different approach, usually one or more of the following:
Pre-compute the values, or compute as much as possible offline. Say by using another server API which is able to do the subdivision offline and cache the results for multiple clients. You could serve the post-computed result as a bitmap where each colour indexes into the table of values you want to display. Looking up the value would be a simple matter of indexing the pixel at the touch position.
Simplify or approximate a solution. Would a grid sub-division be accurate enough? At 500 x 6 = 3000 subdivisions, you only have about 51 square points for each region, that's a region of around 7x7 points. At that size the user isn't going to notice if the region is perfectly accurate. You may need to end up aggregating adjacent regions anyway due to touch resolution.
Progressive refinement. You often don't need to compute the entire algorithm up front. Very often algorithms run in discrete (often symmetrical) units, meaning you're often re-using the information from previous steps. You could compute just the first step up front, and then use a background thread to progressively fill in the rest of the detail. You could also defer final calculation until the the touch occurs. A delay of up to a second is still tolerable at that point, or in the worst case you can display an animation while the calculation is in progress.
You could use some hybrid approach, and possibly compute one or two levels using Delaunay triangulation, and then using a simple, fast triangular sub-division for two more levels.
Depending on the required accuracy, and if discreet samples are not required, the final levels could be approximated using a weighted average between the points of the triangle, i.e., if the touch is halfway between two points, pick the average value between them.

Randomly and efficiently filling space with shapes

What is the most efficient way to randomly fill a space with as many non-overlapping shapes? In my specific case, I'm filling a circle with circles. I'm randomly placing circles until either a certain percentage of the outer circle is filled OR a certain number of placements have failed (i.e. were placed in a position that overlapped an existing circle). This is pretty slow, and often leaves empty spaces unless I allow a huge number of failures.
So, is there some other type of filling algorithm I can use to quickly fill as much space as possible, but still look random?
Issue you are running into
You are running into the Coupon collector's problem because you are using a technique of Rejection sampling.
You are also making strong assumptions about what a "random filling" is. Your algorithm will leave large gaps between circles; is this what you mean by "random"? Nevertheless it is a perfectly valid definition, and I approve of it.
Solution
To adapt your current "random filling" to avoid the rejection sampling coupon-collector's issue, merely divide the space you are filling into a grid. For example if your circles are of radius 1, divide the larger circle into a grid of 1/sqrt(2)-width blocks. When it becomes "impossible" to fill a gridbox, ignore that gridbox when you pick new points. Problem solved!
Possible dangers
You have to be careful how you code this however! Possible dangers:
If you do something like if (random point in invalid grid){ generateAnotherPoint() } then you ignore the benefit / core idea of this optimization.
If you do something like pickARandomValidGridbox() then you will slightly reduce the probability of making circles near the edge of the larger circle (though this may be fine if you're doing this for a graphics art project and not for a scientific or mathematical project); however if you make the grid size 1/sqrt(2) times the radius of the circle, you will not run into this problem because it will be impossible to draw blocks at the edge of the large circle, and thus you can ignore all gridboxes at the edge.
Implementation
Thus the generalization of your method to avoid the coupon-collector's problem is as follows:
Inputs: large circle coordinates/radius(R), small circle radius(r)
Output: set of coordinates of all the small circles
Algorithm:
divide your LargeCircle into a grid of r/sqrt(2)
ValidBoxes = {set of all gridboxes that lie entirely within LargeCircle}
SmallCircles = {empty set}
until ValidBoxes is empty:
pick a random gridbox Box from ValidBoxes
pick a random point inside Box to be center of small circle C
check neighboring gridboxes for other circles which may overlap*
if there is no overlap:
add C to SmallCircles
remove the box from ValidBoxes # possible because grid is small
else if there is an overlap:
increase the Box.failcount
if Box.failcount > MAX_PERGRIDBOX_FAIL_COUNT:
remove the box from ValidBoxes
return SmallCircles
(*) This step is also an important optimization, which I can only assume you do not already have. Without it, your doesThisCircleOverlapAnother(...) function is incredibly inefficient at O(N) per query, which will make filling in circles nearly impossible for large ratios R>>r.
This is the exact generalization of your algorithm to avoid the slowness, while still retaining the elegant randomness of it.
Generalization to larger irregular features
edit: Since you've commented that this is for a game and you are interested in irregular shapes, you can generalize this as follows. For any small irregular shape, enclose it in a circle that represent how far you want it to be from things. Your grid can be the size of the smallest terrain feature. Larger features can encompass 1x2 or 2x2 or 3x2 or 3x3 etc. contiguous blocks. Note that many games with features that span large distances (mountains) and small distances (torches) often require grids which are recursively split (i.e. some blocks are split into further 2x2 or 2x2x2 subblocks), generating a tree structure. This structure with extensive bookkeeping will allow you to randomly place the contiguous blocks, however it requires a lot of coding. What you can do however is use the circle-grid algorithm to place the larger features first (when there's lot of space to work with on the map and you can just check adjacent gridboxes for a collection without running into the coupon-collector's problem), then place the smaller features. If you can place your features in this order, this requires almost no extra coding besides checking neighboring gridboxes for collisions when you place a 1x2/3x3/etc. group.
One way to do this that produces interesting looking results is
create an empty NxM grid
create an empty has-open-neighbors set
for i = 1 to NumberOfRegions
pick a random point in the grid
assign that grid point a (terrain) type
add the point to the has-open-neighbors set
while has-open-neighbors is not empty
foreach point in has-open-neighbors
get neighbor-points as the immediate neighbors of point
that don't have an assigned terrain type in the grid
if none
remove point from has-open-neighbors
else
pick a random neighbor-point from neighbor-points
assign its grid location the same (terrain) type as point
add neighbor-point to the has-open-neighbors set
When done, has-open-neighbors will be empty and the grid will have been populated with at most NumberOfRegions regions (some regions with the same terrain type may be adjacent and so will combine to form a single region).
Sample output using this algorithm with 30 points, 14 terrain types, and a 200x200 pixel world:
Edit: tried to clarify the algorithm.
How about using a 2-step process:
Choose a bunch of n points randomly -- these will become the centres of the circles.
Determine the radii of these circles so that they do not overlap.
For step 2, for each circle centre you need to know the distance to its nearest neighbour. (This can be computed for all points in O(n^2) time using brute force, although it may be that faster algorithms exist for points in the plane.) Then simply divide that distance by 2 to get a safe radius. (You can also shrink it further, either by a fixed amount or by an amount proportional to the radius, to ensure that no circles will be touching.)
To see that this works, consider any point p and its nearest neighbour q, which is some distance d from p. If p is also q's nearest neighbour, then both points will get circles with radius d/2, which will therefore be touching; OTOH, if q has a different nearest neighbour, it must be at distance d' < d, so the circle centred at q will be even smaller. So either way, the 2 circles will not overlap.
My idea would be to start out with a compact grid layout. Then take each circle and perturb it in some random direction. The distance in which you perturb it can also be chosen at random (just make sure that the distance doesn't make it overlap another circle).
This is just an idea and I'm sure there are a number of ways you could modify it and improve upon it.

Broad-phase collision detection methods?

I'm building a 2D physics engine and I want to add broad-phase collision detection, though I only know of 2 or 3 types:
Check everything against everything else (O(n^2) complexity)
Sweep and Prune (sort and sweep)
something about Binary Space Partition (not sure how to do this)
But surely there's more options right? what are they? And can either a basic description of each be provided or links to descriptions?
I've seen this but I'm asking for a list of algorithms available, not the best one for my needs.
In this case, "Broad phase collision detection" is a method used by physics engines to determine which bodies in their simulation are close enough to warrant further investigation and possibly collision resolution.
The best approach depends on the specific use, but the bottom line is that you want to subdivide your world space such that (a) every body is in exactly one subdivision, (b) every subdivision is large enough that a a body in a particular subdivision can only collide with bodies in that same subdivision or an adjacent subdivision, and (c) the number of bodies in a particular subdivision is as small as possible.
How you do that depends on how many bodies you have, how they're moving, what your performance requirements are, and how much time you want to spend on your engine. If you're talking about bodies moving around in a largely open space, the simplest technique would be divide the world into a grid where each cell is larger than your largest object, and track the list of objects in each cell. If you're building something on the scale of a classic arcade game, this solution may well suffice.
If you're dealing with bodies moving in a larger open world, a simple grid will become overwhelming pretty quickly, and you'll probably want some sort of a tree-based structure like quadtrees, as Arriu suggests.
If you're talking about moving bodies around within bounded spaces instead of open spaces, then you may consider a BSP tree; the tree partitions the world into 'space you can walk in' and 'walls', and clipping a body into the tree determines whether it's in a legal position. Depending on the world geometry, you can also use a BSP for your broad-phase detection of collisions between bodies in the world.
Another option for bodies moving in bounded space would be a portal engine; if your world can consist of convex polygonal regions where each side of the polygon is either a solid wall or a 'portal' to another concave space, you can easily determine whether a body is within a region with a point-in-polygon test and simplify collision detection by only looking at bodies in the same region or connected regions.
An alternative to QuadTrees or BSPTrees are SphereTrees (CircleTrees in 2D, the implementation would be more or less the same). The advantage that SphereTrees have are that they handle large loads of dynamic objects very well. If you're objects are constantly moving, BSPTrees and QuadTrees are much slower in their updates than a Sphere/Circle Tree would be.
If you have a good mix of static and dynamic objects, a reasonably good solution is to use a QuadTree/BSPTree for the statics and a Sphere/Cicle Tree for the dynamic objects. Just remember that for any given object, you would need to check it against both trees.
I recommend quadtree partitioning. It's pretty simple and it works really well. Here is a Flash demo of brute-force collision detection vs. quadtree collision detection. (You can tell it to show the quadtree structure.) Did you notice how quadtree collision detection is only 3% of brute force in that demo?
Also, if you are serious about your engine then I highly recommend you pick up real-time collision detection. It's not expensive and it's a really great book which covers everything you would ever want to know. (Including GPU based collision detection.)
All of the acceleration algorithms depend on using an inexpensive test to quickly rule out objects (or groups of objects) and thereby cut down on the number of expensive tests you have to do. I view the algorithms in categories, each of which has many variations.
Spatial partitioning. Carve up space and cheaply exclude candidates that are in different regions. For example, BSP, grids, octrees, etc.
Object partitioning. Similar to #1, but the clustering is focused on the objects themselves more than the space they reside in. For example, bounding volume hierarchies.
Sort and sweep methods. Put the objects in order spatially and rule out collisions among ones that aren't adjacent.
1 and 2 are often hierarchical, recursing into each partition as needed. With 3, you can optionally iterate along different dimensions.
Trade-offs depend a lot on scene geometry. Do objects cluster or are they evenly or sparsely distributed? Are they all about the same size or are there huge variations in size? Is the scene dynamic? Can you afford a lot of preprocessing time?
The "inexpensive" tests are actually along a spectrum of really-cheap to kind-of-expensive. Choosing the best algorithm means minimizing the ratio of the cost of the inexpensive testing to the reduction in the number of expensive tests. Beyond the algorithmic concerns, you get into tuning, like worrying about cache locality.
An alternative are plain grids, say 20x20 or 100x100 (depends on your world and memory size). The implementation is simpler than a recursive structure such as quad/bsp-trees (or sphere trees for that matter).
Objects crossing cell borders are a bit simpler in this case, and do not degenerate as much as an naive implementation of a bsp/quad/oct-tree might do.
Using that (or other techinques), you should be able to quickly cull many pairs and get a set of potential collisions that need further investigation.
I just came up with a solution that doesn't depend on grid size and is probably O(nlogn) (that is the optimum when there are no collisions) though worst at O(nnlogn) (when everything collides).
I also implemented and tested it, here is the link to the source. But I haven't compared it to the brute force solution.
A description of how it works:
(I'm looking for collisions of rectangles)
on x axis I sort the rectangles according to their right edge ( O(nlogn) )
for every rect in the sorted list I take the left edge and do a binary search until I find the rightmost edge at the left of the left edge and insert these rectangles between these indices in a possible_Collision_On_X_Axis set ( those are n rectangles, logn for the binary search, n inserts int the set at O(log n)per insert)
on y axis I do the same
in each of the sets I now have possible collisions on x (in one set) and on y(int the other), I intersect these sets and now I have the collisions on both the x axis and y axis (that means I take the common elements) (O(n))
sorry for the poor description, I hope you understand better from the source, also an example illustrated here: image
You might want to check out what Scott did in Chipmunk with spacial hashing. The source is freely available. I think he used a similar technique to Box-2D if not for collision, definitely for contact persistence.
I've used a quad-tree in a larger project, which is good for game objects that don't move much (less removals & re-insertions). The same principle applies for octrees.
The basic Idea Is, you create a recursive tree structure, which stores 4(for quad), or 8(oct) child objects of the same type as the tree root. Each node in the tree represents a section of Cartesian space, for example, -100 -> +100 on each applicable axis. each child represents that same space, but subdivided by half (an immediate child of the example would represent, for example, -100->0 on X axis, and -100->0 on Y axis).
Imagine a square, or plane, with a given set of dimensions. Now draw a line through the centre on each axis, dividing that plane into 4 smaller planes. Now take one of them and do It again, and again, until you reach a point when the size of the plane segment Is roughly the size of a game object. Now you have your tree. Only objects occupying the same plane, can possibly collide. When you have determined which objects can collide, You generate pairs of possible collisions from them. At this stage, broadphase Is complete, and you can move onto narrow phase collision detection, which Is where your more precise, and expensive calculations are.
The purpose of Broadphase, Is to use inexpensive calculations to generate possible collisions, and cull out contacts that cannot occur, thus reducing the work your narrow phase algorithm has to perform, In turn, making your entire collision detection algorithm more efficient.
So before you go ahead and attempt to implement such an algorithm, as yourself:
How many objects are in my game?
If there are a lot, I probably need a broadphase. If not, then the Nnarrowphase should suffice.
Also, am I dealing with many moving objects?
Tree structures generally are slowed down by moving objects, as they can change their position in the tree over time, simply by moving. This requires that objects be removed and reinserted each frame (potentially), which is less than Ideal.
If this is the case, you would be better off with Sweep and Prune, which maintains min/max heaps of the extents of the shapes in your world. Objects do not need to be reinserted, but the heaps need to be resorted each frame, thought this is generally faster than a tree wide, traversal with removals and reinsertions.
Depending on your coding experience, one may be more complicated to code over another. Personally I have found trees to be more intuitive to code, but less efficient, and more prone to error, since they raise other issues, such as what to do if you have an object that sits directly on top of an axis, or the centre of the root node. This can be solved by using loose trees, which have some spacial overlap, so that one child node is always prioritised over others when such a case occurs.
If the space that your objects move within is bounded, then you could use a grid to subdivide your objects. Put each object into every grid cell which the object covers (fully or partially). To check if object A collides with any other object, determine which grid cells object A covers, then get the list of unique objects in those cells, and finally test object A against each unique object. This approach works best if most objects are usually contained in a single grid cell.
If your space is not bounded, then you will need to implement some sort of dynamic grid that can grow on demand in each of the four directions (in 2D).
The advantage of this approach over more adaptive algorithsm (i.e. BSP, Quadtree, Circletree) is that the cells can be accessed in O(1) time (i.e. constant time) rather than O(log N) time (i.e. logarithmic time). However these latter algorithms are able to adapt themselves to large variability in the density of objects. The grid approach works best when object density is fairly constant across the space.
I would like to recommend the introductory reference to game physics from Ian Millington, Game Physics Engine Development. It has a great section on broad phase collision detection with sample code.

Spatial Data Structures for moving objects?

I was wondering what is the best data structure to deal with a lot of moving objects(spheres, triangles, boxes, points, etc)? I'm trying to answer two questions, Nearest Neighbor and Collsion detection.
I do realize that traditionally, data structures like R trees are used for nearest neighbor queries and Oct/Kd/BSP are used for collision detection problems dealing with static objects, or with very few moving objects.
I'm just hoping that there is something else out there that is better.
I appreciate all the help.
You can partition the scene in an octree and utilize scene coherence. Your moving object that you are testing is going to be in a specific node of the tree for a specific amount of frames depending on its speed. This means you can assume it will be in the node and therefore quickly cut out a lot of objects that are not in the node. Of course the tricky bit is when your object is close to the edge of your partition, you would have to make sure you update which node the object is in.
A moving object has a direction and a speed. You can easily divide the scene in two sections by using a perpendicular division from your objects direction of movement. Any object on the wrong side of this division do not need to be checked. Of course compensate for the other object's velocity. So if the other object is reasonable slow, you can easily cut it out from further checks.
Always simplify whatever shape you are testing with something like an axis aligned bounding box. An initial collision test
You can take the distance between your object and another moving object plus the velocities. If the other moving object is moving in the same general direction at a faster velocity, you can eliminate it from your check.
There are many other optimizations depending on the object's shape. Circles or squares or more complex shapes all have specific optimizations you can do while moving.
Assuming some objects do stop or can stop moving, you can keep track of objects that stop. these objects can than be treated like static objects and therefore the checks are faster and you can apply all the static optimization techniques to them.
I know of more but can't think of any off the top of my head. Haven't done this in a while.
Now of course this depends on how you are doing the collision detection. Are you incrementally updating the object's position based on velocity and checking as though it is static. Or are you compensating for velocity by extruding the shape and figuring out initial collision points. The former needs a small step for fast moving object. The latter is more complicated and costly but gives better results. Also if you are going to have rotating objects then things get a little more tricky.
Bounding Spheres probably would help you with many moving objects; you calculate the radius squared of the object, and track it from it's center. If the squared distance between the centers of two objects is less than the sum of the squared radii of the two objects, then you have a potential collision. Everything done with squared distances; no square roots.
You can sort nearest neighbors by the minimum squared distance between your objects. Collision detection can get complex, of course, and with non-spherically shaped objects, Bounding Spheres won't necessarily get you collision information, but it can prune your tree of objects you need to compare for collision quite nicely.
You'll need, of course, to track the CENTER of your object; and ideally you'd like for each object to be rigid, to avoid having to recalculate bounding sphere sizes (although the recalculation isn't particularly tough, particularly if you use a tree of rigid objects each with their own bounding sphere for the objects which are non-rigid; but it gets complicated).
Basically, to answer your question about data structures, you can keep all of your objects in a master array; I'd have a set of "region" arrays which consist of references to the objects in the master array that you can sort the objects into quickly based upon their cartesian coordinates; the "region' arrays should have an overlap defined of at least 2x the largest object radius in your master object array (if that's feasible; a question of object bounding sphere scaling vs. number of objects obviously comes up).
Once you've got that, you can do a quick collision test by comparing distances of all objects within a region to each other; again, this is where region definition becomes important, because you're doing a significant tradeoff of number of regions to number of comparisons. However, it's a little simpler just because your distance comparisons come down to simple subtractions (and abs() operations, of course).
Of course, then you have to do actual collision detection between your non-spherical objects, and that can be non-trivial, but you've reduced the number of potential comparisons very dramatically at that point.
Basically, it's a two-tiered system; the first is the region array, whereby you do a rough sort on your scene. Second, you have your intra-region distance comparison; wherein you're going to do your basic collision detection and collision flagging on the objects which have collided.
There probably is room in this sort of algorithm for use of trees in dynamic region determination to even out your region sizes to assure that your region size doesn't grow too rapidly with "crowded" regions; that sort of thing, though, is non-trivial, because with objects of differing sizes, your sort on density becomes... complex, to say the least.
One interesting method for doing collision detection is to use axially aligned bounding boxes (AABB's) organised within a special octree structure. The AABB's help because they make the coarse collision computations very quick to do because you only need to perform up to 6 comparisons.
There are a couple of tricks you should use with the octree structure:
1) The bounding area which a node occupies should be twice as large as it would be for a normal octree (where the octree partitions the space without overlap). As each node should overlap half of the area of its adjacent nodes. Since the AABB can belong to only one node this extra size and overlap allows it to remain in the one node for a longer period of time.
2) Also in each node - and probably in each level of the hierarchy - you keep links to the node's neighbours. This will involve a lot of extra code but it will allow you to move elements between nodes in close to O(1) time rather than O(2logn) time.
If the octree is taking up too much memory you could change it to use a sparse octree structure, only keeping the tree for the parts of the game world that actually contained entities. However this would mean that you'd have to perform more computations for each frame when entities move through the world.
Some other ideas you might want to try instead of an octree is to use a kd-tree (I believe that's the correct name), or use AABB's to build the structure from the bottom up.
KD trees (from memory) partition the space using axially aligned planes, thus they are a good fit for use with AABB.
The other idea is instead of forcing the octree generation from the top down (starting with a box envoloping the whole world), you build it up from the base entities and construct bigger AABB's that grow until the biggest one encompasses the whole world. Though I'm unsure of how it would work with many fast moving entities.
(It's been a while since I've done this kind of spatial game development coding.)
RDC may be of use, especially if your objects are sparse (not many intersections).
sweep and prune broad phase + gjk narrow phase

Resources