How to group shapes into sets of sets of overlapping shapes - algorithm

Please note I am interested in the most efficient way to solve the problem, and not looking for a recommendation to use a particular library.
I have a large number (~200,000) of 2D shapes (rectangles, polygons, circles, etc) that I want to sort into overlapping groups. For example, in the picture, the green and orange would be marked as group 1 and the black, red, and blue would be marked as group 2.
Example
Let's say I will be given a list<Shape>. A slow solution would be:
(I haven't run this code - just an example algorithm)
// Everything initialized with groupid = 0
int groupid = 1;
for (int i = 0; i < shapes.size(); ++i)
{
if (shapes[i].groupid)
{
// The shape hasn't been assigned a group yet
// so assign it now
shapes[i].groupid = groupid;
++groupid;
}
// As we compare shapes, we may find that a shape overlaps with something
// that was already assigned a group. This keeps track of groups that
// should be merged.
set<int> mergingGroups = set<int>();
// Compare to all other shapes
for (int j = i + 1; j < shapes.size(); ++j)
{
// If this is one of the groups that we are merging, then
// we don't need to check overlap, just merge them
if (shapes[j].groupid && mergingGroups.contains(shapes[j].groupid))
{
shapes[j].groupid = shapes[i].groupid;
}
// Otherwise, if they overlap, then mark them as the same group
else if (shapes[i].overlaps(shapes[j]))
{
if (shapes[j].groupid >= 0)
{
// Already have a group assigned
mergingGroups.insert(shapes[j].groupid;
}
// Mark them as part of the same group
shapes[j].groupid = shapes[i].groupid
}
}
}
A faster solution would be to put the objects into a spacial tree to reduce the number of j object overlap comparisons (the inner loop), but I would still have to iterate over the rest to merge groups.
Is there anything faster?
Thanks!

Hopefully this helps someone - this is what I actually implemented (in pseudocode).
tree = new spatial tree
for each shape in shapes
set shape not in group
add shape to tree
for each shape in shapes
if shape in any group
skip shape
cur_group = new group
set shape in cur_group
regions = new stack
insert bounds(shape) into regions
while regions has items
cur_bounds = pop(regions)
for test_shape in find(tree, cur_bounds)
if test_shape has group
skip test_shape
if overlaps(any in cur_group, test_shape)
insert bounds(tester) into regions
set test_shape group = cur_group

If you effectively find all pairwise intersections with a spatial tree, you can use union-find algorithm to group objects

Related

Is there a simple way to separate randomly spawned squares?

What I am currently doing is spawning some rooms which are basically just rectangles. What I want to do now is just to separate them so that they are not overlapping.
I'm working on a small SDL based framework.
What I'm looking for is some simple algorithm or way that I can achieve this.
What I currently have is the following:
static void SeparateRooms(Room& r1, Room& r2)
{
if ( utils::IsOverlapping(r1.GetRect(), r2.GetRect()) )
{
constexpr int moveIncrement{ 3 };
if (r1.GetArea() > r2.GetArea())
{
const Vector2f smallToBigVector{ r1.GetPosition() - r2.GetPosition() };
const Vector2f smallToBigNormalized{ smallToBigVector.Normalized() };
r1.m_Rect.left += moveIncrement * smallToBigNormalized.x;
r1.m_Rect.bottom += moveIncrement * smallToBigNormalized.y;
}
else
{
const Vector2f smallToBigVector{ r2.GetPosition() - r1.GetPosition() };
const Vector2f smallToBigNormalized{ smallToBigVector.Normalized() };
r2.m_Rect.left += moveIncrement * smallToBigNormalized.x;
r2.m_Rect.bottom += moveIncrement * smallToBigNormalized.y;
}
}
}
This works, but only when the rectangles are not surrounding another rectangle. Is there some tweak that I can do to make this work or another way I can approach the problem?
Here's a different approach:
Start with a single rectangle the size of your bounding box.
To add a rectangle, take the largest rectangle, find its longest axis, and split it into 2 rectangles along that axis.
Rinse & repeat
If asymmetry is desirable, then when making the split you can pick a random point along the access (possibly avoiding some area near the endpoints to avoid very tiny rooms.
If you want rooms that aren't adjacent, shrink the rectangles towards their centers by some percentage.
If you want more empty space and asymmetry, generate extra rectangles but don't make them rooms.
If you want even more asymmetry, after shrinking the rooms, move them to a random location inside their pre-shrinkage boundaries.
If you wan't more variability in room size, in the second bullet, take a random room instead of the largest room.

Given a mesh face, find its neighboring faces

I'm trying to efficiently find all neighboring faces of a given face.
I'm taking a sly approach, but I wonder if can be improved upon.
The approach I've taken so far is to create a data structure after my mesh geometry is created. I build a hash of arrays to map vertices to the faces that are comprised of them:
var vertexToFace = [];
function crossReference(g) {
for (var fx = 0; fx < g.faces.length; fx++) {
vertexToFace[fx] = new Array();
}
for (var fx = 0; fx < g.faces.length; fx++) {
var f = g.faces[fx];
var ax = f.a;
var bx = f.b;
var cx = f.c;
vertexToFace[ax].push(fx);
vertexToFace[bx].push(fx);
vertexToFace[cx].push(fx);
}
}
Now that I have the hash of arrays, I can retrieve a given face's neighbors:
var neighbors = [];
neighbors.push( vertexToFace(face.a), vertexToFace(face.b), vertexToFace(face.c) );
This works fine, but I'm wondering if its over-kill. I know that each face in geometry.faces contains members a,b,c that are indices into geometry.vertices.
I don't believe the reverse information is stored, although, tantalizingly, each vertex in geometry.vertices does have the member .index, but it doesn't seem to correspond to faces.
Am I missing something obvious?
Thanks/.
I think that
for (var fx = 0; fx < g.faces.length; fx++) {
vertexToFace[fx] = new Array();
}
should be changed to
for (var fx = 0; fx < g.vertices.length; fx++) {
vertexToFace[fx] = new Array();
}
since otherwise, it seems that there won't be enough elements in vertexToFace if the number of vertices is greater than the number of faces. You can also simplify a bit with
vertexToFace[fx] = [];
To me it depends on your use case. If you wanna do this just once then to me it's indeed and overkill and I would just iterate through the faces to find the neighbours. Cost is O(#G) to speak in complexity terms #G being the number of faces. So not bad. The next time you do it cost will again be O(#G).
In your case cost is O(#G) to create the structure and then for retrieving neighbors it depends on array implementation but probably O(log(#V)) where #V is number of vertices. So first retrieval cost is O(#G+log(#V)) so > O(#G) Next retrieval however is again O(log(#V)) so in case you have lots of those retrieval operations it seems to make sense to use your way in most cases. However you will need to keep track when the geometry is changed as this would invalidate your structure. So as so often - it depends...
Just a side note - question is also what you consider a neighbor. One vertex in common or and edge... Just mentioning it cause I just had a case where it was the later.

Determine if a point is in the union of rectangles

I have a set of axis parallel 2d rectangles defined by their top left and bottom right hand corners(all in integer coordinates). Given a point query, how can you efficiently determine if it is in one of the rectangles? I just need a yes/no answer and don't need to worry about which rectangle it is in.
I can check if (x,y) is in ((x1, y1), (x2, y2)) by seeing if x is between x1 and x2 and y is between y1 and y2. I can do this separately for each rectangle which runs in linear time in the number of rectangles. But as I have a lot of rectangles and I will do a lot of point queries I would like something faster.
The answer depends a little bit on how many rectangles you have. The brute force method checks your coordinates against each rectangular pair in turn:
found = false
for each r in rectangles:
if point.x > r.x1 && point.x < r.x2:
if point.y > r.y1 && point.y < r.y2
found = true
break
You can get more efficient by sorting the rectangles into regions, and looking at "bounding rectangles". You then do a binary search through a tree of ever-decreasing bounding rectangles. This takes a bit more work up front, but it makes the lookup O(ln(n)) rather than O(n) - for large collections of rectangles and many lookups, the performance improvement will be significant. You can see a version of this (which looks at intersection of a rectangle with a set of rectangles - but you easily adapt to "point within") in this earlier answer. More generally, look at the topic of quad trees which are exactly the kind of data structure you would need for a 2D problem like this.
A slightly less efficient (but faster) method would sort the rectangles by lower left corner (for example) - you then need to search only a subset of the rectangles.
If the coordinates are integer type, you could make a binary mask - then the lookup is a single operation (in your case this would require a 512MB lookup table). If your space is relatively sparsely populated (i.e. the probability of a "miss" is quite large) then you could consider using an undersampled bit map (e.g. using coordinates/8) - then map size drops to 8M, and if you have "no hit" you save yourself the expense of looking more closely. Of course you have to round down the left/bottom, and round up the top/right coordinates to make this work right.
Expanding a little bit with an example:
Imagine coordinates can be just 0 - 15 in x, and 0 - 7 in y. There are three rectangles (all [x1 y1 x2 y2]: [2 3 4 5], [3 4 6 7] and [7 1 10 5]. We can draw these in a matrix (I mark the bottom left hand corner with the number of the rectangle - note that 1 and 2 overlap):
...xxxx.........
...xxxx.........
..xxxxx.........
..x2xxxxxxx.....
..1xx..xxxx.....
.......xxxx.....
.......3xxx.....
................
You can turn this into an array of zeros and ones - so that "is there a rectangle at this point" is the same as "is this bit set". A single lookup will give you the answer. To save space you could downsample the array - if there is still no hit, you have your answer, but if there is a hit you would need to check "is this real" - so it saves less time, and savings depend on sparseness of your matrix (sparser = faster). Subsampled array would look like this (2x downsampling):
.oxx....
.xxooo..
.oooxo..
...ooo..
I use x to mark "if you hit this point, you are sure to be in a rectangle", and o to say "some of these are a rectangle". Many of the points are now "maybe", and less time is saved. If you did more severe downsampling you might consider having a two-bit mask: this would allow you to say "this entire block is filled with rectangles" (i.e. - no further processing needed: the x above) or "further processing needed" (like the o above). This soon starts to be more complicated than the Q-tree approach...
Bottom line: the more sorting / organizing of the rectangles you do up front, the faster you can do the lookup.
My favourite for a variety of 2D geometry queries is Sweep Line Algorithm. It's widely utilize in CAD software, which would be my wild guess for the purpose of your program.
Basically, you order all points and all polygon vertices (all 4 rectangle corners in your case) along X-axis, and advance along X-axis from one point to the next. In case of non-Manhattan geometries you would also introduce intermediate points, the segment intersections.
The data structure is a balanced tree of the points and polygon (rectangle) edge intersections with the vertical line at the current X-position, ordered in Y-direction. If the structure is properly maintained it's very easy to tell whether a point at the current X-position is contained in a rectangle or not: just examine Y-orientation of the vertically adjacent to the point edge intersections. If rectangles are allowed to overlap or have rectangle holes it's just a bit more complicated, but still very fast.
The overall complexity for N points and M rectangles is O((N+M)*log(N+M)). One can actually prove that this is asymptotically optimal.
Store the coordinate parts of your rectangles to a tree structure. For any left value make an entry that points to corresponding right values pointing to corresponding top values pointing to corresponding bottom values.
To search you have to check the x value of your point against the left values. If all left values do not match, meaning they are greater than your x value, you know the point is outside any rectangle. Otherwise you check the x value against the right values of the corresponding left value. Again if all right values do not match, you're outside. Otherwise the same with top and bottom values. Once you find a matching bottom value, you know you are inside of any rectangle and you are finished checking.
As I stated in my comment below, there are much room for optimizations, for example minimum left and top values and also maximum right and botom values, to quick check if you are outside.
The following approach is in C# and needs adaption to your preferred language:
public class RectangleUnion
{
private readonly Dictionary<int, Dictionary<int, Dictionary<int, HashSet<int>>>> coordinates =
new Dictionary<int, Dictionary<int, Dictionary<int, HashSet<int>>>>();
public void Add(Rectangle rect)
{
Dictionary<int, Dictionary<int, HashSet<int>>> verticalMap;
if (coordinates.TryGetValue(rect.Left, out verticalMap))
AddVertical(rect, verticalMap);
else
coordinates.Add(rect.Left, CreateVerticalMap(rect));
}
public bool IsInUnion(Point point)
{
foreach (var left in coordinates)
{
if (point.X < left.Key) continue;
foreach (var right in left.Value)
{
if (right.Key < point.X) continue;
foreach (var top in right.Value)
{
if (point.Y < top.Key) continue;
foreach (var bottom in top.Value)
{
if (point.Y > bottom) continue;
return true;
}
}
}
}
return false;
}
private static void AddVertical(Rectangle rect,
IDictionary<int, Dictionary<int, HashSet<int>>> verticalMap)
{
Dictionary<int, HashSet<int>> bottomMap;
if (verticalMap.TryGetValue(rect.Right, out bottomMap))
AddBottom(rect, bottomMap);
else
verticalMap.Add(rect.Right, CreateBottomMap(rect));
}
private static void AddBottom(
Rectangle rect,
IDictionary<int, HashSet<int>> bottomMap)
{
HashSet<int> bottomList;
if (bottomMap.TryGetValue(rect.Top, out bottomList))
bottomList.Add(rect.Bottom);
else
bottomMap.Add(rect.Top, new HashSet<int> { rect.Bottom });
}
private static Dictionary<int, Dictionary<int, HashSet<int>>> CreateVerticalMap(
Rectangle rect)
{
var bottomMap = CreateBottomMap(rect);
return new Dictionary<int, Dictionary<int, HashSet<int>>>
{
{ rect.Right, bottomMap }
};
}
private static Dictionary<int, HashSet<int>> CreateBottomMap(Rectangle rect)
{
var bottomList = new HashSet<int> { rect.Bottom };
return new Dictionary<int, HashSet<int>>
{
{ rect.Top, bottomList }
};
}
}
It's not beautiful, but should point you in the right direction.

Rotating a two dimensional array by 90 degrees

I am studying this piece of code on rotating an NxN matrix; I have traced the program countless times, and I sort of understand how the actual rotation happens. It basically rotates the corners first and the elements after the corners in a clockwise direction. I just do not understand a couple of lines, and the code is still not "driven home" in my brain, so to speak. Please help. I am rotating it 90 degrees, given a 4x4 matrix as my tracing example.
[1][2][3][4]
[5][6][7][8]
[9][0][1][2]
[3][4][5][6]
becomes
[3][9][5][1]
[4][0][6][2]
[5][1][7][3]
[6][2][8][4]
public static void rotate(int[][] matrix, int n){
for(int layer=0; layer < n/2; ++layer) {
int first=layer; //It moves from the outside in.
int last=n-1-layer; //<--This I do not understand
for(int i=first; i<last;++i){
int offset=i-first; //<--A bit confusing for me
//save the top left of the matrix
int top = matrix[first][i];
//shift left to top;
matrix[first][i]=matrix[last-offset][first];
/*I understand that it needs
last-offset so that it will go up the column in the matrix,
and first signifies it's in the first column*/
//shift bottom to left
matrix[last-offset][first]=matrix[last][last-offset];
/*I understand that it needs
last-offset so that the number decreases and it may go up the column (first
last-offset) and left (latter). */
//shift right to bottom
matrix[last][last-offset]=matrix[i][last];
/*I understand that it i so that in the next iteration, it moves down
the column*/
//rightmost top corner
matrix[i][last]=top;
}
}
}
It's easier to understand an algorithm like this if you draw a diagram, so I made a quick pic in Paint to demonstrate for a 5x5 matrix :D
The outer for(int layer=0; layer < n/2; ++layer) loop iterates over the layers from outside to inside. The outer layer (layer 0) is depicted by coloured elements. Each layer is effectively a square of elements requiring rotation. For n = 5, layer will take on values from 0 to 1 as there are 2 layers since we can ignore the centre element/layer which is unaffected by rotation. first and last refer to the first and last rows/columns of elements for a layer; e.g. layer 0 has elements from Row/Column first = 0 to last = 4 and layer 1 from Row/Column 1 to 3.
Then for each layer/square, the inner for(int i=first; i<last;++i) loop rotates it by rotating 4 elements in each iteration. Offset represents how far along the sides of the square we are. For our 5x5 below, we first rotate the red elements (offset = 0), then yellow (offset = 1), then green and blue. Arrows 1-5 demonstrate the 4-element rotation for the red elements, and 6+ for the rest which are performed in the same fashion. Note how the 4-element rotation is essentially a 5-assignment circular swap with the first assignment temporarily putting aside an element. The //save the top left of the matrix comment for this assignment is misleading since matrix[first][i] isn't necessarily the top left of the matrix or even the layer for that matter. Also, note that the row/column indexes of elements being rotated are sometimes proportional to offset and sometimes proportional to its inverse, last - offset.
We move along the sides of the outer layer (delineated by first=0 and last=4) in this manner, then move onto the inner layer (first = 1 and last = 3) and do the same thing there. Eventually, we hit the centre and we're done.
This trigger a WTF. The easiest way to rotate a matrix in place is by
first transposing the matrix (swap M[i,j] with M[j,i])
then swapping M[i,j] with M[i, nColumns - j]
When matrices are column-major, the second operation is swapping columns, and hence has good data locality properties. If the matrix is row major, then first permute rows, and then transpose.
Here is a recursive way of solving this:
// rotating a 2 D array (mXn) by 90 degrees
public void rotateArray(int[][] inputArray) {
System.out.println("Input Array: ");
print2D(inputArray);
rotateArray(inputArray, 0, 0, inputArray.length - 1,
inputArray[0].length - 1);
System.out.println("\n\nOutput Array: ");
print2D(inputArray);
}
public void rotateArray(int[][] inputArray, int currentRow,
int currentColumn, int lastRow, int lastColumn) {
// condition to come out of recursion.
// if all rows are covered or all columns are covered (all layers
// covered)
if (currentRow >= lastRow || currentColumn >= lastColumn)
return;
// rotating the corner elements first
int top = inputArray[currentRow][currentColumn];
inputArray[currentRow][currentColumn] = inputArray[lastRow][currentColumn];
inputArray[lastRow][currentColumn] = inputArray[lastRow][lastColumn];
inputArray[lastRow][lastColumn] = inputArray[currentRow][lastColumn];
inputArray[currentRow][lastColumn] = top;
// clockwise rotation of remaining elements in the current layer
for (int i = currentColumn + 1; i < lastColumn; i++) {
int temp = inputArray[currentRow][i];
inputArray[currentRow][i] = inputArray[lastRow - i][currentColumn];
inputArray[lastRow - i][currentColumn] = inputArray[lastRow][lastColumn
- i];
inputArray[lastRow][lastColumn - i] = inputArray[currentRow + i][lastColumn];
inputArray[currentRow + i][lastColumn] = temp;
}
// call recursion on remaining layers
rotateArray(inputArray, ++currentRow, ++currentColumn, --lastRow,
--lastColumn);
}

Data structure for storing thousands of vectors

I have upto 10,000 randomly positioned points in a space and i need to be able to tell which the cursor is closest to at any given time. To add some context, the points are in the form of a vector drawing, so they can be constantly and quickly added and removed by the user and also potentially be unbalanced across the canvas space..
I am therefore trying to find the most efficient data structure for storing and querying these points. I would like to keep this question language agnostic if possible.
After the Update to the Question
Use two Red-Black Tree or Skip_list maps. Both are compact self-balancing data structures giving you O(log n) time for search, insert and delete operations. One map will use X-coordinate for every point as a key and the point itself as a value and the other will use Y-coordinate as a key and the point itself as a value.
As a trade-off I suggest to initially restrict the search area around the cursor by a square. For perfect match the square side should equal to diameter of your "sensitivity circle” around the cursor. I.e. if you’re interested only in a nearest neighbour within 10 pixel radius from the cursor then the square side needs to be 20px. As an alternative, if you’re after nearest neighbour regardless of proximity you might try finding the boundary dynamically by evaluating floor and ceiling relative to cursor.
Then retrieve two subsets of points from the maps that are within the boundaries, merge to include only the points within both sub sets.
Loop through the result, calculate proximity to each point (dx^2+dy^2, avoid square root since you're not interested in the actual distance, just proximity), find the nearest neighbour.
Take root square from the proximity figure to measure the distance to the nearest neighbour, see if it’s greater than the radius of the “sensitivity circle”, if it is it means there is no points within the circle.
I suggest doing some benchmarks every approach; it’s two easy to go over the top with optimisations. On my modest hardware (Duo Core 2) naïve single-threaded search of a nearest neighbour within 10K points repeated a thousand times takes 350 milliseconds in Java. As long as the overall UI re-action time is under 100 milliseconds it will seem instant to a user, keeping that in mind even naïve search might give you sufficiently fast response.
Generic Solution
The most efficient data structure depends on the algorithm you’re planning to use, time-space trade off and the expected relative distribution of points:
If space is not an issue the most efficient way may be to pre-calculate the nearest neighbour for each point on the screen and then store nearest neighbour unique id in a two-dimensional array representing the screen.
If time is not an issue storing 10K points in a simple 2D array and doing naïve search every time, i.e. looping through each point and calculating the distance may be a good and simple easy to maintain option.
For a number of trade-offs between the two, here is a good presentation on various Nearest Neighbour Search options available: http://dimacs.rutgers.edu/Workshops/MiningTutorial/pindyk-slides.ppt
A bunch of good detailed materials for various Nearest Neighbour Search algorithms: http://simsearch.yury.name/tutorial.html, just pick one that suits your needs best.
So it's really impossible to evaluate the data structure is isolation from algorithm which in turn is hard to evaluate without good idea of task constraints and priorities.
Sample Java Implementation
import java.util.*;
import java.util.concurrent.ConcurrentSkipListMap;
class Test
{
public static void main (String[] args)
{
Drawing naive = new NaiveDrawing();
Drawing skip = new SkipListDrawing();
long start;
start = System.currentTimeMillis();
testInsert(naive);
System.out.println("Naive insert: "+(System.currentTimeMillis() - start)+"ms");
start = System.currentTimeMillis();
testSearch(naive);
System.out.println("Naive search: "+(System.currentTimeMillis() - start)+"ms");
start = System.currentTimeMillis();
testInsert(skip);
System.out.println("Skip List insert: "+(System.currentTimeMillis() - start)+"ms");
start = System.currentTimeMillis();
testSearch(skip);
System.out.println("Skip List search: "+(System.currentTimeMillis() - start)+"ms");
}
public static void testInsert(Drawing d)
{
Random r = new Random();
for (int i=0;i<100000;i++)
d.addPoint(new Point(r.nextInt(4096),r.nextInt(2048)));
}
public static void testSearch(Drawing d)
{
Point cursor;
Random r = new Random();
for (int i=0;i<1000;i++)
{
cursor = new Point(r.nextInt(4096),r.nextInt(2048));
d.getNearestFrom(cursor,10);
}
}
}
// A simple point class
class Point
{
public Point (int x, int y)
{
this.x = x;
this.y = y;
}
public final int x,y;
public String toString()
{
return "["+x+","+y+"]";
}
}
// Interface will make the benchmarking easier
interface Drawing
{
void addPoint (Point p);
Set<Point> getNearestFrom (Point source,int radius);
}
class SkipListDrawing implements Drawing
{
// Helper class to store an index of point by a single coordinate
// Unlike standard Map it's capable of storing several points against the same coordinate, i.e.
// [10,15] [10,40] [10,49] all can be stored against X-coordinate and retrieved later
// This is achieved by storing a list of points against the key, as opposed to storing just a point.
private class Index
{
final private NavigableMap<Integer,List<Point>> index = new ConcurrentSkipListMap <Integer,List<Point>> ();
void add (Point p,int indexKey)
{
List<Point> list = index.get(indexKey);
if (list==null)
{
list = new ArrayList<Point>();
index.put(indexKey,list);
}
list.add(p);
}
HashSet<Point> get (int fromKey,int toKey)
{
final HashSet<Point> result = new HashSet<Point> ();
// Use NavigableMap.subMap to quickly retrieve all entries matching
// search boundaries, then flatten resulting lists of points into
// a single HashSet of points.
for (List<Point> s: index.subMap(fromKey,true,toKey,true).values())
for (Point p: s)
result.add(p);
return result;
}
}
// Store each point index by it's X and Y coordinate in two separate indices
final private Index xIndex = new Index();
final private Index yIndex = new Index();
public void addPoint (Point p)
{
xIndex.add(p,p.x);
yIndex.add(p,p.y);
}
public Set<Point> getNearestFrom (Point origin,int radius)
{
final Set<Point> searchSpace;
// search space is going to contain only the points that are within
// "sensitivity square". First get all points where X coordinate
// is within the given range.
searchSpace = xIndex.get(origin.x-radius,origin.x+radius);
// Then get all points where Y is within the range, and store
// within searchSpace the intersection of two sets, i.e. only
// points where both X and Y are within the range.
searchSpace.retainAll(yIndex.get(origin.y-radius,origin.y+radius));
// Loop through search space, calculate proximity to each point
// Don't take square root as it's expensive and really unneccessary
// at this stage.
//
// Keep track of nearest points list if there are several
// at the same distance.
int dist,dx,dy, minDist = Integer.MAX_VALUE;
Set<Point> nearest = new HashSet<Point>();
for (Point p: searchSpace)
{
dx=p.x-origin.x;
dy=p.y-origin.y;
dist=dx*dx+dy*dy;
if (dist<minDist)
{
minDist=dist;
nearest.clear();
nearest.add(p);
}
else if (dist==minDist)
{
nearest.add(p);
}
}
// Ok, now we have the list of nearest points, it might be empty.
// But let's check if they are still beyond the sensitivity radius:
// we search area we have evaluated was square with an side to
// the diameter of the actual circle. If points we've found are
// in the corners of the square area they might be outside the circle.
// Let's see what the distance is and if it greater than the radius
// then we don't have a single point within proximity boundaries.
if (Math.sqrt(minDist) > radius) nearest.clear();
return nearest;
}
}
// Naive approach: just loop through every point and see if it's nearest.
class NaiveDrawing implements Drawing
{
final private List<Point> points = new ArrayList<Point> ();
public void addPoint (Point p)
{
points.add(p);
}
public Set<Point> getNearestFrom (Point origin,int radius)
{
int prevDist = Integer.MAX_VALUE;
int dist;
Set<Point> nearest = Collections.emptySet();
for (Point p: points)
{
int dx = p.x-origin.x;
int dy = p.y-origin.y;
dist = dx * dx + dy * dy;
if (dist < prevDist)
{
prevDist = dist;
nearest = new HashSet<Point>();
nearest.add(p);
}
else if (dist==prevDist) nearest.add(p);
}
if (Math.sqrt(prevDist) > radius) nearest = Collections.emptySet();
return nearest;
}
}
I would like to suggest creating a Voronoi Diagram and a Trapezoidal Map (Basically the same answer as I gave to this question). The Voronoi Diagram will partition the space in polygons. Every point will have a polygon describing all points that are closest to it.
Now when you get a query of a point, you need to find in which polygon it lies. This problem is called Point Location and can be solved by constructing a Trapezoidal Map.
The Voronoi Diagram can be created using Fortune's algorithm which takes O(n log n) computational steps and costs O(n) space.
This website shows you how to make a trapezoidal map and how to query it. You can also find some bounds there:
Expected creation time: O(n log n)
Expected space complexity: O(n) But
most importantly, expected query
time: O(log n).
(This is (theoretically) better than O(√n) of the kD-tree.)
Updating will be linear (O(n)) I think.
My source(other than the links above) is: Computational Geometry: algorithms and applications, chapters six and seven.
There you will find detailed information about the two data structures (including detailed proofs). The Google books version only has a part of what you need, but the other links should be sufficient for your purpose. Just buy the book if you are interested in that sort of thing (it's a good book).
The most efficient data structure would be a kd-tree link text
Are the points uniformly distributed?
You could build a quad-tree up to a certain depth, say, 8. At the top you have a tree node that divides the screen into four quadrants. Store at each node:
The top left and the bottom right coordinate
Pointers to four child nodes, which divide the node into four quadrants
Build the tree up to a depth of 8, say, and at the leaf nodes, store a list of points associated with that region. That list you can search linearly.
If you need more granularity, build the quad-tree to a greater depth.
It depends on the frequency of updates and query. For fast query, slow updates, a Quadtree (which is a form of jd-tree for 2-D) would probably be best. Quadtree are very good for non-uniform point too.
If you have a low resolution you could consider using a raw array of width x height of pre-computed values.
If you have very few points or fast update, a simple array is enough, or may be a simple partitioning (which goes toward the quadtree).
So the answer depends on parameters of you dynamics. Also I would add that nowadays the algo isn't everything; making it use multiple processors or CUDA can give a huge boost.
You haven't specified the dimensions of you points, but if it's a 2D line drawing then a bitmap bucket - a 2D array of lists of points in a region, where you scan the buckets corresponding to and near to a cursor can perform very well. Most systems will happily handle bitmap buckets of the 100x100 to 1000x1000 order, the small end of which would put a mean of one point per bucket. Although asymptotic performance is O(N), real-world performance is typically very good. Moving individual points between buckets can be fast; moving objects around can also be made fast if you put the objects into the buckets rather than the points ( so a polygon of 12 points would be referenced by 12 buckets; moving it becomes 12 times the insertion and removal cost of the bucket list; looking up the bucket is constant time in the 2D array ). The major cost is reorganising everything if the canvas size grows in many small jumps.
If it is in 2D, you can create a virtual grid covering the whole space (width and height are up to your actual points space) and find all the 2D points which belong to every cell. After that a cell will be a bucket in a hashtable.

Resources