So I have n objects that can be dragged on a stage and I want them to snap (lock, magnet...) to each other when you drag them close enough. That means calculating the distance from dragged object to every other object on the stage on every mouse move event. Can this be optimized? Something like comparing the distance only to the closes objects but how do I know which are the closes without calculating all distances first?
Thanks
You can use a k-d tree. K-d tree has an efficient "find closest neighbor" function. Just make sure each object which is already 'in place' is in your tree, and when you move an element - find its closest neighbor, and check if it is close enough or not.
An alternative (mostly usable for grids) is using observer pattern. Whenever you place an object - draw the line where its 'influence' starts. Attach each 'cell' on this line with an observer, and when you are moving an object to a cell, if it has any observers attached to it - invoke them.
Two ideas:
You could divide your stage into 'tiles' of some size (say, 64x64 pixels?). If you knew which tiles would overlap (partially or completely) a given tile, you could limit your hit detection to those.
You could maintain a (sorted!) map which maps a given 'y' coordinate to all the object(s) on that 'y' position. Each list of objects at the y coordinate is sorted as well, giving the objects sorted by their X order. By using the lower/upper bound (which is efficient to compute on sorted sequences) you could quickly get handles to all objects within a certain bounding rect.
On classical solution to this kind of problems is to cut the space recursively into areas. You only search neigboors in close areas. A classical data structure for this is QuadTree where square are cut into four squares.
Related
I'm trying to find a spatial index structure suitable for a particular problem : using a union-find data structure, I want to connect\associate points that are within a certain range of each other.
I have a lot of points and I'm trying to optimize an existing solution by using a better spatial index.
Right now, I'm using a simple 2D grid indexing each square of width [threshold distance] of my point map, and I look for potential unions by searching for points in adjacent squares in the grid.
Then I compute the squared Euclidean distance to the adjacent cells combinations, which I compare to my squared threshold, and I use the union-find structure (optimized using path compression and etc.) to build groups of points.
Here is some illustration of the method. The single black points actually represent the set of points that belong to a cell of the grid, and the outgoing colored arrows represent the actual distance comparisons with the outside points.
(I'm also checking for potential connected points that belong to the same cells).
By using this pattern I make sure I'm not doing any distance comparison twice by using a proper "neighbor cell" pattern that doesn't overlap with already tested stuff when I iterate over the grid cells.
Issue is : this approach is not even close to being fast enough, and I'm trying to replace the "spatial grid index" method with something that could maybe be faster.
I've looked into quadtrees as a suitable spatial index for this problem, but I don't think it is suitable to solve it (I don't see any way of performing repeated "neighbours" checks for a particular cell more effectively using a quadtree), but maybe I'm wrong on that.
Therefore, I'm looking for a better algorithm\data structure to effectively index my points and query them for proximity.
Thanks in advance.
I have some comments:
1) I think your problem is equivalent to a "spatial join". A spatial join takes two sets of geometries, for example a set R of rectangles and a set P of points and finds for every rectangle all points in that rectangle. In Your case, R would be the rectangles (edge length = 2 * max distance) around each point and P the set of your points. Searching for spatial join may give you some useful references.
2) You may want to have a look at space filling curves. Space filling curves create a linear order for a set of spatial entities (points) with the property that points that a close in the linear ordering are usually also close in space (and vice versa). This may be useful when developing an algorithm.
3) Have look at OpenVDB. OpenVDB has a spatial index structure that is highly optimized to traverse 'voxel'-cells and their neighbors.
4) Have a look at the PH-Tree (disclaimer: this is my own project). The PH-Tree is a somewhat like a quadtree but uses low level bit operations to optimize navigation. It is also Z-ordered/Morten-ordered (see space filling curves above). You can create a window-query for each point which returns all points within that rectangle. To my knowledge, the PH-Tree is the fastest index structure for this kind of operation, especially if you typically have only 9 points in a rectangle. If you are interested in the code, the V13 implementation is probably the fastest, however the V16 should be much easier to understand and modify.
I tried on my rather old desktop machine, using about 1,000,000 points I can do about 200,000 window queries per second, so it should take about 5 second to find all neighbors for every point.
If you are using Java, my spatial index collection may also be useful.
A standard approach to this is the "sweep and prune" algorithm. Sort all the points by X coordinate, then iterate through them. As you do, maintain the lowest index of the point which is within the threshold distance (in X) of the current point. The points within that range are candidates for merging. You then do the same thing sorting by Y. Then you only need to check the Euclidean distance for those pairs which showed up in both the X and Y scans.
Note that with your current union-find approach, you can end up unioning points which are quite far from each other, if there are a bunch of nearby points "bridging" them. So your basic approach -- of unioning groups of points based on proximity -- can induce an arbitrary amount of distance error, not just the threshold distance.
I am working on an interactive web application, and I'm currently working on implementing a multi-select feature similar to the way windows allows you to select multiple desktop icons by dragging a rectangle.
Due to limitations of the library I'm required to use, implementing this has already become quite resource intensive:
On initial click, store the position of the mouse cursor.
On each pixel that the mouse cursor moves, perform the following:
Destroy the previous selection rectangle, if it exists, so it doesn't appear on the screen anymore.
Calculate the width and height of the new selection retangle using the current cursor position and the current cursor position.
Create a new selection rectangle using the original cursor position, the width and the height
Display this rectangle on the screen
As you can see, there are quite a few things happening every time the cursor moves a single pixel. I've looked into this as much as I can and there's no way I can make it any more efficient or any faster.
My next step is actually selecting the objects on the screen when the selection rectangle moves over them. I need to implement this algorithm myself so I have freedom to make it as efficient/fast as possible. What I need to do is iterate through the objects on the screen and check each one to see if it lies in the rectangle. So the loop here is going to consume more resources. So, I need the checking to be done as efficiently as possible.
Each object that can be selected can be represented by a single point, P(x, y).
How can I check if P(x, y) is within the rectangles I create in the fastest/most efficient way?
Here's the relevant information:
The can be an arbitrary number of objects that can be selected on the screen at any one time
The selection rectangles will always be axis-aligned
The information I have about the rectangles is their original point, their height, and their width.
How can I achieve what I need to do as fast as possible?
Checking whether point P lies inside rectangle R is simple and fast
(in coordinate system with origin in the top left corner)
(P.X >= R.Left) and (P.X <= R.Right) and (P.Y >= R.Top) and (P.Y <= R.Bottom)
(precalculate Right and Bottom coordinates of rectangle)
Perhaps you could accelerate overall algorithm if objects fulfill to some conditions, that allow don't check all the objects at every step.
Example: sort object list by X coordinate and check only those objects that lies in Left..Right range
More advanced approach: organize objects in some space-partitioning data structure like kd-tree and execute range search very fast
You can iterate through every object on screen and check whether it lies in the rectangle in a Cartesian coordinate system using the following condition:
p.x >= rect.left && p.x <= rect.right && p.y <= rect.top && p.y >= rect.bottom
If are going to have not more than 1000 points on screen, just use the naive O(n) method by iterating through each point. If you are completely sure that you need to optimize this further, read on.
Depending on the frequency of updating the points and number of points being updated each frame, you may want to use a different method potentially involving a data structure like Range Trees, or settle for the naive O(n) method.
If the points aren't going to move around much and are sparse (i.e. far apart from each other), you can use a Range Tree or similar for O(log n) checks. Bear in mind though that updating such a spatial partitioning structure is resource intensive, and if you have a lot of points that are going to be moving around quite a bit, you may want to look at something else.
If a few points are going to be moving around over large distances, you may want to look at partitioning the screen into a grid of "buckets", and check only those buckets that are contained by the rectangle. Whenever a point moves from one bucket to another, the grid will have to update the affected buckets.
If memory is a constraint, you may want to look at using a modified Quad Tree which is limited by tree depth instead of bucket size, if the grid approach is not efficient enough.
If you have a lot of points moving around a lot every frame, I think you may be better of with the grid approach or just with the naive O(n) approach. Experiment and choose an approach that best suites your problem.
I'll do my best to make my case scenario simple.
1-) Let's suppose we need to stores a lot of rectangles in some kind of array/data structure. Each rectangle are of different sizes and are positioned at various parts of the screen. Rectangles can't overlap together.
2-) Then we click on the screen with the mouse at point [x,y].
Now, we need to determine if we clicked on a part of one of the rectangles. Well, that would be insane to iterate through all the rectangles to make some kind of comparisons, especially if there is a huge number of them.
What would be the fastest technique/algorithm to do it with as little steps as possible? What would be the best data structure to use in such case?
One way would be to use a quadtree to store the rectangles: The root represents the whole area that contains all rectangles, and this area is then recursively subdivided as required.
If you want to test if a certain coordinate is within one of the rectangles, you start at the root, and walk down the tree until either you find a rectangle or not.
This can be done in O(log n) time.
For a 2D game I am working on, I am using y axis sorting in a simple rectangle-based collision detection. This is working fine, and now I want to find the nearest empty rectangle at a given location with a given size, efficiently. How can I do this? Is there an algorithm?
I could think of a simple brute force grid test (with each grid the size of the empty space we're looking for) but obviously this is slow and not even a complete test.
Consider using quad-trees to store your rectangles.
See http://en.wikipedia.org/wiki/Quadtree for more information.
If you're already using axis sorting, then presumably you've computed a list of your rectangles sorted by their positions.
Perhaps I am misunderstanding, but could you not just look at the two rectangles before and after the rectangle in question, and decide which one is closer? If you're talking about finding the closest rectangle to an arbitrary point, then you could simply walk through the list until you find the first rectangle with a greater position than your arbitrary point, and use that rectangle and the one before it as the two to compare.
I'm looking for a data structure that provides indexing for Rectangles. I need the insert algorithm to be as fast as possible since the rectangles will be moving around the screen (think of dragging a rectangle with your mouse to a new position).
I've looked into R-Trees, R+Trees, kD-Trees, Quad-Trees and B-Trees but from my understanding insert's are usually slow. I'd prefer to have inserts at sub-linear time complexity so maybe someone can prove me wrong about either of the listed data structures.
I should be able to query the data structure for what rectangles are at point(x, y) or what rectangles intersect rectangle(x, y, width, height).
EDIT: The reason I want insert so fast is because if you think of a rectangle being moved around the screen, they're going to have to be removed and then re-inserted.
Thanks!
I'd use a multiscale grid approach (equivalent to quad-trees in some form).
I'm assuming you're using integer coordinates (i.e. pixels) and have plenty of space to hold all the pixels.
Have an array of lists of rectangles, one for each pixel. Then, bin two-by-two and do it again. And again, and again, and again, until you have one pixel that covers everything.
Now, the key is that you insert your rectangles at the level that is a good match for the size of the rectangle. This will be something like (pixel size) ~= min(height,width)/2. Now for each rectangle you have only a handful of inserts to do into the lists (you could bound it above by a constant, e.g. pick something that has between 4 and 16 pixels).
If you want to seek for all rectangles at x,y you look in the list of the smallest pixel, and then in the list of the 2x2 binned pixel that contains it, and then in the 4x4 etc.; you should have log2(# of pixels) steps to look through. (For larger pixels, you then have to check whether (x,y) was really in the rectangle; you expect about half of them to be successful on borders, and all of them to be successful inside the rectangle, so you'd expect no worse than 2x more work than if you looked up the pixel directly.)
Now, what about insert? That's very inexpensive--O(1) to stick yourself on the front of a list.
What about delete? That's more expensive; you have to look through and heal each list for each pixel you're entered in. That's approximately O(n) in the number of rectangles overlapping at that position in space and of approximately the same size. If you have really large numbers of rectangles, then you should use some other data structure to hold them (hash set, RB tree, etc.).
(Note that if your smallest rectangle must be larger than a pixel, you don't need to actually form the multiscale structure all the way to the pixel level; just go down until the smallest rectangle won't get hopelessly lost inside your binned pixel.)
The data structures you mention are quite a mixed bag: in particular B-Trees should be fast (cost to insert grows with the logarithm of the number of items present) but won't speed up your intersection queries.
Ignoring that - and hoping for the best - the spatial data structures come in two parts. The first part tells you how to build a tree structure from the data. The second part tells you how to keep track of information at each node that describes the items stored below that node, and how to use it to speed up queries.
You can usually pinch the ideas about keeping track of information at each node without using the (expensive) ideas about exactly how the tree should be built. For instance, you could create a key for each rectangle by bit-interleaving the co-ordinates of its points and then use a perfectly ordinary tree structure (such as a B-tree or an AVL tree or a Red-Black tree) to store it, while still keeping information at each node. This might, in practice, speed up your queries enough - although you wouldn't be able to tell that until you implemented and tested it on real data. The purpose of the tree-building instructions in most schemes is to provide performance guarantees.
Two postscripts:
1) I like Patricia trees for this - they are reasonably easy to implement, and adding or deleting entries does not disturb the tree structure much, so you won't have too much work to do updating information stored at nodes.
2) Last time I looked at a window system, it didn't bother about any of this clever stuff at all - it just kept a linear list of items and searched all the way through it when it needed to: that was fast enough.
This is perhaps an extended comment rather than an answer.
I'm a bit puzzled about what you really want. I could guess that you want a data structure to support quick answers to questions such as 'Given the ID of a rectangle, return its current coordinates'. Is that right ?
Or do you want to answer 'what rectangle is at position (x,y)' ? In that case an array with dimensions matching the height and width of your display might suffice, with each element in the array being a (presumably short) list of the rectangles on that pixel.
But then you state that you need an insert algorithm to be as fast as possible to cope with rectangles moving constantly. If you had only, say, 10 rectangles on screen, you could simply have a 10-element array containing the coordinates of each of the rectangles. Updating their positions would not then require any inserts into the data structure.
How many rectangles ? How quickly are they created ? and destroyed ? How do you want to cope with overlaps ? Is a rectangle just a boundary, or does it include the interior ?