Need help optimizing runtime on a problem

Need help optimizing runtime on a problem - algorithm

I have two types of operations possible:
[0,a,b] - Create and save a rectangle of size axb
[1,a,b] - check whether every PREVIOUS saved rectangle can be fit inside this rectangle of axb. You can rotate rectangles by 90 degrees. We try to fit each rectangle one at a time and not together.
I need to return an array of booleans representing answers to second operation in the order in which they appear.
eg.
operations = [1,1,1]
output = [true]
No rectangles saved so they can fit inside anything.
ex.
operations = [[0,1,3], [0,4,2], [1,3,4], [1,3,2]]
output = [true,false]
[1,3] and 4,2 can fit inside [3,4] so true
[1,3] can fit but [4,2] cannot fit inside [3,2] so false.
I've tried the brute force solution of rotating each previous rectangle and checking if it fits and it is too slow
What's the faster solution?

If you want a true/false for every saved rectangle, in the order they were saved, then at some point you have to check every saved rectangle.
To speed things up, when you save a rectangle, also save its area and its longer dimension.
At test time, if the area of a saved rectangle is greater than the area of the test rectangle, then it cannot fit. This is only one test instead of two. If the area is less than or equal then the longer dimension of the saved rectangle must be less than or equal to the longer dimension of the test rectangle. So somtimes you will stiil have to do two tests.
You will gain some speed at test time for a very small loss of speed at save time.

Related

How to index nearby 3D points on the fly?

In physics simulations (for example n-body systems) it is sometimes necessary to keep track of which particles (points in 3D space) are close enough to interact (within some cutoff distance d) in some kind of index. However, particles can move around, so it is necessary to update the index, ideally on the fly without recomputing it entirely. Also, for efficiency in calculating interactions it is necessary to keep the list of interacting particles in the form of tiles: a tile is a fixed size array (eg 32x32) where the rows and columns are particles, and almost every row-particle is close enough to interact with almost every column particle (and the array keeps track of which ones actually do interact).
What algorithms may be used to do this?
Here is a more detailed description of the problem:
Initial construction: Given a list of points in 3D space (on the order of a few thousand to a few million, stored as array of floats), produce a list of tiles of a fixed size (NxN), where each tile has two lists of points (N row points and N column points), and a boolean array NxN which describes whether the interaction between each row and column particle should be calculated, and for which:
a. every pair of points p1,p2 for which distance(p1,p2) < d is found in at least one tile and marked as being calculated (no missing interactions), and
b. if any pair of points is in more than one tile, it is only marked as being calculated in the boolean array in at most one tile (no duplicates),
and also the number of tiles is relatively small if possible (but this is less important than being able to update the tiles efficiently)
Update step: If the positions of the points change slightly (by much less than d), update the list of tiles in the fastest way possible so that they still meet the same conditions a and b (this step is repeated many times)
It is okay to keep any necessary data structures that help with this, for example the bounding boxes of each tile, or a spatial index like a quadtree. It is probably too slow to calculate all particle pairwise distances for every update step (and in any case we only care about particles which are close, so we can skip most possible pairs of distances just by sorting along a single dimension for example). Also it is probably too slow to keep a full (quadtree or similar) index of all particle positions. On the other hand is perfectly fine to construct the tiles on a regular grid of some kind. The density of particles per unit volume in 3D space is roughly constant, so the tiles can probably be built from (essentially) fixed size bounding boxes.
To give an example of the typical scale/properties of this kind of problem, suppose there is 1 million particles, which are arranged as a random packing of spheres of diameter 1 unit into a cube with of size roughly 100x100x100. Suppose the cutoff distance is 5 units, so typically each particle would be interacting with (2*5)**3 or ~1000 other particles or so. The tile size is 32x32. There are roughly 1e+9 interacting pairs of particles, so the minimum possible number of tiles is ~1e+6. Now assume each time the positions change, the particles move a distance around 0.0001 unit in a random direction, but always in a way such that they are at least 1 unit away from any other particle and the typical density of particles per unit volume stays the same. There would typically be many millions of position update steps like that. The number of newly created pairs of interactions per step due to the movement is (back of the envelope) (10**2 * 6 * 0.0001 / 10**3) * 1e+9 = 60000, so one update step can be handled in principle by marking 60000 particles as non-interacting in their original tiles, and adding at most 60000 new tiles (mostly empty - one per pair of newly interacting particles). This would rapidly get to a point where most tiles are empty, so it is definitely necessary to combine/merge tiles somehow pretty often - but how to do it without a full rebuild of the tile list?
P.S. It is probably useful to describe how this differs from the typical spatial index (eg octrees) scenario: a. we only care about grouping close by points together into tiles, not looking up which points are in an arbitrary bounding box or which points are closest to a query point - a bit closer to clustering that querying and b. the density of points in space is pretty constant and c. the index has to be updated very often, but most moves are tiny

Not sure my reasoning is sound, but here's an idea:
Divide your space into a grid of 3d cubes, like this in three dimensions:
The cubes have a side length of d. Then do the following:
Assign all points to all cubes in which they're contained; this is fast since you can derive a point's cube from just their coordinates
Now check the following:
Mark all points in the top left of your cube as colliding; they're less than d apart. Further, every "quarter cube" in space is only the top left quarter of exactly one cube, so you won't check the same pair twice.
Check fo collisions of type (p, q), where p is a point in the top left quartile, and q is a point not in the top left quartile. In this way, you will check collision between every two points again at most once, because very pair of quantiles is checked exactly once.
Since every pair of points is either in the same quartile or in neihgbouring quartiles, they'll be checked by the first or the second algorithm. Further, since points are approximately distributed evenly, your runtime is much less than n^2 (n=no points); in aggregate, it's k^2 (k = no points per quartile, which appears to be approximately constant).
In an update step, you only need to check:
if a point crossed a boundary of a box, which should be fast since you can look at one coordinate at a time, and box' boundaries are a simple multiple of d/2
check for collisions of the points as above
To create the tiles, divide the space into a second grid of (non-overlapping) cubes whose width is chosen s.t. the average count of centers between two particles that almost interact with each other that fall into a given cube is less than the width of your tiles (i.e. 32). Since each particle is expected to interact with 300-500 particles, the width will be much smaller than d.
Then, while checking for interactions in step 1 & 2, assigne particle interactions to these new cubes according to the coordinates of the center of their interaction. Assign one tile per cube, and mark interacting particles assigned to that cube in the tile. Visualization:
Further optimizations might be to consider the distance of a point's closest neighbour within a cube, and derive from that how many update steps are needed at least to change the collision status of that point; then ignore that point for this many steps.

I suggest the following algorithm. E.g we have cube 1x1x1 and the cutoff distance is 0.001
Let's choose three base anchor points: (0,0,0) (0,1,0) (1,0,0)
Associate array of size 1000 ( 1 / 0.001) with each anchor point
Add three numbers into each regular point. We will store the distance between the given point and each anchor point inside these fields
At the same time this distance will be used as an index in an array inside the anchor point. E.g. 0.4324 means index 432.
Let's store the set of points inside of each three arrays
Calculate distance between the regular point and each anchor point every time when update point
Move point between sets in arrays during the update
The given structures will give you an easy way to find all closer points: it is the intersection between three sets. And we choose these sets based on the distance between point and anchor points.
In short, it is the intersection between three spheres. Maybe you need to apply additional filtering for the result if you want to erase the corners of this intersection.

Consider using the Barnes-Hut algorithm or something similar. A simulation in 2D would use a quadtree data structure to store particles, and a 3D simulation would use an octree.
The benefit of using a a tree structure is that it stores the particles in a way that nearby particles can be found quickly by traversing the tree, and far-away particles are in traversal paths that can be ignored.
Wikipedia has a good description of the algorithm:
The Barnes–Hut tree
In a three-dimensional n-body simulation, the Barnes–Hut algorithm recursively divides the n bodies into groups by storing them in an octree (or a quad-tree in a 2D simulation). Each node in this tree represents a region of the three-dimensional space. The topmost node represents the whole space, and its eight children represent the eight octants of the space. The space is recursively subdivided into octants until each subdivision contains 0 or 1 bodies (some regions do not have bodies in all of their octants). There are two types of nodes in the octree: internal and external nodes. An external node has no children and is either empty or represents a single body. Each internal node represents the group of bodies beneath it, and stores the center of mass and the total mass of all its children bodies.
demo

Minimum amount of rectangles from multi-colored grid

I've been working for some time in an XNA roguelike game and I can't get my head around the following problem: developing an algorithm to divide a matrix of non-binary values into the fewest rectangles grouping these values.
Example: given the following matrix
01234567
0 ---##*##
1 ---##*##
2 --------
The algorithm should return:
3x3 rectangle of '-'s starting at (0,0)
2x2 rectangle of '#'s starting at (3, 0)
1x2 rectangle of '*'s starting at (5, 0)
2x2 rectangle of '#'s starting at (6, 0)
5x1 rectangle of '-'s starting at (3, 2)
Why am I doing this: I've gotten a pretty big dungeon type with a size of approximately 500x500. If I were to individually call the "Draw" method for each tile's Sprite, my FPS would be far too low. It is possible to optimize this process by grouping similar-textured tiles and applying texture repetition to them, which would dramatically decrease the amount of GPU draw calls for that. For example, if my map were the previous matrix, instead of calling draw 16 times, I'd call it only 5 times.
I've looked at some algorithms which can give you the biggest rectangle of a type inside a given binary matrix, but that doesn't fit my problem.
Thanks in advance!

You can use breadth first searches to separate each area of different tile type.
Picking a partitioning within the individual shapes is an NP-hard problem (see https://en.wikipedia.org/wiki/Graph_partition), so you can't find an efficient solution that guarantees the minimum number of rectangles. However if you don't mind an extra rectangle or two for each shape and your shapes are relatively small, you can come up with algorithms that split the shape into a number of rectangles close to the minimum.
An off the top of my head guess for something that could potentially work would be to pick a tile with the maximum connecting tiles and start growing a rectangle from it using a recursive algorithm to maximize the size. Remove the resulting rectangle from the shape, then repeat until there are no more tiles not included in a rectangle. Again, this won't produce perfect results, there are graphs on which this will return with more than the minimum amount of rectangles, but it's an easy to implement ballpark solution. With a little more effort I'm sure you will be able to find better heuristics to use and get better results too.

One possible building block is a routine to check, given two points, whether the rectangle formed by using those points as opposite corners is all of the same type. I think that a fast (but unreliable) means of testing this can be based on mapping each type to a large random number, and then working out the sum of the numbers within a rectangle modulo a large prime. Take one of the numbers within the rectangle. If the sum of the numbers within the rectangle is the size of the rectangle times the one number sampled, assume that the all of the numbers in the rectangle are the same.
In one dimension we can work out all of the cumulative sums a, a+b, a+b+c, a+b+c+d,... in time O(N) and then, for any two points, work out the sum for the interval between them by subtracting cumulative sums: b+c+d = a+b+c+d - a. In two dimensions, we can use cumulative sums to work out, for each point, the sum of all of the numbers from positions which have x and y co-ordinates no greater than the (x, y) coordinate of that position. For any rectangle we can work out the sum of the numbers within that rectangle by working out A-B-C+D where A,B,C,D are two-dimensional cumulative sums.
So with pre-processing O(N) we can work out a table which allows us to compute the sum of the numbers within a rectangle specified by its opposite corners in time O(1). Unless we are very unlucky, checking this sum against the size of the rectangle times a number extracted from within the rectangle will tell us whether the rectangle is all of the same type.
Based on this, repeatedly start with a random point not covered. Take a point just to its left and move that point left as long as the interval between the two points is of the same type. Then move that point up as long as the rectangle formed by the two points is of the same type. Now move the first point to the right and down as long as the rectangle formed by the two points is of the same type. Now you think you have a large rectangle covering the original point. Check it. In the unlikely event that it is not all of the same type, add that rectangle to a list of "fooled me" rectangles you can check against in future and try again. If it is all of the same type, count that as one extracted rectangle and mark all of the points in it as covered. Continue until all points are covered.
This is a greedy algorithm that makes no attempt at producing the optimal solution, but it should be reasonably fast - the most expensive part is checking that the rectangle really is all of the same type, and - assuming you pass that test - each cell checked is also a cell covered so the total cost for the whole process should be O(N) where N is the number of cells of input data.

How to randomly place rectangle with minimal overlap and nice dispersion

I'd like to place an arbitrary number of rectangles into a fixed size parent such that they are:
Randomly placed
Randomly rotated to within a give degree range
Nicely dispersed around the center point (not all clumped into one corner)
Not overlapping unless necessary due to lack of space
With minimum overlap when it's necessary
To help you visualize the problem, I would like to scatter images inside a window for the user to choose one.
Googling had led me do various algorithms for packing etc, but nothing really addresses my requirements.
Does anyone have any good ideas?

It shouldn't be much more complicated than:
Place new rectangle in random location with random rotation. Simply using three random values (x, y, r) should do it, unless you want random size as well (in which case you'd need w and h too). This shouldn't give any corner-clumping (through random is random).
For every rectangle already placed, check for collisions. Here's one way. Also check for collisions with the side of the window (if you don't want things to extend past the screen); putting four dummy rectangles around the border may be a cheap way to do this.
If there are any collisions, then there are two choices: either move the new rectangle to a new random location or move both the new rectangle and the blocking rectangle away from each other until they no longer touch. Both have yays and nays - moving the new one only is faster and easier though it might not ever find a place where it fits if the page is really full; moving both is almost sure to be successful but takes longer and may result in chain-reaction collisions that would all have to be sorted out recursively.
In any case you'll want to try and keep the number of rectangles small, because the number of comparisions can quickly get really big. Using a short-circuit (such as "if they're halfway across the screen then don't bother looking closely") may help but isn't guarenteed.
EDIT: Okay, so requirement #5. Chances are that the push-both-rectangles-until-they-no-longer-collide-recursively method of adding new rectangles will end up being the simplest way to do this - just cut off the loop after a few thousand iterations and everything will have attempted to move as far away from everything else as possible, leaving minimum overlap. Or, leave the method running in a seperate thread so that the user can see them spreading out as more are added (also stopping it from looking like it's locking up while it's thinking), stopping once no rectangle has moved more than X units in one iteration.

How about this? Consider the rectangles you have to place as shaped, charged particles which repel one another and are also repelled by the walls of the container. You could start by (randomly) distributing them (and giving them random angles) in the container, then running a simulation where each "particle" moves in response to the forces acting on it (angles will change according to the turning moments of these forces). Stop when you hit a configuration within your tolerances.
You could simplify the calculations by treating each rectangle as an ellipse, which can be further simplified by treating each ellipse as a circle which has undergone scaling and rotation.

I don't understand requirement 2. Are you saying that the rectangles themselves are rotated around the rectangle center point, or that the rectangles only cover part of the 360 degree circle around the center point of all the rectangles.
I'm not sure that random is the way to go.
Simply divide the number of rectangles desired by 360 degrees. That's the number of degrees to offset each rectangle as it's being drawn. This should cover requirements 3, 4, and 5.

Randomly and efficiently filling space with shapes

What is the most efficient way to randomly fill a space with as many non-overlapping shapes? In my specific case, I'm filling a circle with circles. I'm randomly placing circles until either a certain percentage of the outer circle is filled OR a certain number of placements have failed (i.e. were placed in a position that overlapped an existing circle). This is pretty slow, and often leaves empty spaces unless I allow a huge number of failures.
So, is there some other type of filling algorithm I can use to quickly fill as much space as possible, but still look random?

Issue you are running into
You are running into the Coupon collector's problem because you are using a technique of Rejection sampling.
You are also making strong assumptions about what a "random filling" is. Your algorithm will leave large gaps between circles; is this what you mean by "random"? Nevertheless it is a perfectly valid definition, and I approve of it.
Solution
To adapt your current "random filling" to avoid the rejection sampling coupon-collector's issue, merely divide the space you are filling into a grid. For example if your circles are of radius 1, divide the larger circle into a grid of 1/sqrt(2)-width blocks. When it becomes "impossible" to fill a gridbox, ignore that gridbox when you pick new points. Problem solved!
Possible dangers
You have to be careful how you code this however! Possible dangers:
If you do something like if (random point in invalid grid){ generateAnotherPoint() } then you ignore the benefit / core idea of this optimization.
If you do something like pickARandomValidGridbox() then you will slightly reduce the probability of making circles near the edge of the larger circle (though this may be fine if you're doing this for a graphics art project and not for a scientific or mathematical project); however if you make the grid size 1/sqrt(2) times the radius of the circle, you will not run into this problem because it will be impossible to draw blocks at the edge of the large circle, and thus you can ignore all gridboxes at the edge.
Implementation
Thus the generalization of your method to avoid the coupon-collector's problem is as follows:
Inputs: large circle coordinates/radius(R), small circle radius(r)
Output: set of coordinates of all the small circles
Algorithm:
divide your LargeCircle into a grid of r/sqrt(2)
ValidBoxes = {set of all gridboxes that lie entirely within LargeCircle}
SmallCircles = {empty set}
until ValidBoxes is empty:
pick a random gridbox Box from ValidBoxes
pick a random point inside Box to be center of small circle C
check neighboring gridboxes for other circles which may overlap*
if there is no overlap:
add C to SmallCircles
remove the box from ValidBoxes # possible because grid is small
else if there is an overlap:
increase the Box.failcount
if Box.failcount > MAX_PERGRIDBOX_FAIL_COUNT:
remove the box from ValidBoxes
return SmallCircles
(*) This step is also an important optimization, which I can only assume you do not already have. Without it, your doesThisCircleOverlapAnother(...) function is incredibly inefficient at O(N) per query, which will make filling in circles nearly impossible for large ratios R>>r.
This is the exact generalization of your algorithm to avoid the slowness, while still retaining the elegant randomness of it.
Generalization to larger irregular features
edit: Since you've commented that this is for a game and you are interested in irregular shapes, you can generalize this as follows. For any small irregular shape, enclose it in a circle that represent how far you want it to be from things. Your grid can be the size of the smallest terrain feature. Larger features can encompass 1x2 or 2x2 or 3x2 or 3x3 etc. contiguous blocks. Note that many games with features that span large distances (mountains) and small distances (torches) often require grids which are recursively split (i.e. some blocks are split into further 2x2 or 2x2x2 subblocks), generating a tree structure. This structure with extensive bookkeeping will allow you to randomly place the contiguous blocks, however it requires a lot of coding. What you can do however is use the circle-grid algorithm to place the larger features first (when there's lot of space to work with on the map and you can just check adjacent gridboxes for a collection without running into the coupon-collector's problem), then place the smaller features. If you can place your features in this order, this requires almost no extra coding besides checking neighboring gridboxes for collisions when you place a 1x2/3x3/etc. group.

One way to do this that produces interesting looking results is
create an empty NxM grid
create an empty has-open-neighbors set
for i = 1 to NumberOfRegions
pick a random point in the grid
assign that grid point a (terrain) type
add the point to the has-open-neighbors set
while has-open-neighbors is not empty
foreach point in has-open-neighbors
get neighbor-points as the immediate neighbors of point
that don't have an assigned terrain type in the grid
if none
remove point from has-open-neighbors
else
pick a random neighbor-point from neighbor-points
assign its grid location the same (terrain) type as point
add neighbor-point to the has-open-neighbors set
When done, has-open-neighbors will be empty and the grid will have been populated with at most NumberOfRegions regions (some regions with the same terrain type may be adjacent and so will combine to form a single region).
Sample output using this algorithm with 30 points, 14 terrain types, and a 200x200 pixel world:
Edit: tried to clarify the algorithm.

How about using a 2-step process:
Choose a bunch of n points randomly -- these will become the centres of the circles.
Determine the radii of these circles so that they do not overlap.
For step 2, for each circle centre you need to know the distance to its nearest neighbour. (This can be computed for all points in O(n^2) time using brute force, although it may be that faster algorithms exist for points in the plane.) Then simply divide that distance by 2 to get a safe radius. (You can also shrink it further, either by a fixed amount or by an amount proportional to the radius, to ensure that no circles will be touching.)
To see that this works, consider any point p and its nearest neighbour q, which is some distance d from p. If p is also q's nearest neighbour, then both points will get circles with radius d/2, which will therefore be touching; OTOH, if q has a different nearest neighbour, it must be at distance d' < d, so the circle centred at q will be even smaller. So either way, the 2 circles will not overlap.

My idea would be to start out with a compact grid layout. Then take each circle and perturb it in some random direction. The distance in which you perturb it can also be chosen at random (just make sure that the distance doesn't make it overlap another circle).
This is just an idea and I'm sure there are a number of ways you could modify it and improve upon it.

Effecient way to check if N number of (x,y)coordinates in K number of rectangles

Is there an efficient way to see if N number of (x,y) points are inside K number of rectangles? Right now I am doing a brute force approach and looping through all points and rectangles but its taking about 2 minutes and 30 seconds with 200,000 points and 44 rectangles.
I am working with Google maps and creating a program to check if points are close to a route on a map. I calculate multiple Rectangles and Circles along the path and test to see if the existing points lay within these rectangles and circles.
1.The rectangles can overlap depending on the nature of the route.
2.The point only has to be in ONE of the rectangles
3.If the point is on the edge of the rectangle I would like to make it count as inside the rectangle (but if its easier to not count then I won't count it)
4.The rectangles are dependent on what the area I want to search for off the route. Typically they will be 2 miles High (1 mile each direction from point) and the distance from point1 to point2 Wide.

In theory at the very best you'll have to iterate through all 200,000 points -- and in the worst case you'll have to check all of those points against all 44 rectangles (which is what you're doing right now).
Since you know you'll have to loop through all 200,000 points the best you can do is attempt to not have to check all 44 rectangles.
In order to do this you'll have to do some calculations on the rectangles, you find the closest two rectangles and form a larger rectangle which encloses both of them (a cluster if you will). Then you find the next closest rectangle to the rectangle you just formed form another cluster rectangle. You keep doing this until you enclose all of the rectangles (You'll end up having 43 cluster rectangles).
Now loop through the points and check against the largest rectangle, if the point falls within that rectangle, then you check the next largest rectangle, if it doesn't fall within that rectangle then you only need to check the to see if it falls within the rectangle which was used to form that rectangle. If it doesn't fall within that rectangle, then you can move onto the next point because that point doesn't fall within any rectangles (and you discovered this with only 3 checks).

Here are a few possible ideas:
Fuzzy match - if your points don't have to be 100% accurately marked as being within a particular rectangle, you could write an algorithm that makes a "best guess" that is more efficient, but sacrifices being 100% accurate
Fuzzy first, accurate after - give an approximate answer quickly, maybe by just calculating the distance between a given point, and the top-left corner of a rectangle, or the center of a circle. This will give an approximate answer which may not be 100% accurate, but then allow you to asynchronously continue the calculation, to refine it, and update the display after some time with the 100% accurate data
Group the points - when the point it created, have it "register" itself with a rectangle, basically pre-calculating whether or not a point is in a given rectangle, if you can.
Pre-calculate+cache - and then cache the list of rectangles a particular point fits into, in the point itself. Then it becomes a simple lookup instead of having to re-calculate it each time
Asynchronous loading - can you start displaying the answers as it's being calculated? If it takes 2.5 minutes to do the whole batch, can you show the results in 1,000 point chunks, as you calculate them? This way the user quickly begins to get some feedback while the calculation is finishing the work. At 2.5 minutes, that's 150 seconds. If you could deliver results in 1,000-point chunks (about 1/200th of the data at a time), you might be able to update the point map once every second with results as they become available.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio