algorithm to evenly distribute points in N-dimensional space - algorithm

I need to (roughly) evenly distribute points in space, but the dimensionality isn't fixed.
I've seen the Fibonacci Sphere algorithm, but as it uses sin+cos for x,z it seems it's only suited for 3D space. I've also seen the sunflower spiral algorithm, but it similarly is limited to 2D.
Is there a general algorithm that takes
a number of points
a number of dimensions
and spreads points throughout?

We can fill your space with k^n n-dimensional hypercubes by dividing each dimension into k equally-sized regions.
Given r points and n dimensions, we want r = k^n, so k = r^(1/n).
E.g., for 1000 points and 2 dimensions we'd want k = 1000^(1/2) = 31.6 regions per dimension, but for 3 dimensions we'd want k = 1000^(1/3) = 10 regions per dimension.
For non-integer values, I'd recommend rounding up (so 31.6 becomes 32). This will give you a few more cells than points. You can either select which cells don't get points at random, or distribute them towards the edges or however you like.
Once you have the cells that should have points, assign 1 point randomly to a location within each cell, choosing a float between 0-1 per each dimension as the points location along that dimension's axis segment within the cell.
Since the cells are perfectly distributed (except possibly a few extra empty cells) and there is one point per cell, the points are reasonably distributed in space while still being random.

Related

Finding random k points at least d apart in 3D confined space

For my simulation purposes, I want to generate a randomly distributed k number of spheres (having the same radii) in a confined 3D space (inside a rectangle) where k is in order of 1000. Those spheres should not impinge on one another.
So, I want to generate random k points in a 3D space at least d distance away from one another; considering the number of points and the frequency at which I need those points for simulation, I don't want to apply brute force; I'm looking for some efficient algorithms achieving this.
How about just starting with some regular tessellation of the space (i.e. some primitive 3d lattice) and putting a single point somewhere in each tile? You'd then only need to check a small number of neighboring tiles for proximity.
To get a more statistically uniform, i.e. less regular, set of points, you could:
perturb points in space
generate an overly dense lattice and reject some points
"warp" the space so that the lattice was more dense in certain areas
You could perturb the points sequentially, giving you a monte-carlo chain over their coordinates, and potentially saving work elsewhere. Presumably you could tailor this so that the equilibrium distribution was what you wanted.

Minimum amount of rectangles from multi-colored grid

I've been working for some time in an XNA roguelike game and I can't get my head around the following problem: developing an algorithm to divide a matrix of non-binary values into the fewest rectangles grouping these values.
Example: given the following matrix
01234567
0 ---##*##
1 ---##*##
2 --------
The algorithm should return:
3x3 rectangle of '-'s starting at (0,0)
2x2 rectangle of '#'s starting at (3, 0)
1x2 rectangle of '*'s starting at (5, 0)
2x2 rectangle of '#'s starting at (6, 0)
5x1 rectangle of '-'s starting at (3, 2)
Why am I doing this: I've gotten a pretty big dungeon type with a size of approximately 500x500. If I were to individually call the "Draw" method for each tile's Sprite, my FPS would be far too low. It is possible to optimize this process by grouping similar-textured tiles and applying texture repetition to them, which would dramatically decrease the amount of GPU draw calls for that. For example, if my map were the previous matrix, instead of calling draw 16 times, I'd call it only 5 times.
I've looked at some algorithms which can give you the biggest rectangle of a type inside a given binary matrix, but that doesn't fit my problem.
Thanks in advance!
You can use breadth first searches to separate each area of different tile type.
Picking a partitioning within the individual shapes is an NP-hard problem (see https://en.wikipedia.org/wiki/Graph_partition), so you can't find an efficient solution that guarantees the minimum number of rectangles. However if you don't mind an extra rectangle or two for each shape and your shapes are relatively small, you can come up with algorithms that split the shape into a number of rectangles close to the minimum.
An off the top of my head guess for something that could potentially work would be to pick a tile with the maximum connecting tiles and start growing a rectangle from it using a recursive algorithm to maximize the size. Remove the resulting rectangle from the shape, then repeat until there are no more tiles not included in a rectangle. Again, this won't produce perfect results, there are graphs on which this will return with more than the minimum amount of rectangles, but it's an easy to implement ballpark solution. With a little more effort I'm sure you will be able to find better heuristics to use and get better results too.
One possible building block is a routine to check, given two points, whether the rectangle formed by using those points as opposite corners is all of the same type. I think that a fast (but unreliable) means of testing this can be based on mapping each type to a large random number, and then working out the sum of the numbers within a rectangle modulo a large prime. Take one of the numbers within the rectangle. If the sum of the numbers within the rectangle is the size of the rectangle times the one number sampled, assume that the all of the numbers in the rectangle are the same.
In one dimension we can work out all of the cumulative sums a, a+b, a+b+c, a+b+c+d,... in time O(N) and then, for any two points, work out the sum for the interval between them by subtracting cumulative sums: b+c+d = a+b+c+d - a. In two dimensions, we can use cumulative sums to work out, for each point, the sum of all of the numbers from positions which have x and y co-ordinates no greater than the (x, y) coordinate of that position. For any rectangle we can work out the sum of the numbers within that rectangle by working out A-B-C+D where A,B,C,D are two-dimensional cumulative sums.
So with pre-processing O(N) we can work out a table which allows us to compute the sum of the numbers within a rectangle specified by its opposite corners in time O(1). Unless we are very unlucky, checking this sum against the size of the rectangle times a number extracted from within the rectangle will tell us whether the rectangle is all of the same type.
Based on this, repeatedly start with a random point not covered. Take a point just to its left and move that point left as long as the interval between the two points is of the same type. Then move that point up as long as the rectangle formed by the two points is of the same type. Now move the first point to the right and down as long as the rectangle formed by the two points is of the same type. Now you think you have a large rectangle covering the original point. Check it. In the unlikely event that it is not all of the same type, add that rectangle to a list of "fooled me" rectangles you can check against in future and try again. If it is all of the same type, count that as one extracted rectangle and mark all of the points in it as covered. Continue until all points are covered.
This is a greedy algorithm that makes no attempt at producing the optimal solution, but it should be reasonably fast - the most expensive part is checking that the rectangle really is all of the same type, and - assuming you pass that test - each cell checked is also a cell covered so the total cost for the whole process should be O(N) where N is the number of cells of input data.

Find clusters in 3D point data using a massively parallel algorithm

I have a large number of points in 3D space (x,y,z) represented as an array of 3 float structs. I also have access to a strong graphics card with CUDA capability. I want the following:
Divide the points in the array into clusters so that every point within a cluster has a maximum euclidean distance of X to at least one other point within the cluster.
Examle in 2D:
The "brute force" way of doing this is of course to calculate the distance between every point and every other point, to see if any of the distances is below the threshold X, and if so mark those points as belonging to the same cluster. This is an O(n²) algorithm.
This can be done in parallel in CUDA ofcourse with n² threads, but is there a better way?
The algorithm can be reduced to O(n) by using binning:
impose a 3D grid spaced as X, that is a 3D lattice (each cell of the lattice is a cubic bin);
assign each points in space to the corresponding bin (the bin that geometrically contains that points);
every time you need to evaluate the distances from one point, you just use only the points in the bin of the point itself and the ones in the 26 neighbouring bins (3x3x3 = 27)
The points in the other bins are further than X, so you don't need to evaluate the distances at all.
In this way, assuming a constant density in the points, you will have to compute the distance only for a constant number of pair points / total number of points.
Assigning the points to the bins is O(n) as well.
If the points are not uniformly distributed, the bins can be smaller (and you must consider more than 26 neighbours to evaluate the distances) and eventually sparse.
This is a typical trick used for molecular dynamics, ray tracing, meshing,... However I know of the term binning from molecular dynamics simulation: the name can change (link-cell, kd-trees too use the same principle, even if more articulated), the algorithm remains the same!
And, good news, the algorithm is well suited for parallel implementation.
refs:
https://en.wikipedia.org/wiki/Cell_lists

Generating random points with defined minimum and maximum distance

I need algorithm ideas for generating points in 2D space with defined minimum and maximum possible distances between points.
Bassicaly, i want to find a good way to insert a point in a 2D space filled with points, in such manner that the point has random location, but is also more than MINIMUM_DISTANCE_NUM and less than MAXIMUM_DISTANCE_NUM away from nearest points.
I need it for a game, so it should be fast, and not depending on random probability.
Store the set of points in a Kd tree. Generate a new point at random, then examine its nearest neighbors which can be looked up quickly in the Kd tree. If the point is accepted (i.e. MIN_DIST < nearest neighbor < MAX_DIST), then add it to the tree.
I should note that this will work best under conditions where the points are not too tightly packed, i.e. MIN * N << L where N is the number of points and L is the dimension of the box you are putting them in. If this is not true, then most new points will be rejected. But think about it, in this limit, you're packing marbles into a box, and the arrangement can't be very "random" at all above a certain density.
You could use a 2D regular grid of Points (P0,P1,P2,P3,...,P(m*n), when m is width and n is height of this grid )
Each point is associated with 1) a boolean saying wether this grid point was used or not and 2) a 'shift' from this grid position to avoid too much regularity. (or you can put the point+shift coordinates in your grid allready)
Then when you need a new point, just pick a random point of your grid which was not used, state this point as 'used' and use the Point+shift in your game.
Depending on n, m, width/height of your 2D space, and the number of points you're gonna use, this could be just fine.
A good option for this is to use Poisson-Disc sampling. The algorithm is efficient (O(n)) and "produces points that are tightly-packed, but no closer to each other than a specified minimum distance, resulting in a more natural pattern".
https://www.jasondavies.com/poisson-disc/
How many points are you talking about? If you have an upper limit to the number of points, you can generate (pre-compute) an array of points and store them in the array. Take that array and store them in a file.
You'll have all the hard computational processing done before the map loads (so that way you can use any random point generating algorithm) and then you have a nice fast way to get your points.
That way you can generate a ton of different maps and then randomly select one of the maps to generate your points.
This only works if you have an upper limit and you can precomputer the points before the game loads

Best parallel method for calculating the integral of a 2D function

In some crunching number program, I have a function which can be just 1 or 0 in three dimensions. I do not know in advance the function, but I need to know the total "surface" of the function which is equal to zero. In a similar problem I could draw a rectangle over the 2D representation of the map of United Kingdom. The function is equal to 0 at sea, and 1 at the earth. I need to know the total water surface. I wonder what is the best parallel algorithm or method for doing this.
I thought first about the following approach; a) divide 2D map area into a rectangular grid. For each point that belongs to the center of each cell, check whether it is earth of water. This can be done in parallel. At the end of the procedure I will have a matrix with ones and zeroes. I will get the area with some precision. Now I want to increase this precision, so b) choose the cells that are in the border regions between zeroes and ones (what is the best criterion for doing this?) and in those cells, divide them again into successive cells and repeat the process until one gets the desired accuracy. I guess that in this process, the critical parameters are the grid size for each new stage, and how to store and check the cells that belong to the border area. Finally the most optimal method, from the computational point of view, is the one that performs the minimal number of checks in order to get the value of the total surface with the desired accuracy.
First of all, it looks like you are talking about 3D function, e.g. for two coordinates x and y you have f(x, y) = 0 if (x, y) belongs to the sea, and f(x, y) = 1 otherwise.
Having said that, you can use the following simple approach.
Split your rectangle into N subrectangles, where N is the number of
your processors (or processor cores, or nodes in a cluster, etc.)
For each subrectangle use Monte Carlo method to calculate the
surface of the water.
Add the N values to calculate the total surface of the water.
Of course, you can use any other method to calculate the surface, Mothe Carlo was just an example. But the idea is the same: subdivide your problem to N subproblems, solve them in parallel, then combine the results.
Update: For the Monte Carlo method the error estimate decreases as 1/sqrt(N) where N is the number of samples. For instance, to reduce the error by a factor of 2 requires a 4-fold increase in the number of sample points.
I believe that your attitude is reasonable.
Choose the cells that area in the border regions between zeroes and ones (what is the best criterion for doing this?)
Each cell has 8 sorrunding cells (3x3), or 24 sorrunding cells (5x5). If at least one of the 9 or 25 cells contains land, and at least one of these cells contains water - increase the accuracy for the whole block of cells (3x3 or 5x5) and query again.
When the accuracy is good enough - instead of splitting, just add the land area to the sum.
Efficiency
Use a producers-consumer queue. Create n threads, where n equals to the number of cores on your machine. All threads should do the same job:
Dequeue a geo-cell from the queue
If the area of the cell is still large - divide it into 3x3 or 5x5 cells, for each of the split cells check for land/sea. If there is a mix - enqueue all these cells. If it only land: just add the area. only sea: do nothing.
For start, just divide the whole area into reasonable sized cell and equeue all of them.
You can also optimize by not adding all the 9 or 25 cells when there is a mix, but examine the pattern (only top/bottom/left/right cells).
Edit:
There is a tradeoff between accuracy and performance: If the initial cell size is too large, you may miss small lakes or small islands. therefore the optimization criteria should be: start with the largest cells possible that will assure enough accuracy.

Resources