How can I get a random distribution that "clusters" objects? - random

I'm working on a game, and I want to place some objects randomly throughout the world. However, I want the objects to be "clustered" in clumps. Is there any random distribution that clusters like this? Or is there some other technique I could use?

Consider using a bivariate normal (a.k.a. Gaussian) distribution. Generate separate normal values for the X and Y location. Bivariate normals are denser towards the center, sparser farther out, so your choice for the standard deviation of the distribution will determine how tight the clustering is - 2/3 of the items will be within 1 standard deviation of the distribution's center, 95% within 2 standard deviations, and almost all within 3 standard deviations.

Related

Calculate rotation/translation matrix to match measurement points to nominal points

I have two matrices, one containing 3D coordinates that are nominal positions per a CAD model and the other containing 3D coordinates of actual measured positions using a CMM. Every nominal point has a corresponding measurement, or in other words the two matrices are of equal length and width. I'm not sure what the best way is to fit the measured points to the nominal points. I need a way of calculating the translation and rotation to apply to all of the measured points that produce the minimum distance between each nominal/measured pair of points while not exceeding allowed tolerances on maximum distance at any other point. This is similar to Registration of point clouds but different in that each pair of nominal/measured points has a unique tolerance/limit on how far apart they are allowed to be. That limit is higher for some pairs and lower for others. I'm programming in .Net and have looked into Point Cloud Library (PCL), OpenCV, Excel, and basic matrix operations as possible approaches.
This is a sample of the data
X Nom Y Nom Z Nom X Meas Y Meas Z Meas Upper Tol Lower Tol
118.81 2.24 -14.14 118.68 2.24 -14.14 1.00 -0.50
118.72 1.71 -17.19 118.52 1.70 -17.16 1.00 -0.50
115.36 1.53 -24.19 115.14 1.52 -23.98 0.50 -0.50
108.73 1.20 -27.75 108.66 1.20 -27.41 0.20 -0.20
Below is the type of matrix I need to calculate in order to best fit the measured points to the nominal points. I will multiply it by the measured point matrix to best fit to the nominal point matrix.
Transformation
0.999897324 -0.000587540 0.014317661
0.000632725 0.999994834 -0.003151567
-0.014315736 0.003160302 0.999892530
-0.000990993 0.001672040 0.001672040
This is indeed a job for a rigid registration algorithm.
In order to handle your tolerances you have a couple of options:
Simple option: Run rigid registration, check afterwards if result is within tolerances
Bit harder option: Offset your points in the CAD, where you have imbalanced tolerances
the rest the same as the previous option.
Hardest option: What you probably want to do is and have the offset as in the second option, and also add a weight function based on measured position and set tolerance. This weight function should effect the energy function in such a way, that the individual function vectors are larger when you have a small tolerance and smaller when you have a larger tolerence.
So now about implementation, for options 1 and 2 your fastest way to result would probably be:
Use PCL C++ version in a visual studio 2010 environment. There's lots of information about installation of PCL and VS2010 and get it running. Also PCL has a nice ICP registration tutorial that should get you going.
Use VTK for python, it has an ICP algorithm:
Installing VTK for Python
http://www.vtk.org/Wiki/VTK/Examples/Python/IterativeClosestPoints
If your really want option 3 you can do:
Make the weight function in PCL library source code and compile it
Make the complete ICP algorithm yourself in .net:
http://www.math.tau.ac.il/~dcor/Graphics/adv-slides/ICP.ppt
Use math.numerics sparse matrix/vector algebra and solvers to create your own optimizer
Realize the Lev-Marq or Gauss-Newton optimizer from:
imm methods for non-linear least squares problems, K. Madsen, 2004
Generate your own function vector and jacobian matrix (with weight function)
Have quite some patience to get is all to work together :)
Post the result for the others on StackOverflow that are waiting for ICP in C# .net

What type of smoothing to use?

Not sure if this may or may not be valid here on SO, but I was hoping someone can advise of the correct algorithm to use.
I have the following RAW data.
In the image you can see "steps". Essentially I wish to get these steps, but then get a moving average of all the data between. In the following image, you can see the moving average:
However you will notice that at the "steps", the moving average decreases the gradient where I wish to keep the high vertical gradient.
Is there any smoothing technique that will take into account a large vertical "offset", but smooth the other data?
Yup, I had to do something similar with images from a spacecraft.
Simple technique #1: use a median filter with a modest width - say about 5 samples, or 7. This provides an output value that is the median of the corresponding input value and several of its immediate neighbors on either side. It will get rid of those spikes, and do a good job preserving the step edges.
The median filter is provided in all number-crunching toolkits that I know of such as Matlab, Python/Numpy, IDL etc., and libraries for compiled languages such as C++, Java (though specific names don't come to mind right now...)
Technique #2, perhaps not quite as good: Use a Savitzky-Golay smoothing filter. This works by effectively making least-square polynomial fits to the data, at each output sample, using the corresponding input sample and a neighborhood of points (much like the median filter). The SG smoother is known for being fairly good at preserving peaks and sharp transistions.
The SG filter is usually provided by most signal processing and number crunching packages, but might not be as common as the median filter.
Technique #3, the most work and requiring the most experience and judgement: Go ahead and use a smoother - moving box average, Gaussian, whatever - but then create an output that blends between the original with the smoothed data. The blend, controlled by a new data series you create, varies from all-original (blending in 0% of the smoothed) to all-smoothed (100%).
To control the blending, start with an edge detector to detect the jumps. You may want to first median-filter the data to get rid of the spikes. Then broaden (dilation in image processing jargon) or smooth and renormalize the the edge detector's output, and flip it around so it gives 0.0 at and near the jumps, and 1.0 everywhere else. Perhaps you want a smooth transition joining them. It is an art to get this right, which depends on how the data will be used - for me, it's usually images to be viewed by Humans. An automated embedded control system might work best if tweaked differently.
The main advantage of this technique is you can plug in whatever kind of smoothing filter you like. It won't have any effect where the blend control value is zero. The main disadvantage is that the jumps, the small neighborhood defined by the manipulated edge detector output, will contain noise.
I recommend first detecting the steps and then smoothing each step individually.
You know how to do the smoothing, and edge/step detection is pretty easy also (see here, for example). A typical edge detection scheme is to smooth your data and then multiply/convolute/cross-corelate it with some filter (for example the array [-1,1] that will show you where the steps are). In a mathematical context this can be viewed as studying the derivative of your plot to find inflection points (for some of the filters).
An alternative "hackish" solution would be to do a moving average but exclude outliers from the smoothing. You can decide what an outlier is by using some threshold t. In other words, for each point p with value v, take x points surrounding it and find the subset of those points which are between v - t and v + t, and take the average of these points as the new value of p.

gis polygon map overlay intersection operation

There are many algorithms for binary map overlay operation in vector data format which take two layers of map and produce resultant layer i.e overlaid layer as output. I am wondering whether there are any algorithms which take more than two layers say 3 layers simultaneously and produce the overlay result?
There are a variety of geographic computational overlay procedures available for multiple layers. These fall into the group of multiple criteria decision analysis, whereby multiple criteria (map)layers are standardized and combined (overlayed) to produce a resulting (map)layer. However, many of these are for raster data inputs!
If in fact you want to just combine vector data to produce an intersection, a procedural model would work best as #Thomas has commented. This can be done vis a vis python (standalone) or with model builder inside arcgis. Alas, there are other methods that can be used to script the procedural overlay process.
I would like you to think about what exactly you're aiming to do. Let's think about the following scenarios:
You have a vector polygon of some City, and your goal is to overlay all the industrial, residential and commercial land usage. This would leave you to subtract the different land uses from your City polygon, one by one. Or, you can merge your three land uses into one poylgon and subtract from your City polygon.
Given the wide range of multiple criteria decision analysis methodologies (eg. weighted linear combination), a raster methodology might be suitable if you're looking for the "optimal location" For instance, if you were looking for a location in the City that has an optimal combination of industrial, commercial and retail land use, weighted linear combination could be used.
Let us define our land use weights as 20%, 40%, 40% (industrial, commercial, retail). We must also standardize our land use layer values between 0 and 1. The following combination of layer values give the most optimal combination of the three criteria: 0.2, 0.4 and 0.4 = 1.

How to generate Bad Random Numbers

I'm sure the opposite has been asked many times but I couldn't find any answers on how to generate bad random numbers.
I want to write a small program for cluster analysis and want to generate some random Points for testing. If I would just insert 1000 Points with random coordinates they would be scattered all over the field which would make a cluster analysis worthless.
Is there a simple way to generate Random Numbers which build clusters?
I already thought about either not using random() but random()*random() which generates normally distributed numbers (I think I read this somewhere here on Stack Overflow).
Second approach would be picking a few areas at random and run the point generation again in this area which would of course produce a cluster in this area.
Do you have a better idea?
If you are deliberately producing well formed clusters (rather than completely random clusters), you could combine the two to find a cluster center, and then put lots of points around it in a normal distribution.
As well working in cartesian coords (x,y); you could use a radial method to distribute points for a particular cluster. Choose a random angle (0-2PI radians), then choose a radius.
Note that as circumference is proportional radius, the area distribution will be denser close to the centre - but the distribution per specific radius will be the same. Modify the radial distribution to produce a more tightly packed cluster.
OR you could use real world derived data for semi-random point distributions with natural clustering. Recently I've been doing quite a bit of geospatial cluster analysis. For this I have used real world data - zipcode centroids (which form natural clusters around cities); and restaurant locations. Another suggestion: you could use a stellar catalogue or galactic catalogue.
Generate few anchors. True random numbers. Then generate noise around them:
anchor + dist * (random() - 0.5))
this will generate clustered numbers, that will be evenly distributed in distance dist.
Add an additional dimension to your model.
Draw an irregular (i.e. not flat) surface.
Generate numbers in the extended space.
Discard all numbers which are on one side of the surface.
From every number left, drop the additional dimension.
Maybe I have misunderstood, but the gnu scientific library (written in c) has many distributions written within it - could you not pick coordinates from the Gaussian/poisson etc from that library?
http://www.gnu.org/software/gsl/manual/html_node/Random-Number-Distributions.html
They provide a simple example with the Poisson distribution from the link, too.
If you need your distribution to be bounded (for example y-coordinate not less than -1) then you can achieve that by rejection sampling from the uniform distribution in the gsl.
Blessings, Tom
My first thought was that you could implement your own using a linear congruential generator and experiment with the coefficients until you get a low enough period to suit your needs. A really low m coefficient should do the trick.
I also like your second idea of running a good RNG around a few pre-selected points to create clusters. You could either target specific areas for the clusters with this method, or generate those randomly as well.

What's a good way to generate random clusters and paths?

I'm toying around with writing a random map generator, and am not quite sure how to randomly generate realistic landscapes. I'm working with these sorts of local-scale maps, which presents some interesting problems.
One of the simplest cases is the forest:
Sparse Medium Dense
Typical trees 50% 70% 80%
Massive trees — 10% 20%
Light undergrowth 50% 70% 50%
Heavy undergrowth — 20% 50%
Trees and undergrowth can exist in the same space, so an average sparse forest has 25% typical trees and light undergrowth, 25% typical trees, 25% light undergrowth, and 25% open space. Medium and dense forests will take a bit more thinking, but it's not where my problem lies either, as it's all evenly dispersed.
My problem lies in generating clusters and paths, while keeping the percentage constraints. Marshes are a good example of this:
Moor Swamp
Shallow bog 20% 40%
Deep bog 5% 20%
Light undergrowth 30% 20%
Heavy undergrowth 10% 20%
Deep bog squares are usually clustered together and surrounded by an irregular ring of shallow bog squares.
An additional map element, a hedgerow, may also be present, as well as a path of open ground, snaking through the bog. Both of these types of map elements (clusters and paths) present problems, as the total composition of the map should contain X% of the element, but it's not evenly distributed. Other elements, such as streams, ponds, and quicksand need either a cluster or path-type generation as well.
What technique can I use to generate realistic maps given these constraints?
I'm using C#, FYI (but this isn't a C#-specific question.)
Realistic "random" distribution is often done using Perlin Noise, which can be used to give a distribution with "clumps" like you mention. It works by summing/combining multiple layers of linearly interpolated values from random data points. Each layer (or "octave") has twice as many data points as the last, and confined to a narrower range of values. The result is "realistic" looking random texture.
Here is a beautiful demonstration of the theory behind Perlin Noise by Hugo Elias.
Here is the first thing I found on Perlin Noise in C#.
What you can do is generate a Perlin Noise image and set a "threshold", where anything above a value is "on" and everything below it is "off". What you will end up with is clumps where things are above the threshold, which look irregular and awesome. Simply assign the ones above the threshold to where you want your terrain feature to be.
Here is a demonstration if a program generating a Perlin Noise bitmap and then adjusting the cut-off threshold over time. A clear "clumping" is visible. It could be just what you wanted.
Notice that, with a high threshold, very few points are above it, and it's sparse. But as the threshold lowers, those points "grow" into clumps (by the nature of perlin noise), and some of these clumps will join eachother, and basically create something very natural and terrain-like.
Note that you could also set the "clump factor", or the tendency of features to clump, by setting the "turbulence" of your Perlin Noise function, which basically causes peaks and valleys of your PN function to be accentuated and closer together.
Now, where to set the threshold? The higher the threshold, the lower the percentage of the feature on the final map. The lower the threshold, the higher the percentage. You can mess around with them. You could probably get exact percentages by fiddling around with a little math (it seems that the distribution of values follows a Normal Distribution; I could be wrong). Tweak it until it's just right :)
EDIT As pointed out in the comments, you can find the exact percentage by creating a cumulative histogram (index of what % of the map is under a threshold) and pick the threshold that gives you the percent you need.
The coolest thing here is that you can create features that clump around certain other features (like your marsh features) trivially here -- just use the same Perlin Noise map twice -- the second time, lowering the threshold. The first one will be clumpy, and the second one will be clumpy around the same areas, but with the clumps enlarged (refer to the flash animation posted earlier).
As for other features like hedgerows, you could try modeling simple random walk lines that have a higher tendency to go straight than turn, and place them anywhere randomly on your perlin-based map.
samples
Here is a sample 50x50 tile Sparse Forest Map. The undergrowth is colored brown and the trees are colored blue (sorry) to make it clear which is which.
For this map I didn't make exact thresholds to match 50%; I only set the threshold at 50% of the maximum. Statistically, this will average out to exactly 50% every time. But it might not be exact enough for your purposes; see the earlier note for how to do this.
Here is a demo of your Marsh features (not including undergrowth, for clarity), with shallow marsh in grey and deep marsh in back:
This is just 50x50, so there are some artifacts from that, but you can see how easily you can make the shallow marsh "grow" from the deep marsh -- simply by adjusting the threshold on the same Perlin map. For this one, I eyeballed the threshold level to give the most eye-pleasing results, but for your own purposes, you could do what was mentioned before.
Here is a marsh map generated from the same Perlin Noise map, but on stretched out over 250x250 tiled map instead:
I've never done this sort of thing, but here are some thoughts.
You can obtain clusters by biasing random selection to locations on the grid that are close to existing elements of that type. Assign a default value of 1 to all squares. For squares with existing clustered elements, add clustering value to to adjacent squares (the higher the clustering value, the stronger the clustering will be). Then do random selection for the next element of that type on the probability distribution function of all the squares.
For paths, you could have a similar procedure, except that paths would be extended step-wise (probability of path is finite at squares next to the end of the path and zero everywhere else). Directional paths could be done by increasing the probability of selection in the direction of the path. Meandering paths could have a direction that changes over the course of random extension (new_direction = mf * old_direction + (1-mf) * rand_direction, where mf is a momentum factor between 0 and 1).
To expand on academicRobot's comments, you could start with a default marsh or forest seed in some of the grid cells and let them grow from the source using a correlated random number. For instance a bog might have eight adjacent grid cells each of which has a 90% probability of also being a bog, but a 10% probability of being something else. You can let the ecosytem form from the seed and adjust the correlation until you get something that looks right. Probably pretty easy to implement even in a spreadsheet.
You could start reading links here. I remember looking at much better document. Will post it if I find it (it was also based on L-systems).
But that's on the general side; on the particular problem you face I guess you should model it in terms of
percentages
other rules (clusters and paths)
The point is that even though you don't know how to construct the map with given properties, if you are able to evaluate the properties (clustering ratio; path niceness) and score on them you can then brute force or do some other problem space transversal.
If you still want to do generative approach then you will have to examine generative rules a bit closer; here's an idea that I would pursue
create patterns of different terrains and terrain covers that have required properties of 'clusterness', 'pathness' or uniformity
create the patterns in such a way that the values for deep bog are not discreet, but assign probability value; after the pattern had been created you can normalize this probability in such a way that it will produce required percentage of cover
mix different patterns together
You might have some success for certain types of area with a Voronoi pattern. I've never seen it used to create maps but I have seen it used in a number of similar fields.

Resources