Algorithm for distributing images to form a square - algorithm

I'm building an application that creates a spritesheet from a series of images. Currently the application requires to indicate the number of columns by the user but I would like to add an option that suggests this parameter automatically which allows to obtain an almost square spritesheet.
If the images were square, a square root of the total number of images would suffice but this is not the case.
The images must all be the same size but may be taller than wide or the opposite.
For example: the sprite sheet has a walking character and each image is 331 high and 160 wide. The number of frames is 25.
I should find an algorithm that suggests the number of columns (and rows) that allows me to obtain a sheet that is as square as possible.
Unfortunately I have no code to present just because I have no idea what kind of reasoning to do.
Do you have any suggestions to work on?

The basic idea is that, if the image height is twice the width, you will need twice more columns than rows.
If:
q is the image ratio (width/height)
c is the number of columns
r is the number of rows
n is the total number of images
then we have:
r / c = q and r * c = n
After a few substitutions:
c = sqrt(n / q)
r = q * c
In your case, q = 160 / 331 = 0.48
c = sqrt(25 / 0.48) = 7.2
r = 0.48 * c = 3.5
So (after rounding) the answer is 4 rows and 7 columns.

Mathematically, this is an interesting question.
I don't have time to think about it extensively, but here are my two cents:
Let the width and height of the sprites be w and h, respectively, and let the number of sprites be n.
If you put these in a grid consisting of c columns and r rows, the total width of the grid will be cw and the total height rh. You want the quotient cw/rh to be as close to 1 as possible.
Now, if you chose c and r freely, the number of grid cells, N := cr, might well be slightly larger than n. In most cases, I would expect you to accept a partially empty last row.
Since N is close to n,
Hence, we want to find c such that
is as small as possible. Clearly this happens when
Hence, if you let the number of columns be √(nh/w) rounded to the nearest integer, you will probably get a fairly square grid.

Related

How to enlarge the difference between large values that have small variation?

I have four values: [A B C D]
and I have four graphical lines where each ones height is to represent the sizes of A B C and D. I dont want to scale the heights of each line to the exact values but instead a percentage relative to the other values. The minimum line height should be 1 and the max height should be 4. I'm having trouble as the differences between the values are very small.
I have
a_height = 1+(a/a+b+c+d)
b_height = 1+(b/a+b+c+d)
c_height = 1+(c/a+b+c+d)
d_height = 1+(d/a+b+c+d)
But the difference between A B C D are very small for example:
A = 14500
B = 14510
C = 14496
D = 14507
(Collectively all four values will vary together from 10,000-20,000 during simulation but the variation between each will remain small)
So all the heights end up being roughly the same and so visually you cant tell the difference. What can be done here to scale up the differences in the values so the heights of each line are clearly different to eachother but still proportional.

Dataset for meaningless “Nearest Neighbor”?

In the paper "When Is 'Nearest Neighbor' Meaningful?" we read that, "We show that under certain broad conditions (in terms of data and query distributions, or workload), as dimensionality increases, the distance to the nearest
neighbor approaches the distance to the farthest neighbor. In other words, the contrast in distances to different data points becomes nonexistent. The conditions
we have identified in which this happens are much broader than the independent and identically distributed (IID) dimensions assumption that other work
assumes."
My question is how I should generate a dataset that resembles this effect? I have created three points each with 1000 dimensions with random numbers ranging from 0-255 for each dimension but points create different distances and do not reproduce what is mentioned above. It seems changing dimensions (e.g. 10 or 100 or 1000 dimensions) and ranges (e.g. [0,1]) do not change anything. I still get different distances which should not be any problem for e.g. clustering algorithms!
I hadn't heard of this before either, so I am little defensive, since I have seen that real and synthetic datasets in high dimensions really do not support the claim of the paper in question.
As a result, what I would suggest, as a first, dirty, clumsy and maybe not good first attempt is to generate a sphere in a dimension of your choice (I do it like like this) and then place a query at the center of the sphere.
In that case, every point lies in the same distance with the query point, thus the Nearest Neighbor has a distance equal to the Farthest Neighbor.
This, of course, is independent from the dimension, but it's what came at a thought after looking at the figures of the paper. It should be enough to get you stared, but surely, better datasets may be generated, if any.
Edit about:
distances for each point got bigger with more dimensions!!!!
this is expected, since the higher the dimensional space, the sparser the space is, thus the greater the distance is. Moreover, this is expected, if you think for example, the Euclidean distance, which gets grater as the dimensions grow.
I think the paper is right. First, your test: One problem with your test may be that you are using too few points. I used 10000 point and below are my results (evenly distributed point in [0.0 ... 1.0] in all dimensions). For DIM=2, min/max differ almost by a factor of 1000, for DIM=1000, they only differ by a factor of 1.6, for DIM=10000 by 1.248 . So I'd say these results confirm the paper's hypothesis.
DIM/N = 2 / 10000
min/avg/max= 1.0150906548224441E-5 / 0.019347838262624064 / 0.9993862941797146
DIM/N = 10 / 10000.0
min/avg/max= 0.011363500131326938 / 0.9806472676701363 / 1.628460468042207
DIM/N = 100 / 10000
min/avg/max= 0.7701271349716637 / 1.3380320375218808 / 2.1878136533925328
DIM/N = 1000 / 10000
min/avg/max= 2.581913326565635 / 3.2871335447262178 / 4.177669393187736
DIM/N = 10000 / 10000
min/avg/max= 8.704666143050158 / 9.70540814778645 / 10.85760200249862
DIM/N = 100000 / 1000 (N=1000!)
min/avg/max= 30.448610133282717 / 31.14936583713578 / 31.99082677476165
I guess the explanation is as follows: Lets take three randomly generated vectors, A, B and C. The total distance is based on the sum of the distances of each individual row of these vectors. The more dimensions the vectors have, the more the total sum of differences will approach a common average. In other word, it is highly unlikely that a vector C has in all elements a larger distance to A than another vector B has to A. With increasing dimensions, C and B will have increasingly similar distance to A (and to each other).
My test dataset was created as follow. The dataset is essentially a cube ranging from 0.0 to 1.0 in every dimension. The coordinates were created with uniform distribution in every dimension between 0.0 and 1.0. Example code (N=10000, DIM=[2..10000]):
public double[] generate(int N, int DIM) {
double[] data = new double[N*DIM];
for (int i = 0; i < N; i++) {
int pos = DIM*i;
for (int d = 0; d < DIM; d++) {
data[pos+d] = R.nextDouble();
}
}
return data;
}
Following the equation given at the bottom of the accepted answer here, we get:
d=2 -> 98460
d=10 -> 142.3
d=100 -> 1.84
d=1,000 -> 0.618
d=10,000 -> 0.247
d=100,000 -> 0.0506 (using N=1000)

Weighted random number (without predefined values!)

currently I'm needing a function which gives a weighted, random number.
It should chose a random number between two doubles/integers (for example 4 and 8) while the value in the middle (6) will occur on average, about twice as often than the limiter values 4 and 8.
If this were only about integers, I could predefine the values with variables and custom probabilities, but I need the function to give a double with at least 2 digits (meaning thousands of different numbers)!
The environment I use, is the "Game Maker" which provides all sorts of basic random-generators, but not weighted ones.
Could anyone possibly lead my in the right direction how to achieve this?
Thanks in advance!
The sum of two independent continuous uniform(0,1)'s, U1 and U2, has a continuous symmetrical triangle distribution between 0 and 2. The distribution has its peak at 1 and tapers to zero at either end. We can easily translate that to a range of (4,8) via scaling by 2 and adding 4, i.e., 4 + 2*(U1 + U2).
However, you don't want a height of zero at the endpoints, you want half the peak's height. In other words, you want a triangle sitting on a rectangular base (i.e., uniform), with height h at the endpoints and height 2h in the middle. That makes life easy, because the triangle must have a peak of height h above the rectangular base, and a triangle with height h has half the area of a rectangle with the same base and height h. It follows that 2/3 of your probability is in the base, 1/3 is in the triangle.
Combining the elements above leads to the following pseudocode algorithm. If rnd() is a function call that returns continuous uniform(0,1) random numbers:
define makeValue()
if rnd() <= 2/3 # Caution, may want to use 2.0/3.0 for many languages
return 4 + (4 * rnd())
else
return 4 + (2 * (rnd() + rnd()))
I cranked out a million values using that and plotted a histogram:
For the case someone needs this in Game Maker (or a different language ) as an universal function:
if random(1) <= argument0
return argument1 + ((argument2-argument1) * random(1))
else
return argument1 + (((argument2-argument1)/2) * (random(1) + random(1)))
Called as follows (similar to the standard random_range function):
val = weight_random_range(FACTOR, FROM, TO)
"FACTOR" determines how much of the whole probability figure is the "base" for constant probability. E.g. 2/3 for the figure above.
0 will provide a perfect triangle and 1 a rectangle (no weightning).

Algorithm to split set of objects into certain number of groups?

For example, say I have a 2D array of pixels (in other words, an image) and I want to arrange them into groups so that the number of groups will add up perfectly to a certain number (say, the total items in another 2D array of pixels). At the moment, what I try is using a combination of ratios and pixels, but this fails on anything other than perfect integer ratios (so 1:2, 1:3, 1:4, etc). When it does fail, it just scales it to the integer less than it, so, for example, a 1:2.93 ratio scale would be using a 1:2 scale with part of the image cut off. I'd rather not do this, so what are some algorithms I could use that do not get into Matrix Multipication? I remember seeing something similar to what I described at first mentioned, but I cannot find it. Is this an NP-type problem?
For example, say I have a 12-by-12 pixel image and I want to split it up into exactly 64 sub-images of n-by-m size. Through analysis one could see that I could break it up into 8 2-by-2 sub-images, and 56 2-by-1 sub-images in order to get that exact number of sub-images. So, in other words, I would get 8+56=64 sub-images using all 4(8)+56(2)=144 pixels.
Similarly, if I had a 13 by 13 pixel image and I wanted to 81 sub-images of n-by-m size, I would need to break it up into 4 2-by-2 sub-images, 76 2-by-1 sub-images, and 1 1-by-1 sub-image to get the exact number of sub-images needed. In other words, 4(4)+76(2)+1=169 and 4+76+1=81.
Yet another example, if I wanted to split the same 13 by 13 image into 36 sub-images of n-by-m size, I would need 14 4-by-2 sub-images, 7 2-by-2 sub-images, 14 2-by-1 sub-images, and 1 1-by-1 sub-image. In other words, 8(13)+4(10)+2(12)+1=169 and 13+10+12+1=36.
Of course, the image need not be square, and neither the amount of sub-images, but neither should not be prime. In addition, the amount of sub-images should be less than the number of pixels in the image. I'd probably want to stick to powers of two for the width and height of the sub-images for ease of translating one larger sub image into multiple sub images, but if I can find an algorithm which didn't do that it'd be better. That is basically what I'm trying to find an algorithm for.
I understand that you want to split a rectabgular image of a given size, into n rectangular sub-images. Let say that you have:
an image of size w * h
and you want to split into n sub-images of size x * y
I think that what you want is
R = { (x, y) | x in [1..w], y in [1..h], x * y == (w * h) / n }
That is the set of pairs (x, y) such that x * y is equal to (w * h) / n, where / is the integer division. Also, you probably want to take the x * y rectangle having the smallest perimeter, i.e. the smallest value of x + y.
For the three examples in the questions:
splitting a 12 x 12 image into 64 sub-images, you get R = {(1,2),(2,1)}, and so you have either 64 1 x 2 sub-images, or 64 2 x 1 sub-images
splitting a 13 x 13 image into 81 sub-images, you het R = {(1,2),(2,1)}, and so you have either 64 1 x 2 sub-images, or 64 2 x 1 sub-images
splitting a 13 x 13 image into 36 sub-images, you het R = {(1,4),(2,2),(4,1)}, and so you could use 36 2 x 2 sub-images (smallest perimeter)
For every example, you can of course combine different size of rectangles.
If you want to do something else, maybe tiling your original image, you may want to have a look at rectangle tiling algorithms
If you don't care about the subimages being differently sized, a simple way to do this is repeatedly splitting subimages in two. Every new split increases the number of subimages by one.

Generating random points within a hexagon for procedural game content

I'm using procedural techniques to generate graphics for a game I am writing.
To generate some woods I would like to scatter trees randomly within a regular hexagonal area centred at <0,0>.
What is the best way to generate these points in a uniform way?
If you can find a good rectangular bounding box for your hexagon, the easiest way to generate uniformly random points is by rejection sampling (http://en.wikipedia.org/wiki/Rejection_sampling)
That is, find a rectangle that entirely contains your hexagon, and then generate uniformly random points within the rectangle (this is easy, just independently generate random values for each coordinate in the right range). Check if the random point falls within the hexagon. If yes, keep it. If no, draw another point.
So long as you can find a good bounding box (the area of the rectangle should not be more than a constant factor larger than the area of the hexagon it encloses), this will be extremely fast.
A possibly simple way is the following:
F ____ B
/\ /\
A /__\/__\ E
\ /\ /
\/__\/
D C
Consider the parallelograms ADCO (center is O) and AOBF.
Any point in this can be written as a linear combination of two vectors AO and AF.
An point P in those two parallelograms satisfies
P = x* AO + y * AF or xAO + yAD.
where 0 <= x < 1 and 0 <= y <= 1 (we discount the edges shared with BECO).
Similarly any point Q in the parallelogram BECO can be written as the linear combination of vectors BO and BE such that
Q = xBO + yBE where 0 <=x <=1 and 0 <=y <= 1.
Thus to select a random point
we select
A with probability 2/3 and B with probability 1/3.
If you selected A, select x in [0,1) (note, half-open interval [0,1)) and y in [-1,1] and choose point P = xAO+yAF if y > 0 else choose P = x*AO + |y|*AD.
If you selected B, select x in [0,1] and y in [0,1] and choose point Q = xBO + yBE.
So it will take three random number calls to select one point, which might be good enough, depending on your situation.
If it's a regular hexagon, the simplest method that comes to mind is to divide it into three rhombuses. That way (a) they have the same area, and (b) you can pick a random point in any one rhombus with two random variables from 0 to 1. Here is a Python code that works.
from math import sqrt
from random import randrange, random
from matplotlib import pyplot
vectors = [(-1.,0),(.5,sqrt(3.)/2.),(.5,-sqrt(3.)/2.)]
def randinunithex():
x = randrange(3);
(v1,v2) = (vectors[x], vectors[(x+1)%3])
(x,y) = (random(),random())
return (x*v1[0]+y*v2[0],x*v1[1]+y*v2[1])
for n in xrange(500):
v = randinunithex()
pyplot.plot([v[0]],[v[1]],'ro')
pyplot.show()
A couple of people in the discussion raised the question of uniformly sampling a discrete version of the hexagon. The most natural discretization is with a triangular lattice, and there is a version of the above solution that still works. You can trim the rhombuses a little bit so that they each contain the same number of points. They only miss the origin, which has to be allowed separately as a special case. Here is a code for that:
from math import sqrt
from random import randrange, random
from matplotlib import pyplot
size = 10
vectors = [(-1.,0),(.5,sqrt(3.)/2.),(.5,-sqrt(3.)/2.)]
def randinunithex():
if not randrange(3*size*size+1): return (0,0)
t = randrange(3);
(v1,v2) = (vectors[t], vectors[(t+1)%3])
(x,y) = (randrange(0,size),randrange(1,size))
return (x*v1[0]+y*v2[0],x*v1[1]+y*v2[1])
# Plot 500 random points in the hexagon
for n in xrange(500):
v = randinunithex()
pyplot.plot([v[0]],[v[1]],'ro')
# Show the trimmed rhombuses
for t in xrange(3):
(v1,v2) = (vectors[t], vectors[(t+1)%3])
corners = [(0,1),(0,size-1),(size-1,size-1),(size-1,1),(0,1)]
corners = [(x*v1[0]+y*v2[0],x*v1[1]+y*v2[1]) for (x,y) in corners]
pyplot.plot([x for (x,y) in corners],[y for (x,y) in corners],'b')
pyplot.show()
And here is a picture.
alt text http://www.freeimagehosting.net/uploads/0f80ad5d9a.png
The traditional approach (applicable to regions of any polygonal shape) is to perform trapezoidal decomposition of your original hexagon. Once that is done, you can select your random points through the following two-step process:
1) Select a random trapezoid from the decomposition. Each trapezoid is selected with probability proportional to its area.
2) Select a random point uniformly in the trapezoid chosen on step 1.
You can use triangulation instead of trapezoidal decomposition, if you prefer to do so.
Chop it up into six triangles (hence this applies to any regular polygon), randomly choose one triangle, and randomly choose a point in the selected triangle.
Choosing random points in a triangle is a well-documented problem.
And of course, this is quite fast and you'll only have to generate 3 random numbers per point --- no rejection, etc.
Update:
Since you will have to generate two random numbers, this is how you do it:
R = random(); //Generate a random number called R between 0-1
S = random(); //Generate a random number called S between 0-1
if(R + S >=1)
{
R = 1 – R;
S = 1 – S;
}
You may check my 2009 paper, where I derived an "exact" approach to generate "random points" inside different lattice shapes: "hexagonal", "rhombus", and "triangular". As far as I know it is the "most optimized approach" because for every 2D position you only need two random samples. Other works derived earlier require 3 samples for each 2D position!
Hope this answers the question!
http://arxiv.org/abs/1306.0162
1) make biection from points to numbers (just enumerate them), get random number -> get point.
Another solution.
2) if N - length of hexagon's side, get 3 random numbers from [1..N], start from some corner and move 3 times with this numbers for 3 directions.
The rejection sampling solution above is intuitive and simple, but uses a rectangle, and (presumably) euclidean, X/Y coordinates. You could make this slightly more efficient (though still suboptimal) by using a circle with radius r, and generate random points using polar coordinates from the center instead, where distance would be rand()*r, and theta (in radians) would be rand()*2*PI.

Resources