I have a two-dimensional random vector x = [x1, x2]T with a known joint probability density function (PDF). The PDF is non-Gaussian and the two entries of the random vector are statistically dependent. I need to show that for example x1 is more important than x2, in terms of the amount of information that it carries. Is there a classical solution for this problem? Can I show that for example n% of the total information carried by x is in x1 and 100-n% is carried by x2?
I assume that the standard way of measuring the amount of information is by calculating the Entropy. Any clues?
I don't mean a function that generates random numbers, but an algorithm to generate a random function
"High dimension" means the function is multi-variable, e.g. a 100-dim function has 100 different variables.
Let's say the domain is [0,1], we need to generate a function f:[0,1]^n->[0,1]. This function is chosen from a certain class of functions, so that the probability of choosing any of these functions is the same.
(This class of functions can be either all continuous, or K-order derivative, whichever is convenient for the algorithm.)
Since the functions on a closed interval domain are uncountable infinite, we only require the algorithm to be pseudo-random.
Is there a polynomial time algorithm to solve this problem?
I just want to add a possible algorithm to the question(but not feasible due to its exponential time complexity). The algorithm was proposed by the friend who actually brought up this question in the first place:
The algorithm can be simply described as following. First, we assume the dimension d = 1 for example. Consider smooth functions on the interval I = [a; b]. First, we split the domain [a; b] into N small intervals. For each interval Ii, we generate a random number fi living in some specific distributions (Gaussian or uniform distribution). Finally, we do the interpolation of
series (ai; fi), where ai is a characteristic point of Ii (eg, we can choose ai as the middle point of Ii). After interpolation, we gain a smooth curve, which can be regarded as a one dimensional random function construction living in the function space Cm[a; b] (where m depends on the interpolation algorithm we choose).
This is just to say that the algorithm does not need to be that formal and rigorous, but simply to provide something that works.
So if i get it right you need function returning scalar from vector;
The easiest way I see is the use of dot product
for example let n be the dimensionality you need
so create random vector a[n] containing random coefficients in range <0,1>
and the sum of all coefficients is 1
create float a[n]
feed it with positive random numbers (no zeros)
compute the sum of a[i]
divide a[n] by this sum
now the function y=f(x[n]) is simply
y=dot(a[n],x[n])=a[0]*x[0]+a[1]*x[1]+...+a[n-1]*x[n-1]
if I didn't miss something the target range should be <0,1>
if x==(0,0,0,..0) then y=0;
if x==(1,1,1,..1) then y=1;
If you need something more complex use higher order of polynomial
something like y=dot(a0[n],x[n])*dot(a1[n],x[n]^2)*dot(a2[n],x[n]^3)...
where x[n]^2 means (x[0]*x[0],x[1]*x[1],...)
Booth approaches results in function with the same "direction"
if any x[i] rises then y rises too
if you want to change that then you have to allow also negative values for a[]
but to make that work you need to add some offset to y shifting from negative values ...
and the a[] normalization process will be a bit more complex
because you need to seek the min,max values ...
easier option is to add random flag vector m[n] to process
m[i] will flag if 1-x[i] should be used instead of x[i]
this way all above stays as is ...
you can create more types of mapping to make it even more vaiable
This might not only be hard, but impossible if you actually want to be able to generate every continuous function.
For the one-dimensional case you might be able to create a useful approximation by looking into the Faber-Schauder-System (also see wiki). This gives you a Schauder-basis for continuous functions on an interval. This kind of basis only covers the whole vectorspace if you include infinite linear combinations of basisvectors. Thus you can create some random functions by building random linear combinations from this basis, but in general you won't be able to create functions that are actually represented by an infinite amount of basisvectors this way.
Edit in response to your update:
It seems like choosing a random polynomial function of order K (for the class of K-times differentiable functions) might be sufficient for you since any of these functions can be approximated (around a given point) by one of those (see taylor's theorem). Choosing a random polynomial function is easy, since you can just pick K random real numbers as coefficients for your polynom. (Note that this will for example not return functions similar to abs(x))
I wrote a Perlin noise generator, but it depends on something pretty inconvenient: I have to pregenerate a grid of random vectors.
Ideally, I'd like have some function double f(int x, int y, int z, int seed) (or similar) that will always return an identical value given identical arguments, but whose results appear random enough for the noise generation over small ranges of x, y, and z. Then, in my noise generation algorithm, instead of indexing into the precomputed grid, it can generate a "random" vector on the fly. This way I could sample the noise function at coordinates bounded only by the limits of an integer, rather than bounded by the limits of memory.
Is such a thing possible? Obviously the randomness wouldn't be nearly as high-quality as precomputing the vectors, but I only really need something that appears random enough visually. Are there any existing methods of doing this?
This is called a hash function. There are many very good ones: for your usage, where security is less of a concern than speed, I would use something like MD5 or SHA1.
When given a set of values deriving from a probability density function f, like this
{f(X1),f(X2)... f(Xn)}
But we don't know the exactly form of f,only we know is that the probability density function is a generalized Gaussian distribution.
Is it possible to generate the random numbers Xi if Xi belongs to a range [-3,3]?
The most straightforward way that I can see is this. Assuming that you have have large number of points {f(X1),--,f(Xn)}, plot them as distribution and fit a generalized Gaussian distribution curve through them. After this, you can use rejection sampling to generate further numbers from the same distribution.
I want to use rejection sampling to generate random numbers from a given distribution. I want to be quite general so that I don't want to relay on things like Box-Muller transformation which can generate only normal distributed random numbers. I am using linear congruential generator to generate a random sequence between 0 and 1 with uniform distribution. To use rejection sampling, I need to generate two sequences of random numbers so that I would be able to generate uniform points inside a 2d region. This can be done using two random sequences (one for x coordinate and other for y coordinate). I searched on Internet but nowhere I saw how to make sure that these two sequences are really uncorrelated. Is there any way to choose seeds for these such that these sequences are uncorrelated? If I randomly give seeds then the final distribution of these numbers is not quite like what I am looking for.
Thank you