Function derivatives - algorithm

I have some function,
int somefunction( //parameters here, let's say int x) {
return something Let's say x*x+2*x+3 or does not matter
}
How do I find the derivative of this function? If I have
int f(int x) {
return sin(x);
}
after derivative it must return cos(x).

You can approximate the derivative by looking at the gradient over a small interval. Eg
const double DELTA=0.0001;
double dfbydx(int x) {
return (f(x+DELTA) - f(x)) / DELTA;
}
Depending on where you're evaluating the function, you might get better results from (f(x+DELTA) - f(x-DELTA)) / 2*DELTA instead.
(I assume 'int' in your question was a typo. If they really are using integers you might have problems with precision this way.)

You can get the numerical integral of mostly any function using one of many numerical techniques such as Numerical ordinary Differential Equations
Look at: Another question
But you can get the integration result as a function definition with a library such as Maple, Mathematica, Sage, or SymPy

Related

How to generate a seed from an xy coordinate

Iv'e been working on a perlin script but have been having problems with creating simple pseudo random values.
I need to be able to create a seed value from an xy coordinate but x+y has obvious problems with recurring values. Also they go into negative space so x^y doesn't work.
Sorry if this has been already answered somewhere else but either I didn't understand or couldn't find it.
Do you want to assing a repetible random number to each x,y pair ?
Using a linear or in general function combination of the x,y as a seed will give artifacts in the distribution (at least if you don't use a very complex function).
Try with this, I've the same problem ant it worked for me
//seeded random for JS - integer
function irnd2()
{
a=1664525;
c=1013904223;
m=4294967296;
rnd2.r=(rnd2.r*a+c)%m;
return rnd2.r;
}
//seeded random for JS - double [0,1]
function rnd2()
{
a=1664525;
c=1013904223;
m=4294967296;
rnd2.r=(rnd2.r*a+c)%m;
return rnd2.r/m;
}
rnd2.r=192837463;
//seed function
function seed2(s)
{
s=s>0?s:-s;
rnd2.r=192837463^s;
}
//my smart seed from 2 integer
function myseed(x,y)
{
seed2(x);//x is integer
var sx=irnd2();//sx is integer
seed2(y);//y is integer
var sy=irnd2();//sy is integer
seed2(sx^sy);//using binary xor you won't lose information
}
In order to use it :
myseed(x,y);
irnd2();
In this manner you can obtain a good uncorrelated random sequence.
I use it in JS but it should work also in other languages supposing the argument of seed and the returned value of rnd is an integer.
You need to better define the problem to get an optimal answer.
If your x and y values are relatively small, you could place them into the high and low portions of an integer (is the seed in your language an integer), e.g. for a 32-bit platform:
int seed = x << 16 + y;
If the seed value is not allowed to be negative (I didn't fully understand what you meant by "negative space" in your question, whether you were referring to geography or the seed value), you can take the absolute value of the seed.
If you meant that the coordinates can have negative values, your best course of action depends on whether you want the same seed for a coordinate and for it's inverse.
Take the absolute value of both x and y first; then x^y will work fine. One of the easiest ways to create a pseudo-random source is with time. You might try multiplying x^y by the current system time; this method has an extremely low chance of generating recurring seed values.
If you know the range of values you have, you could simply cast x and y as strings padded with zeroes, append the two strings, then run the resulting string through a hash function.
In C#, adapted and improved from alexroat's answer. Just set Random.seed = MyUtils.GetSeedXY(x, y) and you're good to go.
public static class MyUtils
{
static int seed2(int _s)
{
var s = 192837463 ^ System.Math.Abs(_s);
var a = 1664525;
var c = 1013904223;
var m = 4294967296;
return (int) ((s * a + c) % m);
}
public static int GetSeedXY(int x, int y)
{
int sx = seed2(x * 1947);
int sy = seed2(y * 2904);
return seed2(sx ^ sy);
}
}

Iterval arithmetic of real powers

I've found many definitions on how to compute the power of an interval, where the power is an integer, but I'd like to find a formula for computing the power more generally.
In other words, I'd like to implement something like the 'pow' method below:
class Interval {
public final double min;
public final double max;
public Interval pow(Interval i) {
return new Interval(..., ...);
}
}
In Interval Arithmetic, the standard four operators are relatively straight-forward (read: available on Wikipedia), transcendental functions are not much harder, and integer powers of intervals are not difficult... But interval powers of intervals are throwing me for a loop.
I haven't been able to find any open-source libraries in any language that implement this generalized power function.
I'd appreciate any complete answers, suggestions, and references.

floating point operations

i have read that some machine can't express exaclty floating point number for example 1.1
let's take code
float x=0.1;
do{
x+=0.1;
printf("%f\n",x);
} while(x!=1.1);
this code never finished how can i make that code finish? maybe convert it to double or?
For numerical problems, it is common to specify an epsilon of accuracy:
bool within_epsilon(float x, float y, float e) {
if (abs(x - y) > e) {
return false
} else {
return true
}
}
The epsilon you choose will change your accuracy, and the epsilon you can choose is dependent on your floating point implementation: Machine epsilon.
For example, compare within an acceptable margin. I.e.
while (abs(x-1.1)>0.001);
Doubles will have the same issue, just with more precision. Some languages also offer you rational types, where you can specify a number as the fraction 1/10, or fixed point data types.
In this case, checking "<" will do the trick:
float x=0.1;
do{
x+=0.1;
printf("%f\n",x);
} while(x<1.05);
In general, you should test against an "epsilon". Look here for further information.
Work in fixed point, for that kind of task.
The decimal type for example might help. It's not the solution for all problems though.
If you want to do code precisely like you are saying then you want to use a type like decimal (where available) which is a base 10 floating point implementation rather than a base 2.
Further reading: http://en.wikipedia.org/wiki/Floating_point#Accuracy_problems and http://en.wikipedia.org/wiki/Decimal_floating_point

How to calculate the sum of two normal distributions

I have a value type that represents a gaussian distribution:
struct Gauss {
double mean;
double variance;
}
I would like to perform an integral over a series of these values:
Gauss eulerIntegrate(double dt, Gauss iv, Gauss[] values) {
Gauss r = iv;
foreach (Gauss v in values) {
r += v*dt;
}
return r;
}
My question is how to implement addition for these normal distributions.
The multiplication by a scalar (dt) seemed simple enough. But it wasn't simple! Thanks FOOSHNICK for the help:
public static Gauss operator * (Gauss g, double d) {
return new Gauss(g.mean * d, g.variance * d * d);
}
However, addition eludes me. I assume I can just add the means; it's the variance that's causing me trouble. Either of these definitions seems "logical" to me.
public static Gauss operator + (Gauss a, Gauss b) {
double mean = a.mean + b.mean;
// Is it this? (Yes, it is!)
return new Gauss(mean, a.variance + b.variance);
// Or this? (nope)
//return new Gauss(mean, Math.Max(a.variance, b.variance));
// Or how about this? (nope)
//return new Gauss(mean, (a.variance + b.variance)/2);
}
Can anyone help define a statistically correct - or at least "reasonable" - version of the + operator?
I suppose I could switch the code to use interval arithmetic instead, but I was hoping to stay in the world of prob and stats.
The sum of two normal distributions is itself a normal distribution:
N(mean1, variance1) + N(mean2, variance2) ~ N(mean1 + mean2, variance1 + variance2)
This is all on wikipedia page.
Be careful that these really are variances and not standard deviations.
// X + Y
public static Gauss operator + (Gauss a, Gauss b) {
//NOTE: this is valid if X,Y are independent normal random variables
return new Gauss(a.mean + b.mean, a.variance + b.variance);
}
// X*b
public static Gauss operator * (Gauss a, double b) {
return new Gauss(a.mean*b, a.variance*b*b);
}
To be more precise:
If a random variable Z is defined as the linear combination of two uncorrelated Gaussian random variables X and Y, then Z is itself a Gaussian random variable, e.g.:
if Z = aX + bY,
then mean(Z) = a * mean(X) + b * mean(Y), and variance(Z) = a2 * variance(X) + b2 * variance(Y).
If the random variables are correlated, then you have to account for that. Variance(X) is defined by the expected value E([X-mean(X)]2). Working this through for Z = aX + bY, we get:
variance(Z) = a2 * variance(X) + b2 * variance(Y) + 2ab * covariance(X,Y)
If you are summing two uncorrelated random variables which do not have Gaussian distributions, then the distribution of the sum is the convolution of the two component distributions.
If you are summing two correlated non-Gaussian random variables, you have to work through the appropriate integrals yourself.
Well, your multiplication by scalar is wrong - you should multiply variance by the square of d. If you're adding a constant, then just add it to the mean, the variance stays the same. If you're adding two distributions, then add the means and add the variances.
Can anyone help define a statistically correct - or at least "reasonable" - version of the + operator?
Arguably not, as adding two distributions means different things - having worked in reliability and maintainablity my first reaction from the title would be the distribution of a system's mtbf, if the mtbf of each part is normally distributed and the system had no redundancy. You are talking about the distribution of the sum of two normally distributed independent variates, not the (logical) sum of two normal distributions' effect. Very often, operator overloading has surprising semantics. I'd leave it as a function and call it 'normalSumDistribution' unless your code has a very specific target audience.
Hah, I thought you couldn't add gaussian distributions together, but you can!
http://mathworld.wolfram.com/NormalSumDistribution.html
In fact, the mean is the sum of the individual distributions, and the variance is the sum of the individual distributions.
I'm not sure that I like what you're calling "integration" over a series of values. Do you mean that word in a calculus sense? Are you trying to do numerical integration? There are other, better ways to do that. Yours doesn't look right to me, let alone optimal.
The Gaussian distribution is a nice, smooth function. I think a nice quadrature approach or Runge-Kutta would be a much better idea.
I would have thought it depends on what type of addition you are doing. If you just want to get a normal distribution with properties (mean, standard deviation etc.) equal to the sum of two distributions then the addition of the properties as given in the other answers is fine. This is the assumption used in something like PERT where if a large number of normal probability distributions are added up then the resulting probability distribution is another normal probability distribution.
The problem comes when the two distributions being added are not similar. Take for instance adding a probability distribution with a mean of 2 and standard deviation of 1 and a probability distribution of 10 with a standard deviation of 2. If you add these two distributions up, you get a probability distribution with two peaks, one at 2ish and one at 10ish. The result is therefore not a normal distibution. The assumption about adding distributions is only really valid if the original distributions are either very similar or you have a lot of original distributions so that the peaks and troughs can be evened out.

How best to sum up lots of floating point numbers?

Imagine you have a large array of floating point numbers, of all kinds of sizes. What is the most correct way to calculate the sum, with the least error? For example, when the array looks like this:
[1.0, 1e-10, 1e-10, ... 1e-10.0]
and you add up from left to right with a simple loop, like
sum = 0
numbers.each do |val|
sum += val
end
whenever you add up the smaller numbers might fall below the precision threshold so the error gets bigger and bigger. As far as I know the best way is to sort the array and start adding up numbers from lowest to highest, but I am wondering if there is an even better way (faster, more precise)?
EDIT: Thanks for the answer, I now have a working code that perfectly sums up double values in Java. It is a straight port from the Python post of the winning answer. The solution passes all of my unit tests. (A longer but optimized version of this is available here Summarizer.java)
/**
* Adds up numbers in an array with perfect precision, and in O(n).
*
* #see http://code.activestate.com/recipes/393090/
*/
public class Summarizer {
/**
* Perfectly sums up numbers, without rounding errors (if at all possible).
*
* #param values
* The values to sum up.
* #return The sum.
*/
public static double msum(double... values) {
List<Double> partials = new ArrayList<Double>();
for (double x : values) {
int i = 0;
for (double y : partials) {
if (Math.abs(x) < Math.abs(y)) {
double tmp = x;
x = y;
y = tmp;
}
double hi = x + y;
double lo = y - (hi - x);
if (lo != 0.0) {
partials.set(i, lo);
++i;
}
x = hi;
}
if (i < partials.size()) {
partials.set(i, x);
partials.subList(i + 1, partials.size()).clear();
} else {
partials.add(x);
}
}
return sum(partials);
}
/**
* Sums up the rest of the partial numbers which cannot be summed up without
* loss of precision.
*/
public static double sum(Collection<Double> values) {
double s = 0.0;
for (Double d : values) {
s += d;
}
return s;
}
}
For "more precise": this recipe in the Python Cookbook has summation algorithms which keep the full precision (by keeping track of the subtotals). Code is in Python but even if you don't know Python it's clear enough to adapt to any other language.
All the details are given in this paper.
See also: Kahan summation algorithm It does not require O(n) storage but only O(1).
There are many algorithms, depending on what you want. Usually they require keeping track of the partial sums. If you keep only the the sums x[k+1] - x[k], you get Kahan algorithm. If you keep track of all the partial sums (hence yielding O(n^2) algorithm), you get #dF 's answer.
Note that additionally to your problem, summing numbers of different signs is very problematic.
Now, there are simpler recipes than keeping track of all the partial sums:
Sort the numbers before summing, sum all the negatives and the positives independantly. If you have sorted numbers, fine, otherwise you have O(n log n) algorithm. Sum by increasing magnitude.
Sum by pairs, then pairs of pairs, etc.
Personal experience shows that you usually don't need fancier things than Kahan's method.
Well, if you don't want to sort then you could simply keep the total in a variable with a type of higher precision than the individual values (e.g. use a double to keep the sum of floats, or a "quad" to keep the sum of doubles). This will impose a performance penalty, but it might be less than the cost of sorting.
If your application relies on numeric processing search for an arbitrary precision arithmetic library, however I don't know if there are Python libraries of this kind. Of course, all depends on how many precision digits you want -- you can achieve good results with standard IEEE floating point if you use it with care.

Resources