Triple integral converges to wrong number (TI-84) - integral

I made a program for finding a triple integral of an f(x,y,z) function over a general region, but it's not working for some reason.
Here's an excerpt of the program, for when the order of integration is dz dy dx:
(B-A)/N→D
0→V
Dsum(seq(fnInt(Y₅,Y,Y₉,Y₀),X,A+.5D,B,D))→V
For(K,1,P)
A+(B-A)rand→X
Y₉+(Y₀-Y₉)rand→Y
Y₇+(Y₈-Y₇)rand→Z
Y₆→ʟW(K)
End
Vmean(ʟW)→V
Variables used explained below:
Y₆: Equation of f(x,y,z)
Y₇,Y₈: Lower and upper bounds of the innermost integral (dz)
Y₉,Y₀: Lower and upper bounds of the middle integral (dy)
A,B: Lower and upper bounds of the outermost integral (dx)
Y₅: Y₈-Y₇
N: Number of Δx intervals
D: Size of Δx interval
P: Number of points on D to guess the average value of f(x,y,z)
ʟW: List of various values of f(x,y,z)
V: Volume of the region of integration, then of the entire triple integral
So here's how I'm approaching it:
I first find the volume of just the region of integration using Dsum(seq(fnInt(Y₅,Y,Y₉,Y₀),X,A+.5D,B,D)). Then I pick a bunch of random (x,y,z) points in that region, and I plug those points into f(x,y,z) to generate a long list of various values for w = f(x,y,z). I then take the average of those w-values, which should give me a pretty good estimate for the average "height" of the 4D solid that is the triple integral; and by multiplying the region of integration "base" with the average w-value "height" (Vmean(ʟW)), it should give me a good estimate for the hypervolume of the triple integral.
It should naturally follow that as the number of (x,y,z) points tested increases, the value of the triple integral should more or less converge to the actual value.
For some reason, it doesn't. For some integrals it works fantastically, for others it misses by a long shot. Good example of this is ∫[0, 2] ∫[0, 2-x] ∫[0, 2-x-y] 2x dz dy dx. The correct answer is 4/3 or 1.333..., but the program converges to a completely different number: 2.67, give or take.
Why is it doing this? Why is the triple integral converging to a wrong number?
EDIT: My guess is—assuming I didn't make any mistakes, for which there are no promises—that the RNG algorithm used by the calculator can only generate numbers slightly greater than 0 and is throwing the program off, but I have no way to confirm this, nor to account for it since "slightly greater than 0" isn't quantified.

Related

Numerical instability?

I am working in a program that concerns the optimization of some objective function obj over the scalar beta. The true global minimum beta0 is set at beta0=1.
In the mwe below you can see that obj is constructed as the sum of the 100-R (here I use R=3) smallest eigenvalues of the 100x100 symmetric matrix u'*u. While around the true global minimum obj "looks good" when I plot the objective function evaluated at much larger values of beta the objective function becomes very unstable (here or running the mwe you can see that multiple local minima (and maxima) appear, associated with values of obj(beta) smaller than the true global minimum).
My guess is that there is some sort of "numerical instability" going on, but I am unable to find the source.
%Matrix dimensions
N=100;
T=100;
%Reproducibility
rng('default');
%True global minimum
beta0=1;
%Generating data
l=1+randn(N,2);
s=randn(T+1,2);
la=1+randn(N,2);
X(1,:,:)=1+(3*l+la)*(3*s(1:T,:)+s(2:T+1,:))';
s=s(1:T,:);
a=(randn(N,T));
Y=beta0*squeeze(X(1,:,:))+l*s'+a;
%Give "beta" a large value
beta=1e6;
%Compute objective function
u=Y-beta*squeeze(X(1,:,:));
ev=sort(eig(u'*u)); % sort eigenvalues
obj=sum(ev(1:100-3))/(N*T); % "obj" is sum of 97 smallest eigenvalues
This evaluates the objective function at obj(beta=1e6). I have noticed that some of the eigenvalues from eig(u'*u) are negative (see object ev), when by construction the matrix u'*u is positive semidefinite
I am guessing this may have to do with floating point arithmetic issues and may (partly) be the answer to the instability of my function, but I am not sure.
Finally, this is what the objective function obj evaluated at a wide range of values for betalooks like:
% Now plot "obj" for a wide range of values of "beta"
clear obj
betaGrid=-5e5:100:5e5;
for i=1:length(betaGrid)
u=Y-betaGrid(i)*squeeze(X(1,:,:));
ev=sort(eig(u'*u));
obj(i)=sum(ev(1:100-3))/(N*T);
end
plot(betaGrid,obj,"*")
xlabel('\beta')
ylabel('obj')
This gives this figure, which shows how unstable it becomes for extreme values for beta.
The key here is noticing that computing eigenvalues can be a hard problem.
Actually the condition number for this problem is K = norm(A) * norm(inv(A)) (don't compute it this way, use cond(). This means the the an (relative) perturbation in the inpute (i.e. the matrix entries) gets amplified by the condition number when computing the output. I modified your code a little bit to compute and plot the condition number in each step. It turns out that for a large part of the range you are interested in it is greater than 10^17, which is abysmal. (Note that the double floating point numbers are accurate to not quite 16 significant (decimal) digits. This means even the representation error of double floating point numbers will here produce errors that make every digit "insignificant".) This already explains the bad behaviour. You should note that usually we can compute the largest eigenvalues quite accurately, the errors in the smaller (in magnitude) ones usually increase.
If the condition number was better (closer to 1) I would have suggested
computing the singular values, as they happen to be the eigenvalues (due to the symmetry). The svd is numerically more stable, but with this really bad
condition even this will not help. In the following modification of the
final snippet I added a graph that plots the condition number.
The only case where anything is salvageable is for R=0, then we actually
want to compute the sum of all eigenvalues, which happens to be the
trace of our matrix, which can easily be computed by just summing the
diagonal entries.
To summarize: This problem seems to have an inherent bad condition, so it doesn't really matter how you compute it. If you have a completely different formulation for the same problem that might help.
% Now plot "obj" for a wide range of values of "beta"
clear obj
L = 5e5; % decrease to 5e-1 to see that the condition number is still >1e9 around the optimum
betaGrid=linspace(-L,L,1000);
condition = nan(size(betaGrid));
for i=1:length(betaGrid)
disp(i/length(betaGrid))
u=Y-betaGrid(i)*squeeze(X(1,:,:));
A = u'*u;
ev=sort(eig(A));
condition(i) = cond(A);
obj(i)=sum(ev(1:100-3))/(N*t); % for R=0 use trace(A)/(N*T);
end
subplot(1,2,1);
plot(betaGrid,obj,"*")
xlabel('\beta')
ylabel('obj')
subplot(1,2,2);
semilogy(betaGrid, condition);
title('condition number');

Rational approximation of rational exponentiation root with error control

I am looking for an algorithm that would efficiently calculate b^e where b and e are rational numbers, ensuring that the approximation error won't exceed given err (rational as well). Explicitly, I am looking for a function:
rational exp(rational base, rational exp, rational err)
that would preserve law |exp(b, e, err) - b^e| < err
Rational numbers are represented as pairs of big integers. Let's assume that all rationality preserving operations like addition, multiplication etc. are already defined.
I have found several approaches, but they did not allow me to control the error clearly enough. In this problem I don't care about integer overflow. What is the best approach to achieve this?
This one is complicated, so I'm going to outline the approach that I'd take. I do not promise no errors, and you'll have a lot of work left.
I will change variables from what you said to exp(x, y, err) to be x^y within error err.If y is not in the range 0 <= y < 1, then we can easily multiply by an appropriate x^k with k an integer to make it so. So we only need to worry about fractional `y
If all numerators and denominators were small, it would be easy to tackle this by first taking an integer power, and then taking a root using Newton's method. But that naive idea will fall apart painfully when you try to estimate something like (1000001/1000000)^(2000001/1000000). So the challenge is to keep that from blowing up on you.
I would recommend looking at the problem of calculating x^y as x^y = (x0^y0) * (x0^(y-y0)) * (x/x0)^y = (x0^y0) * e^((y-y0) * log(x0)) * e^(y * log(x/x0)). And we will choose x0 and y0 such that the calculations are easier and the errors are bounded.
To bound the errors, we can first come up with a naive upper bound b on x0^y0 - something like "next highest integer than x to the power of the next highest integer than y". We will pick x0 and y0 to be close enough to x and y that the latter terms are under 2. And then we just need to have the three terms estimated to within err/12, err/(6*b) and err/(6*b). (You might want to make those errors tighter half that then make the final answer a nearby rational.)
Now when we pick x0 and y0 we will be aiming for "close rational with smallish numerator/denominator". For that we start calculating the continued fraction. This gives a sequence of rational numbers that quickly converges to a target real. If we just cut off the sequence fairly soon, we can quickly find a rational number that is within any desired distance of a target real while keeping relatively small numerators and denominators.
Let's work from the third term backwards.
We want y * log(x/x0) < log(2). But from the Taylor series if x/2 < x0 < 2x then log(x/x0) < x/x0 - 1. So we can search the continued fraction for an appropriate x0.
Once we have found it, we can use the Taylor series for log(1+z) to calculate log(x/x0) to within err/(12*y*b). And then the Taylor series for e^z to calculate the term to our desired error.
The second term is more complicated. We need to estimate log(x0). What we do is find an appropriate integer k such that 1.1^k <= x0 < 1.1^(k+1). And then we can estimate both k * log(1.1) and log(x0 / 1.1^k) fairly precisely. Find a naive upper bound to that log and use it to find a close enough y0 for the second term to be within 2. And then use the Taylor series to estimate e^((y-y0) * log(x0)) to our desired precision.
For the first term we use the naive method of raising x0 to an integer and then Newton's method to take a root, to give x0^y0 to our desired precision.
Then multiply them together, and we have an answer. (If you chose the "tighter errors, nicer answer", then now you'd do a continued fraction on that answer to pick a better rational to return.)

Parametric Scoring Function or Algorithm

I'm trying to come up with a way to arrive at a "score" based on an integer number of "points" that is adjustable using a small number (3-5?) of parameters. Preferably it would be simple enough to reasonably enter as a function/calculation in a spreadsheet for tuning the parameters by the "designer" (not a programmer or mathematician). The first point has the most value and eventually additional points have a fixed or nearly fixed value. The transition from the initial slope of point value to final slope would be smooth. See example shapes below.
Points values are always positive integers (0 pts = 0 score)
At some point, curve is linear (or nearly), all additional points have fixed value
Preferably, parameters are understandable to a lay person, e.g.: "smoothness of the curve", "value of first point", "place where the additional value of points is fixed", etc
For parameters, an example of something ideal would be:
Value of first point: 10
Value of point #: 3 is: 5
Minimum value of additional points: 0.75
Exact shape of curve not too important as long as the corner can be more smooth or more sharp.
This is not for a game but more of a rating system with multiple components (several of which might use this kind of scale) will be combined.
This seems like a non-traditional kind of question for SO/SE. I've done mostly financial software in my career, I'm hoping there some domain wisdom for this kind of thing I can tap into.
Implementation of Prune's Solution:
Google Sheet
Parameters:
Initial value (a)
Second value (b)
Minimum value (z)
Your decay ratio is b/a. It's simple from here: iterate through your values, applying the decay at each step, until you "peg" at the minimum:
x[n] = max( z, a * (b/a)^n )
// Take the larger of the computed "decayed" value,
// and the specified minimum.
The sequence x is your values list.
You can also truncate intermediate results if you want integers up to a certain point. Just apply the floor function to each computed value, but still allow z to override that if it gets too small.
Is that good enough? I know there's a discontinuity in the derivative function, which will be noticeable if the minimum and decay aren't pleasantly aligned. You can adjust this with a relative decay, translating the exponential decay curve from y = 0 to z.
base = z
diff = a-z
ratio = (b-z) / diff
x[n] = z + diff * ratio^n
In this case, you don't need the max function, since the decay has a natural asymptote of 0.

How to calculate algorithm complexity by having its runtime?

I'm testing sorting algorithm and tried it with different amount of data.
100 thousand elements
1 million elements
up to 10 million of elements.
I need to calculate the complexity of this algorithm by having outputs for how long every sorting took.
How can I do that?
While you can't find the running time of an algorithm without doing mathematical analysis, the empirical measurements can give you a reasonable idea of how the running time of the algorithm---or rather the program---behaves.
For example, if you have n measurements (x1, y1), (x2, y2), ..., (xn, yn), where xi is the size of the input and yi is the time of the program on the input of that size, then you can plot the function to see whether it's a polynomial. In practice it often is. However, it's hard to see what the exponent should be from the plot.
To find the exponent you could find the slope of the line that best fits the points (log xi, log yi). This is because if y=C*x^k+lower order terms, then since the term C*x^k dominates we expect log y =~ k*log x + log C, i.e., the log-log equation is a linear one whenever the "original" equation is a polynomial one. (Whenever you see a linear function in the log-log plot, your running time is polynomial; the slope of the line tells you the degree of the polynomial.)
Here's a plot of the quadratic function y(x)=x^2:
And here's the corresponding log-log plot:
We can see that it's a line with slope 2 (in practice you would compute this using, for example, linear least squares). This is expected because log y(x) = 2 * log(x).
The code I used:
x = 1:1:100;
y = x.^2;
plot(x, y);
plot(log(x), log(y));
In practice the function looks messier and the slope can (or should) only be used as a rule of thumb when nothing else is available.
I imagine there are many other tricks to learn about program behavior from running time measurements. I'll give others a chance to share their experience.

How can a random number be produced with decreasing odds (approaching zero) towards its upper boundary?

I want a random number generator that almost never produces numbers that are near a given upper boundary. The odds should drop linearly to 0 to the upper boundary.
This is possibly best suited as a math-only question, but I need it in code form (pseudo-code is fine, more specifically any C-based language) for my use, so I'm putting it here.
If you want a linear drop-off what you're describing is called a triangle (or triangular) distribution. Given U, a source of uniformly distributed random numbers on the range [0,1), you can generate a triangle on the range [a,b) with its mode at a using:
def triangle(a,b)
return a + (b-a)*(1 - sqrt(U))
end
This can be derived by writing the equation of a triangle for the specified range, scaling it so it has area 1 to make it a valid density, integrating to get the CDF, and using inversion.
As an interesting aside, this will still work if a >= b. For equality, you always get a (which makes sense if the range is zero). Otherwise, you get a triangle which goes from b to a and has its mode at a.

Resources