Round-off to zero behavior in Maxima float coefficients - precision

So I am working on a Maxima program that involves a bunch of iterations (the Souriau-Frame Drazin Inverse Algorithm, to be specific), each step of which yields a polynomial. I need to check and stop my iterations when the polynomial goes to zero (i.e., all coefficients go to zero).
Maxima seems to never truncate small numbers to zero up until it reaches something absurd like $10^{-323}$ and so on.
The following code snippet gives an idea of what I need:
(%i3) rat(1e-300);
rat: replaced 1.0E-300 by 1/999999999999999903803069407426113968898218766118103141789833949572356552411722264192305659040010509526872994217248819197070144216063125530186267630296136203765329090687113225440746189048800695790727969805197112921161540803823920273299782054992133678869364753954248541633605124057805104488924519071744 = 1.0E-300
(%o3)/R/ 1/9999999999999999038030694074261139688982187661181031417898339495723\
565524117222641923056590400105095268729942172488191970701442160631255301862676\
302961362037653290906871132254407461890488006957907279698051971129211615408038\
23920273299782054992
133678869364753954248541633605124057805104488924519071744
(%i4) rat(1e-323);
rat: replaced 9.0E-324 by^C
Maxima encountered a Lisp error:
SIMPLE-ERROR: Console interrupt.
Automatically continuing.
To enable the Lisp debugger set *debugger-hook* to nil.
(%i5) rat(1e-325);
rat: replaced 0.0 by 0/1 = 0.0
(%o5)/R/ 0
(%i6)
As one can see, it's not truncating $10^{-300}$ to zero, it hangs (and I had to sigkill it) for $10^{-323}$ and everything smaller than $10^{-325}$ is set to zero.
I don't know where this 324 is coming from. And I'd like to know if it's possible to reduce this for my code.
Edit 1: Here's the output if I used rationalize instead of rat:
(%i3) rationalize(1e-300);
(%o3) 6032057205060441/6032057205060440848842124543157735677050252251748505781\
796615064961622344493727293370973578138265743708225425014400837164813540499979\
063179105919597766951022193355091707896034850684039059079180396788349106095584\
290087446076413771468940477241550670753145517602931224392424029547429993824129\
889235158145614364972941312
(%i4) rationalize(1e-323);
(%o4) 1/1012011266536553091762476733594586535247783248820710591784506790137151\
697839976734459801918507185622475935389321584059556949043686928967384335066999\
703692549607587121382831806822334538710466081706198838392363725342810037417123\
463493090516778245797781704050282561793847761667073076152512660931637543230031\
31653853870546747392
(%i5) rationalize(1e-324);
(%o5) 0
Edit 2: Here's the output to "build_info();":
(%i6) build_info();
(%o6)
Maxima version: "5.43.2"
Maxima build date: "2020-02-21 05:22:38"
Host type: "x86_64-pc-linux-gnu"
Lisp implementation type: "GNU Common Lisp (GCL)"
Lisp implementation version: "GCL 2.6.12"
User dir: "/home/nidish/.maxima"
Temp dir: "/tmp"
Object dir: "/home/nidish/.maxima/binary/5_43_2/gcl/GCL_2_6_12"
Frontend: false

I gather that the goal is to replace small (in absolute value) float with zero. There doesn't appear to be a built-in function for that. Here's an attempt at an implementation via the pattern matching machinery.
First define a rule to replace small floats, and define a function which applies the rule to an expression.
(%i4) matchdeclare(xx,floatnump) $
(%i5) defrule(squashing_rule,xx, if abs(xx) <= squashing_tolerance then 0 else xx);
(%o5) squashing_rule : xx -> (if abs(xx) <= squashing_tolerance then 0 else xx)
(%i6) squashing_tolerance:0.01 $
(%i7) squash_floats(expr):=applyb1(expr,squashing_rule) $
Now create a random polynomial.
(%i8) e:makelist(float((((2*random(2)-1)*(1+random(8)))/8) *10^-random(4)) *x^k,k,1,6);
2 3 4 5 6
(%o8) [- 3.75e-4 x, - 0.00625 x , - 0.05 x , 0.00625 x , 0.005 x , 0.5 x ]
(%i9) e1:apply("+",e);
6 5 4 3 2
(%o9) 0.5 x + 0.005 x + 0.00625 x - 0.05 x - 0.00625 x - 3.75e-4 x
Apply squash_floats to the generated polynomial.
(%i10) squash_floats(e1);
6 3
(%o10) 0.5 x - 0.05 x
Change the squashing tolerance.
(%i11) squashing_tolerance:0.001;
(%i12) squash_floats(e1);
6 5 4 3 2
(%o12) 0.5 x + 0.005 x + 0.00625 x - 0.05 x - 0.00625 x
Verify the replacement happens in nested expressions.
(%i13) squash_floats(sin(1+1/e1));
1
(%o13) sin(----------------------------------------------------- + 1)
6 5 4 3 2
0.5 x + 0.005 x + 0.00625 x - 0.05 x - 0.00625 x

First let's step back a moment. What is the behavior you are hoping to find? If you need to convert very small floats to rational numbers accurately, try rationalize instead of rat. Does that work correctly for 1e-323?
If you want floats smaller than a tolerance to be converted to zero, we'll need to take a different approach. I'll hold off on that for the moment.
About the specific behavior you have observed, it appears to be implementation-dependent; I get a different (still buggy) behavior with Maxima + SBCL, which reports a floating point overflow. What does build_info(); report?
I don't know if it matters, but 1e-323 is a so-called denormalized float -- it is smaller than the smallest normalized (full precision) float, which is about 1e-308.

First you say "I want to know when a polynomial is exactly going to zero." And then you say "if a coefficient in a polynomial drops below a threshold, I want that terms to be completely thrown out of the polynomial". So you don't want the polynomial to be exactly zero, you want it to be zero within some threshold (relative? absolute?).
I'm afraid I'm not familiar with the Souriau-Frame Drazin algorithm, but looking at the Greville paper about it, it seems that all the calculations are rational (no square roots etc.), so I wonder if it's feasible to perform your calculations with completely exact rational numbers instead of using floating-point numbers. Then presumably exact means exact, and you don't need to worry about thresholds at all.

Related

How to fix skew trapezoidal distribution sampling output sample size

I am trying to generate a skewed trapezoidal distribution using inverse transform sampling.
The inputs are the values where the ramps start and end (a, b, c, d) and the sample size.
a=-3;b=-1;c=1;d=8;
SampleSize=10e4;
h=2/(d+c-a-b);
Then I calculate the ratio of the length of ramps and flat components to get sample size for each:
firstramp=round(((b-a)/(d-a)),3);
flat=round((c-b)/(d-a),3);
secondramp=round((d-c)/(d-a),3);
n1=firstramp*SampleSize; %sample size for first ramp
n3=secondramp*SampleSize; %sample size for second ramp
n2=flat*SampleSize;
And then finally I get the histogram from the following code:
quartile1=h/2*(b-a);
quartile2=1-h/2*(d-c);
y1=linspace(0,quartile1,n1);
y2=linspace(quartile1,quartile2,n2);
y3=linspace(quartile2,1,n3);
%inverse cumulative distribution functions
invcdf1=a+sqrt(2*(b-a)/h)*sqrt(y1);
invcdf2=(a+b)/2+y2/h;
invcdf3=d-sqrt(2*(d-c)/h)*sqrt(1-y3);
distr=[invcdf1 invcdf2 invcdf3];
histogram(distr,100)
However the sampling of ramps and flat components are not equal, looks like this:
I fixed this by trial and error, by reducing the sample size of the ramps by half:
n1=0.5*firstramp*SampleSize; %sample size for first ramp
n3=0.5*secondramp*SampleSize; %sample size for second ramp
n2=flat*SampleSize;
This made the distribution look like this:
However this makes the output sample less than what is given in input.
I've also tried different combinations of changing the sample sizes of ramps and flat.
This also works:
n1=0.75*firstramp*SampleSize; %sample size for first ramp
n3=0.75*secondramp*SampleSize; %sample size for second ramp
n2=1.5*flat*SampleSize;
It increases the output samples, but it's still not close.
Any help will be appreciated.
Full code:
a=-3;b=-1;c=1;d=8;
SampleSize=10e4;%*1.33333333333333;
h=2/(d+c-a-b);
firstramp=round(((b-a)/(d-a)),3);
flat=round((c-b)/(d-a),3);
secondramp=round((d-c)/(d-a),3);
n1=firstramp*SampleSize; %sample size for first ramp
n3=secondramp*SampleSize; %sample size for second ramp
n2=flat*SampleSize;
quartile1=h/2*(b-a);
quartile2=1-h/2*(d-c);
y1=linspace(0,quartile1,.75*n1);
y2=linspace(quartile1,quartile2,1.5*n2);
y3=linspace(quartile2,1,.75*n3);
%inverse cumulative distribution functions
invcdf1=a+sqrt(2*(b-a)/h)*sqrt(y1);
invcdf2=(a+b)/2+y2/h;
invcdf3=d-sqrt(2*(d-c)/h)*sqrt(1-y3);
distr=[invcdf1 invcdf2 invcdf3];
histogram(distr,100)
%end
I don't know Matlab so I was hoping somebody else would jump in on this, but since nobody did here goes.
If I'm reading your code correctly what you did is not an inversion. Inversion is 1-1, i.e., one uniform input produces one outcome. You seem to be using a technique known as the "composition method". In composition the overall distribution is comprised of component pieces, each of which is straightforward to generate. You choose which component to generate from based on their proportions/probabilities relative to the whole. For density functions, probability is found as the area under the density curve, so your first mistake was in sampling the components relative to the width of each component rather than using their areas. The correct sampling proportions are 2/13, 4/13, and 7/13 for what you designated the firstramp, flat, and secondramp components, respectively. A second mistake (which is relatively minor) was to assign exact sample sizes to each of the components. Having probability 2/13 does not mean that exactly 2*SampleSize/13 of your samples will be from the firstramp, it means that's the expected sample size for that component. The expected value of a random variate is not necessarily (or even likely to be) the outcome you will actually get.
In pseudocode, the composition approach would be
generate U ~ Uniform(0,1)
if U <= 2/13:
generate and return a value from firstramp
else if U <= 6/13:
generate and return a value from flat
else:
generate and return a value from secondramp
Note that since each of the generate options will use one or more uniforms, and choosing between the options requires a uniform U, this is not an inversion.
If you want an actual inversion, you need to quantify your density, integrate it to get the cumulative distribution function, then apply the inversion technique by setting F(X) = U and solving for X. Since your distribution is made of distinct components, both the density and cumulative density will be piecewise functions.
After deriving the height based on the requirement that the areas of the two triangles and the flat section must add up to 1, I came up with the following for your density:
| (x + 3) / 13 -3 <= x <= -1
|
f(x) = | 2 / 13 -1 <= x <= 1
|
| 2 * (8 - x) / 91 1 <= x <= 8
Integrating this and collecting terms produces the CDF:
| (x + 3)**2 / 26 -3 <= x <= -1
|
F(x) = | (2 + x) * 2 / 13 -1 <= x <= 1
|
| 6 / 13 + [49 - (x - 8)**2] / 91 1 <= x <= 8
Finally, determining the values of F(x) at the break points between the segments and applying inversion yields the following pseudocode algorithm:
generate U ~ Uniform(0,1)
if U <= 2 / 13:
return 2 * sqrt( (13 * U) / 2 ) - 3
else if U <= 6 / 13:
return (13 * U) / 2 - 2:
else:
return 8 - sqrt( 91 * (1 - U) )
Note that this is a true inversion. The outcome is determined by generating a single U, and transforming it in different ways depending on which range it falls in.

Binary multiplication with addition

I have an exercise where I am supposed to multiply an unknown number by 5. The number is max 8 bits and stored in 2's compliment. So if the number is 5 the result should be 25 and if its -12 it should be -60.
My only tools are load a register with a bit pattern and add two bit patterns together and store the result. I'm only allowed to use addition 3 times in my program.
With a known number I think I could solve it by adding the appropriate bit pattern to make it 5 times bigger but will that work with an unknown number?
Ex. I have the bit pattern 0001010 representing 10 knowing I want 50 as the result I would add the bit pattern 0101000 to get the resulting 0110010 which is 50.
Can someone please give me a hint how I can solve this problem thank you!
You can compute 5x=2x+2x+x using three additions. Something like this:
y = x + x // y = 2x
z = y + y // z = 2y = 4x
r = z + x // r = 4x + x = 5x
If x is the unknown value, then at the end of this code, r is x * 5.

Square root calculation using continued fractions to n bits of precision

This is an unsolved problem from my past arbitrary-precision rational numbers C++ assignment.
For calculation, I used this expression from Wikipedia (a being the initial guess, r being its remainder):
I ended up, just by guessing from experiments, with this approach:
Use an integer square root function on the numerator/denominator, use that as the guess
Iterate the continued fraction until the binary length of the denominator was at least the target precision
This worked well enough to get me through the official tests, however, from my testing, the precision was too high (sometimes almost double) – i.e. the code was inefficient – and I had no proof it worked on any input (and hence no confidence in the code).
A simplified excerpt from the code (natural/rational store arbitrary length numbers, assume all operations return fractions in their simplest form):
rational sqrt(rational input, int precision) {
rational guess(isqrt(input.numerator), isqrt(input.denominator)); // a
rational remainder = input - power(guess, 2); // r
rational result = guess;
rational expansion;
while (result.denominator.size() <= precision) {
expansion = remainder / (2 * guess + expansion);
result = guess + expansion;
// Handle rational results
if (power(root, 2) == input) {
break;
}
}
return result;
}
Can it be done better? If so, how?
Square roots can easily and very accurately be calculated by the General Continued Fractions (GCF). Being general means it can have any positive number as the numerator in contrast to the Regular or Simple Continued Fractions (RCF) where the numerators are all 1s. In order to comprehend the answer as a whole, it is best to start from the beginning.
The method used to solve the square root of any positive number n by a GFC (a + x) whereas a being the integral and x being the continued fractional part, is;
n − a^2
√n = a + x ⇒ n = a^2 + 2ax + x^2 ⇒ n − a^2 = x(2a + x) ⇒ x = _______
2a + x
Right at this moment you have a GCF since x nicely gets placed at the denominator and once you replace x with it's definition you get an indefinitely extending definition of x. Regarding a, you are free to choose it among integers which are less than the √n. So if you want to find √11 then a can be chosen among 1, 2 or 3. However it's always better to chose the biggest one in order to be able to simplify the GCF into an RCF at the next stage.
Remember that x = (n − a^2) / (2a + x) and n = 11 and a = 3. Now if we write the first two terms then we may simplify the GCF to RCF with all numerators as 1.
2 2 divide both 1
x = _____ ⇒ _________ ⇒ numerator and ⇒ _________ = x
6 + x 6 + 2 denominator by 2 3 + 1
_____ _____
6 + x 6 + x
Accordingly our RCF for √11 is;
1 ___
√11 = 3 + x ⇒ 3 + _____________ = [3;3,6]
1
3 + _________
1
6 + _____
1
3 + _....
6
Notice the coefficient notation [3; 3, 6, 3, 6, ...] which in this particular case resembles an infinite array. This is how RCF's are expressed in coefficient notation, the first item being the a and the tail after ; are the RCF coefficients of x. These two are sufficient since we already know that in RCF all numerators are fixed to 1.
Coming back to your precision question. You now have √11 = 3 + x where x is your RCF as [3;3,6,3,6,3,6...]. Normally you can try by picking a depth and reducing from right like [3,3,6,3,6,3,6...].reduceRight((p,c) => c + 1/p) as it would be done in JS. Not a precise enough result.? Then try it again from another depth. This is in fact how it is descriped in the linked wikipedia topic as bottom up. However it would be much efficient to go from left to right (top to bottom) by calculating the intermediate convergents one after the other, at a single pass. Every next intermediate convergent yields a better precision for you to test and decide weather to stop or continue. When you reach to a coefficient sufficient enough just stop there. Having said that, once you reach to the desired coefficient you may still do some fine tuning by increasing or decreasing that coefficient. Decreasing the coefficients at even indices or increasing the ones at odd indices would decrease the convergent and vice versa.
So in order to be able to do a left to right (top to bottom) analysis there is a special rule as
n2/d2 = (xn * n1 + n0)/(xn * d1 + d0)
We need to know last two interim convergents (n0/d0 and n1/d1) along with the current coefficient xn in order to be able calculate the next convergent (n2/d2).
We will start with two initial convergents as Infinity (n0/d0 = 1/0) and the a that we've chosen above (Remember √n = a + x) which is 3 so (n1/d1 = 3/1). Knowing that the 3 before the semicolon is in fact a, our first xn is the 3 right after the semicolon in our coefficients array [3;»» 3 ««,6,3,6,3,6...].
After we calculate n2/d2 and do our test, if need be, for the next step we will shift our convergents to the left so that we have the last two ready to calculate the next convergent. n0/d0 <- n1/d1 <- n2/d2
Here i present the table for the n2/d2 = (xn * n1 + n0)/(xn * d1 + d0) rule.
n0/d0 n1/d1 xn index n2/d2 decimal val.
_____ ______ __ _____ ________ ____________
1/0 3/1 3 1 odd 10/3 3.33333333..
3/1 10/3 6 2 evn 63/19 3.31578947..
10/3 63/19 3 3 odd 199/60 3.31666666..
63/19 199/60 6 4 evn 1257/379 3.31662269..
. . . . . .
. . . . . .
So as you may notice we are very quickly approaching to √11 which is 3.31662479... Note that the odd indices overshoot and evens undershoot due to cascading reciprocals. Since √11 is an irrational this will continue convergining indefinitely up until we say enough.
Remember, as mentioned earlier, once you reach to the desired coefficient you may still do some fine tuning by increasing or decreasing that coefficient (xn). Decreasing the coefficients at even indices or increasing the ones at odd indices would decrease the convergent and vice versa.
The problem here is, not all √n can simply be turned into RCF by a simple division as shown above. For a more generalized way to generate RCF from any √n you may check a more recent answer of mine.

convert real number to radicals

Suppose I have a real number. I want to approximate it with something of the form a+sqrt(b) for integers a and b. But I don't know the values of a and b. Of course I would prefer to get a good approximation with small values of a and b. Let's leave it undefined for now what is meant by "good" and "small". Any sensible definitions of those terms will do.
Is there a sane way to find them? Something like the continued fraction algorithm for finding fractional approximations of decimals. For more on the fractions problem, see here.
EDIT: To clarify, it is an arbitrary real number. All I have are a bunch of its digits. So depending on how good of an approximation we want, a and b might or might not exist. Brute force is naturally not a particularly good algorithm. The best I can think of would be to start adding integers to my real, squaring the result, and seeing if I come close to an integer. Pretty much brute force, and not a particularly good algorithm. But if nothing better exists, that would itself be interesting to know.
EDIT: Obviously b has to be zero or positive. But a could be any integer.
No need for continued fractions; just calculate the square-root of all "small" values of b (up to whatever value you feel is still "small" enough), remove everything before the decimal point, and sort/store them all (along with the b that generated it).
Then when you need to approximate a real number, find the radical whose decimal-portion is closet to the real number's decimal-portion. This gives you b - choosing the correct a is then a simple matter of subtraction.
This is actually more of a math problem than a computer problem, but to answer the question I think you are right that you can use continued fractions. What you do is first represent the target number as a continued fraction. For example, if you want to approximate pi (3.14159265) then the CF is:
3: 7, 15, 1, 288, 1, 2, 1, 3, 1, 7, 4 ...
The next step is create a table of CFs for square roots, then you compare the values in the table to the fractional part of the target value (here: 7, 15, 1, 288, 1, 2, 1, 3, 1, 7, 4...). For example, let's say your table had square roots for 1-99 only. Then you would find the closest match would be sqrt(51) which has a CF of 7: 7,14 repeating. The 7,14 is the closest to pi's 7,15. Thus your answer would be:
sqrt(51)-4
As the closest approximation given a b < 100 which is off by 0.00016. If you allow larger b's then you could get a better approximation.
The advantage of using CFs is that it is faster than working in, say, doubles or using floating point. For example, in the above case you only have to compare two integers (7 and 15), and you can also use indexing to make finding the closest entry in the table very fast.
This can be done using mixed integer quadratic programming very efficiently (though there are no run-time guarantees as MIQP is NP-complete.)
Define:
d := the real number you wish to approximate
b, a := two integers such that a + sqrt(b) is as "close" to d as possible
r := (d - a)^2 - b, is the residual of the approximation
The goal is to minimize r. Setup your quadratic program as:
x := [ s b t ]
D := | 1 0 0 |
| 0 0 0 |
| 0 0 0 |
c := [0 -1 0]^T
with the constraint that s - t = f (where f is the fractional part of d)
and b,t are integers (s is not)
This is a convex (therefore optimally solvable) mixed integer quadratic program since D is positive semi-definite.
Once s,b,t are computed, simply derive the answer using b=b, s=d-a and t can be ignored.
Your problem may be NP-complete, it would be interesting to prove if so.
Some of the previous answers use methods that are of time or space complexity O(n), where n is the largest “small number” that will be accepted. By contrast, the following method is O(sqrt(n)) in time, and O(1) in space.
Suppose that positive real number r = x + y, where x=floor(r) and 0 ≤ y < 1. We want to approximate r by a number of the form a + √b. If x+y ≈ a+√b then x+y-a ≈ √b, so √b ≈ h+y for some integer offset h, and b ≈ (h+y)^2. To make b an integer, we want to minimize the fractional part of (h+y)^2 over all eligible h. There are at most √n eligible values of h. See following python code and sample output.
import math, random
def findb(y, rhi):
bestb = loerror = 1;
for r in range(2,rhi):
v = (r+y)**2
u = round(v)
err = abs(v-u)
if round(math.sqrt(u))**2 == u: continue
if err < loerror:
bestb, loerror = u, err
return bestb
#random.seed(123456) # set a seed if testing repetitively
f = [math.pi-3] + sorted([random.random() for i in range(24)])
print (' frac sqrt(b) error b')
for frac in f:
b = findb(frac, 12)
r = math.sqrt(b)
t = math.modf(r)[0] # Get fractional part of sqrt(b)
print ('{:9.5f} {:9.5f} {:11.7f} {:5.0f}'.format(frac, r, t-frac, b))
(Note 1: This code is in demo form; the parameters to findb() are y, the fractional part of r, and rhi, the square root of the largest small number. You may wish to change usage of parameters. Note 2: The
if round(math.sqrt(u))**2 == u: continue
line of code prevents findb() from returning perfect-square values of b, except for the value b=1, because no perfect square can improve upon the accuracy offered by b=1.)
Sample output follows. About a dozen lines have been elided in the middle. The first output line shows that this procedure yields b=51 to represent the fractional part of pi, which is the same value reported in some other answers.
frac sqrt(b) error b
0.14159 7.14143 -0.0001642 51
0.11975 4.12311 0.0033593 17
0.12230 4.12311 0.0008085 17
0.22150 9.21954 -0.0019586 85
0.22681 11.22497 -0.0018377 126
0.25946 2.23607 -0.0233893 5
0.30024 5.29150 -0.0087362 28
0.36772 8.36660 -0.0011170 70
0.42452 8.42615 0.0016309 71
...
0.93086 6.92820 -0.0026609 48
0.94677 8.94427 -0.0024960 80
0.96549 11.95826 -0.0072333 143
0.97693 11.95826 -0.0186723 143
With the following code added at the end of the program, the output shown below also appears. This shows closer approximations for the fractional part of pi.
frac, rhi = math.pi-3, 16
print (' frac sqrt(b) error b bMax')
while rhi < 1000:
b = findb(frac, rhi)
r = math.sqrt(b)
t = math.modf(r)[0] # Get fractional part of sqrt(b)
print ('{:11.7f} {:11.7f} {:13.9f} {:7.0f} {:7.0f}'.format(frac, r, t-frac, b,rhi**2))
rhi = 3*rhi/2
frac sqrt(b) error b bMax
0.1415927 7.1414284 -0.000164225 51 256
0.1415927 7.1414284 -0.000164225 51 576
0.1415927 7.1414284 -0.000164225 51 1296
0.1415927 7.1414284 -0.000164225 51 2916
0.1415927 7.1414284 -0.000164225 51 6561
0.1415927 120.1415831 -0.000009511 14434 14641
0.1415927 120.1415831 -0.000009511 14434 32761
0.1415927 233.1415879 -0.000004772 54355 73441
0.1415927 346.1415895 -0.000003127 119814 164836
0.1415927 572.1415909 -0.000001786 327346 370881
0.1415927 911.1415916 -0.000001023 830179 833569
I do not know if there is any kind of standard algorithm for this kind of problem, but it does intrigue me, so here is my attempt at developing an algorithm that finds the needed approximation.
Call the real number in question r. Then, first I assume that a can be negative, in that case we can reduce the problem and now only have to find a b such that the decimal part of sqrt(b) is a good approximation of the decimal part of r. Let us now write r as r = x.y with x being the integer and y the decimal part.
Now:
b = r^2
= (x.y)^2
= (x + .y)^2
= x^2 + 2 * x * .y + .y^2
= 2 * x * .y + .y^2 (mod 1)
We now only have to find an x such that 0 = .y^2 + 2 * x * .y (mod 1) (approximately).
Filling that x into the formulas above we get b and can then calculate a as a = r - b. (All of these calculations have to be carefully rounded of course.)
Now, for the time being I am not sure if there is a way to find this x without brute forcing it. But even then, one can simple use a simple loop to find an x good enough.
I am thinking of something like this(semi pseudo code):
max_diff_low = 0.01 // arbitrary accuracy
max_diff_high = 1 - max_diff_low
y = r % 1
v = y^2
addend = 2 * y
x = 0
while (v < max_diff_high && v > max_diff_low)
x++;
v = (v + addend) % 1
c = (x + y) ^ 2
b = round(c)
a = round(r - c)
Now, I think this algorithm is fairly efficient, while even allowing you to specify the wished accuracy of the approximation. One thing that could be done that would turn it into an O(1) algorithm is calculating all the x and putting them into a lookup table. If one only cares about the first three decimal digits of r(for example), the lookup table would only have 1000 values, which is only 4kb of memory(assuming that 32bit integers are used).
Hope this is helpful at all. If anyone finds anything wrong with the algorithm, please let me know in a comment and I will fix it.
EDIT:
Upon reflection I retract my claim of efficiency. There is in fact as far as I can tell no guarantee that the algorithm as outlined above will ever terminate, and even if it does, it might take a long time to find a very large x that solves the equation adequately.
One could maybe keep track of the best x found so far and relax the accuracy bounds over time to make sure the algorithm terminates quickly, at the possible cost of accuracy.
These problems are of course non-existent, if one simply pre-calculates a lookup table.

Generating strongly biased random numbers for tests

I want to run tests with randomized inputs and need to generate 'sensible' random
numbers, that is, numbers that match good enough to pass the tested function's
preconditions, but hopefully wreak havoc deeper inside its code.
math.random() (I'm using Lua) produces uniformly distributed random
numbers. Scaling these up will give far more big numbers than small numbers,
and there will be very few integers.
I would like to skew the random numbers (or generate new ones using the old
function as a randomness source) in a way that strongly favors 'simple' numbers,
but will still cover the whole range, i.e., extending up to positive/negative infinity
(or ±1e309 for double). This means:
numbers up to, say, ten should be most common,
integers should be more common than fractions,
numbers ending in 0.5 should be the most common fractions,
followed by 0.25 and 0.75; then 0.125,
and so on.
A different description: Fix a base probability x such that probabilities
will sum to one and define the probability of a number n as xk
where k is the generation in which n is constructed as a surreal
number1. That assigns x to 0, x2 to -1 and +1,
x3 to -2, -1/2, +1/2 and +2, and so on. This
gives a nice description of something close to what I want (it skews a bit too
much), but is near-unusable for computing random numbers. The resulting
distribution is nowhere continuous (it's fractal!), I'm not sure how to
determine the base probability x (I think for infinite precision it would be
zero), and computing numbers based on this by iteration is awfully
slow (spending near-infinite time to construct large numbers).
Does anyone know of a simple approximation that, given a uniformly distributed
randomness source, produces random numbers very roughly distributed as
described above?
I would like to run thousands of randomized tests, quantity/speed is more
important than quality. Still, better numbers mean less inputs get rejected.
Lua has a JIT, so performance is usually not much of an issue. However, jumps based
on randomness will break every prediction, and many calls to math.random()
will be slow, too. This means a closed formula will be better than an
iterative or recursive one.
1 Wikipedia has an article on surreal numbers, with
a nice picture. A surreal number is a pair of two surreal
numbers, i.e. x := {n|m}, and its value is the number in the middle of the
pair, i.e. (for finite numbers) {n|m} = (n+m)/2 (as rational). If one side
of the pair is empty, that's interpreted as increment (or decrement, if right
is empty) by one. If both sides are empty, that's zero. Initially, there are
no numbers, so the only number one can build is 0 := { | }. In generation
two one can build numbers {0| } =: 1 and { |0} =: -1, in three we get
{1| } =: 2, {|1} =: -2, {0|1} =: 1/2 and {-1|0} =: -1/2 (plus some
more complex representations of known numbers, e.g. {-1|1} ? 0). Note that
e.g. 1/3 is never generated by finite numbers because it is an infinite
fraction – the same goes for floats, 1/3 is never represented exactly.
How's this for an algorithm?
Generate a random float in (0, 1) with a library function
Generate a random integral roundoff point according to a desired probability density function (e.g. 0 with probability 0.5, 1 with probability 0.25, 2 with probability 0.125, ...).
'Round' the float by that roundoff point (e.g. floor((float_val << roundoff)+0.5))
Generate a random integral exponent according to another PDF (e.g. 0, 1, 2, 3 with probability 0.1 each, and decreasing thereafter)
Multiply the rounded float by 2exponent.
For a surreal-like decimal expansion, you need a random binary number.
Even bits tell you whether to stop or continue, odd bits tell you whether to go right or left on the tree:
> 0... => 0.0 [50%] Stop
> 100... => -0.5 [<12.5%] Go, Left, Stop
> 110... => 0.5 [<12.5%] Go, Right, Stop
> 11100... => 0.25 [<3.125%] Go, Right, Go, Left, Stop
> 11110... => 0.75 [<3.125%] Go, Right, Go, Right, Stop
> 1110100... => 0.125
> 1110110... => 0.375
> 1111100... => 0.625
> 1111110... => 0.875
One way to quickly generate a random binary number is by looking at the decimal digits in math.random() and replace 0-4 with '1' and 5-9 with '1':
0.8430419054348022
becomes
1000001010001011
which becomes -0.5
0.5513009827118367
becomes
1100001101001011
which becomes 0.25
etc
Haven't done much lua programming, but in Javascript you can do:
Math.random().toString().substring(2).split("").map(
function(digit) { return digit >= "5" ? 1 : 0 }
);
or true binary expansion:
Math.random().toString(2).substring(2)
Not sure which is more genuinely "random" -- you'll need to test it.
You could generate surreal numbers in this way, but most of the results will be decimals in the form a/2^b, with relatively few integers. On Day 3, only 2 integers are produced (-3 and 3) vs. 6 decimals, on Day 4 it is 2 vs. 14, and on Day n it is 2 vs (2^n-2).
If you add two uniform random numbers from math.random(), you get a new distribution which has a "triangle" like distribution (linearly decreasing from the center). Adding 3 or more will get a more 'bell curve' like distribution centered around 0:
math.random() + math.random() + math.random() - 1.5
Dividing by a random number will get a truly wild number:
A/(math.random()+1e-300)
This will return an results between A and (theoretically) A*1e+300,
though my tests show that 50% of the time the results are between A and 2*A
and about 75% of the time between A and 4*A.
Putting them together, we get:
round(6*(math.random()+math.random()+math.random() - 1.5)/(math.random()+1e-300))
This has over 70% of the number returned between -9 and 9 with a few big numbers popping up rarely.
Note that the average and sum of this distribution will tend to diverge towards a large negative or positive number, because the more times you run it, the more likely it is for a small number in the denominator to cause the number to "blow up" to a large number such as 147,967 or -194,137.
See gist for sample code.
Josh
You can immediately calculate the nth born surreal number.
Example, the 1000th Surreal number is:
convert to binary:
1000 dec = 1111101000 bin
1's become pluses and 0's minuses:
1111101000
+++++-+---
The first '1' bit is 0 value, the next set of similar numbers is +1 (for 1's) or -1 (for 0's), then the value is 1/2, 1/4, 1/8, etc for each subsequent bit.
1 1 1 1 1 0 1 0 0 0
+ + + + + - + - - -
0 1 1 1 1 h h h h h
+0+1+1+1+1-1/2+1/4-1/8-1/16-1/32
= 3+17/32
= 113/32
= 3.53125
The binary length in bits of this representation is equal to the day on which that number was born.
Left and right numbers of a surreal number are the binary representation with its tail stripped back to the last 0 or 1 respectively.
Surreal numbers have an even distribution between -1 and 1 where half of the numbers created to a particular day will exist. 1/4 of the numbers exists evenly distributed between -2 to -1 and 1 to 2 and so on. The max range will be negative to positive integers matching the number of days you provide. The numbers go to infinity slowly because each day only adds one to the negative and positive ranges and days contain twice as many numbers as the last.
Edit:
A good name for this bit representation is "sinary"
Negative numbers are transpositions. ex:
100010101001101s -> negative number (always start 10...)
111101010110010s -> positive number (always start 01...)
and we notice that all bits flip accept the first one which is a transposition.
Nan is => 0s (since all other numbers start with 1), which makes it ideal for representation in bit registers in a computer since leading zeros are required (we don't make ternary computer anymore... too bad)
All Conway surreal algebra can be done on these number without needing to convert to binary or decimal.
The sinary format can be seem as a one plus a simple one's counter with a 2's complement decimal representation attached.
Here is an incomplete report on finary (similar to sinary): https://github.com/peawormsworth/tools/blob/master/finary/Fine%20binary.ipynb

Resources