How to find the multiplier that produces a smaller output for every double values? - algorithm

How would you find the greatest and positive IEEE-754 binary-64 value C such that every IEEE-754 product of a positive, normalized binary-64 value A with C is smaller than A?
I know it must be close to 0.999999... but I'd like to find exactly the greatest one.
Suppose round-to-nearest, ties to even.

There've been a couple of experimental approaches; here's a proof that C = 1 - ε, where ε is machine epsilon (that is, the distance between 1 and the smallest representable number greater than 1.)
We know that C < 1, of course, so it makes sense to try C = 1 - ε/2 because it's the next representable number smaller than 1. (The ε/2 is because C is in the [0.5, 1) bucket of representable numbers.) Let's see if it works for all A.
I'm going to assume in this paragraph that 1 <= A < 2. If both A and AC are in the "normal" region then it doesn't really matter what the exponent is, the situation will be the same with the exponent 2^0. Now, that choice of C obviously works for A=1, so we are left with the region 1 < A < 2. Looking at A = 1 + ε, we see that AC (the exact value, not the rounded result) is already greater than 1; and for A = 2 - ε we see that it's less than 2. That's important, because if AC is between 1 and 2, we know that the distance between AC and round(AC) (that is, rounding it to the nearest representable value) is at most ε/2. Now, if A - AC < ε/2, then round(AC) = A which we don't want. (If A - AC = ε/2 then it might round to A given the "ties to even" part of the normal FP rounding rules, but let's see if we can do better.) Since we've chosen C = 1 - ε/2, we can see that A - AC = A - A(1 - ε/2) = A * ε/2. Since that's greater than ε/2 (remember, A>1), it's far enough away from A to round away from it.
BUT! The one other value of A we have to check is the minimum representable normal value, since there AC is not in the normal range and so our "relative distance to nearest" rule doesn't apply. And what we find is that in that case A-AC is exactly half of machine epsilon in the region. "Round to nearest, ties to even" kicks in and the product rounds back up to equal A. Drat.
Going through the same thing with C = 1 - ε, we see that round(AC) < A, and that nothing else even comes close to rounding towards A (we end up asking whether A * ε > ε/2, which of course it is). So the punchline is that C = 1-ε/2 almost works but the boundary between normals and denormals screws us up, and C = 1-ε gets us into the end zone.

Due to the nature of floating-point types, C will vary depending on how big the value of A is. You can use nextafter to get the largest value less than 1 which will be the rough value for C
However it's possible that if A is too large or too small, A*C will be the same as A. I'm not able to mathematically prove that nextafter(1.0, 0) will work for all possible A's, therefore I'm suggesting a solution like this
double largestCfor(double A)
{
double C = nextafter(1.0, 0);
while (C*A >= A)
C = nextafter(C, 0);
return C;
}
If you want a C value that works for any A, even if C*A might not be the largest possible value then you'll need to check for every exponents that the type can represent
double C = 1;
for (double A = 0x1p-1022; isfinite(A); A *= 2) // loop through all possible exponents
{
double c = largestCfor(A);
if (c < C) C = c;
}
I've tried running on Ideone and got the result
C = 0.999999999999999777955395074969
nextafter(1.0, 0) = 0.999999999999999888977697537484
Edit:
0.999999999999999777955395074969 is 0x1.ffffffffffffep-1 which is also 1 - DBL_EPSILON. That aligns with Sneftel's proof above

Related

Computing the most accurate result for: the sum of a list of fixed point nubers

Lets say you'd have to add some 32-bit fixed point numbers stored in a huge array L and you'd like to get the most accurate result as possible. Furthermore, you're not allowed to use anything else than L and the 32-bit fixed point numbers (i.e. you're not allowed to convert them to 64-bit). What would be your approach to get the most accurate result for the sum of the numbers in L?
This would be my current approach noted in sudo code:
L = sort(L)
result = 0
lastMax = false -- indicates whether we've extracted the maximum from L last time
while (not empty(L)) and (result not equals +INF or -INF) do:
current = 0
if lastMax:
current = extractMin(L) -- gets and removes minimum from L
else:
current = extractMax(L) -- gets and removes maximum from L
result = safeAdd(result, current)
lastMax = not lastMax
safeAdd(a,b):
if a = +INF: return +INF
else if a = -INF: return -INF
else: return a + b
So I'm alternating between adding the minimum/maximum from the remaining list L in order to stay between the ranges of L. The way how safeAdd is implemented shows that once we've crossed the ranges of accuracy (i.e., the result of a+b has yielded +INF or -INF - just as it's done in C) we will not alter the result anymore.
Do you have any suggestions on how to improve the approach?
Sidenote: If we want to be very precise: We further assume that the + operation can yield +INF or -INF which can be represented as fixed point numbers in the programming language. But we assume that the values +INF, -INF do not occur in L. And we ignore the fact the fixed point standard may also have a representation for NaN.
If you know, that the result is in range, you don't need to care about overflow or underflow.
Here is an example with 4 bits only to keep it simple
7 0111
+1 0001
= 1000 <-- -8 overflow
-1 0001
= 0111
If you are not sure, if the result is in range, you need to count overflows and underflows.
For a = b + c
Overflow if b and c are positive and a is negative
Underflow if b and c are negative and a is positive

Implementing the square root method through successive approximation

Determining the square root through successive approximation is implemented using the following algorithm:
Begin by guessing that the square root is x / 2. Call that guess g.
The actual square root must lie between g and x/g. At each step in the successive approximation, generate a new guess by averaging g and x/g.
Repeat step 2 until the values of g and x/g are as close together as the precision of the hardware allows. In Java, the best way to check for this condition is to test whether the average is equal to either of the values used to generate it.
What really confuses me is the last statement of step 3. I interpreted it as follows:
private double sqrt(double x) {
double g = x / 2;
while(true) {
double average = (g + x/g) / 2;
if(average == g || average == x/g) break;
g = average;
}
return g;
}
This seems to just cause an infinite loop. I am following the algorithm exactly, if the average equals either g or x/g (the two values used to generate it) then we have our answer ?
Why would anyone ever use that approach, when they could simply use the formulas for (2n^2) = 4n^2 and (n + 1)^2 = n^2 + 2n + 1, to populate each bit in the mantissa, and divide the exponent by two, multiplying the mantissa by two iff the the mod of the exponent with two equals 1?
To check if g and x/g are as close as the HW allow, look at the relative difference and compare
it with the epsilon for your floating point format. If it is within a small integer multiple of epsilon, you are OK.
Relative difference of x and y, see https://en.wikipedia.org/wiki/Relative_change_and_difference
The epsilon for 32-bit IEEE floats is about 1.0e-7, as in one of the other answers here, but that answer used the absolute rather than the relative difference.
In practice, that means something like:
Math.abs(g-x/g)/Math.max(Math.abs(g),Math.abs(x/g)) < 3.0e-7
Never compare floating point values for equality. The result is not reliable.
Use a epsilon like so:
if(Math.abs(average-g) < 1e-7 || Math.abs(average-x/g) < 1e-7)
You can change the epsilon value to be whatever you need. Probably best is something related to the original x.

convert real number to radicals

Suppose I have a real number. I want to approximate it with something of the form a+sqrt(b) for integers a and b. But I don't know the values of a and b. Of course I would prefer to get a good approximation with small values of a and b. Let's leave it undefined for now what is meant by "good" and "small". Any sensible definitions of those terms will do.
Is there a sane way to find them? Something like the continued fraction algorithm for finding fractional approximations of decimals. For more on the fractions problem, see here.
EDIT: To clarify, it is an arbitrary real number. All I have are a bunch of its digits. So depending on how good of an approximation we want, a and b might or might not exist. Brute force is naturally not a particularly good algorithm. The best I can think of would be to start adding integers to my real, squaring the result, and seeing if I come close to an integer. Pretty much brute force, and not a particularly good algorithm. But if nothing better exists, that would itself be interesting to know.
EDIT: Obviously b has to be zero or positive. But a could be any integer.
No need for continued fractions; just calculate the square-root of all "small" values of b (up to whatever value you feel is still "small" enough), remove everything before the decimal point, and sort/store them all (along with the b that generated it).
Then when you need to approximate a real number, find the radical whose decimal-portion is closet to the real number's decimal-portion. This gives you b - choosing the correct a is then a simple matter of subtraction.
This is actually more of a math problem than a computer problem, but to answer the question I think you are right that you can use continued fractions. What you do is first represent the target number as a continued fraction. For example, if you want to approximate pi (3.14159265) then the CF is:
3: 7, 15, 1, 288, 1, 2, 1, 3, 1, 7, 4 ...
The next step is create a table of CFs for square roots, then you compare the values in the table to the fractional part of the target value (here: 7, 15, 1, 288, 1, 2, 1, 3, 1, 7, 4...). For example, let's say your table had square roots for 1-99 only. Then you would find the closest match would be sqrt(51) which has a CF of 7: 7,14 repeating. The 7,14 is the closest to pi's 7,15. Thus your answer would be:
sqrt(51)-4
As the closest approximation given a b < 100 which is off by 0.00016. If you allow larger b's then you could get a better approximation.
The advantage of using CFs is that it is faster than working in, say, doubles or using floating point. For example, in the above case you only have to compare two integers (7 and 15), and you can also use indexing to make finding the closest entry in the table very fast.
This can be done using mixed integer quadratic programming very efficiently (though there are no run-time guarantees as MIQP is NP-complete.)
Define:
d := the real number you wish to approximate
b, a := two integers such that a + sqrt(b) is as "close" to d as possible
r := (d - a)^2 - b, is the residual of the approximation
The goal is to minimize r. Setup your quadratic program as:
x := [ s b t ]
D := | 1 0 0 |
| 0 0 0 |
| 0 0 0 |
c := [0 -1 0]^T
with the constraint that s - t = f (where f is the fractional part of d)
and b,t are integers (s is not)
This is a convex (therefore optimally solvable) mixed integer quadratic program since D is positive semi-definite.
Once s,b,t are computed, simply derive the answer using b=b, s=d-a and t can be ignored.
Your problem may be NP-complete, it would be interesting to prove if so.
Some of the previous answers use methods that are of time or space complexity O(n), where n is the largest “small number” that will be accepted. By contrast, the following method is O(sqrt(n)) in time, and O(1) in space.
Suppose that positive real number r = x + y, where x=floor(r) and 0 ≤ y < 1. We want to approximate r by a number of the form a + √b. If x+y ≈ a+√b then x+y-a ≈ √b, so √b ≈ h+y for some integer offset h, and b ≈ (h+y)^2. To make b an integer, we want to minimize the fractional part of (h+y)^2 over all eligible h. There are at most √n eligible values of h. See following python code and sample output.
import math, random
def findb(y, rhi):
bestb = loerror = 1;
for r in range(2,rhi):
v = (r+y)**2
u = round(v)
err = abs(v-u)
if round(math.sqrt(u))**2 == u: continue
if err < loerror:
bestb, loerror = u, err
return bestb
#random.seed(123456) # set a seed if testing repetitively
f = [math.pi-3] + sorted([random.random() for i in range(24)])
print (' frac sqrt(b) error b')
for frac in f:
b = findb(frac, 12)
r = math.sqrt(b)
t = math.modf(r)[0] # Get fractional part of sqrt(b)
print ('{:9.5f} {:9.5f} {:11.7f} {:5.0f}'.format(frac, r, t-frac, b))
(Note 1: This code is in demo form; the parameters to findb() are y, the fractional part of r, and rhi, the square root of the largest small number. You may wish to change usage of parameters. Note 2: The
if round(math.sqrt(u))**2 == u: continue
line of code prevents findb() from returning perfect-square values of b, except for the value b=1, because no perfect square can improve upon the accuracy offered by b=1.)
Sample output follows. About a dozen lines have been elided in the middle. The first output line shows that this procedure yields b=51 to represent the fractional part of pi, which is the same value reported in some other answers.
frac sqrt(b) error b
0.14159 7.14143 -0.0001642 51
0.11975 4.12311 0.0033593 17
0.12230 4.12311 0.0008085 17
0.22150 9.21954 -0.0019586 85
0.22681 11.22497 -0.0018377 126
0.25946 2.23607 -0.0233893 5
0.30024 5.29150 -0.0087362 28
0.36772 8.36660 -0.0011170 70
0.42452 8.42615 0.0016309 71
...
0.93086 6.92820 -0.0026609 48
0.94677 8.94427 -0.0024960 80
0.96549 11.95826 -0.0072333 143
0.97693 11.95826 -0.0186723 143
With the following code added at the end of the program, the output shown below also appears. This shows closer approximations for the fractional part of pi.
frac, rhi = math.pi-3, 16
print (' frac sqrt(b) error b bMax')
while rhi < 1000:
b = findb(frac, rhi)
r = math.sqrt(b)
t = math.modf(r)[0] # Get fractional part of sqrt(b)
print ('{:11.7f} {:11.7f} {:13.9f} {:7.0f} {:7.0f}'.format(frac, r, t-frac, b,rhi**2))
rhi = 3*rhi/2
frac sqrt(b) error b bMax
0.1415927 7.1414284 -0.000164225 51 256
0.1415927 7.1414284 -0.000164225 51 576
0.1415927 7.1414284 -0.000164225 51 1296
0.1415927 7.1414284 -0.000164225 51 2916
0.1415927 7.1414284 -0.000164225 51 6561
0.1415927 120.1415831 -0.000009511 14434 14641
0.1415927 120.1415831 -0.000009511 14434 32761
0.1415927 233.1415879 -0.000004772 54355 73441
0.1415927 346.1415895 -0.000003127 119814 164836
0.1415927 572.1415909 -0.000001786 327346 370881
0.1415927 911.1415916 -0.000001023 830179 833569
I do not know if there is any kind of standard algorithm for this kind of problem, but it does intrigue me, so here is my attempt at developing an algorithm that finds the needed approximation.
Call the real number in question r. Then, first I assume that a can be negative, in that case we can reduce the problem and now only have to find a b such that the decimal part of sqrt(b) is a good approximation of the decimal part of r. Let us now write r as r = x.y with x being the integer and y the decimal part.
Now:
b = r^2
= (x.y)^2
= (x + .y)^2
= x^2 + 2 * x * .y + .y^2
= 2 * x * .y + .y^2 (mod 1)
We now only have to find an x such that 0 = .y^2 + 2 * x * .y (mod 1) (approximately).
Filling that x into the formulas above we get b and can then calculate a as a = r - b. (All of these calculations have to be carefully rounded of course.)
Now, for the time being I am not sure if there is a way to find this x without brute forcing it. But even then, one can simple use a simple loop to find an x good enough.
I am thinking of something like this(semi pseudo code):
max_diff_low = 0.01 // arbitrary accuracy
max_diff_high = 1 - max_diff_low
y = r % 1
v = y^2
addend = 2 * y
x = 0
while (v < max_diff_high && v > max_diff_low)
x++;
v = (v + addend) % 1
c = (x + y) ^ 2
b = round(c)
a = round(r - c)
Now, I think this algorithm is fairly efficient, while even allowing you to specify the wished accuracy of the approximation. One thing that could be done that would turn it into an O(1) algorithm is calculating all the x and putting them into a lookup table. If one only cares about the first three decimal digits of r(for example), the lookup table would only have 1000 values, which is only 4kb of memory(assuming that 32bit integers are used).
Hope this is helpful at all. If anyone finds anything wrong with the algorithm, please let me know in a comment and I will fix it.
EDIT:
Upon reflection I retract my claim of efficiency. There is in fact as far as I can tell no guarantee that the algorithm as outlined above will ever terminate, and even if it does, it might take a long time to find a very large x that solves the equation adequately.
One could maybe keep track of the best x found so far and relax the accuracy bounds over time to make sure the algorithm terminates quickly, at the possible cost of accuracy.
These problems are of course non-existent, if one simply pre-calculates a lookup table.

Split a number into three buckets with constraints

Is there a good algorithm to split a randomly generated number into three buckets, each with constraints as to how much of the total they may contain.
For example, say my randomly generated number is 1,000 and I need to split it into buckets a, b, and c.
These ranges are only an example. See my edit for possible ranges.
Bucket a may only be between 10% - 70% of the number (100 - 700)
Bucket b may only be between 10% - 50% of the number (100 - 500)
Bucket c may only be between 5% - 25% of the number (50 - 250)
a + b + c must equal the randomly generated number
You want the amounts assigned to be completely random so there's just as equal a chance of bucket a hitting its max as bucket c in addition to as equal a chance of all three buckets being around their percentage mean.
EDIT: The following will most likely always be true: low end of a + b + c < 100%, high end of a + b + c > 100%. These percentages are only to indicate acceptable values of a, b, and c. In a case where a is 10% while b and c are their max (50% and 25% respectively) the numbers would have to be reassigned since the total would not equal 100%. This is the exact case I'm trying to avoid by finding a way to assign these numbers in one pass.
I'd like to find a way to pick these number randomly within their range in one pass.
The problem is equivalent to selecting a random point in an N-dimensional object (in your example N=3), the object being defined by the equations (in your example):
0.1 <= x <= 0.7
0.1 <= y <= 0.5
0.05 <= z <= 0.25
x + y + z = 1 (*)
Clearly because of the last equation (*) one of the coordinates is redundant, i.e. picking values for x and y dictates z.
Eliminating (*) and one of the other equations leaves us with an (N-1)-dimensional box, e.g.
0.1 <= x <= 0.7
0.1 <= y <= 0.5
that is cut by the inequality
0.05 <= (1 - x - y) <= 0.25 (**)
that derives from (*) and the equation for z. This is basically a diagonal stripe through the box.
In order for the results to be uniform, I would just repeatedly sample the (N-1)-dimensional box, and accept the first sampled point that fulfills (**). Single-pass solutions might end up having biased distributions.
Update: Yes, you're right, the result is not uniformly distributed.
Let's say your percent values are natural numbers (if this assumption is wrong, you don't have to read further :) In that case I don't have a solution).
Let's define an event e as a tuple of 3 values (percentage of each bucket): e = (pa, pb, pc). Next, create all possible events en. What you have here is a tuple space consisting of a discrete number of events. All of the possible events should have the same possibility to occur.
Let's say we have a function f(n) => en. Then, all we have to do is take a random number n and return en in a single pass.
Now, the problem remains to create such a function f :)
In pseudo code, a very slow method (just for illustration):
function f(n) {
int c = 0
for i in [10..70] {
for j in [10..50] {
for k in [5..25] {
if(i + j + k == 100) {
if(n == c) {
return (i, j, k) // found event!
} else {
c = c + 1
}
}
}
}
}
}
What you have know is a single pass solution, but problem is only moved away. The function f is very slow. But you can do better: I think you can calculate everything a bit faster if you set your ranges correctly and calculate offsets instead of iterating through your ranges.
Is this clear enough?
First of all you probably have to adjust your ranges. 10% in bucket a is not possible, since you can't get condition a+b+c = number to hold.
Concerning your question: (1) Pick a random number for bucket a inside your range, then (2) update the range for bucket b with minimum and maximum percentage (you should only narrow the range). Then (3) pick a random number for bucket b. In the end c should be calculated that your condition holds (4).
Example:
n = 1000
(1) a = 40%
(2) range b [35,50], because 40+35+25 = 100%
(3) b = 45%
(4) c = 100-40-45 = 15%
Or:
n = 1000
(1) a = 70%
(2) range b [10,25], because 70+25+5 = 100%
(3) b = 20%
(4) c = 100-70-20 = 10%
It is to check whether all the events are uniformly distributed. If that should be a problem you might want to randomize the range update in step 2.

Does a range of integers contain at least one perfect square?

Given two integers a and b, is there an efficient way to test whether there is another integer n such that a ≤ n2 < b?
I do not need to know n, only whether at least one such n exists or not, so I hope to avoid computing square roots of any numbers in the interval.
Although testing whether an individual integer is a perfect square is faster than computing the square root, the range may be large and I would also prefer to avoid performing this test for every number within the range.
Examples:
intervalContainsSquare(2, 3) => false
intervalContainsSquare(5, 9) => false (note: 9 is outside this interval)
intervalContainsSquare(9, 9) => false (this interval is empty)
intervalContainsSquare(4, 9) => true (4 is inside this interval)
intervalContainsSquare(5, 16) => true (9 is inside this interval)
intervalContainsSquare(1, 10) => true (1, 4 and 9 are all inside this interval)
Computing whether or not a number is a square isn't really faster than computing its square root in hard cases, as far as I know. What is true is that you can do a precomputation to know that it isn't a square, which might save you time on average.
Likewise for this problem, you can do a precomputation to determine that sqrt(b)-sqrt(a) >= 1, which then means that a and b are far enough apart that there must be a square between them. With some algebra, this inequality is equivalent to the condition that (b-a-1)^2 >= 4*a, or if you want it in a more symmetric form, that (a-b)^2+1 >= 2*(a+b). So this precomputation can be done with no square roots, only with one integer product and some additions and subtractions.
If a and b are almost exactly the same, then you can still use the trick of looking at low order binary digits as a precomputation to know that there isn't a square between them. But they have to be so close together that this precomputation might not be worth it.
If these precomputations are inconclusive, then I can't think of anything other than everyone else's solution, a <= ceil(sqrt(a))^2 < b.
Since there was a question of doing the algebra right:
sqrt(b)-sqrt(a) >= 1
sqrt(b) >= 1+sqrt(a)
b >= 1+2*sqrt(a)+a
b-a-1 >= 2*sqrt(a)
(b-a-1)^2 >= 4*a
Also: Generally when a is a large number, you would compute sqrt(a) with Newton's method, or with a lookup table followed by a few Newton's method steps. It is faster in principle to compute ceil(sqrt(a)) than sqrt(a), because the floating point arithmetic can be simplified to integer arithmetic, and because you don't need as many Newton's method steps to nail down high precision that you're just going to throw away. But in practice, a numerical library function can be much faster if it uses square roots implemented in microcode. If for whatever reason you don't have that microcode to help you, then it might be worth it to hand-code ceil(sqrt(a)). Maybe the most interesting case would be if a and b are unbounded integers (like, a thousand digits). But for ordinary-sized integers on an ordinary non-obsolete computer, you can't beat the FPU.
Get the square root of the lower number. If this is an integer then you are done.
Otherwise round up and square the number. If this is less than b then it is true.
You only need to compute one square root this way.
In order to avoid a problem of when a is equal to b, you should check that first. As this case is always false.
If you will accept calculating two square roots, because of its monotonicity you have this inequality which is equivalent to your starting one:
sqrt(a) <= n < sqrt(b)
thus, if floor(sqrt(a)) != floor(sqrt(b)), floor(sqrt(b)) - 1 is guaranteed to be such an n.
get the square root of the lower number and round it up
get the square root of the higher number and round it down
if 1 is lower or equal 2, there will be a perfect square
Find the integral part of sqrt(a) and sqrt(b), say sa and sb.
If sa2 = a, then output yes.
If sb2 = b and sa = sb-1, then output no.
If sa < sb output yes.
Else output no.
You can optimize the above to get rid of the computation of sqrt(b) (similar to JDunkerly's answer).
Or did you want to avoid computing square roots of a and b too?
You can avoid computing square roots completely by using a method similar to binary search.
You start with a guess for n, n = 1 and compute n2
Consider if a <= n < b, you can stop.
If n < a < b, you double your guess n.
if a < b < n, you make it close to average of current + previous guess.
This will be O(logb) time.
In addition to JDunkerley's nice solution (+1), there could be a possible improvement that needs to be tested and uses integer square roots to calculate integer square roots
Why are you hoping to avoid square roots entirely? Even before you get to the most efficient way of solving this, you have seen methods that call for only 2 square roots. That's done in O(1) time, so it seems to me that any improvement you could hope to make would take more time to think about than it would EVER save you computing time. Am I wrong?
One way is to use Newton's method to find the integer square root for b. Then you can check if that number falls in the range. I doubt that it is faster than simply calling the square root function, but it is certainly more interesting:
int main( int argc, char* argv[] )
{
int a, b;
double xk=0, xk1;
int root;
int iter=0;
a = atoi( argv[1] );
b = atoi( argv[2] );
xk1 = b / 32 + 1; // +1 to ensure > 0
xk1 = b;
while( fabs( xk1 - xk ) >= .5 ) {
xk = xk1;
xk1 = ( xk + b / xk ) / 2.;
printf( "%d) xk = %f\n", ++iter, xk1 );
}
root = (int)xk1;
// If b is a perfect square, then this finds that root, so it also
// needs to check if (n-1)^2 falls in the range.
// And this does a lot more multiplications than it needs
if ( root*root >= a && root*root < b ||
(root-1)*(root-1) >= a && (root-1)*(root-1) < b )
printf( "Contains perfect square\n" );
else
printf( "Does not contain perfect square\n" );
return 1;
}

Resources