Precision Arithmetic and Overflow for Factorial - precision

Recall IEEE Double Precision Arithmetic. Now, for which n > 1 can binom(n,k) be computed in IEEE Double Precision? Additionally on the same interval, when will intermediate factorial values overflow?
For my first question, I have found the interval n < 2^53. Not sure if this is correct though.

For a given n the largest binom(n, k) value is attained for k = [n/2] (integer part of n/2). For binom(n, k) to be representable in double precision precision format, it is therefore sufficient for binom(n, [n/2]) to be representable.
The following lists the number of bits (binary digits) required for the exact representation of binom(n, [n/2]) (retrieved from Wolfram Alpha using queries similar to this one).
n binom(n, [n/2])
56 53 bits
57 54 bits
The following lists the values in binary exponent form for binom(n, [n/2]).
n binom(n, [n/2])
1029 1.1... * 2^1023
1030 1.1... * 2^1024
The max n for which all binom(n, k) can be exactly represented in a double precision floating point (53 bits mantissa) is 56.
The max n for which all binom(n, k) can be approximately represented in a double precision floating point (11 bit exponent) is 1029.
The similar max limits for n! are n = 18 (exact representation) and n = 170 (floating point approximation).

Related

Determine which fraction is better

Given two fractions, determine which fraction has more potential to give a max value.
For example we have fractions 1/2 and 2/4, for this example I picked 2 random fractions: 3/4, 5/3 (I don't know what this fractions will be, I just know that n 1..100, d 1..100)
Sum all n and d then divide
1. Fraction 1/2, sum with 3/4, 5/3
(1+3+5) / (2+4+3) = 9/9 = 1
2. Fraction 2/4 sum with 3/4, 5/3
(2+3+5) / (4+4+3) = 10/11 = 0.90
For the above example the output would be 1/2. But will this be true when instead of 3/4, 5/3 we have all the fractions [1..100]/[1..100]?
Let's fix the sum of the numerators of the two random fractions (let's denote as num) and the sum of their denominators (denom). Then we know we the result of the comparison. We just need to find the number of ways to represent the num and the denom as the sum of two fractions. It's exactly min(100, x - 1) - max(x - 100, 1) + 1.
So we just need to iterate over all possible numerator and denominator sums. There're only 199 * 199 different options.
However, 100^4 is a pretty small number, so you can just iterate over all possible numerators and denominators of the two random fractions, too.

Making a very large calculation

I am want to calculate the value X =n!/2^r
where n<10^6 and r<10^6
and it's guarantee that value of X is between O to 10
How to calculate X since i can't simple divide the factorial and power term since they overflow the long integer.
My Approach
Do with the help of Modulus. Let take a prime number greater than 10 let say 101
X= [(Factorial N%101)*inverse Modulo of(2^r)]%101;
Note that inverse modulo can easily be calculate and 2^r%101 can also be calculated.
Problem:
It's not guarantee that X is always be integer it can be float also.
My method works fine when X is integer ? How to deal when X is a floating point number
If approximate results are OK and you have access to a math library with base-2 exponential (exp2 in C), natural log gamma (lgamma in C), and natural log (log in C), then you can do
exp2(lgamma(n+1)/log(2) - r).
Find the power that 2 appears at in n!. This is:
P = n / 2 + n / 2^2 + n / 2^3 + ...
Using integer division until you reach a 0 result.
If P >= r, then you have an integer result. You can find this result by computing the factorial such that you ignore r powers of 2. Something like:
factorial = 1
for i = 2 to n:
factor = i
while factor % 2 == 0 and r != 0:
factor /= 2
r -= 1
factorial *= factor
If P < r, set r = P, apply the same algorithm and divide the result by 2^(initial_r - P) in the end.
Except for a very few cases (with small n and r) X will not be an integer -- for if n >= 11 then 11 divides n! but doesn't divide any power of two, so if X were integral it would have to be at least 11.
One method would be: initialise X to one; then loop: if X > 10 divide by 2 till its not; if X < 10 multiply by the next factors till its not; until you run out of factors and powers of 2.
An approach that would be tunable for precision/performance would be the following:
Store the factorial in an integer with a fixed number of bits. We can drop the last few digits if the number gets too large, since they won't affect the overall result altogether that much. By scaling this integer larger/smaller the algorithm gets tunable for either performance or precision.
Whenever the integer would overflow due to multiplication, shift it to the right by a few places and subtract that value from r. In the end there should be a small number left as r and an integer v with the most significant bits of the factorial. This v can now be interpreted as a fixed-point number with r fractional digits.
Depending upon the required precision this approach might even work with long, though I haven't had the time to test this approach yet apart from a bit experimenting with a calculator.

Why is naive multiplication n^2 time?

I've read that operations such as addition/subtraction were linear time, and that "grade-school" long multiplication is n^2 time. Why is this true?
Isn't addition floor(log n) times, when n is the smaller operand? The same argument goes for subtraction, and for multiplication, if we make a program to do long multiplication instead of adding integers together, shouldn't the complexity be floor(log a) * floor(log b) where a and b are the operands?
The answer depends on what is "n." When they say that addition is O(n) and multiplication (with the naïve algorithm) is O(n^2), n is the length of the number, either in bits or some other unit. This definition is used because arbitrary precision arithmetic is implemented as operations on lists of "digits" (not necessarily base 10).
If n is the number being added or multiplied, the complexities would be log n and (log n)^2 for positive n, as long as the numbers are stored in log n space.
The naive approach to multiplication of (for example) 273 x 12 is expanded out (using the distributive rule) as (200 + 70 + 3) x (10 + 2) or:
200 x 10 + 200 x 2
+ 70 x 10 + 70 x 2
+ 3 x 10 + 3 x 2
The idea of this simplification is to reduce the multiplications to something that can be done easily. For your primary school math, that would be working with digits, assuming you know the times tables from zero to nine. For bignum libraries where each "digit" may be a value from 0 to 9999 (for ease of decimal printing), the same rules apply, being able to multiply numbers less than 10,000 relatively constantly).
Hence, if n is the number of digits, the complexity is indeed O(n2) since the number of "constant" operations tends to rise with the product of the "digit" counts.
This is true even if your definition of digit varies slightly (such as being a value from 0 to 9999 or even being one of the binary digits 0 or 1).

Fast algorithm to calculate large n! mod 2³²

I want to calculate the exact value of N! mod 232. N can be up to 231
Any language is fine but I would appreciate detailed explanation of algorithm.
Time limit is < 1 sec
In python:
if n > 33:
return 0
else
return reduce(lambda x, y: x*y, range(1, n+1)) % 2**32
Justification:
We know that 34! is divisible by 232 because in the sequence:
1 * 2 * 3 * 4 * ... * 34
there are:
17 multiples of 2
8 multiples of 4
4 multiples of 8
2 multiples of 16
1 multiple of 32
--
32 multiplications by 2
It's a factor of every larger factorial, so all the larger ones are 0 mod 232
For small values of N, if you don't have bignum arithmetic available, you can do the individual multiplications mod 232, and/or you can prefactor the power of 2 in the factorial, which is easy to compute (see above).
Calculate the factorial normally (multiply the numbers 1,2,3,...), performing the modulo after each multiplication. This will give you the result for small values of N.
For larger values of N, do the same. Pretty soon, your intermediate result will be 0, and then you can stop the loop immediately and return 0. The point at which you stop will be relatively fast: For N == 64 the result will already be 0 because the product of 1..64 contains 32 even numbers and is therefore divisible by 2^32. The actual minimal value of N where you get 0 will be less than 64.
In general, you can implement algorithms modulo small powers of two without bignums or modular reduction using the integer types (int, long) available in most programming languages. For modulo 232 you would use a 32-bit int. "Integer overflow" takes care of the modular arithmetic.
In this case, since there are only 34 distinct results, a lookup table may be faster than computing the factorial, assuming the factorials are used often enough that the table gets loaded into the CPU cache. The execution time will be measured in microseconds.
When multiplying 2 numbers of arbitrary length, the lower bits are always exact because it doesn't depend on high order bits. Basically a×b mod m = [(a mod m)×(b mod m)] mod m so to do N! mod m just do
1×2×...×N mod m = (...(((1×2 mod m)×3 mod m)×4 mod m)...)×N mod m
Modulo 2n is a special case because getting the modulus is rather easy with an AND operation. Modulo 232 is even more special because all unsigned operations in C and most C-like languages are reduced modulo 232 for a 32-bit unsigned type
As a result you can just multiply the numbers in a twice-as-wide type then after that AND with 232 - 1 to get the modulus
uint64_t p = 1;
for (uint32_t i = 1; i <= n; i++)
p = p*i & 0xFFFFFFFFU;
return p;
Calculating a modulo is a very fast operation, especially the modulo of a power of 2. A multiplication is very costly in comparison.
The fastest algorithm would factorize the factors of the factorial in prime numbers (which is very fast since the numbers are smaller than 33). And get the result by multiplying all of them together, by taking the modulo in between each multiplication, and starting with the big numbers.
E.g.: to calculate 10! mod 232: use de Polignac's formula, to get the prime factors of 10!
which gives you :
10! = 7 * 5 * 5 * 3 * 3 * 3 * 3 * 2 ...
this would be faster than the basic algorithm, because calculating
(29! mod 232) X 30
is much harder than multiplying by 5, 3 and 2, and taking the modulo in between each time.

Newton's Method for finding the reciprocal of a floating point number for division

I am trying to divide two numbers, a numerator N by a divisor D.
I am using the Newton–Raphson method which uses Newton's method to find the reciprocal of D (1/D). Then the result of the division can be found by multiplying the numerator N by the reciprocal 1/D to get N/D.
The Newton-Raphson algorithm can be found here
So the first step of the algorithm is to start with an initial guess for 1/D which we call X_0.
X_0 is defined as X_0 = 48/17-39/17*D
However, we must first apply a bit-shift to the divisor D to scale it so that 0.5 ≤ D ≤ 1. The same bit-shift should be applied to the numerator N so that the quotient does not change.
We then find X_(i+1) using the formula X_(i+1) = X_i*(2-D*X_i)
Since both the numerator N, divisor D, and result are all floating point IEEE-754 32-bit format, I am wondering how to properly apply this scaling since my value for 1/D does not converge to a value, it just approaches -Inf or +Inf (depending on D).
What I have found works though is that if I make X_0 less than 1/D, the algorithm seems to always converge. So if I just use a lookup table where I always store a bunch of values of 1/D and I can always ensure I have a stored 1/D value where D > Dmin, then I should be okay. But is that standard practice?
To set the sign bit correctly, perform the XOR on the sign of the original dividend and divisor.
Make the sign of the divisor and dividend positive now.
First set the dividend exponent equal to dividend_exponent- 1 - divisor_exponent - 1 + 127.
The +127 is for the bias since we just subtracted it out. This scales the dividend by the same amount we will scale the divisor by.
Change the divisor exponent to 126 (biased) or -1 (unbiased). This scales the divisor to between 0.5 and 1.
Proceed to find Xo with the new scaled D value from step one. Xo = 48/17-32/17 * D.
Proceed to find Xn using the new D until we have iterated enough times so that we have the precision we need. X(i+1) = X(i) * (2-D*X(i)). Also, the number of steps S we need is S = ceil(log_2((P + 1)/log_2(17))). Where P is the number of binary places
Multiply Xn * N = 1/D * N = N/D and your result should be correct.
Update: This algorithm works correctly.

Resources