Calculating powers (e.g. 2^11) quickly [duplicate] - algorithm

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
The most efficient way to implement an integer based power function pow(int, int)
How can I calculate powers with better runtime?
E.g. 2^13.
I remember seeing somewhere that it has something to do with the following calculation:
2^13 = 2^8 * 2^4 * 2^1
But I can't see how calculating each component of the right side of the equation and then multiplying them would help me.
Any ideas?
Edit: I did mean with any base. How do the algorithms you've mentioned below, in particular the "Exponentation by squaring", improve the runtime / complexity?

There is a generalized algorithm for this, but in languages that have bit-shifting, there's a much faster way to compute powers of 2. You just put in 1 << exp (assuming your bit shift operator is << as it is in most languages that support the operation).
I assume you're looking for the generalized algorithm and just chose an unfortunate base as an example. I will give this algorithm in Python.
def intpow(base, exp):
if exp == 0:
return 1
elif exp == 1:
return base
elif (exp & 1) != 0:
return base * intpow(base * base, exp // 2)
return intpow(base * base, exp // 2)
This basically causes exponents to be able to be calculated in log2 exp time. It's a divide and conquer algorithm. :-) As someone else said exponentiation by squaring.
If you plug your example into this, you can see how it works and is related to the equation you give:
intpow(2, 13)
2 * intpow(4, 6)
2 * intpow(16, 3)
2 * 16 * intpow(256, 1)
2 * 16 * 256 == 2^1 * 2^4 * 2^8

Use bitwise shifting. Ex. 1 << 11 returns 2^11.

Powers of two are the easy ones. In binary 2^13 is a one followed by 13 zeros.
You'd use bit shifting, which is a built in operator in many languages.

You can use exponentiation by squaring. This is also known as "square-and-multiply" and works for bases != 2, too.

If you're not limiting yourself to powers of two, then:
k^2n = (k^n)^2

The fastest free algorithm I know of is by Phillip S. Pang, Ph.D and can the source code can be found here.
It uses table-driven decomposition, by which it is possible to make exp() function, which is 2-10 times faster, then native exp() of Pentium(R) processor.


Best way to generate U(1,5) from U(1,3)?

I am given a uniform integer random number generator ~ U3(1,3) (inclusive). I would like to generate integers ~ U5(1,5) (inclusive) using U3. What is the best way to do this?
This simplest approach I can think of is to sample twice from U3 and then use rejection sampling. I.e., sampling twice from U3 gives us 9 possible combinations. We can assign the first 5 combinations to 1,2,3,4,5, and reject the last 4 combinations.
This approach expects to sample from U3 9/5 * 2 = 18/5 = 3.6 times.
Another approach could be to sample three times from U3. This gives us a sample space of 27 possible combinations. We can make use of 25 of these combinations and reject the last 2. This approach expects to use U3 27/25 * 3.24 times. But this approach would be a little more tedious to write out since we have a lot more combinations than the first, but the expected number of sampling from U3 is better than the first.
Are there other, perhaps better, approaches to doing this?
I have this marked as language agnostic, but I'm primarily looking into doing this in either Python or C++.
You do not need combinations. A slight tweak using base 3 arithmetic removes the need for a table. Rather than using the 1..3 result directly, subtract 1 to get it into the range 0..2 and treat it as a base 3 digit. For three samples you could do something like:
function sample3()
result <- 0
result <- result + 9 * (randU3() - 1) // High digit: 9
result <- result + 3 * (randU3() - 1) // Middle digit: 3
result <- result + 1 * (randU3() - 1) // Units digit: 1
return result
end function
That will give you a number in the range 0..26, or 1..27 if you add one. You can use that number directly in the rest of your program.
For the range [1, 3] to [1, 5], this is equivalent to rolling a 5-sided die with a 3-sided one.
However, this can't be done without "wasting" randomness (or running forever in the worst case), since all the prime factors of 5 (namely 5) don't divide 3. Thus, the best that can be done is to use rejection sampling to get arbitrarily close to no "waste" of randomness (such as by batching multiple rolls of the 3-sided die until 3^n is "close enough" to a power of 5). In other words, the approaches you give in your question are as good as they can get.
More generally, an algorithm to roll a k-sided die with a p-sided die will inevitably "waste" randomness (and run forever in the worst case) unless "every prime number dividing k also divides p", according to Lemma 3 in "Simulating a dice with a dice" by B. Kloeckner. For example:
Take the much more practical case that p is a power of 2 (and any block of random bits is the same as rolling a die with a power of 2 number of faces) and k is arbitrary. In this case, this "waste" and indefinite running time are inevitable unless k is also a power of 2.
This result applies to any case of rolling a n-sided die with a m-sided die, where n and m are prime numbers. For example, look at the answers to a question for the case n = 7 and m = 5.
See also this question: Frugal conversion of uniformly distributed random numbers from one range to another.
Peter O. is right, you cannot escape to loose some randomness. So the only choice is between how expensive calls to U(1,3) are, code clarity, simplicity etc.
Here is my variant, making bits from U(1,3) and combining them together with rejection
C/C++ (untested!)
int U13(); // your U(1,3)
int getBit() { // single random bit
return (U13()-1)&1;
int U15() {
int r;
for(;;) {
int q = getBit() + 2*getBit() + 4*getBit(); // uniform in [0...8)
if (q < 5) { // need range [0...5)
r = q + 1; // q accepted, make it in [1...5]
return r;

fast multiplications

When I am going to compute the following series 1+x+x^2+x^3+..., I would prefer to do like this: (1+x)(1+x^2)(1+x^4)... (which is like some sort of repeated squaring) so that the number of multiplications can be significantly reduced.
Now I want to compute the series 1+x/1!+(x^2)/2!+(x^3)/3!+..., how can I use the similar techniques to improve the number of multiplications?
Any suggestions are warmly welcome!
The method of optimization you refer, is probably Horner's method:
a + bx +cx^2 +dx^3 = ((c+dx)x + b)x + a
The alternating series A*(1-x)(1+x^2)(1-x^4)(1+x^8) ... OTOH is useful in calculating approximation for division of A/(1+x), where x is small.
The Taylor series sigma x^n/n! for exp(x) converges quite badly; other approximations are better suited to get accurate values; if there's a trick to make it with less multiplications, it is to iterate with a temporary value:
sum=1; temp=x; k=1;
// The sum after first iteration is (1+x) or 1+x^1/1!
for (i=1;i<=N;i++) { sum=sum+temp; k=k*(i+1); temp = temp * x / k; }
// or
prod=1.0; for (i=N;i>0;i--) prod = prod * x/(double)i + 1.0;
Multiplying the factorial should increase accuracy a bit -- in real life situation it's may be advisable to either combine temp=temp*x/(i+1) in order to be able to iterate much further, or to use a lookup table for the constant a_n / n!, as one typically needs just a few terms. (4 or 5 terms for sin/cos).
As it turned out, Horner's rule didn't have much role in the transformation of the geometric series Sigma x^n to product form. To calculate exponential, other powerful techniques have to be applied -- typically range reduction and rational (Pade), polynomial (chebyshev) approximations and such.
Converting comment to an answer:
Note that for first series, there is exact equivalence:
1+x+x^2+x^3+...+x^n = (1-x^(n+1))/(1-x)
Using it, you can compute it much, much faster.
Second one is convergence series for e^x, you might want to use standard math library functions pow(e, x) or exp(x) instead.
On your approach for the first series don't you think that using 1 + x(1+ x( 1+ x( 1+x)....)) would be a better approach. Similar approach can be applied for the second series. So 1 + x/1 ( 1+ x/2 (1 + x/3 * (1 + x/4(.....))))

Factorial of a big number [duplicate]

This question already has answers here:
Calculating factorial of large numbers in C
(16 answers)
Closed 2 years ago.
Consider problem of calculating factorial of a number.
When result is bigger than 2^32 then we will get overflow error.
How can we design a program to calculate factorial of big numbers?
EDIT: assume we are using C++ language.
EDIT2: it is a duplicate question of this one
As a question with just algorithm tagged. Your 2^32 is not an issue because an algorithm can never have an Overflow error. Implementations of an algorithm can and do have overflow errors. So what language are you using?
Most languages have a BigNumber or BigInteger that can be used.
Here's a C++ BigInteger library:
I suggest that you google for: c++ biginteger
If you can live with approximate values, consider using the Stirling approximation and compute it in double precision.
If you want exact values, you'll need arbitrary-precision arithmetic and a lot of computation time...
Doing this requires you to take one of a few approaches, but basically boils down to:
splitting your number across multiple variables (stored in an array) and
managing your operations across the array.
That way each int/element in the array has a positional magnitude and can be strung together in the end to make your whole number.
A good example here in C:
Test this script:
import gmpy as gm
print gm.fac(3000)
For very big number is difficult to stock or print result.
For some purposes, such as working out the number of combinations, it is sufficient to compute the logarithm of the factorial, because you will be dividing factorials by factorials and the final result is of a more reasonable size - you just subtract logarithms before taking the exponential of the result.
You can compute the logarithm of the factorial by adding logarithms, or by using the, which is often available in mathematical libraries (there are good ways to approximate this).
First invent a way to store and use big numbers. Common way is to interpret array of integers as digits of a big number. Then add basic operations to your system, such as multiplication. Then multiply.
Or use already made solutions. Google for: c++ big integer library
You can use BigInteger for finding factorial of a Big numbers probably greater than 65 as the range of data type long ends at 65! and it starts returning 0 after that. Please refer to below Java code. Hope it would help:
import java.math.BigInteger;
public class factorial {
public factorial() {
// TODO Auto-generated constructor stub
public static void main(String args[])
factorial f = new factorial();
public BigInteger fact(int num)
BigInteger sum = BigInteger.valueOf(1);
for(int i = num ; i>= 2; i --)
sum = sum.multiply(BigInteger.valueOf(i));
return sum;
If you want to improve the range of your measurement, you can use logarithms. Logarithms will convert your multiplication to additions making it much smaller to store.
factorial(n) => n * factorial(n-1)
log(factorial(n)) => log(n) * log(factorial(n-1))
5! = 5*4*3*2*1 = 120
log(5!) = log(5) + log(4) + log(3) + log(2) + log(1) = 2.0791812460476247
In this example, I used base 10 logarithms, but any base works.
Or 10^0.0791812460476247*10^2 or 1.2*10^2
Implementation example in javascript

How to find a specific sequence within the decimals of PI?

I want to find a specific sequence of digits in the decimals of PI and that involves first computing PI to (quite possibly) infinity. The problem is that I don't know how to make a variable store that many digits or how to just use the newly computed digit so I can compare it to my sequence.
So how can I calculate PI and keep only the last decimal as an integer?
Thanks in advance.
This kind of problem can be solved very elegantly using lazy evaluation, like the one found in Haskell. Or using generators in Python, producing at most one number of Pi at a time, and checking against the corresponding position in target value that's being searched.
The advantage of either approach is that you don't have to generate a (potentially) infinite sequence of numbers, only generate as much as needed until you find what you're looking for. Of course, if the specific sequence really doesn't appear in the number Pi the algorithm will iterate forever, but at least the computer executing the program won't run out of memory.
Alternatively: you could use the BBP Formula, or a similar algorithm which allows the extraction of a specific digit in Pi.
You can use an iterative algorithm for calculating Pi, for example, the Gauss–Legendre algorithm.
To implement it, you will need a library that does arbitrary-precision arithmetic; one such library is GMP.
Apparently, someone has done most of the work for you:
Here is an implementation in Python of a streaming algorithm described by Jeremy Gibbons in Unbounded Spigot Algorithms for the Digits of Pi (2004), Chaper 6:
def generate_digits_of_pi():
q = 1
r = 180
t = 60
i = 2
while True:
digit = ((i * 27 - 12) * q + r * 5) // (t * 5)
yield digit
u = i * 3
u = (u + 1) * 3 * (u + 2)
r = u * 10 * (q * (i * 5 - 2) + r - t * digit)
q *= 10 * i * (i * 2 - 1)
t *= u
i += 1
# Demo
iter = generate_digits_of_pi()
from time import sleep
import sys
for i in range(1000):
print (next(iter), end="")

How to implement square root and exponentiation on arbitrary length numbers?

I'm working on new data type for arbitrary length numbers (only non-negative integers) and I got stuck at implementing square root and exponentiation functions (only for natural exponents). Please help.
I store the arbitrary length number as a string, so all operations are made char by char.
Please don't include advices to use different (existing) library or other way to store the number than string. It's meant to be a programming exercise, not a real-world application, so optimization and performance are not so necessary.
If you include code in your answer, I would prefer it to be in either pseudo-code or in C++. The important thing is the algorithm, not the implementation itself.
Thanks for the help.
Square root: Babylonian method. I.e.
function sqrt(N):
oldguess = -1
guess = 1
while abs(guess-oldguess) > 1:
oldguess = guess
guess = (guess + N/guess) / 2
return guess
Exponentiation: by squaring.
function exp(base, pow):
result = 1
bits = toBinary(powr)
for bit in bits:
result = result * result
if (bit):
result = result * base
return result
where toBinary returns a list/array of 1s and 0s, MSB first, for instance as implemented by this Python function:
def toBinary(x):
return map(lambda b: 1 if b == '1' else 0, bin(x)[2:])
Note that if your implementation is done using binary numbers, this can be implemented using bitwise operations without needing any extra memory. If using decimal, then you will need the extra to store the binary encoding.
However, there is a decimal version of the algorithm, which looks something like this:
function exp(base, pow):
lookup = [1, base, base*base, base*base*base, ...] #...up to base^9
#The above line can be optimised using exp-by-squaring if desired
result = 1
digits = toDecimal(powr)
for digit in digits:
result = result * result * lookup[digit]
return result
Exponentiation is trivially implemented with multiplication - the most basic implementation is just a loop,
result = 1;
for (int i = 0; i < power; ++i) result *= base;
You can (and should) implement a better version using squaring with divide & conquer - i.e. a^5 = a^4 * a = (a^2)^2 * a.
Square root can be found using Newton's method - you have to get an initial guess (a good one is to take a square root from the highest digit, and to multiply that by base of the digits raised to half of the original number's length), and then to refine it using division: if a is an approximation to sqrt(x), then a better approximation is (a + x / a) / 2. You should stop when the next approximation is equal to the previous one, or to x / a.
