run time of this Prime Factor function? - algorithm

I wrote this prime factorization function, can someone explain the runtime to me? It seems fast to me as it continuously decomposes a number into primes without having to check if the factors are prime and runs from 2 to the number in the worst case.
I know that no functions yet can factor primes in polynomial time. Also, how does the run time relate asymptotically to factoring large primes?
function getPrimeFactors(num) {
var factors = [];
for (var i = 2; i <= num; i++) {
if (num % i === 0) {
num = num / i;
factors.push(i);
i--;
}
}
return factors;
}

In your example, if num is prime then it would take exactly num - 1 steps. This would mean that the algorithm's runtime is O(num) (where O stands for a pessimistic case). But in case of algorithm that operate on numbers things get a little bit more tricky (thanks for noticing thegreatcontini and Chris)! We always describe complexity as a function of input size. In this case the input is a number num and it is represented with log(num) bits. So the input size is of log(num). Because num = 2 ^ (log(num)) then your algorithm is of complexity O(2^k) where k = log(num) - size of your input.
This is what makes this problem hard - input is very, very small and any polynomial from num leads to exponential algorithm ...
On a side note #rici is right, you need to check only up to sqrt(num), thus easily reducing the runtime to O(sqrt(num)) or more correctly O(sqrt(2) ^ k).

Related

Does manipulating n have any impact on the O of an algorithm?

Does manipulating n have any impact on the O of an algorithm?
recursive code for example:
Public void Foo(int n)
{
n -= 1;
if(n <= 0) return;
n -= 1;
if(n <= 0) return;
Foo(n)
}
Does the reassignment of n impact O(N)? Sounds intuitive to me...
Does this algorithm have O(N) by dropping the constant? Technically, since it's decrementing n by 2, it would not have the same mathematical effect as this:
public void Foo(int n) // O(Log n)
{
if(n <= 0) return;
Console.WriteLine(n);
Foo(n / 2);
}
But wouldn't the halving of n contribute to O(N), since you are only touching half of the amount of n? To be clear, I am learning O Notation and it's subtleties. I have been looking for cases such that are like the first example, but I am having a hard time finding such a specific answer.
The reassignment of n itself is not really what matters when talking about O notation. As an example consider a simple for-loop:
for i in range(n):
do_something()
In this algorithm, we do something n times. This would be equivalent to the following algorithm
while n > 0:
do_something()
n -= 1
And is equivalent to the first recursive function you presented. So what really matters is how many computations is done compared to the input size, which is the original value of n.
For this reason, all these three algorithms would be O(n) algorithms, since all three of them decreases the 'input size' by one each time. Even if they had increased it by 2, it would still be a O(n) algorithm, since constants doesn't matter when using O notation. Thus the following algorithm is also a O(n) algorithm.
while n > 0:
do something()
n -= 2
or
while n > 0:
do_something()
n -= 100000
However, the second recursive function you presented is a O(log n) algorithm (even though it does not have a base case and would techniqually run till the stack overflows), as you've written in the comments. Intuitively, what happens i that when halving the input size every time, this exactly corresponds to taking the logarithm in base two of the original input number. Consider the following:
n = 32. The algorithm halves every time: 32 -> 16 -> 8 -> 4 -> 2 -> 1.
In total, we did 5 computations. Equivalently log2(32) = 5.
So to recap, what matters is the original input size and how many computations is done compared to this input size. Whatever constant may affect the computations does not matter.
If I misunderstood your question, or you have follow up questions, feel free to comment this answer.

Is the measurement of the big O notation correct in my algorithm?

Here is the link to algorithm without comments, to see better
function getLastFactor(n) {
var tempArr = [2, 3], max = Math.sqrt(n); /* (1 + 1)
it is despised
*/
for(var i = 5; i < max; i += 2) { /*
(sqrt(n) - 5) ---> ( 5 is despised )
*/
if(n%i) continue; /*
sqrt(n) - 5 operations
PD: Should I add this? or is it already included in the for, in (sqrt(n) - 5 above) ?
*/
if(check(tempArr, i)) continue; // How do I measure this? If I do not have security of the number?
tempArr.push(i); // 1 operation it is despised
}
return tempArr[tempArr.length - 1]; // 1 operation
}
function check(array, i) {
for(var j = 0, l = array.length; j < l; ++j) { // sqrt(n) operations
var cur = array[j]; // 1 operation
if(!(i%cur)) return true; // 1 operation
}
// O(3 * sqrt(n))
I do not know what really add up, I have read that that is not important, since the Big O notation standardizes that, eliminating the terms of minor order. But I have many doubts, like some that I left in the code itself and also:
1) Should I count operations that depend on a conditional? , Imagine I have a condition, if you evaluate true, a cycle will be done to increase the efficiency in n operations, this should be taken into account since it makes a big change.
2) I think the efficiency of this code is O (sqrt (n) * 3), is this correct?
This problem is not a duplicate of another, I have read a lot by the network and especially in this community, and almost all the questions / answers are based on the theoretical and almost never have seen theoretical and practical examples at the same time.
First of all, when using big-o notation drop all constants, so O (sqrt (n) * 3) is actually O (sqrt (n)).
To correctly analyze the asymptotic complexity of this code we need some background in number theory. What this algorithm basically performs is determining the prime factors of n (only the greatest is returned). Main part of program is a for loop that iterates over all odd numbers from 5 to sqrt(n), so the number of iterations is (sqrt(n) - 5) / 2, or in big-o terminology, O(sqrt(n)).
Next, there is a statement if(n%i) continue; that eliminates all numbers that are not divisors of n (regardless whether they are prime or not). So, the rest of code is executed only when i is a divisor of n. Asymptotic bound for number of divisors is O(n^{1 / ln(ln(n))}).
Finally, there is a function check that iterates over the array tempArr which contains prime divisors of n found so far. How many prime divisors does a positive integer n have? Asymptotic bound for number of prime divisors is in a worst case (when n is so called a primorial number) O(ln(n) / ln(ln(n))).
Let's now sum up everything. Even if we assume that n is primorial and that all prime divisors are found quickly (so array.length has a maximum possible value), the asymptotic complexity of the part of code after if(n%i) continue; is O(n^{1 / ln(ln(n))} * ln(n) / ln(ln(n))). It is not that easy to see, but this grows slower than O(sqrt(n)), so the total complexity is O(sqrt(n)).

Calculation of euler phi function

int phi (int n) {
int result = n;
for (int i=2; i*i<=n; ++i)
if (n % i == 0) {
while (n % i == 0)
n /= i;
result -= result / i;
}
if (n > 1)
result -= result / n;
return result;
}
I saw the above implementation of Euler phi function which is of the O(sqrt n).I don't get the fact of using i*i<=n in the for loop and need of changing n.It is said that it can be done in a time much smaller O (sqrt n) How? link (in Russian)
i*i<=n is same as i<= sqrt(n) from which your iteration lasts only to order of sqrt(n).
Using the straight definition of Euler totient function you are supposed to find the prime numbers that divides n.
The function is a straight forward implementation of integer factorization by trial division, except that instead of reporting the factors as it finds them the function uses the factors to calculate phi. Calculation of phi can be done in time less than O(sqrt n) by using a better algorithm to find the factors; the best way to do that depends on the magnitude of n.
If the biggest number (N say) that you will want the totient of is small enough that you can have a table of size N in memory, then you can do a lot better, per evaluation, at the cost of having to build a table before any evaluations.
One approach would be to build a table of primes first, and then instead of using trial division by every integer at most sqrt(n), use trial division by every prime at most sqrt(n).
You could improve on this by building, instead of a table of primes, a table that gives (for each integer 2..N) the smallest prime that divides the number. A simple modification of the usual Sieve of Eratosthenes can be used to build such a table. Then to compute the totient of a number you use the table to find the smallest prime dividing the number (and accumulate that into you answer), then divide the number by the table entry, use the table to find the smallest prime that divides that, and so on.

The efficiency of an algorithm

I am having a hard time understanding the efficiency of an algorithm and how do you really determine that, that particular sentence or part is lg n, O (N) or log base 2 (n)?
I have two examples over here.
doIt() can be expressed as O(n)=n^2.
First example.
i=1
loop (i<n)
doIt(…)
i=i × 2
end loop
The cost of the above is as follows:
i=1 ... 1
loop (i<n) ... lg n
doIt(…) ... n^2 lg n
i=i × 2 ... lg n
end loop
Second example:
static int myMethod(int n){
int i = 1;
for(int i = 1; i <= n; i = i * 2)
doIt();
return 1;
}
The cost of the above is as follows:
static int myMethod(int n){ ... 1
int i = 1; ... 1
for(int i = 1; i <= n; i = i * 2) ... log base 2 (n)
doIt(); ... log base 2 (n) * n^2
return 1; ... 1
}
All this have left me wondering, how do you really find out what cost is what? I've been asking around, trying to understand but there is really no one who can really explain this to me. I really wanna understand how do I really determine the cost badly. Anyone can help me on this?
The big O notation is not measuring how long the program will run. It says how fast will the running time increase as the size of the problem grows.
For example, if calculating something is O(1), that could be a very long time, but it is independent of the size of the problem.
Normally, you're not expecting to estimate costs of such things as cycle iterator (supposing storing one integer value and changing it N times is too minor to include in result estimation).
What really matters - that in terms of Big-O, Big-Theta e.t.c you're expected to find functional dependence, i.e. find a function of one argument (N), for which:
Big-O: entire algorithm count of operation grows lesser than F(N)
Big-Theta: entire algorithm count of operation grows equal to F(N)
Big-Omega: entire algorithm count of operation grows greater than F(N)
so, remember - you're not trying to find a number of operations, you're trying to find functional estimation for that, i.e. functional dependence between amount of incoming data N and some function from N, which indicates speed of growth for operation's count.
So, O(1), for example, indicates, that whole algorithm will not depend from N (it is constant). You can read more here.
Also, there are different types of estimations. You can estimate memory or execution time, for example - that will be different estimations in common case.

Reverse factorial

Well, we all know that if N is given it's easy to calculate N!. But what about the inverse?
N! is given and you are about to find N - Is that possible ? I'm curious.
Set X=1.
Generate F=X!
Is F = the input? If yes, then X is N.
If not, then set X=X+1, then start again at #2.
You can optimize by using the previous result of F to compute the new F (new F = new X * old F).
It's just as fast as going the opposite direction, if not faster, given that division generally takes longer than multiplication. A given factorial A! is guaranteed to have all integers less than A as factors in addition to A, so you'd spend just as much time factoring those out as you would just computing a running factorial.
If you have Q=N! in binary, count the trailing zeros. Call this number J.
If N is 2K or 2K+1, then J is equal to 2K minus the number of 1's in the binary representation of 2K, so add 1 over and over until the number of 1's you have added is equal to the number of 1's in the result.
Now you know 2K, and N is either 2K or 2K+1. To tell which one it is, count the factors of the biggest prime (or any prime, really) in 2K+1, and use that to test Q=(2K+1)!.
For example, suppose Q (in binary) is
1111001110111010100100110000101011001111100000110110000000000000000000
(Sorry it's so small, but I don't have tools handy to manipulate larger numbers.)
There are 19 trailing zeros, which is
10011
Now increment:
1: 10100
2: 10101
3: 10110 bingo!
So N is 22 or 23. I need a prime factor of 23, and, well, I have to pick 23 (it happens that 2K+1 is prime, but I didn't plan that and it isn't needed). So 23^1 should divide 23!, it doesn't divide Q, so
N=22
int inverse_factorial(int factorial){
int current = 1;
while (factorial > current) {
if (factorial % current) {
return -1; //not divisible
}
factorial /= current;
++current;
}
if (current == factorial) {
return current;
}
return -1;
}
Yes. Let's call your input x. For small values of x, you can just try all values of n and see if n! = x. For larger x, you can binary-search over n to find the right n (if one exists). Note hat we have n! ≈ e^(n ln n - n) (this is Stirling's approximation), so you know approximately where to look.
The problem of course, is that very few numbers are factorials; so your question makes sense for only a small set of inputs. If your input is small (e.g. fits in a 32-bit or 64-bit integer) a lookup table would be the best solution.
(You could of course consider the more general problem of inverting the Gamma function. Again, binary search would probably be the best way, rather than something analytic. I'd be glad to be shown wrong here.)
Edit: Actually, in the case where you don't know for sure that x is a factorial number, you may not gain all that much (or anything) with binary search using Stirling's approximation or the Gamma function, over simple solutions. The inverse factorial grows slower than logarithmic (this is because the factorial is superexponential), and you have to do arbitrary-precision arithmetic to find factorials and multiply those numbers anyway.
For instance, see Draco Ater's answer for an idea that (when extended to arbitrary-precision arithmetic) will work for all x. Even simpler, and probably even faster because multiplication is faster than division, is Dav's answer which is the most natural algorithm... this problem is another triumph of simplicity, it appears. :-)
Well, if you know that M is really the factorial of some integer, then you can use
n! = Gamma(n+1) = sqrt(2*PI) * exp(-n) * n^(n+1/2) + O(n^(-1/2))
You can solve this (or, really, solve ln(n!) = ln Gamma(n+1)) and find the nearest integer.
It is still nonlinear, but you can get an approximate solution by iteration easily (in fact, I expect the n^(n+1/2) factor is enough).
Multiple ways. Use lookup tables, use binary search, use a linear search...
Lookup tables is an obvious one:
for (i = 0; i < MAX; ++i)
Lookup[i!] = i; // you can calculate i! incrementally in O(1)
You could implement this using hash tables for example, or if you use C++/C#/Java, they have their own hash table-like containers.
This is useful if you have to do this a lot of times and each time it has to be fast, but you can afford to spend some time building this table.
Binary search: assume the number is m = (1 + N!) / 2. Is m! larger than N!? If yes, reduce the search between 1 and m!, otherwise reduce it between m! + 1 and N!. Recursively apply this logic.
Of course, these numbers might be very big and you might end up doing a lot of unwanted operations. A better idea is to search between 1 and sqrt(N!) using binary search, or try to find even better approximations, though this might not be easy. Consider studying the gamma function.
Linear search: Probably the best in this case. Calculate 1*2*3*...*k until the product is equal to N! and output k.
If the input number is really N!, its fairly simple to calculate N.
A naive approach computing factorials will be too slow, due to the overhead of big integer arithmetic. Instead we can notice that, when N ≥ 7, each factorial can be uniquely identified by its length (i.e. number of digits).
The length of an integer x can be computed as log10(x) + 1.
Product rule of logarithms: log(a*b) = log(a) + log(b)
By using above two facts, we can say that length of N! is:
which can be computed by simply adding log10(i) until we get length of our input number, since log(1*2*3*...*n) = log(1) + log(2) + log(3) + ... + log(n).
This C++ code should do the trick:
double result = 0;
for (int i = 1; i <= 1000000; ++i) { // This should work for 1000000! (where inputNumber has 10^7 digits)
result += log10(i);
if ( (int)result + 1 == inputNumber.size() ) { // assuming inputNumber is a string of N!
std::cout << i << endl;
break;
}
}
(remember to check for cases where n<7 (basic factorial calculation should be fine here))
Complete code: https://pastebin.com/9EVP7uJM
Here is some clojure code:
(defn- reverse-fact-help [n div]
(cond (not (= 0 (rem n div))) nil
(= 1 (quot n div)) div
:else (reverse-fact-help (/ n div) (+ div 1))))
(defn reverse-fact [n] (reverse-fact-help n 2))
Suppose n=120, div=2. 120/2=60, 60/3=20, 20/4=5, 5/5=1, return 5
Suppose n=12, div=2. 12/2=6, 6/3=2, 2/4=.5, return 'nil'
int p = 1,i;
//assume variable fact_n has the value n!
for(i = 2; p <= fact_n; i++) p = p*i;
//i is the number you are looking for if p == fact_n else fact_n is not a factorial
I know it isn't a pseudocode, but it's pretty easy to understand
inverse_factorial( X )
{
X_LOCAL = X;
ANSWER = 1;
while(1){
if(X_LOCAL / ANSWER == 1)
return ANSWER;
X_LOCAL = X_LOCAL / ANSWER;
ANSWER = ANSWER + 1;
}
}
This function is based on successive approximations! I created it and implemented it in Advanced Trigonometry Calculator 1.7.0
double arcfact(double f){
double result=0,precision=1000;
int i=0;
if(f>0){
while(precision>1E-309){
while(f>fact(result+precision)&&i<10){
result=result+precision;
i++;
}
precision=precision/10;
i=0;
}
}
else{
result=0;
}
return result;
}
If you do not know whether a number M is N! or not, a decent test is to test if it's divisible by all the small primes until the Sterling approximation of that prime is larger than M. Alternatively, if you have a table of factorials but it doesn't go high enough, you can pick the largest factorial in your table and make sure M is divisible by that.
In C from my app Advanced Trigonometry Calculator v1.6.8
double arcfact(double f) {
double i=1,result=f;
while((result/(i+1))>=1) {
result=result/i;
i++;
}
return result;
}
What you think about that? Works correctly for factorials integers.
Simply divide by positive numbers, i.e: 5!=120 ->> 120/2 = 60 || 60/3 = 20 || 20/4 = 5 || 5/5 = 1
So the last number before result = 1 is your number.
In code you could do the following:
number = res
for x=2;res==x;x++{
res = res/x
}
or something like that. This calculation needs improvement for non-exact numbers.
Most numbers are not in the range of outputs of the factorial function. If that is what you want to test, it's easy to get an approximation using Stirling's formula or the number of digits of the target number, as others have mentioned, then perform a binary search to determine factorials above and below the given number.
What is more interesting is constructing the inverse of the Gamma function, which extends the factorial function to positive real numbers (and to most complex numbers, too). It turns out construction of an inverse is a difficult problem. However, it was solved explicitly for most positive real numbers in 2012 in the following paper: http://www.ams.org/journals/proc/2012-140-04/S0002-9939-2011-11023-2/S0002-9939-2011-11023-2.pdf . The explicit formula is given in Corollary 6 at the end of the paper.
Note that it involves an integral on an infinite domain, but with a careful analysis I believe a reasonable implementation could be constructed. Whether that is better than a simple successive approximation scheme in practice, I don't know.
C/C++ code for what the factorial (r is the resulting factorial):
int wtf(int r) {
int f = 1;
while (r > 1)
r /= ++f;
return f;
}
Sample tests:
Call: wtf(1)
Output: 1
Call: wtf(120)
Output: 5
Call: wtf(3628800)
Output: 10
Based on:
Full inverted factorial valid for x>1
Use the suggested calculation. If factorial is expressible in full binary form the algorithm is:
Suppose input is factorial x, x=n!
Return 1 for 1
Find the number of trailing 0's in binary expansion of the factorial x, let us mark it with t
Calculate x/fact(t), x divided by the factorial of t, mathematically x/(t!)
Find how many times x/fact(t) divides t+1, rounded down to the nearest integer, let us mark it with m
Return m+t
__uint128_t factorial(int n);
int invert_factorial(__uint128_t fact)
{
if (fact == 1) return 1;
int t = __builtin_ffs(fact)-1;
int res = fact/factorial(t);
return t + (int)log(res)/log(t+1);
}
128-bit is giving in on 34!

Resources