Why naive primality test algorithm is not polynomial - algorithm

I would like to understand why the following naive primality test algorithm is not polynomial.
IsPrime (n: an integer)
Begin
For i=2 to n-1 do
If (n % i == 0) then
return (no)
EndIf
EndFor
return (yes)
End
This algorithm is said to be exponential in the size of the input n. Why is it true? And why the following sorting test algorithm is said polynomial and not exponential?
IsSorted (T[n]: an array of n integer)
Begin
For i = 1 to n-1 do
If (T[i] > T[i+1]) then
return (no)
EndIf
EndFor
return (yes)
End

The input size is typically measured in bits. To represent the number n the input size would be log2(n). The primitive primality test is linear in n, but exponential in log2(n).

The naive primarily test is polynomial in the value of the input (that is, the actual number which the function receives), but exponential in the size (bits, bytes, etc.) of the input.
If you have a number n consisting of b bits, we have b = O(log n) and also n = O(2b).
The running time is thus O(n) or O(2b).

To second the answer given by Henry, the algorithm in the original question in fact has a polynomially bounded running time - if unary encoding is used for the input!
More precisely, the runtime bound does not only depend on the algorithm itself but also the used encoding scheme. Consider the following algorithm in C-like syntax.
INPUT: integer n
for (int i = 0; i < n; i++)
{
wait one second
}
Apparently, the algorithm takes n seconds to terminate; the time is linear in n. If the input is encoded using unary encoding, the amount of time scales linearly in the encoding length of n. However, if n is encoded using binary encoding the amount of time scales exponentially in the encoding length of n (as the encoding length of n scales logarithmically in the value of n).
To put it all in a nutshell, the statement that the algorithm in the question is not polynomial without any additional information is not correct. However, apparently it is a convention that binary encoding (or any other positional notation) is used unless otherwise stated.
That being said, I admit that the depenence of the runtime bound on the encoding scheme tends to be taught a bit imprecisely. The term pseudopolynomial is also floating around.

Related

Time complexity of a loop with value increasing in powers of 2

for(i=1;i<=n;i=pow(2,i)) { print i }
What will be the time complexity of this?
Approximate kth term for value of i will be pow(2,(pow(2,pow(2,pow(2, pow(2,pow(2,...... k times)))))))
How can the above value, let's say kth value of i < n be solved for k.
What you have is similar to tetration(2,n) but its not it as you got wrong ending condition.
The complexity greatly depends on the domain and implementation. From your sample code I infer real domain and integers.
This function grows really fast so after 5 iterations you need bigints where even +,-,*,/,<<,>> are not O(1). Implementation of pow and print have also a great impact.
In case of small n<tetration(2,4) you can assume the complexity is O(1) as there is no asymptotic to speak of for such small n.
Beware pow is floating point in most languages and powering 2 by i can be translated into simple bit shift so let assume this:
for (i=1;i<=n;i=1<<i) print(i);
We could use previous state of i to compute 1<<i like this:
i0=i; i<<=(i-i0);
but there is no speedup on such big numbers.
Now the complexity of decadic print(i) is one of the following:
O( log(i)) // power of 10 datawords (like 1000000000 for 32 bit)
O((log(i))^2) // power of 2 datawords naive print implementation
O( log(i).log(log(i))) // power of 2 datawords subdivision or FFT based print implementation
The complexity of bit shift 1<<i and comparison i<=n is:
O(log(i)) // power of 2 datawords
So choosing the best implementation for print in power of 2 datawords lead to iteration:
O( log(i).log(log(i) + log(i) + log(i) ) -> O(log(i).log(log(i)))
At first look one would think we would need to know the number of iterations k from n:
n = tetration(2,k)
k = slog2(n)
or Knuth's notation which is directly related to Ackermann function:
n = 2↑↑k
k = 2↓↓n
but the number of iterations is so small in comparison to inner complexity of the stuff inside loop and next iterations grows so fast that the previous iteration is negligible fraction of the next one so we can ignore them all and only consider the last therm/iteration...
After all these assumptions I got final complexity:
O(log(n).log(log(n)))

Does time complexity changes as we change our programming language?

In Python, if we need to find the maximum element from the list. We use :
>>>max(listname) to get the maximum number from the list.
Time Complexity : O(1)
If we use C/C++,
we need to iterate over the loop and get the max.
Time Complexity: O(n) or O(n square)
Hence, does the time complexity changes over the programming language?
No. max(listname) is always O(N) where N is the length of listname (*). This is just by definition of complexity - because the iteration has to happen somewhere - maybe in your code (in case of your C/C++), or maybe in library code (in case of Python).
Sometimes we ignore some complexity, usually because it is beneath our notice; for example, x * y in Python (with x and y being integers) is actually not O(1), because x and y are of arbitrary length in Python, and the * operation thus does execute in loops, but we do simplify and treat it as O(1), as it is most often the case that the integers are of small enough size for it not to matter.
The choice of programming language only matters inasmuch as it affects our perception of what we can ignore; but it does not actually change the time complexity.
*) it is only O(N) in case N is proportional to input size; it is O(1) if it is known in advance, or limited. For example, lst = read_list(); m = max(lst) is O(N), but lst = [1, 2, 3, 4]; m = max(lst) is O(1), as is lst = read_list(); lst = lst[:4]; m = max(lst).

How to calculate runtime complexity of the nested loops with variable length

Suppose I have a task to write an algorithm that runs through an array of strings and checks if each value in the array contains s character. The algorithm will have two nested loops, here is the pseudo code:
for (let i=0; i < a.length; i++)
for (let j=0; j < a[i].length; j++)
if (a[i][j] === 'c')
do something
Now, the task is to identify the runtime complexity of the algorithm. Here is my reasoning:
let the number of elements in the array be n, while the maximum length of string values m. So the general formula for complexity is
n x m
Now the possible cases.
If the maximum length of string values is equal to the number of elements, I get the complexity:
n^2
If the maximum length of elements is less than the number of elements by some number a, the complexity is
n x (n - a) = n^2 - na
If the maximum length of elements is more than the number of elements by some number a, the complexity is
n x (n - a) = n^2 + na
Since we discard lower growth functions, it seems that the complexity of the algorithm is n^2. Is my reasoning correct?
Your time complexity is just the total number of characters. Which of the analyses is applicable, depends entirely on which of your assumptions about the relationship between the length of words, and the number of words, holds true. Note in particular, your statement that the time complexity is N x M where M is the largest name in the array, is not correct (it's correct in the sense that it places an upper bound, but that upper bound is not tight, so it's not very interesting; it's correct in the same sense that N^2 x M^2 is correct).
I think certainly in many real cases of interest, your analysis is incorrect. The total number of characters is equal to the number of strings, times the average number of characters per string, i.e. word length (note: average, not maximum!). As the number of strings becomes large, the average sample word length will approach the mean of whatever distribution you are sampling from. So at least for any well behaved distribution where the sampling is iid, the time complexity is simply N.
A good practical example is a database that stores names. It depends of course which people happen to be in the database, but if you are storing names of say American citizens, then as N becomes large, the number of inner operations will approach N times the average number of characters in a name, in the US. The latter quantity just doesn't depend on N at all, so it's linear in N.

run time of this Prime Factor function?

I wrote this prime factorization function, can someone explain the runtime to me? It seems fast to me as it continuously decomposes a number into primes without having to check if the factors are prime and runs from 2 to the number in the worst case.
I know that no functions yet can factor primes in polynomial time. Also, how does the run time relate asymptotically to factoring large primes?
function getPrimeFactors(num) {
var factors = [];
for (var i = 2; i <= num; i++) {
if (num % i === 0) {
num = num / i;
factors.push(i);
i--;
}
}
return factors;
}
In your example, if num is prime then it would take exactly num - 1 steps. This would mean that the algorithm's runtime is O(num) (where O stands for a pessimistic case). But in case of algorithm that operate on numbers things get a little bit more tricky (thanks for noticing thegreatcontini and Chris)! We always describe complexity as a function of input size. In this case the input is a number num and it is represented with log(num) bits. So the input size is of log(num). Because num = 2 ^ (log(num)) then your algorithm is of complexity O(2^k) where k = log(num) - size of your input.
This is what makes this problem hard - input is very, very small and any polynomial from num leads to exponential algorithm ...
On a side note #rici is right, you need to check only up to sqrt(num), thus easily reducing the runtime to O(sqrt(num)) or more correctly O(sqrt(2) ^ k).

program that checks if any even number greater than 4 is a sum of two prime numbers

I have the following problem:
Given that the even numbers greater than 4 can be obtained by addition of 2 prime
numbers, I have to write an algorithm which check it. The algorithm should take less time that O(n^2).
For example there is a set of numbers from 6 to n. If we have the number 6 the answer is 6=3+3 and for 22=17+5 and so on.
My first idea:
S - set of n numbers
for i=1 to n {
//removing odd numbers
if (S[i]%2!=0)
continue;
result = false;
for j=2 to S[i]-2{
if (j.isPrime) // prime test can be done in O(log^2(n))
if ((S[i]-j).isPrime)
result = true;
break;
else
continue;
}
if (result == false)
break;
}
Since I use 2 for-loops, the total running time of this algorithm should be
O(n*n)*O(log^2(n)) = O(n^2*log^2(n)) which is not less than O(n^2).
Does anybody have an idea to reduce the running time to get the required time of less than O(n^2)?
If set contains big numbers I've got nothing.
If max(S) < n ^ 2 / log(n) than:
You should preprocess which numbers from interval [1, max(S)] are primes.
For preprocessing you can use sieve of Eratosthenes
Then, you are able to check if number is a prime in O(1) and complexity of your solution becomes O(N^2).
This is Goldbach's conjecture. Primality testing is known to be in P (polynomial time), but the break-even is ridiculously high - in practice, you will not be able to do this in anywhere near O(n^2).
If we assume you only need to deal with relatively small numbers, and can precompute the primes up to a certain limit, you still need to find candidate pairs. The prime counting function gives approximately: n / ln(n) primes, less than (n). Subtracting the candidate prime (p) from (n) gives an odd number (q). If you can look up the primality of (q) with a complexity of: n.ln(n), or better - i.e., an O(1) lookup table for all odd numbers less than the limit - you can achieve O(n^2) or better.
You can run only until square root of N, this sufficient for determine if the number is prime or not.
this will reduce your running time.
also take a look at the following question - Program to find prime numbers

Resources