Time complexity in ascending order - performance

What is the ascending order of growth rate of the following functions:
2^((logn)^1/2)
2^n
2^(n/2)
n^(4/3)
n(logn)^3
n^logn
2^(n^2)
n!
log n is with base 2.

We can immediate deduce that n! is the highest order, as it is equal to
... and the n^n part far exceeds any of the other functions.
Since
We can deduce that (1) is less than other functions with n as the base, e.g. (4), (5) and (6). In fact it is less than all of the other functions.
(3) < (2), since the latter is the former squared.
(2) < (7), since the latter is the former to the power n.
(4) < (6), since log n > 4/3.
From this post, log n grows more slowly than any positive power of n. Therefore:
Thus (5) < (4), (6)
Using a logarithm law transformation we obtain the following:
Thus (6) < (3).
Compiling all of the reasoning steps above, we deduce the ascending order to be:
(1).
(5).
(4).
(6).
(3).
(2).
(7).
(8).

Related

Is the algorithm that involves enumerating n choose k exponetial

Say if we have an algorithm needs to list out all possibilities of choosing k elements from n elements (k<=n), is the time complexity of the particular algorithm exponential and why?
No.
There are n choose k = n!/(k!(n-k)!) possibilities [1].
Consider that, n choose k = n^k / (k!). [2].
Assuming you are keeping k constant, as n grows, the amount of possibilities increases in polynomial time.
For this example, ignore the (1/(k!)) term because it is constant. If k = 2, and you increase n from 2 to 3, then you have a 2^2 to 3^2 change. An exponential change would be from 2^2 to 2^3. This is not the same.
Keeping k constant and changing n results in a big O of O(n^k) (the 1/(k!) term is constant and you ignore it).
Thinking carefully about the size of the input instance is required since the input instance contains numbers - a basic familiarity with weak NP-hardness can also be helpful.
Assume that we fix k=1 and encode n in binary. Since the algorithm must visit n choose 1 = n numbers, it takes at least n steps. Since the magnitude of the number n may be exponential in the size of the input (the number of bits used to encode n), the algorithm in the worst case consumes exponential time.
You can get a feel for this exponential-time behavior by writing a simple C program that prints all the numbers from 1 to n with n = 2^64 and see how far you get in a minute. While the input is only 64 bits long, it would take you about 600 years to print all the numbers assuming that your device can print a million numbers per second.
An algorithm that finds all possibilities of choosing k elements from n unique elements (k<=n), does NOT have an exponential time complexity, O(K^n), because it instead has a factorial time complexity, O(n!). The relevant formula is:
p = n!/(k!(n-k)!)

how many operation is need for sorting?

This is a 2016 entrance exam question:
We have N balls with distinct and unknown weights that have labels 1 to n. We are given a two-pan balance and want to use it for weighting these balls in pairs and writing them on a paper in-order to sort all of these balls. In the worst case, how many weighing operations are need? Choose the best answer.
a) Ceil[ n log2 n ]
b) Floor[ n log2 n ]
c) n − 1
d) Ceil[ log2 n! ]
According to the answer answer sheet, the correct solution is: Ceil[ log2 n! ]
My question is: how is this solution is achieved (how does this algorithms work, is there any pesudocode?)?
If you look at Number of Comparisons in Merge-Sort you will find my answer there arguing that the total number of comparisons for mergesort (which is known to have good asymptotic behavior) is
n ⌈log2 n⌉ − 2⌈log2 n⌉ + 1
Of course n ⌈log2 n⌉ = ⌈n log2 n⌉ and 2⌈log2 n⌉ ≥ n so for n ≥ 1 this confirms answer (a) as an upper bound.
Is (b) a tighter upper bound? If you write ⌈log2 n⌉ = log2 n + d for some 0 ≤ d < 1 then you get
n (log2 n + d) − 2d n + 1 = n (log2 n + d − 2d) + 1 = (n log2 n) + n (d − 2d + 1/n)
and if you write m := ⌈log2 n⌉ and n = 2m − d that last parenthesis becomes (d − 2d + 2d − m).
Plotting this for some values of m shows that for integers m ≥ 1 this will very likely be zero. You get m = 0 for n = 1, which means d = 0 so the whole parenthesis becomes zero. So when you worked out the details of the proof, this will show that (b) is indeed an upper bound for mergesort.
How about (c)? There is an easy counterexample for n = 3. If you know that ball 1 is lighter than 2 and smaller than 3, this doesn't tell you how to sort 2 and 3. You can show that you can't have chosen a suboptimal algorithm by comparing 1 to both 2 and 3, due to the symmetry of the problem this is a generic situation. So (c) is not an upper bound. Can it be a lower bound? Sure, even to confirm that the balls are already ordered you have to weigh each consecutive pair, resulting in n − 1 comparisons. Even with the best algorithm you can't do better than guessing the correct order and then confirming your guess.
Is (d) a tighter lower bound? Plots again suggest that it is at least as great as (c), with the exception of a small region with no integer values. So if it is a lower bound, it will be tighter. Now think of a decision tree. Every algorithm to order these n balls can be written as a binary decision tree: you compare two balles named in a given node, and depending on the result of the comparison you proceed with one of two possible next steps. That decision tree has to have n! leafs, since every permutation has to be a distinct leaf so you know the exact permutation once you have reached a leaf. And a binary tree with n! leafs has to have a depth of at least ⌈log2 n!⌉. So yes, this is a lower bound as well.
Summarizing all of this you have (c) ≤ (d) ≤ x ≤ (b) ≤ (a), where x denotes the number of comparisons an optimal algorithm would need to order all the balls. As a comment by Mark Dickinson pointed out, A036604 on OEIS gives explicit lower bounds for some few n, and for n = 12 the inequality (d) < x is strict. So (d) does not describe the optimal algorithm exactly either.
By the way (and to answer your “how does this algorithms work”), finding the optimal algorithm for a given n is fairly easy, at least in theory: compute all possible decision trees for those n! sortings, and choose one with minimal depth. Of course this approach becomes impractical fairly quickly.
Now that we know that none of the answers gives the correct count of the optimal sorting algorithm, which answer is “best”? That depends a lot on context. In many applications, knowing an upper bound to the worst time behavior is more valuable than knowing a lower limit, so (b) would be superior to (d). But apparently the person creating the solution sheet had a different opinion, and went for (d), either because it is closer to the optimum (which I assume but have not proven) or because a lower bound is more useful to the application at hand. If you wanted to, you could likely challenge the whole question on the grounds that “best” wasn't adequately defined in the scope of the question.

Time complexity of algorithm with number of operations described by polynomial of more than one variable

I know that a certain algorithm I am using does 2Nk - 4k^2 operations, with parameters N and k. Now, the first derivative of that function is 2N - 8k (I know that N and k can only be positive integers here, but bear with me). That derivative is positive when k < N/4 and negative when k > N/4. So the complexity actually reduces if we increase k past a certain point. How will I express this in Big O notation? Also note that k <= (N - 1)/2, so there is an upper bound on k.
NOTE:
I know that a similar question has been asked here, but it does not consider the case where the first derivative changes sign if one of the variables reaches a certain point.

SICP, Fermat Test Issue

Section 1.2.6 of SICP describes an algorithm for Fermat prime testing as follows (my own words):
To test whether n is prime:
Choose a random integer a between 1 and n-1 inclusively.
If a^n %n = a, then n is probably prime.
The part I'm getting stuck on is the fact that we allow a = 1 because in this case, regardless of our choice of n (prime or not), the test will always pass.
You're right; there's no reason to choose a = 1. That being said, the statistical distance between the uniform distribution on [1, n-1] and the uniform distribution on [2, n-1] is O(1/n), so when n is very large (large enough that you don't just want to do trial division), the practical impact is very small (remember that this is already a probabilistic test, so a good number of other choices of a won't work either).
The text you link to actually says (emphasis mine):
Fermat's Little Theorem: If n is a prime number and a is any positive integer less than n, then a raised to the nth power is congruent to a modulo n.
(Two numbers are said to be congruent modulo n if they both have the same remainder when divided by n. The remainder of a number a when divided by n is also referred to as the remainder of a modulo n, or simply as a modulo n.)
If n is not prime, then, in general, most of the numbers a < n will not satisfy the above relation. This leads to the following algorithm for testing primality: Given a number n, pick a random number a < n and compute the remainder of a^n modulo n. If the result is not equal to a, then n is certainly not prime. If it is a, then chances are good that n is prime. Now pick another random number a and test it with the same method. If it also satisfies the equation, then we can be even more confident that n is prime. By trying more and more values of a, we can increase our confidence in the result. This algorithm is known as the Fermat test.
Up until now it never says to actually pick 1. It does later on though. I think that's a mistake, although not a big one. Even if it's true for a given value, you should test multiple values to be sure.
The pseudocode on Wikipedia uses [2, n - 1] as the range for example. You should probably use this range in practice (although the Fermat test isn't really used in practice, since Miller-Rabin is better).

Why is Sieve of Eratosthenes more efficient than the simple "dumb" algorithm?

If you need to generate primes from 1 to N, the "dumb" way to do it would be to iterate through all the numbers from 2 to N and check if the numbers are divisable by any prime number found so far which is less than the square root of the number in question.
As I see it, sieve of Eratosthenes does the same, except other way round - when it finds a prime N, it marks off all the numbers that are multiples of N.
But whether you mark off X when you find N, or you check if X is divisable by N, the fundamental complexity, the big-O stays the same. You still do one constant-time operation per a number-prime pair. In fact, the dumb algorithm breaks off as soon as it finds a prime, but sieve of Eratosthenes marks each number several times - once for every prime it is divisable by. That's a minimum of twice as many operations for every number except primes.
Am I misunderstanding something here?
In the trial division algorithm, the most work that may be needed to determine whether a number n is prime, is testing divisibility by the primes up to about sqrt(n).
That worst case is met when n is a prime or the product of two primes of nearly the same size (including squares of primes). If n has more than two prime factors, or two prime factors of very different size, at least one of them is much smaller than sqrt(n), so even the accumulated work needed for all these numbers (which form the vast majority of all numbers up to N, for sufficiently large N) is relatively insignificant, I shall ignore that and work with the fiction that composite numbers are determined without doing any work (the products of two approximately equal primes are few in number, so although individually they cost as much as a prime of similar size, altogether that's a negligible amount of work).
So, how much work does the testing of the primes up to N take?
By the prime number theorem, the number of primes <= n is (for n sufficiently large), about n/log n (it's n/log n + lower order terms). Conversely, that means the k-th prime is (for k not too small) about k*log k (+ lower order terms).
Hence, testing the k-th prime requires trial division by pi(sqrt(p_k)), approximately 2*sqrt(k/log k), primes. Summing that for k <= pi(N) ~ N/log N yields roughly 4/3*N^(3/2)/(log N)^2 divisions in total. So by ignoring the composites, we found that finding the primes up to N by trial division (using only primes), is Omega(N^1.5 / (log N)^2). Closer analysis of the composites reveals that it's Theta(N^1.5 / (log N)^2). Using a wheel reduces the constant factors, but doesn't change the complexity.
In the sieve, on the other hand, each composite is crossed off as a multiple of at least one prime. Depending on whether you start crossing off at 2*p or at p*p, a composite is crossed off as many times as it has distinct prime factors or distinct prime factors <= sqrt(n). Since any number has at most one prime factor exceeding sqrt(n), the difference isn't so large, it has no influence on complexity, but there are a lot of numbers with only two prime factors (or three with one larger than sqrt(n)), thus it makes a noticeable difference in running time. Anyhow, a number n > 0 has only few distinct prime factors, a trivial estimate shows that the number of distinct prime factors is bounded by lg n (base-2 logarithm), so an upper bound for the number of crossings-off the sieve does is N*lg N.
By counting not how often each composite gets crossed off, but how many multiples of each prime are crossed off, as IVlad already did, one easily finds that the number of crossings-off is in fact Theta(N*log log N). Again, using a wheel doesn't change the complexity but reduces the constant factors. However, here it has a larger influence than for the trial division, so at least skipping the evens should be done (apart from reducing the work, it also reduces storage size, so improves cache locality).
So, even disregarding that division is more expensive than addition and multiplication, we see that the number of operations the sieve requires is much smaller than the number of operations required by trial division (if the limit is not too small).
Summarising:
Trial division does futile work by dividing primes, the sieve does futile work by repeatedly crossing off composites. There are relatively few primes, but many composites, so one might be tempted to think trial division wastes less work.
But: Composites have only few distinct prime factors, while there are many primes below sqrt(p).
In the naive method, you do O(sqrt(num)) operations for each number num you check for primality. Ths is O(n*sqrt(n)) total.
In the sieve method, for each unmarked number from 1 to n you do n / 2 operations when marking multiples of 2, n / 3 when marking those of 3, n / 5 when marking those of 5 etc. This is n*(1/2 + 1/3 + 1/5 + 1/7 + ...), which is O(n log log n). See here for that result.
So the asymptotic complexity is not the same, like you said. Even a naive sieve will beat the naive prime-generation method pretty fast. Optimized versions of the sieve can get much faster, but the big-oh remains unchanged.
The two are not equivalent like you say. For each number, you will check divisibility by the same primes 2, 3, 5, 7, ... in the naive prime-generation algorithm. As you progress, you check divisibility by the same series of numbers (and you keep checking against more and more as you approach your n). For the sieve, you keep checking less and less as you approach n. First you check in increments of 2, then of 3, then 5 and so on. This will hit n and stop much faster.
Because with the sieve method, you stop marking mutiples of the running primes when the running prime reaches the square root of N.
Say, you want to find all primes less than a million.
First you set an array
for i = 2 to 1000000
primetest[i] = true
Then you iterate
for j=2 to 1000 <--- 1000 is the square root of 10000000
if primetest[j] <--- if j is prime
---mark all multiples of j (except j itself) as "not a prime"
for k = j^2 to 1000000 step j
primetest[k] = false
You don't have to check j after 1000, because j*j will be more than a million.
And you start from j*j (you don't have to mark multiples of j less than j^2 because they are already marked as multiples of previously found, smaller primes)
So, in the end you have done the loop 1000 times and the if part only for those j's that are primes.
Second reason is that with the sieve, you only do multiplication, not division. If you do it cleverly, you only do addition, not even multiplication.
And division has larger complexity than addition. The usual way to do division has O(n^2) complexity, while addition has O(n).
Explained in this paper: http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf
I think it's quite readable even without Haskell knowledge.
the first difference is that division is much more expensive than addition. Even if each number is 'marked' several times, it's trivial when compared with the huge number of divisions needed for the 'dumb' algorithm.
A "naive" Sieve of Eratosthenes will mark non-prime numbers multiple times.
But, if you have your numbers on a linked list and remove numbers taht are multiples (you will still need to walk the remainder of the list), the work left to do after finding a prime is always smaller than it was before finding the prime.
http://en.wikipedia.org/wiki/Prime_number#Number_of_prime_numbers_below_a_given_number
the "dumb" algorithm does i/log(i) ~= N/log(N) work for each prime number
the real algorithm does N/i ~= 1 work for each prime number
Multiply by roughly N/log(N) prime numbers.
It was a time when I was trying to find an efficient way of finding sum of primes less than x:
There I decided to use N by N square table and started checking if numbers with a unit digits in [1,3,7,9]
But Eratosthenes Method of prime made it a little easier: How
Let you want to know if N is prime or Not
You started finding factorization. So you will realize when N is factorized
when you divide N with the highest factor quotient will be less.
So, You take a number: int(sqrt(N)) = K(say) divides N you get somewhat same and close number to K
Now let's say you divide N with u<K, but if "U" is not prime and one of the prime factors of U is V(prime) then will obviously be less than U (V<U) and V will also divide N
then
why not divide and check if N is prime or not by DIVIDING 'N' WITH ONLY PRIMES LESS THAN K=int(sqrt(N))
Number of times for which loop Keeps executing = π(√n)
This is how the brilliant idea of Eratosthenes starts taking pictures and will start giving you intuition behind this all.
Btw using the Sieve of Eratosthenes one can find sum of primes less than a multiple of 10.
because for a given column you just check need to check their unit digits[1,3,7,9] and for how many times a particular unit digit is repeating.
Being new to Stack Overflow Community! Would like to know suggestions on the same if anything is wrong.

Resources