O(n) - the next permutation lexicographically - algorithm

i'm just wondering what is efficiency (O(n)) of this algorithm:
Find the largest index k such that a[k] < a[k + 1]. If no such index exists, the permutation is the last permutation.
Find the largest index l such that a[k] < a[l]. Since k + 1 is such an index, l is well defined and satisfies k < l.
Swap a[k] with a[l].
Reverse the sequence from a[k + 1] up to and including the final element a[n].
As I understand the worst case O(n) = n (when k is the first element of previous permutation), best case O(n) = 1 (when k is last element of previous permutation).
Can I say that O(n) = n/2 ?

O(n) = n/2 makes no sense. Let f(n) = n be the running time of your algorithm. Then the right way to say it is that f(n) is in O(n). O(n) is a set of functions that are at most asymptotically linear in n.
Your optimization makes the expected running time g(n) = n/2. g(n) is also in O(n). In fact O(n) = O(n/2) so your saving of half of the time does not change the asymptotic complexity.

All steps in the algorithm takes O(n) asymptotically.
Your averaging is incorrect. Just because best case is O(1) and worst case is O(n), you can't say the algorithm takes O(n)=n/2. Big O notation is simply for the upper bound of the algorithm.
So the algorithm is still O(n) irrespective of the best case scenario.

There is no such thing as O(n) = n/2.
When you do O(n) calculations you're just trying to find the functional dependency, you don't care about coefficients. So there's no O(n)= n/2 just like there's no O(n) = 5n

Asymptotically, O(n) is the same as O(n/2). In any case, the algorithm is performed for each of the n! permutations, so the order is much greater than your estimate (on the order of n!).

Related

Sorting algorithm proof and running-time

Hydrosort is a sorting algorithm. Below is the pseudocode.
*/A is arrary to sort, i = start index, j = end index */
Hydrosort(A, i, j): // Let T(n) be the time to find where n = j-1+1
n = j – i + 1 O(1)
if (n < 10) { O(1)
sort A[i…j] by insertion-sort O(n^2) //insertion sort = O(n^2) worst-case
return O(1)
}
m1 = i + 3 * n / 4 O(1)
m2 = i + n / 4 O(1)
Hydrosort(A, i, m1) T(n/2)
Hydrosort(A, m2, j) T(n/2)
Hydrosort(A, i, m1) T(n/2)
T(n) = O(n^2) + 3T(n/2), so T(n) is O(n^2). I used the 3rd case of the Master Theorem to solve this recurrence.
I have 2 questions:
Have I calculated the worst-case running time here correctly?
how would I prove that Hydrosort(A, 1, n) correctly sorts an array A of n elements?
Have I calculated the worst-case running time here correctly?
I am afraid not.
The complexity function is:
T(n) = 3T(3n/4) + CONST
This is because:
You have three recursive calls for a problem of size 3n/4
The constant modifier here is O(1), since all non recursive calls operations are bounded to a constant size (Specifically, insertion sort for n<=10 is O(1)).
If you go on and calculate it, you'll get worse than O(n^2) complexity
how would I prove that Hydrosort(A, 1, n) correctly sorts an array A
of n elements?
By induction. Assume your algorithm works for problems of size n, and let's examine a problem of size n+1. For n+1<10 it is trivial, so we ignore this case.
After first recursive call, you are guaranteed that first 3/4 of the
array is sorted, and in particular you are guaranteed that the first n/4 of the elements are the smallest ones in this part. This means, they cannot be in the last n/4 of the array, because there are at least n/2 elements bigger than them. This means, the n/4 biggest elements are somewhere between m2 and j.
After second recursive call, since it is guaranteed to be invoked on the n/4 biggest elements, it will place these elements at the end of the array. This means the part between m1 and j is now sorted properly. This also means, 3n/4 smallest elements are somewhere in between i and m1.
Third recursive call sorts the 3n/4 elements properly, and the n/4 biggest elements are already in place, so the array is now sorted.

Why is the Big-O complexity of this algorithm O(n^2)?

I know the big-O complexity of this algorithm is O(n^2), but I cannot understand why.
int sum = 0;
int i = 1; j = n * n;
while (i++ < j--)
sum++;
Even though we set j = n * n at the beginning, we increment i and decrement j during each iteration, so shouldn't the resulting number of iterations be a lot less than n*n?
During every iteration you increment i and decrement j which is equivalent to just incrementing i by 2. Therefore, total number of iterations is n^2 / 2 and that is still O(n^2).
big-O complexity ignores coefficients. For example: O(n), O(2n), and O(1000n) are all the same O(n) running time. Likewise, O(n^2) and O(0.5n^2) are both O(n^2) running time.
In your situation, you're essentially incrementing your loop counter by 2 each time through your loop (since j-- has the same effect as i++). So your running time is O(0.5n^2), but that's the same as O(n^2) when you remove the coefficient.
You will have exactly n*n/2 loop iterations (or (n*n-1)/2 if n is odd).
In the big O notation we have O((n*n-1)/2) = O(n*n/2) = O(n*n) because constant factors "don't count".
Your algorithm is equivalent to
while (i += 2 < n*n)
...
which is O(n^2/2) which is the same to O(n^2) because big O complexity does not care about constants.
Let m be the number of iterations taken. Then,
i+m = n^2 - m
which gives,
m = (n^2-i)/2
In Big-O notation, this implies a complexity of O(n^2).
Yes, this algorithm is O(n^2).
To calculate complexity, we have a table the complexities:
O(1)
O(log n)
O(n)
O(n log n)
O(n²)
O(n^a)
O(a^n)
O(n!)
Each row represent a set of algorithms. A set of algorithms that is in O(1), too it is in O(n), and O(n^2), etc. But not at reverse. So, your algorithm realize n*n/2 sentences.
O(n) < O(nlogn) < O(n*n/2) < O(n²)
So, the set of algorithms that include the complexity of your algorithm, is O(n²), because O(n) and O(nlogn) are smaller.
For example:
To n = 100, sum = 5000. => 100 O(n) < 200 O(n·logn) < 5000 (n*n/2) < 10000(n^2)
I'm sorry for my english.
Even though we set j = n * n at the beginning, we increment i and decrement j during each iteration, so shouldn't the resulting number of iterations be a lot less than n*n?
Yes! That's why it's O(n^2). By the same logic, it's a lot less than n * n * n, which makes it O(n^3). It's even O(6^n), by similar logic.
big-O gives you information about upper bounds.
I believe you are trying to ask why the complexity is theta(n) or omega(n), but if you're just trying to understand what big-O is, you really need to understand that it gives upper bounds on functions first and foremost.

Proof of Ω(n logk) worst case complexity in a comparison sort algorithm

I'm writing a comparison algorithm that takes n numbers and a number k as input.
It separates the n numbers to k groups so that all numbers in group 1 are smaller than all numbers of group 2 , ... , smaller than group k.
The numbers of the same group are not necessarily sorted.
I'm using a selection(A[],left,right,k) to find the k'th element , which in my case is the n/k element (to divide the whole array in to 2 pieces) and then repeat for each piece , until the initial array is divided to k parts of n/k numbers each.
It has a complexity of Θ(n logk) as its a tree of logk levels (depth) that cost maximum cn calculations each level. This is linear time as logk is considered a constant.
I am asked to prove that all comparison algorithms that sort an Array[n] to k groups in this way, cost Ω(nlogk) in the worst case.
I've searched around here , google and my algorithm's book (Jon Kleinberg Eva Tardos) I only find proof for comparison algorithms that sort ALL the elements. The proof of such algorithm complexity is not accepted in my case because all of these are under circumstances that do not meet my problem, nor can they be altered to meet my problem. ( also consider that regular quicksort with random selection results in Θ(nlogn) which is not linear as Ω(nlogk) is)
You can find the general algorithm proof here:
https://www.cs.cmu.edu/~avrim/451f11/lectures/lect0913.pdf
where it is also clearly explained why my problem does not belong in the comparison sort case of O(nlogn)
Sorting requires lg(n!) = Omega(n log n) comparisons because there are n! different output permutations.
For this problem there are
n!
-------
k
(n/k)!
equivalence classes of output permutations because the order within k independent groups of n/k elements does not matter. We compute
n!
lg ------- = lg (n!) - k lg((n/k)!)
k
(n/k)!
= n lg n - n - k ((n/k) lg (n/k) - n/k) ± O(lg n + k lg (n/k))
(write lg (...!) as a sum, bound with two integrals;
see https://en.wikipedia.org/wiki/Stirling's_approximation)
= n (lg n - lg (n/k)) ± O(lg n + k lg (n/k))
= n lg k ± O(lg n + k lg (n/k))
= Omega(n lg k).
(O(lg n + k lg (n/k)) = O(n), since k <= n)
prove that all comparison algorithms that sort an Array[n] to k groups in this way, cost Ω(nlogk) in the worst case.
I think the statement is false. If using quickselect with a poor pivot choice (such as always using first or last element), then the worst case is probably O(n^2).
Only some comparison algorithms will have a worst case of O(n log(k)). Using median of medians (the n/5 version) for the pivot prevents quickselect solves the pivot issue. There are other algorithms that would also be O(n log(k)).

Average time complexity of finding top-k elements

Consider the task of finding the top-k elements in a set of N independent and identically distributed floating point values. By using a priority queue / heap, we can iterate once over all N elements and maintain a top-k set by the following operations:
if the element x is "worse" than the heap's head: discard x ⇒ complexity O(1)
if the element x is "better" than the heap's head: remove the head and insert x ⇒ complexity O(log k)
The worst case time complexity of this approach is obviously O(N log k), but what about the average time complexity? Due to the iid-assumption, the probability of the O(1) operation increases over time, and we rarely have to perform the costly O(log k), especially for k << N.
Is this average time complexity documented in any citable reference? What's the average time complexity? If you have a citeable reference for your answer please include it.
Consider the i'th largest element, and a particular permutation. It'll inserted into the k-sized heap if it appears before no more than k-1 of the (i - 1) larger elements in the permutation.
The probability of that heap-insertion happening is 1 if i <= k, and k/i if i > k.
From this, you can compute the expectation of the number of heap adjustments, using linearity of expectation. It's sum(i = 1 to k)1 + sum(i = k+1 to n)k/i = k + sum(i = k+1 to n)k/i = k * (1 + H(n) - H(k)), where H(n) is the n'th harmonic number.
This is approximately k log(n) (for k << n), and you can compute your average cost from there.

How do you calculate the big oh of the binary search algorithm?

I'm looking for the mathematical proof, not just the answer.
The recurrence relation of binary search is (in the worst case)
T(n) = T(n/2) + O(1)
Using Master's theorem
n is the size of the problem.
a is the number of subproblems in the recursion.
n/b is the size of each subproblem. (Here it is assumed that all subproblems are essentially the same size.)
f (n) is the cost of the work done outside the recursive calls, which includes the cost of dividing the problem and the cost of merging the solutions to the subproblems.
Here a = 1, b = 2 and f(n) = O(1) [Constant]
We have f(n) = O(1) = O(nlogba)
=> T(n) = O(nlogba log2 n)) = O(log2 n)
The proof is quite simple: With each recursion you halve the number of remaining items if you’ve not already found the item you were looking for. And as you can only divide a number n recursively into halves at most log2(n) times, this is also the boundary for the recursion:
2·2·…·2·2 = 2x ≤ n ⇒ log2(2x) = x ≤ log2(n)
Here x is also the number of recursions. And with a local cost of O(1) it’s O(log n) in total.

Resources