Can this be approximated? - algorithm

The time complexity of finding k largest element using min-heap is given as
O(k + (n-k)log k) as mentioned here link Can it be approximated to O((n-k) log k)?
Since O(N+Nlog(k))=O(Nlog(k)) is above approximation also true ?

No you can't simplify it like that. This can be shown with a few example values for k that are close to n:
k = n
Now the complexity is defined as: O(n + 0log n) = O(n). If you would have left out the first term of the sum, you would have ended of with O(0), which obviously is wrong.
k = n - 1
We get: O((n-1) + 1log(n-1)) = O(n + log(n)) = O(n). Without the first term, you would get O(log(n)), which again is wrong.

Related

Time Complexity in asymptotic analysis log n and log (n+m)

Just some interesting discussion inspired by a conversation in my class.
There are two algorithms, one has time complexity log n and another log (n+m).
Am I correct to argue for average cases, log (n+m) one will perform faster while they make no differences in running time when considering it asymptotically? Because taking the limit of both and f1'/f2' will result in a constant, therefore they have the same order of growth.
Thanks!
As I can see from the question, both n and m are independent variables. So
when stating that
O(m + n) = O(n)
it should hold for any m, which is not: the counter example is
m = exp(n)
O(log(m + n)) = O(log(n + exp(n))) = O(log(exp(n))) = O(n) > O(log(n))
That's why in general case we can only say, that
O(log(m + n)) >= O(log(n))
An interesting problem is when O(m + n) = O(n). If m grows not faster then polynom from n, i.e. O(m) <= O(P(n)):
O(log(m + n)) = O(log(P(n) + n)) = O(log(P(n))) = k * O(log(n)) = O(log(n))
In case of (multi)graphs seldom have we that many edges O(m) > P(n): even complete graph Kn contains only m = n * (n - 1) / 2 = P(n) edges, that's why
O(m + n) = O(n)
holds for ordinary graph (no parallel/multiple edges, no loops)

Sorting algorithm proof and running-time

Hydrosort is a sorting algorithm. Below is the pseudocode.
*/A is arrary to sort, i = start index, j = end index */
Hydrosort(A, i, j): // Let T(n) be the time to find where n = j-1+1
n = j – i + 1 O(1)
if (n < 10) { O(1)
sort A[i…j] by insertion-sort O(n^2) //insertion sort = O(n^2) worst-case
return O(1)
}
m1 = i + 3 * n / 4 O(1)
m2 = i + n / 4 O(1)
Hydrosort(A, i, m1) T(n/2)
Hydrosort(A, m2, j) T(n/2)
Hydrosort(A, i, m1) T(n/2)
T(n) = O(n^2) + 3T(n/2), so T(n) is O(n^2). I used the 3rd case of the Master Theorem to solve this recurrence.
I have 2 questions:
Have I calculated the worst-case running time here correctly?
how would I prove that Hydrosort(A, 1, n) correctly sorts an array A of n elements?
Have I calculated the worst-case running time here correctly?
I am afraid not.
The complexity function is:
T(n) = 3T(3n/4) + CONST
This is because:
You have three recursive calls for a problem of size 3n/4
The constant modifier here is O(1), since all non recursive calls operations are bounded to a constant size (Specifically, insertion sort for n<=10 is O(1)).
If you go on and calculate it, you'll get worse than O(n^2) complexity
how would I prove that Hydrosort(A, 1, n) correctly sorts an array A
of n elements?
By induction. Assume your algorithm works for problems of size n, and let's examine a problem of size n+1. For n+1<10 it is trivial, so we ignore this case.
After first recursive call, you are guaranteed that first 3/4 of the
array is sorted, and in particular you are guaranteed that the first n/4 of the elements are the smallest ones in this part. This means, they cannot be in the last n/4 of the array, because there are at least n/2 elements bigger than them. This means, the n/4 biggest elements are somewhere between m2 and j.
After second recursive call, since it is guaranteed to be invoked on the n/4 biggest elements, it will place these elements at the end of the array. This means the part between m1 and j is now sorted properly. This also means, 3n/4 smallest elements are somewhere in between i and m1.
Third recursive call sorts the 3n/4 elements properly, and the n/4 biggest elements are already in place, so the array is now sorted.

Is O(K + (N-K)logK) equivalent to O(K + N log K)?

Can we say O(K + (N-K)logK) is equivalent to O(K + N logK) for 1 < = K <= N?
The short answer is they are not equivalent and it depends on the value of k. If k is equal to N, then the first complexity is O(N), and the second complexity is O(N + Nlog N) which is equivalent to O(NlogN). However, O(N) is not equivalent to O(N log N).
Moreover, if a function is in O(K + (N-K) log K) is in O(K + N log K) (definitely for every positive K), and the proof of this is straightforward.
Yes because in the worst case (N-K) logK is at most N logK given your constraints since 1 <= K <= N.
Not exactly.
If they are equivalent, then every function in O(k + (n-k)log k) is also in O(k + n log k) and vice-versa.
Let f(n,k) = n log k
This function is certainly in O(k + n log k), but not in O(k + (n-k)log k).
Let g(n,k) = k + (n-k)log k
Then as x approaches infinity, f(x,x)/g(x,x) grows without bound, since:
f(x,x) / g(x,x)
= (x log x) / x
= log x
See the definition of big-O notation for multiple variables: http://mathwiki.cs.ut.ee/asymptotics/04_multiple_variables
Wikipedia provides the same information, but in less accessible notation:
https://en.wikipedia.org/wiki/Big_O_notation#Multiple_variables

Is my Big O approximations correct?

I have an algorithm to determine if two strings are permutations of each other. The code can be found here: https://jsfiddle.net/bxohcgjn/
N = String A
M = String B
For time complexity, i have: O(N(N logN + M logM))
For space complexity, I have: O(N + M)
N logN = for sorting A
M logM = for sorting B
I understand that the browser's implementation for sort will change it, but I am assuming quicksort.
Just want to see if my thinking is correct on this.
About your time complexity, the for loop (the lineal order), must not be multiplied by the sum of both sorts.
If an algorithm consists of n steps, the order of the algorithm is the sum of their orders:
O(alg) = O(step_1) + O(step_2) + ... + O(step_n)
In your case, n = 3 (both sorts and the for):
O(is_permutation) = O(sort_A) + O(sort_B) + O(for)
= O(n logn) + O(m logm) + O(n)
Which is the maximum of them:
O(is_permutation) = max(O(n logn), O(m logm), O(n))
But since you test before that the sizes of both strings must be the same before applying the sort, in the worst case (which is what you are analyzing), there is only one size, so, the expression is translated to:
O(is_permutation) = max(O(n logn), O(n logn), O(n))
= max(O(n logn), O(n))
= O(n logn)

Complexity of a particular divide and conquer algorithm

An algorithm decomposes (divides) a problem of size n into b sub-problems each of size n/b where b is an integer. The cost of decomposition is n, and C(1)=1. Show, using repeated substitution, that for all values of 2≥b, the complexity of the algorithm is O(n lg n).
This is what I use for my initial equation C(n) = C(n/b) + n
and after k-steps of substituting I get C(n) = C(n/b^k) + n [summation(from i=0 to k-1) of (1/b)^i]
k = log(base b) n
I'm not sure I'm getting all of this right because when I finish doing this i don't get n lgn, anybody can help me figure out what to do?
I think your recurrence is wrong. Since there are b separate subproblems of size n/b, there should be a coefficient of b in front of the C(n / b) term. The recurrence should be
C(1) = 1
C(n) = b C(n/b) +O(n).
Using the Master Theorem, this solves to O(n log n). Another way to see this is that after expanding the recurrence k times, we get
C(n) = bk C(n / bk) + kn
This terminates when k = logb n. Plugging in that value of k and simplifying yields a value that is O(n log n).
Hope this helps!

Resources