Can't define runtime of this algorithm - algorithm

I have one algorithm here.
Click here to check algorithm image
What it does, it traverse an array and find 3 largest values and return their sum.
For example, an array [1,2,3,4,5] will return 12 (3+4+5=12).
The algorithm in the image says it is O(nlogk). But that is what I cannot understand.
Followings is my perspective about first for loop in the image:
Heap's method "insert()" and "deleteMin()", they both takes O(logn). So in the first for loop, it makes O(2*logn) by adding their runtime, which is simply O(logn). Since first for loop iterates for all element in the array, so total runtime of first for loop is O(nlogn).
Following is my perspective about 2nd while loop in the image:
From the previous for loop, we have deleted some of minimum values if h.size() > k. So the number of values in the heap is currently k. "sum=sum+h.min()" takes O(logn) because searching minimum value in heap takes O(logn) if I know correctly, and "h.deleteMin()" also takes O(logn) because it has to search again and delete. So is O(2*logn) by adding their runtime, which is simply O(logn). Since we iterate this while loop for only k times because there are k numbers of elements, so 2nd while loops result in O(k*logn)
So we have O(nlogn) from first for loop, and O(klogn) from 2nd while loop. It is obvious that O(nlogn) is greater than O(klogn) since k is some constant. Thus this algorithm ends in being O(nlogn) at the end.
But the answer says it is "O(nlogk)" instead of "O(nlogn)".
Can you explain the reason?

Operations on heap take O(log(size_of_heap)). In case of this algorithm heap size is k (excluding first several iterations).
So we get O(total_number_of_elements*log(size_of_heap))=O(n*log(k)).

Your assumption about insert() and deletemin() runtime takes O(log n) is the not correct. The 'n' in O(log n) represents the no.of elements in heap. In this case it is k.
Hence, for the first loop - you have O(2*logk) for every element and a total will have O(nlogk) and 2nd loop - O(klogk)
Together the total complexity can be defines as O(n*logk)

Related

a heap with n elements that supports Insert and Extract-Min, Which of the following tasks can you achieve in O(logn) time?

For the following questions
Question 3
You are given a heap with n elements that supports Insert and Extract-Min. Which of the following tasks can you achieve in O(logn) time?
Find the median of the elements stored in the heap.
Find the fifth-smallest element stored in the heap.
Find the largest element stored in the heap.
Find the median of the elements stored in theheap.
Why is "Find the largest element stored in the heap."not correct, my understanding here is that you can use logN time to go to the bottom of the heap, and one of the element there must be the largest element.
"Find the fifth-smallest element stored in the heap." this should take constant time right, because you only need to go down 5 layers at most?
"Find the median of the elements stored in the heap. " should this take O(n) time? because we extract min for the n elements to get a sorted array, and take o(1) to find the median of it?
It depends on what the running times are of the operations insert and extract-min. In traditional heaps, both take ϴ(log n) time. However, in finger-tree-based heaps, only insert takes ϴ(log n) time, while extract-min takes O(1) time. There, you can find the fifth smallest element in O(5) = O(1) time and the median in O(n/2) = O(n) time. You can also find the largest element in O(n) time.
Why is "Find the largest element stored in the heap."not correct, my understanding here is that you can use logN time to go to the bottom of the heap, and one of the element there must be the largest element.
The lowest level of the heap contains half of the elements. More correctly, half of the elements of the heap are leaves--have no children. The largest element in the heap is one of those. Finding the largest element of the heap, then, will require that you examine n/2 items. Except that the heap only supports insert and extract-min, so you end up having to call extract-min on every element. Finding the largest element will take O(n log n) time.
"Find the fifth-smallest element stored in the heap." this should take constant time right, because you only need to go down 5 layers at most?
This can be done in log(n) time. Actually 5*log(n) because you have to call extract-min five times. But we ignore constant factors. However it's not constant time because the complexity of extract-min depends on the size of the heap.
"Find the median of the elements stored in the heap." should this take O(n) time? because we extract min for the n elements to get a sorted array, and take o(1) to find the median of it?
The median is the middle element. So you only have to remove n/2 elements from the heap. But removing an item from the heap is a log(n) operation. So the complexity is O(n/2 log n) and since we ignore constant factors in algorithmic analysis, it's O(n log n).

quick sort time and space complexity?

Quick is the in place algorithm which does not use any auxiliary array. So why memory complexity of this O(nlog(n)) ?
Similarly I understand it's worst case time complexity is O(n^2) but not getting why average case time complexity is O(nlog(n)). Basically I am not sure what do we mean when we say average case complexity ?
To your second point an excerpt from Wikipedia:
The most unbalanced partition occurs when the partitioning routine returns one of sublists of size n − 1. This may occur if the pivot happens to be the smallest or largest element in the list, or in some implementations (e.g., the Lomuto partition scheme as described above) when all the elements are equal.
If this happens repeatedly in every partition, then each recursive call processes a list of size one less than the previous list. Consequently, we can make n − 1 nested calls before we reach a list of size 1. This means that the call tree is a linear chain of n − 1 nested calls. The ith call does O(n − i) work to do the partition, and {\displaystyle \textstyle \sum _{i=0}^{n}(n-i)=O(n^{2})} , so in that case Quicksort takes O(n²) time.
Because you usually don't know what exact numbers you have to sort and you don't know, which pivot element you choose, you have the chance, that your pivot element isn't the smallest or biggest number in the array you sort. If you have an array of n not duplicated numbers, you have the chance of (n - 2) / n, that you don't have a worst case.

The time complexity of quick select

I read that the time complexity of quick select is:
T(n) = T(n/5) + T(7n/10) + O(n)
I read the above thing as "time taken to quick select from n elements = (time taken to select from 7n/10 elements)+ (time taken to quickselect from n/5 elements) + (some const *n)"
So I understand that once we find decent pivot, only 7n/10 elements are left, and doing one round of arranging the pivot takes time n.
But the n/5 part confuses me. I know it has got to do with median of medians, but i don't quite get it.
Median of medians from what i understood , is recursively splitting into 5 and finding the medians, till u get 1.
I found that the time taken to do that, is about n
So T of mom(n)=n
How do you equate that T of quick_select(n) = T_mom(n)/5?
In other words, this is what I think the equation should read:
T(n)= O(n)+n+T(7n/10)
where,
O(n) -> for finding median
n-> for getting the pivot into its position
T(7n/10) -> Doing the same thing for the other 7n/10 elements. (worst case)
Can someone tell me where I'm going wrong?
In this setup, T(n) refers to the number of steps required to compute MoM on an array of n elements. Let's go through the algorithm one step at a time and see what happens.
First, we break the input into blocks of size 5, sort each block, form a new array of the medians of those blocks, and recursively call MoM to get the median of that new array. Let's see how long each of those steps takes:
Break the input into blocks of size 5: this could be done in time O(1) by just implicitly partitioning the array into blocks without moving anything.
Sort each block: sorting an array of any constant size takes time O(1). There are O(n) such blocks (specifically, ⌈n / 5⌉), so this takes time O(n).
Get the median of each block and form a new array from those medians. The median element of each block can be found in time O(1) by just looking at the center element. There are O(n) blocks, so this step takes time O(n).
Recursively call MoM on that new array. This takes time T(⌈n/5⌉), since we're making a recursive call on the array of that size we formed in the previous step.
So this means that the logic to get the actual median of medians takes time O(n) + T(⌈n/5⌉).
So where does the T(7n/10) part come from? Well, the next step in the algorithm is to use the median of medians we found in step (4) as a partition element to split the elements into elements less than that pivot and elements greater than that pivot. From there, we can determine whether we've found the element we're looking for (if it's at the right spot in the array) or whether we need to recurse on the left or right regions of the array. The advantage of picking the median of the block medians as the splitting point is that it guarantees a worst-case 70/30 split in this step between the smaller and larger elements, so if we do have to recursively continue the algorithm, in the worst case we do so with roughly 7n/10 elements.
In the median of median part, we do the followings:
Take median of sublists which each of them has at most 5 elements. for each of this lists we need O(1) operations and there are n/5 such lists so totally it takes O(n) to just find median of each of them.
We take median of those n/5 medians (median of medians). This needs T(n/5), because there are only n/5 elements which we should check.
So the median of median part is actually T(n/5) + O(n), BTW the T(7n/10) part is not exactly as what you said.

Get the k smallest elements of an array using quick sort

How would you find the k smallest elements from an unsorted array using quicksort (other than just sorting and taking the k smallest elements)? Would the worst case running time be the same O(n^2)?
You could optimize quicksort, all you have to do is not run the recursive potion on the other portions of the array other than the "first" half until your partition is at position k. If you don't need your output sorted, you can stop there.
Warning: non-rigorous analysis ahead.
However, I think the worst-case time complexity will still be O(n^2). That occurs when you always pick the biggest or smallest element to be your pivot, and you devolve into bubble sort (i.e. you aren't able to pick a pivot that divides and conquers).
Another solution (if the only purpose of this collection is to pick out k min elements) is to use a min-heap of limited tree height ciel(log(k)) (or exactly k nodes). So now, for each insert into the min heap, your maximum time for insert is O(n*log(k)) and the same for removal (versus O(n*log(n)) for both in a full heapsort). This will give the array back in sorted order in linearithmic time worst-case. Same with mergesort.

Sorting m sets of total O(n) elements in O(n)

Suppose we have m sets S1,S2,...,Sm of elements from {1...n}
Given that m=O(n) , |S1|+|S2|+...+|Sm|=O(n)
sort all the sets in O(n) time and O(n) space.
I was thinking to use counting sort algorithm on each set.
Counting sort on each set will be O(S1)+O(S2)+...+O(Sm) < O(n)
and because that in it's worst case if one set consists of n elements it will still take O(n).
But will it solve the problem and still hold that it uses only O(n) space?
Your approach won't necessarily work in O(n) time. Imagine you have n sets of one element each, where each set just holds n. Then each iteration of counting sort will take time Θ(n) to complete, so the total runtime will be Θ(n2).
However, you can use a modified counting sort to solve this by effectively doing counting sort on all sets at the same time. Create an array of length n that stores lists of numbers. Then, iterate over all the sets and for each element, if the value is k and the set number is r, append the number r to array k. This process essentially builds up a histogram of the distribution of the elements in the sets, where each element is annotated with the set that it came from. Then, iterate over the arrays and reconstruct the sets in sorted order using logic similar to counting sort.
Overall, this algorithm takes time Θ(n), since it takes time Θ(n) to initialize the array, O(n) total time to distribute the elements, and O(n) time to write them back. It also uses only Θ(n) space, since there are n total arrays and across all the arrays there are a total of n elements distributed.
Hope this helps!

Resources