Using an algorithm Tree-Insert(T, v) that inserts a new value v into a binary search tree T, the following algorithm grows a binary search tree by repeatedly inserting each value in a given section of an array into the tree:
Tree-Grow(A, first, last, T)
1 for i ← first to last
2 do Tree-Insert(T, A[i])
If the tree is initially empty, and the length of array section (i.e., last-first+1) is n, what are the best-case and the worst-case asymptotic running time of the above algorithm, respectively?
When n = 7, give a best-case instance (as an array containing digits 1 to 7, in certain order), and a worst-case instance (in the same form) of the algorithm.
If the array is sorted and all the values are distinct, find a way to modify Tree-Grow, so that it will always build the shortest tree.
What are the best-case and the worst-case asymptotic running time of the modified algorithm, respectively?
Please tag homework questions with the homework tag. In order to do well on your final exam, I suggest you actually learn this stuff, but I'm not here to judge you.
1) It takes O(n) to iterate from first to last. It takes O(lg n) to insert into a binary tree, therefore it the algorithm that you have shown takes O(n lg n) in the best case.
The worst case of inserting into a binary tree is when the tree is really long, but not very bushy; similar to a linked list. In that case, it would take O(n) to insert, therefore it would take O(n^2) in the worst case.
2) Best Case: [4, 2, 6, 1, 3, 5, 7], Worst Case: [1, 2, 3, 4, 5, 6, 7]
3) Use the n/2 index as the root, then recursively do this for the left side and right side of the array.
4) O(n lg n) in the best and worst case.
I hope this helps.
Related
Given an already sorted array of n distinct elements where only the last element is out of order, would insertion sort be the fastest algorithm to be used here?
Ex: [1, 3, 5, 6, 7, 9, 2]
If it was an array, yes, insertion sort.
Worst case complexity: O(n)
Worst case scenario: Unsorted element is the smallest element.
If it was a linked list of any kind where the cost of insertion is constant time, then a binary search would be the fastest most efficient way.
Worst case complexity: O(log(n))
In Cormen's own words - "The difference is that with the deterministic algorithm, a particular input can elicit that worst-case behavior. With the randomized algorithm, however, no input can always elicit the worst-case behavior."
How does adding a randomized pivot change anything, the algorithm is still gonna perform bad under some particular input and considering each kind of input equally likely this is no better than that of the standard quicksort, only difference being we don't actually know which particular input is going to cause the worst case time complexity. So why is the randomized version considered better?
Consider the following version of quicksort, where we always pick the last element as the pivot. Now consider the following array:
int[] arr = {9, 8, 7, 6, 5, 4, 3, 2, 1};
When this array is sorted using our version of quicksort, it will always pick the smallest element as its pivot, the last element. And in the first iteration, it will change the array like this:
arr = [1, 8, 7, 6, 5, 4, 3, 2, 9];
Now, it will recurse on the sub-arrays:
s1 = [1, 8, 7, 6, 5, 4, 3, 2];
s2 = [9];
In s1 it will again pick 2 as its pivot, and only 8 and 2 will interchange positions. So, in this way, if we try to formulate a recurrence relation, for its complexity, it will be
T(n) = T(n-1) + O(n)
which corresponds to O(n^2).
So, for this array, the standard version will always take O(n^2) time.
In the randomized version, we first exchange the last element with some random element in the array and then select it as the pivot. So, for the given array, this pivot will split the array randomly, most probably in the middle. So, now the recurrence will be
T(n) = 2T(n/2) + O(n)
which will be O(n * Log(n)).
That's why we consider randomized quicksort better than standard quicksort, because, there is very low probability of bad splits in randomized quicksort.
The difference is that with the deterministic algorithm, a particular input can elicit that worst-case behavior. With the randomized algorithm, however, no input can always elicit the worst-case behavior.
This should be clarified to mean a truly randomized algorithm. If instead a deterministic pseudo-random algorithm is used, then a deliberately created input can elicit worst case behavior.
With the randomized algorithm, however, no input can always elicit the worst-case behavior.
This should be clarified: even with a truly randomized algorithm, there is still the possibility of some specific input that could elicit worst-case behavior in one or more invocations of a randomized quicksort with that input, but no input could always elicit worst-case behavior for an infinite number of invocations of a truly randomized quicksort on that same input.
Most library implementations of single pivot quicksort use a median of 3 or median of 9, since they can't rely on having fast instructions for random numbers like X86 RRAND and fast divide (for modulo function). If a quicksort was somehow part of an encryption scheme, then a truly randomized algorithm could be used to avoid time based attacks.
I have create a data structure that implements a maximum binary heap. Im trying to find 2 sequences of n numbers which the insertion takes O(n) and O(nlogn) time.
Is this possible?
Let me try to restate what you are asking; please correct me if this is wrong.
So a Binary Heap data structure has time complexity of logN for insertion. The process of insertion in a max-heap is as follows,
the tree is a complete binary tree, i.e. all levels are full except the last one.
insert at the left most spot in the tree.
if the node is smaller than the parent, a swap is performed.
the process is repeated until the node is at the appropriate level.
So for your question,
you want a sequence of n numbers with insertion time complexity of O(n). This means, that each insertion takes O(1) or constant time. This means we need a sequence where there is no need for a heapify operation. I think a sequence like following would obviate the need for a heapify operation.
[10, 8, 9, 4, 5, 6, 7 ]
for the second one, you want O(nlogn) which means each operation takes logn which is the standard or average performance a binary heap for insertion. So any sequence should do,
[ 1, 2, 3, 4, 5, 6, 7]
for each one from 2nd onward, you need to compare to parent node and swap.
How is it different if I select a randomized pivot versus just selecting the first pivot in an unordered set/list?
If the set is unordered, isnt selecting the first value in the set, random in itself? So essentially, I am trying to understand how/if randomizing promises a better worst case runtime.
I think you may be mixing up the concepts of arbitrary and random. It's arbitrary to pick the first element of the array - you could pick any element you'd like and it would work equally well - but it's not random. A random choice is one that can't be predicted in advance. An arbitrary choice is one that can be.
Let's imagine that you're using quicksort on the sorted sequence 1, 2, 3, 4, 5, 6, ..., n. If you choose the first element as a pivot, then you'll choose 1 as the pivot. All n - 1 other elements then go to the right and nothing goes to the left, and you'll recursively quicksort 2, 3, 4, 5, ..., n.
When you quicksort that range, you'll choose 2 as the pivot. Partitioning the elements then puts nothing on the left and the numbers 3, 4, 5, 6, ..., n on the right, so you'll recursively quicksort 3, 4, 5, 6, ..., n.
More generally, after k steps, you'll choose the number k as a pivot, put the numbers k+1, k+2, ..., n on the right, then recursively quicksort them.
The total work done here ends up being Θ(n2), since on the first pass (to partition 2, 3, ..., n around 1) you have to look at n-1 elements, on the second pass (to partition 3, 4, 5, ..., n around 2), you have to look at n-2 elements, etc. This means that the work done is (n-1)+(n-2)+ ... +1 = Θ(n2), quite inefficient!
Now, contrast this with randomized quicksort. In randomized quicksort, you truly choose a random element as your pivot at each step. This means that while you technically could choose the same pivots as in the deterministic case, it's very unlikely (the probability would be roughly 22 - n, which is quite low) that this will happen and trigger the worst-case behavior. You're more likely to choose pivots closer to the center of the array, and when that happens the recursion branches more evenly and thus terminates a lot faster.
The advantage of randomized quicksort is that there's no one input that will always cause it to run in time Θ(n log n) and the runtime is expected to be O(n log n). Deterministic quicksort algorithms usually have the drawback that either (1) they run in worst-case time O(n log n), but with a high constant factor, or (2) they run in worst-case time O(n2) and the sort of input that triggers this case is deterministic.
In quick sort, the pivot is always the right most index of the selected array whereas in Randomized quick sort, pivot can be any element in the array.
Actually, I am teaching myself algorithm and here I am trying to solve this problem which is the following:
We have an array of n positive integers in an arbitrary order and we have k which is k>=1 to n. The question is to output k smallest odd integers. If the
number of odd integers in A is less than k, we should report all odd integers. For example,
if A = [2, 17, 3, 10, 28, 5, 9, 4, 12,13, 7] and k = 3, the output should be 3, 5, 9.
I want to solve this problem in O(n) time.
My current solution is to have another array with only odd numbers and then I apply this algorithm which is by finding the median and divide the list into L, Median , Right and compare the k as the following:
If |L|<k<= (|L|+|M|) Return the median
else if K<|L|, solve the problem recursively on (L)
else work on (R, k- (|L|+|M|)
Any help is appreciated.
Assuming the output can be in any order:
Create a separate array with only odd numbers.
Use a selection algorithm to determine the k-th item. One such algorithm is quickselect (which runs in O(n) on average), which is related to quicksort - it partitions the array by some pivot, and then recursively goes to one of the partitioned sides, based on the sizes of each. See this question for more details.
Since quickselect partitions the input, you will be able to output the results directly after running this algorithm (as Karoly mentioned).
Both of the above steps take O(n), thus the overall running time is O(n).
If you need the output in ascending order:
If k = n, and all the numbers are odd, then an O(n) solution to this would be an O(n) sorting algorithm, but no-one knows of such an algorithm.
To anyone who's considering disagreeing, saying that some non-comparison-based sort is O(n) - it's not, each of these algorithms have some other factor in the complexity, such as the size of the numbers.
The best you can do here, with unbounded numbers, is to use the approach suggested in Proger's answer (O(n + k log n)), or iterate through the input, maintaining a heap of the k smallest odd numbers (O(n log k)).