Given an already sorted array of n distinct elements where only the last element is out of order, would insertion sort be the fastest algorithm to be used here?
Ex: [1, 3, 5, 6, 7, 9, 2]
If it was an array, yes, insertion sort.
Worst case complexity: O(n)
Worst case scenario: Unsorted element is the smallest element.
If it was a linked list of any kind where the cost of insertion is constant time, then a binary search would be the fastest most efficient way.
Worst case complexity: O(log(n))
Related
Encountered a question like this for mergesort specifically and was wondering how does one approach a question like this for other algorithms (insertionsort,heapsort,quicksort and etc)
Is it safe to assume that the nth best/worst arrangement for any algorithm is the nth step of solving the best/worst arrangement for the same set of data?
Example:
If the worst case for mergesort with the following array of integers [1,2,3,4,5,6,7,8] is [1,5,3,7,2,6,4,8]. What is the next worst case for this array of integers?
I assumed it would be the next arrangement when solving the worst case which is [1,3,5,7,2,6,4,8]. Am I approaching such a question wrongly?
The concept of a "next-best" or "next-worst" case is not really well-defined in the first place. Neither is the concept of "the state of the array after one step", because not all algorithms modify an array in-place.
When we say the "worst case" of an algorithm, we don't mean a single input to an algorithm. For example, the array [5, 4, 3, 2, 1] is not - by itself - the worst case of the insertion sort algorithm. This array is one of the worst inputs (i.e. highest number of steps to compute) for insertion sort out of arrays of length 5, but we are very rarely interested in arrays of one specific length.
What we mean by "best case" or "worst case" is actually an infinite family of inputs, such that each member of that family is a best or worst input for its own value of n, and the family must contain inputs for arbitrarily large values of n. So, for example:
The infinite set of arrays {[1], [2, 1], [3, 2, 1], [4, 3, 2, 1], ...} is a worst case for insertion sort. For inputs from this infinite set, the asymptotic complexity of insertion sort is Θ(n2).
The infinite set of arrays {[1], [1, 2], [1, 2, 3], [1, 2, 3, 4], ...} is a best case for insertion sort. For inputs from this infinite set, the asymptotic complexity of insertion sort is Θ(n).
Note that the (larger) infinite set of all arrays which are in descending order is also a worst case for insertion sort, and the (larger) infinite set of all arrays in ascending order is also a best case. So the family is not unique, but the asymptotic complexity of the algorithm on inputs from any two "best case" (or any two "worst case") families is the same.
Now we've got that out of the way, let's think about what a "next-best" or "next-worst" case would have to mean. If the asymptotic complexity of insertion sort on some family of inputs is also Θ(n2), then that family is a worst case for insertion sort; so the asymptotic complexity of a "next-worst" case would have to be something lower than Θ(n2).
But however small a gap you choose, it is not the "next-worst":
If you choose a family where the complexity is Θ(n1.999), then it is not "next-worst" because I can find another family where the complexity is Θ(n1.9999).
If you choose a family where the complexity is Θ(n2 / log n), I can find one where it's Θ(n2 / log log n).
That is, the asymptotic complexities of different families of possible inputs for insertion sort form a dense order, for any two different complexities there is another complexity in between those two, so there is no "next" or "previous" one.
I am reading about use cases of Selection Sort, and this source says:
(selection sort is used when...) cost of writing to a memory matters like in flash memory (number of writes/swaps is O(n) as compared to O(n2) of bubble sort)
We can even see O(n^2) swaps in this example:
[1, 2, 3, 4, 5]. It's going to have 4 swaps, then 3, then 2, and 1. That is O(n^2), not O(n) swaps. Why do they say the opposite?
A selection sort has a time complexity of O(n2), but only O(n) swaps.
In each iteration i, you go over all the remaining items (in indexes i and onwards), find the right value to populate that index, and swap it there. So in total you perform O(n2) comparisons, but only O(n) swaps.
I have create a data structure that implements a maximum binary heap. Im trying to find 2 sequences of n numbers which the insertion takes O(n) and O(nlogn) time.
Is this possible?
Let me try to restate what you are asking; please correct me if this is wrong.
So a Binary Heap data structure has time complexity of logN for insertion. The process of insertion in a max-heap is as follows,
the tree is a complete binary tree, i.e. all levels are full except the last one.
insert at the left most spot in the tree.
if the node is smaller than the parent, a swap is performed.
the process is repeated until the node is at the appropriate level.
So for your question,
you want a sequence of n numbers with insertion time complexity of O(n). This means, that each insertion takes O(1) or constant time. This means we need a sequence where there is no need for a heapify operation. I think a sequence like following would obviate the need for a heapify operation.
[10, 8, 9, 4, 5, 6, 7 ]
for the second one, you want O(nlogn) which means each operation takes logn which is the standard or average performance a binary heap for insertion. So any sequence should do,
[ 1, 2, 3, 4, 5, 6, 7]
for each one from 2nd onward, you need to compare to parent node and swap.
How is it different if I select a randomized pivot versus just selecting the first pivot in an unordered set/list?
If the set is unordered, isnt selecting the first value in the set, random in itself? So essentially, I am trying to understand how/if randomizing promises a better worst case runtime.
I think you may be mixing up the concepts of arbitrary and random. It's arbitrary to pick the first element of the array - you could pick any element you'd like and it would work equally well - but it's not random. A random choice is one that can't be predicted in advance. An arbitrary choice is one that can be.
Let's imagine that you're using quicksort on the sorted sequence 1, 2, 3, 4, 5, 6, ..., n. If you choose the first element as a pivot, then you'll choose 1 as the pivot. All n - 1 other elements then go to the right and nothing goes to the left, and you'll recursively quicksort 2, 3, 4, 5, ..., n.
When you quicksort that range, you'll choose 2 as the pivot. Partitioning the elements then puts nothing on the left and the numbers 3, 4, 5, 6, ..., n on the right, so you'll recursively quicksort 3, 4, 5, 6, ..., n.
More generally, after k steps, you'll choose the number k as a pivot, put the numbers k+1, k+2, ..., n on the right, then recursively quicksort them.
The total work done here ends up being Θ(n2), since on the first pass (to partition 2, 3, ..., n around 1) you have to look at n-1 elements, on the second pass (to partition 3, 4, 5, ..., n around 2), you have to look at n-2 elements, etc. This means that the work done is (n-1)+(n-2)+ ... +1 = Θ(n2), quite inefficient!
Now, contrast this with randomized quicksort. In randomized quicksort, you truly choose a random element as your pivot at each step. This means that while you technically could choose the same pivots as in the deterministic case, it's very unlikely (the probability would be roughly 22 - n, which is quite low) that this will happen and trigger the worst-case behavior. You're more likely to choose pivots closer to the center of the array, and when that happens the recursion branches more evenly and thus terminates a lot faster.
The advantage of randomized quicksort is that there's no one input that will always cause it to run in time Θ(n log n) and the runtime is expected to be O(n log n). Deterministic quicksort algorithms usually have the drawback that either (1) they run in worst-case time O(n log n), but with a high constant factor, or (2) they run in worst-case time O(n2) and the sort of input that triggers this case is deterministic.
In quick sort, the pivot is always the right most index of the selected array whereas in Randomized quick sort, pivot can be any element in the array.
Using an algorithm Tree-Insert(T, v) that inserts a new value v into a binary search tree T, the following algorithm grows a binary search tree by repeatedly inserting each value in a given section of an array into the tree:
Tree-Grow(A, first, last, T)
1 for i ← first to last
2 do Tree-Insert(T, A[i])
If the tree is initially empty, and the length of array section (i.e., last-first+1) is n, what are the best-case and the worst-case asymptotic running time of the above algorithm, respectively?
When n = 7, give a best-case instance (as an array containing digits 1 to 7, in certain order), and a worst-case instance (in the same form) of the algorithm.
If the array is sorted and all the values are distinct, find a way to modify Tree-Grow, so that it will always build the shortest tree.
What are the best-case and the worst-case asymptotic running time of the modified algorithm, respectively?
Please tag homework questions with the homework tag. In order to do well on your final exam, I suggest you actually learn this stuff, but I'm not here to judge you.
1) It takes O(n) to iterate from first to last. It takes O(lg n) to insert into a binary tree, therefore it the algorithm that you have shown takes O(n lg n) in the best case.
The worst case of inserting into a binary tree is when the tree is really long, but not very bushy; similar to a linked list. In that case, it would take O(n) to insert, therefore it would take O(n^2) in the worst case.
2) Best Case: [4, 2, 6, 1, 3, 5, 7], Worst Case: [1, 2, 3, 4, 5, 6, 7]
3) Use the n/2 index as the root, then recursively do this for the left side and right side of the array.
4) O(n lg n) in the best and worst case.
I hope this helps.