Insertion sort time complexity - algorithm

The unusual Θ(n2) implementation of Insertion Sort to sort an array uses linear search to identify the position where an element is to be inserted into the already sorted part of the array. If then, instead, we use binary search to identify the position, the worst case running time will then
A) remain Θ(n2)
B) become Θ(n(logn)2)
C) become Θ(nlogn)
D) become Θ(n)
This is my first question on stackoverflow please forgive any mistakes.

First of all the question is about Insertion Sort not Quicksort as you display above.
The correct answer is A-Remain Θ(n^2) since even if you can binary search the position of the element in the already sorted part of the array you have to move every element greater than it one position to the right which cause an Θ(k) amount of moves if the original array's element ordering is from greatest to lowest, where k is the initial index of the element being added to the sorted part. The total running time is Θ(n^2) when you do the math.
Question answer aside: the time complexity average case of Randomized-QuickSort is O(nlogn) and it can be proved if you have a mathematical background in expected value (probabilities). You can find more about it reading the quicksort section in the book Introduction to Algorithms (Cormen).

Related

Getting the time complexity of a selection sort

I created a code for selection sort, but my teacher asked me what the time complexity of my code is, so I need help to get it. I'm not sure if my code is the same with the other selection sort with a worst time case of O(n^2) and a best time case of O(n).
code:
def selection(collection):
for endnum in range(len(collection)-1, 0, -1):
print(collection)
max_idx = endnum
if max(collection[0:endnum]) > collection[endnum]:
max_idx = collection.index(max(collection[0:endnum]))
collection[endnum], collection[max_idx] = collection[max_idx], collection[endnum]
Selection sort doesn't have a best case. It's always O(n²), because each step needs to find the largest (or smallest) element in the unsorted portion of the array, which requires scanning the entire unsorted segment.
Your version is not different except that you rather unnecessarily compute the maximum twice and then do a third scan to find its index. However, doing three times as much work as is necessary is "just" a constant factor, so the asymptotic complexity doesn't change. The cycles you waste are real, though.
Your code hase the same complexity O(n^2) as usual selection sort, you just fill sorted items from the end rather than from start.
There are n-1 loops with run lengths n,n-1, n-2..1, so sum of arithmetic progression gives about n*(n-1)/2 comparisons and n exchanges.
Also note that the best time case of selection sort is quadratic, not linear (selection sort doesn't retrieve information to stop early)

Best running time

What is the best running time using theta notation for:
Find an element in a sorted array
Find an element in a sorted linked list
Inserting an element in a sorted array, once the position is found
Inserting an element in a sorted linked list, once the position is found
So far I have 2) is theta(n) and 4) is theta(1) only because I remembered my prof just said the answer in class, but is there an explanation on how to get these?
First of all reading one of your answers it seems like you might be asking for complexity in O[big o].Theta notation is used when the complexity is bound asymptotically both above and below. Big O notation is for when the complexity is bound asymptotically, only above.
1. Find an element in a sorted array:
Using binary search it can be O(logn). But in Best case Ω(1)
2. Find an element in a sorted linked list
You can't use binary search here. You have to traverse the entire list to find a particular number. No way of going to a particular position without traversing numbers before(or after) it. So in worst case, you traverse n(length) times. So O(n)
Ω(1) because in best case you can find it in the beginning.
3. Inserting an element in a sorted array, once the position is found
O(n) since you have to shift all the numbers to the right of the position of new insertion place.
Ω(1) because in the best case you might just add it at the end.
4. Inserting an element in a sorted linked list, once the position is found
Ɵ(1) O(1) Ω(1), because adding a new element in a particular position (after you know the position and you have a pointer to that position) is theta(1)

The complexity of bubble sort and insertion sort for a list with a given number of inversions

Let the length of a list be n, and the number of inversions be d. Why does insertion sort run in O(n+d) time and why does bubble sort not?
When I consider this problem I am thinking of the worst case scenario. Since the worse case for inversions is n(n-1)\2, both bubble and insertion sort run in the same time. But then I don't know how to answer the question since I find them the same. Can someone help me with this?
For bubble sort, if the last element needs to get to the first position (n inversions) you need to loop over the entire array n times, each time moving the element one position forward so you get n^2 steps, so you get O(N^2) regardless of the value of d.
The same setup in insertion sort will do only n+n steps to get everything sorted (O(N+d)). d is actually the total number of swaps insertion sort will need to do to get the thing sorted.
You went wrong when you assumed the worst case value of d is n(n-1)/2. While this is true, if you want to express the complexity in terms of d you can't replace it with it's worst value case, unless you're ok with a higher bound.

O(nlogn) in-place sorting algorithm

This question was in the preparation exam for my midterm in introduction to computer science.
There exists an algorithm which can find the kth element in a list in
O(n) time, and suppose that it is in place. Using this algorithm,
write an in place sorting algorithm that runs in worst case time
O(n*log(n)), and prove that it does. Given that this algorithm exists,
why is mergesort still used?
I assume I must write some alternate form of the quicksort algorithm, which has a worst case of O(n^2), since merge-sort is not an in-place algorithm. What confuses me is the given algorithm to find the kth element in a list. Isn't a simple loop iteration through through the elements of an array already a O(n) algorithm?
How can the provided algorithm make any difference in the running time of the sorting algorithm if it does not change anything in the execution time? I don't see how used with either quicksort, insertion sort or selection sort, it could lower the worst case to O(nlogn). Any input is appreciated!
Check wiki, namely the "Selection by sorting" section:
Similarly, given a median-selection algorithm or general selection algorithm applied to find the median, one can use it as a pivot strategy in Quicksort, obtaining a sorting algorithm. If the selection algorithm is optimal, meaning O(n), then the resulting sorting algorithm is optimal, meaning O(n log n). The median is the best pivot for sorting, as it evenly divides the data, and thus guarantees optimal sorting, assuming the selection algorithm is optimal. A sorting analog to median of medians exists, using the pivot strategy (approximate median) in Quicksort, and similarly yields an optimal Quicksort.
The short answer why mergesort is prefered over quicksort in some cases is that it is stable (while quicksort is not).
Reasons for merge sort. Merge Sort is stable. Merge sort does more moves but fewer compares than quick sort. If the compare overhead is greater than move overhead, then merge sort is faster. One situation where compare overhead may be greater is sorting an array of indices or pointers to objects, like strings.
If sorting a linked list, then merge sort using an array of pointers to the first nodes of working lists is the fastest method I'm aware of. This is how HP / Microsoft std::list::sort() is implemented. In the array of pointers, array[i] is either NULL or points to a list of length pow(2,i) (except the last pointer points to a list of unlimited length).
I found the solution:
if(start>stop) 2 op.
pivot<-partition(A, start, stop) 2 op. + n
quickSort(A, start, pivot-1) 2 op. + T(n/2)
quickSort(A, pibvot+1, stop) 2 op. + T(n/2)
T(n)=8+2T(n/2)+n k=1
=8+2(8+2T(n/4)+n/2)+n
=24+4T(n/4)+2n K=2
...
=(2^K-1)*8+2^k*T(n/2^k)+kn
Recursion finishes when n=2^k <==> k=log2(n)
T(n)=(2^(log2(n))-1)*8+2^(log2(n))*2+log2(n)*n
=n-8+2n+nlog2(n)
=3n+nlog2(n)-8
=n(3+log2(n))-8
is O(nlogn)
Quick sort have worstcase O(n^2), but that only occurs if you have bad luck when choosing the pivot. If you can select the kth element in O(n) that means you can choose a good pivot by doing O(n) extra steps. That yields a woest-case O(nlogn) algorithm. There are a couple of reasons why mergesort is still used. First, this selection algorithm is more or less cumbersome to implement in-place, and also adds several extra operations to the regular quicksort, so it is not that fastest than merge sort, as one might expect.
Nevertheless, MergeSort is not still used because of its worst time complexity, in fact HeapSort achieves the same worst case bounds and is also in place, and didn't replace MergeSort, though it has also other disadvantages against quicksort. The main reason why MergeSort survives is because it is the fastest stable sort algorithm know so far. There are several applications in which is paramount to have an stable sorting algorithm. And that is the strength of MergeSort.
A stable sort is such that the equal items preserve the original relative order. For example, this is very useful when you have two keys, and you want to sort by first key first and then by second key, preserving the first key order.
The problem with HeapSort against quicksort is that it is cache inefficient, since you swap/compare elements too far from each other in the array, while quicksort compares consequent elements, these elements are more likely to be in the cache at the same time.

Sorting algorithm best case/worst case scenario

This is a practice exam question i'm working on, i have a general idea of what the answer is but would like some clarification.
The
following
is
a
sorting
algorithm
for
n
integers
in
an
array.
In
step
1,
you
iterate
through
the
array,
and
compare
each
pair
of
adjacent
integers
and
swap
each
pair
if
they
are
in
the
wrong
order.
In
step
2,
you
repeat
step
1
as
many
times
as
necessary
until
there
is
an
iteration
where
no
swaps
are
made
(in
which
case
the
list
is
sorted
and
you
can
stop).
What
is
the
worst case
complexity
of
this algorithm?
What is the best case complexity of this algorithm?
Basically the algorithm presented here is a bubble sort.
The worst case complexity here is O(n^2).
The best case complexity is O(n).
Here is the explanation:
The best case situation here would be "Already sorted array". so all you need is N comparisions(To be precise its n-1) so the complexity is O(n).
The worst case situation is reverse ordered array.
To better understand why its O(n^2), consider just first element of reverse ordered array which indeed is a largest element, to make this array sorted you need to get that element to the last index of the array. Through the algorithm explained in the question, on each iteration it takes the largest element one index towards its actual position(last index here) and it requires O(n) comparisions to move one posistion. and hence O(n^2) comparision to move it to its actual position.
In the best case, no swapping will be required and a single pass of the array would suffice. So the complexity is O(n).
In the worst case, the elements of the array could be in the reverse order. So the first iteration requires (n-1) swaps, the next one (n-2) and do on...
So it would lead to O(n^2) complexity.
As others have said, this is bubble sort. But if you are measuring complexity in terms of comparisons, you can easily be more precise than big-O.
In the best case, you need only compare n-1 pairs to verify they're all in the right order.
In the worst case, the first element is the one that should be in the last position, so n-1 passes will be needed, each advancing that element one more position toward the end of the list. Each pass requires n-1 comparisons. In all, then, (n-1)^2 comparisons are needed.

Resources