You are given a list of 100 integers that have been read from a file. If all values are zero, what would be the running time (in terms of O-notation) of a selection sort algorithm.
I thought it was O(n) because selection sort starts with the leftmost number as the sorted side. then it goes through the rest of the array to find the smallest number and swaps it with the the first number in the sorted side. But since they are all zeros then it won't swap any numbers (or so I think).
my teacher said that it is O(n^2). can anyone explain why?
Selection sort is not adaptive. Each element will always be compared with each other element (Compare n elements with n other elements → n^2 comparisons). Thus, selection sort always has O(n^2) comparisons. It has, however, O(n) swaps.
Think of a table with n rows and n colums, and each cell needs a comparison to fill the value (except the diagonal).
More info on this amazing website
Related
So i guess its because it just compares A[k] and A[k-1], and does the implementation in one sweep but its still not clear. Can someone explain better.
Thanks
This link shows a graphical representation of sorting algorithm with different types of data set.
As you can see, when the data is sorted the algorithm complexity is reduced to N. Which is equivalent to the number of elements as inputs.
The link provided gives a clear picture of how its more efficient.
You answered your own question: For a nearly sorted array, insertion sort will only need a handful of O(n) passes to complete. Contrast that to a divide and conquer sorting algorithm like merge sort, which takes O(n*lgn). For any non trivial value of n, a divide and conquer algorithm will need many O(n) passes, even if the array be almost completely sorted, whereas insertion sort might only require a few.
Insertion sort is a faster and more improved sorting algorithm than selection sort. In selection sort the algorithm iterates through all of the data through every pass whether it is already sorted or not. However, insertion sort works differently, instead of iterating through all of the data after every pass the algorithm only traverses the data it needs to until the segment that is being sorted is sorted. Again there are two loops that are required by insertion sort and therefore two main variables, which in this case are named 'i' and 'j'. Variables 'i' and 'j' begin on the same index after every pass of the first loop, the second loop only executes if variable 'j' is greater then index 0 AND arr[j] < arr[j - 1]. In other words, if 'j' hasn't reached the end of the data AND the value of the index where 'j' is at is smaller than the value of the index to the left of 'j', finally 'j' is decremented. As long as these two conditions are met in the second loop it will keep executing, this is what sets insertion sort apart from selection sort. Only the data that needs to be sorted is sorted.
The general goal of a sorting algorithm is to minimize the number of comparisons. Sorting algorithms have a lower bound and an upper bound on the number of comparisons( n log n worst-case for merge and heap sorts, n log n average case for quick sort). In the most general case, you'd go with an algorithm that happens to have the best average or best worst-case number of comparisons. However, when you know something about the data (e.g., the array is already sorted, or almost sorted), you can exploit the fact that insertion sort's lower bound is far lower than the "n log n" sorts.
For example, if you have an array [1,2,3,4,5,6,7,9] and you need to insert 8 into it, you can either insert it at the end, and sort the array using a vanilla n log n sort (which will do about 28 comparisons (roughly) to sort the data to [1,2,3,4,5,6,7,8,9]). However, insertion sort lets you insert the 8 at the right position in only about 8 comparisons.
I need to create an algorithm to solve the following problem:
Given two sorted arrays (both have n elements) I need to modify them so that each element in the first array is smaller than any element in the second array. The operations I could do is compare two elements and swap two elements.
My first solutions is this:
Let a be the last element of the first array and let b be the first element of the second array. If a<b then we stop, otherwise we swap them and continue with arrays n-1 smaller (erase last element in the first, the first in the second).
This is obviously linear.
But what if I wanted to minimize the number of comparisons made in this algorithm? In this one I make a linear number of swaps and comparisons. Could I go smaller with comparisons?
I could do a double binary search I think. Meaning I search for such element a' in the first array that is bigger then some element b' in the second array but smaller then the one next to him. This has complexity O(lg n^2). Can I do better?
Your algorithm is not working with O(lg n^2).
With binary search you will decrease the number of comparison from O(n) to log(n), but the number of swap operation will still be O(n)
Yes, your algorithm will be slightly improved but the complexity stays the same. You have to do k number of swaps where 0<=k<=n
If anyone can give some input on my logic, I would very much appreciate it.
Which method runs faster for an array with all keys identical, selection sort or insertion sort?
I think that this would be similar to when the array is already sorted, so that insertion sort will be linear, and the selection sort quadratic.
Which method runs faster for an array in reverse order, selection sort or insertion sort?
I think that they would run similarly, since the values at every position will have to be changed. The worst case scenario for insertion sort is reverse order, so that would mean it is quadratic, and then the selection sort would already be quadratic as well.
Suppose that we use insertion sort on a randomly ordered array where elements have only one of three values. Is the running time linear, quadratic, or something in between?
Since it is randomly sorted, I think that would mean that the insertion sort would have to perform many more times the number of operations that the number of values. If that's the case, then its not linear.So, it would likely be quadratic, or perhaps a little below quadratic.
What is the maximum number of times during the execution of Quick.sort() that the largest item can be exchanged, for an array of length N?
The maximum number cannot be passed over more times than there are spaces available, since it should always be approaching its right position. So, going from being the first to the last value spot, it would be exchanged N times.
About how many compares will quick.sort() make when sorting an array of N items that are all equal?
When drawing out the quick sort , a triangle can be drawn around the compared objects at every phase, that is N tall and N wide, the area of this would equal the number of compares performed, which would be (N^2)/2
Here are my comments on your comments:
Which method runs faster for an array with all keys identical, selection sort or insertion sort?
I think that this would be similar to when the array is already sorted, so that insertion sort will be linear, and the selection sort quadratic.
Yes, that's correct. Insertion sort will do O(1) work per element and visit O(n) elements for a total runtime of O(n). Selection sort always runs in time Θ(n2) regardless of the input structure, so its runtime will be quadratic.
Which method runs faster for an array in reverse order, selection sort or insertion sort?
I think that they would run similarly, since the values at every position will have to be changed. The worst case scenario for insertion sort is reverse order, so that would mean it is quadratic, and then the selection sort would already be quadratic as well.
You're right that both algorithms have quadratic runtime. The algorithms should actually have relatively comparable performance, since they'll make the same total number of comparisons.
Suppose that we use insertion sort on a randomly ordered array where elements have only one of three values. Is the running time linear, quadratic, or something in between?
Since it is randomly sorted, I think that would mean that the insertion sort would have to perform many more times the number of operations that the number of values. If that's the case, then its not linear.So, it would likely be quadratic, or perhaps a little below quadratic.
This should take quadratic time (time Θ(n2)). Take just the elements in the back third of the array. About a third of these elements will be 1's, and in order to insert them into the sorted sequence they'd need to be moved above 2/3's of the way down the array. Therefore, the work done would be at least (n / 3)(2n / 3) = 2n2 / 9, which is quadratic.
What is the maximum number of times during the execution of Quick.sort() that the largest item can be exchanged, for an array of length N?
The maximum number cannot be passed over more times than there are spaces available, since it should always be approaching its right position. So, going from being the first to the last value spot, it would be exchanged N times.
There's an off-by-one error here. When the array has size 1, the largest element can't be moved any more, so the maximum number of moves would be N - 1.
About how many compares will quick.sort() make when sorting an array of N items that are all equal?
When drawing out the quick sort , a triangle can be drawn around the compared objects at every phase, that is N tall and N wide, the area of this would equal the number of compares performed, which would be (N^2)/2
This really depends on the implementation of Quick.sort(). Quicksort with ternary partitioning would only do O(n) total work because all values equal to the pivot are excluded in the recursive calls. If this isn't done, then your analysis would be correct.
Hope this helps!
I am reading algorithms by Robertsedwick by C++ on sorting
Property 1: Insertion sort and bubble sort use a linear number of
comparisions and exchanges for files with at most a constant number of
inversions corresponding to each element.
In another type of partially sorted file, we perhaps have appended a few elements to a sorted file or have edited a few elements in a sorted file to change their kesy. Insetion sort is efficient menthod for such files; bubble and selection sort are not.
Property 2: Insertion sort uses a linear number of comparisions and
exchanges for files with atmost a constant number of elements having
more than a constant number of corresponding inversions.
My questions on above properties are
I am not able to get difference between property 1 and property 2? Can any one explain me here?
On what basis above for property 2 author mentioned insertion sort is best and not bubble and selection sort?
It would be good if explained with example.
Thanks for your time and help
So, an inversion where the sort order is < is there i < j but a[i] > a[j].
Property 1. Consider the sequence 2 1 4 3 6 5 8 7 10 9.... Every element is out of order with respect to its neighbor to the left or to the right, but is in order with respect to all other elements. So each element has a constant number of inversions, one, in this case. This property says that all the elements can be a little out of order.
Both bubble sort and insertion sort will run in linear time. Bubble sort will take just one pass to correct the order since it swaps neighboring elements and another pass to confirm. Insertion sort will only have to do one compare and swap per element.
Property 2. This property is stronger. In addition to being able to have all the elements a little out of order, now you can have a few that are very out of order. Consider the same sequence as before, but the smallest element and largest elements moved to opposite ends: n 2 4 3 6 5 8 7 10 9...1. Now 1 and n are out of order with respect to all other elements.
Insertion sort will still perform in linear time. As before, most of the elements require only a few compare and swaps, but there are a few that can take order n compare and swaps. In this example, the first n-1 elements take a couple of compare and swaps (ok, so the 2 only takes one) to get into place and the last takes n-1 compare and swaps -- 2*(n-1) + 1*(n-1) is order n.
Bubble sort has a much harder time in this example. Each pass through can only move the 1 a single step backwards. Thus it will take at least (n-1) passes in which (n-1) comparisons are done before completion -- this is multiplicative (n-1)*(n-1) is order n^2. (You could also run bubble sort in the opposite direction, in which case the largest element at the beginning would slowly move to the other end instead.)
this is a homework question, and I'm not that at finding the complixity but I'm trying my best!
Three-way partitioning is a modification of quicksort that partitions elements into groups smaller than, equal to, and larger than the pivot. Only the groups of smaller and larger elements need to be recursively sorted. Show that if there are N items but only k unique values (in other words there are many duplicates), then the running time of this modification to quicksort is O(Nk).
my try:
on the average case:
the tree subroutines will be at these indices:
I assume that the subroutine that have duplicated items will equal (n-k)
first: from 0 - to(i-1)
Second: i - (i+(n-k-1))
third: (i+n-k) - (n-1)
number of comparisons = (n-k)-1
So,
T(n) = (n-k)-1 + Sigma from 0 until (n-k-1) [ T(i) + T (i-k)]
then I'm not sure how I'm gonna continue :S
It might be a very bad start though :$
Hope to find a help
First of all, you shouldn't look at the average case since the upper bound of O(nk) can be proved for the worst case, which is a stronger statement.
You should look at the maximum possible depth of recursion. In normal quicksort, the maximum depth is n. For each level, the total number of operations done is O(n), which gives O(n^2) total in the worst case.
Here, it's not hard to prove that the maximum possible depth is k (since one unique value will be removed at each level), which leads to O(nk) total.
I don't have a formal education in complexity. But if you think about it as a mathematical problem, you can prove it as a mathematical proof.
For all sorting algorithms, the best case scenario will always be O(n) for n elements because to sort n elements you have to consider each one atleast once. Now, for your particular optimisation of quicksort, what you have done is simplified the issue because now, you are only sorting unique values: All the values that are the same as the pivot are already considered sorted, and by virtue of its nature, quicksort will guarantee that every unique value will feature as the pivot at some point in the operation, so this eliminates duplicates.
This means for an N size list, quicksort must perform some operation N times (once for every position in the list), and because it is trying to sort the list, that operation is trying to find the position of that value in the list, but because you are effectively dealing with just unique values, and there are k of those, the quicksort algorithm must perform k comparisons for each element. So it performs Nk operations for an N sized list with k unique elements.
To summarise:
This algorithm eliminates checking against duplicate values.
But all sorting algorithms must look at every value in the list at least once. N operations
For every value in the list the operation is to find its position relative to other values in the list.
Because duplicates get removed, this leaves only k values to check against.
O(Nk)