Insertion sort in best case - algorithm

With reference to Algorithm - Fourth Edition by Robert and Kevin, I am having difficulty in understanding the best case complexity for Insertion sort as per below code:
public class Insertion
{
public static void sort(Comparable[] a)
{ // Sort a[] into increasing order.
int N = a.length;
for (int i = 1; i < N; i++)
{ // Insert a[i] among a[i-1], a[i-2], a[i-3]... ..
for (int j = i; j > 0 && less(a[j], a[j-1]); j--)
exch(a, j, j-1);
}
}
// See page 245 for less(), exch(), isSorted(), and main().
}
It says in the book that in best case (sorted array), the number of exchanges is 0 and number of compares is N-1. While I understood exchanges to be 0, I am having a hard time how can number of compares be N-1 in best case?

If the array is already sorted, then in the specific implementation of insertion-sort that you provide, each element will only be compared to its immediate predecessor. Since it's not less than that predecessor, the inner for-loop then aborts immediately, without requiring any further comparisons or exchanges.
Note that other implementations of insertion-sort do not necessarily have that property.

how can number of compares be N-1 in best case?
The best case happens when you have an already sorted array. The number of comparison is n-1 because the comparison is made from the 2nd element onwards till the last element.
This can also be observed from your given code:
for (int i = 1; i < N; i++) //int i=1 (start comparing from 2nd element)

The source code for the specific implementation is:
public class Insertion
{
public static void sort(Comparable[] a)
{ // Sort a[] into increasing order.
int N = a.length;
bool exc = false;
for (int i = 1; i < N; i++)
{ // Insert a[i] among a[i-1], a[i-2], a[i-3]... ..
for (int j = i; j > 0 && less(a[j], a[j-1]); j--) {
exch(a, j, j-1);
exc = true;
}
if (!exc)
break;
}
}
// See page 245 for less(), exch(), isSorted(), and main().
}

Related

First missing Integer approach's time complexity

I want to understand the time complexity of my below algorithm, which is an acceptable answer for the famous first missing integer problem:
public int firstMissingPositive(int[] A) {
int l = A.length;
int i = 0;
while (i < l) {
int j = A[i];
while (j > 0 && j <= l) {
int k = A[j - 1];
A[j - 1] = Integer.MAX_VALUE;
j = k;
}
i++;
}
for (i = 0; i < l; i++) {
if (A[i] != Integer.MAX_VALUE)
break;
}
return i + 1;
}
Observations and findings:
Looking at the loop structure I thought that the complexity should be more than n as I may visit every element more than twice in some cases. But to my surprise, the solution got accepted. I am not able to understand the complexity.
You are probably looking at the nested loops and thinking O(N2), but it's not that simple.
Every iteration of the inner loop changes an item in A to Integer.MAX_VALUE, and there are only N items, so there cannot be more than N iterations of the inner loop in total.
The total time is therefore O(N).

(with example) Why is KMP string matching O(n). Shouldn't it be O(n*m)?

Why is KMP O(n + m)?
I know this question has probably been asked a million times on here but I haven't find a solution that convinced me/I understood or a question that matched my example.
/**
* KMP algorithm of pattern matching.
*/
public boolean KMP(char []text, char []pattern){
int lps[] = computeTemporaryArray(pattern);
int i=0;
int j=0;
while(i < text.length && j < pattern.length){
if(text[i] == pattern[j]){
i++;
j++;
}else{
if(j!=0){
j = lps[j-1];
}else{
i++;
}
}
}
if(j == pattern.length){
return true;
}
return false;
}
n = size of text
m = size of pattern
I know why its + m, thats the runtime it takes to create the lsp array to do lookups. I'm not sure why the code I passed above is O(n).
I see that above "i" always progresses forwards EXCEPT when it doesn't match and j!= 0. In that case, we can do iterations of the while loop where i doesn't move forward, so its not exactly O(n)
If the lps array is incrementing like [1,2,3,4,5,6,0]. If we fail to match at index 6, j gets updated to 5, and then 4, and then 3.... and etc and we effectively go through m extra iterations (assuming all mismatch). This can occur at every step.
so it would look like
for (int i = 0; i < n; i++) {
for (int j = i; j >=0; j--) {
}
}
and to put all the possible i j combinations aka states would require a nm array so wouldn't the runtime be O(nm).
So is my reading of the code wrong, or the runtime analysis of the for loop wrong, or my example is impossible?
Actually, now that I think about it. It is O(n+m). Just visualized it as two windows shifting.

Insertion sort with sentinel

I would like to know if there is a purpose to add a sentinel to this code?
public void Sort(ArrayToSort<T> array) {
for (var i = 0; i < array.Length; i++) {
for (var j = i; j > 0; j--) {
if (array.isLess(j, j - 1)) {
array.Swap(j, j - 1);
} else {
break;
}
}
}
}
If the answer is yes, how should I do it? Cauz if I copy all the tab I'm pretty sure that's better to do without sentinel...
thanks ;)
There is way to make natural sentinel in insertion sort. Make the first traversal through the whole array, find the smallest element and shift it into the first position.
After that you get rid off index checking in inner loop. Example code for the second stage from Sedgewick book (Alg. in C):
for (i = l+2; i <= r; i++)
{ int j = i; Item v = a[i];
while (less(v, a[j-1]))
{ a[j] = a[j-1]; j--; }
a[j] = v;
}
Also note that insertion sort uses element shifts, not swaps - for effectivity.
Using this method in the worst case you have about n^2/2 element comparisons versus (n^2/2 element comparisons + n^2/2 index comparisons in trivial case).
I believe that speed gain should exist, but it is not very large (element comparisons might be heavier, and there is also the same number of shift operations in both cases). You can profile both approaches and know result for your specific case.

Sort a given array whose elements range from 1 to n , in which one element is missing and one is repeated

I have to sort this array in O(n) time and O(1) space.
I know how to sort an array in O(n) but that doesn't work with missing and repeated numbers. If I find the repeated and missing numbers first (It can be done in O(n)) and then sort , that seems costly.
static void sort(int[] arr)
{
for(int i=0;i<arr.length;i++)
{
if(i>=arr.length)
break;
if(arr[i]-1 == i)
continue;
else
{
while(arr[i]-1 != i)
{
int temp = arr[arr[i]-1];
arr[arr[i]-1] = arr[i];
arr[i] = temp;
}
}
}
}
First, you need to find missing and repeated numbers. You do this by solving following system of equations:
Left sums are computed simultaneously by making one pass over array. Right sums are even simpler -- you may use formulas for arithmetic progression to avoid looping. So, now you have system of two equations with two unknowns: missing number m and repeated number r. Solve it.
Next, you "sort" array by filling it with numbers 1 to n left to right, omitting m and duplicating r. Thus, overall algorithm requires only two passes over array.
void sort() {
for (int i = 1; i <= N; ++i) {
while (a[i] != a[a[i]]) {
std::swap(a[i], a[a[i]]);
}
}
for (int i = 1; i <= N; ++i) {
if (a[i] == i) continue;
for (int j = a[i] - 1; j >= i; --j) a[j] = j + 1;
for (int j = a[i] + 1; j <= i; ++j) a[j] = j - 1;
break;
}
}
Explanation:
Let's denote m the missing number and d the duplicated number
Please note in the while loop, the break condition is a[i] != a[a[i]] which covers both a[i] == i and a[i] is a duplicate.
After the first for, every non-duplicate number i is encountered 1-2 time and moved into the i-th position of the array at most 1 time.
The first-found number d is moved to d-th position, at most 1 time
The second d is moved around at most N-1 times and ends up in m-th position because every other i-th slot is occupied by number i
The second outer for locate the first i where a[i] != i. The only i satisfies that is i = m
The 2 inner fors handle 2 cases where m < d and m > d respectively
Full implementation at http://ideone.com/VDuLka
After
int temp = arr[arr[i]-1];
add a check for duplicate in the loop:
if((temp-1) == i){ // found duplicate
...
} else {
arr[arr[i]-1] = arr[i];
arr[i] = temp;
}
See if you can figure out the rest of the code.

Interview - Find magnitude pole in an array

Magnitude Pole: An element in an array whose left hand side elements are lesser than or equal to it and whose right hand side element are greater than or equal to it.
example input
3,1,4,5,9,7,6,11
desired output
4,5,11
I was asked this question in an interview and I have to return the index of the element and only return the first element that met the condition.
My logic
Take two MultiSet (So that we can consider duplicate as well), one for right hand side of the element and one for left hand side of the
element(the pole).
Start with 0th element and put rest all elements in the "right set".
Base condition if this 0th element is lesser or equal to all element on "right set" then return its index.
Else put this into "left set" and start with element at index 1.
Traverse the Array and each time pick the maximum value from "left set" and minimum value from "right set" and compare.
At any instant of time for any element all the value to its left are in the "left set" and value to its right are in the "right set"
Code
int magnitudePole (const vector<int> &A) {
multiset<int> left, right;
int left_max, right_min;
int size = A.size();
for (int i = 1; i < size; ++i)
right.insert(A[i]);
right_min = *(right.begin());
if(A[0] <= right_min)
return 0;
left.insert(A[0]);
for (int i = 1; i < size; ++i) {
right.erase(right.find(A[i]));
left_max = *(--left.end());
if (right.size() > 0)
right_min = *(right.begin());
if (A[i] > left_max && A[i] <= right_min)
return i;
else
left.insert(A[i]);
}
return -1;
}
My questions
I was told that my logic is incorrect, I am not able to understand why this logic is incorrect (though I have checked for some cases and
it is returning right index)
For my own curiosity how to do this without using any set/multiset in O(n) time.
For an O(n) algorithm:
Count the largest element from n[0] to n[k] for all k in [0, length(n)), save the answer in an array maxOnTheLeft. This costs O(n);
Count the smallest element from n[k] to n[length(n)-1] for all k in [0, length(n)), save the answer in an array minOnTheRight. This costs O(n);
Loop through the whole thing and find any n[k] with maxOnTheLeft <= n[k] <= minOnTheRight. This costs O(n).
And you code is (at least) wrong here:
if (A[i] > left_max && A[i] <= right_min) // <-- should be >= and <=
Create two bool[N] called NorthPole and SouthPole (just to be humorous.
step forward through A[]tracking maximum element found so far, and set SouthPole[i] true if A[i] > Max(A[0..i-1])
step backward through A[] and set NorthPole[i] true if A[i] < Min(A[i+1..N-1)
step forward through NorthPole and SouthPole to find first element with both set true.
O(N) in each step above, as visiting each node once, so O(N) overall.
Java implementation:
Collection<Integer> magnitudes(int[] A) {
int length = A.length;
// what's the maximum number from the beginning of the array till the current position
int[] maxes = new int[A.length];
// what's the minimum number from the current position till the end of the array
int[] mins = new int[A.length];
// build mins
int min = mins[length - 1] = A[length - 1];
for (int i = length - 2; i >= 0; i--) {
if (A[i] < min) {
min = A[i];
}
mins[i] = min;
}
// build maxes
int max = maxes[0] = A[0];
for (int i = 1; i < length; i++) {
if (A[i] > max) {
max = A[i];
}
maxes[i] = max;
}
Collection<Integer> result = new ArrayList<>();
// use them to find the magnitudes if any exists
for (int i = 0; i < length; i++) {
if (A[i] >= maxes[i] && A[i] <= mins[i]) {
// return here if first one only is needed
result.add(A[i]);
}
}
return result;
}
Your logic seems perfectly correct (didn't check the implementation, though) and can be implemented to give an O(n) time algorithm! Nice job thinking in terms of sets.
Your right set can be implemented as a stack which supports a min, and the left set can be implemented as a stack which supports a max and this gives an O(n) time algorithm.
Having a stack which supports max/min is a well known interview question and can be done so each operation (push/pop/min/max is O(1)).
To use this for your logic, the pseudo code will look something like this
foreach elem in a[n-1 to 0]
right_set.push(elem)
while (right_set.has_elements()) {
candidate = right_set.pop();
if (left_set.has_elements() && left_set.max() <= candidate <= right_set.min()) {
break;
} else if (!left.has_elements() && candidate <= right_set.min() {
break;
}
left_set.push(candidate);
}
return candidate
I saw this problem on Codility, solved it with Perl:
sub solution {
my (#A) = #_;
my ($max, $min) = ($A[0], $A[-1]);
my %candidates;
for my $i (0..$#A) {
if ($A[$i] >= $max) {
$max = $A[$i];
$candidates{$i}++;
}
}
for my $i (reverse 0..$#A) {
if ($A[$i] <= $min) {
$min = $A[$i];
return $i if $candidates{$i};
}
}
return -1;
}
How about the following code? I think its efficiency is not good in the worst case, but it's expected efficiency would be good.
int getFirstPole(int* a, int n)
{
int leftPole = a[0];
for(int i = 1; i < n; i++)
{
if(a[j] >= leftPole)
{
int j = i;
for(; j < n; j++)
{
if(a[j] < a[i])
{
i = j+1; //jump the elements between i and j
break;
}
else if (a[j] > a[i])
leftPole = a[j];
}
if(j == n) // if no one is less than a[i] then return i
return i;
}
}
return 0;
}
Create array of ints called mags, and int variable called maxMag.
For each element in source array check if element is greater or equal to maxMag.
If is: add element to mags array and set maxMag = element.
If isn't: loop through mags array and remove all elements lesser.
Result: array of magnitude poles
Interesting question, I am having my own solution in C# which I have given below, read the comments to understand my approach.
public int MagnitudePoleFinder(int[] A)
{
//Create a variable to store Maximum Valued Item i.e. maxOfUp
int maxOfUp = A[0];
//if list has only one value return this value
if (A.Length <= 1) return A[0];
//create a collection for all candidates for magnitude pole that will be found in the iteration
var magnitudeCandidates = new List<KeyValuePair<int, int>>();
//add the first element as first candidate
var a = A[0];
magnitudeCandidates.Add(new KeyValuePair<int, int>(0, a));
//lets iterate
for (int i = 1; i < A.Length; i++)
{
a = A[i];
//if this item is maximum or equal to all above items ( maxofUp will hold max value of all the above items)
if (a >= maxOfUp)
{
//add it to candidate list
magnitudeCandidates.Add(new KeyValuePair<int, int>(i, a));
maxOfUp = a;
}
else
{
//remote all the candidates having greater values to this item
magnitudeCandidates = magnitudeCandidates.Except(magnitudeCandidates.Where(c => c.Value > a)).ToList();
}
}
//if no candidate return -1
if (magnitudeCandidates.Count == 0) return -1;
else
//return value of first candidate
return magnitudeCandidates.First().Key;
}

Resources