Heap-sort 'Heapify' iterative procedure - algorithm

I was checking the iterative approach for the max-heapify algorithm and the following is what is given in CLRS solutions.
while i < A.heap-size do
l =LEFT(i)
r =LEFT(i)
largest = i
if l ≤ A.heap-size and A[l] > A[i] then
largest = l
end if
if r ≤ A.heap-size and A[r] > A[i] then
largest = r
end if
if largest not equal i then
exchange A[i] and A[largest]
i = largest
else return A
end if
end while
return A
My question is why the loop condition is given as i < A.heap-size? Since the left and right should be within the heap size, which would mean that the parent must be i <= A.heap-size/2, why can't we check the condition as such i<=A.heap-size/2?

Yeah you are correct , it is just sufficient to check till heap-size/2. After that we don't even have children for those nodes.

Related

How does this method, which finds the smallest factor of a given number, work?

I've recently come across a method which returns the smallest factor of a given number:
public static int findFactor(int n)
{
int i = 1;
int j = n - 1;
int p = j; // invariant: p = i * j
while(p != n && i < j)
{
i++;
p += j;
while(p > n)
{
j--;
p -= i;
}
}
return p == n ? i : n;
}
After examining the method, I've been able to (most likely incorrectly) determine the quantities which some of is variables respectively represent:
n = the int that is subject to factorization for
the purposes of determining its smallest factor
i = the next potential factor of n to be tested
j = the smallest integer which i can be multiplied by to yield a value >= n
The problem is I don't know what quantity p represents. The inner loop seems to treat (p+=j) - n as a
potential multiple of i, but given what I believe j represents, I don't understand how that can be true
for all i, or how the outer loop accounts for the "extra" iteration of the inner loop that is carried out
before the latter terminates as a result of p < n
Assuming I've correctly determined what n, i, and j represent, what quantity does p represent?
If any of my determinations are incorrect, what do each of the quantities represent?
p stands for “product”. The invariant, as stated, is p == i*j; and the algorithm tries different combinations of i and j until the product (p) equals n. If it never does (the while loop falls through), you get p != n, and hence n is returned (n is prime).
At the end of the outer while loop's body, j is the largest integer which i can be multiplied by to yield a value ≤ n.
The algorithm avoids explicit division, and tries to limit the number of j values inspected for each i. At the beginning of the outer loop, p==i*j is just less than n. As i is gradually increased, j needs to gradually shrink. In each outer loop, i is increased (and p is corrected to match the invariant). The inner loop then decreases j (and corrects p) until p is ≤ n again. Since i*j is only just less than n at the beginning of the next outer loop, increasing i makes the product greater than n again, and the process repeats.
The algorithm tries all divisors between 1 and n / i (continuing past n / i is of no use as the corresponding quotients have already been tried).
So the outer loop actually performs
i= 1
while i * (n / i) != n && i < n / i)
{
i++;
}
It does it in a clever way, by avoiding divisions. As the annotation says, the invariant p = i * j is maintained; more precisely, p is the largest multiple of i that doesn't exceed n, and this actually establishes j = n / i.
There is a little adjustment to perform when i is incremented: i becoming i + 1 makes p = i * j become (i + 1) * j = p + j, and p may become too large. This is fixed by decrementing j as many times as necessary (j--, p-= i) to compensate.

Proving that an algorithm is correct using a loop invariant

From Introduction to Algorithms by Cormen et al. (3rd ed.), I am doing exercise 2.1-3. Basically, given an vector A of length n and a value v, the algorithm outputs an index i (indices start at 1, rather than 0) such that v = A[i], or NIL if such an index does not exist.
My pseudocode is the following:
for j = 1 to length(A):
value = A[j]
if v == value:
return j
return 'NIL'
How do I use a loop invariant to prove that this is correct? I'm not sure how to extend their discussion of the loop invariant on the insertion-sort algorithm to this algorithm here (known as the linear-search algorithm).
I suppose, when j = 1, you have a (sub-)vector of length 1, which has a component that is trivially either v or not v.
When you have a sub-vector of length j = k, if we assume that the algorithm works, then for j = k+1, it is trivial (I think?).
I'm clearly not understanding this method of proving that an algorithm is correct, although I am very familiar with mathematical induction, but I have no idea how to pursue this problem.
The loop invariant is: A[i] != v for all 1 <= i < j
The loop invariant is always maintained at each iteration. Assume otherwise that there exists an i < j such that A[i] = v. The algorithm would return i before reaching the jth-iteration.
The loop invariant helps prove the correctness, because upon termination there are two possible cases. Either (1) j <= length(A), where the loop invariant and the if-statement suggest that A[j] = v and the algorithm correctly returns j; or (2) j > length(A), where the loop invariant implies that for all i <= length(A), A[i] != v, in which case the algorithm correctly returns NIL.

Find the value in range L to R in given array

Given array A, and two indexes L and R,find the value of
Summation(AS[i]*AS[j]*AS[k])
where L<=i<j<k<=R holds, and AS is the sorted set of all elements of A in range L to R inclusive.
Example:
Let A=(4,4,1,6,1,3) L=0 and R=3 gives AS=(1,4,6), so Ans=1*4*6=24
I don't have any approach better than O(n^3) , which is very slow.
Please suggest me some faster approach.
Number of elements in A are upto 10^5.
As the question commentators said, determining AS can be done by using a hash table H. You simply iterate through the elements of A from index L to R and you insert each element into H. The result should be the set of elements you need. You still need to sort the set. For that you maybe copy the elements of H into an array and sort that array. The result is AS. This should take no more than O(NlogN) steps, where N=R-L.
What the commentators did not say is how to compute the sum efficiently. It can be done in O(N) steps. Here is how.
We first make the following observation:
Sum(AS[j]*AS[k], a <= j < k <= b) =
1/2*(AS[a] + AS[a+1] + ... + AS[b])^2 -
1/2*(AS[a]^2 + AS[a+1]^2 + ... + AS[b]^2)
We expand our target sum as follows:
S = Sum(AS[i]*AS[j]*AS[k]) =
AS[L] * Sum(AS[j]*AS[k], L+1 <= j < k <= R) + (iteration 1)
AS[L+1] * Sum(AS[j]*AS[k], L+2 <= j < k <= R) + (iteration 2)
...
AS[R-2] * Sum(AS[j]*AS[k], R-1 <= j < k <= R). (iteration R-L-1)
We now apply the observation.
To determine the sums of the form Sum(AS[j]*AS[k], a <= j < k <= b) efficiently we can first compute
S1 = AS[L] + AS[L+1] + ... + A[R]
S2 = AS[L]^2 + AS[L+1]^2 + ... + A[R]^2
and then incrementally subtract the first term from each sum as we iterate through the elements of AS from from index L to R-2.
Thus, determining the sum you want can be done in O(N) steps after you determine AS. Provided that you use some comparison sort method the whole algorithm should take O(|A|) + O(NlogN) + O(N) steps.

Get the first x elements of a Heapsort

I'm preparing for a Google developer interview and working on algorithm questions. I need to figure out how to get the first x elements in an array of size n using the Heapsort algorithm. What part of the algorithm needs to be modified to get just the first x smallest elements?
This is the Heapsort algorithm from Introduction to Algorithms by Cormen Leiserson (page 155):
HEAPSORT(A)
{
BUILD-MAX-HEAP(A)
for i = A.length down to 2
exchange A[1] with A[i]
A.heap-size = A.heap-size - 1
MAX-HEAPIFY(A, 1)
}
These are the component algorithms:
BUILD-MAX-HEAP(A)
A.heap-size = A.length
for i = floor(A.length / 2) down to 1
MAX-HEAPIFY(A, i)
MAX-HEAPIFY(A, i)
l = LEFT(i)
r = RIGHT(i)
if l <= A.heap-size and A[l] > A[i]
largest = l
else largest = r
if r <= A.heap-size and A[r] > A[largest]
largest = r
if largest != i
exchange A[i] with A[largest]
MAX-HEAPIFY(A, largest)
I can't figure out what part to modify to get the x smallest elements of the sorted array. Also need to find the time complexity of the modified algorithm.
By changing the condition in MAX-HEAPIFY, we can change it into MIN-HEAPIFY, thus , we can easily obtain a min heap.
Then, the first element of this heap is the smallest element, we can remove this element, and bring the last element in the heap to the first element, and call MIN-HEAPIFY again to maintain the property of the heap. Continuing this process n time, we can obtain the first n smallest object.
Time complexity : log(m) + log(m - 1) + ... + log(m - n) ~ O(nlogm)

Longest Increasing Sub sequence

Here is the pseudo code for longest increasing sub sequence given on Wikipedia
L = 0
for i = 1, 2, ... n:
binary search for the largest positive j ≤ L
such that X[M[j]] < X[i] (or set j = 0 if no such value exists)
P[i] = M[j]
if j == L or X[i] < X[M[j+1]]:
M[j+1] = i
L = max(L, j+1)
I have understood how the code works. The only thing i cannot understand is the necessity of this statement (if j == L or X[i] < X[M[j+1]]:)
I have tried running the algorithm on many examples and what i could make out is that in all the cases either j == L or X[i] < X[M[j+1]] and so the if statement always evaluates to True. Could you give me an example where the if loop is false and thus required for the algorithm ??
When there are duplicates the if condition will fail
Consider X={2, 2, 2}
if Condition fails when j=0 and L=1

Resources