How many times is the function called? - algorithm

Algorithm(a-array, n-length):
for(i=2;i<=n;i++)
if(a[1]<a[i]) Swap(a,1,i);
for(i=n-1;i>=2;i--)
if(a[n]<a[i]) Swap(a,n,i);
I'm interested in determining how many times Swap is called in the code above in the worst case, so I have some questions.
What's the worst case there?
If I had only the first for loop, it could be said that the worst case for this algorithm is that the array a is already sorted in ascending order, and Swap would be called n-1 times.
If I had only the second loop, the worst case would also be that a is already sorted, but this time, the order would be descending. That means that if we consider the first worst case, the Swap wouldn't be called in the second loop, and vice versa, i.e. it can't be called in both loops in each iteration.
What should I do now? How to combine those two worst cases that are opposite to each other?
Worst case means that I want to have as many Swap calls as possible. : )
P.S. I see that the complexity is O(n), but I need to estimate as precisely as possible how many times is the Swap executed.
EDIT 1: Swap(a,i,j) swaps the elements a[i] and a[j].

Let s and r be the positions of the largest and next to largest elements in the original array. At the end of the first loop:-
the largest will come to the first position.
If r < s then the position of the next to largest will now be r. if r > s it will still be r.
At the end of second loop the next to largest element will be at the end
For the first loop the worst case for fixed s is when all elements upto s are in ascending order. The number of swaps is s.
For the second loop the worst case occurs if the next to largest is closer to the beginning of the array. this occurs when r < s and all elements after the largest were in descending order in the original array(they will be untouched even after the first loop). The number of swaps is n-s-1
Total = n-1 in the worst case independent of r and s.
eg A = [1 2 5 7 3 4] Here upto max elemnt 7 it is ascending and after that descending
number of swaps = 5

The worst case for the first loop is that every ai is smaller than aj with 1 ≤ i < j ≤ n. In that case, every aj is swapped with a1 so that at the end a1 is the largest number. This swapping can only happen at most n-1 times, e.g.:
[1,2,3,4,5] ⟶ [5,1,2,3,4]
Similarly, the worst case for the second loop is that every ai is greater than aj with 2 ≤ i < j ≤ n. In that case, every ai is swapped with an so that at the end an is the largest number of the sub-array a2,…,an. This swapping can only happen at most n-2 times, e.g.:
[x,4,3,2,1] ⟶ [x,3,2,1,4]
Now the tricky part is to combine both conditions as the conditions for a Swap call in both loops are mutually exclusive: For any pair ai, aj with 1 ≤ i < j ≤ n and ai < aj, the first loop will call Swap. But for any of such pairs, the second loop won’t call Swap as it expects the opposite: ai > aj.
So the maximum number of Swap calls is n-1.

Related

Searching for time complexity of my recursive algorithm

So I have an assignment from my school, and I only wondered what time complexity my algorithm has (not needed for answer per se, only the algorithm need to run faster than O(n) worst case))
The question is: Given n (n ≥ 3) distinct elements, design a divide and conquer algorithm to compute the first three smallest elements. Your algorithm should return a triple (x, y, z) such that x < y < z < the rest n − 3 input elements, and run in linear time in the worst case.
And my solution is as follows:
Solution: Since “n” must be ≥ 3, the given array cannot have less than 3
elements.
If n == 3, simply compare the elements to each other (maximum of 3
comparisons needed).
Compare the first 2 elements to each other and assign the elements with a
“smallest element” and “2nd smallest element”, we’ll call
them “x” (smallest) and “y” (2nd smallest) from here on out. Compare “x”
(smallest) to the 3rd element.
If x > 3rd element, we know that the 3rd element is the smallest in this
array, therefore: “y” becomes the new “z”, “x” becomes the new “y”, and the
3rd element becomes the new “x”. You’re now left with the desired output.
Return the triple (x<y<z). (2 comparisons used)
Else if x < 3rd, make the 3rd element be “z”. Compare y with 3rd element
(now “z”).
If y < z, do nothing, you already have a triple with the desired outcome
(x<y<z).
If y > z, swap the elements. You now have a triple with the desired outcome
(x<y<z).
Now that the first case (n == 3) has been handled, let’s handle what should happen when n > 3.
If n > 3, split the array recursively into parts of n/2 (if n is an uneven
number, split the array into parts of n/2 + 1 & n/2).
Keep doing the 1st step until the split arrays have a size of ≥ 2.
Compare the elements in each sub-array to each other, and assign a lowest
and 2nd lowest value to them. (“x” and “y”)
When merging the sub-array, compare the lowest element of one sub-array, to
the other array’s lowest value.
If a1(lowest) < a2(lowest), make a1(lowest) be “x”, a2(lowest) the new “y”,
and a2(2nd lowest) the new “z”, then compare a1(2nd lowest) with the highest
value in a1 at that moment (“z” in this case).
If a1(2ndlowest) > “z”, do nothing. (“x”, “y” and “z” are already the three
lowest values in the sub-array)
Else if a1(2ndlowest) < “z”, swap the elements and make a1(2ndlowest) the
new “z”, then compare the new “z” with “y”
If “z” < “y”, swap the elements.
No more comparisons needed, as the “x” is the lowest element from a1, which
means the 3 elements, “x”, “y” & “z” are now the lowest possible from this
subsection of the array.
Repeat from step 4 until reaching the highest layer of the call, and you now
have a triple that satisfies the condition (x<y<z).
Sorry about the wall of text (and pseudocode). I've read about time complexity, and I understand the simple ones (like a statement has constant time, a for-loop has time depending on how long it is, etc.) but I have a hard time understanding the time complexity for my own algorithm. Thanks in advance.

Big O of BubbleSort on a simple list of 5 values

I believe that a BubbleSort is of the order O(n^2). As I read previous postings, this has to do with nested iteration. But when I dry run a simple unsorted list, (see below), I have the list sorted in 10 comparisons.
In my example, here is my list of integer values:
5 4 3 2 1
To get 5 into position, I did n-1 swap operations. (4)
To get 4 into position, I did n-2 swap operations. (3)
To get 3 into position, I did n-3 swap operations. (2)
To get 2 into position, I did n-4 swap operations. (1)
I can't see where (n^2) comes from, as when I have a list of n=5 items, I only need 10 swap operations.
BTW, I've seen (n-1).(n-1) which doesn't make sense to me, as this would give 16 swap operations.
I'm only concerned with basic BubbleSort...a simple nested FOR loop, in the interest of simplicity and clarity.
You don't seem to understand the concept of big O notation very
well. It refers to how the number of operations or the time grows in
relation to the size of the input, asymptotically, considering only the
fastest-growing term, and without considering the constant of
proportionality.
A single measurement like your 5:10 result is completely meaningless.
Imagine looking for a function that maps 5 to 10. Is it 2N? N + 5? 4N –
10? 0.4N2? N2 – 15? 4 log5N + 6? The
possibilities are limitless.
Instead, you have to analyze the algorithm to see how the number of
operations grows as N does, or measure the operations or time over many
runs, using various values of N and the most general datasets you can
devise. Note that your test case is not general at all: when checking
the average performance of a sorting algorithm, you want the input to be
in random order (the most likely case), not sorted or reverse-sorted.
If you wan to precise there are (n)*(n-1)/2 operations because you are actually computing n+(n-1)+(n-2)+...+1 as the first element needs n swaps, second element need n-1 swaps and so on. So the algorithm is of O(1/2 * (n^2) - n) which in asymptotic notations is equal to O(n^2). But what actually is happening in bubble sort is different. In bubble sort you perform a pass on array and swap the misplaced neighbors place, until there is no misplacement which means the array has become sorted. As each pass on array takes O(n) time and in the worst case you have to perform n passes so the algorithm is of O(n^2). Note that we are counting the number of comparisons not the number of swaps.
There are two version of bubble sort mentioned in wikipedia:
procedure bubbleSort( A : list of sortable items )
n = length(A)
repeat
swapped = false
for i = 1 to n-1 inclusive do
/* if this pair is out of order */
if A[i-1] > A[i] then
/* swap them and remember something changed */
swap( A[i-1], A[i] )
swapped = true
end if
end for
until not swapped
end procedure
This version perform (n-1)*(n-1) comparison -> O(n^2)
Optimizing bubble sort
The bubble sort algorithm can be easily
optimized by observing that the n-th pass finds the n-th largest
element and puts it into its final place. So, the inner loop can avoid
looking at the last n-1 items when running for the n-th time:
procedure bubbleSort( A : list of sortable items )
n = length(A)
repeat
swapped = false
for i = 1 to n-1 inclusive do
if A[i-1] > A[i] then
swap(A[i-1], A[i])
swapped = true
end if
end for
n = n - 1
until not swapped
end procedure
This version performs (n-1)+(n-2)+(n-3)+...+1 operations which is (n-1)(n-2)/2 comparisons -> O(n^2)

Correctness of greedy algorithm

In non-decreasing sequence of (positive) integers two elements can be removed when . How many pairs can be removed at most from this sequence?
So I have thought of the following solution:
I take given sequence and divide into two parts (first and second).
Assign to each of them iterator - it_first := 0 and it_second := 0, respectively. count := 0
when it_second != second.length
if 2 * first[it_first] <= second[it_second]
count++, it_first++, it_second++
else
it_second++
count is the answer
Example:
count := 0
[1,5,8,10,12,13,15,24] --> first := [1,5,8,10], second := [12,13,15,24]
2 * 1 ?< 12 --> true, count++, it_first++ and it_second++
2 * 5 ?< 13 --> true, count++, it_first++ and it_second++
2 * 8 ?< 15 --> false, it_second++
8 ?<24 --> true, count ++it_second reach the last element - END.
count == 3
Linear complexity (the worst case when there are no such elements to be removed. n/2 elements compare with n/2 elements).
So my missing part is 'correctness' of algorithm - I've read about greedy algorithms proof - but mostly with trees and I cannot find analogy. Any help would be appreciated. Thanks!
EDIT:
By correctness I mean:
* It works
* It cannot be done faster(in logn or constant)
I would like to put some graphics but due to reputation points < 10 - I can't.
(I've meant one latex at the beginning ;))
Correctness:
Let's assume that the maximum number of pairs that can be removed is k. Claim: there is an optimal solution where the first elements of all pairs are k smallest elements of the array.
Proof: I will show that it is possible to transform any solution into the one that contains the first k elements as the first elements of all pairs.
Let's assume that we have two pairs (a, b), (c, d) such that a <= b <= c <= d, 2 * a <= b and 2 * c <= d. In this case, pairs (a, c) and (b, d) are valid, too. And now we have a <= c <= b <= d. Thus, we can always transform out pairs in such a way that the first element from any pair is not greater than the second element of any pair.
When we have this property, we can simply substitute the smallest element among all first all elements of all pairs with the smallest element in the array, the second smallest among all first elements - with the second smallest element in the array and so on without invalidating any pair.
Now we know that there is an optimal solution that contains k smallest elements. It is clear that we cannot make the answer worse by taking the smallest unused element(making it bigger can only reduce the answer for the next elements) which fits each of them. Thus, this solution is correct.
A note about the case when the length of the array is odd: it doesn't matter where the middle element goes: to the first or to the second half. In the first half it is useless(there are not enough elements in the second half). If we put it to the second half, it is useless two(let's assume that we took it. It means that there is "free space" somewhere in the second half. Thus, we can shift some elements by one and get rid of it).
Optimality in terms of time complexity: the time complexity of this solution is O(n). We cannot find the answer without reading the entire input in the worst case and reading is already O(n) time. Thus, this algorithm is optimal.
Presuming your method. Indices are 0-based.
Denote in general:
end_1 = floor(N/2) boundary (inclusive) of first part.
Denote while iterating:
i index in first part, j index in second part,
optimal solution until this point sol(i,j) (using algorithm from front),
pairs that remain to be paired-up optimally behind (i,j) point i.e. from
(i+1,j+1) onward rem(i,j) (can be calculated using algorithm from back),
final optimal solution can be expressed as the function of any point as sol(i,j) + rem(i,j).
Observation #1: when doing algorithm from front all points in [0, i] range are used, some points from [end_1+1, j] range are not used (we skip a(j) not large engough). When doing algorithm from back some [i+1, end_1] points are not used, and all [j+1, N] points are used (we skip a(i) not small enough).
Observation #2: rem(i,j) >= rem(i,j+1), because rem(i,j) = rem(i,j+1) + M, where M can be 0 or 1 depending on whether we can pair up a(j) with some unused element from [i+1, end_1] range.
Argument (by contradiction): let's assume 2*a(i) <= a(j) and that not pairing up a(i) and a(j) gives at least as good final solution. By the algorithm we would next try to pair up a(i) and a(j+1). Since:
rem(i,j) >= rem(i,j+1) (see above),
sol(i,j+1) = sol(i,j) (since we didn't pair up a(i) and a(j))
we get that sol(i,j) + rem(i,j) >= sol(i,j+1) + rem(i,j+1) which contradicts the assumption.

How to find number of expected swaps in bubble sort in better than O(n^2) time

I am stuck on problem http://www.codechef.com/JULY12/problems/LEBOBBLE
Here it is required to find number of expected swaps.
I tried an O(n^2) solution but it is timing out.
The code is like:
swaps = 0
for(i = 0;i < n-1;i++)
for(j = i+1;j<n;j++)
{
swaps += expected swap of A[i] and A[j]
}
Since probabilities of elements are varying, so every pair is needed to be compared. So according to me the above code snippet must be most efficient but it is timing out.
Can it be done in O(nlogn) or it any complexity better than O(n^2).
Give me any hint if possible.
Alright, let's think about this.
We realize that every number needs to be eventually swapped with every number after it that's less than it, sooner or later. Thus, the total number of swaps for a given number is the total number of numbers after it which are less than it. However, this is still O(n^2) time.
For every pass of the outer loop of bubble sort, one element gets put in the correct position. Without loss of generality, we'll say that for every pass, the largest element remaining gets sorted to the end of the list.
So, in the first pass of the outer loop, the largest number is put at the end. This takes q swaps, where q is the number of positions the number started away from the final position.
Thus, we can say that it will take q1+q2+ ... +qn swaps to complete this bubble sort. However, keep in mind that with every swap, one number will be taken either one position closer or one position farther away from their final positions. In our specific case, if a number is in front of a larger number, and at or in front of its correct position, one more swap will be required. However, if a number is behind a larger number and behind it's correct position, one less swap will be required.
We can see that this is true with the following example:
5 3 1 2 4
=> 3 5 1 2 4
=> 3 1 5 2 4
=> 3 1 2 5 4
=> 3 1 2 4 5
=> 1 3 2 4 5
=> 1 2 3 4 5 (6 swaps total)
"5" moves 4 spaces. "3" moves 1 space. "1" moves 2 spaces. "2" moves 2 spaces. "4" moves 1 space. Total: 10 spaces.
Note that 3 is behind 5 and in front of its correct position. Thus one more swap will be needed. 1 and 2 are behind 3 and 5 -- four less swaps will be needed. 4 is behind 5 and behind its correct position, thus one less swap will be needed. We can see now that the expected value of 6 matches the actual value.
We can compute Σq by sorting the list first, keeping the original positions of each of the elements in memory while doing the sort. This is possible in O(nlogn + n) time.
We can also see what numbers are behind what other numbers, but this is impossible to do in faster than O(n^2) time. However, we can get a faster solution.
Every swap effectively moves two numbers number needs to their correct positions, but some swaps actually do nothing, because one be eventually swapped with every number gets closerafter it that's less than it, and another gets farthersooner or later. The first swap in our previous exampleThus, between "3" and "5" is the only example of this in our example.
We have to calculate how many total number of said swaps that there are. This is left as an exercise to the reader, but here's one last hint: you only have to loop through the first half of the list. Though this for a given number is still, in the end O(n^2), we only have to do O(n^2) operations on the first half total number of the list, making numbers after it much faster overall.
Use divide and conquer
divide: size of sequence n to two lists of size n/2
conquer: count recursively two lists
combine: this is a trick part (to do it in linear time)
combine use merge-and-count. Suppose the two lists are A, B. They are already sorted. Produce an output list L from A, B while also counting the number of inversions, (a,b) where a is-in A, b is-in B and a > b.
The idea is similar to "merge" in merge-sort. Merge two sorted lists into one output list, but we also count the inversion.
Everytime a_i is appended to the output, no new inversions are encountered, since a_i is smaller than everything left in list B. If b_j is appended to the output, then it is smaller than all the remaining items in A, we increase the number of count of inversions by the number of elements remaining in A.
merge-and-count(A,B)
; A,B two input lists (sorted)
; C output list
; i,j current pointers to each list, start at beginning
; a_i, b_j elements pointed by i, j
; count number of inversion, initially 0
while A,B != empty
append min(a_i,b_j) to C
if b_j < a_i
count += number of element remaining in A
j++
else
i++
; now one list is empty
append the remainder of the list to C
return count, C
With merge-and-count, we can design the count inversion algorithm as follows:
sort-and-count(L)
if L has one element return 0
else
divide L into A, B
(rA, A) = sort-and-count(A)
(rB, B) = sort-and-count(B)
(r, L) = merge-and-count(A,B)
return r = rA+rB+r, L
T(n) = O(n lg n)

What is the complexity of this specialized sort

I would like to know the complexity (as in O(...) ) of the following sorting algorithm:
There are B barrels
that contain a total of N elements, spread unevenly across the barrels.
The elements in each barrel are already sorted.
The sort combines all the elements from each barrel in a single sorted list:
using an array of size B to store the last sorted element of each barrel (starting at 0)
check each barrel (at the last stored index) and find the smallest element
copy the element in the final sorted array, increment the array counter
increment the last sorted element for the barrel we picked from
perform those steps N times
or in pseudo code:
for i from 0 to N
smallest = MAX_ELEMENT
foreach b in B
if bIndex[b] < b.length && b[bIndex[b]] < smallest
smallest_barrel = b
smallest = b[bIndex[b]]
result[i] = smallest
bIndex[smallest_barrel] += 1
I thought that the complexity would be O(n), but the problem I have with finding the complexity is that if B grows, it has a larger impact than if N grows because it adds another round in the B loop. But maybe that has no effect on the complexity?
Since you already have pseudo-code, finding the complexity is easy.
You have 2 nested loops. The outer one always runs N-1 times, the inner always |B|, where |B| is the size of the set B (number of barrels). Therefore the complexity is (N-1)*|B| which is O(NB).
You are correct that the number of barrels changes the complexity. Just look at Your pseudocode. For every element You are about to pick, You have to search the array of candidates, the length of which is B. So You are performing an operation of complexity O(B) N times

Resources