Counting the frequency of target element in an unsorted array using divide and conquer - algorithm

Suppose I have an unsorted array A of n integers and an integer b. I want to write an algorithm to compute the frequency of b in A (i.e., count the number of times b appears in A) using divide and conquer.
Here is a recursive divide-and-conquer algorithm to count the frequency of b in the array A:
Divide the array A into two sub-arrays: left half and right half.
Recursively count the frequency of b in left half of A and in right half of A.
Combine the results from step 2: the frequency of b in A is equal to the sum of the frequency of b in left half and the frequency of b in right half.
If the length of the array A is 1, return 1 if A[0] equals b, otherwise return 0.
The recurrence relation of the algorithm is T(n) = 2T(n/2) + O(1), where O(1) is the time to divide the array and combine the results. The solution of the recurrence is T(n) = O(n), so the time complexity of the algorithm is O(n).
This is because each recursive call divides the array into two sub-arrays of equal size, and each element is visited once at the bottom level of the recursion. Therefore, the algorithm visits each element of the array once, leading to a linear time complexity.
Correct me If I'm wrong.

Let's just use concrete elements and do the math. The constant step during the splitting is O(1) so let's call it c and the constant step at the very end (Returning 1 or 0 for length 1 array) is just one step.
So then:
T(n) = 2T(n/2) + c
T(1) = 1
We make the educated guess (or ansatz, if you want to use fancy language) that T(n) = a*n + b, i.e., a linear function. Let's plug that into the relations:
T(n) = a*n + b = 2 * (a*n/2 + b) + c
from which it follows after a bit of basic math that b = -c.
Next, we plug the ansatz into the base case:
T(1) = a*1 + b = a + b = 1
from which we can deduce that a = 1 - b = 1 + c.
So there! We solved for a and b without making a mess and indeed we have
T(n) = (1 + c) * n - c
which is indeed O(n).
Note that this is the "pedestrian" way. If we're not interested in the actual coefficients a and b but really just in the complexity, we can be more efficient like so:
T(n) = 2 T(n/2) + O(1) = 4 T(n/4) + 2 * O(1) + O(1) = ...
= 2^k T(1) + 2^(k-1) O(1) + 2^(k-2) O(1) ... + O(1)
= 2^k O(1) + 2^(k-1) O(1) + ...
where k = log_2(n).
Summing up all those precoefficients then we get roughly
T(n) = 2^(k+1) O(1) = 2 * n * O(1) = O(n)

Related

Analyzing Worst Case Performance of Quicksort by Substitution Method

I am trying to solve the reccurence of the quicksort algorithm by the substitution method:
I can not find any way to proof that this will lead to . What further steps do I have to make to make this work?
The worst case of quicksort is when you choose a pivot element which is the minimum or maximum element from the array, so that all remaining elements go to one side of the partition, and the other side of the partition is empty. In this case, the non-empty partition has a size of (n - 1), and it takes linear time (kn for some constant k > 0) to do the partitioning itself, so the recurrence relation is
T(n) = T(n - 1) + T(0) + kn
If we guess that T(n) = an² + bn + c for some constants a, b, c, then we can substitute:
an² + bn + c = [ a(n - 1)² + b(n - 1) + c ] + [ c ] + kn
where the two square-bracketed terms are T(n - 1) and T(0) respectively. By expanding the brackets and equating coefficients, we get
an² = an²
bn = -2an + bn + kn
c = a - b + 2c
It follows that there is a family of solutions, parameterised by c = T(0), where a = k/2 and b = k/2 + c. This family of solutions can be written exactly as
T(n) = (k/2) n² + (k/2 + c) n + c
which is not just O(n²), but Ө(n²), meaning the running time is a quadratic function, not merely bounded above by a quadratic function. Note that the actual value of c doesn't change the asymptotic behaviour of the function, so long as k > 0 (i.e. the partitioning step does take a positive amount of time).
I have found an answer to my question, the continuation of the previous equation is,
This is true if

Running Time of Randomized Binary Search

Consider the following silly randomized variant of binary search. You are given a sorted array
A of n integers and the integer v that you are searching for is chosen uniformly at random from A.
Then, instead of comparing v with the value in the middle of the array, the randomized binary search
variant chooses a random number r from 1 to n and it compares v with A[r]. Depending on whether
v is larger or smaller, this process is repeated recursively on the left sub-array or the right sub-array,
until the location of v is found. Prove a tight bound on the expected running time of this algorithm.
Here is what I got for the T(n)
T(n) = T(n-r) + T(r) + Θ(1)
However, I have no clue how to get a tight bound.
Your formulation of T(n) is not completely correct. Actually,
Lets try to look over all the cases. When we reduce the problem size by partitioning the array across any random point, the reduced sub-problem will have any size from 1 to n with uniform probability. Hence with probability 1/n, search space becomes r. So expected running time becomes
T(n) = sum ( T(r)*Pr(search space becomes r) ) + O(1) = sum ( T(r) )/n + O(1)
Which gives,
T(n) = average(T(r)) + O(1)
Let expected time complexity of random binary sort be T(n).
T(n) = [ T(1)+T(2)+...+T(n)]/n + 1
n*T(n) = T(1)+T(2)+...+T(n) + n
(n-1)*T(n-1) = T(1)+T(2)+...+T(n-1) + n-1 [substituiting n by n-1]
n*T(n) - (n-1)*T(n-1) = T(n) + 1
(n-1)*T(n) - (n-1)*T(n-1) = 1
(n-1)*T(n) = (n-1)*T(n-1) + 1
T(n) = 1/(n-1) + T(n-1)
T(n) = 1/(n-1) + 1/(n-2) + T(n-2) [ T(n-1) = T(n-2) + 1/(n-2) ]
...
T(n) = 1 + 1/2 + 1/3 + ... + 1/(n-1) = H(n-1) < H(n) = O(log n)
[ H(n) = reciprocal sum of first n natural numbers ]
so, T(n) = O(log n)
Harmonic number
bound of H(n)

recurrence relation on a Merge Sort algorithm

The question is :
UNBALANCED MERGE SORT
is a sorting algorithm, which is a modified version of
the standard MERGE SORT
algorithm. The only difference is that instead of dividing
the input into 2 equal parts in each stage, we divide it into two unequal parts – the first
2/5 of the input, and the other 3/5.
a. Write the recurrence relation for the worst case time complexity of the
UNBALANCED MERGE SORT
algorithm.
b. What is the worst case time complexity of the UNBALANCEDMERGESORT
algorithm? Solve the recurrence relation from the previous section.
So i'm thinkin the recurrence relation is : T(n) <= T(2n/5) + T(3n/5) + dn.
Not sure how to solve it.
Thanks in advance.
I like to look at it as "runs", where the ith "run" is ALL the recursive steps with depth exactly i.
In each such run, at most n elements are being processed (we will prove it soon), so the total complexity is bounded by O(n*MAX_DEPTH), now, MAX_DEPTH is logarithmic, as in each step the bigger array is size 3n/5, so at step i, the biggest array is of size 3^i/5^i * n.
Sovle the equation:
3^i/5^i * n = 1
and you will find out that i = log_a(n) - for some base a
So, let's be more formal:
Claim:
Each element is being processed by at most one recursive call at depth
i, for all values of i.
Proof:
By induction, at depth 0, all elements are processed exactly once by the first call.
Let there be some element x, and let's have a look on it at step i+1. We know (induction hypothesis) that x was processed at most once in depth i, by some recursive call. This call later invoked (or not, we claim at most once) the recursive call of depth i+1, and sent the element x to left OR to right, never to both. So at depth i+1, the element x is proccessed at most once.
Conclusion:
Since at each depth i of the recursion, each element is processed at most once, and the maximal depth of the recursion is logarithmic, we get an upper bound of O(nlogn).
We can similarly prove a lower bound of Omega(nlogn), but that is not needed, since sorting is already an Omega(nlogn) problem - so we can conclude the modified algorithm is still Theta(nlogn).
If you want to prove it with "basic arithmetics", it can also be done, by induction.
Claim: T(n) = T(3n/5) + T(2n/5) + n <= 5nlog(n) + n
It will be similar when replacing +n with +dn, I simplified it, but follow the same idea of proof with T(n) <= 5dnlogn + dn
Proof:
Base: T(1) = 1 <= 1log(1) + 1 = 1
T(n) = T(3n/5) + T(2n/5) + n
<= 5* (3n/5) log(3n/5) +3n/5 + 5*(2n/5)log(2n/5) +2n/5 + n
< 5* (3n/5) log(3n/5) + 5*(2n/5)log(3n/5) + 2n
= 5*nlog(3n/5) + 2n
= 5*nlog(n) + 5*nlog(3/5) + 2n
(**)< 5*nlog(n) - n + 2n
= 5nlog(n) + n
(**) is because log(3/5)~=-0.22, so 5nlog(3/5) < -n, and 5nlog(3/5) + 2n < n

Complexity of trominoes algorithm

What is or what should be complexity of (divide and conquer) trominoes algorithm and why?
I've been given a 2^k * 2^k sized board, and one of the tiles is randomly removed making it a deficient board. The task is to fill the with "trominos" which are an L-shaped figure made of 3 tiles.
Tiling Problem
– Input: A n by n square board, with one of the 1 by 1 square
missing, where n = 2k for some k ≥ 1.
– Output: A tiling of the board using a tromino, a three square tile
obtained by deleting the upper right 1 by 1 corner from a 2 by 2
square.
– You are allowed to rotate the tromino, for tiling the board.
Base Case: A 2 by 2 square can be tiled.
Induction:
– Divide the square into 4, n/2 by n/2 squares.
– Place the tromino at the “center”, where the tromino does not
overlap the n/2 by n/2 square which was earlier missing out 1 by 1
square.
– Solve each of the four n/2 by n/2 boards inductively.
This algorithm runs in time O(n2) = O(4k). To see why, notice that your algorithm does O(1) work per grid, then makes four subcalls to grids whose width and height of half the original size. If we use n as a parameter denoting the width or height of the grid, we have the following recurrence relation:
T(n) = 4T(n / 2) + O(1)
By the Master Theorem, this solves to O(n2). Since n = 2k, we see that n2 = 4k, so this is also O(4k) if you want to use k as your parameter.
We could also let N denote the total number of squares on the board (so N = n2), in which case the subcalls are to four grids of size N / 4 each. This gives the recurrence
S(N) = 4S(N / 4) + O(1)
This solves to O(N) = O(n2), confirming the above result.
Hope this helps!
To my understanding, the complexity can be determined as follows. Let T(n) denote the number of steps needed to solve a board of side length n. From the description in the original question above, we have
T(2) = c
where c is a constant and
T(n) = 4*T(n/2) + b
where b is a constant for placing the tromino. Using the master theorem, the runtime bound is
O(n^2)
via case 1.
I'll try to offer less formal solutions but without making use of the Master theorem.
– Place the tromino at the “center”, where the tromino does not overlap the n/2 by n/2 square which was earlier missing out 1 by 1 square.
I'm guessing this is an O(1) operation? In that case, if n is the board size:
T(1) = O(1)
T(n) = 4T(n / 4) + O(1) =
= 4(4T(n / 4^2) + O(1)) + O(1) =
= 4^2T(n / 4^2) + 4*O(1) + O(1) =
= ... =
= 4^kT(n / 4^k) + 4^(k - 1)*O(1)
But n = 2^k x 2^k = 2^(2k) = (2^2)^k = 4^k, so the whole algorithm is O(n).
Note that this does not contradict #Codor's answer, because he took n to be the side length of the board, while I took it to be the entire area.
If the middle step is not O(1) but O(n):
T(n) = 4T(n / 4) + O(n) =
= 4(4*T(n / 4^2) + O(n / 4)) + O(n) =
= 4^2T(n / 4^2) + 2*O(n) =
= ... =
= 4^kT(n / 4^k) + k*O(n)
We have:
k*O(n) = n log n because 4^k = n
So the entire algorithm would be O(n log n).
You do O(1) work per tromino placed. Since there's (n^2-1)/3 trominos to place, the algorithm takes O(n^2) time.

Proving this recursive Fibonacci implementation runs in time O(2^n)?

I'm having difficulty proving that the 'bad' version of fibonacci is O(2^n).
Ie.
Given the function
int fib(int x)
{
if ( x == 1 || x == 2 )
{
return 1;
}
else
{
return ( f( x - 1 ) + f( x - 2) );
}
}
Can I get help for the proof of this being O(2^n).
Let's start off by writing a recurrence relation for the runtime:
T(1) = 1
T(2) = 1
T(n+2) = T(n) + T(n + 1) + 1
Now, let's take a guess that
T(n) ≤ 2n
If we try to prove this by induction, the base cases check out:
T(1) = 1 ≤ 2 = 21
T(2) = 1 ≤ 4 = 22
Then, in the inductive step, we see this:
T(n + 2) = T(n) + T(n + 1) + 1
≤ 2n + 2n+1 + 1
< 2n+1 + 2n+1
= 2n+2
Therefore, by induction, we can conclude that T(n) ≤ 2n for any n, and therefore T(n) = O(2n).
With a more precise analysis, you can prove that T(n) = 2Fn - 1, where Fn is the nth Fibonacci number. This proves, more accurately, that T(n) = Θ(φn), where φ is the Golden Ratio, which is approximately 1.61. Note that φn = o(2n) (using little-o notation), so this is a much better bound.
Hope this helps!
Try manually doing a few test cases like f(5) and take note of how many times the method f() is called.
A fat hint would be to notice that every time the method f() is called (except for x is 1 or 2), f() is called twice. Each of those call f() two more times each, and so on...
There's actually a pretty simple proof that the total number of calls to the f is going to be 2Fib(n)-1, where Fib(n) is the n'th Fibonacci number. It goes like this:
The set of calls to f form a binary tree, where each call is either a leaf (for x=1 or x=2) or else the call spawns two child calls (for x>2).
Each leaf contributes exactly 1 to the total returned by the original call, therefore there are Fib(n) total leaves.
The total number of internal nodes in any binary tree is equal to L-1, where L is the number of leaves, so the total number of nodes in this tree is 2L-1.
This shows that the running time (measured in terms of total calls to f) is
T(n)=2Fib(n)-1=O(Fib(n))
and since Fib(n)=Θ(φ^n), where φ is the golden ratio
Φ=(1+sqrt{5})/2 = 1.618...
this proves that T(n) = Θ(1.618...^n) = O(n).
Using the Recursion Tree Method :
T(n)
↙ ↘
n-1 n – 2
↙ ↘ ↙ ↘
N – 2 n – 3 n – 3 n - 4
Each tree level is considered as a call for fib(x - 1) fib(x - 2) if you complete the recursion tree on this manner you will stop when x = 1 or x = 2 (base case) .... this tree shows only three levels of the recursion tree . To solve this tree you need these important informations : 1- height of the tree. 2-how much work is done at each level .
The height of this tree is 2^n and the work at each level is O(1) then the Order of this recurrence is Height * Work at each level = 2^n * 1 = O(2^n)

Resources