Finding Minimum Nodes in AVL Tree without recursive call? - data-structures

Can we find generalized formula to count minimum number of nodes in AVL tree without recursive relation formula as when we have to found number of minimum nodes in AVL tree with height 1000 then it will fail because it takes very long time to solve on paper?

This is a great place to draw some pictures and look for a pattern.
Here's the smallest AVL trees of heights 0 and 1:
*
*
|
*
The smallest AVL tree of height 2 would be made by using trees of heights 0 and 1 (if you used heights 1 and 1, you could improve the number of nodes by using trees of height 0 and 1), and the smallest option is
*
/ \
* *
|
*
Then, the smallest tree of height 3 would use trees of heights 1 and 2, which would look like this:
*
/ \
* *
| / \
* * *
|
*
And, more generally, the smallest tree of height k, where k ≥ 2, is given by taking the two smallest trees of heights k-1 and k-2 and linking them together with a new node at the root.
This gives a recurrence relation:
T(n) = T(n - 1) + T(n - 2) + 1
T(0) = 1
T(1) = 2
Iterating this recurrence gives this pattern:
h = 0 1 2 3 4 5 6 7 8
1 2 4 7 12 20 33 54 88
There are a couple of things you could do next:
Could you write some code to evaluate this recurrence relation on larger and larger values of n? That might give you a function that directly computes the number you need.
The recurrence T(n) = T(n-2) + T(n-1) + 1 looks a lot like the Fibonacci recurrence F(n) = F(n-2) + F(n-1). Do you see any relation between the T(n) numbers and the Fibonacci sequence? That might let you directly compute the answer.
Hope this helps!

Related

is the following way correct for determines the complexity of Fibonacci function? [duplicate]

I understand Big-O notation, but I don't know how to calculate it for many functions. In particular, I've been trying to figure out the computational complexity of the naive version of the Fibonacci sequence:
int Fibonacci(int n)
{
if (n <= 1)
return n;
else
return Fibonacci(n - 1) + Fibonacci(n - 2);
}
What is the computational complexity of the Fibonacci sequence and how is it calculated?
You model the time function to calculate Fib(n) as sum of time to calculate Fib(n-1) plus the time to calculate Fib(n-2) plus the time to add them together (O(1)). This is assuming that repeated evaluations of the same Fib(n) take the same time - i.e. no memoization is used.
T(n<=1) = O(1)
T(n) = T(n-1) + T(n-2) + O(1)
You solve this recurrence relation (using generating functions, for instance) and you'll end up with the answer.
Alternatively, you can draw the recursion tree, which will have depth n and intuitively figure out that this function is asymptotically O(2n). You can then prove your conjecture by induction.
Base: n = 1 is obvious
Assume T(n-1) = O(2n-1), therefore
T(n) = T(n-1) + T(n-2) + O(1) which is equal to
T(n) = O(2n-1) + O(2n-2) + O(1) = O(2n)
However, as noted in a comment, this is not the tight bound. An interesting fact about this function is that the T(n) is asymptotically the same as the value of Fib(n) since both are defined as
f(n) = f(n-1) + f(n-2).
The leaves of the recursion tree will always return 1. The value of Fib(n) is sum of all values returned by the leaves in the recursion tree which is equal to the count of leaves. Since each leaf will take O(1) to compute, T(n) is equal to Fib(n) x O(1). Consequently, the tight bound for this function is the Fibonacci sequence itself (~θ(1.6n)). You can find out this tight bound by using generating functions as I'd mentioned above.
Just ask yourself how many statements need to execute for F(n) to complete.
For F(1), the answer is 1 (the first part of the conditional).
For F(n), the answer is F(n-1) + F(n-2).
So what function satisfies these rules? Try an (a > 1):
an == a(n-1) + a(n-2)
Divide through by a(n-2):
a2 == a + 1
Solve for a and you get (1+sqrt(5))/2 = 1.6180339887, otherwise known as the golden ratio.
So it takes exponential time.
I agree with pgaur and rickerbh, recursive-fibonacci's complexity is O(2^n).
I came to the same conclusion by a rather simplistic but I believe still valid reasoning.
First, it's all about figuring out how many times recursive fibonacci function ( F() from now on ) gets called when calculating the Nth fibonacci number. If it gets called once per number in the sequence 0 to n, then we have O(n), if it gets called n times for each number, then we get O(n*n), or O(n^2), and so on.
So, when F() is called for a number n, the number of times F() is called for a given number between 0 and n-1 grows as we approach 0.
As a first impression, it seems to me that if we put it in a visual way, drawing a unit per time F() is called for a given number, wet get a sort of pyramid shape (that is, if we center units horizontally). Something like this:
n *
n-1 **
n-2 ****
...
2 ***********
1 ******************
0 ***************************
Now, the question is, how fast is the base of this pyramid enlarging as n grows?
Let's take a real case, for instance F(6)
F(6) * <-- only once
F(5) * <-- only once too
F(4) **
F(3) ****
F(2) ********
F(1) **************** <-- 16
F(0) ******************************** <-- 32
We see F(0) gets called 32 times, which is 2^5, which for this sample case is 2^(n-1).
Now, we want to know how many times F(x) gets called at all, and we can see the number of times F(0) is called is only a part of that.
If we mentally move all the *'s from F(6) to F(2) lines into F(1) line, we see that F(1) and F(0) lines are now equal in length. Which means, total times F() gets called when n=6 is 2x32=64=2^6.
Now, in terms of complexity:
O( F(6) ) = O(2^6)
O( F(n) ) = O(2^n)
There's a very nice discussion of this specific problem over at MIT. On page 5, they make the point that, if you assume that an addition takes one computational unit, the time required to compute Fib(N) is very closely related to the result of Fib(N).
As a result, you can skip directly to the very close approximation of the Fibonacci series:
Fib(N) = (1/sqrt(5)) * 1.618^(N+1) (approximately)
and say, therefore, that the worst case performance of the naive algorithm is
O((1/sqrt(5)) * 1.618^(N+1)) = O(1.618^(N+1))
PS: There is a discussion of the closed form expression of the Nth Fibonacci number over at Wikipedia if you'd like more information.
You can expand it and have a visulization
T(n) = T(n-1) + T(n-2) <
T(n-1) + T(n-1)
= 2*T(n-1)
= 2*2*T(n-2)
= 2*2*2*T(n-3)
....
= 2^i*T(n-i)
...
==> O(2^n)
Recursive algorithm's time complexity can be better estimated by drawing recursion tree, In this case the recurrence relation for drawing recursion tree would be T(n)=T(n-1)+T(n-2)+O(1)
note that each step takes O(1) meaning constant time,since it does only one comparison to check value of n in if block.Recursion tree would look like
n
(n-1) (n-2)
(n-2)(n-3) (n-3)(n-4) ...so on
Here lets say each level of above tree is denoted by i
hence,
i
0 n
1 (n-1) (n-2)
2 (n-2) (n-3) (n-3) (n-4)
3 (n-3)(n-4) (n-4)(n-5) (n-4)(n-5) (n-5)(n-6)
lets say at particular value of i, the tree ends, that case would be when n-i=1, hence i=n-1, meaning that the height of the tree is n-1.
Now lets see how much work is done for each of n layers in tree.Note that each step takes O(1) time as stated in recurrence relation.
2^0=1 n
2^1=2 (n-1) (n-2)
2^2=4 (n-2) (n-3) (n-3) (n-4)
2^3=8 (n-3)(n-4) (n-4)(n-5) (n-4)(n-5) (n-5)(n-6) ..so on
2^i for ith level
since i=n-1 is height of the tree work done at each level will be
i work
1 2^1
2 2^2
3 2^3..so on
Hence total work done will sum of work done at each level, hence it will be 2^0+2^1+2^2+2^3...+2^(n-1) since i=n-1.
By geometric series this sum is 2^n, Hence total time complexity here is O(2^n)
The proof answers are good, but I always have to do a few iterations by hand to really convince myself. So I drew out a small calling tree on my whiteboard, and started counting the nodes. I split my counts out into total nodes, leaf nodes, and interior nodes. Here's what I got:
IN | OUT | TOT | LEAF | INT
1 | 1 | 1 | 1 | 0
2 | 1 | 1 | 1 | 0
3 | 2 | 3 | 2 | 1
4 | 3 | 5 | 3 | 2
5 | 5 | 9 | 5 | 4
6 | 8 | 15 | 8 | 7
7 | 13 | 25 | 13 | 12
8 | 21 | 41 | 21 | 20
9 | 34 | 67 | 34 | 33
10 | 55 | 109 | 55 | 54
What immediately leaps out is that the number of leaf nodes is fib(n). What took a few more iterations to notice is that the number of interior nodes is fib(n) - 1. Therefore the total number of nodes is 2 * fib(n) - 1.
Since you drop the coefficients when classifying computational complexity, the final answer is θ(fib(n)).
It is bounded on the lower end by 2^(n/2) and on the upper end by 2^n (as noted in other comments). And an interesting fact of that recursive implementation is that it has a tight asymptotic bound of Fib(n) itself. These facts can be summarized:
T(n) = Ω(2^(n/2)) (lower bound)
T(n) = O(2^n) (upper bound)
T(n) = Θ(Fib(n)) (tight bound)
The tight bound can be reduced further using its closed form if you like.
It is simple to calculate by diagramming function calls. Simply add the function calls for each value of n and look at how the number grows.
The Big O is O(Z^n) where Z is the golden ratio or about 1.62.
Both the Leonardo numbers and the Fibonacci numbers approach this ratio as we increase n.
Unlike other Big O questions there is no variability in the input and both the algorithm and implementation of the algorithm are clearly defined.
There is no need for a bunch of complex math. Simply diagram out the function calls below and fit a function to the numbers.
Or if you are familiar with the golden ratio you will recognize it as such.
This answer is more correct than the accepted answer which claims that it will approach f(n) = 2^n. It never will. It will approach f(n) = golden_ratio^n.
2 (2 -> 1, 0)
4 (3 -> 2, 1) (2 -> 1, 0)
8 (4 -> 3, 2) (3 -> 2, 1) (2 -> 1, 0)
(2 -> 1, 0)
14 (5 -> 4, 3) (4 -> 3, 2) (3 -> 2, 1) (2 -> 1, 0)
(2 -> 1, 0)
(3 -> 2, 1) (2 -> 1, 0)
22 (6 -> 5, 4)
(5 -> 4, 3) (4 -> 3, 2) (3 -> 2, 1) (2 -> 1, 0)
(2 -> 1, 0)
(3 -> 2, 1) (2 -> 1, 0)
(4 -> 3, 2) (3 -> 2, 1) (2 -> 1, 0)
(2 -> 1, 0)
The naive recursion version of Fibonacci is exponential by design due to repetition in the computation:
At the root you are computing:
F(n) depends on F(n-1) and F(n-2)
F(n-1) depends on F(n-2) again and F(n-3)
F(n-2) depends on F(n-3) again and F(n-4)
then you are having at each level 2 recursive calls that are wasting a lot of data in the calculation, the time function will look like this:
T(n) = T(n-1) + T(n-2) + C, with C constant
T(n-1) = T(n-2) + T(n-3) > T(n-2) then
T(n) > 2*T(n-2)
...
T(n) > 2^(n/2) * T(1) = O(2^(n/2))
This is just a lower bound that for the purpose of your analysis should be enough but the real time function is a factor of a constant by the same Fibonacci formula and the closed form is known to be exponential of the golden ratio.
In addition, you can find optimized versions of Fibonacci using dynamic programming like this:
static int fib(int n)
{
/* memory */
int f[] = new int[n+1];
int i;
/* Init */
f[0] = 0;
f[1] = 1;
/* Fill */
for (i = 2; i <= n; i++)
{
f[i] = f[i-1] + f[i-2];
}
return f[n];
}
That is optimized and do only n steps but is also exponential.
Cost functions are defined from Input size to the number of steps to solve the problem. When you see the dynamic version of Fibonacci (n steps to compute the table) or the easiest algorithm to know if a number is prime (sqrt(n) to analyze the valid divisors of the number). you may think that these algorithms are O(n) or O(sqrt(n)) but this is simply not true for the following reason:
The input to your algorithm is a number: n, using the binary notation the input size for an integer n is log2(n) then doing a variable change of
m = log2(n) // your real input size
let find out the number of steps as a function of the input size
m = log2(n)
2^m = 2^log2(n) = n
then the cost of your algorithm as a function of the input size is:
T(m) = n steps = 2^m steps
and this is why the cost is an exponential.
Well, according to me to it is O(2^n) as in this function only recursion is taking the considerable time (divide and conquer). We see that, the above function will continue in a tree until the leaves are approaches when we reach to the level F(n-(n-1)) i.e. F(1). So, here when we jot down the time complexity encountered at each depth of tree, the summation series is:
1+2+4+.......(n-1)
= 1((2^n)-1)/(2-1)
=2^n -1
that is order of 2^n [ O(2^n) ].
No answer emphasizes probably the fastest and most memory efficient way to calculate the sequence. There is a closed form exact expression for the Fibonacci sequence. It can be found by using generating functions or by using linear algebra as I will now do.
Let f_1,f_2, ... be the Fibonacci sequence with f_1 = f_2 = 1. Now consider a sequence of two dimensional vectors
f_1 , f_2 , f_3 , ...
f_2 , f_3 , f_4 , ...
Observe that the next element v_{n+1} in the vector sequence is M.v_{n} where M is a 2x2 matrix given by
M = [0 1]
[1 1]
due to f_{n+1} = f_{n+1} and f_{n+2} = f_{n} + f_{n+1}
M is diagonalizable over complex numbers (in fact diagonalizable over the reals as well, but this is not usually the case). There are two distinct eigenvectors of M given by
1 1
x_1 x_2
where x_1 = (1+sqrt(5))/2 and x_2 = (1-sqrt(5))/2 are the distinct solutions to the polynomial equation x*x-x-1 = 0. The corresponding eigenvalues are x_1 and x_2. Think of M as a linear transformation and change your basis to see that it is equivalent to
D = [x_1 0]
[0 x_2]
In order to find f_n find v_n and look at the first coordinate. To find v_n apply M n-1 times to v_1. But applying M n-1 times is easy, just think of it as D. Then using linearity one can find
f_n = 1/sqrt(5)*(x_1^n-x_2^n)
Since the norm of x_2 is smaller than 1, the corresponding term vanishes as n tends to infinity; therefore, obtaining the greatest integer smaller than (x_1^n)/sqrt(5) is enough to find the answer exactly. By making use of the trick of repeatedly squaring, this can be done using only O(log_2(n)) multiplication (and addition) operations. Memory complexity is even more impressive because it can be implemented in a way that you always need to hold at most 1 number in memory whose value is smaller than the answer. However, since this number is not a natural number, memory complexity here changes depending on whether if you use fixed bits to represent each number (hence do calculations with error)(O(1) memory complexity this case) or use a better model like Turing machines, in which case some more analysis is needed.

Max-heapify with convergent series

I am going through max-heapify and below was the observations
1- Observe max-heapify takes O(1) for nodes that are one level above leaves and in general O(L) times for nodes that are L level above leaves
2- n/4 nodes with level 1, n/8 nodes with level 2 so on.
The total amount of work in the for loop:
n/4 (1 c) + n/8 (2 c) + n/16 (3c) + ... + 1(log n c)
set n/4 = 2 pow k
C*2powk (1/2pow0 + 2/2pow1 + 3/2pow2 + ... + (k+1)/2powk)
series in the bracket is convergent series bounded by a constant
Algorithm is:
build max-heap (A)
for i=n/2 down to 1:
do max-heapify (A,i)
I understand most of the thing from the lecture, but I am confused on some points
1- Why we using n/4 (1 c), why not n/2? and how we know that n/4 leads us to level 1
2- How this convergent series leads us to theta n complexity
1: Consider some (complete for simplicity) binary tree. It has one root, two nodes below the root, 4 below them and so on. Hence the number of nodes in a binary tree of height h is
1 + 2 + 2^2 + ... + 2^(h-1) = 2^h - 1
Since this is the number of nodes n, roughly half of all nodes (n/2) are leaves. In the level above are half as many nodes as there are leaves so n/4.
2: You have a runtime of
C * 2^k * (1 + 2/2^1 + ... + (k+1)/2^k)
Let's call the Term in the bracket S(k) (dependent on k). We say that S(k) converges when k goes against infinity. Furthermore it is increasing, since S(k+1) is S(k) plus a positive term. Hence it has to always be lower than its limit. Would it be higher, we couldn't get down again. Therefore we can say there is a constant A (so independent of k) such that S(k) < A for all k.
Hence we can write the runtime as
C * 2^k * (1 + 2/2^1 + ... + (k+1)/2^k) < C * 2^k * A = A * C * n/4 = O(n)
It is kind of long explanation, I had the same difficulty to understand the heapify() before. So, I’d like to share it. Please let me know if you find some issue. Thanks.

Complexity of trominoes algorithm

What is or what should be complexity of (divide and conquer) trominoes algorithm and why?
I've been given a 2^k * 2^k sized board, and one of the tiles is randomly removed making it a deficient board. The task is to fill the with "trominos" which are an L-shaped figure made of 3 tiles.
Tiling Problem
– Input: A n by n square board, with one of the 1 by 1 square
missing, where n = 2k for some k ≥ 1.
– Output: A tiling of the board using a tromino, a three square tile
obtained by deleting the upper right 1 by 1 corner from a 2 by 2
square.
– You are allowed to rotate the tromino, for tiling the board.
Base Case: A 2 by 2 square can be tiled.
Induction:
– Divide the square into 4, n/2 by n/2 squares.
– Place the tromino at the “center”, where the tromino does not
overlap the n/2 by n/2 square which was earlier missing out 1 by 1
square.
– Solve each of the four n/2 by n/2 boards inductively.
This algorithm runs in time O(n2) = O(4k). To see why, notice that your algorithm does O(1) work per grid, then makes four subcalls to grids whose width and height of half the original size. If we use n as a parameter denoting the width or height of the grid, we have the following recurrence relation:
T(n) = 4T(n / 2) + O(1)
By the Master Theorem, this solves to O(n2). Since n = 2k, we see that n2 = 4k, so this is also O(4k) if you want to use k as your parameter.
We could also let N denote the total number of squares on the board (so N = n2), in which case the subcalls are to four grids of size N / 4 each. This gives the recurrence
S(N) = 4S(N / 4) + O(1)
This solves to O(N) = O(n2), confirming the above result.
Hope this helps!
To my understanding, the complexity can be determined as follows. Let T(n) denote the number of steps needed to solve a board of side length n. From the description in the original question above, we have
T(2) = c
where c is a constant and
T(n) = 4*T(n/2) + b
where b is a constant for placing the tromino. Using the master theorem, the runtime bound is
O(n^2)
via case 1.
I'll try to offer less formal solutions but without making use of the Master theorem.
– Place the tromino at the “center”, where the tromino does not overlap the n/2 by n/2 square which was earlier missing out 1 by 1 square.
I'm guessing this is an O(1) operation? In that case, if n is the board size:
T(1) = O(1)
T(n) = 4T(n / 4) + O(1) =
= 4(4T(n / 4^2) + O(1)) + O(1) =
= 4^2T(n / 4^2) + 4*O(1) + O(1) =
= ... =
= 4^kT(n / 4^k) + 4^(k - 1)*O(1)
But n = 2^k x 2^k = 2^(2k) = (2^2)^k = 4^k, so the whole algorithm is O(n).
Note that this does not contradict #Codor's answer, because he took n to be the side length of the board, while I took it to be the entire area.
If the middle step is not O(1) but O(n):
T(n) = 4T(n / 4) + O(n) =
= 4(4*T(n / 4^2) + O(n / 4)) + O(n) =
= 4^2T(n / 4^2) + 2*O(n) =
= ... =
= 4^kT(n / 4^k) + k*O(n)
We have:
k*O(n) = n log n because 4^k = n
So the entire algorithm would be O(n log n).
You do O(1) work per tromino placed. Since there's (n^2-1)/3 trominos to place, the algorithm takes O(n^2) time.

Proving this recursive Fibonacci implementation runs in time O(2^n)?

I'm having difficulty proving that the 'bad' version of fibonacci is O(2^n).
Ie.
Given the function
int fib(int x)
{
if ( x == 1 || x == 2 )
{
return 1;
}
else
{
return ( f( x - 1 ) + f( x - 2) );
}
}
Can I get help for the proof of this being O(2^n).
Let's start off by writing a recurrence relation for the runtime:
T(1) = 1
T(2) = 1
T(n+2) = T(n) + T(n + 1) + 1
Now, let's take a guess that
T(n) ≤ 2n
If we try to prove this by induction, the base cases check out:
T(1) = 1 ≤ 2 = 21
T(2) = 1 ≤ 4 = 22
Then, in the inductive step, we see this:
T(n + 2) = T(n) + T(n + 1) + 1
≤ 2n + 2n+1 + 1
< 2n+1 + 2n+1
= 2n+2
Therefore, by induction, we can conclude that T(n) ≤ 2n for any n, and therefore T(n) = O(2n).
With a more precise analysis, you can prove that T(n) = 2Fn - 1, where Fn is the nth Fibonacci number. This proves, more accurately, that T(n) = Θ(φn), where φ is the Golden Ratio, which is approximately 1.61. Note that φn = o(2n) (using little-o notation), so this is a much better bound.
Hope this helps!
Try manually doing a few test cases like f(5) and take note of how many times the method f() is called.
A fat hint would be to notice that every time the method f() is called (except for x is 1 or 2), f() is called twice. Each of those call f() two more times each, and so on...
There's actually a pretty simple proof that the total number of calls to the f is going to be 2Fib(n)-1, where Fib(n) is the n'th Fibonacci number. It goes like this:
The set of calls to f form a binary tree, where each call is either a leaf (for x=1 or x=2) or else the call spawns two child calls (for x>2).
Each leaf contributes exactly 1 to the total returned by the original call, therefore there are Fib(n) total leaves.
The total number of internal nodes in any binary tree is equal to L-1, where L is the number of leaves, so the total number of nodes in this tree is 2L-1.
This shows that the running time (measured in terms of total calls to f) is
T(n)=2Fib(n)-1=O(Fib(n))
and since Fib(n)=Θ(φ^n), where φ is the golden ratio
Φ=(1+sqrt{5})/2 = 1.618...
this proves that T(n) = Θ(1.618...^n) = O(n).
Using the Recursion Tree Method :
T(n)
↙ ↘
n-1 n – 2
↙ ↘ ↙ ↘
N – 2 n – 3 n – 3 n - 4
Each tree level is considered as a call for fib(x - 1) fib(x - 2) if you complete the recursion tree on this manner you will stop when x = 1 or x = 2 (base case) .... this tree shows only three levels of the recursion tree . To solve this tree you need these important informations : 1- height of the tree. 2-how much work is done at each level .
The height of this tree is 2^n and the work at each level is O(1) then the Order of this recurrence is Height * Work at each level = 2^n * 1 = O(2^n)

The Recurrence W(n)= 2W(floor(n/2)) + 3

I have this recurrence:
W(n)= 2W(floor(n/2)) + 3
W(2)=2
My try is as follow:
the tree is like this:
W(n) = 2W(floor(n/2)) + 3
W(n/2) = 2W(floor(n/4)) + 3
W(n/4) = 2W(floor(n/8)) + 3
...
the hight of the tree : I assume its lgn because the tree has 2 branches at every expanding process, not sure though :S
the cost of the last level : 2^lgn * W(2) = 2n
the cost of all levels until level h-1 : 3 * sigma from 0 to lgn-1 of (2^i), which is a geometric series = 3 (n-1)
So, T(n) = 5n - 3 which belong to Theta(n)
my question is: Is that right?
I don't think it's exactly 5n-3 except n is 2t, but your theta is right if you look at Master Theorem, there is no need to calculate it (but its good for startup):
assume you have:
T(n) = aT(n/b) + f(n), where a>=1, b>1 then:
if f(n) = nlogba-eps for any eps > 0 then T(n) = nlogba like your case, in which a=b=2, f(n) = O(1).
f(n) = Theta(nlogba * logkn) then T(n)=Theta(nlogba * logk+1n).
Otherwise is Theta(f(n)). (see detail of constraint in this case in CLRS or wiki, ...)
for detail see wiki.
Well, if you calculate W(4), you find W(4) = 2*W(2) + 3 = 2*2 + 3 = 7, but 5*4 - 3 = 17, so your result for T(n) is not correct. It is close, though, there's just a minor slip in your reasoning (or possibly in a certain other place).
Edit: To be specific, your calculation would work if W(1) was given, but it's W(2) in the question. Either the latter is a typo or you're off by one with the height. (and of course, what Saeed Amiri said.)

Resources