Binary Tree Line by Line Level Order Traversal Time Complexity - data-structures

Here is the Code of Level Order Traversal Line by Line. How come the Time Complexity is O(n) and not O(n2).
def levelOrder(root):
queue = [root]
while len(queue):
count = len(queue)
while count:
current = queue.pop(0)
print(current.data,end='\t')
if current.left:
queue.append(current.left)
if current.right:
queue.append(current.right)
count -= 1
print()
Code

I assume that by O(n2) you actually mean O(n^2).
Why should it be O(n^2)? Just because you have two nested loops it doesn't mean that the complexity is O(n^2). It all depends what you are iterating over and what you are doing inside the loop.
If you look at the execution of the code, you'll see every node in the tree is inserted and popped exactly once, and every iteration of the loop is productive (so there are no iterations that don't do anything). Therefore, the number of iterations is bounded by N, the number of nodes in the tree. So the overall complexity is O(N).

No this has only O(N*L) complexity, where N - number of nodes and L - number of level tree has. I explaining why:
Assume tree has N nodes
queue = [root] | O(1)
while len(queue): | size of level tree has : O(Level)
count = len(queue) | O(1)
while count: | it roughly depends number nodes are left after
| processing left and right sub-tree of the current
| node; O(Left sub tree nodes) + O(Right sub tree
nodes) => O(L+R) => O(N)
count -= 1 | O(1)
In terms of upper bound of an algorithm, it wrap into O(N * L * 1) => O(N*L), where is N-Number of Nodes and L-Number of Level, tree has.

Related

Why is time complexity of recursive tree equal to number of leaf nodes instead of total number of nodes?

I'm looking at the time and space complexity of the simple recursive function below:
It has time complexity O(2^n), which is the number of leaf nodes. But there is a function call at every node of the tree. Why is the time complexity equal to the number of leaf nodes and not the total number of nodes?
the tree has a depth of 5 and 16 leaf nodes, last time i checked 2^5 is 32 not 16....
it's 2^n because there are 2^(n-1) + 2^(n-2) + ... + 2^1 nodes which comes to exactly 2^n-1 calls, discarding the -1 you get O(2^n)
Doesn't change the complexity if you count only the leaves or all nodes.
All non leave-Nodes are 2^n-1.
O(2^n + 2^n-1) = O(2^n).

Solving while loop time complexity algorithm

I am having quite a hard time figuring out the time complexity of my algorithm. I know that the "for" portion of the algorithm will run in O(n) but I am unsure of my while loop. The problem involves creating a binary tree from a given vector. Each node is evaluated for its value and its index in the vector so, essentially, every following node must be to the right of the previous and depending on whether its value is greater or smaller, it will be a child node or a parent node. The children of the parents nodes must be smaller in value.
I have used the while loop for the case where a child node is smaller than the next node to be placed, and I follow up through the parents until I find the spot for the new node to be placed. I beleive this will run, in the worst case, k-1 times, k being the depth of the tree, but how would I represent this as a time complexity? O(kn)? Is that linear?
for(int i = 0; i < vecteur_source.size(); i++){
if( i == 0){
do bla....
}else if((vecteur_source.at(i) > vecteur_source.at(i-1)) && (m_map_index_noeud.at(i-1)->parent)){
int v = m_map_index_noeud.at(i-1)->parent->index;
while ((vecteur_source.at(i) >= vecteur_source.at(v))){
v = m_map_index_noeud.at(v)->parent->index;
}
}
}
Allow me to simplify this into pseudocode:
# I'm assuming this takes constant or linear time
do the thing for i = 0
for i ← 1 to n-1:
if source[i] > source[i-1] and nodes[i] is not the root node:
v ← nodes[i-1].parent.index
while source[i] > source[v]:
v ← nodes[v].parent.index
If I've understood it properly from your code, then your analysis is correct: the outer loop iterates O(n) times, the inner loop iterates up to O(h) times where h is the height of the tree, so therefore the time complexity is O(nh).
This is not linear time, unless h is guaranteed to be at most a constant. More usually, h is O(log n) for balanced trees, or O(n) in the worst case for unbalanced trees.

Big O time complexity for two inserts in an array in a loop?

Each insert to a python list is 0(n), so for the below snippet of code is the worst case time-complexity O(n+ 2k) or O(nk)? Where k is the elements, we move during the insert.
def bfs_binary_tree(root):
queue=[root]
result=[]
while queue:
node = queue.pop()
result.append(node.val)
if node.left :
queue.insert(0, node.left)
if node.right:
queue.insert(0, node.right)
return result
I am using arrays as FIFO queue, but inserting each element at the start of the list has O(k) complexity, so trying to figure out the total complexity for n elements in the queue.
Since each node ends up in the queue at most once, the outer loop will execute n times (where n is the number of nodes in the tree).
Two inserts are performed during each iteration of the loop and these inserts will require size_of_queue + 1 steps.
So we have n steps and size_of_queue steps as the two variables of interest.
The question is: the size of the queue changes, so what is the overall runtime complexity?
Well, the size of the queue will continuously grow until it is full of leaf nodes, which is the upper bound of the size of the queue. Since the number of leaf nodes is the upper bound of the queue, we know that the queue will never be larger than that.
Therefore, we know that the algorithm will never take more than n * leaf nodes steps. This is our upper bound.
So let's find out what the relationship between n and leaf_nodes is.
Note: I am assuming a balanced complete binary tree
The number of nodes at any level of a balanced binary tree with a height of at least 1 (the root node) is: 2^level. The max level of a tree is called its depth.
For example, a tree with a root and two children has 2 levels (0 and 1) and therefore has a depth of 1 and a height of 2.
Thhe total number of nodes in a tree (2^(depth+1))-1 (-1 because level 0 only has one node).
n=2^(depth+1)-1
We can also use this relationship to identify the depth of the balanced binary tree, given the total number of nodes:
If n=2^(depth+1) - 1
n + 1 = 2^(depth+1)
log(n+1) = depth+1 = number of levels, including the root. Subtract 1 to get the depth (ie., the max level) (in a balanced tree with 4 levels, level 3 is the max level because root is level 0).
What do we have so far
number_of_nodes = 2^(depth+1) - 1
depth = log(number_of_nodes)
number_of_nodes_at_level_k = 2^k
What we need
A way to derive the number of leaf nodes.
Since the depth == last_level and since the number_of_nodes_at_level_k = 2^k, it follows that the number of nodes at the last level (the leaf nodes) = 2^depth
So: leaf_nodes = 2^depth
Your runtime complexity is n * leaf_nodes = n * 2^depth = n * 2^(log n) = n * n = n^2.

Disjoint Set in a special ways?

We implement Disjoint Data structure with tree. in this data structure makeset() create a set with one element, merge(i, j) merge two tree of set i and j in such a way that tree with lower height become a child of root of the second tree. if we do n makeset() operation and n-1 merge() operations in random manner, and then do one find operation. what is the cost of this find operation in worst case?
I) O(n)
II) O(1)
III) O(n log n)
IV) O(log n)
Answer: IV.
Anyone could mentioned a good tips that the author get this solution?
The O(log n) find is only true when you use union by rank (also known as weighted union). When we use this optimisation, we always place the tree with lower rank under the root of the tree with higher rank. If both have the same rank, we choose arbitrarily, but increase the rank of the resulting tree by one. This gives an O(log n) bound on the depth of the tree. We can prove this by showing that a node that is i levels below the root (equivalent to being in a tree of rank >= i) is in a tree of at least 2i nodes (this is the same as showing a tree of size n has log n depth). This is easily done with induction.
Induction hypothesis: tree size is >= 2^j for j < i.
Case i == 0: the node is the root, size is 1 = 2^0.
Case i + 1: the length of a path is i + 1 if it was i and the tree was then placed underneath
another tree. By the induction hypothesis, it was in a tree of size >= 2^i at
that time. It is being placed under another tree, which by our merge rules means
it has at least rank i as well, and therefore also had >= 2^i nodes. The new tree
therefor has >= 2^i + 2^i = 2^(i + 1) nodes.

Heap-sort time complexity deep understanding

When I studied the Data Structures course in the university, I learned the following axioms:
Insertion of a new number to the heap takes O(logn) in worst case (depending on how high in the tree it reaches when inserted as a leaf)
Building a heap of n nodes, using n insertions, starting from an empty heap, is summed to O(n) time, using amortized analysis
Removal of the minimum takes O(logn) time in worst case (depending on how low the new top node reaches, after it was swapped with the last leaf)
Removal of all the minimums one by one, until the heap is empty, takes O(nlogn) time complexity
Reminder: The steps of "heapsort" algorithm are:
Add all the array values to a heap: summed to O(n) time complexity using the amortized-analysis trick
Pop the minimum out of the heap n times and place the i-th value in the i-th index of the array : O(nlogn) time complexity, as the amortized-analysis trick does not work when popping the minimum
My question is: Why the amortized-analysis trick does not work when emptying the heap, causing heap-sort algorithm to take O(nlogn) time and not O(n) time?
Edit/Answer
When the heap is stored in an array (rather than dynamic tree nodes with pointers), then we can build the heap bottom up, i.e., starting from the leaves and up to the root, then using amortized-analysis we can get total time complexity of O(n), whereas we cannot empty the heap minima's bottom up.
Assuming you're only allowed to learn about the relative ranking of two objects by comparing them, then there's no way to dequeue all elements from a binary heap in time O(n). If you could do this, then you could sort a list in time O(n) by building a heap in time O(n) and then dequeuing everything in time O(n). However, the sorting lower bound says that comparison sorts, in order to be correct, must have a runtime of Ω(n log n) on average. In other words, you can't dequeue from a heap too quickly or you'd break the sorting barrier.
There's also the question about why dequeuing n elements from a binary heap takes time O(n log n) and not something faster. This is a bit tricky to show, but here's the basic idea. Consider the first half of the dequeues you make on the heap. Look at the values that actually got dequeued and think about where they were in the heap to begin with. Excluding the ones on the bottom row, everything else that was dequeued had to percolate up to the top of the heap one swap at a time in order to be removed. You can show that there are enough elements in the heap to guarantee that this alone takes time Ω(n log n) because roughly half of those nodes will be deep in the tree. This explains why the amortized argument doesn't work - you're constantly pulling deep nodes up the heap, so the total distance the nodes have to travel is large. Compare this to the heapify operation, where most nodes travel very little distance.
Let me show you "mathematically" how we can compute the complexity of transforming an arbitrary array into an heap (let me call this "heap build") and then sorting it with heapsort.
Heap build time analysis
In order to transform the array into an heap, we have to look at each node with children and "heapify" (sink) that node. You should ask yourself how many compares we perform; if you think about it, you see that (h = tree height):
For each node at level i, we make h-i compares: #comparesOneNode(i) = h-i
At level i, we have 2^i nodes: #nodes(i) = 2^i
So, generally T(n,i) = #nodes(i) * #comparesOneNode(i) = 2^i *(h-i), is the time spent for "compares" at level "i"
Let's make an example. Suppose to have an array of 15 elements, i.e., the height of the tree would be h = log2(15) = 3:
At level i=3, we have 2^3=8 nodes and we make 3-3 compares for each node: correct, since at level 3 we have only nodes without children, i.e., leaves. T(n, 3) = 2^3*(3-3) = 0
At level i=2, we have 2^2=4 nodes and we make 3-2 compares for each node: correct, since at level 2 we have only level 3 with which we can compare. T(n, 2) = 2^2*(3-2) = 4 * 1
At level i=1, we have 2^1=2 nodes and we make 3-1 compares for each node: T(n, 1) = 2^1*(3-1) = 2 * 2
At level i=0, we have 2^0=1 node, the root, and we make 3-0 compares: T(n, 0) = 2^0*(3-0) = 1 * 3
Ok, generally:
T(n) = sum(i=0 to h) 2^i * (h-i)
but if you remember that h = log2(n), we have
T(n) = sum(i=0 to log2(n)) 2^i * (log2(n) - i) =~ 2n
Heapsort time analysis
Now, here the analysis is really similar. Every time we "remove" the max element (root), we move to root the last leaf in the tree, heapify it and repeat till the end. So, how many compares do we perform here?
At level i, we have 2^i nodes: #nodes(i) = 2^i
For each node at level "i", heapify, in the worst case, will always do the same number of compares that is exactly equal to the level "i" (we take one node from level i, move it to root, call heapify, and heapify in worst case will bring back the node to level i, performing"i" compares): #comparesOneNode(i) = i
So, generally T(n,i) = #nodes(i) * #comparesOneNode(i) = 2^i*i, is the time spent for removing the first 2^i roots and bring back to the correct position the temporary roots.
Let's make an example. Suppose to have an array of 15 elements, i.e., the height of the tree would be h = log2(15) = 3:
At level i=3, we have 2^3=8 nodes and we need to move each one of them to the root place and then heapify each of them. Each heapify will perform in worst case "i" compares, because the root could sink down to the still existent level "i". T(n, 3) = 2^3 * 3 = 8*3
At level i=2, we have 2^2=4 nodes and we make 2 compares for each node: T(n, 2) = 2^2*2 = 4 * 2
At level i=1, we have 2^1=2 nodes and we make 1 compare for each node: T(n, 1) = 2^1*1 = 2 * 1
At level i=0, we have 2^0=1 node, the root, and we make 0 compares: T(n, 0) = 0
Ok, generally:
T(n) = sum(i=0 to h) 2^i * i
but if you remember that h = log2(n), we have
T(n) = sum(i=0 to log2(n)) 2^i * i =~ 2nlogn
Heap build VS heapsort
Intuitively, you can see that heapsort is not able to "amortise" his cost because every time we increase the number of nodes, more compares we have to do, while we have exactly the opposite in the heap build functionality! You can see here:
Heap build: T(n, i) ~ 2^i * (h-i), if i increases, #nodes increases, but #compares decreases
Heapsort: T(n, i) ~ 2^i * i, if i increases, #nodes increases and #compares increases
So:
Level i=3, #nodes(3)=8, Heap build does 0 compares, heapsort does 8*3 = 24 compares
Level i=2, #nodes(2)=4, Heap build does 4 compares, heapsort does 4*2 = 8 compares
Level i=1, #nodes(1)=2, Heap build does 4 compares, heapsort does 2*1 = 2 compares
Level i=0, #nodes(0)=1, Heap build does 3 compares, heapsort does 1*0 = 1 compares

Resources