AVL tree worst case number of rotations during insertion and deletion - data-structures

In an AVL tree, what is the worst case number of rotations during insertion and deletion of n elements ?
I think for insertion it should be O(n) and for deletion it should be O(nlogn). However, I am not that much sure about deletion .
Am I correct?

For both operations - inserting or deleting of a node x, there are cases that require rotations to be made on all nodes from x up to the root. Since height of a tree with n nodes is O(log n), the worst case for both operations take O(log n) rotations. For n insert/delete operations that gives O(n log n).

Related

Asymptotic running time insertion and searching in AVL

I am learning about AVL trees. AVL trees are Binary Search Trees that balance themselves through rotations. Because they are balanced, the query time is O(log n). But the order in which the entries are added is also important to avoid the worst-case O(log n) rotations per insertion.
What is the asymptotic running time of:
(a) adding n entries with consecutive keys in an AVL tree (the time to insert all, not each of them)
b) searching for a key that is not in the tree.
What I understand is this height is O(log N), so insertion into an AVL tree has a worst case O(log N). Searching an AVL tree is completely unchanged from BST's, and so also takes time proportional to the height of the tree, making O(log N).
Is it correct?
Insertion in an AVL requires at most one rotation not O(log n) rotations( two if you count the double rotations individually). Asymptotically the order of insertion does not matter since a rotation takes constant time.
a) with n insertion the cost = n*(cost of finding the proper place to insert+actual creation and insertion of node +rotation if needed)=n*( O(log n)+O(1)+O(1))=O(n log n)
b) searching for an element is O(log n) since the tree is balanced
c) deleting a single element requires at most O(log n) rotations so the complexity of deletion is also O(log n)

Is building a BST with N given elements is O(n lg n)?

What would be the worst case time complexity to build binary search tree with given arbitrary N elements ?
I think there is a difference between N given elements and the elements coming one by one and thereby making a BST of total N elements .
In the former case, it is O(n log n) and in second one is O(n^2) . Am i right ?
If Binary Search Tree (BST) is not perfectly balanced, then the worst case time complexity is O(n^2). Generally, BST is build by repeated insertion, so worst case will be O(n^2). But if you can sort the input (in O(nlogn)), it can be built in O(n), resulting in overall complexity of O(nlogn)
It BST is self-balancing, then the worst case time complexity is O(nlog n) even if we have repeated insertion.

T.C to find an element if it is already present in binary heap tree?

What will be the time complexity to find and element if it is already there in a binary heap ?
I think traversal operations are not possible in heap trees !!
In the worst case, one would end up traversing the entire binary heap to search for an element and therefore the time complexity would be O(N) that is linear in number of elements in the binary heap
Binary Heaps are not meant to be used as search data structure. They are used for implementing priority queues and they handle all the operations that are used frequently very well than linear time
insert: O(log n)
delete: O(log n)
increase_key: O(log n)
decrease_key: O(log n)
extract_max or extract_min: O(log n)
If you want the search operation to be always O(log n) use balanced search trees like AVL or Red Black trees.

Complexity of balancing an unbalanced/partially balanced BST?

In an AVL tree, it takes a constant number of single and double rotations every time we rebalance on insertion and deletion since we only have to check the path from point of insertion or deletion to the root.
If we had an unbalanced tree, we would have to check if every possible node is balanced, so it would cost O(n) to rebalance an unbalanced tree. Is this correct?
It does take time O(n) to rebalamce an unbalanced tree, but not for the reason you mentioned. In an AVL tree, insertions and deletions may require Θ(log n) rotations if the tree was maximally imbalanced before the element was inserted. This could potentially require O(n log n) time to rebalance the tree, since you might do O(log n) work per each of the n nodes.
However, using other algorithms, you can rebalance a tree in O(n) time. One simple option is to do an inorder traversal of the tree to get the elements in sorted order, then reconstruct an optimal BST from those elements by recursively building the tree bottom-up. Alternatively, you can use the Day-Stout-Warren algorithm, which balances any tree in O(n) time and O(1) space.
Hope this helps!

In Big-O notation for tree structures: Why do some sources refer to O(logN) and some to O(h)?

In researching complexity for any algorithm that traverses a binary search tree, I see two different ways to express the same thing:
Version #1: The traversal algorithm at worst case compares once per height of the tree; therefore complexity is O(h).
Version #2: The traversal algorithm at worst case compares once per height of the tree; therefore complexity is O(logN).
It seems to me that the same logic is at work, yet different authors use either logN or h. Can someone explain to me why this is the case?
The correct value for the worst-case time to search is tree is O(h), where h is the height of a tree. If you are using a balanced search tree (one where the height of the tree is O(log n)), then the lookup time is worst-case O(log n). That said, not all trees are balanced. For example, here's a tree with height n - 1:
1
\
2
\
3
\
...
\
n
Here, h = O(n), so the lookup is O(n). It's correct to say that the lookup time is also O(h), but h ≠ O(log n) in this case and it would be erroneous to claim that the lookup time was O(log n).
In short, O(h) is the correct bound. O(log n) is the correct bound in a balanced search tree when the height is at most O(log n), but not all trees have lookup time O(log n) because they may be imbalanced.
Hope this helps!
If your binary tree is balanced so that every node has exactly two child nodes, then the number of nodes in the tree will be exactly N = 2h − 1, so the height is the logarithm of the number of elements (and similarly for any complete n-ary tree).
An arbitrary, unconstrained tree may look totally different, though; for instance, it could just have one node at every level, so N = h in that case. So the height is the more general measure, as it relates to actual comparisons, but under the additional assumption of balance you can express the height as the logarithm of the number of elements.
O(h) would refer to a binary tree that is sorted but not balanced
O(logn) would refer to a tree that is sorted and balanced
It's sort of two ways of saying the same thing, because your average balanced binary tree of height 'h' will have around 2^h nodes.
Depending on the context, either height or #nodes may be more relevant, and so that's what you'll see referenced.
because the (h)eight of a balanced tree varies as the log of the (N)umber of elements

Resources