How many balance checks are needed in an AVL tree insertion / deletion - algorithm

I've been reading literature on AVL tree and found it is not elaborated very much on how many balance checks are needed in an AVL tree insertion / deletion.
For example, after inserting a node, do we need to check balance from the new node all the way up to the root? Or could we stop after a rotation(s) is committed?
How about in a deletion with the strategy of copying the rightmost node in the left sub-tree? Check up from the newly deleted (rightmost node in the left sub-tree) node to the root? could we stop after a rotation(s) is committed?

After an insertion, you need to update the balance factor of each "parent" all the way up the tree until the root; so it's a max of O(log n) updates. But you will only have to do a single restructuring to restore the tree to it's invariants.
After a delete, like insertion, you will have to update the balance factor all the way up the tree; so again it's O(log n) updates. But, unlike insert, you may have multiple restructuring rotations to restore the tree to it's invariants.
http://en.wikipedia.org/wiki/AVL_tree

I've been searching a bit deeper, and I've found when you can stop checking:
When the balance factor of an upper node of a node inserted is 0.
After a rotation. This is a consequence of the previous affirmation.
http://www.superstarcoders.com/blogs/posts/efficient-avl-tree-in-c-sharp.aspx
http://www.eternallyconfuzzled.com/tuts/datastructures/jsw_tut_avl.aspx

Related

How many nodes in an AVL tree change depth after a rotation

When adding or deleting a node in an AVL tree, rebalancing might occur. I can understand how there can be O(log(n)) rebalances needed, but when those rotations occur to balance the tree how many nodes actually change level. I can't seem to find this anywhere. I was thinking it was O(log(n)) but can't seem to figure out why. Help would be greatly appreciated
The answer is O(n).
Suppose that at each node there is a "depth" field, how much will it cost to maintain it?
There is a theorem: If the information in field F of node N depends solely on its direct children, then it can be maintained when updated (-inserted or deleted) in logarithmic time.
(the theorem can be proved by induction)
The "depth" field doesn't depends on its children - but rather on its parent.
Note, however, that the theorem is one way, meaning it says when it can be maintained in logarithmic time, but doesn't say when not.Therefore it cant't be said with certaity that the "depth" field can be maintained in logarithmic time (such as height or BF fields), and it can even be seen at insertion that O(n) nodes change thier depth:
In the first rotation the depth of t2 and t4 have been changed, and in the second, the depth of t1, t2, and t3!

Graph-heap-based implementaiton of priority queue

The OCaml reference manual provides an example of priority queue implementation.
It's graph-based implementation of a heap. I said 'heap' because each node has 0, 1 or 2 children and parent node is less than or equals than its children. However, it's not a 'binary heap' as the insertion algorithm doesn't force leaves to be left-most aligned (as it should be according to Wikipedia definition), so the tree isn't complete.
My intuition is that the tree is balanced though, as each time we insert a new node: the left sub-tree moves to the right sub-tree and the previous right sub-tree gets the node added and become new left sub-tree. The following insertion will move the previously called 'new right sub-tree' to the left and gets the node added.
So the depth of the left sub-tree never differs more than 1 from the depth of the right sub-tree, so the tree is balanced. Hence we should never end up in a tree having a linked-list form and worst case complexity should remain O(log n) - while the insertion algorithm is way simpler, as it doesn't take care of keeping the tree complete (but only balanced).
Is my intuition correct here? I make some research and didn't find out this algorithm elsewhere (instead most algorithm focus on array-based implementation, which obviously require a complete tree, otherwise some slots could be 'invalid').
Thanks
You are correct about the way that the heap maintains balance during inserts.
The removeMin operation, however, can disturb the balance, because all the left can be lower than all the elements on the right, for example. There is nothing to restore the balance, and so the balance may be lost.
So this heap does not provide any O(log N) guarantee, if N is the size of the heap. It does, though, if N is the total number of inserts, and that's not too bad. It doesn't hurt the complexity of most algorithms that use heaps.

delete subtree from bst and balance the tree in logn time

Is it possible that we could perform m insert and delete operations on a balanced binary search tree such that delete operation deletes a node and the whole subtree below it and after that balance it? The whole process being in done in amortized O(log n) time per step?
Short answer: Yes it is possible.
What you are describing is a self balancing binary tree, like an AVL-tree or a Red-Black tree. Both take O(log n) for deletion, which includes reordering of the nodes. Here is a link to a page describing such trees and how they work in much more detail than I can, including illustrations. You can also check out the Wikipedia page of AVL-trees, they have a decent explanation as well as animations of the insertions. Here is a short version of what you were most interested in:
Deletions in an AVL tree are on average O(log n) and the rebalancing is on average O(log n), worst case O(1). This is done by doing rotations, again, well explained in both the sources.
The Wikipedia page also includes some code there, if you need to implement it.
EDIT:
For removing a subtree, you will still be able to do the same thing. Here is a link to a very good explanation of this. Short version: deleting the subtree is can be done O(log n) (keep in mind that deletion, regardless of the number of nodes deleted is still O(log n) as long as you do not directly rebalance the tree), then the tree would rebalance itself using rotations. This can also change the root of your tree. Removing a whole subtree is of course going to create a bigger height difference than just the deletion of one node at the end of the tree. Still, using rotation the tree can be rebalanced itself by finding the first node imbalance and then doing the AVL rebalancing scheme. Due to the use of the rotations, this should still all be O(log n). Here you will find how the tree rebalances itself after a deletion, which creates a height imbalance.

Insertion and deletion of nodes in Splay Trees

I have 2 questions regarding splay trees:
1. Deletion of a node
The book I am using says the following: ''When deleting a key k, we splay the parent of the node w that gets removed. Example deletion of 8:
However, what I am doing is this: If the deleted node is not the root, I splay it (to the root), delete it, and splay the right-most node of the left-subtree. But since in this case, the deleted node is the root, I simply remove it and splay the right-most node of the left subtree immediately. Like this:
Is this way also correct? Notice that it is totally different (like my root is 7 not 6 like my book says).
2. In which order are the values in a splay tree inserted?
Is it possible to ''get'' the order of the values that are inserted in the left tree example above? In other words, how is this tree made (in which order are the nodes inserted to generate the following tree). Is there a way to figure this out?
Re deleting a node: both algorithms are correct, and both take time O(log n) amortized. Splaying a node costs O(log n). Creating a new link near the root costs O(log n). Splay trees have a lot of flexibility in how they are accessed and restructured.
Re reconstructing the sequence of insertions: assuming that the insert method is the usual unbalanced insert and splay, then the root is the last insertion. Unfortunately, there are, in general, several ways that it could have been splayed to the root. An asymptotic improvement on the obvious O(n! poly(n))-time brute force algorithm is to do an exhaustive search with memoization, which has cost O(4^n poly(n)).

Data Structure to maintain numbers

Please suggest a data structure to maintain numbers in such a way that i can answer the following queries -
Find(int n) - O(log(n))
Count number of numbers less than k = O(log(n))
Insert - O(Log(n))
It's not homework, but a smaller problem that i am encountering to solve a bigger one - Number of students with better grades and lower jee rank
I have though of an avl tree with maintaining number of nodes in subtree at each nod.But i dont know how to maintain this count at each node when an insert is being done and re-balancing is being done.
I would also try using an AVL Tree. Without looking much deeper into it, I don't think this would be too hard to add. In case of an AVL tree you alway need to know the depth of each subtree for each node anyway (or at least the balancing factor). So it should not be too hard to propagate the size of the subtrees. In case of an rotation, you know exactely where each node and each subtree will land, so it should be just a simple recalculation for those nodes, which are rotated.
Finding in a binary tree is O(log(n)), inserting too.
If you store the subtree size in the node:
you can coming back from a successful insert in a subtree increment the node's counter;
on deleting the same.
So subtree size is like a find, O(log(n)).
Have a look at the different variants of heap data structures, e.g. here.

Resources