Keeping avl tree balanced without rotations - algorithm

B Tree is self balancing tree like AVL tree. HERE we can see how left and right rotations are used to keep AVL tree balanced.
And HERE is a link which explains insertion in B tree. This insertion technique does not involve any rotations, if I am not wrong, to keep tree balanced. And therefore it looks simpler.
Question: Is there any similar (or any other technique without using rotations) to keep avl tree balanced ?

The answer is... yes and no.
B-trees don't need to perform rotations because they have some slack with how many different keys they can pack into a node. As you add more and more keys into a B-tree, you can avoid the tree becoming lopsided by absorbing those keys into the nodes themselves.
Binary trees don't have this luxury. If you insert a key into a binary tree, it will increase the height of some branch in that tree by 1 in all cases because that key needs to go into its own node. Rotations combat the overall growth of the tree by ensuring that if certain branches grow too much, that height is shuffled into the rest of the tree.
Most balanced BSTs have some sort of rebalancing strategy that involves rotations, but not all do. One notable example of a strategy that doesn't directly involve rotations is the scapegoat tree, which rebalances by tearing huge subtrees out of the master tree, optimally rebuilding them, then gluing the subtree back into the main tree. This approach doesn't technically involve any rotations and is a pretty clean way to implement a balanced tree.
That said - the most space-efficient implementations of scapegoat trees do indeed use rotations to convert an imbalanced tree into a perfectly balanced one. You don't have to use rotations to do this, though if space is short it's probably the best way to do so.
Hope this helps!

Rotations can be made simple (if you need only simplicity).
If the insertion traffic is left, the balance -1 is the red-light.
If the insertion traffic is right, the balance 1 is the red-light.
This is a (simplified) coarse-graining (2-adic rounding) of the normalized fundamental AVL balance:
{left,even,right} ~ {low,even,high} ~ {green,green,red}
Walk the insertion route and rotate every red-light (before the insertion). If the next light is green, you can just rotate the red-light 1 or 2 times. You may have to re-balance the next subtrees before each rotation, because inner subtrees are invariant. This is simple, but it takes a very long time. You have to move down the green-light before each rotation. You can always move down the green-light to the root, and you can rotate the tree-top to generate a new green-light.
The red-light rotations naturally move down the green-light.
At this point, you have only the green-lights for the insertion.
The cost structure of this naive method is topologically simplified as
df(h)/dh=∫f(h)dh
such as sin(h),sinh(h),etc.

Related

Is it always possible to turn one BST into another using at most O(n) tree rotations?

This earlier question asks whether it's always possible to turn one BST for a set of values into another BST for the same set of values purely using tree rotations (the answer is yes). However, is it always possible to do this using at most O(n) total tree rotations?
Yes, it is always possible to turn one BST into another using at most O(n) tree rotations. This answer follows the same general approach as the other answer by picking some canonical tree shape T* and bounding the number of rotations needed to turn an arbitrary tree into our canonical tree. Then you can turn an arbitrary tree T₁ into another tree T₂ by transforming T₁ into T* and then transforming T* into T₂.
As suggested in comments, you can choose your canonical tree to be a degenerate linked list. For trees of n nodes, this upper bounds the number of rotations needed at 2n−2.
In the paper Rotation Distance, Triangulation, and Hyperbolic Geometry, Daniel Sleator, Robert Tarjan, and William Thurston proved that the rotation distance between any two binary trees of n nodes is at most 2n−6 (better than the bound we get when transforming into a linked list).
At a high level, they did this by introducing a way to represent any binary tree as a polygon triangulation, where a tree rotation has a corresponding triangulation operation. Then, instead of reasoning about binary trees in their usual representation, the paper picks a canonical triangulation and shows how to transform an arbitrary triangulation into their desired one.
The canonical triangulation they chose is one where all diagonals emanate from a single vertex in a fan-like shape, which ends up corresponding to a somewhat unintuitive binary tree shape (a generalization of linked lists that also includes diamond shaped trees consisting of a root, a left child whose right child is a linked list, and a right child whose left child is a linked list).
It's a very cool technique that illustrates the power of isometries in data structures, showing how changing our representation can give us a new way of approaching a problem. Some friends and I recently put together a writeup walking through Sleator, Tarjan, and Thurston's proof if you would like to explore this in more detail.
Yes, this is always possible. I fear that the best I can do right now is give you a silly algorithm that proves it's possible, though I suspect that there must be a much better way to do this.
The Day-Stout-Warren algorithm is an algorithm that, starting with any BST, uses tree rotations to convert it to a perfectly balanced BST. It runs in time O(n) and does O(n) total rotations.
So suppose that you want to turn one tree T1 into another tree T2 using tree rotations. Run Day-Stout-Warren on both trees to convert them to the same balanced tree T*, and record the rotations that you needed to make in both cases. Then you can turn T1 into T2 by first running all the rotations needed to perfectly balanced T1, then running the reverse of the rotations needed to turn T2 into a balanced tree. This turns T1 into T* and then turns T* into T2. Since the Day-Stout-Warren algorithms makes only O(n) total rotations, this too makes only O(n) total rotations.
I feel like there has to be a better way to do this, but I'm not sure off the top of my head how to achieve this. If I think of anything, I'll let you know!

Is kd-tree always balanced?

I have used kd-tree algoritham and make tree.
But i found that tree is not balanced so my question is if we used kd-tree algoritham then that tree is always balanced if not then how can we make it balance ?.
We can use another algoritham likes AVL or Red-Black for balancing kd tree ?
I have some sample data for that i used kd-tree algoritham but that tree is not balanced.
(14,31), (15,32), (17,42), (16,44), (18,52), (16,62)
This is a fairly broad topic and the questions themselves are kind of general.
Hopefully this will give you some useful insights and material to work with:
Kd tree is not always balanced.
AVL and Red-Black will not work with K-D Trees, you will have either construct some balanced variant such as K-D-B-tree or use other balancing techniques.
K-d Tree are commonly used to store GeoSpatial data because they let you search over more then one key, contrary to 'traditional' tree which lets you do single dimensional search. GeoSpatial data certainly cannot be represented in single dimension.
Note that there are also specialized databases working with GeoSpatial data so it might be worth checking if the overhead could be shifted to them instead of making your own solution: Although i don't have much experience with this, maybe it is worth checking the postgis.
postgis
Here are some useful links showing how to build balanced K-D tree variant and usage of K-D trees with Spatial data:
balancing K-D-Tree
K-D-B-tree
spatial data k-d-trees
It depends on how you build the tree.
If built as originally published, the tree will be balanced, i.e. only at the leaf level it will have at most a height difference of 1. If your data set has 2^n-1 elements, the tree will be perfectly balanced.
When constructed with the median, then half of the objects must be on either branch of the tree, thus it has minimal height and is balanced.
However, this tree cannot be changed then. I am not aware of an insert or remove algorithm that would preserve this property, but YMMV. I bet there are two dozens of kd-tree extensions that aim at rebalancing and making insertions/deletions more effective.
The k-d-tree is not designed for changes, and will quickly lose efficiency. It relies on the median, and thus any change to the tree would worst-case propagate through all of the tree. Therefore, you need to allow some tolerance in the tree quality to support changes. It appears to be a common approach to just keep track of insertions/deletions and rebuild the tree eventually. You cannot combine it with red-black-trees or AVL-trees, because data with more than 1 dimension is not ordered; these trees only work for ordered data. Upon rotation of the tree the splitting axis changes; and there may be elements in either half that suddenly would need to move to the other branch. This does not happen in AVL or red-black trees.
But as you can imagine, people have published several indexes that remain balanced. Such as k-d-b-trees, and R-trees. These also work better for large data that needs to be stored on disk.
In order to make your kd-tree balanced use median value.
(14,31), (15,32), (17,42), (16,44), (18,52), (16,62)
In the root choose median of x-cordinates [14,15,16,16,17,18] which is 16,
So all the elements less than 16 goes to left part of the tree and
greater than or equal to goes to right side of tree.
as of now,
left part tree consists of [14,31],[15,32] ,now for y-axis find the median for [31,32]
so that the tree is balanced

Splay tree rotation algorithm: Why use zig-zig and zig-zag instead of simpler rotations?

I don't quite understand why the rotation in the splay tree data structure is taking into account not only the parent of the rating node, but also the grandparent (zig-zag and zig-zig operation). Why would the following not work:
as we insert, for instance, a new node to the tree, we check whether we insert into the left or right subtree. If we insert into the left, we rotate the result RIGHT, and vice versa for right subtree. Recursively it would be sth like this
Tree insert(Tree root, Key k){
if(k < root.key){
root.setLeft(insert(root.getLeft(), key);
return rotateRight(root);
}
//vice versa for right subtree
}
That should avoid the whole "splay" procedure, don't you think?
The algorithm you're proposing on the tree is called the "move-to-root" heuristic and is discussed on page four of Sleator and Tarjan's original paper on splay trees. They cite an older paper by Allen and Munro where it is shown that if you try to use move-to-root as a means for reshaping trees, it is possible for the amortized cost of each lookup to be O(n), which is quite slow. Splaying is a very carefully designed algorithm for reshaping the tree, and it guarantees amortized O(log n) lookups no matter what sequence of accesses is performed.
Intuitively, move-to-root is not a very good way to reshape the tree because it moves down all the nodes on the path from the node to the root while trying to make the accessed node easier to reach in the future. As a result, the overall tree can get worse when doing this version of tree reorganizing. On the other hand, the splay method tends to decrease the height of the splayed node and all of the nodes on its access path, which means that as a whole the tree tends to get better during a splay.
Hope this helps!

Why is avl tree faster for searching than red black tree?

I have read it in a couple of places that avl tree search faster, but not able to understand. As I understand :
max height of red-black tree = 2*log(N+1)
height of AVL tree = 1.44*logo(N+1)
Is it because AVL is shorter?
Yes.
The number of steps required to find an item depends on the distance between the item and the root.
Since the AVL tree is packed tighter (i.e. it has a lower max height) it means more items are closer to the root than in the red-black case.
The extra tight packing also means the AVL tree requires more work when inserting elements.
The best choice for any app depends on whether it is insert intensive or search intensive...
AVL tree is better than red-black tree if the input key is almost ascending/descending because then we would have to do single rotation(left-left or right-right case) to add this element. Also, since the tree would be tightly balanced, the search would also be faster.
But for randomly selected input key, RBTree are better since they require less rotation for insertion in comparison to AVL.
Overall, it depends on the input sequence, which would decide how tilted our tree is, and the operation performed.For insert-intensive use Red-Black Tree and for search-intensive use AVL.
AVL tree and RBTree do have respective advantages as well as disadvantages. You'll perceive that better if you've already learned how they work.
AVL is slightly faster than RBTree in insert operation because there would be at most one rotation involved in insertion, while there may be two for RBTree.
RBTree only require at most three rotations in deletion, but this is not guaranteed in AVL. So it can delete nodes faster than AVL.
However, above all, they both have strict logarithmic tree height.
Pick up any subtree, the property that makes AVL "balanced" guarantees that the difference of height between two child subtrees is at most one, which is to say, intuitively, the whole tree is rigidly balanced.
But when it comes to an RBTree, the rule becomes likely "looser", since property of RBTree can only guarantee the depth of a tree is not larger than twice as the logarithm of the total number of nodes.
Here're some facts that may be more precise:
An AVL tree's height is strictly less than: 1.44log(n+2)-0.328
(approximately)
A red-black tree's height is at most 2log(n+1)
See https://en.wikipedia.org/wiki/AVL_tree#Comparison_to_other_structures for detailed information.

How does a red-black tree work?

There are lots of questions around about red-black trees but none of them answer how they work. Why is it called red-black? How does this keep the tree balanced (thus increasing performance over an unbalanced normal binary search tree)? I'm just looking for an overview of how and why it works.
For searches and traversals, it's the same as any binary tree.
For inserts and deletes, more sophisticated algorithms are applied which aim to ensure that the tree cannot be too unbalanced. These guarantee that all single-item operations will always run in at worst O(log n) time, whereas in a simple binary tree the binary tree can become so unbalanced that it's effectively a linked list, giving O(n) worst case performance for each single-item operation.
The basic idea of the red-black tree is to imitate a B-tree with up to 3 keys and 4 children per node. B-trees (or variations such as B+ trees) are mainly used for database indexes and for data stored on hard disk.
Each binary tree node has a "colour" - red or black. Each black node is, in the B-tree analogy, the subtree root for the subtree that fits within that B-tree node. If this node has red children, they are also considered part of the same B-tree node. So it is possible (though not done in practice) to convert a red-black tree to a B-tree and back, with (most) structure preserved. The only possible anomoly is that when a B-tree node has two keys and three children, you have a choice of which key to goes in the black node in the equivalent red-black tree.
For example, with red-black trees, every line from root to leaf has the same number of black nodes. This rule is derived from the B-tree rule that all leaf nodes are at the same depth.
Although this is the basic idea from which red-black trees are derived, the algorithms used in practice for inserts and deletes are modified to enforce all the B-tree rules (there might be a minor exception - I forget) during updates, but are tailored for the binary tree form. This means that doing a red-black tree insert or delete may give a different structure for the result than that you'd expect comparing with doing the B-tree insert or delete.
For more detail, follow the Wikipedia link that MigDus already supplied.
A red-black tree is an ordered binary tree where each vertex is coloured red or black. The intuition is that a red vertex should be seen as being at the same height as its parent (i.e., an edge to a red vertex is thought of as "horizontal" rather than "descending").
[I don't believe the Wikipedia entry makes this point clear.]
The usual rules for red-black trees require that a red vertex never point to another red vertex. This means that the possible vertex arrangements for any subtree rooted with a black vertex (bbb, bbr, rbb, rbr -- for [left child][root][right child]) correspond to 234 trees.
Searching a red-black tree is just the same as searching an ordinary binary tree. Insertion and deletion are similar, except that a "fix-up" rotation may be required at some point to preserve the red-black invariant.
Cheers!

Resources