Using a 2-3-4 tree instead of a splay tree - data-structures

I'm in a data structures course right now and we learned about 2-3-4 trees and splay trees. I was wondering in what circumstances would you use a 2-3-4 tree instead of a splay tree? They're both self balancing and sorted so I don't see that much of a difference between them.

A 2-3-4 tree only changes the structure on insertions and deletions, while a splay-tree also re-organizes the nodes on searches.
Splay trees will, thanks to the re-organization on lookup, provide faster responses if your typical usage pattern happens to look up a small subset of elements most of the time.
It is possible to implement a 2-3-4 tree such that the smallest element can be looked up in O(1), but generally both offer insertion and deletion at amortized O(log n).

Related

Comparison of Avl Tree and Red Black Tree

I have an exam tomorrow and there are 3 questions those i can not understand on my notes.
1- #searches >> #insertions and #deletions=0 Which tree is that? (Avl or Red-Black Tree) (Answer is Avl)
2- #insertions>0 and #searches=#deletions=0 Which tree is that? (Avl or Red-Black Tree) (Answer is Red-Black)
3- #insertions=#deletions and #searches=0 Which tree is that? (Avl or Red-Black Tree) (Answer is Red-Black)
Can you explain them please?
Thanks for help
AVL trees, compared to red/black trees, usually have smaller height because the AVL invariants give less room for imbalance. However, red/black trees, compared to AVL trees, have faster insertions and deletions (the fixup cost of maintaining the red/black invariants is lower than the fixup cost of maintaining the AVL invariants.)
For case (1), an AVL tree is probably better because the cost of the lookups will be lower and, if the number of lookups is truly much larger than the number of insertions, the AVL tree will have a comparative advantage.
For case (2), the red/black tree will probably be faster because it supports faster insertions.
For case (3), for the same reason as part (2), the red/black tree will probably be faster.
Hope this helps!

How is insertion and deletion more faster in red black tree than AVL tree?

I would like to understand the difference bit better, but haven't found a source that can break it down to my level.
I am aware that both trees require at most 2 rotations per insertion. Then how is insertion faster in red-black trees?
And how insertion requires O(log n) rotations in avl tree while O(1) in red-black?
Well, I don't know what your level is, exactly, but to put it simply, red-black trees are less balanced than AVL trees. For red-black trees, the path from the root to the furthest leaf is no more than twice as long as the path from the root to the nearest leaf, while for AVL trees there is never more than one level difference between two neighboring subtrees. This makes insertions and deletions slightly more costly in AVL trees but lookup faster. The asymptotic and worst-case behavior of the two data structures is identical though (the runtime (not number of rotations) is O(log n) for insertions in both cases, the O(1) you mentioned is the so-called amortized runtime).
See this paragraph for a short comparison of the two data structures.
Insertion and deletion is not faster in red-black trees. This is a common ASSUMPTION and the assumption is based on the fact that red-black trees perform slightly fewer rotations on average per insert than AVL (.6 vs .7).
You can check for yourself in Java comparing TreeMap(red-black) to this implementation of TreeMapAVL and you can get exact numbers instead of the common, but incorrect, assumptions. https://github.com/dmcmanam/bbst-showdown

Red black tree over avl tree

AVL and Red black trees are both self-balancing except Red and black color in the nodes. What's the main reason for choosing Red black trees instead of AVL trees? What are the applications of Red black trees?
What's the main reason for choosing Red black trees instead of AVL trees?
Both red-black trees and AVL trees are the most commonly used balanced binary search trees and they support insertion, deletion and look-up in guaranteed O(logN) time. However, there are following points of comparison between the two:
AVL trees are more rigidly balanced and hence provide faster look-ups. Thus for a look-up intensive task use an AVL tree.
For an insert intensive tasks, use a Red-Black tree.
AVL trees store the balance factor at each node. This takes O(N) extra space. However, if we know that the keys that will be inserted in the tree will always be greater than zero, we can use the sign bit of the keys to store the colour information of a red-black tree. Thus, in such cases red-black tree takes no extra space.
What are the application of Red black tree?
Red-black trees are more general purpose. They do relatively well on add, remove, and look-up but AVL trees have faster look-ups at the cost of slower add/remove. Red-black tree is used in the following:
Java: java.util.TreeMap, java.util.TreeSet
C++ STL (in most implementations): map, multimap, multiset
Linux kernel: completely fair scheduler, linux/rbtree.h
Try reading this article
It offers some good insights on differences, similarities, performance, etc.
Here's a quote from the article:
RB-Trees are, as well as AVL trees, self-balancing. Both of them provide O(log n) lookup and insertion performance.
The difference is that RB-Trees guarantee O(1) rotations per insert operation. That is what actually costs performance in real implementations.
Simplified, RB-Trees gain this advantage from conceptually being 2-3 trees without carrying around the overhead of dynamic node structures. Physically RB-Trees are implemented as binary trees, the red/black-flags simulate 2-3 behaviour
As far as my own understanding goes, AVL trees and RB trees are not very far off in terms of performance. An RB tree is simply a variant of a B-tree and balancing is implemented differently than an AVL tree.
Our understanding of the differences in performance has improved over the years and now the main reason to use red-black trees over AVL would be not having access to a good AVL implementation since they are slightly less common perhaps because they are not covered in CLRS.
Both trees are now considered forms of rank-balanced trees but red-black trees are consistently slower by about 20% in real world tests. Or even 30-40% slower when sequential data is inserted.
So people who have studied red-black trees but not AVL trees tend to choose red-black trees. The primary uses for red-black trees are detailed on the Wikipedia entry for them.
Other answers here sum up the pros & cons of RB and AVL trees well, but I found this difference particularly interesting:
AVL trees do not support constant amortized update cost [but red-black trees do]
Source: Mehlhorn & Sanders (2008) (section 7.4)
So, while both RB and AVL trees guarantee O(log(N)) worst-case time for lookup, insert and delete, restoring the AVL/RB property after inserting or deleting a node can be done in O(1) amortized time for red-black trees.
Insertions in AVL trees and in RB trees both require a maximum of 2 rotations. From https://adtinfo.org/ :
The primary advantage of red-black trees is that, in AVL trees, deleting one node from a tree containing n nodes may require log 2 n rotations, but deletion in a red-black tree never requires more than three rotations.
They're both the most commonly used tree types but Red black trees have faster insertions because they have relaxed balancing apart from that they both have O(log N) search time something that's common between them. They are both equally good but AVL is usually more balanced because of it's 1.44 logN complexity over 2 logN complexity

Applications of red-black trees

What are the applications of red-black (RB) trees? Is there any application where only RB Trees can be used and no other data structures?
A red-black tree is a particular implementation of a self-balancing binary search tree, and today it seems to be the most popular choice of implementation.
Binary search trees are used to implement finite maps, where you store a set of keys with associated values. You can also implement sets by only using the keys and not storing any values.
Balancing the tree is needed to guarantee good performance, as otherwise the tree could degenerate into a list, for example if you insert keys which are already sorted.
The advantage of search trees over hash tables is that you can traverse the tree efficiently in sort order.
AVL-trees are another variant of balanced binary search trees. They were popular before red-black trees were known. They are more carefully balanced, with a maximal difference of one between the heights of the left and right subtree (RB trees guarantee at most a factor of two). Their main drawback is that rebalancing takes more effort.
So red-black trees are certainly a good but not the only choice for this application.
Red Black Trees are from a class of self balancing BSTs and as answered by others, any such self balancing tree can be used. I would like to add that Red-black trees are widely used as system symbol tables. For example they are used in implementing the following:
Java: java.util.TreeMap , java.util.TreeSet .
C++ STL: map, multimap, multiset.
Linux kernel: completely fair scheduler, linux/rbtree.h
Unless you have very specific performance requirements, an R-B tree could be replaced by some other self-balancing binary tree, for example an AVL tree. Choosing between the two of them is basically a performance optimization - they offer the same basic operations.
Not that either of them is definitively "faster" than the other, just that they're different enough that specific uses of them will tend to have slightly different performance, all else being equal. So if you draw your requirements carefully enough, or just by chance, you could end up with one of them being "fast enough" for your use, and the other not. R-B offers slightly faster insertion than AVL, at the cost of slightly slower lookup.
There is no such rule like red black can only be used in a particular case
it depends upon the application in cases like when You have to build the tree only once and you have to query it many times then you can go for a AVL tree because in AVL tree searching is quite fast.. But it is strictly balanced so insertion and deletion may take some time
AVl tree may be used for language dictionery where You have to build the data structure just once
and the red black tree is used in the Completely Fair Scheduler used in current Linux kernels now a days..
the constraints applied on the red black tree also enforce the point that that that the path from the root to the furthest leaf is no more than twice as long as the path from the root to the nearest leaf.
BTW you can look for the various seach and insert etc time required for a red black tree down here
Average Worst case
Space O(n) O(n)
Search O(log n) O(log n)
Insert O(log n) O(log n)
Delete O(log n) O(log n)

Tree Datastructures

I've tried to understand what sorted trees are and binary trees and avl and and and ...
I'm still not sure, what makes a sorted tree sorted? And what is the complexity (Big-Oh) between searching in a sorted and searching in an unsorted tree? Hope you can help me.
Binary Trees
There exists two main types of binary trees, balanced and unbalanced. A balanced tree aims to keep the height of the tree (height = the amount of nodes between the root and the furthest child) as even as possible. There are several types of algorithms for balanced trees, the two most famous being AVL- and RedBlack-trees. The complexity for insert/delete/search operations on both AVL and RedBlack trees is O(log n) or better - which is the important part. Other self balancing algorithms are AA-, Splay- and Scapegoat-tree.
Balanced trees gain their property (and name) of being balanced from the fact that after every delete or insert operation on the tree the algorithm introspects the tree to make sure it's still balanced, if it's not it will try to fix this (which is done differently with each algorithm) by rotating nodes around in the tree.
Normal (or unbalanced) binary trees do not modify their structure to keep themselves balanced and have the risk of, most often overtime, to become very inefficient (especially if the values are inserted in order). However if performance is of no issue and you mainly want a sorted data structure then they might do. The complexity for insert/delete/search operations on an unbalanced tree range from O(1) (best case - if you want the root) to O(n) (worst-case if you inserted all nodes in order and want the largest node)
There exists another variation which is called a randomized binary tree which uses some kind of randomization to make sure the tree doesn't become fully unbalanced (which is the same as a linked list)
A binary search tree is an "tree"-structure where every node has two children-nodes.
The left nodes all have the property of being less than its parent, and the right-nodes are all greater than its parent.
The intressting thing with an binary-tree is that we can search for an value in O(log n) when the tree is properly sorted. Doing the same search in an LinkedList for an example would give us the searchspeed of O(n).
The best way to go about learning datastructures would be to do a day of googling and reading wikipedia articles.
This might get you started
http://en.wikipedia.org/wiki/Binary_search_tree
Do a google search for the following:
site:stackoverflow.com binary trees
to get a list of SO questions which will answer your several questions.
There isn't really a lot of point in using a tree structure if it isn't sorted in some fashion - if you are planning on searching for a node in the tree and it is unsorted, you will have to traverse the entire tree (O(n)). If you have a tree which is sorted in some fashion, then it is only necessary to traverse down a single branch of the tree (typically O(log n)).
In binary tree the right leaf is always smaller then the head, and the left leaf is always bigger, so you can search in sorted tree in O(log(n)), you just need to go right if if the key is smaller than head and to the left if bgger

Resources